Stefano Tombolini
An empirical investigation of the Italian
digital publishing market
This work is licensed under a Creative Commons Attribution 3.0 License. To view a
copy of this license, visit bit.ly/T67jS...
A Giovanni e Paolino.
ABSTRACT
Il presente lavoro di tesi analizza il mercato italiano dei libri digitali (ebook) da un punto di vista statistic...
Table of contents
Abstract...................................................................................................
Tablet.....................................................................................................130
Wi-Fi.........
Table of figures
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
...
Table of tables
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
Tab.
T...
I. INTRODUCTION
The present work is a statistical and economic study of original catalog
and sales data for the Italian di...
fer based on analogy with the print market, as suggested by the analysis of
catalog data.
Finally, in chapter VI, the meth...
II. THE DIGITAL PUBLISHING INDUSTRY
A basic understanding of the digital publishing industry supply chain requires some kn...
During 2011, 20% of US Internet users have purchased e-books, with
sales reaching 8% and 18% of sales of trade books and o...
In July 2010, AMAZON announced that, for the previous three months,
“sales of books for its e-reader, the Kindle, outnumbe...
In December 2011, the EUROPEAN COMMISSION had already opened formal antitrust proceedings against the same firms for anti-...
quire very precise element spacing or include many image elements that
have to be precisely positioned”22, but pose techni...
size of the screen it is displayed on” 26 and to the viewing preferences of the
user (e.g. bigger font size).
This is a ve...
As a consequence, the EPUB and the AZW formats can be considered
very close substitutes.30

II.2. SUPPLY CHAIN
Fig. 2 pres...
Fig. 2: The publishing industry supply chain

Source: own elaboration on B. BLAZEJEWSKI, EPUB: new open standard in e-publ...
•

Publishing: the author's input is treated and converted into output formats of quality high enough to be printed and bo...
It is very likely that he will need to outsource part of it, which explains
the existence of specialized companies (self-p...
In May 2012, 32,000 titles were available in digital format, 4.4% of the
2012 book catalog, a 180% increase over the previ...
•

a supply-side survey about book production, administered to Italian
publishers (almost 2,700 in total) in 2012;47

•

a...
edition,53 while 79.1% of the Italian digital publications were protected by
DRM technologies54.55
In 2012, almost 5.5 mil...
III. REVIEW OF THE LITERATURE
Due to the youth of the digital publishing industry, especially in Italy,
our research subje...
In contrast, print format selection (hardcover vs. paperback, usually)
follows book title selection, which explains the pu...
books ranked by sales)67 consumers might be more likely to search for alternatives in their preferred channel.68
Unfortuna...
In 2002, associations of publishers and authors complained to AMAZON
CEO Jeff Bezos about the negative impact of the promo...
Used books may be poor substitutes for new books, mainly because of
eventual quality degradation and possible reseller unr...
•

and video games.84
Among these product categories, books are the most laggard in terms

of digitization: the total esti...
III.2.“SUPERSTARS” VS. “UNDERDOGS”
It has always been common knowledge among publishing industry
workers that “a small per...
best-sellers [defined in the paper as the top 124 best-sellers] were likely to have
generated nearly $1 billion in sales o...
Increased product availability implies a shift of the sales distribution towards the tail, i.e. more obscure titles, while...
From the supply-side point of view, information and communication
technology loosens physical constraints (virtual shelf-s...
Initially, the reason that prompted researchers to study the concentration of book sales was eminently practical.
Internet...
Even though extremely simple, the model in (1) fits the data fairly well
(R-squared: 0.8008) and is consistent enough with...
The stability across ranks and retailers, and over time, of β2 , which
measures “customers' relative tastes for popular an...
According to the “old” methodology, the proportion of total Amazon.com sales generated in 2008 by “niche” titles (defined ...
Fig. 4: Concentration can be a misleading measure of the “long tail”

Case 1A: 100 products are available and the top 50% ...
They analyze an identical selection of products, at an identical set of
prices, with the same order-fulfillment facilities...
In the case at hand, quantile linear regression is more appropriate than
OLS linear regression: the research is not concer...
creases substantially; it is now four times as high as in 2000. Many underdogs turn
out to be losers. We also find evidenc...
Fig. 5: Calibration between sales and rank

Source: M. D. SMITH, R. TELANG, Y. ZHANG, Analysis of the potential market for...
•

For titles with high ranks (above 200,000), they “assign sales according to the expected sales that belong to the inter...
Fig. 6: Likelihood to pay for downloading a single e-book

Question: Assume you saw a new fiction e-book on an online serv...
Since 2008, the company has published more than 100,000 indie (independent) e-books distributed to multiple major retailer...
Historically, economic arguments have had, and still have, a central role
in public debates and political decisions about ...
For the titles composing Liebowitz Book Review Digest dataset,190 large
differences across subject categories emerge also ...
institutional framework of the publishing industry.
Data about prices, sales quantities, retail margins, royalties, and in...
So, under the assumption of positive yet marginally decreasing sales
growth in retail selling effort,198 there are two fir...
The margin varies by type of book. For example, the margin on academic books
is typically 25-30 percent, on literature 40 ...
Fig. 7: Zero-profit locus of price and output combinations

K is the fixed cost of book production, m is the marginal cost...
For the econometric estimation of price elasticity and retail margin
elasticity, the original dataset is divided into two ...
The empirical analysis suggests a price elasticity between -2 and -3 for
this dataset of academic and intellectual books, ...
So, according to (4), for such titles, the price elasticity of the demand
faced by the publisher is between -1.56 and -1.7...
websites of Amazon.com and BarnesandNobles.com during April, June and
August 2001.
The period was characterized by major p...
However, “a firm maximizing dynamic profits might choose a price below […] [the] static profit-maximizing level,” 240 whic...
Log-linear models using sales rank data, scraped from the websites of
online retailers, are common in the stream of litera...
The relative price elasticities of the two online book retailers with respect to each other are similar: between -1.49 and...
Price elasticity for print books in the 20th and the 80th percentile is
about -1.6 and -1.3, respectively, vs. -3 and -1.9...
Then, with the propensity scores thus obtained, the two samples are
matched through the nearest neighbor and the stratific...
IV. E-BOOK PRICING BY MAJOR ITALIAN PUBLISHERS
Price information and, more in general, public catalog data for digital
boo...
•

drm, a dummy set to one if the e-book is encrypted with DRM technologies275, to zero otherwise;

•

watermark, a dummy ...
•

Only few e-books in the sample (2.76%) employ digital watermarking.

•

Still fewer e-books in the sample do not employ...
Fig. 9: Pie chart of e-book subjects

Blue: non-fiction books, 56.25%;
Light blue: fiction books, 43.75%.
Source: own elab...
Tab. 2: Summary statistics for the quantitative catalog variables
Statistic

ebook.price

paper.price

ebook.pub.delay

10...
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
An empirical investigation of the Italian digital publishing market
Upcoming SlideShare
Loading in...5
×

An empirical investigation of the Italian digital publishing market

500

Published on

Stefano Tombolini's master's thesis in business statistics on the digital publishing industry (dataset from Italy)

Published in: Business, Technology
1 Comment
1 Like
Statistics
Notes
No Downloads
Views
Total Views
500
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
8
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "An empirical investigation of the Italian digital publishing market"

  1. 1. Stefano Tombolini An empirical investigation of the Italian digital publishing market
  2. 2. This work is licensed under a Creative Commons Attribution 3.0 License. To view a copy of this license, visit bit.ly/T67jSf.
  3. 3. A Giovanni e Paolino.
  4. 4. ABSTRACT Il presente lavoro di tesi analizza il mercato italiano dei libri digitali (ebook) da un punto di vista statistico ed economico, utilizzando dati di catalogo e di vendita inediti, relativi al periodo 2010-2013. In primo luogo, le strategie di prezzo delle maggiori case editrici italiane vengono descritte tramite modelli di regressione lineare OLS e per quantili, individuando price point multipli, su dati di catalogo. Grazie all'elevato grado di dettaglio del dataset cross-section a disposizione, è stato possibile studiare il legame tra prezzi digitali e prezzi cartacei, un'analisi originale rispetto alla letteratura di riferimento. In secondo luogo, il lavoro esamina la concentrazione e la sensibilità al prezzo, a livello di singolo titolo, delle vendite di un distributore e-book italiano, focalizzato su editoria medio-piccola e self-publishing. L'uso ragionato di statistiche di concentrazione e il modello di regressione lineare, stimato in modo coerente alla natura longitudinale del panel di vendite, rivelano forti somiglianze tra la domanda di libri digitali e la domanda di libri cartacei, piuttosto che mutamenti della tipologia d'acquisto. Ciò è dovuto in parte a caratteristiche intrinseche del mercato librario, in parte a politiche di offerta editoriale sviluppate per analogia al mercato cartaceo, come evidenziato dall'analisi dei dati di catalogo. In futuro, invece, potrebbero emergere modelli di business alternativi basati su strategie di tying e di bundling, a causa di forti incentivi economici presenti nella distribuzione via Internet di beni digitali, come illustrato nelle conclusioni. Parole chiave: e-book, elasticità al prezzo, regressione lineare OLS, regressione lineare panel, regressione lineare per quantili, concentrazione. 4
  5. 5. Table of contents Abstract...............................................................................................................4 Table of figures...................................................................................................7 Table of tables.....................................................................................................8 I. Introduction......................................................................................................9 II. The digital publishing industry....................................................................11 II.1. History and technicalities.....................................................................11 II.2. Supply chain..........................................................................................17 II.3. The Italian e-book market.....................................................................20 III. Review of the literature...............................................................................24 III.1. Digital “cannibalization” of physical sales..........................................24 III.2. “Superstars” vs. “underdogs”.............................................................30 III.3. Survey and descriptive evidence.........................................................43 III.4. Econometric evidence..........................................................................45 IV. E-book pricing by major Italian publishers.................................................61 IV.1. Catalog dataset.....................................................................................61 IV.1.1. Descriptive statistics.....................................................................62 IV.1.2. Missing data..................................................................................70 IV.2. OLS linear regression results...............................................................72 IV.3. Quantile linear regression results........................................................81 V. A “long-tail-oriented” distributor sales.........................................................98 V.1. Sales dataset..........................................................................................98 V.1.1. Panel structure...............................................................................99 V.2. Concentration analysis........................................................................101 V.3. Panel linear regression results............................................................110 VI. Discussion..................................................................................................118 VI.1. Limitations of the study.....................................................................118 VI.1.1. Catalog analysis..........................................................................118 VI.1.2. Sales analysis..............................................................................120 VI.2. Possible further developments..........................................................123 Appendix.........................................................................................................129 Short glossary of the e-book......................................................................129 3G...........................................................................................................129 Adobe.....................................................................................................129 Adobe DRM............................................................................................129 AZW.......................................................................................................129 DRM.......................................................................................................129 E-book....................................................................................................129 E-book reader (e-reader)......................................................................129 E Ink®...................................................................................................129 EPUB......................................................................................................130 LCD........................................................................................................130 PDF........................................................................................................130 Social DRM............................................................................................130 5
  6. 6. Tablet.....................................................................................................130 Wi-Fi.......................................................................................................130 Short chronology of the e-book.................................................................130 1971.......................................................................................................130 1987.......................................................................................................131 1993.......................................................................................................131 1994.......................................................................................................131 1995.......................................................................................................131 1996.......................................................................................................131 1998.......................................................................................................131 1999.......................................................................................................131 2000.......................................................................................................131 2002.......................................................................................................132 2005.......................................................................................................132 2006.......................................................................................................132 2007.......................................................................................................132 2008.......................................................................................................132 2009.......................................................................................................132 2010.......................................................................................................133 2011.......................................................................................................133 Bibliography...................................................................................................134 6
  7. 7. Table of figures Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. Fig. 1: Growth in US e-book revenue (2002-2011)..........................................12 2: The publishing industry supply chain...................................................18 3: Graphical illustration of the “long tail” hypothesis..............................32 4: Concentration can be a misleading measure of the “long tail”............38 5: Calibration between sales and rank......................................................42 6: Likelihood to pay for downloading a single e-book..............................44 7: Zero-profit locus of price and output combinations.............................51 8: Pie chart of e-book protection mechanisms..........................................63 9: Pie chart of e-book subjects..................................................................64 10: Histogram of ebook.price....................................................................66 11: Histogram of paper.price.....................................................................67 12: Histogram of ebook.pub.delay............................................................68 13: Scatter plot of the quantitative catalog variables..............................70 14: Residuals normal Q-Q plot (OLS)........................................................76 15: Plot of residuals vs. fitted values (OLS)..............................................77 16: Graphical view of const (QUANTREG)................................................87 17: Graphical view of paper.price (QUANTREG)......................................88 18: Graphical view of file.is.open (QUANTREG).......................................89 19: Graphical view of subj.fiction (QUANTREG)......................................90 20: Graphical view of ebook.pub.delay.pos (QUANTREG)........................91 21: Graphical view of ebook.pub.delay.pos.sq (QUANTREG)...................92 22: Graphical view of ebook.pub.delay.neg (QUANTREG).......................93 23: Graphical view of ebook.pub.delay.neg.sq (QUANTREG)..................94 24: Graphical view of the pseudo-R-squared (QUANTREG).....................96 25: Bar chart of Stealth sales and catalog data........................................99 26: Lorenz curves of revenue distributions (TOT.SAMPLE)...................103 27: Lorenz curves of unit sales distributions (TOT.SAMPLE)................105 28: Lorenz curves of revenue distributions (SUB.SAMPLE)..................107 29: Lorenz curves of unit sales distributions (SUB.SAMPLE)................108 30: Plot of 95% confidence intervals for lnprice (PANEL)......................116 31: A Google Trends query by author.....................................................121 32: A Google Trends query by title..........................................................122 33: Demand for bundles of information goods........................................126 7
  8. 8. Table of tables Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. Tab. 1: Percentage of titles surviving more than 58 years..............................47 2: Summary statistics for the quantitative catalog variables..................65 3: Correlation matrix of the catalog variables..........................................69 4: Table of coefficients (LOGIT)................................................................71 5: Table of coefficients – preliminary (OLS).............................................75 6: White test (OLS)....................................................................................77 7: Table of coefficients – final (OLS).........................................................78 8: Breusch-Godfrey test (OLS)..................................................................80 9: Ramsey RESET test (OLS)....................................................................80 10: Table of coefficients – 0.05th quantile (QUANTREG.05)...................84 11: Table of coefficients – 0.25th quantile (QUANTREG.25)...................84 12: Table of coefficients – 0.50th quantile (QUANTREG.50)...................85 13: Table of coefficients – 0.75th quantile (QUANTREG.75)...................85 14: Table of coefficients – 0.95th quantile (QUANTREG.95)...................86 15: Gini coefficients of revenue distributions (TOT.SAMPLE)...............104 16: Gini coefficients of unit sales distributions (TOT.SAMPLE)............105 17: Gini coefficients of revenue distributions (SUB.SAMPLE)..............107 18: Gini coefficients of unit sales distributions (SUB.SAMPLE)............108 19: F test for individual effects (PANEL.1).............................................111 20: F test for individual effects (PANEL.2).............................................111 21: F test for individual effects (PANEL.3).............................................111 22: Hausman test (PANEL.1)..................................................................112 23: Hausman test (PANEL.2)..................................................................113 24: Hausman test (PANEL.3)..................................................................113 25: Table of coefficients (PANEL.1)........................................................113 26: Table of coefficients (PANEL.2)........................................................114 27: Table of coefficients (PANEL.3)........................................................114 8
  9. 9. I. INTRODUCTION The present work is a statistical and economic study of original catalog and sales data for the Italian digital publishing market during the period 2010-2013. Essential historical, technical, and economic information about the digital publishing industry, and the Italian context in particular, are provided in chapter II. The review of the literature in chapter III is quite extensive, and not only for completeness sake: a thorough review of the literature has been instrumental to the definition of the economic and statistical framework. Chapter IV presents the results of the analysis, through OLS and quantile linear regression, on multiple price points, of catalog data for e-books by major Italian publishers. Thanks to the detailed information available in the cross-sectional catalog dataset, it was possible to contribute a joint analysis of e-book and print book prices to the reference literature. An interpretation in terms of pricing behavior and market assumptions by incumbent print publishing houses is attempted. Chapter V presents the results of the analysis of sales data for a panel of titles distributed by a “long-tail-oriented” Italian e-book distributor. Concentration measures and panel linear regression models are used to investigate the distribution and the price sensitivity of sales on a title basis. We find little support in the data for the hypothesis of shifts in consumer tastes towards “niche” products. On the one hand, the finding is consistent with inherent characteristics of the book market, on the other, it might be ascribed also to a publishing of9
  10. 10. fer based on analogy with the print market, as suggested by the analysis of catalog data. Finally, in chapter VI, the methodological limitations of the research, both in technical and interpretative terms, are discussed. In addition to possible adjustments to the original models, alternative paradigms, more in line with the economics of digital markets for information goods, are outlined. Innovative tying and bundling strategies by e-book distributors could reshape the digital publishing industry and represent a serious competitive threat for incumbent players in the print publishing industry. 10
  11. 11. II. THE DIGITAL PUBLISHING INDUSTRY A basic understanding of the digital publishing industry supply chain requires some knowledge of the historical and technical aspects of e-book production and distribution. Sections II.1 and II.2 are preparatory to the comprehension of the variables in the datasets and to the interpretation of the results of the analyses. Readers unfamiliar with the digital publishing landscape might benefit from a quick survey of the short glossary and chronology in the Appendix. II.1. HISTORY AND TECHNICALITIES The digital storage of books probably dates as far back as the 1960s, in parallel to the development of the ASCII (American Standard Code for Information Interchange) character-encoding scheme.1 Over the years, the introduction of new encodings has allowed text files to represent many alphabets other than the English one. At least since the early 1990s, the supply side of the book market engaged in the digitization of the production process; by that time, ADOBE Portable Document Format (PDF), which “made it possible to create complex text documents with professional grade software”2, was already available.3 However, it is only at the end of the 2000s, with the advent of specialized hardware devices such as the AMAZON Kindle e-reader and the APPLE iPad tablet computer, that consumers began to perceive e-books as a viable alternative to traditional printed books.4 1 2 3 4 See ASCII, Wikipedia, accessed on 09/07/2013, bit.ly/WBUEXv. B. BLAZEJEWSKI, EPUB: new open standard in e-publishing, Université de Fribourg Suisse, 2011, p. 9, bit.ly/XKt7Cp. Portable Document Format, Wikipedia, accessed on 09/07/2013, bit.ly/WRAkm3. B. BLAZEJEWSKI, EPUB: new open standard in e-publishing, Université de Fribourg Suisse, 2011, p. 8, bit.ly/XKt7Cp. 11
  12. 12. During 2011, 20% of US Internet users have purchased e-books, with sales reaching 8% and 18% of sales of trade books and of fiction books, respectively;5 the share of US book consumers who also buy e-books grew from 13% in 2010 to 17% in 2011.6 Fig. 1: Growth in US e-book revenue (2002-2011) Source: M. D. SMITH, R. TELANG, Y. ZHANG, Analysis of the potential market for out-of-print ebooks, 08/04/2012, p. 30, fig. 1, available at SSRN: bit.ly/WcNhJq. AMAZON, the multinational e-commerce company, is the most successful global market player: during 2011, 70% of US e-book consumers have bought at least one of the 950,000 titles available in digital format on Amazon.com in the same year.7 5 6 7 Private estimates from SIMPLICISSIMUS BOOK FARM 2012 business plan. Private estimates from SIMPLICISSIMUS BOOK FARM 2012 business plan. Private estimates from SIMPLICISSIMUS BOOK FARM 2012 business plan. 12
  13. 13. In July 2010, AMAZON announced that, for the previous three months, “sales of books for its e-reader, the Kindle, outnumbered sales of hardcover books.”8 Six months later, Kindle books overtook paperback books to become the most popular format on Amazon.com.9 By May 2011, AMAZON had been selling “more Kindle books than all print books – hardcover and paperback – combined”10. These numbers are remarkable indeed: AMAZON has been selling printed books since 1995, whereas Kindle was introduced only in 2007,11 and the Kindle catalog is still a small fraction of the printed one. 12 The figures do not even include free Kindle books, mostly out-of-copyright pre-1923 titles.13,14 The earning reports of large book publishers for 2011 also suggest that “e-books are generally more profitable than print books,” 15 despite roughly flat yearly revenues. In April 2012, the US DEPARTMENT APPLE and five major international HARPERCOLLINS, MACMILLAN, SIMON of antitrust laws. 8 9 10 11 12 13 14 15 16 OF JUSTICE AND book sued the technology giant publishers (HACHETTE, SCHUSTER, and PENGUIN) for violation 16 C. CAIN MILLER, E-books top hardcovers at Amazon, “The New York Times”, 07/19/2010, nyti.ms/118yEuR. Amazon.com now selling more Kindle books than print books, “Amazon Media Room: Press Releases”, 05/19/2011, bit.ly/119B0K2. Ibid. Ibid. C. CAIN MILLER, E-books top hardcovers at Amazon, “The New York Times”, 07/19/2010, nyti.ms/118yEuR. Ibid. Amazon.com now selling more Kindle books than print books, “Amazon Media Room: Press Releases”, 05/19/2011, bit.ly/119B0K2. L. HAZARD OWEN, Thanks to e-books, flat revenue is no problem for publishers, “TIME”, 03/30/2012, ti.me/UFbwiq. US sues Apple and publishers over e-book prices, BBC, 04/11/2012, bbc.in/13fslFO. 13
  14. 14. In December 2011, the EUROPEAN COMMISSION had already opened formal antitrust proceedings against the same firms for anti-competitive practices.17 According to the accusations, book publishers teamed up with APPLE to restrain retail price competition in the e-book market. Initially, e-books were sold under a wholesale model, where publishers set a price and each retailer decides the cover price that he wants to charge for the title. AMAZON set $9.99 as its own price ceiling for e-books, an aggressive marketing policy to attract customers to the Kindle platform.18 Publishers reacted by shifting to agency pricing, where they fix the final customer price and pay a commission to the retailer. When APPLE launched the iBooks platform on iPad and iPhone, it accepted and advocated the adoption of such a pricing model for e-books. AMAZON, faced with the prospect of offering a smaller catalog than competitors, had to stop its $9.99 price policy and accept agency terms from publishers.19,20 At the very beginning of the 2000s, the ADOBE PDF file format had already found success among consumers, thanks to the ADOBE Acrobat Reader free viewing tool, and was used to “pioneer the commercial distribution of ebooks in Internet”21. The PDF file format is page-oriented and provides very precise layout control; these features are convenient for “high quality publications that re17 Antitrust: Commission opens formal proceedings to investigate sales of e-books , European Commission, 12/06/2011, bit.ly/ZBWTjW. 18 US sues Apple and publishers over e-book prices, BBC, 04/11/2012, bbc.in/13fslFO. 19 Ibid. 20 M. RICH, B. STONE, E-book price increase may stir readers' passions, “The New York Times”, 12/06/2010, nyti.ms/XF4v2l. 21 B. BLAZEJEWSKI, EPUB: new open standard in e-publishing, Université de Fribourg Suisse, 2011, p. 10, bit.ly/XKt7Cp. 14
  15. 15. quire very precise element spacing or include many image elements that have to be precisely positioned”22, but pose technical problems as far as the needs of the e-book market are concerned. […] the downturn of the PDF file format is that is quite difficult to read and use on small to medium size screens, like for example smartphones or very small tablet computers. The reason is that the text elements in PDF files are of a fixed size that is relative to the page size of the document and not to the size of the screen of the reader device. The user barely has the difficult choice between viewing the whole page with very small text characters or viewing only a magnified part of the page that has to be moved around all the time. 23 During the 2000s, attempts by the digital publishing industry to address the file format issue led to the emergence of two different de facto standards. The OPEN EBOOK FORUM, an international publishing industry organiza- tion created in 2000 and later renamed INTERNATIONAL DIGITAL PUBLISHING FORUM (IDPF), proposed the Open eBook (OeB) format, later replaced by the EPUB format, “in an attempt to set a common industry standard”24. Meanwhile, MOBIPOCKET, a company founded in 2000, concurrently developed the proprietary MOBI file format, very similar to the Open eBook specification. In 2005, AMAZON acquired the company. […] the MOBI file format was modified and transformed into the AZW file format. The AZW file format is now a proprietary format of AMAZON and there is no public specification available.25 Unlike PDF, both EPUB and AZW do not allow pagination or precise page layout; however, their text content is reflowable: “it adapts itself to the 22 Ibid., p. 29. 23 Ibid. 24 Ibid., p. 10. The EPUB file format is an open standard based on existing standard formats and algorithms: XML (eXtensible Markup Language), XHTML (eXtensible HyperText Markup Language), and ZIP (an open archiving file format). 25 Ibid., pp. 29-30. 15
  16. 16. size of the screen it is displayed on” 26 and to the viewing preferences of the user (e.g. bigger font size). This is a very convenient feature for simple books that do not require precise layout and image positioning (unlike comics, textbooks, and other richly illustrated books). The conventional wisdom about “open and standard formats” vs. “closed and proprietary formats” is that the former would be advantageous for producers, because of the complete control over the production process, and for consumers, because of the implicit guarantee against technological lock-in, whereas the latter would ensure distribution exclusivity and, well, customer lock-in.27 This line of reasoning is broadly correct, but, in the specific case of EPUB vs. AZW, it must be mitigated by the observation of two facts.28 First, AMAZON has been attentive to provide interoperability for its format: up to date, producers can easily create AZW files starting from EPUB files, and consumers can read AZW files with the official reading software, readily available for free on many platforms and devices other than the Kindle e-reader itself. Secondly, in order to prevent copyright infringement, many publishers encrypt their EPUB files with DRM (Digital Rights Management)29 technologies, similar to those implemented by AMAZON in its AZW format. Usually, the user cannot copy & paste or print the content of encrypted e-books, which can be read only on a limited number of authorized DRM-compatible devices. 26 27 28 29 Ibid., p. 20. Ibid., p. 30. Correspondence and conversations with SIMPLICISSIMUS BOOK FARM management. See Digital Rights Management, Wikipedia, accessed on 09/07/2013, bit.ly/WzMF04. 16
  17. 17. As a consequence, the EPUB and the AZW formats can be considered very close substitutes.30 II.2. SUPPLY CHAIN Fig. 2 presents a modified version of the publishing industry supply chain as depicted by Blazejewski, 31 with the aim of providing a reference scheme and highlight recent industry trends. 30 A note for the tech-savvy reader: hereafter EPUB refers to the widely adopted EPUB 2.0.1 specification. The latest EPUB 3.0 specification introduced many modifications, aimed at improving the presentation of multimedia content and the expression of complex mathematical notation. For more information, see B. BLAZEJEWSKI, EPUB: new open standard in e-publishing, Université de Fribourg Suisse, 2011, pp. 18-21, bit.ly/XKt7Cp. AMAZON responded with the development of its new KF8 (Kindle Format 8) file format. For more information, see Kindle Format 8 Overview, Amazon.com, accessed on 09/07/2013, amzn.to/Vx6NlD. 31 B. BLAZEJEWSKI, EPUB: new open standard in e-publishing, Université de Fribourg Suisse, 2011, p. 24, bit.ly/XKt7Cp. 17
  18. 18. Fig. 2: The publishing industry supply chain Source: own elaboration on B. BLAZEJEWSKI, EPUB: new open standard in e-publishing, Université de Fribourg Suisse, 2011, p. 24, fig. VIII, bit.ly/XKt7Cp. For explanation sake, we minimized the level of integration of the supply chain; actually, we often observe a variety of higher degrees of vertical and horizontal integration (vertical integration between publishing houses and offline distributors, horizontal integration between offline and online bookstores, etc.) The author's manuscript, typescript, or, more probably nowadays, digital text document (.doc, .docx, .rtf, .odt, etc.) has to be delivered to the reader in a suitable format, viz. a printed book or an e-book. We can identify three stages in the production and commercialization process.32 32 Again, these stages may or may not correspond to as many specialized market operators. 18
  19. 19. • Publishing: the author's input is treated and converted into output formats of quality high enough to be printed and bound, or displayed on e-book readers. • Distribution: the output formats are cataloged, stocked, and distributed to retailers. • Retailing: the printed and/or digital versions of the book are made available for purchase to the public. While the second and the third stage are straightforward enough, the first stage needs a more in-depth analysis. During this phase, the author is faced with an alternative: whether to sign a publishing contract with a publisher or to self-publish. The distinction is relevant from many points of view, and there is an ongoing debate about advantages and disadvantages of each other. However, if we assume that the self-publishing author could find on the market the typical services provided by a publishing house (editing, translation, composition, etc.), for us it will suffice to neglect the technical aspects and focus on few fundamental economic considerations. On the one hand, a “traditional” author receives from his publishing house royalties on books sold, which provide “an incentive to the author to help market his book”33. The analogy of the author with “a salesman on commission”34 is intuitive and more persuasive than the analogy of royalties with taxes. On the other, a self-publishing author has to arrange the whole production and commercialization process of his book. 33 G. BITTLINGMAYER, The elasticity of demand for books, resale price maintenance and the Lerner index, “Journal of Institutional and Theoretical Economics”, 148, 4, 1992, p. 590, bit.ly/17dxX0b. 34 Ibid., p. 591. 19
  20. 20. It is very likely that he will need to outsource part of it, which explains the existence of specialized companies (self-publishing platforms). The higher the fixed and circulating capital requirements, the clearer the case for the role of intermediaries in the supply chain. Thanks to the introduction of low-cost and user-friendly desktop publishing, Internet distribution, and e-book reading devices, instead, it is virtually possible for an author to achieve a perfect vertical integration of the digital supply chain.35 Therefore, we may observe an even greater variety of degrees of supply chain integration in the future. II.3. THE ITALIAN E-BOOK MARKET In 2011, the Italian e-book market experienced fast growth. The number of e-book consumers reached 1.1 million people, approximately 2.3% of Italian adult (14+) population, with a yearly growth rate of 59%.36 Sales of e-reading devices increased by 718%, from €16 million in 2010 to €131 million in 2011.37 In the same year, e-book sales still represented a tiny fraction of the publishing industry total sales (€3 million vs. €1.3 billion, approximately),38 but the estimated 2011-2012 growth rate is 300%, with approximately €12 million worth of e-book sales.39 35 For a partial, yet worthy of mention, example of disintermediation, see O. SOLON, J.K. Rowling's Pottermore details revealed: Harry Potter e-books and more, “Wired”, 06/23/2011, bit.ly/WLzGY4. 36 Private estimates from SIMPLICISSIMUS BOOK FARM 2012 business plan. 37 Private estimates from SIMPLICISSIMUS BOOK FARM 2012 business plan. 38 Private estimates from SIMPLICISSIMUS BOOK FARM 2012 business plan. 39 Private estimates from SIMPLICISSIMUS BOOK FARM 2012 business plan. 20
  21. 21. In May 2012, 32,000 titles were available in digital format, 4.4% of the 2012 book catalog, a 180% increase over the previous period. 40 The e-book distribution platform Edigita, by FELTRINELLI, RCS, GEMS, and other Italian publishers, is the market leader in Italy, both in terms of value and catalog size.41 The publishing house MONDADORI, which distributes its own e-books, is comparable to Edigita in terms of value, in spite of a relatively small catalog. SIMPLICISSIMUS BOOK FARM e-book distribution platform, Stealth, is comparable to Edigita in terms of catalog size, with relatively low sales, because of its focus on small and medium publishers, and self-publishing authors.42 Both the paper and the digital book retailing industry are quite crowded, not only by many publishing houses, distributors, and specialized retailers, but also by “outsiders”, such as consumer electronics retailers and telecommunications operators.43 However, the Italian market is probably being shaken by the entrance of the main global industry players, AMAZON in particular, with the national release of Amazon.it in September 2010 and Kindle in December 2011.44 On the 16th of May 2013, ISTAT (ISTITUTO NAZIONALE DI STATISTICA) pub- lished a comprehensive report about the supply and demand of books in Italy for the years 2011 and 2012.45 The report is based on the results of two different surveys:46 40 41 42 43 44 45 Private estimates from SIMPLICISSIMUS BOOK FARM 2012 business plan. Private estimates from SIMPLICISSIMUS BOOK FARM 2012 business plan. Private estimates from SIMPLICISSIMUS BOOK FARM 2012 business plan. Private estimates from SIMPLICISSIMUS BOOK FARM 2012 business plan. Private estimates from SIMPLICISSIMUS BOOK FARM 2012 business plan. La produzione e la lettura di libri in Italia: anni 2011 e 2012 , Istat, 05/16/2013, bit.ly/13RcbPE. 46 Nota metodologica, p. 1 in La produzione e la lettura di libri in Italia: anni 2011 e 2012, Istat, 05/16/2013, bit.ly/13RcbPE. 21
  22. 22. • a supply-side survey about book production, administered to Italian publishers (almost 2,700 in total) in 2012;47 • a demand-side surveys about everyday habits and lifestyles, administered to a sample of Italian families (19,300 in total, distributed in 853 cities and towns) in 2011 and 2012.48 The Italian publishing industry is very concentrated: in 2011, 11.3% of active publishers49 published 75.8% of the entire book catalog and printed 88.7% of the total number of copies.50 In 2011, over 15% of the 9,000 print titles published in Italy in the same year was made available also in e-book format.51 Most of these digital titles (67.2%) are adult non-fiction books (natural sciences, linguistics, law and administration, geography and travel, information technology) and classic literary texts.52 It must be noted that, usually, fiction books attract a larger public and are more price-elastic and promotion-elastic than non-fiction books. In the same year, only one e-book out of four presented extra content or additional features (hypertext links, multimedia, etc.) with respect to its print 47 Publications shorter than five pages and propagandist, advertising, and informative materials are excluded from the inquiry. 48 “Readers” are defined as persons aged 6+ who have read at least one book in their leisure time during the 12 months prior to the interview. 49 The survey also includes companies that publish and print books as an accessory activity: in 2011, 25.2% of the responding publishers did not publish any book. For more information, see Nota metodologica, p. 1 in La produzione e la lettura di libri in Italia: anni 2011 e 2012, Istat, 05/16/2013, bit.ly/13RcbPE and La produzione e la lettura di libri in Italia: anni 2011 e 2012, Istat, 05/16/2013, p. 11, note 3, bit.ly/13RcbPE. 50 La produzione e la lettura di libri in Italia: anni 2011 e 2012, Istat, 05/16/2013, p. 10, tab. 6, bit.ly/13RcbPE. 51 Ibid., p. 17. 52 Ibid. 22
  23. 23. edition,53 while 79.1% of the Italian digital publications were protected by DRM technologies54.55 In 2012, almost 5.5 million people aged 16-74 used a mobile device (cellphones, smartphones, PDAs, MP3 players, e-book readers, handheld game consoles, etc.) to connect to the Internet away from home or the workplace.56 Among them, 13.2% (over 700,000 people) read books online or downloaded e-books, in line with the European average (13%).57,58 53 Ibid. 54 See section II.1. 55 La produzione e la lettura di libri in Italia: anni 2011 e 2012 , Istat, 05/16/2013, p. 17, bit.ly/13RcbPE. 56 Ibid., p. 15. 57 La produzione e la lettura di libri in Italia: anni 2011 e 2012 , Istat, 05/16/2013, p. 15, bit.ly/13RcbPE. 58 The ISTAT estimate is lower than the SIMPLICISSIMUS BOOK FARM private estimate reported above in this section. 23
  24. 24. III. REVIEW OF THE LITERATURE Due to the youth of the digital publishing industry, especially in Italy, our research subject is relatively unexplored. Nevertheless, we have drawn valuable contributions from the voluminous literature on related subjects, such as the commercial impact of digital products, the characteristics of Internet distribution, and the peculiarities of the publishing industry. III.1.DIGITAL “CANNIBALIZATION” OF PHYSICAL SALES Some publishers and authors are skeptical about the profitability of ebooks and, thus, the sustainability of the industry: they fear that the digital versions of their books could “cannibalize” higher-priced print sales, without enough market growth to offset declining prices.59 The implicit assumption behind this line of reasoning is that paper books and e-books are products homogeneous enough to be considered very close substitutes, which implies highly positive cross-price elasticities. However, other industry observers point to the fact that e-book consumers represent a relatively distinct market segment for a relatively differentiated product.60 Once consumers invests on a device that allows to carry an entire library and search, scale, and highlight text, they shop directly the on-device ebook stores. 59 M. RICH, Steal this book (for $9.99), “The New York Times”, 05/16/2009, nyti.ms/11MVb0z. Interestingly, the article reports that publishers expressed similar concerns at the time of the introduction of the paperback format, which, eventually, expanded the demand for books, even though it cannibalized hardcover sales. 60 E. SCHNITTMAN, Ebooks don't cannibalize print, people do, “Black Plastic Glasses”, 09/27/2010, bit.ly/11Nk78l. 24
  25. 25. In contrast, print format selection (hardcover vs. paperback, usually) follows book title selection, which explains the publication delay of lowerpriced paperback editions with respect to higher-priced hardcover editions. 61 Hu and Smith (2011) use data from a natural experiment to test the significance of cross-channel effects between e-books and print books. 62 In April and May 2010, a publisher stopped distributing Kindle titles to AMAZON, but returned to release Kindle e-books and print (hardcover) books simultaneously in June 2010. The titles published in April and May 2010 are similar to those published in March and June 2010 along some observable dimensions. 63 The latter group serves as “control” (no publication delay of the e-book version with respect to the print edition) for the former “experimental” group (publication delay of the e-book version with respect to the print edition variable between one and eight weeks).64 The most robust finding of the research is a significant decrease in overall digital sales caused by delaying the publication of the e-book edition relative to the print edition. However, there is also some evidence of cross-channel substitution for “popular” books (defined in the paper as top 20% books ranked by sales) 65.66 In the case of popular books, content selection might precede channel selection, whereas for “niche” books (defined in the paper as bottom 80% 61 S. LOWE, Do ebooks cannibalize print sales?, “Publishing Bits”, 07/28/2009, bit.ly/YymFj6. 62 Y. J. HU, M. D. SMITH, The impact of ebook distribution on print sales: analysis of a natural experiment, 08/29/2011, p. 9, available at SSRN: bit.ly/Y2G650. 63 Ibid., p. 12. 64 Ibid., p. 10. 65 Ibid., p. 19. 66 Ibid., p. 25. 25
  26. 26. books ranked by sales)67 consumers might be more likely to search for alternatives in their preferred channel.68 Unfortunately, in order to look for further evidence on digital cannibalization of physical sales, we have to turn our attention to other publishing products that have already undergone a substantial process of digitization. In an early study of the effects of the addition of Internet channels by newspaper companies, Deleersnyder et al. (2002) collect data for 85 online newspapers launched in UK and Netherlands between 1991 and 2001. 69 In the newspaper industry context, cannibalization is represented by a reduction in circulation and/or advertising revenues. 70 Based on individual and collective evidence, the researchers dismiss “the often-cited cannibalization fears”71 as “largely overstated”72. However, they also report a significantly higher probability of circulation revenues cannibalization in case of high overlap between the online and offline version of the newspapers, measured by surveying the respective webmasters.73 Another noteworthy secondary finding concerns the possible non-neutrality of the digitization process across product categories: some support emerges for the hypothesis that economic newspapers might benefit more than average from an online channel addition.74,75 67 Ibid., p. 19. 68 Ibid., p. 8. 69 B. DELEERSNYDER, I. GEYSKENS, K. GIELENS, M. G. DEKIMPE, How cannibalistic is the Internet channel? A study of the newspaper industry in the United Kingdom and the Netherlands, “International Journal Research in Marketing”, 19, 4, 2002, p. 337, bit.ly/1bP5umm. 70 Ibid., p. 342. 71 Ibid., p. 337. 72 Ibid., p. 346. 73 Ibid. 74 Ibid., p. 343 and p. 344, note 10. 75 Recently, similar evidence emerged for the Italian market: in the first seven months of 2013, Il Sole 24 Ore, the most widespread national daily business newspaper, outperformed the other national newspapers in terms of digital subscriptions. For more information, see G. FUSINA, Gli abbonamenti ai quotidiani digitali, dataninja.it, 09/18/2013, bit.ly/1fgfbwG. 26
  27. 27. In 2002, associations of publishers and authors complained to AMAZON CEO Jeff Bezos about the negative impact of the promotion of used books by the famous e-tailer on the sales of new titles. 76 Ghose et al. (2006) use data collected between 2002 and 2004 from Amazon.com new- and used-book marketplace to empirically test this theoretically possible proposition.77 Information and communication technology transformed the very inefficient brick-and-mortar used-book market into a relatively efficient online market, which shares with the e-book market potentially lower price tags than the new-book print market. […] while brick-and-mortar bookstores have high search costs, limited inventory capacity, limited geographical coverage, and relatively high prices, IT-enabled markets for used books offer low search costs, nearly unlimited (virtual) inventory ca pacity, global coverage, and—through competition among sellers—relatively low prices. […] Internet sales of used books made up an estimated 67% of all used-book sales in 2004 (Wyatt 2005). This represents the highest Internet penetration for any physical product category that we are aware of […] 78 The cross-price elasticity of new-book sales with respect to used-book prices has the expected positive sign, but is rather low.79 According to the theoretical model,80 […] only 16% of AMAZON used-book sales directly cannibalize new-book purchases; the remaining 84% of sales represent purchases that otherwise would not have occurred at new-book prices.81 76 D. K. KIRKPATRICK, Online sales of used books draw protest, “The New York Times”, 04/10/2002, nyti.ms/Y7mbUm. 77 A. GHOSE, M. D. SMITH, R. TELANG, Internet exchanges for used books: an empirical analysis of product cannibalization and welfare impact, “Information Systems Research”, 17, 1, 2006, pp. 4-5, bit.ly/1983mWp. 78 Ibid., p. 4. 79 Ibid., pp. 13-14. 80 Ibid., pp. 6-9. 81 Ibid., p. 17. 27
  28. 28. Used books may be poor substitutes for new books, mainly because of eventual quality degradation and possible reseller unreliability. The indirect method proposed by the researchers to quantify the substitution effect between used and new books treats the two products as homogeneous.82 If, instead, they were relatively differentiated products, such an estimate, based as it is on cross-price elasticity of demand, would be misleading. This consideration may not be deemed relevant for the used books market, but might be fundamental for e-books and, more in general, digital prod ucts. The often-cited fears expressed by publishing industry participants might rest not so much on sales cannibalization from cheaper versions of the same products, as on market “annihilation” from entirely new products. In UK, the OFCOM (OFFICE OF COMMUNICATIONS) and the IPO (INTELLECTUAL PROPERTY OFFICE) commissioned KANTAR MEDIA to conduct an extensive and rigorous83 survey, to measure online copyright infringement levels during the third quarter of 2012, consumer spend on recorded and digital media, and willingness to pay for six different content types: • music, • films, • TV programs, • computer software, • books, 82 Ibid., p. 8, eq. 8. 83 For the report, data reconciliations, questionnaire, and data tables, see Online copyright infringement tracker benchmark study Q3 2012, Ofcom, 11/20/2012, bit.ly/Vw6IfS. 28
  29. 29. • and video games.84 Among these product categories, books are the most laggard in terms of digitization: the total estimate of digital and physical books consumed is 176 millions, of which 39% are e-books consumed via downloading or accessing online (59% for free, of which 21% illegally). 85 These numbers are low if compared to those of the other content types. • The total estimate of digital and physical music tracks consumed is 1,403 millions, of which 81% are digital tracks consumed via downloading or streaming (72% for free, of which 37% illegally). 86 • The total estimate of digital and physical films consumed is 148 millions, of which 56% are digital films consumed via downloading or streaming (61% for free, of which 57% illegally).87 • The total estimate of digital and physical TV programs consumed is 272 millions, of which 92% are digital TV programs consumed via downloading or streaming (80% for free, of which 21% illegally). 88 • The total estimate of digital and physical software products consumed is 69 millions, of which 80% are computer software products consumed via downloading or accessing online (85% for free, of which 55% illegally).89 • The total estimate of digital and physical video games consumed is 68 millions, of which 55% are digital video games consumed via downloading or accessing online (63% for free, of which 29% illegally). 90 84 Report by Kantar Media, pp. 5-6 in Online copyright infringement tracker benchmark study Q3 2012, Ofcom, 11/20/2012, bit.ly/Vw6IfS. 85 Ibid., pp. 66-67. 86 Ibid., pp. 27-28. 87 Ibid., p. 38. 88 Ibid., pp. 48-49. 89 Ibid., p. 58. 90 Ibid., p. 77. 29
  30. 30. III.2.“SUPERSTARS” VS. “UNDERDOGS” It has always been common knowledge among publishing industry workers that “a small percentage of titles accounts for a large share of sales of copyrighted materials.”91 According to Lindy Hess, director of the Columbia Publishing Course, The truth about this business is that, with rare exceptions, nobody makes a great deal of money. 92 In its 1643 petition to the parliament, the British publishing guild, the STATIONERS' COMPANY, argued that “scarce one book in three sells well, or proves gainfull to the publisher.”93 Similar evidence was brought about by publishers and authors before the 1876-8 ROYAL COMMISSION ON COPYRIGHT. Four books out of five which are published do not pay their expenses […] The most experienced person can do no more than guess whether a book by an unknown author will succeed or fail.94 […] only one book in four is a very moderate calculation of the books which are successful, or the books which pay their expenses. 95 […] not one book in nine has paid its expenses […] still they [two publishers] have been able to carry on the trade.96 In 1986 and 1987, according to Liebowitz' estimations,97 91 S. J. LIEBOWITZ, S. MARGOLIS, Seventeen famous economists weigh in on copyright: the role of theory, empirics, and network effects, “Harvard Journal of Law & Technology”, 18, 2, 2005, p. 454, bit.ly/GHOhQB. 92 M. RICH, Math of publishing meets the e-book, “The New York Times”, 02/28/2010, nyti.ms/YUlmO4. 93 M. RICH, Math of publishing meets the e-book, “The New York Times”, 02/28/2010, nyti.ms/YUlmO4. 94 Ibid., p. 183. 95 Ibid., p. 185. 96 Ibid. 97 S. J. LIEBOWITZ, S. MARGOLIS, Seventeen famous economists weigh in on copyright: the role of theory, empirics, and network effects, “Harvard Journal of Law & Technology”, 18, 2, 2005, pp. 454-455, bit.ly/GHOhQB. 30
  31. 31. best-sellers [defined in the paper as the top 124 best-sellers] were likely to have generated nearly $1 billion in sales out of a total of $1.7 billion. 98 These estimates do not even include “sales of best-sellers from previous years that were still selling in relatively large numbers” 99. Liebowitz addresses also the question of market longevity, by constructing a small sample of 236 titles from a 1920s edition of the Book Review Digest, which reviewed approximately 25% of the new titles. 100 These were the “titles attracting the most attention, written by the more important authors and published by the better-known houses” 101. After 58 years, 54% of the 1920s best-selling titles were still in print, vs. only 33% of the 1920s non-best-selling titles.102 Brynjolfsson et al. (2006) report that a typical brick-and-mortar store in the early 2000s stocked only 40,000-100,000 unique titles, out of more than three million books in print.103 In the same period, Amazon.com and other Internet retailers were selling almost the entire catalog of books in print.104 The researchers estimate that 30-40% of Amazon.com sales were in books not normally available in brick-and-mortar stores. 105 Digital markets could have not only increased product variety, but also deepened consumer preferences, a phenomenon that web marketing experts have dubbed “long tail”.106 98 Ibid., p. 455. 99 Ibid. 100 Ibid. 101 Ibid. 102 Ibid., tab. 1. 103 E. BRYNJOLFSSON, Y. J. HU, M. D. SMITH, From niches to riches: the anatomy of the long tail, “Heinz Research Papers”, 51, 06/01/2006, p. 3, bit.ly/GHk7y3. 104 Ibid. 105 Ibid. 106 Ibid., p. 4. 31
  32. 32. Increased product availability implies a shift of the sales distribution towards the tail, i.e. more obscure titles, while the dispersion of consumer preferences modifies the shape of the distribution: a long tail emerges, eventually at the expense of the head, i.e. top-selling hits.107 Fig. 3: Graphical illustration of the “long tail” hypothesis Source: A. ELBERSE, F. OBERHOLZER-GEE, Superstars and underdogs: an examination of the long tail phenomenon in video sales, “Harvard Business School Working Paper Series”, 07-015, 09/05/2006, p. 40, fig. 1, bit.ly/19uLOzX. The distinction between first-order (original) and second-order (derivative) drivers provides a simple framework to describe the supply-side (producers/retailers) and demand-side (consumers) causes of the long tail phenomenon, and to understand its dynamics.108 107 A. ELBERSE, F. OBERHOLZER-GEE, Superstars and underdogs: an examination of the long tail phenomenon in video sales, “Harvard Business School Working Paper Series”, 07-015, 09/05/2006, p. 5, bit.ly/19uLOzX. 108 E. BRYNJOLFSSON, Y. J. HU, M. D. SMITH, From niches to riches: the anatomy of the long tail, “Heinz Research Papers”, 51, 06/01/2006, p. 12, exhibit 4, bit.ly/GHk7y3. 32
  33. 33. From the supply-side point of view, information and communication technology loosens physical constraints (virtual shelf-space, aggregation of consumers from different geographical locations, etc.) and reduces production costs (e.g. make-to-order production, such as print-on-demand), distribution costs (e.g. electronic delivery of digital products), and marketing costs (websites, social networks, etc.)109 From the demand-side point of view, information and communication technology reduces search costs for consumers thanks to active search tools (search engines, sampling tools, etc.), passive search tools (recommender systems, product classifications, etc.), and user-generated content (customer reviews, online communities, etc.)110 Thus, “niche” products may become viable options for producers, retailers, and consumers. Moreover, these first-order (original) drivers of the long tail phenomenon could set up a potentially cumulative and self-perpetuating process triggered by second-order (derivative) drivers: • the increased profitability of niche products for producers and retailers (supply-side incentive), • and the further deepening of consumer tastes towards niche products (demand-side positive feedback).111 Whether or not the long tail hypothesis translates into a relevant busi- ness phenomenon is purely an empirical question, which, since the early 2000s, has proved to be an interesting line of research for both the academic and the business literature.112 109 Ibid., pp. 4-5. 110 Ibid., pp. 5-6. 111 Ibid., pp. 7-8. 112 In the following, we discuss only the statistical and econometric literature. For a broader overview, see Long tail, Wikipedia, accessed on 09/07/2013, bit.ly/ZX5aLb. 33
  34. 34. Initially, the reason that prompted researchers to study the concentration of book sales was eminently practical. Internet retailers are jealous of their own sales data, but usually report sales rankings, therefore the need for researchers to map observable sales ranks to the corresponding sales quantities. Given rank data, Chevalier and Goolsbee (2003) hypothesize that the probability distribution of book sales is paretian, a distributional assumption already exploited by authors and publishers. 113 So, a log-linear model can be used for demand estimation: log(Sales)=β1 +β 2⋅log( Rank)+ϵ .114 (1) β2 is the shape parameter relating sales quantities to sales ranks, while β1 is a scale parameter. They estimate β2 as -0.855115 by means of the sales quantities and the sales ranks obtained before and shortly after a simple experiment. […] they first obtained information from a publisher on a book with relatively constant weekly sales, then purchased six copies of the book in a 10-minute period, and tracked the Amazon.com rank […]116 Brynjolfsson et al. (2003) gather weekly sales data for 321 titles from one publisher during the summer of 2001 and the corresponding weekly sales rank data from Amazon.com.117 113 J. CHEVALIER, A. GOOLSBEE, Measuring prices and price competition online: Amazon.com and BarnesandNoble.com, “Quantitative Marketing and Economics”, 1, 2, 2003, pp. 208-209, bit.ly/1b24xGp. 114 Ibid. 115 E. BRYNJOLFSSON, Y. J. HU, M. D. SMITH, Consumer surplus in the digital economy: estimating the value of increased product variety at online booksellers, “Management Science”, 49, 11, 2003, pp. 1587-1588, bit.ly/18I1YJp. 116 Ibid. 117 Ibid., p. 1587. 34
  35. 35. Even though extremely simple, the model in (1) fits the data fairly well (R-squared: 0.8008) and is consistent enough with industry statistics.118 Since their estimate of β2 as -0.871 is based on 861 data points, other researchers have preferred to stick to it rather than execute experiments with few data points.119 The most conservative estimate by Brynjolfsson et al. (2003) of the proportion of total Amazon.com sales generated by “niche” titles is 29.3%, computed as the proportion of total Amazon.com sales lying above rank 250,000, approximately the number of titles available at the largest BARNES & NOBLE superstore in New York City at the time (out of 2,300,000 books in print).120 Ghose and Gu (2006) analyze daily panel data for 3,210 books, gathered from Amazon.com and Barnes&Noble.com between September 2005 and April 2006.121 They show that, even in online markets, search costs for “obscure” books (defined in the paper as books with sales rank higher than 20,000 or 40,000, alternatively)122 are higher than for “popular” books (defined in the paper as books with sales rank lower than 20,000 or 40,000, alternatively) 123, which may limit the scope of the long tail phenomenon.124 118 Ibid. 119 For example, see A. GHOSE, M. D. SMITH, R. TELANG, Internet exchanges for used books: an empirical analysis of product cannibalization and welfare impact, “Information Systems Research”, 17, 1, 2006, p. 11, bit.ly/1983mWp and A. GHOSE, B. GU, Search costs, demand structure and long tail in electronic markets: theory and evidence, “NET Institute Working Papers”, 06-19, 2006, p. 11, bit.ly/1aeZtxj. 120 E. BRYNJOLFSSON, Y. J. HU, M. D. SMITH, Consumer surplus in the digital economy: estimating the value of increased product variety at online booksellers, “Management Science”, 49, 11, 2003, pp. 1588-1589, bit.ly/18I1YJp. See above and below in this section for less conservative estimates from the same authors. 121 A. GHOSE, B. GU, Search costs, demand structure and long tail in electronic markets: theory and evidence, “NET Institute Working Papers”, 06-19, 2006, pp. 9-10, bit.ly/1aeZtxj. 122 Ibid., p. 20. 123 Ibid. 124 Ibid., p. 7. 35
  36. 36. The stability across ranks and retailers, and over time, of β2 , which measures “customers' relative tastes for popular and obscure books” 125, is a very strong assumption that has been criticized in subsequent works; a number of alternative techniques have been proposed in the literature. Brynjolfsson et al. (2010) suggest that the relationship between sales and sales rank may not be purely log-linear.126 In order to improve the estimate of Amazon.com long tail sales, they use different slope coefficients127 to fit the sales-rank relationship on a sample of 1,598 Amazon.com titles, monitored over a ten-week period from June to August 2008.128 The estimation results of a negative binomial regression model with four splines (knot points at the 25 th, 50th, and 75th percentiles of sales rank)129 show that “the coefficients on all four splines are negative and highly significant.”130 In addition, the slope coefficients gradually become more negative as the book sales rank increases. […] book sales decrease at an increasingly faster pace, as we move from popular books to niche books […]131 The advocated advantage over the OLS linear regression in (1) is that the model takes into account also the frequent observations with zero sales.132 125 A. GHOSE, M. D. SMITH, R. TELANG, Internet exchanges for used books: an empirical analysis of product cannibalization and welfare impact, “Information Systems Research”, 17, 1, 2006, p. 11, bit.ly/1983mWp. 126 E. BRYNJOLFSSON, Y. J. HU, M. D. SMITH, The longer tail: the changing shape of Amazon sales distribution curve, 09/20/2010, p. 2, available at SSRN: bit.ly/15QfXyK. 127 Ibid. 128 Ibid., p. 3. 129 Ibid., p. 7. 130 Ibid. 131 Ibid. 132 Ibid., p. 6. 36
  37. 37. According to the “old” methodology, the proportion of total Amazon.com sales generated in 2008 by “niche” titles (defined in the paper as books with sales rank higher than 100,000)133 would have been 82.57%, a clear overestimation with respect to 36.7%, the estimate obtained with the “new” methodology.134 The treatment of products with zero sales is critical also for the measurement of the significance of the long tail phenomenon through concentration statistics. Meaningful concentration comparisons, across channels, retailers, and over time, require similar product availability. […] the effect of product availability on the concentration of product sales may be nonmonotonic. A moderate increase in production selection may lead to a less concentrated distribution of product sales, but if the market is flooded by a large number of products that have minimal sales, product sales can actually appear to be more concentrated even if the sales don't change for any of the previously existing products.135 133 Ibid., p. 8. 134 Ibid. 135 E. BRYNJOLFSSON, Y. J. HU, M. D. SMITH, Goodbye Pareto principle, hello long tail: the effect of search costs on the concentration of product sales, “Management Science”, 57, 8, 2011, p. 1374, bit.ly/196Bu39. 37
  38. 38. Fig. 4: Concentration can be a misleading measure of the “long tail” Case 1A: 100 products are available and the top 50% of products account for 75% of total sales. Case 2: Add a “tail” of 100 niche products with small sales, while leaving the sales of existing products unchanged. Now 200 products are available, and the top 50% of products account for 95% of total sales. Case 1B: Sales of the top 100 products are exactly the same as in Case 1A. The only change from Case 1A is we now consider 100 niche products that have zero sales. In this case, 200 products are available, and the top 50% of products account for 100% of total sales. Source: E. BRYNJOLFSSON, Y. J. HU, M. D. SMITH, Goodbye Pareto principle, hello long tail: the effect of search costs on the concentration of product sales, “Management Science”, 57, 8, 2011, p. 1375, fig. 1, bit.ly/196Bu39. Brynjolfsson et al. (2011) use the Lorenz curve and the Gini coefficient to study the concentration of product sales in the catalog channel and the Internet channel of a clothing retailer.136 136 Ibid., pp. 1378-1379. 38
  39. 39. They analyze an identical selection of products, at an identical set of prices, with the same order-fulfillment facilities, available and “visible” within an identical time window.137 As expected, the Internet channel exhibits a less concentrated distribution of product sales than the catalog channel.138 Elberse and Oberholzer-Gee (2006) use three different techniques and Nielsen VideoScan data to “study the distribution of revenues across products in the context of the US home video industry for the 2000 to 2005 pe riod”139. First, they generate various descriptive statistics for the distribution of sales across titles from year to year and compute the Kolmogorov-Smirnov statistic for pairs of years to test for shifts in the distribution. 140 Location, scale, skewness, kurtosis, and inter-quartile measures are “consistent with a scenario in which the distribution becomes more dispersed, more asymmetrical, and develops a sharper peak and a longer tail over time”141. The Kolmogorov-Smirnov tests reveal that the distributions of weekly sales across titles are significantly different across the years. 142 Then, they “estimate a quantile regression model to examine the factors that underlie the shift in the distribution of sales” 143. 137 Ibid., p. 1374. 138 Ibid., p. 1379. 139 A. ELBERSE, F. OBERHOLZER-GEE, Superstars and underdogs: an examination of the long tail phenomenon in video sales, “Harvard Business School Working Paper Series”, 07-015, 09/05/2006, p. 2, bit.ly/19uLOzX. 140 Ibid., pp. 8-9. 141 Ibid., p. 12. 142 Ibid. 143 Ibid., p. 8. 39
  40. 40. In the case at hand, quantile linear regression is more appropriate than OLS linear regression: the research is not concerned with average effects, but with “how the entire distribution changes with certain covariates” 144. Examining multiple quantiles allows for a richer inference than “segmenting the response variable into subsets according to its unconditional distribution and then doing least squares fitting on these subsets” 145, a procedure that clearly suffers from sample selection bias. Quantile linear regression results show that the distribution of sales has shifted down in general, but this shift is largest for the better-selling titles. 146 [...] the tail of the distribution has seen a much smaller decrease, implying a shift in the mass towards niche products. 147 Finally, they estimate a negative binomial regression model to analyze the number of titles that meet certain weekly sales threshold levels (“zero sales, sales below the 70th quantile, sales between the 70 th and the 80th quantile, sales between the 80th and the 90th quantile, and sales above the 90th quantile”148).149 The whole picture that emerges from this comprehensive study is quite complex: “superstar” and “long tail” effects are not necessarily antithetical, and, indeed, seem to coexist. Are there important superstar and long-tail effects in U.S. home video sales? The answers turn out to be of the “yes, but…” variety. Yes, there is a long-tail effect in that the number of titles that sell only a few copies every week increases during our study period. But at the same time, the number of non-selling titles also in 144 Ibid., p. 9. 145 K. F. HALLOCK, R. KOENKER, Quantile regression, “Journal of Economic Perspectives”, 15, 4, 2001, p. 147, bit.ly/15gERVI. 146 A. ELBERSE, F. OBERHOLZER-GEE, Superstars and underdogs: an examination of the long tail phenomenon in video sales, “Harvard Business School Working Paper Series”, 07-015, 09/05/2006, p. 13, bit.ly/19uLOzX. 147 Ibid. 148 Ibid., p. 17. 149 Ibid. 40
  41. 41. creases substantially; it is now four times as high as in 2000. Many underdogs turn out to be losers. We also find evidence of a superstar effect. Among the best-per forming titles, it is an ever-smaller number of films that accounts for the bulk of sales. The caveat here is that today's superstars lack the punch of earlier years. Video sales generally decrease over time across all quantiles of the sales distribution, but this effect is most pronounced among best-selling titles. 150 In order to estimate the potential producer and consumer welfare arising from the availability in e-book format of world out-of-print titles, Smith et al. (2012) match two random samples of titles: one composed of titles already available on the Kindle marketplace, the other of titles not yet available on the Kindle marketplace.151 The project required the mapping of sales ranks of Kindle out-of-print titles to the corresponding sales levels. Initially, they try to fit with (1) a dataset provided by a major publisher, covering weekly sales and weekly sales ranks for 713 e-book titles for a tenweek period. Since the research object is the “extreme tail”152 of out-of-print books, they also try various different polynomial rank terms to produce stronger fits in the tail of the distribution, and obtain better results with a third degree polynomial function: 2 3 log (Sales)=β1 +β 2⋅log (Rank)+β3⋅log( Rank) +β 4⋅log( Rank) +ϵ .153 (2) 150 Ibid., p. 18. 151 M. D. SMITH, R. TELANG, Y. ZHANG, Analysis of the potential market for out-of-print ebooks, 08/04/2012, pp. 2-3, available at SSRN: bit.ly/WcNhJq. 152 Ibid., p. 10. 153 Ibid. 41
  42. 42. Fig. 5: Calibration between sales and rank Source: M. D. SMITH, R. TELANG, Y. ZHANG, Analysis of the potential market for out-of-print ebooks, 08/04/2012, p. 33, fig. 5, available at SSRN: bit.ly/WcNhJq. Still unsatisfied with the fit produced by (2) for observations with ranks above 200,000, the researchers complement the estimation method with an experiment similar to that of Chevalier and Goolsbee (2003).154 They purchased between one and three copies of 30 randomly selected Kindle titles with ranks between 200,000 and 1,000,000, and tracked their sales before and after this experiment.155 • For titles with low ranks (below 200,000), they predict sales with (2).156 154 See above in this section. 155 Ibid. 156 Ibid., p. 11. 42
  43. 43. • For titles with high ranks (above 200,000), they “assign sales according to the expected sales that belong to the interval the title rank falls into based on the experiment described above”157. • If a title does not have a rank, they assume that it has no sales. 158 III.3.SURVEY AND DESCRIPTIVE EVIDENCE In 2002, the OPEN EBOOK FORUM159 sponsored a consumer survey on e- books, administered during the New York City is Book Country event.160 263 volunteers self-completed the survey, so the results are limited by sample self-selection and incomplete questionnaires.161 Most participants (61%)162 reported that they were willing to buy ebooks at the same price of paperback books.163 However, according to survey data from the UK OFCOM Online copyright infringement tracker benchmark study Q3 2012,164 e-book consumers expect lower price tags with respect to print books.165 The likelihood to pay for a single book download decreases steadily with price: among those who have ever downloaded or accessed e-books, 78% are willing to pay at £2, falling to 7% at £10.166 The mean price willing to pay is £3.49; 167 similar results arise for the willingness to pay for a subscription service.168 157 Ibid. 158 Ibid. 159 See section II.1. 160 Consumer survey on ebooks, Open eBook Forum, 2003, p. 4, available at im+m: bit.ly/15Olump. 161 Ibid., p. 8. 162 Ibid., p. 14. 163 Ibid., p. 19. 164 See section III.1. 165 Report by Kantar Media, p. 69 in Online copyright infringement tracker benchmark study Q3 2012, Ofcom, 11/20/2012, bit.ly/Vw6IfS. 166 Ibid. 167 Ibid. 168 Ibid., p. 70. 43
  44. 44. Fig. 6: Likelihood to pay for downloading a single e-book Question: Assume you saw a new fiction e-book on an online service that you wanted to own. It would be high quality, and you knew it was a reputable and reliable service. How likely would you be to download it if it was the following prices? Base: All 12+ in the UK that have ever downloaded/accessed e-books (652). £3.49 is the average price people are willing to pay for a single e-book download. Source: OCI Benchmark study slide pack 5 - Books, p. 31, slide B23 in Online copyright infringement tracker benchmark study Q3 2012, Ofcom, 11/20/2012, bit.ly/Vw6IfS. Compared to the users of other content types covered in the report, ebook downloaders are more skewed towards females (54%) and have an older age profile (58% are 35+).169 AMAZON Kindle is the most used service for e-books (80% of users, consistent among demographics and sub-groups)170, which might explain, in part, why the estimate of illegal behavior for books is the lowest across content types (11% of users)171. Descriptive evidence on the e-book market has been presented in 2012 by Mark Croker, founder of SMASHWORDS, a large distributor of self-published e-books.172 169 Ibid., p. 61. 170 Ibid., p. 66. 171 Ibid., p. 65. 172 M. COKER, How data-driven decisions *might* help indie ebook authors reach more readers , “RT Booklovers Convention”, 04/25/2012, slidesha.re/Xhccuo. 44
  45. 45. Since 2008, the company has published more than 100,000 indie (independent) e-books distributed to multiple major retailers, and remunerates authors with significantly higher royalties (60% of list price) than traditional publishing houses.173 Aggregate sales data for a nine-month period from the SMASHWORDS distribution network174 reveal strong overall growth, driven by titles achieving viral word of mouth.175 In fact, the sales distribution is characterized, unsurprisingly, 176 by few titles selling extremely well, thousands of moderate sellers, and a vast majority of poor selling titles.177 Sales of individual titles at APPLE iBookstore rise and fall over time, either randomly or based on author promotions, new releases, and promotions by retailers.178 As for price elasticity, e-books priced between $2.00 and $2.99 sell 6.2 times more units than those priced more than $10.00, 179 which implies an approximate price elasticity of -2. However, e-books priced between $0.99 and $1.99 seem to underperform in terms of profitability with respect to those priced between $2.99 and $5.99.180 III.4.ECONOMETRIC EVIDENCE We can identify historical and political reasons prompting econometric investigations of the publishing market, of the relationship between quantity sold and prices in particular. 173 Ibid., pp. 8-9. 174 Ibid., pp. 18. 175 Ibid., pp. 24. 176 See section III.2. 177 Ibid., p. 61. 178 Ibid., p. 30. 179 Ibid., p. 51. 180 Ibid., p. 58. 45
  46. 46. Historically, economic arguments have had, and still have, a central role in public debates and political decisions about copyright laws. As far back as 1643, STATIONERS181 argued that “books are luxuries, the demand for which is elastic.”182 Books (except the sacred Bible) are not of such general use and necessity, as some staple commodities are, which feed and clothe us, nor are they so perishable, or require change in keeping, some of them being once bought, remain to children's children, and many of them are rarities only and useful only to a very few, and of no necessity to any, few men bestow more in Books than what they can spare out of their superfluities […] And therefore property in Books maintained among stationers cannot have the same effect, in order to the public, as it has in other commodities of more public use and necessity. 183 Before the 1876-8 ROYAL COMMISSION ON COPYRIGHT184, the English philosopher Herbert Spencer witnessed against the introduction of the “compulsory licence”185 system, where any person would be free to print any book after paying a fixed percentage of the selling price to the author.186 According to him, such a system would be “especially injurious to the particular class [of books] which of all others needs encouragement” 187, described by the chairman of the commission as “the graver class [of books] which do not appeal to the popular tastes”188. Contrary to STATIONERS' thesis, the demand for certain books, e.g. philosophical works, seems, according to their own authors, inelastic to a fall in price, which suggests significant economic differences across subject categories.189 181 See section III.2. 182 A. PLANT, The economic aspects of copyright in books, “Economica”, 1, 2, 1934, p. 177, bit.ly/1giWKb4. 183 Ibid. 184 See section III.2. 185 A. PLANT, The economic aspects of copyright in books, “Economica”, 1, 2, 1934, p. 183, bit.ly/1giWKb4. 186 Ibid., p. 188. 187 Ibid. 188 Ibid., pp. 188-189. 189 See section II.3. 46
  47. 47. For the titles composing Liebowitz Book Review Digest dataset,190 large differences across subject categories emerge also in market longevity.191 Tab. 1: Percentage of titles surviving more than 58 years Category All titles Best-sellers removed Academic 68% 68% Philosophical 52% 41% History 51% 43% Biography 49% 42% Religion 46% 40% Poetry 43% 40% Fiction 36% 40% Mystery 23% 16% Comedy 25% 0% Autobiography 19% 11% Art 17% 17% Travel 6% 6% Sports 0% 0% Source: S. J. LIEBOWITZ, S. MARGOLIS, Seventeen famous economists weigh in on copyright: the role of theory, empirics, and network effects, “Harvard Journal of Law & Technology”, 18, 2, 2005, p. 456, tab. 2, bit.ly/GHOhQB. Bittlingmayer (1992) provides empirical estimates of the elasticity of the demand for books, 192 as part of a more general economic analysis of the 190 See section III.2. 191 S. J. LIEBOWITZ, S. MARGOLIS, Seventeen famous economists weigh in on copyright: the role of theory, empirics, and network effects, “Harvard Journal of Law & Technology”, 18, 2, 2005, p. 456, bit.ly/GHOhQB. 192 G. BITTLINGMAYER, The elasticity of demand for books, resale price maintenance and the Lerner index, “Journal of Institutional and Theoretical Economics”, 148, 4, 1992, p. 602, tab. 47
  48. 48. institutional framework of the publishing industry. Data about prices, sales quantities, retail margins, royalties, and inventories for over 1,000 titles for the period 1984-1986 were provided by a West German publishing house, specialized in the production of academic and intellectual books.193 In markets characterized by monopolistic competition, under the assumption of profit maximization, there is an inverse relationship between percentage margin (the difference between price, p, and marginal cost, c, as percentage of p) and demand elasticity ( ηp , in absolute value): (p−c) 1 194 =η . p p (3) The marginal costs associated with the sale of an extra copy of a book – taxes, royalties, and the retailer's margin – amount to only 40 to 60 percent of the retail price, which in turn implies a price elasticity in the range 1.7 to 2.5. 195 Resale price maintenance restricts retailing by allowing publishers to directly control both the retail price (p) and the wholesale price (the difference between the retail price, p, and the retail margin, v).196 In the period considered, resale price maintenance was legal in West Germany, and still is today in the Federal Republic of Germany, observed and enforced by the GERMAN BOOK TRADERS ASSOCIATION.197 3 and p. 603, tab. 4, bit.ly/17dxX0b. 193 Ibid., p. 589 and p. 597. 194 Ibid., p. 588. 195 Ibid., pp. 588-589. 196 See also section II.1 for information about the agency model. 197 German resale price maintenance act, Börsenverein des 07/14/2006, bit.ly/17bdJF6. 48 Deutschen Buchhandels,
  49. 49. So, under the assumption of positive yet marginally decreasing sales growth in retail selling effort,198 there are two first-order conditions for profit maximization: (p−v−c ) 1 =η p p 199 (4) (p−v−c ) 1 200 =η , v v (5) and where η v is the elasticity of demand with respect to the retail margin. (4) and (5) imply −η p+η v =−1− c .201 p−v −c (6) Ceteris paribus, the larger the price elasticity ( η p , in absolute value), the larger the retail margin elasticity ( η v ). For a cross section of books that are being optimally marketed and priced, those for which sales are most responsive to extra promotional efforts will also be those for which sales are most responsive to price changes. Novels have a comparatively large elasticity of demand because the quantity purchased can be influenced relatively easily with higher promotional effort. The demand for research monographs in entomology is price inelastic because the quantity sold is not sensitive to promotional efforts.202 Again, retail margin and, thus, price elasticity vary by type of book, and also by type of bookstore. 198 G. BITTLINGMAYER, The elasticity of demand for books, resale price maintenance and the Lerner index, “Journal of Institutional and Theoretical Economics”, 148, 4, 1992, p. 593, bit.ly/17dxX0b. 199 Ibid., eq. 5. 200 Ibid., p. 594, eq. 6. 201 Ibid., p. 594, eq. 7. 202 Ibid., p. 594. 49
  50. 50. The margin varies by type of book. For example, the margin on academic books is typically 25-30 percent, on literature 40 percent. School books, which are sold through bookstores, have margins of 20 percent. Margins also vary by the type of bookstore. A publisher will often grant bookstores that specialize in a particular topic a deeper discount on the corresponding titles. 203 In monopolistically competitive equilibrium, where free entry drives profits to zero, “books for which cost-covering prices allow large sales will have demand curves that are more elastic than books with more limited audiences.”204 203 Ibid., p. 591. 204 Ibid., p. 595. 50
  51. 51. Fig. 7: Zero-profit locus of price and output combinations K is the fixed cost of book production, m is the marginal cost of book production and distribution, and q is the quantity produced. Source: G. BITTLINGMAYER, The elasticity of demand for books, resale price maintenance and the Lerner index, “Journal of Institutional and Theoretical Economics”, 148, 4, 1992, p. 596, fig. 1, bit.ly/17dxX0b. Analogously, “books that involve large marginal production or marketing expenses must have large sales.”205 It can be shown that: −η p* +η v* =−1 ,206 (7) where η p* is the equilibrium price elasticity (in absolute value) and η v* is the equilibrium retail margin elasticity. 205 Ibid. 206 Ibid. 51
  52. 52. For the econometric estimation of price elasticity and retail margin elasticity, the original dataset is divided into two cross-sectional datasets, composed of the “year-to-year percentage changes in prices and quantities”207 for the periods 1984-1985 and 1985-1986, respectively. When dealing with data from natural experiments, traditional “naïve”208 log-linear regression models cannot distinguish between shifts of the demand curve and movements along the demand curve. Suppose that the demand for a title shifts to the right, say, because this year is the centenary of the author's birth or the book has been made into a film. With un changed margin costs […], the optimal price would increase. If this anticipation of changed demand conditions is common, […] the estimated [price elasticity] […] could very well be positive. 209 The change in the inverse of (4) is an estimate of changes in price elase ticity ( Δ η pt ) and can be employed as a control for shifts of the demand curve. The result d η pt =dη vt from (7) yields the final econometric specification: e T Δ ln(qt )=a+b1⋅Δ ln(p t )+b2⋅Δ ln(v t )+b3⋅Δ η pt⋅(ln(p t )− ln(v t ))+∑ (b i Di)+ϵ ,210 (8) i=1 where the dummies Di , i=1, ... ,T , reflect the title vintage. The coefficients b 1 and b 2 are the estimators of η p and η v ; theoretically, the coefficient b3 should be equal to -1. Unfortunately, the model in (8) has a weak explanatory power. Most of the year-to-year variation in sales of books is attributable to influences not captured by price, margin or vintage. 211 207 Ibid., p. 598. 208 Ibid. 209 Ibid. 210 Ibid., p. 599, eq. 14. 211 Ibid., p. 604. 52
  53. 53. The empirical analysis suggests a price elasticity between -2 and -3 for this dataset of academic and intellectual books, 212 a figure slightly larger than what implied by (4) (about -1.7 to -2.5)213. The estimate of retail margin elasticity is about 1.5, positive and “roughly consistent with the theory”214. b 3 deviates from the predicted value of -1 and the estimated elasticities may be biased by measurement problems, especially for “poorly selling” titles (defined in the paper as the titles with sales lower than “the median number of books sold per title”215). Measurements problems include errors in the price and retail margin variables (e.g. “divergence of actual average sales price from the nominal price”216), and the failure to allocate “marginal printing costs” 217 and “title-specific promotional expenses of the publisher”218. Brynjolfsson et al. (2003) show that, under resale price maintenance219, the price elasticity of the aggregate demand for a given title in the retailing market equals the price elasticity of demand for the title faced by its publisher.220 Data from the AMERICAN ASSOCIATION OF PUBLISHERS and discussions of the researchers with various publishers indicate a gross margin of 56%-64% for “the typical obscure title”221.222 212 Ibid., p. 605. 213 See above in this section. 214 G. BITTLINGMAYER, The elasticity of demand for books, resale price maintenance and the Lerner index, “Journal of Institutional and Theoretical Economics”, 148, 4, 1992, p. 605, bit.ly/17dxX0b. 215 Ibid., p. 601. 216 Ibid., pp. 603-604. 217 Ibid., p. 605. 218 Ibid. 219 See above in this section. 220 E. BRYNJOLFSSON, Y. J. HU, M. D. SMITH, Consumer surplus in the digital economy: estimating the value of increased product variety at online booksellers, “Management Science”, 49, 11, 2003, p. 1585, eq. 10, bit.ly/18I1YJp. 221 Ibid., p. 1586. 222 Ibid. 53
  54. 54. So, according to (4), for such titles, the price elasticity of the demand faced by the publisher is between -1.56 and -1.79, 223 which represents an estimate of the price elasticity of the aggregate demand in the retailing market, obtained “by taking advantage of the characteristics of the book industry structure and available industry statistics on gross margins” 224. The online book-selling business has been thriving for almost twenty years to date and anticipated the e-book market in many respects, such as the increased product availability.225 Studies about the electronic commerce of books could very well be relevant for the electronic book industry. At the beginning of the 2000s, online book sales made up about 10% of total book sales in the US;226 the combined market share of Amazon.com and BarnesandNobles.com, the two dominant online bookstores, was higher than 85% in terms of sales, with the former selling almost four times as much as the latter.227 Chevalier and Goolsbee (2003) use sales rank data228 to estimate both the own-price elasticity faced by the two merchants and their cross-price elasticity with respect to each other. Data about a sample of 20,000 books, constructed through stratified random sampling from three different sources (“to get books representative of different parts of the sales distribution” 229), were “scraped”230 from the 223 Ibid. 224 Ibid., p. 1585. 225 See section III.2 for a discussion of the long tail phenomenon. 226 J. CHEVALIER, A. GOOLSBEE, Measuring prices and price competition and BarnesandNoble.com, “Quantitative Marketing and Economics”, bit.ly/1b24xGp. 227 Ibid. 228 See section III.2. 229 J. CHEVALIER, A. GOOLSBEE, Measuring prices and price competition and BarnesandNoble.com, “Quantitative Marketing and Economics”, bit.ly/1b24xGp. 230 See Web scraping, Wikipedia, accessed on 09/07/2013, bit.ly/1e1Oo8h. 54 online: Amazon.com 1, 2, 2003, p. 205, online: Amazon.com 1, 2, 2003, p. 206,
  55. 55. websites of Amazon.com and BarnesandNobles.com during April, June and August 2001. The period was characterized by major pricing experiments by the two e-tailers, across broad categories of titles.231 BarnesandNobles.com data are censored for sales rankings greater than about 630,000 whereas Amazon.com provides complete sales rank data.232 Chevalier and Goolsbee (2003) use “the trimmed least absolute deviation deviations (LAD) panel estimator of Honore (1992)” 233 for the censored dataset and OLS for the complete one.234 BarnesandNoble.com own-price elasticity is around -3.5 vs. only about -0.45 for Amazon.com.235 Cross-price effect seems to be relevantly positive only for BarnesandNoble.com,236 but a robustness check of the trimmed LAD estimator, performed by dropping the observations missing sales rank and employing OLS in place of the trimmed LAM, 237 shows a much lower degree of shifting from Amazon.com to BarnesandNoble.com.238 Interestingly, Amazon.com, the incumbent, prices in the inelastic portion of the demand curve, in contrast with the theory of static imperfectly competitive markets.239 231 J. CHEVALIER, A. GOOLSBEE, Measuring prices and price competition online: Amazon.com and BarnesandNoble.com, “Quantitative Marketing and Economics”, 1, 2, 2003, p. 206, bit.ly/1b24xGp. 232 Ibid. 233 Ibid., p. 215. 234 Ibid. 235 Ibid., p. 217. 236 Ibid., p. 218. 237 Ibid., p. 219. 238 Ibid. 239 Ibid., pp. 217-218. 55
  56. 56. However, “a firm maximizing dynamic profits might choose a price below […] [the] static profit-maximizing level,” 240 which is of some interest for the fast-growing e-book market. Prices below the single-period profit-maximizing level would be attractive in a growing market with consumer switching costs, for example. 241 Ghose and Gu (2006) use data gathered from Amazon.com and BarnesandNobles.com to study the significance of search costs in online markets, 242 which could create kinks in the demand curve. “The demand elasticity for price increases is different from the demand elasticity for price increases”243 according to the magnitude of search costs. When search costs are high, if a retailer increases prices, its own customers will notice it and have an incentive to look for better bargains at competing retailers. If the retailer decreases prices, instead, potential new customers will not be aware of it. As a consequence, demand elasticity (in absolute value) for price increases is higher than for price decreases. Vice versa, when search costs are low, if a retailer increases prices, it will affect only its current customers. If the retailer decreases prices, instead, it will attract potential new customers. As a consequence, demand elasticity (in absolute value) for price increases is lower than for price decreases. 240 Ibid., p. 218. 241 Ibid. 242 See section III.2. 243 A. GHOSE, B. GU, Search costs, demand structure and long tail in electronic markets: theory and evidence, “NET Institute Working Papers”, 06-19, 2006, p. 7, bit.ly/1aeZtxj. 56
  57. 57. Log-linear models using sales rank data, scraped from the websites of online retailers, are common in the stream of literature under review. 244 In order to capture the difference in demand elasticity between price increases and price decreases in such regression models, Ghose and Gu (2006) propose a decomposition of the price explanatory variable. β1 log( Pit )+ ∑ j=1,2,3,4 β2j (log( Pit )−log( Pit ))× PriceDecreaseit×Week ijt ,245 (9) where: • Pit is the retailer price of product i at time t, • Pit is “the price before the most recent price change”246, • PriceDecreaseit is a dummy that “takes the value of 1 if the most recent action on product i is a price decrease”247, • and Weekijt are four ( j=1, 2, 3, 4 ) weekly dummies that represent the number of weeks after the most recent price decrease and quantify the “time for information on price decreases to spread in the market”248. Therefore, “ β1 represents demand elasticity for price increases,” 249 while “ β2 denotes the difference between demand elasticity for price decreases and that for price increases.”250 244 See section III.2, and also above and below in this section. 245 A. GHOSE, B. GU, Search costs, demand structure and long tail in electronic markets: theory and evidence, “NET Institute Working Papers”, 06-19, 2006, p. 14, eq. 4, bit.ly/1aeZtxj. 246 Ibid. 247 Ibid., p. 13. 248 Ibid., p. 14. 249 Ibid. 250 Ibid. 57
  58. 58. The relative price elasticities of the two online book retailers with respect to each other are similar: between -1.49 and -1.89 for Amazon.com, and between -1.53 and -1.60 for BarnesandNobles.com.251 However, Amazon.com relative price elasticity is higher for price decreases than for price increases and gradually increases over time after a price decrease, whereas BarnesandNobles.com relative price elasticity is lower for price decreases than for price increases, with little information diffusion over time.252 There are a number of possible explanations why the two retailers face different search costs: specific consumer search preferences, targeting of different consumer segments, different implementations of active and passive search tools253, etc. Hu and Smith (2011) analyze sales data, directly provided by a publisher, of both print books and e-books.254 The price elasticity estimates obtained by running an OLS linear regression on the whole dataset are around -3 for both print books and e-books.255 Quantile linear regression256, used in place of OLS on the same dataset, unveils significant differences between best-selling titles (defined in the paper as top 20% books ranked by sales) 257 and non-best-selling titles (defined in the paper as bottom 80% books ranked by sales) 258. 251 Ibid., p. 17. 252 Ibid., p. 25. 253 Ibid., pp. 26-27. 254 See section III.1. 255 Y. J. HU, M. D. SMITH, The impact of ebook distribution on print sales: analysis of a natural experiment, 08/29/2011, p. 9, available at SSRN: bit.ly/Y2G650. 256 See section III.2. 257 Y. J. HU, M. D. SMITH, The impact of ebook distribution on print sales: analysis of a natural experiment, 08/29/2011, p. 19, available at SSRN: bit.ly/Y2G650. 258 Ibid. 58
  59. 59. Price elasticity for print books in the 20th and the 80th percentile is about -1.6 and -1.3, respectively, vs. -3 and -1.9 for e-books in the 20 th and 80th percentile.259 Smith et al. (2012) study the potential market for the digital version of out-of-print titles260 through sales rank data, scraped from Amazon.com and the Kindle marketplace.261 First, a probit regression model is applied on random samples of out-ofprint titles already available on the Kindle marketplace and of out-of-print titles not yet available on the Kindle marketplace, to calculate the probability of an out-of-print title being digitized.262 The explanatory variables include: • the price of the print version, • the year of publication of the first print version, • the number of pages of the print version, • the subject category, • the type of audience, • the number of Bing search results for the ISBN263, • the sales rank of the print version on Amazon.com, • the binding format of the print version, • and a dummy set to one for “large publishers”. 264 259 Ibid., p. 24, tab. 10. 260 See section III.2. 261 M. D. SMITH, R. TELANG, Y. ZHANG, Analysis of the potential market for out-of-print ebooks, 08/04/2012, pp. 6-7, available at SSRN: bit.ly/WcNhJq. 262 Ibid., pp. 8-9. 263 See International Standard Book Number, Wikipedia, accessed on 09/07/2013, bit.ly/16kdpGT. 264 Ibid., pp. 7-8. 59
  60. 60. Then, with the propensity scores thus obtained, the two samples are matched through the nearest neighbor and the stratification method.265 In order to obtain more credible confidence intervals for propensity scores, the probit regression model is re-estimated with the Bayesian approach outlined in Chen and Kaplan (2011)266.267 As before, the resulting propensity scores are used to match the two samples, based, again, on the nearest neighbor and the stratification method.268 Finally, sales and price of the out-of-print titles not yet available on Kindle marketplace are inferred from sales and price of the matched out-of-print titles already available on Kindle marketplace.269 […] bringing the world's 2.7 million out-of-print titles back into print as eBooks could create $740 million in revenue in the first year after publication […] 270 Cost assumptions based on current Kindle sales contracts271 suggest that as much as $460 million would accrue to publishers and authors. 272 265 Ibid., pp. 14-16. 266 C. J. S. CHEN, D. KAPLAN, Bayesian propensity score analysis: simulation and case study, Society for Research on Educational Effectiveness, 2011, bit.ly/16N541w. 267 M. D. SMITH, R. TELANG, Y. ZHANG, Analysis of the potential market for out-of-print ebooks, 08/04/2012, pp. 16-17, available at SSRN: bit.ly/WcNhJq. 268 Ibid., pp. 17-18. 269 Ibid., p. 5. 270 Ibid., p. 25. 271 Ibid., pp. 21-22. 272 Ibid., p. 25. 60
  61. 61. IV. E-BOOK PRICING BY MAJOR ITALIAN PUBLISHERS Price information and, more in general, public catalog data for digital books published in Italy can be obtained, with relatively little effort, by webscraping mainstream Italian e-book stores (in our case, Ultima Books). Provided that a certain e-book has a corresponding print edition, if the respective publisher has entered the ISBN of the print edition as part of the e-book metadata, it is possible to query print book catalog databases (in our case, Informazioni Editoriali) for information. In the regression of the e-book price on the price of the print edition and other explanatory variables, we restrict our attention to titles of publishing houses distributed by Edigita, the leading e-book distributor in Italy.273 Edigita distributes most of the major Italian publishers 274 and provides well-formed and well-compiled metadata. Incidentally, the purposive sample reduction with respect to the whole market was also technically convenient, since it allowed for effective and timely querying of data sources. IV.1. CATALOG DATASET Cross-sectional catalog data for 14,794 e-books distributed by Edigita, and available for sale on the Ultima Books e-book store on the 28th of November 2012, were collected within a five-day time window (11/28/201212/02/2012). We arranged information across few variables of interest: • ebook.price, the retail price of the e-book (in euro, VAT included); 273 See section II.3. 274 See section II.3 for information about market concentration in the Italian publishing industry. 61
  62. 62. • drm, a dummy set to one if the e-book is encrypted with DRM technologies275, to zero otherwise; • watermark, a dummy set to one if the e-book embeds information about the purchase276, to zero otherwise; • paper.price, the retail price of the print edition (in euro, VAT included); • ebook.pub.delay, the distance in years between the publication of the e-book and of the print edition (either “positive” for print-first titles or “negative” for digital-first titles);277 • subj.fiction, a dummy set to one or zero whether the title subject is fiction or non-fiction, respectively.278 Unfortunately, only 6,053 observations out of a total of 14,794 are com- plete of information about the print edition. Since experiments in digital-only publishing279 by major publishers have been very limited, if not null, 280 missingness should be attributed to incomplete reporting during metadata compilation. Paragraph IV.1.2 describes the treatment method adopted to deal with the issue of missing values in X. IV.1.1. Descriptive statistics • Almost every e-book in the sample (96.88%) embeds DRM technologies. 275 See section II.1. 276 Digital watermarking is a “social” deterrent against piracy and represents a loose alternative to DRM technologies. 277 All the e-books in the sample were published between 2010 and 2012, so, in practice, the publication delay reflects the print edition vintage. 278 See section II.3. 279 For example, see Cos'è quintadicopertina, quintadicopertina, accessed on 09/07/2013, bit.ly/1eeukj9. 280 Correspondence and conversations with SIMPLICISSIMUS BOOK FARM management. 62
  63. 63. • Only few e-books in the sample (2.76%) employ digital watermarking. • Still fewer e-books in the sample do not employ either DRM encryption or digital watermarking (0.36%). • Non-fiction e-books are slightly more represented in the sample than fiction e-books (56.25% vs. 43.75%). Fig. 8: Pie chart of e-book protection mechanisms Red: DRM encryption (Adobe Content Server 4), 96.88%; Yellow: digital watermarking, 2.76%; Green: open file, 0.36%. Source: own elaboration of public catalog data for a sample of e-books by Italian publishers. 63
  64. 64. Fig. 9: Pie chart of e-book subjects Blue: non-fiction books, 56.25%; Light blue: fiction books, 43.75%. Source: own elaboration of public catalog data for a sample of e-books by Italian publishers. Tab. 2 reports summary statistics for the quantitative variables in the dataset. 64
  65. 65. Tab. 2: Summary statistics for the quantitative catalog variables Statistic ebook.price paper.price ebook.pub.delay 10.893 15.984 2.045 Median 8.990 14.000 0.000 Minimum 0.000 3.900 -2.000 Maximum 109.990 150.000 36.000 Std. Dev. 7.682 9.899 3.610 C.V. 0.705 0.619 1.765 Skewness 3.470 3.314 2.939 23.688 21.547 12.924 3.990 7.000 0.000 24.990 35.000 10.000 5.010 9.000 3.000 Mean Ex. kurtosis 5% Perc. 95% Perc. IQ range Source: own elaboration of public catalog data for a sample of ebooks by Italian publishers. • All three variables are right-skewed, as evident from Fig. 10, Fig. 11, and Fig. 12. • The median price, more robust to pricey outliers than the mean price, is €8.99 for e-books and €14.00 for print books. • Very few titles have been published first in e-book format and only later in print edition. 65

×