Saturday, February 28, 2009

Google Book Settlement doesn't address the hard problem

The Google Book Settlement (GBS) defines an arrangement between the Association of American Publishers (AAP), the Authors Guild and Google which allows Google to digitise and sell access to out-of-print books which are still subject to copyright, and to share the proceeds with the rights holders.

It's easy to see what AAP and the Authors Guild were thinking: books, like all information, are going 'e', lets monetise these lazy assets, and if you can't beat them, join them. But AAP and the Authors Guild are "joining" Google like the Celtic Gauls joined the Roman Empire.

It probably costs Google about $90 to digitise each book covered by the GBS: $60 in up-front payment to the Books Rights Registry (BRR) and around $30 to perform the digitisation (*).

I'm not sure what it costs to author, edit, layout, proof-read and index the typical book, but I've seen estimates that it's typically many tens of thousands of dollars. That is, that the difference between the costs of digitisation and production is around 3 orders of magnitude.

But the split between Google and the BRR is 37%:63%. That is, despite costs hundreds or thousands of times higher, rights holders get only twice Google's share of income produced.

At an average sales price of $6, Google need only $90 / $6 / 0.37 = 41 sales of a title to recoup their costs (**). Rights holders need more like 15000 sales to recoup theirs. The risk/reward balance looks to be unbalanced and hence unstable.

Maybe that's fine - after all, these books are out-of-print, and the rights holders have presumably already got all the revenue they can from these works? Well, no. All we can deduce from the fact that a book is out-of-print is that it is no longer commercially advantageous, given the high costs of producing, moving and selling physical books, to bother printing, distributing and selling it. Old books have to make way for the new on the book-store shelf.

The problem with the settlement is that given the reality of inevitable piracy of digitised books, the interests of rights holders and Google are seriously misaligned. Google has little incentive to be very worried about piracy, and in any case, they're smart enough to know there's nothing they can do about it. All they need is to sell 40 odd copies (or get equivalent per-book institutional subscription revenue to their book database) and they're in the black. If the sell 100, they've got a 200% return on investment, whereas the rights holders haven't even covered the costs of the layout artist.

Digitised books from the Google repository will be pirated and there's nothing that can be done about it. DRM wouldn't help a bit, copies will be untraceable, watermarks will be removed (***).

In the short-term, those lucky enough through personal wealth or institutional affiliation (or those happy to use pirated copies) will enjoy previously unimaginable access to our written culture, albeit at the terms set by a for-profit corporation. But in the long term, we'll all suffer as the incentives to produce are reduced by uncontrollable piracy.

As Kevin Kelly says
The internet is a giant copy machine ... a super-distribution system, where once a copy is introduced it will continue to flow through the network forever, much like electricity in a superconductive wire. We see evidence of this in real life. Once anything that can be copied is brought into contact with internet, it will be copied, and those copies never leave. Even a dog knows you can't erase something once it's flowed on the internet.

As Paul Krugman says
Bit by bit, everything that can be digitized will be digitized, making intellectual property ever easier to copy and ever harder to sell for more than a nominal price. And we’ll have to find business and economic models that take this reality into account.

The Google Books Settlement does not take this reality into account. Rather, it is a short term commercial play which helps to cement Google's pre-eminent position in the information business.

Google isn't being evil, or even tricky, it's just being rational. I assert that what the AAP, the Authors Guild and our society is really looking for is what Krugman describes as "a sustainable business and economic models that take this reality into account".

One attempt to come up with a model that fits Krugman's specification does so by incorporating compulsory licensing with free, easy and anonymous access and downloading of digitised materials administered by commercially disinterested parties and funded by general taxation. More details are here.


* The Internet Archive asserts it costs them around $30 to digitise a typical book by scanning from paper and store the digitised copy.

** The settlement claims that about half the books covered will be offered for sale for $5.99 or less.

*** Copies will be downloaded by individuals with access to large institutional subscriptions (eg, university students using their library's access), programmatically combined with other copies to locate and remove or blur watermarks. The costs of piracy are near zero as everything can be automated (see for example, the Google Book Downloader which automates the process of creating a local PDF copy of books on Google whose pages can be viewed). The music industry has learnt that neither DRM nor attacking P2P networks materially helps.


  1. Kent,

    One important point in your financial analysis is that traditionally a retailer receives 50% of the retail price of the book. This is fairly standard in the industry although their are sectors where the share varies. The 63/37 split should be compared against the more standard 50/50 split with retailers.

  2. Read the settlement. There are no downloads. You would have to hack Google's entire infrastructure to provide alternative access (your "piracy")to any book or all books.

    Further, the only books that fall under the settlement are books that are not being commercially exploited. What revenue content owners gain from the agreement is all revenue they had no access to. They aren't trying to make up their investment. It's a win for publishers, not Google. A win for Google would have been litigating to the finish and winning a fair use case. Then they wouldn't have to share revenues.

  3. Re the retailer taking 50%: that's a fair point, but this is 50% of a much higher retail sales price than the approximate modal price of $6 for Google Books. However, the publisher also has a higher cost base (the printing and distribution). But so does the retailer: it could be argued that the retailer may fairly "earn" their 50% by paying for store rent, fit-out, employee wages, power, dealing with returns and shop-lifting; that is, they have higher transaction costs per purchase than Google. There are thousands of book retailers free to compete on their internal costs and retail price (and margin). There is one Google Books...

  4. Re no downloads: people using an institutional subscription (eg, students at a subscribing university) or personal consumers are (legitimately) able to cut and paste or print the entire book (albeit print in chunks of 20 pages at a time). What "print" really means is up to the controller of the system on which printing is done - "printing" means creating a data stream which can just as easily be diverted to a file as a printer. And even if physically printed, the pages can be cheaply scanned and optionally OCR'ed. Watermarks and other identifiers will be removed/defeated and the anonymized content uploaded to file sharing networks.

  5. Re "win for publishers": I agree that Google is providing a mechanism whereby rights holders get something rather than nothing, and that something is better than nothing. However, I don't agree rights holders are getting enough to make it "fair", and Google Books' effective monopoly position is a largely the reason.

    Re: "not a win for Google": Google had no chance at all of winning a "fair use" case which would allow them to sell in-copyright books without compensating rights holders.

  6. This comment has been removed by a blog administrator.

  7. This comment has been removed by a blog administrator.

  8. This comment has been removed by a blog administrator.