Sunday, July 25, 2010

Don't just rip a book, don't just rip a bookshelf - rip the library

The BookLiberator, another cheap DIY book scanning kit has just been announced. There's even a site dedicated to DIY book scanners.

The relevance of mass digitisation for libraries has been highlighted by a report commissioned by CLIR: "On the Cost of Keeping a Book," by Paul Courant and Matthew "Buzzy" Nielsen, contained within the document "The Idea of Order: Transforming Research Collections for 21st Century Scholarship" published in June 2010.

Courant and Nielsen analyse the full costs of storing physical books, and conclude that for a typical book held by a US research library in an open stack, the "fully loaded" cost is $US4.26 per-annum. Moving the book to a high density storage facility after 20 years reduces this cost to $US1.99 pa in perpetuity, assuming very low usage of the information it contains (that is, very low circulation).

Drawing on the experiences of the Hathi Trust, Courant and Nielsen estimate the comparable "fully loaded" cost of storing a black-and-white ebook in a mirrored digital archive with tape backup is less than $US0.15 pa, and $US0.40 pa for a full colour ebook and after adding a third mirror site.

A very signficant difference; but the real "total societal" cost differential is far greater.

Consider the 3.1 million monographs held by the National Library of Australia in Canberra. Less than 2% of the Australian population has convenient physical access to the NLA, so getting access to a book at the NLA for a typical Australian is very expensive if the person needs to travel to the book, and expensive and slow if they are fortunate enough to be able to get the book to travel to them through an inter-library loan.

The 2001 inter-library loan benchmark by the NLA revealed the average cost to the participating libraries of a loan was over $A49, and the average delay between the request being made and the reader being informed that their requested book was waiting for them at their library was over 11 days.

But a scanned and OCR'ed version of the book, an ebook, could be delivered almost instantly and for negligible cost to the reader (or, if the reader did not have a means of receiving or reading it, to their library). It could be delivered in a format which was easier to read (reader selectable font) and which supported searching, annotation, copying and pasting and hyperlinking. It could be made available to 2 or more readers simultaneously.

But the primary benefit to libraries is the greatly reduced cost of storage and access.

The Internet Archive claim it costs about $US30 to digitise a book and store it in perpetuity using their widely deployed hardware and software. That's much less than the cost of one inter-library loan.

The Internet Archive recently announced a new digital lending library service with 3 categories of books, two of which are not controversial:

  • downloading public domain books
  • linking to the commercial OverDrive service for "current" books made available through that service by the reader's library

But a third, much smaller selection of out-of-print but in-copyright books have been scanned and made available for anyone to freely download and read for 2 weeks. After 2 weeks, the downloaded copy can no-longer be read and the ebook becomes available for someone else to download.

There are currently less than 200 books in this third category. Eric Hellman speculates that Internet Archive's founder, Brewster Kahle, must have expensive legal advice, and that perhaps this ploy is a bait for the publishers.

And 200 books won't change much; they are just noise compared with the number of in-print and in-copyright books which have been "liberated" and circulate on peer-to-peer networks.

It is very likely that very soon, Google Editions will begin making cloud-hosted versions of in-copyright books available for an average price of $US6, of which about $US3.80 will be made available to the Books Right Registry for distribution to the rights holders.

Libraries are funded to preserve and circulate books. They perform an essential role in enriching our society by making information and entertainment available to all. By storing and circulating ebooks rather than physical books, libraries can probably save around $4 per book per year and simultaneously provide a better service to their readers. Even a system which allowed just one electronic copy of each in-copyright book to circulate would provide a better service and be much cheaper than the current physical storage and circulation system.

But what if the savings made from going "e" where made available to purchase additional "copies" for simultaneous circulation? Or what if rights holders could be compensated according to the circulation of their creations?

For decades, the Australian Government has run a public lending right program which makes payments to creators and publishers based on their physical holdings in Australian libraries.

It's now time for libraries to provide a better service for their readers and reduce their own costs by digitising their collections.

It's time for libraries to build on the technology of their new commercial competitors and to spend less resources on shuffling and storing blocks of paper and more on encouraging and rewarding those who produce the content.

Unlike the costs of storing and circulation books, the benefits of a better informed citizenry are incalculable.

From the Internet Archive Digital Lending Library announcement:

"As the first American library to lend books, we believe it is only
fitting that we extend and upgrade this basic, yet crucial service in
the digital age,” said Tom Blake, Digital Projects Manager Boston
Public Library. “We hold the third largest research collection in the
country, much of which is available at our buildings only during
business hours. Digital lending allows us to circulate these rare,
precious, and unique holdings into our local neighborhoods and
beyond – anytime, anywhere, free to all."