One of the great tasks facing humanity today is digitizing the world's books and liberating the huge stores of knowledge they contain. The technology is there – scanners are now relatively fast and cheap – but the legal framework is struggling to keep up. That can be seen in the continuing uncertainty hovering over Google's massive book scanning project. It can also be observed in some recent digitization projects like Cambridge University's Digital Library:
Over the course of six centuries Cambridge University Library's collections have grown from a few dozen volumes into one of the world's great libraries, with an extraordinary accumulation of books, maps, manuscripts and journals. These cover every conceivable aspect of human endeavour, spanning most of the world's cultural traditions. While parts of the Library's manuscript collections have already been published in print, microfilm and digital formats, we are now building a substantial online resource so that our collections can be much more accessible to students, researchers and the wider public.
That's obviously a highly laudable aim. But the strict terms and conditions are not so praiseworthy:
Subject to statutory allowances, extracts of the Content and University Material from the site may be accessed, downloaded and printed for your personal and non-commercial use and you may draw the attention of others within your organisation to material posted on the site. Unless explicitly licensed or permitted by us, you may not:
One of the jewels of the Cambridge University Digital Library is a collection of Newton's scientific papers. So far, a selection of important mathematical works from the 1660s has been digitized. These date are from well before the first modern copyright act, the 1710 Statute of Anne. So it's an interesting question -- what is the copyright situation of these papers and their digitized images?
use any part of the Content or University Material on the site for direct or indirect commercial purposes or advantage without obtaining a licence to do so from the University or its licensors
modify or alter the paper or digital copies of any Content or University Material printed off or downloaded in any way
sell, resell, license, transfer, transmit, display in any form, perform, hire, lease or loan any Content or University Material in whole or in part printed or downloaded from the site
systematically extract and/or re-utilise substantial parts of the Content or University Material from the site
create and/or publish your own database that features substantial parts of this site.
Assuming that copyright dates from the "fixing" of the work, or from the date of the Statute of Anne, they would clearly have passed into the public domain long ago. One technique that libraries have tried to employ in order to maintain their control is to claim that the act of digitizing creates a new copyright, although this seems dubious. After all, the whole point of digitization is to capture as faithfully as possible the physical appearance of a text: an artistic interpretation of that physical appearance would defeat the object of the exercise. But without that artistic element there seems to be no grounds for claiming copyright.
Moreover, even if there were copyright in the digitized image, it's hard to see how there is any basis for stopping people from transcribing the text, since that is undoubtedly in the public domain. But that's precisely what Cambridge University is trying to do in its conditions quoted above.
At least the Cambridge University Digital Library allows "personal and non-commercial use" for free; the British Library's new British Newspaper Archive doesn't even permit that:
The index of the newspaper archives featured on the website can be searched for free, from any location. If you are using the website in premises owned or operated by the British Library, you can view the images of the newspapers themselves for free also. If you are using the website anywhere else and want to view the images of the newspaper archive or use some features of the website you will need to buy either a Credit Package or a Subscription. You have to register with us and be signed in to buy credits or a subscription.
Here's what the British Newspaper Archive encompasses:
The British Library's newspaper collections are among the finest in the world, containing most of the runs of newspapers published in the UK since 1800.
Clearly, most of that material will be in the public domain. But as a result of this digitization project, the British Library is actually removing physical access to some of its public domain holdings, replacing it with virtual access through images it claims are under copyright:
The scale of the newspaper publishing industry from the early 19th century onwards is enormous, with many cities and towns publishing several newspapers simultaneously, often aimed at distinct audiences depending on social status, geographical location and political affiliation. The first stage of this project focuses on runs published before 1900 and will include titles from cities such as Birmingham, Derby, Manchester, Nottingham, Norwich, Leeds and York, along with local titles from London boroughs.
We have even scanned single pages more than two feet wide! These publications are now not available for public view or access through the Library's reading rooms; however, they will be available to view on this website.
And to those who say that digitization costs money, and that those costs must be recouped in some way, consider this: holding books in a library, and making them available to the public, costs money too, but that did not prevent the great libraries of the past from providing access to their holdings for free. Those trail-blazing institutions knew that charging people to read would have been a negation of their central role in making knowledge freely available to all. And so it is today: a key part of the modern library ought to be making digital knowledge available to all, without charge, and without limitations.
This current trend to limit access to digitized versions of public domain materials is a real betrayal of the original mission of public libraries like the British Library. These made possible the opening up knowledge to huge numbers of ordinary people who otherwise would never been able to access these materials. Today's massive digitization projects, which ought to be building on and extending that great tradition, are actually reversing it by seeking to take texts out of the public domain and charge for access to them. That's not just a shame, it's a scandal.
Follow me @glynmoody on Twitter or identi.ca, and on Google+