The Great Digitization Or The Great Betrayal?

from the so-what-exactly-are-libraries-for? dept

One of the great tasks facing humanity today is digitizing the world's books and liberating the huge stores of knowledge they contain. The technology is there – scanners are now relatively fast and cheap – but the legal framework is struggling to keep up. That can be seen in the continuing uncertainty hovering over Google's massive book scanning project. It can also be observed in some recent digitization projects like Cambridge University's Digital Library:

Over the course of six centuries Cambridge University Library's collections have grown from a few dozen volumes into one of the world's great libraries, with an extraordinary accumulation of books, maps, manuscripts and journals. These cover every conceivable aspect of human endeavour, spanning most of the world's cultural traditions. While parts of the Library's manuscript collections have already been published in print, microfilm and digital formats, we are now building a substantial online resource so that our collections can be much more accessible to students, researchers and the wider public.
That's obviously a highly laudable aim. But the strict terms and conditions are not so praiseworthy:
Subject to statutory allowances, extracts of the Content and University Material from the site may be accessed, downloaded and printed for your personal and non-commercial use and you may draw the attention of others within your organisation to material posted on the site. Unless explicitly licensed or permitted by us, you may not:
use any part of the Content or University Material on the site for direct or indirect commercial purposes or advantage without obtaining a licence to do so from the University or its licensors

modify or alter the paper or digital copies of any Content or University Material printed off or downloaded in any way

sell, resell, license, transfer, transmit, display in any form, perform, hire, lease or loan any Content or University Material in whole or in part printed or downloaded from the site

systematically extract and/or re-utilise substantial parts of the Content or University Material from the site

create and/or publish your own database that features substantial parts of this site.
If you print, copy, download or use any part of the site in breach of these terms of use, your right to use the site will cease immediately and you must at the option of the University return or destroy any copies of the material you have made.
One of the jewels of the Cambridge University Digital Library is a collection of Newton's scientific papers. So far, a selection of important mathematical works from the 1660s has been digitized. These date are from well before the first modern copyright act, the 1710 Statute of Anne. So it's an interesting question -- what is the copyright situation of these papers and their digitized images?

Assuming that copyright dates from the "fixing" of the work, or from the date of the Statute of Anne, they would clearly have passed into the public domain long ago. One technique that libraries have tried to employ in order to maintain their control is to claim that the act of digitizing creates a new copyright, although this seems dubious. After all, the whole point of digitization is to capture as faithfully as possible the physical appearance of a text: an artistic interpretation of that physical appearance would defeat the object of the exercise. But without that artistic element there seems to be no grounds for claiming copyright.

Moreover, even if there were copyright in the digitized image, it's hard to see how there is any basis for stopping people from transcribing the text, since that is undoubtedly in the public domain. But that's precisely what Cambridge University is trying to do in its conditions quoted above.

At least the Cambridge University Digital Library allows "personal and non-commercial use" for free; the British Library's new British Newspaper Archive doesn't even permit that:

The index of the newspaper archives featured on the website can be searched for free, from any location. If you are using the website in premises owned or operated by the British Library, you can view the images of the newspapers themselves for free also. If you are using the website anywhere else and want to view the images of the newspaper archive or use some features of the website you will need to buy either a Credit Package or a Subscription. You have to register with us and be signed in to buy credits or a subscription.
Here's what the British Newspaper Archive encompasses:
The British Library's newspaper collections are among the finest in the world, containing most of the runs of newspapers published in the UK since 1800.

The scale of the newspaper publishing industry from the early 19th century onwards is enormous, with many cities and towns publishing several newspapers simultaneously, often aimed at distinct audiences depending on social status, geographical location and political affiliation. The first stage of this project focuses on runs published before 1900 and will include titles from cities such as Birmingham, Derby, Manchester, Nottingham, Norwich, Leeds and York, along with local titles from London boroughs.
Clearly, most of that material will be in the public domain. But as a result of this digitization project, the British Library is actually removing physical access to some of its public domain holdings, replacing it with virtual access through images it claims are under copyright:
We have even scanned single pages more than two feet wide! These publications are now not available for public view or access through the Library's reading rooms; however, they will be available to view on this website.
And to those who say that digitization costs money, and that those costs must be recouped in some way, consider this: holding books in a library, and making them available to the public, costs money too, but that did not prevent the great libraries of the past from providing access to their holdings for free. Those trail-blazing institutions knew that charging people to read would have been a negation of their central role in making knowledge freely available to all. And so it is today: a key part of the modern library ought to be making digital knowledge available to all, without charge, and without limitations.

This current trend to limit access to digitized versions of public domain materials is a real betrayal of the original mission of public libraries like the British Library. These made possible the opening up knowledge to huge numbers of ordinary people who otherwise would never been able to access these materials. Today's massive digitization projects, which ought to be building on and extending that great tradition, are actually reversing it by seeking to take texts out of the public domain and charge for access to them. That's not just a shame, it's a scandal.

Follow me @glynmoody on Twitter or, and on Google+

Filed Under: copyright, digitization, public domain, uk

Reader Comments

Subscribe: RSS

View by: Time | Thread

  1. icon
    Michael Edson (profile), 9 Jan 2012 @ 12:40pm

    Re: Re: Re: #38. It's a laudable aim, but...

    Nick - - so you're mad at...everyone? That's so unlike you!

    I don't want to presume, but I can see where I think you're coming from - - trying to make Big Progress in the bight (in the nautical sense) between Government, law, technologists, copyright evangelists, The Public (in aggregate), and pooles (no pun) of heritage organizations thinking about all of their collections. From that perspective, Gunning for the Big Wins but having to deal with the Big Headaches of the lowest common denominator of collective strengths, weaknesses, and neuroses, I’d be mad at everyone too!

    (Below, shadow-boxing with The World, through you. You're welcome.)

    I've been mulling this over and I'm uneasy casting Glyn's Public Domain WTF into the same parent class as a lobby of inflexible Copyright Evangelists and then dismissing it as being an impediment to progress. Glyn isn't saying all content should be free, he's saying that digitized Public Domain materials (like the 17th century mathematical treaties and 19th century newspapers described in his post) should not be enclosed in faux copyright or restrictive terms-of-use statements - - especially when the holding Institutions are publicly funded. I think public institutions could afford to do this—I think they can't afford not to do it.

    Most of the institutions I've talked to and studied attempt to restrict access to digital reproductions of public domain works because they think they're making money off of them. With some groups of content they are (arguably, depending on how they account for cost-to-market), but as you know there's a lot of evidence that that most heritage organizations lose money running licensing and rights-and-repro offices for their digital collections - - even when they only count a fraction of their true cost-to-market in their balance sheets. (I'm thinking of Simon Tanner's 2004 Mellon Foundation study and the V&A Images study - - {ref: }). I wonder if we could lose just as much money—but get better civic outcomes—giving the public domain stuff away—in harmony with our missions and with the intended purpose of PD law (in the US at least)? Our digital strategy says we should do exactly that.

    I've also found that only a few organizations have teams with a solid grasp of digital content issues, public domain and copyright issues, business acumen, and desirable mission-based outcomes - - and can put them in the same room at the same time and allow them to share credit for each others' success. Without this kind of cross-disciplinary teamwork it's hard to imagine any other way to create value from digitized public domain content than to treat our online holdings like a rental property and let gatekeepers directly charge for limited access to the inventory.

    A final thought: digitizing heritage collections doesn't have to be an enormous public works project like roads or railways. In that analogy, the big infrastructure has already been built: it's the Internet. The rest can be (and probably will be) small pieces loosely joined. It's already happened - - huge aggregate numbers of small batches of useful things put online in an effort spread across individual collections, departmental work groups, volunteers, experts, and citizens. The work has already been done, the content is there, and we're pissing off what are potentially our super-fans, donors, and most passionate advocates by (what I personally feel is) gaming the system. We're also throttling re-use, which creates a negative feedback loop that undermines our efforts to convince new funders and long-tail audiences that heritage collections are worth supporting - - are worth fighting for.

    [And this article and thread aren't complete without a link to Yale's new(ish) public domain policy and associated documents/statements. (Trail head at same link as above: ) Yale does an awesome job of laying out the arguments for unrestricted access to high-quality public domain materials from the perspectives of mission, money, leadership, and the law. Quite well done.]

Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here

Subscribe to the Techdirt Daily newsletter

Comment Options:

  • Use markdown. Use plain text.
  • Remember name/email/url (set a cookie)

Follow Techdirt
Insider Shop - Show Your Support!

Report this ad  |  Hide Techdirt ads
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Chat
Report this ad  |  Hide Techdirt ads
Recent Stories
Report this ad  |  Hide Techdirt ads


Email This

This feature is only available to registered users. Register or sign in to use it.