Open Access Faces Many Problems; Here's One That The Indispensable Internet Archive Is Helping To Solve
from the now-would-be-a-good-time-to-make-a-donation dept
As Techdirt has reported many times, open access is a self-evidently great idea, but one that is still beset with many problems. That’s not least because academic publishers are keen to remain in control of any transition to open access, and aim to maintain their extremely high profit margins whatever the publishing model. But there’s one problem for open access that ironically derives from its greatest strength — the fact that anyone can access journals at any time, for free. Because material is always available, librarians have tended not to worry about making some kind of backup. That’s not the case for traditional journals, where there is potentially a big problem if a subscription is cancelled. The end of a subscription often means that readers lose their existing access to journals. To address this, librarians have come up with a variety of ways to ensure “post-cancellation access“, explained well in a 2007 post on a blog about digital preservation, written by David Rosenthal. A recent article on the Internet Archive site provides some interesting statistics on the scale of the problem of creating permanent copies of open access titles:
Of the 14.8 million known open access articles published since 1996, the Internet Archive has archived, identified, and made available through the Wayback Machine 9.1 million of them… In the jargon of Open Access, we are counting only “gold” and “hybrid” articles which we expect to be available directly from the publisher, as opposed to preprints, such as in arxiv.org or institutional repositories. Another 3.2 million are believed to be preserved by one or more contracted preservation organizations, based on records kept by Keepers Registry… These copies are not intended to be accessible to anybody unless the publisher becomes inaccessible, in which case they are “triggered” and become accessible.
This leaves at least 2.4 million Open Access articles at risk of vanishing from the web… While many of these are still on publisher’s websites, these have proven difficult to archive.
That’s a pretty serious problem, and one which the Internet Archive is taking steps to address, for example by trawling through the petabytes of Web content that it has built up since 1996. There’s an editable catalog with an open API that aims to provide “Perpetual Access to Millions of Open Research Publications From Around The World”. Internet Archive has also created a full-text search index to over 25 million research articles and other scholarly documents.
Although few people are aware of this project, it is vital work. There is little point publishing open access titles, theoretically available to all, if their holdings simply disappear at some point in the future. The Internet Archive’s copies will ensure that doesn’t happen. They are yet another indication of the invaluable and unique role the site plays in the online world. Without it, we would already have lost so much of the amazing material that was once online, but which has since vanished except for the copies held by the Wayback Machine. Another good reason to support this incredible, free resource financially, and to help defend it from incredibly selfish attacks by publishers.