In The Vacuum Of AI Legislation, Libraries Have The Playbook
from the always-listen-to-librarians dept
The White House AI framework made official what we already knew: this administration has no interest in regulating AI. Any legislation that contradicts the framework will be a dead end. In this regulatory vacuum, it is instructive to turn to norms developed by libraries and archives through their decades of experience working through the same core issues that are now animating AI debate: understanding copyright law; providing machine access to data; contextualizing information; and adhering to responsible stewardship obligations to communities.
The Google Books Library Project can be instructive. In the mid-2000s, research libraries partnered with Google to digitize and preserve millions of volumes in their collections. To solve the problem of how to store and provide access to a massive number of scanned books, research libraries banded together to create HathiTrust, a secure, searchable repository that remains in use today. Of course, this didn’t happen without legal challenges. Authors Guild separately sued Google and HathiTrust for copyright infringement in what came to be known as the “Google Books” cases. But these cases ultimately established the legal precedent that copying books to create a digital searchable database is fair use. Based on this precedent, research methods such as text and data mining are possible because of mass digitization, and lawful under fair use.
Based on Google Books and other litigation, libraries put a stake in the ground when it comes to copyright law: training AI models on copyrighted works generally is fair use, a position articulated by the Library Copyright Alliance (LCA) in 2023, and updated in light of recent court decisions. In two of those decisions, Kadrey v. Meta and Bartz v. Anthropic, judges held that training AI models on copyrighted works is transformative and therefore fair use. It’s worth noting that these cases are in a commercial context. It is likely that a court would rule in favor of AI uses in educational, research, and scholarly contexts, as those are favored uses under fair use.
Meanwhile, disagreements over AI safety, harm prevention, bias mitigation, and abuse have held up federal AI legislation in the US. But these are not new problems for libraries, which have developed norms to balance the collection and preservation of sensitive information in archives and special collections with the imperative to provide the broadest possible user access to digitized content. One example is the 2010 ARL principles to guide vendor/publisher relations in large scale digitization projects with special collections, which calls for libraries to make material available to the public while providing context to aid in the understanding of that material. Libraries have also developed frameworks for stewarding materials of vulnerable communities and historically marginalized groups, like the Library of Congress access policy on culturally sensitive materials relating to Indigenous peoples, which includes transparent procedures for controlled access and use of culturally sensitive materials.
Congress has also been legislating in the dark around issues like transparency and provenance in AI training, and many of the proposals we have seen so misunderstand these concepts that they threaten to bring the university-based research enterprise to a halt. Libraries already do what Congress is trying to mandate — authenticating, contextualizing, and documenting collections — but the legislation is too disconnected from this expertise, and as a result unworkable for the institutions that actually practice rigorous provenance.
As AI governance debates continue to stall on Capitol Hill, library norms offer a foundation for approaching AI training and research in a way that is responsible, steeped in library expertise, and advances the public interest.
With gratitude to Betsy Rosenblatt, Professor of Law, Case Western Reserve University Law School
Katherine Klosek is the Director of Information Policy and Federal Relations at the Association of Research Libraries.
Filed Under: ai, ai policy, copyright, fair use, librarians, libraries
