from the behind-the-times dept
One of the biggest problems with copyright law is orphaned works—material for which the copyright status is impossible to determine, or for which the current copyright holder is unknown. Closely tied to the issue of orphaned works is the sheer prohibitive difficulty of looking up copyright registration information: the Copyright Office has records going all the way back to 1870, but only those from 1978 onwards are available online. That leaves over a century of records that exist only in physical catalogues containing upwards of 70-million entries. Millions of these works have undoubtedly fallen into the public domain thanks to shorter, stricter copyright laws at the time of their creation—and yet they continue to have de facto copyright protection because very few people have the means to access and search the records.
Last year, the Copyright Office said that they would make it a priority to digitize these records. Naturally this is a difficult and expensive task: the scanning phase is well underway, but the office has yet to tackle the much bigger challenges of text-recognition and metadata tagging that are necessary to make the records searchable. As a stopgap solution, they are now considering the possibility of putting the raw scans online:
Of the 25,723 drawers in the Copyright Card Catalog, more than 12,000 have already been scanned resulting in more than 17 million card images safely tucked away in Library storage. The long term plan is to capture index terms from the card images using OCR and keyboarding and to build indexes for online searching. But this will require significant time and money to achieve. Must we wait to share these images with you? Maybe not.
As an interim step, the Copyright Office is considering making the images of the cards in the catalog available online through a hierarchical structure that would mimic the way a researcher would approach and use the physical card catalog. We’re calling this a virtual card catalog. While it would not provide the full record level indexing that remains a principal goal, it would make information available as we’re doing the scanning and as searchable as the actual cards.
Anything that makes these records more accessible is a good thing, but this situation really just serves to underline the massive imbalance in copyright law. The public domain—supposedly a key part of the bargain of copyright—is being curtailed by the failure of the system to keep up with technology and culture. In today's world, everyone is a content creator, and by extension, everyone is a remixer. For new generations, public records that don't exist online might as well not exist at all. Expecting people to wait for these records to be digitized (or fly to D.C. and request access to the physical catalogue) is laughable—a total denial of reality. Is it any surprise that people don't respect copyright law under such circumstances?