Yes, Digital Books Do Wear Out; Stop Accepting Publishers Claims That They Don’t

from the digital-data-wears-out dept

There’s a great post by Brewster Kahle on the Internet Archive blog with the title “Digital Books wear out faster than Physical Books“. He makes an important point about the work involved in providing and preserving digital books:

The Internet Archive processes and reprocesses the books it has digitized as new optical character recognition technologies come around, as new text understanding technologies open new analysis, as formats change from djvu to daisy to epub1 to epub2 to epub3 to pdf-a and on and on. This takes thousands of computer-months and programmer-years to do this work. This is what libraries have signed up for—our long-term custodial roles.

Also, the digital media they reside on changes, too—from Digital Linear Tape to PATA hard drives to SATA hard drives to SSDs. If we do not actively tend our digital books they become unreadable very quickly.

The issue is particularly acute for this sector because ebooks potentially offer huge advantages over physical ones, which therefore encourages libraries and archives to adopt that format. Unfortunately, the latter are faced by two sets of problems: the one mentioned above, and the fact that publishers are making digital books less useful than analogue ones in order to boost their profits, as I detailed in Walled Culture the book.

Of course, ebooks are not the only digital artefacts subject to the problems pointed out by Brewster. Digital music and digital films also wear out in the sense that formats change and the media they are stored on must be replaced as technology progresses. It also applies to the world of video games – a cultural area often overlooked. Moreover, video games – like ebooks – are typically locked up using Digital Rights Management (DRM), which adds a further challenge to preserving them: it’s generally against the law to circumvent that DRM, even for purposes of making backups or changing its formatting.

In other words, the problem of archiving digital creations is hard enough, but thanks to copyright, it’s often impossible. So much for copyright supporting creativity…

Originally posted to the Walled Culture blog.

Filed Under: , , , ,

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Yes, Digital Books Do Wear Out; Stop Accepting Publishers Claims That They Don’t”

Subscribe: RSS Leave a comment
35 Comments
Alan says:

"So much for copyright supporting creativity…"

Copyright was designed to ensure some creative works are created. Whether the work can be enjoyed by people (especially those without the means to purchase the same work again and again) has always been a non-goal. It is also not a goal to ensure copyright encourages more creative works created than a world without it.

It is defective by design. Not just a result of bastardized copyright laws.

Anonymous Coward says:

Re:

Indeed. Copyright was created to encourage culture production but with no thought about public access to said production. That’s the fatal flaw. What do rights we the public have to access the culture production? Seems little. And what little of our rights of access to said production is being taken away a bit and bit at times. Current culture production is locked up behind paywall for the public lifetime. Whats so good about that?

More ever restrictions are being imposed as to access to said production that copyright supposedly encourage? and for what benefit to public? Nil as I can see. It’s not about the public but about corporate profits. Whats so good about that?

What’s good about encouraging culture production that is designed to be readily accessible to wealthy people and readily inaccessible to the marginalized population of the public? Copyright discriminates against the disadvantaged. What’s so good about that?

Copyright was designed for physical world of an industrial era. It was not designed for a free modern digital society of information age. That’s a big flaw.

It is evident Copyright is defective by design but many people refuse to see this because they drank too much kool aid of the Copyright cult. The Copyright cult wants us to think Digital Copyright is worth keeping but that is a lie. Their shit does not smell like roses. This relic of a system, legacy of past, needs to go away for a more superior system more compitable to a digital modern world with social justice and public access rights made an integral part of the design.

Queen Anne did not care about social justice. The founders of the Constitution did not care about social justice. They allowed for slavery to be protected in the Constitution. They are like social justice for the elites and screw others. That’s it. Their values are not 21th century values. Their values are not ours. So why should we preserve their legacy? Intellectual property is a big joke. Copyright is big joke as well. It’s a mockery of First Amendment right. Its time to abolish Copyright for an more equitable system that gives more meaning to this abolishment, Congress shall not pass laws to abridge the freedom of speech…

Why give more respect to an order of law when this order does not respect a higher order of law???

Time to abolish Copyright!

Arijirija says:

Re: Re: Re:

What incentive did the scribes in Anglo-Saxon England have to write down the entire Beowulf epic? What incentive did the ancient Vedic reciters have to memorize the entire Mahabharata, the Ramayana? What incentive did Virgil have to write the Aeneid?

For what very very little it is worth, the Aeneid was reportedly very popular in Rome at that time, and people actually could sing their favorite verses off by heart. Virgil never had a “copyright” as such on the Aeneid, but everybody in Rome knew who wrote it, and who their favorite epic reciters were, and undoubtedly Virgil did receive some honorarium from Caesar Octavius, self-titled Augustus.

Come to think of it, what incentive kept a fair number of singers in the UK and rural America singing some rather old ballads right up to the time they were written down and called the Child Ballads after their recorder, whereupon it was discovered just how old some of them were? And what incentive kept people making new ballads in the old mould for the same length of time? Without any copyright as such?

What incentive made a mother make up and sing a song about her absent son as recorded in Donald MacGregor’s Olo Language Materials, 1982? (And yes, I did know Donald MacGregor; I have read this book.)

Anathema Device (profile) says:

Re: Re: Re:2

“What incentive did the scribes in Anglo-Saxon England have to write down the entire Beowulf epic?”

They were almost certainly paid to do so as part of a pre-Norman Court in England

“What incentive did the ancient Vedic reciters have to memorize the entire Mahabharata, the Ramayana?”

Religious devotion

Your first two examples are not of creation, but recording, so we can ignore them

“What incentive did Virgil have to write the Aeneid?”

He was commissioned to by Augustus Caesar. In other words, he was paid.

“what incentive kept people making new ballads in the old mould for the same length of time?”

Entertainment. This has nothing to do with someone making a career out of writing. It’s the difference between fanfic and profic. You can do both for love, but you can only live on the latter.

Your examples are ridiculous. People will create, yes. But if every author who wants a living while they’re writing has to depend on the oligarchs of the day, then you’ll end up with a bunch of stuff which is probably okay if you don’t mind a strong bias to the material, access to which is controlled by the oligarch and not the creator.

If I write a book at the behest of King Charles III, he can lock it up in the Royal archives like so much art the royal family has looted over the centuries, and only offer access to amenable souls.

If I write a polemic on the evils of royalty at my own cost, I have the option of selling it to a newspaper, printing it and selling it through Amazon, or putting it on a website, or on Patreon. But the important thing is that I can do what I like and make my own profit from it, because I own the rights to it.

(I can’t do that with the journal papers I’ve written because oligarchs control academic press journals and demand I hand over my copyright to them. You’d be far better off screaming about the abuse in academic journal publishing and the restriction of access it exercises, than attacking poor old fiction authors much of whose work is available for one to five dollars. Access to a single paper in a scientific journal is usually in the region of $40 or more.)

Anonymous Coward says:

Re: Re: Re:3

Pre_printing press,making a copy of a book was an expensive and rime consuming exercise, Indeed one of the advantages of going to an university was access to books on one chosen topic, so that one could make ones own copy. That made it almost impossible for am author to make money by writing a book, due to the labour needed to make each copy.

By the way, the scribes who wrote down many of the ancient works like Beowulf were not the original authors, but mere copiers of extant works. so whether they were paid or not has no bearing on whether the original story creators were paid.

Anonymous Coward says:

Re: Re: Re:2

What incentive did the scribes in Anglo-Saxon England have to write down the entire Beowulf epic?

If they weren’t paid to do so, then racial pride.

What incentive did the ancient Vedic reciters have to memorize the entire Mahabharata, the Ramayana?

If they weren’t paid to do so, definitely racial or religious pride.

What incentive did Virgil have to write the Aeneid?

100% paid for, if not, “muh legitimacy”. The Aenid was fucking propaganda.

Come to think of it, what incentive kept a fair number of singers in the UK and rural America singing some rather old ballads right up to the time they were written down and called the Child Ballads after their recorder, whereupon it was discovered just how old some of them were? And what incentive kept people making new ballads in the old mould for the same length of time? Without any copyright as such?

I dunno, try “fucking slavery” if you’re talking about Black Gospel. Suffering has a way to create some of the most heartrending music.

Copyright’s outdated and stuff, yes. But without copyright, all you do is let corps and countries dictate what THEY want to do. DIRECTLY.

I don’t mind the violence that could ensue from such a thing but to msot people, would that be acceptable?

Anonymous Coward says:

Re:

Whether the work can be enjoyed by people (especially those without the means to purchase the same work again and again) has always been a non-goal

I’m looking at this from an American perspective, granted. That said, if you look at our Constitutional basis for copyright, you’ll find that this statement is patently false. Public domain is as close as you can get to ensuring wide access, short of actively investing to increase access. The original intent was, in part to ensure that works could be widely enjoyed.

Anonymous Coward says:

Re:

Publishers and lawmakers simply need to accept that publishers don’t own the works that they publish.

If you look at the history of copyright, the Statute of Anne was written so that publishers could tale ownership of works, that why it was made transferable, rather than a limited time license for publishers to make copies.

Anonymous Coward says:

as formats change from djvu to daisy to epub1 to epub2 to epub3 to pdf-a and on and on

Isn’t this kind of the opposite of wearing out? I don’t know that my software supports epub3 or pdf-a, but when it comes to old formats, they work just fine. DjVu, GIF87a, Postscript, PDF 1.2, MP3, early Vorbis (those these were briefly broken till Monty fixed a bug), MIDI, MOD, text (via iconv).

It’s true that optical character recognition improves, but that has nothing to do with wearing out either. Re-OCRing is likely to improve an e-book; if not done, the book stays in exactly the same (possibly poor) condition as it started in.

“lucky if our digital books last a decade” is a serious exaggeration. If I put a book on 2 or 3 hard drives (not SSDs), store them for 20 years, and try to read them, I’ll almost certainly get one good copy. Having recently cleaned up a basement, the multiple decades-old computers down there booted up just fine and needed their data wiped before disposal. The “important” data had already been copied to one newer computer, then another, etc., and still works fine in Dosbox, Mikmod, and other “retrocomputing” software. Had I needed to copy anything, ISA/PCI ethernet cards (and FTP servers) aren’t hard to find and can interoperate with modern gigabit networks. For systems with no TCP/IP, they’ve probably got a serial port (that can talk to a USB-to-serial adapter on the “modern” end) and an installed Zmodem program.

It might be a pain in the ass to use a file from decades ago, whereas a book stored for a century is probably fine and can be immediately used. So it’s fair to say a digital collection may require more effort and attention. But unless we’re stupid enough to use DRM, the exact same bit sequence will be readable for a very long time, and software can trivially ensure copies are exact.

Anonymous Coward says:

Re: Re:

Most of these can be stored in e-mail or in the cloud.

Ah, yes, the cloud, in which services can be shut down on a whim—with 30-90 days notice if you’re lucky—and, more importantly, files can be deleted or accounts shut down over copyright claims. Nevermind inactivity.

Every account-based service I used in the 1990s was dead by 2010. My iName “free e-mail for life” account, my Geocities web sites, and various others. I didn’t get 10 years from some of them. I know people who’ve had Gmail for 18 years now, but Google’s the last company I’d trust to keep something running over the long term.

I highly doubt you’d be able to just forget about a cloud service for decades and come back to find all your data intact.

Anathema Device (profile) says:

Re:

“a book stored for a century is probably fine and can be immediately used”

Yes, but a cheap paperback from 20 years ago is falling apart and the pages starting to rot.

I have ebooks in various formats going back at least that long, and they’re perfectly readable.

If you care about the environment, you should be demanding that books written purely as entertainment should always be digitised where possible. Cutting down trees to produce what is essentially ephemera, is immoral.

Very little that is currently printed is going to have any value to anyone in a hundred years, so why waste a valuable resource to make paper versions?

mechtheist (profile) says:

Re: Re:

People really need to stop saying ‘cutting down trees’ when it comes to construction and paper, they don’t cut down virgin forests to make paper or 2x4s and haven’t for a pretty long time, they have tree farms for that. It’s cutting down old-growth trees that needs to stop, there is relatively little harm in harvesting trees grown for that purpose, the carbon price was already paid long ago, and paper and wood recycle well.

Anonymous Coward says:

Re: Re: Re: tree farms

Tree farms are monocultures, therefore subject to mass destruction by a single biological or insect attacker. It has happened in the past and is happening now.
The forests in the American west are under attack.
The Cavendish banana, the most popular variety worldwide, is under attack right now, and at risk of disappearing. The worldwide economic, social and political repercussions will be immense.
Monoculture is just plain bad.

Anonymous Coward says:

Re: Re:

Yes, but a cheap paperback from 20 years ago is falling apart and the pages starting to rot.

While it is true that paperbacks are not made with acid-free archival paper, storage method matters too.

I have pulps from 80 years ago that – with only a modicum of care – can still be read without problem. I have paperbacks from 50 years ago that are absolutely fine. Where do you purchase your paperbacks that rot in 20 years, and how do you store them?

On the other hand, digital storage media also “rots”. Magnetic domains on tapes and disks (hard and floppy) migrate. CD-Rs degrade. I’ve got examples of both, from 20 years ago and less.

Very little that is currently printed is going to have any value to anyone in a hundred years, so why waste a valuable resource to make paper versions?

… perhaps because the printed word on paper is an information transfer device that requires no special tools to read. Does not require exotic and rare elements to store or process. (Care to name how many such elements are used in cell phone construction?) Does not require outrageous legal permissions to use a copy that comes into your hands. And as noted by previous commenters, “no such valuable resource is present here”. Should I go on?

Anathema Device (profile) says:

Re: Re: Re:

“Where do you purchase your paperbacks that rot in 20 years, and how do you store them?”

From a bookstore, and in a bookshelf. I’m not building a special refrigerated room to keep my cheap books in. It’s bad enough I’ve moved house with them so many times! (Another reason I am getting rid of paper books and replacing them with digital.)

Anonymous Coward says:

Re: Re:

but a cheap paperback from 20 years ago is falling apart

A well-used library book, perhaps, but one that was read once and then sat on an indoor shelf for 20 years? My parents used to have lots of cheap paperback books, stored in no special way, till they got rid of them when the oldest were 50 years old. I don’t recall anything like that; any book I ever took off the shelves was readable.

Anathema Device (profile) says:

Re: Re: Re:

“one that was read once and then sat on an indoor shelf for 20 years?”

If you’re only reading the book once, isn’t that a strong argument for not having a paper version? As I get older, screens are much better for me for reading, and I never pick up paper versions of books any more. I’ve replaced them all with digital.

The books I’m talking about were bought new. Some of them are actually hardbacks, but the paper degrades just the same. I’m in Queensland, Australia where the humidity is high, and not kind to paper products.

Apart from all that, I’ve moved house and countries several times, humped hundreds and hundreds of books back and forth, and I am now saying ‘no more’. I won’t leave a pile of decaying paper to my inheritors, and they won’t want the books in any condition. Good books will survive in some form, and the bad ones will disappear, as is right, to make room for better ones. I despite the idea that books should be kept like religious icons – the value of the book is the content, not the container.

Anonymous Coward says:

Re: Re: Re:2

If you’re only reading the book once, isn’t that a strong argument for not having a paper version?

Well, let’s say everyone in a family reads it once, or you read it every year. In my experience, such a book will still be in fine condition, not “decaying” unless you’ve had a flood or something. I leave it to you to decide whether it makes sense to have such a collection; the one I’m thinking of mostly pre-dated e-books, and was sold exactly because the owner was moving house (for the first time in decades). Stay away from DRM, anyway.

Anonymous Coward says:

Re: Re:

There is one market of books that idea doesn’t really work out for logistically: books for young children (i.e. babies and toddlers).

We would be effectively giving breakable electronics (most likely with glass screens) to small humans still developing their strength and motor skills.

The idea would lead to increased environmental impact in the form of both e-waste from broken electronics and the making of replacement electronics.

This comment has been deemed insightful by the community.
Anonymous Coward says:

I can see copyright being useful.

But in my opinion, the current duration of Copyright is the main problem. In the USA, Copyright was originally for 14 years with an option to renew for an additional 14 years, giving a maximum duration of 28 years for Copyright. This was back in 1790. Now, the ability to transmit information since then has greatly increased. To illustrate, in 1872, the novel “Around the World in 80 Days” was published. The timeframe of 80 days was selected since it was possible, yet extremely difficult to accomplish the feat of circumnavigating the world. Today, we have some people circumnutating the world every 90 to 93 minutes (ISS). And if we’re talking about just information, data can be transmitted in less than 170 milliseconds. With that greatly increased ability to transmit and copy information, why do we need a copyright that lasts 70 years past the death of the author? Roll back the duration of Copyright back to something reasonable, and many of the problems it causes will go away.

This comment has been deemed insightful by the community.
PaulT (profile) says:

Re:

“why do we need a copyright that lasts 70 years past the death of the author?”

Because corporations can’t push for literally infinite copyright, so they’ve had to settle for a version that’s effectively infinite, in the sense that most people won’t live to see art created when they were alive enter the public domain.

Fortunately, enough people are preserving art that’s not commercially viable to those corporations that it still exists, even if they have to break the law to do so.

“Roll back the duration of Copyright back to something reasonable, and many of the problems it causes will go away.”

Indeed, but the people paying to write the laws don’t want reasonable. They want infinite profit from what they own, and zero competition from things they don’t. So, that won’t change until the tie between politicians and lobbyists is broken.

Anonymous Coward says:

Re:

0Well, that depends on the format, just try reading a word document from the DOS days or early windows versions. Word will choke on them. Plain text, or text with embedded plain text markup, such as HTML, are the best format, that is any format where it is easy to separate the content and markup by inspection in a text editor, and the content read
normally.

Leave a Reply to Anonymous Coward Cancel reply

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...