Dear Internet, We Need Better Image Archives

from the The-Public-Domain-should-be-Public dept

Cross-posted from ninapaley.com

Dear Internet,

You know what should be really easy to find online? Good quality, Public Domain vintage illustrations. You know, things like this:

Hats / chapéus


I found this on Flickr, where someone claims full copyright on it. That's copyfraud, but understandable because Flickr's default license is full copyright (all the more reason to ignore copyright notices!). But copyfraud isn't not the main problem. The main problem is that images like this are painfully difficult to find online, especially at high resolutions (and this image is only available at medium resolution - up to 604 pixels high, which is barely usable for most purposes but higher than much of what you find online).

The images are out there - and with zillions of antique books being scanned, their vintage illustrations are being scanned right along with them. But the images are buried in the text, and often the scan quality is poor. Images should be scanned at high quality, and tagged for searchability.

Are archives ignoring the value of images?

Take the American Memory archive of the Library of Congress. Lots and lots of historical documents here, but no way for me to find an image of, say, a horse.

Most book-scanning projects focus on texts, not illustrations. Many interesting and useful illustrations are buried within these scans, uncatalogued and inaccessible. Scan quality is set for text, not illustrations, so even if one can find a choice illustration buried within, its quality is usually too low to use.

Archive.org is great (I love you, archive.org!) but does not have an image archive. Still images are not among their "Media Types" (which consist of Moving Images, Texts, Audio, Software, and Education). So I went spelunking through their texts, starting with "American Libraries," and searched for something easy: "horse." Surely I could find a nice usable etching of a horse in there somewhere. I eventually found "The Harness Horse" by Sir Walter Gilbey, from 1898.



Nice illustrations! Can I use them? Unfortunately, no. The book is downloadable as PDF and various e-publication formats, but when I try to extract the illustrations, I get a mess (which you can see, after the jump):


Copied and pasted from Adobe Acrobat. WTF?



The same image, inverted. Doesn't work.



"Save Image as..." from Acobat. This worked, except where it didn't: part of the image is simply missing.


Clearly something is messed up here. Was it just that page? Alas, no:


This sad image from another page has the same problem.


The scans have some flaws that PDFs and Photoshop can't cope with:


Screen grab of zoomed-in view from Acrobat. What looks like a blur in the PDF renders the image unusable when extracted.


These images are not usable, which is a pity because they are very nice illustrations. And they seem to be among the higher quality scans, which again isn't saying much.

Let me add that it's great these books are being scanned at all! That's definitely better than losing them entirely. But as an artist, it saddens me that we're neglecting this wealth of visual art. I'd like to see our rich visual history properly archived. Our bias favoring text over pictures is especially ironic considering how much more efficiently information is communicated to humans through images; "A picture is worth a thousand words," or more. That's why I'm a cartoonist, after all.

I was able to extract one clean image from the book, on page 48:



Unfortunately I can't use this illustration for my purposes, but maybe someone else can. I've already gone through the trouble of finding it in a text, extracting it, and rotating it. If only there were some image archive I could upload it to at high resolution, so someone else could use it. I could tag it, to make it easier to find. I could include all kinds of useful metadata, like what book it was from and when it was published; but even if that was too bothersome, I could at least include tags like "horse," "rider" and "engraving." Wouldn't it be nice if such an archive existed? Wikimedia Commons is close, although I dread uploading things there after having all my open-licensed comics deleted by an overzealous editor. But maybe they're our best hope.

Continuing my searches on archive.org, I found this ostensibly Public Domain, vintage horse book with line illustrations. Unfortunately this is controlled by Google Books. It's "free" to read online in Google's reader, which doesn't allow any image export. It also doesn't allow me to zoom in.



All those illustrations, trapped at low resolution, unusable (even if they were tagged/catalogued, which they aren't). This is our "Public Domain." Who exactly is benefiting from having these 18th Century illustrations inaccessible to today's artists?

Then there's Dover Books. I loved Dover books growing up - they introduced me to the idea of the Public Domain. Dover reproduces vintage illustrations in books for artists and designers. Their paper books were reasonably priced, and you could use the illustrations for anything, without restriction. Browsing was free, so I would flip through the pages in the book store, and if it had what I needed, I'd buy it.



Dover is still selling books, but the prices are now relatively high, few are carried in bookstores, and they prohibit browsing online. You have to shell out $15 to find out if what you need is in the book, and how could you know? They seem to be clinging to an outdated copyright model, and rather than selling things of added value, they are simply blocking access to existing Public Domain works, in order to collect a toll.

What else has kept a good public archive of Public Domain images from existing? Some artists and archivists do make high quality scans of vintage illustrations - and keep them to themselves. I guess we could call this "image hoarding." I assume the reasoning is, "I went through all the trouble to scan it, why should I share? Others can pay me if they want a copy." Also there's the "finders, keepers" reasoning: "anyone else is free to find the same illustration in another antique book, but I found this one, so it's mine." And so these images remain inaccessible, not part of any public archive.

Wikimedia Commons is the best public image archive I know of right now. A bit of searching led me to their "Engravings of Horses" category, which yielded some nice images. Unfortunately, many of these are not available at sufficiently high resolutions.

File:Fotothek df tg 0005647 Nutztierhaltung ^ Tiermedizin ^ Pferd ^ Krankheit.jpg


The maximum size of this image is 800 × 608 pixels, which limits its use. Limited image sizes and limited selection have been the biggest obstacles to my relying more on Wikimedia Commons; but it can get better. Maybe it will. It would be nice if something became the public vintage image archive I and so many other artists need.


Reader Comments (rss)

(Flattened / Threaded)

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 2:55pm

    I bet if someone could charge monopoly rates for access to high-res images of such public domain works, you wouldn't have as much trouble finding them.

    Just sayin'

     

    reply to this | link to this | view in chronology ]

    •  
      icon
      Richard (profile), Sep 29th, 2011 @ 3:08pm

      Re:

      Right then - find me a hi res image from a point in time "just" inside copyright.

       

      reply to this | link to this | view in chronology ]

    •  
      icon
      crade (profile), Sep 29th, 2011 @ 3:15pm

      Re:

      You could charge monopoly rates for access to them if you want..

       

      reply to this | link to this | view in chronology ]

    •  
      icon
      The Incoherent One (profile), Sep 29th, 2011 @ 3:30pm

      Re:

      In that case you would not be charging for the photograph, but the service that you provide to distribute them.

       

      reply to this | link to this | view in chronology ]

    •  
      icon
      Marcus Carab (profile), Sep 29th, 2011 @ 4:10pm

      Re:

      I bet if someone could charge monopoly rates for access to high-res images of such public domain works, you wouldn't have as much trouble finding them.

      That's true, but then they would cease to be public domain, which kind of kills the point of the whole thing.

      Still, I do see what you're saying. But what I don't understand is why your mind goes straight to selling "access" to non-scare goods - the hardest thing to put a price on in the digital era. It's like you have a mountain full of gold, and instead of mining it you decide to tax everyone in the countryside for looking at it. Why not focus on selling the scarcities? There are a few that seem obvious right off the bat:

      - High-res scanning services
      - Manual vectorization services
      - Archive/library/research services ("I need you to find me high-res engravings of horses, here are my requirements...")
      - Printing/mounting/framing/canvas-transfer/etc (we are talking about a wealth of artwork just waiting to be tapped)

      I bet there are more too - including some pretty clever and disruptive ones. But to figure them out you'd have to put your mind to it, instead of relying on outdated laws in the hopes of barely lifting a finger.

       

      reply to this | link to this | view in chronology ]

      •  
        identicon
        Anonymous Coward, Sep 29th, 2011 @ 4:27pm

        Re: Re:

        I fail to see how selling any of the services you refer to would result in greater access to high-res copies of public domain works (thought they may very well be a good business for the proprietor).

        I'm not trying to make an argument for doing away with the public domain, but I find it amusing that Nina would, without a hint of irony, bemoan how hard it is to get access to good quality copies of things in the public domain.

        I think the incentive not just to create, but to distribute, market, publicize, etc. works is one of the most often underappreciated incentives of copyright protection.

         

        reply to this | link to this | view in chronology ]

        •  
          identicon
          Anonymous Coward, Sep 29th, 2011 @ 4:31pm

          Re: Re: Re:

          Have you tried Omemo?

          Also she can always, always make a torrent file and distribute that, and as long as someone shares it, it will never die.

           

          reply to this | link to this | view in chronology ]

          •  
            identicon
            Anonymous Coward, Sep 29th, 2011 @ 4:33pm

            Re: Re: Re: Re:

            No. What is Omemo?

             

            reply to this | link to this | view in chronology ]

            •  
              identicon
              Anonymous Coward, Sep 29th, 2011 @ 4:41pm

              Re: Re: Re: Re: Re:

              http://en.wikipedia.org/wiki/Omemo

              A social network storage.
              The biggest hard drive on earth.

               

              reply to this | link to this | view in chronology ]

              •  
                identicon
                Anonymous Coward, Sep 29th, 2011 @ 5:05pm

                Re: Re: Re: Re: Re: Re:

                Ok, that's cool, but that's just a tool, like a scanner. The fact that it could, potentially be used to store access to hi-res scans of old images and make them available to the masses doesn't mean it is being used that way, or will be.

                 

                reply to this | link to this | view in chronology ]

                •  
                  identicon
                  Anonymous Coward, Sep 29th, 2011 @ 5:18pm

                  Re: Re: Re: Re: Re: Re: Re:

                  Well, the tool makers made their part, now is up to people to start using those tools :)

                  Besides if people really want to store images they can find solutions.

                  Make a high resolution video of still images and upload to archive.org, upload to PD.org, upload to the other dozen websites that accept PD material, use Flickr and let search engines index those images, use distributed storage, use distributed websites that are hard to kill by any government unless they can remove all copies from all over the world.

                  Did you saw the size of the list of places where one can find PD material?
                  2 years ago you could just count them with your fingers now it has grown to more than a hundred places.

                  You know what that means right?

                   

                  reply to this | link to this | view in chronology ]

        •  
          identicon
          Anonymous Coward, Sep 29th, 2011 @ 4:35pm

          Re: Re: Re:

          You don't want others to know that there are free alternatives to your way of incentivizing things, because people are already doing it in a variety of forms.

          - Using social images sharing websites like Flickr.
          - Torrents.
          - Dedicated PD websites, which there are more than one.
          - Creating a movie with still images and uploading to archive.org.
          - Using distributed storage solutions.

          The real problem is not that there are no solutions, the problem is that there are no marketing involved so few people know about it.

          But with others bringing attention to the issue that soon may change.

           

          reply to this | link to this | view in chronology ]

          •  
            identicon
            Anonymous Coward, Sep 29th, 2011 @ 5:07pm

            Re: Re: Re: Re:

            Um...you have no idea what I want. Check your prejudices.

            "The real problem is not that there are no solutions, the problem is that there are no marketing involved so few people know about it."

            I agree that's a problem. That's one of my points.

             

            reply to this | link to this | view in chronology ]

        •  
          icon
          Marcus Carab (profile), Sep 29th, 2011 @ 4:53pm

          Re: Re: Re:

          I fail to see how selling any of the services you refer to would result in greater access to high-res copies of public domain works (thought they may very well be a good business for the proprietor).

          - Scanning and vectorizing: customers pay to have this done so they can be the first to use something that has never been digitized (an illustration they've found in a book, for example), but the work is PD so once the deed is done it increases access for everyone. Thus a core group of people who hunt out source material incrementally increase the wealth of quality PD images by funding digitization on an as-needed basis

          - Archive/library/research: if there proves to be a demand for such services (and I'm not saying I'm certain, but it seems likely) then naturally that incentivizes whoever offers those services to constantly increase their database of PD works, in order to make curation all the more valuable.

          - Printing: again, if there proves to be a demand for printed reproductions of PD artwork, that incentivizes the creation of larger databases of such artwork and increased access to those databases in order to drive sales

           

          reply to this | link to this | view in chronology ]

          •  
            icon
            Nina Paley (profile), Sep 29th, 2011 @ 5:50pm

            Re: Re: Re: Re:

            Vectorizing etchings is a bad idea. All those lines and vertices add up to big memory-heavy files that crash graphics programs. Simple line art is better vectorized, but if it has a lot of hatching, high res raster images are superior.

             

            reply to this | link to this | view in chronology ]

            •  
              identicon
              Anonymous Coward, Sep 29th, 2011 @ 6:42pm

              Re: Re: Re: Re: Re:

              Well, I could see vectorizing images having a purpose. I once had inkscape automatically do this one time with an xkcd comic to use as tool paths for the cnc machine. How else am I supposed to read xkcd in plywood?

               

              reply to this | link to this | view in chronology ]

              •  
                icon
                Nina Paley (profile), Sep 29th, 2011 @ 7:28pm

                Re: Re: Re: Re: Re: Re:

                XKCD doesn't have a lot of hatching. Old etchings do. It really depends on the type of image. Simpler line art works great as vectors! Most of my own illustrations work better as vector art. But things with hatched shading are much less manageable as vectors than as raster images. Old etching have a lot of lines.

                 

                reply to this | link to this | view in chronology ]

          •  
            icon
            Nina Paley (profile), Sep 29th, 2011 @ 5:52pm

            Re: Re: Re: Re:

            That said, properly scanning and preparing black and white line art does require some skill, so I'd just say "scanning and cleaning up" instead of "scanning and vectorizing."

             

            reply to this | link to this | view in chronology ]

            •  
              icon
              Marcus Carab (profile), Sep 29th, 2011 @ 9:57pm

              Re: Re: Re: Re: Re:

              Well, actually I meant both scanning and vectorizing as two entirely separate things - as in my original comment :) I do know what you are talking about with vectorized etchings - however I have found that it's not so bad depending on what level of detail you are talking about, and can still have certain advantages. However, I also wasn't really limiting my thoughts to this specific type of etching, but public domain artwork in general.

               

              reply to this | link to this | view in chronology ]

            •  
              icon
              Marcus Carab (profile), Sep 29th, 2011 @ 10:01pm

              Re: Re: Re: Re: Re:

              btw, www.vintagevectors.com has some cool stuff - but it's a very small collection, not a big archive. Fun though - and a pretty good guage of what we are talking about, because some of the vectorized etchings are indeed too detailed and hard on the computer, while others are quite nice to work with (though of course this depends to some degree on the computer in question, too)

               

              reply to this | link to this | view in chronology ]

              •  
                icon
                Nina Paley (profile), Sep 30th, 2011 @ 6:07am

                Re: Re: Re: Re: Re: Re:

                Thanks for the link to vintagevectors.com. I downloaded this caloric engine illustration. The vector file is 12.6 MB. The high res photoshop file is 11.3 MB. This is a fairly typical etching, and the vector version is larger than the (large) raster. It would be a monster to work with in any of my vector graphics programs; I'd get minutes of the "spinning rainbow" if I tried to edit it. Of course this isn't true for all images, and I respect vintagevector's handling of their images - breaking them into smaller chunks, for example, as in the case of these highly detailed border parts.

                 

                reply to this | link to this | view in chronology ]

        •  
          identicon
          Ed C., Sep 30th, 2011 @ 7:29am

          Re: Re: Re:

          Wait, I thought the argument for copyright was that without protection, works could be copied with reckless abandon? Actually, if you even tried to think about it for even a moment, you would realize that copyright has nothing to do with the need to distribute, market, publicize, etc; everyone has those cost, regardless of whether you own the copyright or not. The only difference that copyright makes is the need to pay someone else before doing any of the above.

           

          reply to this | link to this | view in chronology ]

    •  
      identicon
      Anonymous Coward, Sep 29th, 2011 @ 4:32pm

      Re:

      No need. We din't need monopolies for archivers to catalog evey other kind of public domain work, no reason why we can't do the same for images.

      In face, resources are listed in the comments below that do this very thing.


      Yet the maximists will still try to claim that this is a failure of the sharing model. *sigh*

       

      reply to this | link to this | view in chronology ]

      •  
        identicon
        Anonymous Coward, Sep 29th, 2011 @ 4:37pm

        Re: Re:

        I think sharing PD works is great. I just think that you're not as likely to get as much effort put forth to such work based on nonprofit goodwill toward man as you would as toward an effort based on cold, hard, greedy, money-grubbing profit.

        Hell, you can't escape popular music. It's everywhere. I don't think that's because if its inherent qualities. I think that's because people stand to make a lot of money by making such music popular.

         

        reply to this | link to this | view in chronology ]

      •  
        identicon
        bob, Sep 29th, 2011 @ 6:27pm

        Actually there is a reason

        Images are one of the old data formats on the web, much older than music or video. If such a repository was going to be created by the good will of people, it would already exist.

        But there's a problem here. Scanning takes work. It's not like ripping a file or copying it with an automated program. Someone has to pick something up and put it in a box. It's much harder to share these things.

        And don't be fooled by the vast array of pirated material. There are reasons to believe that companies seed it to increase their revenues from people who pay for access to the pirated material. They're running stores, they're just not sharing anything with the artists. Many of these aren't the grassroots efforts that you would like to believe.

        Trust me. Big Piracy is a big business. If they want people to come back each month and paying for access, they've got seeders putting up the stuff.

        Also don't be fooled by the existence of open source software. Many of the companies that release software to the open source stacks are doing it for selfish reasons. They share with other programmers because they hope that the other programmers will do some of the work and share the development costs. It often works for some areas.

        But don't think it works for all. There are precious few open source games and it looks like open source productivity software could be heading south now that Sun/Oracle is givin g up on getting anyone to pay.

        Face it. The cases when people share successfully are rare. I'm pretty sure that the free scanning services that Mike would like aren't going to appear any time soon.

         

        reply to this | link to this | view in chronology ]

    •  
      identicon
      bob, Sep 29th, 2011 @ 6:42pm

      Re:

      What's great is that no one is picking up the sarcasm here. They're just assuming...

       

      reply to this | link to this | view in chronology ]

    •  
      identicon
      Anonymous Coward, Sep 30th, 2011 @ 7:44am

      Re:

      They can. You can do whatever you want with public domain works, thats what makes em' public domain.

       

      reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 3:06pm

    Maximize view on screen. PrtScr. Paste into Photoshop. Crop. Good enough for website or inclusion in small media.

     

    reply to this | link to this | view in chronology ]

  •  
    icon
    Richard (profile), Sep 29th, 2011 @ 3:06pm

    Eh

    I Picked up "The Harness Horse - and selected an illuststration at random.

    Using Foxit readers snapshot tool I got a pretty good image out of it -(636x451) - none of the problems you encountered.

     

    reply to this | link to this | view in chronology ]

  •  
    icon
    iamtheky (profile), Sep 29th, 2011 @ 3:20pm

    But copyfraud isn't not the main problem.

    Aha! So it is the main problem.

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Marius, Sep 29th, 2011 @ 3:21pm

    You might want to try Adobe Acrobat Professional.

    It has an export function which would allow you to export each page (as shown on screen) to a PNG, TIFF, JPG or other formats at various DPI values.

    For example, you could export it to a 600dpi PNG file, which would probably give you a 4000 by 2000 image or even larger.

    I know I've used this to export scanned newspaper articles, to import them in OCR software later on (these programs require high dpi images)

    It is an expensive piece of software though, other pdf readers may be able to do the same thing but I can't vouch for any.

     

    reply to this | link to this | view in chronology ]

  •  
    icon
    cjstg (profile), Sep 29th, 2011 @ 3:37pm

    lessons learned

    i completely agree with you. but i am about to do something that a mentor used to do to me until i figured how to keep my mouth shut.

    nina, if you want it that badly, then to do it. maybe you can even figure out a way to make money off it. but don't expect us to do it.

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 3:37pm

    Google has options to search images by type of licenses and size.

    http://www.google.com/support/bin/answer.py?answer=29508

    http://images.google.com/search ?as_st=y&tbm=isch&hl=en&as_q=horse&as_epq=&as_oq=&as_eq=&as_sitesearch=& amp;cr=&safe=off&btnG=Search+images&tbs=isz:lt,islt:svga&gbv=1&ei=ze6ETtSBN9Htsg avhaHhAQ

    For being able to upload and thus archive those findings.

    http://pddepot.com/

    Found it here
    http://meta.wikimedia.org/wiki/Help:Public_domain_image_resources

    There are several websites dedicated to PD content apparently for images and some let users upload images.

    But there are also other solutions you could also use Flickr and let search engines do the work like this one.

    http://www.everystockphoto.com/about.php

    It indexes only images with liberal licenses that let you use it or PD, at least that is what it says.

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    out_of_the_blue, Sep 29th, 2011 @ 3:38pm

    "I'd like to see our rich visual history properly archived."

    You need an archivist. But as I pointed out in one of the JSTOR threads, the advocates of "free" here aren't willing for librarians or archivists to get any income from their services. They'd rather sneak in and "liberate" the data, oblivious to the efforts of scanning and classifying: JSTOR can beg for contributioms. -- You should re-consider your stance on "free", as you're simply wishing for people to give their time for your possible convenience.

    Not intended as overly personal or emphatic. I'm sure you're a "good" person, in short. But those services don't come for free.

    My solution to the problem of paying archivists and librarians is gov't subsidies, and the argument concludes that it'd be far better spent than on killing people in needless wars.

     

    reply to this | link to this | view in chronology ]

    •  
      identicon
      Anonymous Coward, Sep 29th, 2011 @ 3:46pm

      Re: "I'd like to see our rich visual history properly archived."

      ...or you go look here for sources and places to upload free content.

      http://meta.wikimedia.org/wiki/Help:Public_domain_image_resources

       

      reply to this | link to this | view in chronology ]

    •  
      icon
      Nina Paley (profile), Sep 29th, 2011 @ 5:48pm

      Re: "I'd like to see our rich visual history properly archived."

      the advocates of "free" here aren't willing for librarians or archivists to get any income from their services.

      Actually I'd very much like some of the funding that currently exists for text, audio and motion picture archives to go towards making a PD image library, at least of black and white line illustrations, etchings, engravings and woodcuts. People need to be paid for something like that to work. The Library of Congress pays its staff; most archives have professional staff that are paid. But image archiving isn't valued the way text archiving is, and so it isn't funded as well. I assume most funders just don't think there's a need for it. I'm pointing out that yes, there is a need. A funded archive could include contributions from unpaid participants as well, but I don't think a proper image archive is going to happen without some real money.

       

      reply to this | link to this | view in chronology ]

      •  
        identicon
        bob, Sep 29th, 2011 @ 6:41pm

        Re: Re: "I'd like to see our rich visual history properly archived."

        I think you're just being cynical. If the images just manage to Connect with their Fans and give them a real Reason to Buy, the Image Archive is going to be flying high! Don't be cynical and talk about money. This is the Internet. All of the cool dudes are going to be running to copy this stuff for you because they're so grateful that they were able to snarf some free MP3s. Yup. That's how the web rolls all right.

         

        reply to this | link to this | view in chronology ]

        •  
          icon
          Ninja (profile), Sep 30th, 2011 @ 6:55am

          Re: Re: Re: "I'd like to see our rich visual history properly archived."

          Fail. Missed the point by a few hundred light years troll.

          But yes, I'd be one to donate money for such a service. I've helped other archives already. Unfortunately the donations have to fit my earnings and expenditures or I'd give them more =(

           

          reply to this | link to this | view in chronology ]

    •  
      icon
      PaulT (profile), Sep 30th, 2011 @ 3:08am

      Re: "I'd like to see our rich visual history properly archived."

      "the advocates of "free" here aren't willing for librarians or archivists to get any income from their services"

      The problem is, when you start out from an assumption that's unmitigated bullshit, the conclusions you draw from that tend to be bullshit as well...

       

      reply to this | link to this | view in chronology ]

  •  
    identicon
    out_of_the_blue, Sep 29th, 2011 @ 3:43pm

    PDF is the most convoluted, horrible format possible.

    Why your extraction results are random.

    Don't get me started on Adobe and its insane format. If there's one company and format that should be literally outlawed, that's it.

     

    reply to this | link to this | view in chronology ]

    •  
      icon
      Marcus Carab (profile), Sep 29th, 2011 @ 5:14pm

      Re: PDF is the most convoluted, horrible format possible.

      If there's one company and format that should be literally outlawed, that's it.

      Personally I think PowerPoint is worse. And I don't have any problem with Adobe as a whole. But dangit blue, for once I completely agree with you on something: PDFs are a pain in the ass.

       

      reply to this | link to this | view in chronology ]

    •  
      icon
      Nina Paley (profile), Sep 29th, 2011 @ 5:55pm

      Re: PDF is the most convoluted, horrible format possible.

      I agree with OOTB on this one, too. Kumbaya!

       

      reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 3:43pm

    Also there is a index on the Gimp yay!
    http://gimp-savvy.com/PHOTO-ARCHIVE/

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 3:49pm

    Another one for sharing free photos and images.
    http://www.freephotos.se/

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 3:50pm

    A list of websites that offer or let you upload things for free and are in the PD.
    http://www.xpase.net/index.php?category=public%20domain

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 4:01pm

    Now if people want distributed storage on the cheap to store petabytes of information maybe things like OMEMO are the solution.
    http://www.omemo.com/

    It is encrypted and fairly anonymous.

    http://en.wikipedia.org/wiki/Omemo

    There are other projects that are less known like Osiris Serverless Portal that don't even depend on DNS on the normal internet.

    http://en.wikipedia.org/wiki/Osiris_%28Serverless_Portal_System%29

    Microsoft has their own distributed tech too.
    http://en.wikipedia.org/wiki/BitVault

    Tahoe-LAFS
    http://en.wikipedia.org/wiki/Tahoe-LAFS

    http://en.wikipedia.org/wiki/GlusterFS

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 4:07pm

    best pd archive on the net:
    http://www.artrenewal.org

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 4:09pm

     

    reply to this | link to this | view in chronology ]

  •  
    icon
    Brock Phillimore (profile), Sep 29th, 2011 @ 4:11pm

    I have a family bible that's over 100 years old. It's nearly a foot thick with over 1500 pages and as many illustrations. It has both the old and new testaments side by side and huge concordance(index) at the back.

    The last copyright I can find in it is from 1890. Could I scan the pages and put them into some public domain web site or does copyright still stop me from doing that?

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    bob, Sep 29th, 2011 @ 4:22pm

    Can you say "death of the commons"?

    I can.

    So Mike, why don't you step up and run this? Isn't that how it's supposed to work? You've got a great idea. You've got the vision. So jump on it! I'm happy to support you by using it to illustrate my blog and skip paying real artists. I'll even fill out some review somewhere that says you're really great.

    Oh what? You need some help doing the work. Don't worry. I'm sure someone's going to step forward. There are tons of cool artists and they're all pissed off at the man and those big corporate machines that only give them a small percentage of what their art is worth. They'll be rushing over to help you because zero is somehow better than a small percentage.

    So stick it to Dover and their outdated business model. Show us how it's done for free. Maybe you can get Silent Bob to tell you how to create a paywall and collect a toll without calling it a paywall or a toll. Yeah. That's the ticket. Just change the words.

    But whatever you do, blame evil copyright for creating this death of the commons. I know that all of the public domain work is free of copyright and it's going to take a bit of work to actually blame copyright, but I'm sure that somehow you'll find a way to blame Rightshaven or the RIAA or the artists who somehow want to cash a check from time to time.

    Go for it. Then get to work showing us how it's done.

     

    reply to this | link to this | view in chronology ]

    •  
      identicon
      Anonymous Coward, Sep 29th, 2011 @ 4:29pm

      Re: Can you say "death of the commons"?

      You are afraid aren't you?

       

      reply to this | link to this | view in chronology ]

    •  
      identicon
      Anonymous Coward, Sep 29th, 2011 @ 4:30pm

      Re: Can you say "death of the commons"?

      "skip paying real artists"

      yeahh, 150 years old artists

       

      reply to this | link to this | view in chronology ]

    •  
      identicon
      Anonymous Coward, Sep 29th, 2011 @ 5:15pm

      Re: Can you say "death of the commons"?

      Why would he pay dead artists whose works are public domain?

       

      reply to this | link to this | view in chronology ]

    •  
      icon
      ChurchHatesTucker (profile), Sep 29th, 2011 @ 6:19pm

      Re: Can you say "death of the commons"?

      So stick it to Dover and their outdated business model. Show us how it's done for free.

      OK.

      I've got some Dover CDs. Let's take the scans, re-title them, and add metadata. The US has no 'sweat of the brow' BS, and we'd be making our own compilation, so we should be go to go.

       

      reply to this | link to this | view in chronology ]

      •  
        identicon
        bob, Sep 29th, 2011 @ 6:36pm

        Re: Re: Can you say "death of the commons"?

        Hey, I'm a big believer in getting you to do the work for me and give it to me for free. I'll be sitting here with a gin and tonic cheering you on. Go for it! You're doing great! We're all rooting for you!

         

        reply to this | link to this | view in chronology ]

        •  
          identicon
          Anonymous Coward, Sep 29th, 2011 @ 6:52pm

          Re: Re: Re: Can you say "death of the commons"?

          Of course you are drunk, when was the last time you saw any drunk do any work in their life?

           

          reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 4:46pm

    How to store anything and make it fast to deliver to everybody on earth?

    Use Gmail or other email accounts, overlay a virtual filesystem on them and let those servers do the distributing.

    http://en.wikipedia.org/wiki/GmailFS

     

    reply to this | link to this | view in chronology ]

    •  
      icon
      RobShaver (profile), Sep 29th, 2011 @ 7:04pm

      Unfortunately ...

      "Unfortunately the GmailFS project has come to an end. libgmail has ceased being maintained by its developers, and as a result libgmail no longer works with the latest Gmail interface (and has not done so for many weeks). Without a working libgmail, GmailFS does not function, so the end of libgmail also spells the end of GmailFS."

       

      reply to this | link to this | view in chronology ]

      •  
        icon
        RobShaver (profile), Sep 29th, 2011 @ 7:06pm

        Re: Unfortunately ...

        Oh, and "Note that Google's terms of use prohibit the use of their services by any automated means or any means other than through the interface provided by Google. These restrictions would make use of GmailFS a direct violation of the Service agreement."

        As we know from reading TechDirt, violating any companies terms of service is now a criminal offence ... so maybe it's good that GmailFS doesn't work any more.

         

        reply to this | link to this | view in chronology ]

  •  
    identicon
    Rekrul, Sep 29th, 2011 @ 5:08pm

    That Harness Horse PDF is bizarre. None of the freeware image extraction programs I have can extract the images correctly. A couple of them won't even open it. The only program that seems to work on it is "PDF To JPG"

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 5:42pm

    seems the complaint is someone else hasnt done the work to make the scans you want, but you cant be bothered to go scan them yourself, quit complaining

    dover doesnt have an outdated business model, those people do this thing called 'work' to make those PD images useable for others, if they posted them all online so you can "see" them first, people would just copy them and not pay dover for ht etime they invested in making them, but you dont seem to care about that

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 6:08pm

    Another option get a 3D model that is CC0, pose it and have your horsie.

    http://www.blendswap.com/3D-models/characters/alex-2-5-2/

    Other 3D models that are PD.
    http://www.blendswap.com/?license=cc0

     

    reply to this | link to this | view in chronology ]

  •  
    icon
    RobShaver (profile), Sep 29th, 2011 @ 6:43pm

    I had no problem ...

    getting a fairly nice high resolution copy of that horse.

    I'm not disagreeing with you about the Public Domain ... I do think it's a travesty that nothing goes into the Public Domain any more ... in fact stuff it getting removed.

    Here's a little video of the first way I tried (which I created using the free version of Jing):
    http://screencast.com/t/PFQ8Kt1RNd

    Next I downloaded the PDF and opened it in Adobe Acrobat Reader v.10.1.1. I went to the page, rotated it 90 degrees and then used Jing again to capture a still image which you can see here:
    http://screencast.com/t/G6amW0xG6

    Next I futzed around using Adobe Reader to make the picture as big as I could on my screen. Then I used Jing to capture this larger image and saved it to my local disk. I opened it in Gimp and found that the pixel size is 1350x9034. So I zoomed into the picture where you had seen much bluring and took another snap with Jing. I think mine looks much sharper than yours. Here's the link to it.
    http://screencast.com/t/harjCYlxgea1

    Of course you are limited to your screen resolution when capturing from your screen. Even 1920x1080 isn't really high enough for good print design of any size.

    Peace,

    Rob:-]
    p.s. I've been enjoying planting your Intellectual Pooperty pamphlets in some strategic place.

     

    reply to this | link to this | view in chronology ]

    •  
      icon
      Nina Paley (profile), Sep 29th, 2011 @ 8:00pm

      Re: I had no problem ...

      Very nice. But since the scans were originally captured as image files, wouldn't it be sensible if they could be obtained as such, rather than being converted back and forth? Can you imagine going through that for every image, when it's totally unnecessary?

      Fortunately, Rick Prelinger left this comment on my blog:
      You can easily download the still images from which Internet Archive PDFs were derived. There should be a link on the left side to “All Files,” right with the links to the various versions. You will see a menu, and what you want is the .zip of all the .jp2 files. It’s usually a large download, but you will then have each page in much better resolution and quality.

      It's not quite that simple, but close. I replied:
      Thanks Rick. The files listed for “The Harness Horse” are:
      (2.8 M)PDF
      (2.2 M)B/W PDF
      (~72 pg)EPUB
      (~72 pg)Kindle
      (~72 pg)Daisy
      (47.6 K)Full Text
      (1.5 M)DjVu

      Below that, there is “All Files: HTTP”. When I clicked that, I got a list of all kinds of things – and one was indeed .jp2 zip! Now that I know what it is, I can use it. But it’s very hidden! And we still don’t have an image archive, although poring through .jp2 files and cleaning up and tagging images found therein could be a way to contribute to one.

       

      reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 6:58pm

    Bob loves making an idiot of himself.

    And he's your uncle.

     

    reply to this | link to this | view in chronology ]

  •  
    icon
    Nina Paley (profile), Sep 29th, 2011 @ 7:49pm

    artists and techies

    More comments than I expected on this article. One thing is clear: TD commenters are not well versed about graphics and how artists use them.

    This may explain why there are no really good public image archives online: the leaders of public/open source projects are mostly techies, who (in general) don't understand images so well. And most visual artists, who do understand images, tend to cling to proprietary models and disdain public archives.

     

    reply to this | link to this | view in chronology ]

    •  
      icon
      Karl (profile), Sep 29th, 2011 @ 8:33pm

      Re: artists and techies

      the leaders of public/open source projects are mostly techies, who (in general) don't understand images so well.

      Nor music, unfortunately, which is why Open Source music software has been about ten years behind the times.

      Fortunately, things are getting a lot better, very quickly. I'm sure if you got some actual graphic artists on board with this, things would eventually take off.

      Perhaps some kind of SETI@home type deal? It's kind of what Wikimedia Commons does, but there should be a service that is focused mainly on the images themselves.

      Or, perhaps, some sort of incentive for book stores (those that are left) to help out? Scan in the drawings from a PD book, and you can have some sort of "sponsorship" ad on the site, or something.

       

      reply to this | link to this | view in chronology ]

    •  
      identicon
      Anonymous Coward, Sep 30th, 2011 @ 5:48am

      Re: artists and techies

      TD commenters are not well versed about graphics and how artists use them.

      This may explain why there are no really good public image archives online: the leaders of public/open source projects are mostly techies


      The first may be true, but does not explain shit. I bet the largest reason is because there is no money in trying to appease a niche group, who also happens to be ridiculously anal and arrogant about stuff that nobody but the niche group notices or cares about. It seems more likely that

      And most visual artists, who do understand images, tend to cling to proprietary models and disdain public archives.

      explains that they are far less technical (meaning: cant figure out how to load an image and right click), than the technical people are visual artists.

       

      reply to this | link to this | view in chronology ]

    •  
      identicon
      Anonymous Coward, Sep 30th, 2011 @ 7:13am

      Re: artists and techies

      Who needs images? A text-only VT100 terminal is all one would ever need.

      ;-)

       

      reply to this | link to this | view in chronology ]

    •  
      identicon
      Anonymous Coward, Sep 30th, 2011 @ 8:22am

      Re: artists and techies

      I think this is the heart of the problem. Artists can't wait for other people to give them what they want. The great thing about the internet is if you see a need for something, you can fill that need yourself. You probably have the clout to start such an image archive and get a group of people going filling it with quality scans.

      It would be a vintage clip art collection for artists and designer. I would love to have such an archive for design work too.

      I tried the Smithsonian - not much luck with high quality images there either.

       

      reply to this | link to this | view in chronology ]

    •  
      identicon
      Anonymous Coward, Sep 30th, 2011 @ 9:47am

      Re: artists and techies

      Nice comments there Nina. Why not just call them fucking morons?

      The truth is people who live a leaching life rarely learn the tools to actually make anything for themselves. You have made it all the way up to bad cartoons, which puts you in the top 1 or 2 percent on this site, considering most people here (like the talentless schmuck Marcus Carab) thing that taking something and chanting bad poetry over it is somehow "art".

      Don't be lazy - if there is a need, make it your life's work. Give up your time and really give back to the Tardian world. Stop all this other stuff you are wasting you time on, and give back to the community that has so well rewarded you by ignoring all your previous works.

       

      reply to this | link to this | view in chronology ]

      •  
        icon
        PaulT (profile), Sep 30th, 2011 @ 10:27am

        Re: Re: artists and techies

        Funny, I don't think I saw you post your resume at any point...

        How about you do and then we can contrast your work with Nina and Marcus'?

         

        reply to this | link to this | view in chronology ]

        •  
          icon
          Marcus Carab (profile), Sep 30th, 2011 @ 12:43pm

          Re: Re: Re: artists and techies

          In his mind, it is better to not create anything at all than to risk creating something that isn't 100% "original" or something that a random anonymous weirdo on the internet might (gasp!) make fun of you for.

          We should cut him some slack though - when the human fire in your belly is all but extinguished by bile and uncle-sperm, the world must seem like a cruel ironic place: so many people walking around, mocking you, making it look so easy to just be happy and not have dicks in their mouths while you struggle with the mystery of how that is accomplished. It must be a sad little life in his basement, with nothing to keep him company but porn blogs and photos of Mike with the eyes scratched out - he deserves our pity more than anything.

           

          reply to this | link to this | view in chronology ]

      •  
        identicon
        Anonymous Coward, Sep 30th, 2011 @ 3:10pm

        Re: Re: artists and techies

        Did you not see sita sings the blues? it was awesome and mostly done by just Nina.


        In fact, I think your "hate" of nina's work is purely political. You just want to belive that anyone who disagrees with you is a lazy leech even though the facts don't bear that out.

         

        reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 7:52pm

    Archive.org has the original images for all their books apparently, this one uses JPEG 2000 compression which may be the cause for the fusing on the PDF format.

    http://ia700406.us.archive.org/6/items/harnesshor00gilb/harnesshor00gilb_orig_jp2.tar

    Using openjpeg to extract the image to TGA I got a file with 30MB in size that is clear to me here
    http://code.google.com/p/openjpeg/wiki/DocJ2KCodec

    j2k_to_image -i /home/thepirate/Documents/harnesshor00gilb_orig_0006.jp2 -o /home/thepirate/Documents/harnesshor00gilb_orig_0006.tga

    If the DJVU came from Archive.org, there are often high-quality JPG files that are viewable online (go to the Archive.org details page, and choose "read online", and from there you can increase the size of the image, then right click and save an image. This is in fact easier than ripping from DJVU, as you don't have to mess around with a screenshot and trimming the image, and the resulting quality is hugely better.

    http://en.wikisource.org/wiki/Help:DjVu_files

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 8:59pm

    Scan Tailor
    http://sourceforge.net/project/screenshots.php?group_id=227253

    Open source post processor for scanned images :)

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 29th, 2011 @ 9:02pm

    Nina, quit being a lazy git. You your time and your efforts to make better images, and put them in the public domain. Make it your life's work.

    PLEASE?

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 30th, 2011 @ 1:58am

    Nina, use screenshots to get the image from pdfs.

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Sep 30th, 2011 @ 2:32pm

    "I assume the reasoning is, "I went through all the trouble to scan it, why should I share? Others can pay me if they want a copy."

    That reasoning is common in the real world, where the rest of us live. If someone takes the trouble to do painstaking work and expects to be compensated, why is that a bad thing? Google scanned a great deal of books and put them up online, because they have the effing money, and stand to make more money from their efforts.

    Instead of whining about the lack of images, why don't you do something about it? Start a project. Who knows, Google might acquire it.

    "Anyone else is free to find the same illustration in another antique book, but I found this one, so it's mine." And so these images remain inaccessible, not part of any public archive."

    Because no one wants to work for free. At least in this area. Maybe someone will read this post and take the cue, pumping in a great deal of money, energy and enthusiasm to create a wonderful free online archive that everyone has instant access to.

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    ed.ch, Sep 30th, 2011 @ 8:44pm

    +100000000

    When we rebuilt Encyclopedia Dramatica we got a huge number of the articles from Archive.org (We love you too!), but we lost sooooo many images. We are still missing something like 60 thousand image files, and many more of the files that we have are only thumbnails we pulled from Google cache snapshops of ED articles. The Internet needs an Image archive database. I guess it comes down to who would pay for it. I would donate whatever server space I could just to make sure that something like this never happens again, and I know many others would do the same.

     

    reply to this | link to this | view in chronology ]

    •  
      identicon
      Anonymous Coward, Sep 30th, 2011 @ 9:00pm

      Re: +100000000

      Wait, are you the one who revived ED on the .ch domain?

      All hail the troll god!

      *Places an offering of LULZ on alter*

       

      reply to this | link to this | view in chronology ]

  •  
    identicon
    Gaz Davidson, Feb 14th, 2012 @ 6:50am

    Here it is

    The Internet Archive allows you to export directly via the web interface, just right click on the image, copy the URL, then edit it to change the rotation and scale:

    http://ia600406.us.archive.org/BookReader/BookReaderImages.php?zip=/6/items/harnesshor00gi lb/harnesshor00gilb_jp2.zip&file=harnesshor00gilb_jp2/harnesshor00gilb_0006.jp2&scale=1& rotate=90

     

    reply to this | link to this | view in chronology ]


Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Save me a cookie
  • Note: A CRLF will be replaced by a break tag (<br>), all other allowable HTML will remain intact
  • Allowed HTML Tags: <b> <i> <a> <em> <br> <strong> <blockquote> <hr> <tt>
Follow Techdirt
A word from our sponsors...
Essential Reading
Techdirt Reading List
Techdirt Insider Chat
A word from our sponsors...
Recent Stories
A word from our sponsors...

Close

Email This