Hide Techdirt is off for the long weekend! We'll be back with our regular posts tomorrow.

Scribd Puts User Docs Behind A Paywall Without Them Realizing It

from the totally-not-cool dept

Last year, I wrote about some issues I had with the way Scribd tried to avoid liability by suggesting that public domain documents couldn’t be hosted on the site or that fair use was not allowed. To the company’s credit, it responded quickly and fixed the situation, but soon after that I switched to (mostly) using Docstoc to host documents. Doctstoc has its own problems as well, but for the most part has worked well for me. Still, in my experience Scribd is still quite popular among folks — especially for uploading and hosting legal documents. Apparently, the company recently made some quiet changes and it’s seriously pissed off law professor Eric Goldman, who has relied on the site for quite some time.

The key problem? Without clear notification, it took “older” (and older is left undefined) documents and put them behind a paywall. As Goldman notes, the whole reason he used Scribd was to make the documents available, and it was quite a shock to suddenly find them behind a paywall:

Scribd’s paywall stunt instantly put Scribd on my shitlist because it vitiates the reason I chose to use Scribd in the first place. I don’t know that they ever promised me perpetual free access to the documents I post, but their value proposition always has been open access to the documents–freely shared with everyone and indexed in the search engines. The paywall destroys that value proposition. They’ve taken the documents that I wanted to freely share with the public (many of them public documents like court rulings and filings) and made them inaccessible. If my readers can’t freely get the documents I wanted to share with them, then what’s the point of using Scribd in the first place???

I also feel like Scribd used me. With their implicit promise of open access, they got me to share a lot of high-interest documents and generate lots of link love, then they flipped the default (from free to paywall) as part of a cash grab. I could check out of Scribd, but then I would break a lot of links and it would take a lot of time. So now I feel trapped. It’s a terrible feeling.

Goldman is looking at other options, including Docstoc and Rapidshare. Another one worth checking out could be Slideshare, or even potentially Google Docs. However, all this has me thinking again about the wisdom of relying on third parties for such things (even though I do it myself). I do like the ability to display PDF documents, such as legal filings, embedded within a post, but I’m wondering if there are any simple solutions for setting up that sort of thing on your own server. Anyone know of any?

Filed Under: , ,
Companies: scribd

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Scribd Puts User Docs Behind A Paywall Without Them Realizing It”

Subscribe: RSS Leave a comment
John Schmidt says:

Eric Schmidt is on Colbert tomorrow night.

Here’s the debate, Mike:

Either you PAY for the privilege for it to be private or you make it freely available under what they call “ad-supported” but, it’s actually paid for with tax dollars.

It’s what’s called the Tax-funded “CIA-interwebs-gotta-get-em-terrerests-by-selling-snooping/tracking-technology” business model.

Jon Renaut (profile) says:

A lot of room in that space

I’m not sure why the WordPress model isn’t more widely used. You essentially have three options – free and limited hosted at wordpress.com, paid and supported hosted at wordpress.com, or free and whatever you want hosted yourself.

Document hosting or nearly any sort of web application could function the same way. With the cost of cloud storage dropping daily, it seems like someone should be able to make this model work for tons of useful things, like embeddable PDF hosting.

I’m currently accepting venture capital to get right on this.

Karl (profile) says:

Know of any?

If you don’t mind using Flash, then one option is to go with FlexPaper. It’s exactly what you want, but though it claims to be “GPL v3,” it’s really not (you have to display their logo even on modified versions, and you can’t use it for free on a commercial site). Might be worth the $70, though.

There’s also SWFTools, which includes PDF2SWF, and is completely open source. However, this generates a distinct .swf file for each PDF, so I don’t know if it’s the right solution.

If you don’t want Flash and your site is uses PHP, you might be able to hack something together using Samuraj Data’s online coverter and embedding the HTML in an iframe.

There’s also CynergyPDF, but it’s only for Joomla-powered websites.

Mat says:

Have you thought of the idea of PAYING people to host your documents so you can make them available for free?

You put your documents in the cloud for free then they have no value and you can’t really get upset if the people you gave them to choose to do things with them that you don’t like.

You don’t get stuff for free – and if you’re dumb enough to think that you do then you deserve everything you get. Apparently our “law professor” shouldn’t really be trusted to make adult decisions.

Anonymous Coward says:

FTP meets my needs.

I use something called FTP. It’s reliable.

Steps to replicate:
1.) Buy used computer on craigslist.
2.) Install TFTP server.
3.) Connect to interwebs via wired cable (So the Google Car and their legal team that’s driving by can’t steal your info!)
4.) Sign up for GoToMyPC and give all your students the login and ability to remotely read the documents.
5.) Proceed to Cheezburger.

Karl (profile) says:

Re: Karl

My issue with FlexPaper isn’t with the product, which actually looks very good (and worth the $70 that Mike would have to pay).

The issue is that it’s supposedly GPL, even though it’s not. If you look at the FSF’s Categories of free and nonfree software page, it would actually be what used to be called “semifree software,” and is now just called “proprietary software.”

Still, that’s an issue for the FSF to deal with, not us.

Karl (profile) says:

Re: Re: Re: Karl

Yes, but unlike FlexPaper, Flowplayer allows commercial use, which is a requirement of the GPL. From their FAQ:

I’d like to license my code under the GPL, but I’d also like to make it clear that it can’t be used for military and/or commercial uses. Can I do this?

No, because those two goals contradict each other. The GNU GPL is designed specifically to prevent the addition of further restrictions. GPLv3 allows a very limited set of them, in section 7, but any other added restriction can be removed by the user.

(Emphasis mine.)

But I guess you’re right about the requirement that the logo stay in place. You learn something new every day, I guess.

We’re probably just picking nits at this point. FlexPaper seems like a good program, so even if it was proprietary, it would be worth using IMHO.

Anonymous Coward says:

Re: Re: Re:2 Karl

No, it allows commercial use. Read what the site says.

“This is the appropriate option if you are creating a commercial website and you are not prepared to distribute and share the source code of your application under the GPL.”


You can use it for commercial use but you must then release it under the GPL-V3. If you want a different license that allows you to use it for commercial use and keep what you made a secret then you must buy that different license.

Same thing if you want a license that allows you to bundle it with proprietary software.

“This is the appropriate license to use if you intend to bundle or ship FlexPaper as part of a product.”

It’s released under the GPl-V3, you can do whatever yo want with that provided you maintain the license because the license requires that you do so. If you want a different license, if you want a license that allows you to do something without maintaining the Gpl-V3 license, then you must pay.

Anonymous Coward says:

Maybe there is a way to put it on Google Books?

“Can Authors and Publishers distribute their works under the settlement for free, under a Creative Commons license or otherwise?
Yes. Rightsholders are free to set any price for their work including the ability to distribute their work free of charge. If you are interested in distributing your work for free, including under a Creative Commons license, then you should claim your Book on the Claim Form and, on the ?Manage Your Books? page, fill in the box asking you to specify your sale price for the book at ?zero.? In the future, the Claim Form will also provide an option for you to offer your Book under a Creative Commons license, and you should check the Claim Form periodically for that option to appear. The Registry will inform Google of your request, and Google will include information on its web site so that end users are aware of the licensing terms chosen by you. Rightsholders are also free to authorize Google directly to distribute their book through a Creative Commons license.”


Chunky Vomit says:

I guess there is something of a Catch 22 here. After all, he did put the documents on servers that don’t belong to him.

I wonder: is there a reason why Google Docs isn’t an option here? I realize that the service has its limitations, but I have had great luck with sharing documents on that service.

I wonder if they index documents open to the entire web?

Coises (profile) says:


I do like the ability to display PDF documents, such as legal filings, embedded within a post, but I’m wondering if there are any simple solutions for setting up that sort of thing on your own server. Anyone know of any?

Does anything prevent you from storing the files on your own server and using an OBJECT tag in your posts?

This page has instructions for using the OBJECT tag to embed a PDF.

The link on that page to the explanation of PDF Open Parameters is stale, but here is a PDF that explains PDF Open Parameters.

Karl (profile) says:

Re: OBJECT tag?

Does anything prevent you from storing the files on your own server and using an OBJECT tag in your posts?

The fact that users must have the Acrobat plugin installed. Naturally, this causes browser incompatibility issues. See the “Compatibility” section of the PDFObject guide.

Incidentally, PDFObject seems like it would be useful if you want to go this route, as it gets around most browser limitations using JavaScript.

But I should note that I have Acrobat installed, and I can’t view the PDF in my browser (Chrome), even using PDFObject.

There’s also one other, possibly major, drawback: No search engine will index anything in an OBJECT tag. Of course, that applies to Flash as well. If that’s a worry, you’d have to convert the PDF into HTML before displaying it.

mariush (profile) says:

Re: Re: OBJECT tag?

Lots of people disable Adobe Acrobat from automatically opening documents in the page because of all the vulnerabilities and critical bugs it has.

Plus, it loads very slow and would annoy users if you embed 10 pdf files on a single page.

Flash is more reasonable as I can just use the Flashblock extension for Firefox to block all flash on the page and, if I’m interested in seeing the PDF file, I can just click on the flash icon for that object and unblock it without reloading the page, and I can then see it loading in the Flash object on the page.

Add Your Comment

Your email address will not be published.

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...