Research Shows That Published Versions Of Papers In Costly Academic Titles Add Almost Nothing To The Freely-Available Preprints They Are Based On

from the all-that-glitters-is-not-gold dept

The open access movement believes that academic publications should be freely available to all, not least because most of the research is paid for by the public purse. Open access supporters see the high cost of many academic journals, whose subscriptions often run into thousands of dollars per year, as unsustainable for cash-strapped libraries, and unaffordable for researchers in emerging economies. The high profit margins of leading academic publishers -- typically 30-40% -- seem even more outrageous when you take into account the fact that publishers get almost everything done for free. They don't pay the authors of the papers they publish, and rely on the unpaid efforts of public-spirited academics to carry out crucial editorial functions like choosing and reviewing submissions.

Academic publishers justify their high prices and fat profit margins by claiming that they "add value" as papers progress through the publication process. Although many have wondered whether that is really true -- does a bit of sub-editing and design really justify the ever-rising subscription costs? -- hard evidence has been lacking that could be used to challenge the publishers' narrative. A paper from researchers at the University of California and Los Alamos Laboratory is particularly relevant here. It appeared first on arXiv.org in 2016 (pdf), but has only just been "officially" published (paywall). It does something really obvious but also extremely valuable: it takes around 12,000 academic papers as they were originally released in their preprint form, and compares them in detail with the final version that appears in the professional journals, sometimes years later, as the paper's own history demonstrates. The results are unequivocal:

We apply five different similarity measures to individual extracted sections from the articles' full text contents and analyze their results. We have shown that, within the boundaries of our corpus, there are no significant differences in aggregate between pre-prints and their corresponding final published versions. In addition, the vast majority of pre-prints (90%-95%) are published by the open access pre-print service first and later by a commercial publisher.

That is, for the papers considered, which were taken from the arXiv.org preprint repository, and compared with the final versions that appeared, mostly in journals published by Elsevier, there were rarely any important additions. That applies to titles, abstracts and the main body of the articles. The five metrics applied looked at letter-by-letter changes between the two versions, as well as more subtle semantic differences. All five agreed that the publishers made almost no changes to the initial preprint, which nearly always appeared before the published version, minimizing the possibility that the preprint merely reflected the edited version.

The authors of the paper point out a number of ways in which their research could be improved and extended. For example, the reference section of papers before and after editing was not compared, so it is possible that academic publishers add more value in this section; the researchers plan to investigate this aspect. Similarly, since the arXiv.org papers are heavily slanted towards physics, mathematics, statistics, and computer science, further work will look at articles from other fields, such as economics and biology.

Such caveats aside, this is an important result that has not received the attention it deserves. It provides hard evidence of something that many have long felt: that academic publishers add almost nothing during the process of disseminating research in their high-profile products. The implications are that libraries should not be paying for expensive subscriptions to academic journals, but simply providing access to the equivalent preprints, which offer almost identical texts free of charge, and that researchers should concentrate on preprints, and forget about journals. Of course, that means that academic institutions must do the same when it comes to evaluating the publications of scholars applying for posts.

If it was felt that more user-friendly formats were needed than the somewhat austere preprints, it would be enough for funding organizations to pay third-party design companies to take the preprint texts as-is, and simply reformat them in a more attractive way. Given the relatively straightforward skills required, the costs of doing so would be far less than paying high page charges, which is the main model used to fund so-called "gold" open access journals, as opposed to the "green" open access based on preprints freely available from repositories.

In theory, gold open access offers "better" quality texts than green open access, which supposedly justifies the higher cost of the former. What the research shows is that when it comes to academic publishing, as in many other spheres, all that glitters is not gold: humble preprints turn out to be almost identical to the articles later published in big-name journals, but available sooner, and much more cheaply.

Follow me @glynmoody on Twitter or identi.ca, and +glynmoody on Google+


Reader Comments

Subscribe: RSS

View by: Time | Thread


  • icon
    charliebrown (profile), 13 Mar 2018 @ 11:48pm

    Of course researchers would say that. They just want all their work available for free.

    reply to this | link to this | view in chronology ]

  • icon
    Peter (profile), 14 Mar 2018 @ 2:25am

    Are there any models where libraries or funding agencies ...

    ... sponsor open-access platforms? In the Bioinformatics/Systems biology arena, it used to be common practice for industry and funding organizations to sponsor personnel, technology development and platforms. Their reasoning would apply to open-access in same way: The community needs certain tools and services. Sponsoring open systems was considered to be cheaper than licensing closed systems.

    The prerquisite is, of course, to accept that open-source is not free, but requires an (up-front) investment of some of the money saved from paid subscriptions (later).

    reply to this | link to this | view in chronology ]

    • identicon
      Monica, 14 Mar 2018 @ 10:38am

      Re: Are there any models where libraries or funding agencies ...

      The short answer to your question is yes. Many research libraries pay to support Cornell University Library, who maintain the arXiv.org repository. Many more also provide their researchers with open-access repository services using open source tools such as DSpace and Fedora. And a growing number of libraries offer publishing services, enabling the publication of open access journals. Universities also support organizations like the Center for Open Science, which provides a range of platforms for preprints and data, by way of institutional memberships.

      In short, university libraries are paying for *both* the subscription journals and the services and platforms that enable open-access publishing and data sharing. While some have suggested that we cancel subscriptions and channel those funds to more support for OA and open source platforms, to my knowledge nobody has done this in any large-scale way.

      reply to this | link to this | view in chronology ]

  • icon
    kmo12345 (profile), 14 Mar 2018 @ 2:26am

    Somewhat misleading

    I am a physicist and have a couple issues with this conclusion. I think the general claim that the editors of journals add very little to the published work is probably correct. However, the claim that there are no significant changes between the pre-print version and the published version is rubbish.

    My most recent paper, which was just accepted for publishing in Physical Review B, was sent out to two referees. Referee A had not a whole lot to say but pointed out an explanation that we had provided for an method was unclear to non-experts. Referee B was perhaps overly thorough but actually pointed out a few instances where specific word choices could lead to the incorrect conclusions being made. He or she also found a couple minor stylistic errors that had slipped through our editing.

    While the total number of words changed was probably under 5% (maybe even closer to 1 or 2%), the revised manuscript is certainly better than the pre-print version.

    The next step is for the journal to copy-edit the manuscript. This step usually consists of changing British English to American English and spelling out some abbreviations or abbreviating other words but can sometimes uncover typos that made it through peer editing. In any case, I agree that this is less useful than the peer review.

    It is certainly ridiculous that the public has to triple pay for research (they pay me to do it, they pay for me to access and submit to journals, and they have to pay if they want to access the research). However, I have yet to see an alternative to the current peer review process that is facilitated by the journals.

    In many cases, I have encountered papers on the arXiv which are completely incorrect. The papers in question have not been published and likely wouldn't be published without significant changes. I myself have manuscripts on the arXiv that contain small errors which have been fixed in the published versions. Depending on the journal we can usually replace the pre-print version with the published version after some length of time (6 months I believe) but this is not always done.

    reply to this | link to this | view in chronology ]

    • identicon
      Anonymous Coward, 14 Mar 2018 @ 2:54am

      Re: Somewhat misleading

      As the peer review process is provided by academia, they could come together to manage that process for themselves. It is not like the peer reviewers are employed by the journals, or paid to carry out that work. Indeed a Wikipedia provides the basis system for publication and peer review.

      The outstanding problem is the use of publications in prestigious as a measure of academic ability when academics seek new posts.

      reply to this | link to this | view in chronology ]

    • icon
      Prashanth (profile), 14 Mar 2018 @ 5:15am

      Re: Somewhat misleading

      Indeed, there is a lot of work in sifting through what people submit to whittle down to work that is worthy of publication (though that is not to say that there aren't problems in how those determinations are made, nor that there aren't problems in using that to justify the obscenely high prices of journals, because there are).

      reply to this | link to this | view in chronology ]

  • identicon
    Yes, I know I'm commenting anonymously, 14 Mar 2018 @ 5:02am

    It is not that the publishers improve the quality of the papers that is at issue but that they publish important journals: Writers get more funding points from their bosses for publishing in prestigeous journals. These journals happen to be owned by the big publishers and this is where the publishers have their value.

    The way forward is to have a transparant system to rank the importance of papers that does not depend on the chosen magazine. This way, there can be a lot of papers in any open access repository without having the importance of the few critical articles being watered down.

    reply to this | link to this | view in chronology ]

  • identicon
    Annonymouse, 14 Mar 2018 @ 6:41am

    Really 95% of the tools and resources are already there.
    All that is needed is just two things.
    First as already pointed out is an open platform equivalent to wiki that has the logistics hammered out and middlemen proof.
    Second is to beat the various admins and funding bodies about the head and shoulders until they get into their syphilis addled heads to stop looking at the colour of the covers and actually to their jobs.

    reply to this | link to this | view in chronology ]

    • identicon
      Anonymous Coward, 14 Mar 2018 @ 10:33am

      Re:

      All that is needed is just two things. First as already pointed out is an open platform equivalent to wiki that has the logistics hammered out and middlemen proof.

      What prevents a wiki from being set up for this purpose? Which new software features are required, and has anyone written or requested them?

      Would Reddit or StackOverflow-style software be better suited? They have voting, comments etc.

      reply to this | link to this | view in chronology ]

  • icon
    Toom1275 (profile), 14 Mar 2018 @ 7:20am

    The "value" of journals is their "prestige." It's kind of like bitcoin - it only has value because some people perceive it does.

    reply to this | link to this | view in chronology ]

  • identicon
    Anonymous Coward, 14 Mar 2018 @ 9:42am

    Isn't the main value of the editorial process determining WHICH papers to publish? It's not surprising that comparing earlier and final versions of the papers chosen don't show much of a difference.

    What would be more interesting is a study comparing the papers published to those NOT published by some metrics of quality. In other words, measure if the journals are performing a valuable "gatekeeper" function or not.

    reply to this | link to this | view in chronology ]

  • This comment has been flagged by the community. Click here to show it
    icon
    Richard Bennett (profile), 14 Mar 2018 @ 1:33pm

    Moody being Moody

    When did this hack start writing for Techdirt? Man, this place has become Troll Central.

    reply to this | link to this | view in chronology ]

    • identicon
      Anonymous Coward, 14 Mar 2018 @ 6:20pm

      Re: Moody being Moody

      Troll Central? I was wondering why you kept coming back. Ars Technica too left for you eh Dick?

      reply to this | link to this | view in chronology ]

      • identicon
        Anonymous Coward, 14 Mar 2018 @ 6:48pm

        Re: Re: Moody being Moody

        You'd think Dickface would be happy that Pai is in charge and doing whatever he wants, but the truth is trying to satisfy the whims of IP fanatics is a fool's errand.

        reply to this | link to this | view in chronology ]

      • This comment has been flagged by the community. Click here to show it
        icon
        Richard Bennett (profile), 14 Mar 2018 @ 8:03pm

        Re: Re: Moody being Moody

        Why are you so obsessed with dicks, Anono-coward?

        reply to this | link to this | view in chronology ]

        • identicon
          Anonymous Coward, 14 Mar 2018 @ 11:08pm

          Re: Re: Re: Moody being Moody

          The only "Dick" brought up was the derivative nickname for "Richard". You chose to bring up the plural form, and not the proper noun the original was referred to. Nice projection.

          reply to this | link to this | view in chronology ]

          • icon
            Richard Bennett (profile), 15 Mar 2018 @ 1:44am

            Re: Re: Re: Re: Moody being Moody

            Calling Richards “Dick”is such a clever insult...on Techdirt.

            But I've seen you explode into paroxysms of homophobia at the slightest provocation, because dicks trigger you.

            reply to this | link to this | view in chronology ]

            • identicon
              Anonymous Coward, 15 Mar 2018 @ 9:41am

              Re: Re: Re: Re: Re: Moody being Moody

              The insult was the suggestion that you are attracted to Troll Central, as a Troll.

              If you choose to be triggered by the usage of a common derivative for Richard, that's on you.

              reply to this | link to this | view in chronology ]

  • identicon
    McDawg, 15 Mar 2018 @ 2:56am

    Scholarly Kitchen perspective

    reply to this | link to this | view in chronology ]

  • identicon
    Anonymous Coward, 15 Mar 2018 @ 6:31am

    One of the ironies of this...

    ...is that a single deep-pocketed investor could end this entire farce in a single day. Drop a billion dollars on a foundation (900M in endowment, 100M in operating capital) and go full open-access with all research. It would be an enormous service to humanity and it would crush the Elsevier's of the world out of existence (good riddance to them).

    There are people who could do this without even blinking. And while there are numerous other worthy causes, making all academic knowledge free would serve those too -- maybe not today, but certainly in the future.

    reply to this | link to this | view in chronology ]

    • identicon
      Anonymous Coward, 15 Mar 2018 @ 9:51am

      Re: One of the ironies of this...

      Even if all future research was published open access, the Elsevier's of the world would linger because they hold the copyrights on a large number of foundation and important research papers in many subject areas. Those companies would have to be bought out to free all the existing papers to truly stop their blood sucking on academic research.

      reply to this | link to this | view in chronology ]


Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Use markdown for basic formatting. HTML is no longer supported.
  Save me a cookie
Follow Techdirt
Special Affiliate Offer
Anonymous number for texting and calling from Hushed. $25 lifetime membership, use code TECHDIRT25
Report this ad  |  Hide Techdirt ads
Advertisement
Report this ad  |  Hide Techdirt ads
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Chat
Advertisement
Report this ad  |  Hide Techdirt ads
Recent Stories
Advertisement
Report this ad  |  Hide Techdirt ads

Close

Email This

This feature is only available to registered users. Register or sign in to use it.