Science, meet World
Time for a new type of peer review?
Over the past year or so I’ve had a number of interesting conversations with people about peer review. It seems as though many people think the current system is broken, although I have yet to hear many suggestions on where to go from here. Sometimes people mention wikis or other web-based means of publication, but many (myself included) worry that there needs to be some form of peer review to assess study quality, fraud, and other things that are undesirable (although whether traditional peer review achieves these goals is very much up for debate).
Here are my thoughts on the problem, and what I see as a relatively simple solution.
How do we assess study quality?
Historically, if you had to assess the quality of a study without having the luxury of reading it, you would probably ask two questions:
1. Was it published in a peer reviewed journal?
2. If published, how prestigious is the journal?
While far from perfect, these questions give a general sense of the quality of a piece of research. Something that was published in Nature is likely of higher quality than something published in a small society journal (or at least that’s an assumption that many of us are willing to make when pressed for time), and both of these papers are likely to be of higher quality than a paper that has been rejected from multiple journals and now sits unpublished in a desk drawer.
This quick and dirty assessment of paper quality worked for a long time, since there were a fairly limited number of journals where you could publish research on any given topic. If peer reviewers deemed your work to be of high enough quality and/or impact, then it was accepted for publication. If not, it went unpublished. That served as a simple, albeit crude, way to assess the quality of a study or experiment. If no one was willing to publish your paper, then it must not be of very high quality.
Taken a step further, these questions can also be used to assess the quality of a researcher. Are you publishing many peer-reviewed papers? Are they in top journals? If the answer to either of those questions is no, then the implication would be that your research was of lower quality than someone who answered yes.
There are problems with this line of reasoning (among several obvious problems: not all papers that get rejected are low quality, and not all papers that sneak through the peer review process are high quality), but in general I would say that many people were happy with the system, since it was simple and (at least perceived to be) reasonably effective at keeping low quality studies on the outside and higher quality studies on the inside.
Why don’t these questions work anymore?
There are a lot of new journals popping up. Not just one or two, but hundreds. New Open Access publisher Hindawi publishes more than 120 journals in medicine alone! I get several emails every week publicizing the launch of a new journal(s), most of which are open access, and which seem to have varying standards of peer review (some use external reviewers, others are only reviewed in-house by the editors). The issue now is that if you can afford to pay the open access publishing fees, no paper is unpublishable. If you submit to enough journals, then your paper will almost certainly be accepted eventually, at which point you can say that the study has been published in a peer reviewed journal. So the first question from above (“Was it published in a peer reviewed journal?”) is no longer a useful way to assess paper quality, since almost anything can be published in some form of peer reviewed journal eventually.
Related to the issue of journal proliferation, people are becoming less and less devoted to any single journal. Rather than reading a specific journal from cover to cover each month, I have email alerts that send me a message whenever a paper is published on certain topics, regardless of the journal. As a result, papers published in low-impact journals can still get lots of attention, even if few people actually read that journal on a regular basis. In contrast, before online journal access it would have been much less likely that anyone would come across a paper in an obscure journal, no matter how relevant to their work.
Article-level metrics (e.g. assessing the number of citations for a specific paper, rather than the impact of the journal itself) are also reducing the importance of publishing in “prestigious” journals, since people now have more precise ways of determining whether your paper is being cited regularly. This isn’t to say that there are no benefits to publishing in prestigious journal – far from it. But the penalties of publishing in a low-impact journal are now much less than they used to be.
Is this a good or bad thing?
This depends on your perspective. If you liked the system where there were few journals and not everything could be published, then this will almost certainly seem like a bad thing. Suddenly we’ve lost one of the simplest (albeit very imperfect) ways of determining the quality of a study, or the quality of a researcher. This means that you could conceivably find a paper (or publish a paper yourself) that proves/supports just about anything, regardless of how poorly the study was conducted, which is a big problem.
Despite these obvious problems, however, I think that the new system could be a good thing… if we are willing to tweak the peer review and publishing process.
A journal that publishes everything
This may sound a bit far-fetched, but hear me out on this. At this point, almost every paper will get published eventually. What’s worse, it will often get peer reviewed at multiple journals before finally be accepted. This means that much of the time and expense of reviewing/rejecting the paper at higher journals was wasted, since it doesn’t actually keep the paper from being published – it just bumped it down to a lower quality journal.
So why not just publish everything that is submitted to a journal? PLoS ONE already does a more restricted version of this – they publish everything that they receive that is above a certain threshold of quality (as opposed to other journals, that consider both quality and “impact”, e.g. whether it’s a splashy finding or not). The papers would still be peer reviewed, but the purpose of the review would be to assess the study quality, using a pre-determined checklist (I’m picturing something similar to, but much more detailed than the Downs and Black checklist that is sometimes used to assess study quality in systematic reviews).
The checklist would include things like methodology (x-sectional, intervention, RCT, etc), number of participants, likelihood of bias, etc, and could range from 0-100. The final score and the checklists themselves would be published along with the paper, along with any additional reviewer comments. The peer review could be done using the current method of simply sending manuscripts out for review, or there could be a central clearing house run by the NIH or some such organization. The critical point is that the articles would be peer reviewed, and the quality of the article would be made abundantly clear on the article itself.
Using this system, you could publish a paper as soon as it’s received for review – it would simply need to say “pending quality review” or something of that nature. You could also require that all studies also put their full dataset online in order to aid with replication and hopefully reduce the likelihood of fraud, which isn’t easily caught by traditional peer review anyway (some journals, such as BMC Public Health already require that authors be willing to share data upon request, although I don’t think there is any mechanism for determining whether this actually takes place). The quality score of a paper could even be amended as authors improve their study by performing additional experiments or analyses.
This would keep the best aspects of peer review – extra eyes and ears providing thoughtful comments on how a paper could be improved – while acknowledging the fact that the current system doesn’t do a tremendous job of quality control (for an excellent look at the shortcomings of traditional peer review, please check out this paper by Richard Smith titled Classical Peer Review: An Empty Gun).
What are the advantages of this system?
I see a number of benefits to adopting this “publish everything” model.
1. This new system would make paper quality exceedingly clear – if I say that wifi causes cancer, and can only point to a study that scored 2/100 for quality, and you point to a study that found the opposite that scored a 90/100, then we have a better idea of which side to take. If the findings are conflicting and study quality is similar, then we know that the issue is yet to be settled. Essentially we are making it easier to do systematic reviews, by assessing study quality when a paper is published, rather than waiting for the systematic review to come along.
2. This system would incentivize high quality research, rather than “sexy” findings. If I know that my study will be judged on the quality of my methods, rather than the controversy or novelty of the findings, then it will help to improve the methodological quality of studies in general.
3. This system would make it ok to replicate prior work, or publish null findings. We talk a lot about the importance of replication in science, but we also know that it’s really hard to publish a replication study in a prestigious journal (it’s hard to spin a replication study as “cutting edge” since, by definition, it’s already been done by someone else first). It the same thing with null results – we know they’re harder to publish, we know this introduces biases into systematic reviews, and yet there haven’t been many effective ways to fix it (I’m curious if PLoS ONE publishes more “null” studies than other journals, since it doesn’t concern itself with a study’s impact – anyone with info on this please let me know).
If papers are judged solely on quality rather than the novelty of the finding, that removes the incentives against performing/writing up replication studies and null results.
What are the downsides of this system?
The biggest downside of this system is that everything, regardless of quality, would be published. So if your assessment of study quality begins and ends with “was it published in a peer reviewed journal?”, then this is obviously going to be a problem. Of course my counter-argument is that we’re already in a situation where you can publish anything regardless of quality, so that’s not really going to be a big change anyway. Of course there would be a lot of complicating factors (what goes into the quality checklist, who performs the review, how to make sure it’s applied consistently, ways to appeal if something was done incorrectly, etc), but if the over-arching idea has merit then I think the plumbing could be dealt with in turn.
Why isn’t this working already?
As I was writing up this post, James Coyne pointed out that WebmedCentral has most of the characteristics I’m looking for. They publish everything, they do so rapidly, they publish their reviews online, and they include a quality score. However, the quality score seems to be completely arbitrary, and their 10-question quality checklist focuses on the writing (e.g. “Is the quality of the diction satisfactory?”) rather than the quality of the study methodology, which is the real issue. I think it’s a worthwhile attempt, but I don’t think any modified form of peer review (including post-publication peer review, which has been spectacular in a few specific situations – e.g. Rosie Redfield and #arsenicDNA – and generally underwhelming elsewhere) will really catch on without without a true assessment of study methodology published alongside the paper.
So, what do you think?
I’ve been mulling this over in my head for a while, and I’m very curious to hear if anyone thinks this is even remotely plausible. It’s basically a more extreme version of PLoS ONE, which was pretty extreme in its own way when it first came out. Could this idea ever work in practice? If not, why not? And specifically, if you think it’s a bad idea, I’m curious to hear how this type of peer review be worse than the current form, given that we’re already at the point where peer review is weeding out less and less material with the creation of every new journal.
I’d love to hear what you think!
Travis
| Print article | This entry was posted by Travis Saunders on January 31, 2012 at 10:00 am, and is filed under Doing Science, Knowledge Translation. Follow any responses to this post through RSS 2.0. You can leave a response or trackback from your own site. |
- Science Policy Around the Web – January 31, 2012 « Science Policy For All
- Interesting reads: January 29th – February 4th, 2012 « Mr Epidemiology
- Opening up Science | Over the Dither and Through the Words
- Weekly List Bookmarks (weekly) | Eccentric Eclectica @ ToddSuomela.com
- Make Computer Faster
- Night Vision Camcorder
about 3 months ago
I think you’re on the right track with your argument. In fact, there has long been a trickledown effect, with papers that are rejected from the most prestigious journals cascading down into second and third tier journals. Determined authors could always be published in the past; of course with the web, the mere fact of publication is now trivial.
The task, as you ably outline, is to reconstruct the metadata/reputation framework that is created on publication in a traditional journal in the online environment.
I think your rating scale proposal is superficially attractive, but the devil will be in the detail. Scoring each piece of work will not be a task for the fainthearted, given that the author will fight tooth and nail for the highest score possible, and the skills needed to rate the work being highly sophisticated.
Making critical comment “stick” so that the searcher who finds the paper also finds the criticism is the crucial thing.
A simple model of peer review is easy to construct online: the question is how to get people to believe in it and participate in it. I don’t think it is in the power of any single individual or organisation to prescribe any solution. Making a publication open access, thereby enabling third party addons and analysis of the raw publications should be enough for a workable online solution to arrive.
The key obstacle at the moment is how much valuable content is paywalled, which inhibits access to knowledge and innovation around it. I suspect that a considerable factor in the conservatism evident in the face of the obvious superiority of the world wide web for the dissemination of research, is that many, many authors know how feeble their work is, and don’t want it exposed to the harsh oxygen outwith their hitherto cosy club. Existing journal editors and learned societies are also quite happy with their current grasp on the reins of power.
I think you need to look more widely at combining your idea of a standardised rating for publications with the idea of metajournals being constructed by interest groups over the contents of open access repositories.
about 3 months ago
F1000 Research (just announced) plans to do exactly what you are suggesting. See the first blog post at http://f1000research.com which provides detail on the plans (COI statement: I am running this project). I would be keen to hear your thoughts on how we are planning to tackle these issues.
about 3 months ago
Thanks for the comment Rebecca!
I read the Retraction Watch post, and I’m curious about your sanity check. You say that it’s similar to PLoS ONE, but their check is quite a bit more rigorous since they are still looking for studies using top notch methodologies. How high or low would you set the bar (to avoid simply being another PLoS ONE I’m suggesting a very low bar, personally).
Also, would you have some way of “scoring” articles so that people can tell the quality without having to read through the reviewer’s comments?
Very interested in what you’re doing, and to hear how to plan to move forward!
about 3 months ago
We are still finalising exactly what the sanity check would entail but the idea is to simply check that it looks like science and to weed out anything that looks ridiculous. With the data articles, we may be able to do a little more in terms of ensuring there is what looks like a reasonable amount of protocol information alongside the data and basic checks like the column headings are explained etc. But we are certainly not checking for scientific soundness – this has to come from experts in the field at the peer review stage.
As you suggest, we will make sure you can tell at a glance if the article is awaiting peer review or, if it has been peer reviewed, what the overall outcome was (i.e. ‘approved’ or ‘not approved’) with the name of the reviewer. Of course the great advantage with this approach is that contrary to the current system where a paper can be rejected by numerous reviewers for a whole host of journals before finally being accepted in a small but ‘peer reviewed’ journal and the reader is none the wiser, with our system it will be immediately obvious that most reviewers did not feel the research was scientifically sound.
about 3 months ago
A couple nuts-and-bolts questions, as this is where I find these things get complicated.
So if a paper is “not approved” would it be taken down? Could you simply publish it somewhere else? Also, is there any way to easily distinguish whether 1 “approved” paper is of higher quality than another “approved” paper? And are the reviews based specifically on a study’s methodology and interpretations, or could someone “not approve” a paper simply due to null results? (which is not uncommon in the current literature)
My only concern with your proposed system (as I understand it) is that it seems to rely on the standard dichotomy of published=good vs not published=not good, even though there is a wide range of study quality. This is why I like the idea of an actual numerical score attached to the paper (but that’s just an idea, other things could work as well).
It’s important that people can tell at-a-glance not just whether a methodology is “good enough” (which is essentially what the current system does), but instead tells people roughly how strong the methodology was relative to other papers.
about 3 months ago
Thanks for the detailed questions. If a paper is ‘not approved’ by a reviewer, it may be other reviewers disagree and say it is ‘approved’ or it may be that the author decides to amend it to try and get it ‘approved’. The article is published so you wouldn’t submit it elsewhere – the whole point is that you don’t get articles being published in say the 5th journal the authors submited to and no-one is the wiser that previously everyone thought the article was poor quality.
The idea is to encourage null results and so it should not be ‘not approved’ because of this. Reviewers will be directed to specifically focus on the scientific quality of the work and whether it ‘seems ok’.
The issue of impact and importance of the study is something that our existing F1000 evaluation service addresses and we believe that judgement should be kept separate from assessing if the scientific methodology was okay.
about 3 months ago
Ok, so “approval” and “publication” would be separated here. Very interesting.
Just for clarification, the rating system I’m suggesting would have nothing to do with impact or importance – it would be focused exclusively on a study’s methodological quality. The whole purpose being that it would be a simple way to know how the methodological quality of one paper compares with that of another. That would hopefully influence the impact of a paper, but it wouldn’t be influenced by the sexiness of a finding.
about 2 months ago
It strikes me that there is a lot of overlap with Wikipedia’s struggle for better and more transparent reliability. Wikipedia editors try to signal article quality with tools like edit patrols, quality flags, and page ratings. Indeed there’s now a site-wide public ratings system, though its resulting scores are not obvious in the articles themselves.
There are plenty of differences to scientific review, but some similarities too.