I saw this on Twitter yesterday:
About 200,000 academic journals are published in English. The average number of readers per article is 5.
I don’t know where those numbers came from, but five readers per article sounds about right.
When I was a grad student, I felt like a fraud for writing papers that I wouldn’t want to read. Only later did I realize this is the norm.You’re forced to publish frequently. You can’t wait until you think you have something worthwhile to say.
If the average academic article has five readers, most would have fewer. Since some articles have hundreds of readers, there would have to be even more with practically no readers to balance out the average.
Readership may follow something like a power law distribution. It could be that the vast majority of articles have one or two readers even though some papers have thousands of readers.
John, to support your point:
1) once in a while I had to do some due diligence in our department, and check the average number of citations per paper of our researchers. Exluding self-citation, it rarely exceeded 5. Even a Fulkerson Prize (a pretty big deal in optimization) winner never got more than 20;
2) I sampled once a few articles on the Annals of Mathematics. Incredibly, some did not get a single citation; none of them got more than 10.
I am hopeful that this state of affairs will change. Not sure how.
@gappy
The average number of citations per paper is known, precisely. It is the average number of papers appearing in the reference section of papers. Depending on your field, this can vary between 5 and 20. (Of course, it is not because you cite a paper that you read it.) Yet, this is the average. In my experience (but this is very specific to your field), the median number of citations per paper is *zero*.
@John I am not sure why this is a problem. People write a lot of papers that nobody will read… so? There are lots of open source projects with no user. There are lots of blogs with no reader. Overabundance does not have to be a problem.
Even highly cited and read papers are often not all that important in themselves. They become important because they establish common ground between many people.
What matters is the output of this process. Science is all about grinding through…
Well, now I feel better about never publishing my dissertation. In fact, this makes reason #42,562 that I’m glad to be out of academia altogether.
You can’t assume that someone has read a paper just because they cited it. It’s quite common to cite a paper you’ve never seen. You ask your colleagues for a reference on something and they give you one. Maybe they’ve read it, or maybe they got it the same way you did. Or maybe a reviewer gives you a couple references he thinks you should add and you comply.
I have a question and a comment:
[Question] What do you think about scientific blogs in this regard? I don’t know about the average reader of a blog post, but I would imagine it is much more than 10-20 (blogs with a few hundreds or thousands of RSS subscriber are not unusual). Of course, most blog posts do not have the same quality/depth/coverage of a research paper, but still they are a way to communicate with audience.
[Comment] In your last comment, you mentioned that “you cannot assume that someone has read a paper just because they cited it”. This is true, but on the other hand, it is quite possible to read a paper and not cite it. For me, I read many many papers and may just pick one of them to cite. I have not kept track of the ratio, but I guess it would be something in the order of 10-100 to 1.
I agree that citation != readership, but it cuts both ways;
for example, the link below shows the number of downloads for a working paper in theoretical stats as 1820 while the actual paper has been cited just 5 times.
http://ecares.org/index.php?option=com_docman&task=doc_details&gid=56&Itemid=204
Yeah, this almost certainly follow’s Zipf’s law. Most articles probably have zero readers beyond the reviewers.
It starts early. I’m finishing up my (late) undergraduate capstone (on computability structures of inner product and Hilbert spaces), and I feel pretty lame about it. There are real mitigating factors, but I often feel I’m riding on my advisor’s coattails and just regurgitating her advice.
Daniel Black: That’s OK. Riding your advisor’s coattails is a sensible strategy for now. Jazz trumpeter Clark Terry gave this advice to musicians and it applies to math as well: “Imitate. Assimilate. Innovate.”
I have to doubt that figure for two reasons: first, there’s no reasonable way of determining how many readers an article has, and second, there are too many classes of reader to make such a small number likely. A couple examples of the sorts of reader I have in mind: people like me who are voracious readers of academic journals, and prospective graduate students who want to evaluate the work of professors and students at an institution. When I was applying to graduate schools — before I decided it wasn’t worth it — I spent many hours in the basement of the library reading journal articles by potential professors and their students. I know other classmates who did the same. Right there, in one Midwestern public university library, we probably exceeded five readers for several articles.
@Jason: I don’t know whether the five-readers-per-article stat was made up or based on some kind of study. But you could do a study. Online journals know how often articles are downloaded. I’ve seen a little data along those lines.
You could also survey authors about their preprint requests. I’ve gotten several requests for my most popular articles, but zero requests for most things I’ve written.
Here’s an idea for studying how often paper journals are read. Go to a college library and look at how dirty the pages are for the bound journals. Simon Newcomb discovered what is now known as “Benford’s Law” by noticing that the pages in the front of books of logarithms were black on the edges from repeated use while the pages in the back were clean.
I’m not saying people don’t read journal articles, only that most journal articles aren’t read. If every academic read an article a day, but they all read the same articles, that would still mean that most articles are never read.
@John: The point about online journals is granted.
A count of preprint requests represents a far more committed audience than casual readers. I have happened across and read many more articles — probably 20 or 30 times more — than I would have actively sought. There may be many articles that are casually read but which would never be requested.
Newcomb’s observation of books of logarithms probably wouldn’t be useful here; I guess, though I don’t know, that all pages of books of logarithms in 1880 were more heavily-used than almost any paper scholarly journal today. I don’t recall ever seeing one fingered black on any section.
@Jason: Good point. No article will ever be read as often as logarithm books were used in the past. Maybe you could dust journals for finger prints. :)
If journal volumes are allowed to circulate, you could see which ones are checked out. Otherwise you could look in the copier area and see which journals are left there to be reshelved.
I agree that journals should give more feedback to authors about the number of times their article has been read.
One of the motivating and educative features of blogging is that you can see immediately how many hits you are getting.
However, If you hunt around, you can put together some rough estimates of journal article readership. The bibliometric literature has reported on some of this.
I can’t remember all the articles but my impression was that:
1. Decent journals have several thousand paid subscribers.
This includes institutional subscriptions which support many more students and researchers.
2. Readership levels are much greater than suggested in this post:
Journal of Vision (a middle to upper level journal based on impact factor) provides download reports:
http://www.journalofvision.org/content/9/4/i?related-urls=yes&legid=jov;9/4/i
“In the most recent accounting in July, 2008, the top five articles were each downloaded between 1,993 and 3,478 times. ”
This presumably does not include readers who download by other means or who read paper copies. E.g., actual estimates could easily be from two or ten times larger.
If I were to formulate a very rough estimate I’d say that the number of readers is somewhere between 50 and 1,000 times the number of citations.
@Jeromy: I’m not saying that nobody reads journal articles. I’m saying that nobody reads most journal articles. I’m sure that popular articles have tens of thousands of readers.
I’m skeptical of inferring readership from subscriptions. A university may subscribe to a list of journals just because they’re expected to. If an individual subscribes to a journal, they probably read articles from it. It would be interesting to survey subscribers to ask which articles they read.
So most articles will barely be read (if at all). Is it predictable prior to publication which ones are which?
I think you’re saying it is predictable– people forced to publish things they know no ones wants to read. In that case, you’re pointing to a problem that needs fixing.
But if it weren’t predictable, then I’m not sure you’re pointing to a problem. If it’s unpredictable which ones will never be read, then saying they shouldn’t be published is like saying I shouldn’t have submitted most of the job applications I did. (I know that most of them won’t even result in interviews! But no one can know which ones…)
check out plos one: they show how many views an article has had. I haven’t yet found a summary of this view count, but when I clicked on around 10 articles each of which was 2 or 3 years old from different areas, most articles had view counts between 1000 and 4000.
After looking a little further, I found the data summarising PLos One article usage:
http://www.plosone.org/static/journalStatistics.action#PLoSONE
There’s even a link at the bottom to the raw data with article level usage for (from what I can tell) all articles in plos one journals.