ArXiv crackpot filter developed by accident

Dr. Courtney · May 31, 2016

Bill McKeeman said:

I think it is no accident that breakthroughs often come from the young or weirdos -- they are not yet entrapped in the warm and blinding company of their peers. We do not want to be bored by crackpots, and they surely would disrupt a forum designed to explain the scientific edifice, but we also do not want to miss a grand idea just because the presentation or presenter is rough around the edges.

Dale said:

I don't think this has ever happened.

It depends on how big a threshold you have for a "grand idea."

My most widely cited paper was in a field I had never published in before. My PhD is in experimental atomic Physics, the paper was describing a hypothesis my wife and I developed in blast-related traumatic brain injury. We've offended lots of neuroscientists since then due to our rough edges and lack of knowledge regarding the secret handshake and subtle use of the language. But our hypothesis that the brain can be injured by the blast wave impacting the thorax has since been experimentally verified and stood the test of time.

Unlike many crackpot papers, my wife and I spent a long time and carefully reviewed a large volume of relevant literature while developing the idea and writing the paper. We considered carefully what kinds of experiments would be needed to test the hypothesis and whether there was already a convincing experimental disproof in the literature. We knew that there was a strong bias among experts in the field of brain injury (and medical community in general) that only insults to the head/brain can cause traumatic brain injuries. See: https://arxiv.org/ftp/arxiv/papers/0812/0812.4757.pdf

We knew through private communications and a review of the literature that our paper would likely be subject to strong biases in the peer-review process. As a result, we choose to publish it in a journal that was not peer reviewed (Medical Hypotheses) as well as arXiv. The follow-up paper where we estimated injury thresholds for the thoracic mechanism of blast-induced traumatic brain injury was published through a back door: a special issue of a journal reporting findings presented at a conference (as well as arXiv). The existing biases against the new idea would probably have otherwise prevented either paper from being published through the more traditional peer review process of submitting the paper to established experts. See: https://arxiv.org/ftp/arxiv/papers/1102/1102.1508.pdf

Both papers have been widely cited, and the benefit of hindsight shows them to be basically right, although the threshold paper somewhat overestimated the blast pressure needed to produce a traumatic brain injury by impacting the thorax. The whole field of blast-TBI shows how wrong the experts can be. For many decades most of the brain injury experts scoffed at the idea of blast-induced TBI (shell shock) and mis-attributed all the symptoms to psychological effects and malingering.

ShayanJ · May 31, 2016

Today a crackpot gave a talk in my university.
Unfortunately I wasn't good enough at math and physics to criticize him properly(he was of the advanced kind, and the audience were master and doctoral students and two professors!oh...and a postdoc!) but I did put up a fight.
Its amazing how many people actually take them seriously.

BiGyElLoWhAt · May 31, 2016

I suppose let me reiterate. I'm not suggesting that we are missing out on any idea from "crack pots", but that we are potentially missing out on ideas from papers mislabeled as crackpottery, and that fall through the cracks.
As said in the article, there isn't a clear definition of what is considered crackpottery, we just sort of "know" when it is or isn't, which is why it's so surprising that this labeling algorithm is pulling out the crackpot papers. This is mainly due to the correlation between odd phrases and crackpottery. The two don't necessarily have to go hand in hand, either. Looking through the comments after the article, the point arises: How many of Einstein's papers would be filtered out as crackpottery?
Of course, those would probably belong to the group of crackpot papers that were then realized were actually not crack pot papers.

Dale · May 31, 2016

Dr. Courtney said:

My most widely cited paper...

And so your grand idea was not lost.

Dale · May 31, 2016

BiGyElLoWhAt said:

that we are potentially missing out on ideas from papers mislabeled as crackpottery

Again, I don't think that this happens. Important ideas are usually developed by multiple people independently.

To me, this is like saying that the NFL is missing out on all of the touchdowns that could be scored by people who have played sports video games. Or that medicine is missing out on all of the lives that could have been saved by all of the people who have watched an episode of Gray's Anatomy. In principle it could happen, but in practice it is a non existent concern.

BiGyElLoWhAt said:

How many of Einstein's papers would be filtered out as crackpottery?

Probably 0. He was not a crackpot and he did not write like a crackpot.

f95toli · May 31, 2016

BiGyElLoWhAt said:

How many of Einstein's papers would be filtered out as crackpottery?
Of course, those would probably belong to the group of crackpot papers that were then realized were actually not crack pot papers.

Maybe a a few (some o the papers he wrote towards the end of his life WERE borderline crackpottery). However, that would be true of many papers written during that era. One of the signs of a crackpot is someone who writes in an anachronistic style; usually because they are not familiar with modern terminology or notation (typically because someone like Tesla is their big hero). Modern physics and its many conventions when it comes to style of writing etc, has only really been around since the 1930s and did not really take off until after WWII (due to the explosion in the number of researchers and the number of journals and the establishment of English as the lingua franca of physics)
If someone was to submit an article written using 1970s notation (using say exclusively CGS units*) to a journal and I was referee I would reject it outright and I would presume it was written by a crackpot.

*Yes I do know CGS units are still used in certain fields (magnetic materials etc), but they are all weird...

BiGyElLoWhAt · May 31, 2016

Dale said:

Probably 0. He was not a crackpot and he did not write like a crackpot.

I think you're missing what I'm saying, which is just emphasizing one of the points made in the article. Define: Crackpottery. (rigorously define, please).
The article emphasized the ability of this algorithm to find crackpot articles, but the surprising thing is not that it does this, it's that it isn't designed to. It has effectively found a correlation between a phrase's lack of commonality and the quality of substance of the containing article. This in no way says that someone who uses odd phrases is a crackpot, they might just not have either A) the experience to know which phrases are common enough to use, or B) the ability to care about which phrases are common enough to use.

Is it possible that we have missed zero ideas from the ill definition of crackpottery? Sure. Is it possible (and in my opinion more likely) that we have missed at least one good idea from this ill definition. Also yes.

BiGyElLoWhAt · May 31, 2016

f95toli said:

some o the papers he wrote towards the end of his life WERE borderline crackpottery

Which papers are you referring to? As far as I know, when he was dying, he was working on Kaluza-Klein. Surely that's not what you're referring to? If so then it follows that string theory is crackpottery?

Dale · May 31, 2016

BiGyElLoWhAt said:

I think you're missing what I'm saying, which is just emphasizing one of the points made in the article. Define: Crackpottery. (rigorously define, please).

Crackpottery in physics is the thing measured by the following test:
http://math.ucr.edu/home/baez/crackpot.html

Similar measurement instruments could be developed in other fields.

BiGyElLoWhAt · May 31, 2016

Dale said:

Crackpottery in physics is the thing measured by the following test:
http://math.ucr.edu/home/baez/crackpot.html
Similar measurement instruments could be developed in other fields.

That's interesting. I'm not sure how this actually works, though. It seems as though the more points, the more crackpot it is. Considering number 1 is a 5 point starting credit, everything is a little bit crackpotty.
2) Were I working on, say, quantum gravity, it is likely that Einstein and Feynman would pop up on occasion, rendering my otherwise perfectly valid paper highly crackpotty. I suppose (from reading further on) that he's referring to the misspellings?
3) I'm pretty sure there's a reward out there to prove asymptotic freedom (maybe it's quark confinement?). I would assume that this reward would be equally applicable to someone proving the nonexistence of the proof for asymptotic freedom (or quark confinement).
4) Quite honestly, most of that is just silly, and I could probably come up with another list that included none or few of these elements and it could be equally applicable.

It is kind of funny that someone put the time into coming up with that list, though.
I particularly like number 31) "30 points for claiming that your theories were developed by an extraterrestrial civilization (without good evidence)."
I think the theory would potentially be overlooked by the "good evidence" part

BiGyElLoWhAt · May 31, 2016

However, this does seem to justify my reemphasis of the point that there is no rigorous definition of whether a paper is crackpottery or not.

ShayanJ · May 31, 2016

BiGyElLoWhAt said:

Considering number 1 is a 5 point starting credit, everything is a little bit crackpotty.

That's a -5 starting credit!

BiGyElLoWhAt said:

It is kind of funny that someone put the time into coming up with that list, though.

Not funny when you remember that he used to be(and maybe still is) being flooded with letters from crackpots!

BiGyElLoWhAt · May 31, 2016

I suppose that is a (-)5 point credit. Apologies.

JorisL · May 31, 2016

Also you have to read the names carefully, Einstien, Feynmann and Hawkins !
They are wrong.

BiGyElLoWhAt · May 31, 2016

I did see that, but only after his point about messaging him about how he misspelled Einstien. I just sort of skimmed through it, if that wasn't obvious.

Hornbein · May 31, 2016

Dale said:

I don't think this has ever happened.

Galois couldn't get his paper accepted because it was written so badly. "The ink was almost white," the reviewer said.

Grassman couldn't get linear algebra accepted because of the rough presentation. He self-published.

JorisL · May 31, 2016

Hornbein said:

Galois couldn't get his paper accepted because it was written so badly. "The ink was almost white," the reviewer said.

Well then he should have put more time into writing his paper. Its like complaining that your paper isn't accepted because it has a spelling error every other line.

Grassman couldn't get linear algebra accepted because of the rough presentation. He self-published.

What do you mean with rough presentation, ambiguous? Or perhaps too fast (using the trivial-argument)?In both cases I'd say that it doesn't matter how good your research is if the reader has to go to extraordinary lengths to comprehend it.
Be it using a magnifying glass to read it or figure out new techniques with only a statement of the problem and the result.

Hornbein · May 31, 2016

JorisL said:

What do you mean with rough presentation, ambiguous? Or perhaps too fast (using the trivial-argument)?

He was self-taught and used strange terminology, I think. In at least one case the referee was enthusiastic about the contents, but thought the presentation was too poor. It didn't help that his ideas were so original.

JorisL · May 31, 2016

Hornbein said:

He was self-taught and used strange terminology, I think. In at least one case the referee was enthusiastic about the contents, but thought the presentation was too poor.

Isn't that normal? How can you understand something if the terminology isn't standard, adding an extra layer of difficulty.
Ideally the referee would help him contact someone that's willing to help but this requires even a basic understanding of what's actually done.

Hornbein · May 31, 2016

JorisL said:

Isn't that normal? How can you understand something if the terminology isn't standard, adding an extra layer of difficulty.
Ideally the referee would help him contact someone that's willing to help but this requires even a basic understanding of what's actually done.

The question under discussion is whether we might "miss a grand idea just because the presentation or presenter is rough around the edges." The question is not whether poor presentations are difficult to understand.

Dale · May 31, 2016

BiGyElLoWhAt said:

However, this does seem to justify my reemphasis of the point that there is no rigorous definition of whether a paper is crackpottery or not.

The point is that it isn't necessary to have a separate definition, just a measurement that you define as crackpottery.

The arXiv filter itself can be considered a measurement and crackpottery can be defined as a particular score or range of scores on that measurement. You could even take some reference standard crackpots and calibrate other similar measurements. All without ever providing a non empirical definition.

BiGyElLoWhAt · May 31, 2016

I suppose that's a valid point.

Dale · May 31, 2016

Hornbein said:

Galois couldn't get his paper accepted because it was written so badly. "The ink was almost white," the reviewer said.

Grassman couldn't get linear algebra accepted because of the rough presentation. He self-published.

And we have not missed their ideas, we have them and use them.

I think that Galois' case is pretty much a worst case example, the dissemination was delayed by about 10 years due to the poor presentation.

Vanadium 50 · May 31, 2016

Dale said:

, the dissemination was delayed by about 10 years due to the poor presentation.

And the fact that Galois was unable to revise anything in those 10 years, being rather inconveniently dead.

Dale · May 31, 2016

Vanadium 50 said:

And the fact that Galois was unable to revise anything in those 10 years, being rather inconveniently dead.

Yes, pretty much a worst case indeed.

Dr. Courtney · May 31, 2016

Dale said:

And so your grand idea was not lost.

Ah, but only because we were able to chart a course to publication that cleverly avoided the peer review that would have risked biased reviewers labeling our hypothesis as "crakpottery."

It is notable that the journal we published the paper in is now peer-reviewed and, in fact, rejected a later paper we submitted. We eventually got it published in a lower tier journal, but the key idea of that one has been under-appreciated, and is in danger of being lost. The key idea (hypothesis) in the later paper is that the cranium exhibits hysteresis with regard to blast wave transmission - previous exposures to blast waves increase the blast wave transmitted in subsequent exposures. Unlike the first paper, no one has bothered to do the necessary experiments to further test the hypothesis, even though the idea is fairly simple.

Hornbein · May 31, 2016

The hypothesis that "the good ideas always come through eventually" is both unverifiable and unfalsifiable. If a good idea never came through, then we would not know about it. There simply is no way to collect any evidence one way or the other.

Dr. Courtney · May 31, 2016

Hornbein said:

The hypothesis that "the good ideas always come through eventually" is both unverifiable and unfalsifiable. If a good idea never came through, then we would not know about it. There simply is no way to collect any evidence one way or the other.

This is a great point.

Further, the idea that "the good ideas always come through eventually" neglects both the benefits to others from the great ideas coming through in a timely manner. How many lives did Pasteur save with the rabies vaccine? Jenner with the smallpox vaccine?

Finally, the idea that "the good ideas always come through eventually" also neglects the fundamental injustice in failing to recognize the scientist(s) who were truly first with the idea. Posting to arXiv establishes priority, even if the idea seems like crack pottery. I hope arXiv is keeping records of all their rejections, so any ideas that turn out to be true and important can be duly noted.

strangerep · Jun 1, 2016

Dr. Courtney said:

I hope arXiv is keeping records of all their rejections, [...]

Heh, isn't that what viXra is for?

mfb · Jun 1, 2016

Hornbein said:

The hypothesis that "the good ideas always come through eventually" is both unverifiable and unfalsifiable. If a good idea never came through, then we would not know about it. There simply is no way to collect any evidence one way or the other.

It is possible to study cases in the past: if something was rejected initially, but accepted later (=> that allows to collect the sample), how long did it need? Can we fit some spectrum to it and extrapolate to "very long times"?
It is also possible to study who suggested it earlier. And, not surprisingly, none of them were crackpots.

Bill McKeeman said:

I think it is no accident that breakthroughs often come from the young or weirdos

Do you have a reference for that? Also keep in mind that most researchers are young - because most PhD students don't get a permanent position.

BiGyElLoWhAt · Jun 1, 2016

Most of QM was developed by young "weirdos". That's easy to look up. It was even nick named young science or something along those lines due to most of the people being in their early 20s.

Dale · Jun 1, 2016

Dr. Courtney said:

Ah, but only because we were able to chart a course to publication that cleverly avoided the peer review that would have risked biased reviewers labeling our hypothesis as "crakpottery."

First, you didn't go that route, so you don't know what would have happened if you had. It very well could have been accepted, or it could have been rejected in the first tier journal and accepted in a second tier journal as is fairly common. You don't know and can't know. If you are going to assert knowledge of hypothetical scenarios then I would assert that someone else would have figured out the same idea.

Second, the point remains that the idea itself was not lost. This is the supposed risk, which I don't believe, that good ideas will be lost because of who or how the idea is presented. There are IMO, just too many redundancies: outsiders can polish their presentation and get it published in peer reviewed journals, they can publish in non standard channels, or other people can develop the same idea and present it better.

Dr. Courtney · Jun 1, 2016

Dale said:

First, you didn't go that route, so you don't know what would have happened if you had. It very well could have been accepted, or it could have been rejected in the first tier journal and accepted in a second tier journal as is fairly common. You don't know and can't know. If you are going to assert knowledge of hypothetical scenarios then I would assert that someone else would have figured out the same idea.

We've submitted enough papers related to blast injury to peer reviewed journals to be well familiar with the biases inherent against newcomers in the field. LIke many fields in science, there is an established "in" crowd, and if you are not part of it, your submissions are more likely than not to be stalled in the process.

Quick publication of the paper was key in being widely cited: the field was hot, DoD was pouring a lot of money into research, and it turned out to be very useful to have all the dominant mechanistic hypotheses summarized in print in a timely manner. That paper set a speed record for submission to acceptance to publication:

7/28/2008 Approved for public release by Department of Defense
7/31/2008 Submitted to Medical Hypotheses
8/1/2008 Submission Acknowledged
8/3/2008 Accepted for Publication
9/12/2008 Published online

In contrast, we average about 18 months delay from initial submission to publication of papers that are rejected by the first journal. One recent paper we published in blast injury took 12 months from submission to acceptance, even though it was an invited paper accepted by the first journal we submitted to. A 12-18 month delay would have resulted in a significant delay while research went forward and many millions of DoD dollars were spent by scientists without the benefit of considering our new hypothesis.

arXiv may not offer the benefits of peer-review or filtering out of crackpottery, but it does offer the benefit of rapid dissemination. Discerning readers are capable of filtering out the nonsense themselves. When fields are hot and there is need for rapid progress, it helps to have ideas and new results available to a wide audience quickly.

Dale · Jun 1, 2016

Dr. Courtney said:

the field was hot

Making it even more likely that someone else would come up with the same idea even if you hadn't been able to get the word out.

ArXiv crackpot filter developed by accident

Similar threads

Hot Threads

Recent Insights