TL;DR: There’s been a recent major upgrade to the private Papermill Alarm. It looks like we can predict the inclusion of journals on the Chinese Academy of Sciences (CAS) lists. You know how data scientists make predictions?
TL;DR: There’s been a recent major upgrade to the private Papermill Alarm. It looks like we can predict the inclusion of journals on the Chinese Academy of Sciences (CAS) lists. You know how data scientists make predictions?
This is Ralph. How tall is Ralph? It seems simple, you could just hold a ruler up to the screen. But when you look at the ruler and use it to measure Ralph, are you actually measuring Ralph , or are you measuring the ruler and using that as a proxy ? How accurate is your measurement? Are you including fur in the measurement? What if Ralph were to stand on his hind legs, like a mighty bear — how tall would he be then?
There’s a quote attributed to Ernest Rutherford: “That which is not measurable is not science. That which is not physics is stamp collecting”. I think his point was that a lot of scientific work is just documenting things. In Rutherford’s day, there was a lot of exciting new creative work happening in physics, so perhaps physics seemed special to him.
Recently, I gave a presentation on the APIs for Papermill Detection offered by Clear Skies Ltd. I also touch on a newer service called the Clear Skies Standard Report. More on that in future posts… :) Here’s the video: https://medium.com/media/ee5a04aaed9fc53d2748e64516178ebe/href Would you like to know more?
TL;DR: Join me at ConTech Live to hear about a recent project with Open Credo to see if we could detect unusual co-authorships in a dataset created by Anna Abalkina. Sign up here! Papermilling has a few definitions which you see here and there.
Also… what is an API? The Papermill Alarm API, is a service which you can send some article metadata to and which will return an alert telling you if the paper looks like past papermill-products. Anyone can use it, but it definitely helps to have the support of an IT or data professional.
Last week, a paper I wrote on the subject of peer-review fraud was published in the journal Scientometrics (free link here, preprint here) . It was an interesting project to work on. I found a lot of examples where one referee would write a report during peer-review and then another referee would write an identical report in some other peer-review of some other paper.
This post is about The Papermill Alarm: an API for detecting potential papermill-products. There’s a field of study called ‘stylometry’ where we look at the statistical properties of someone’s writing and use that to model their ‘style’. People write in idiosyncratic ways.
I once saw a brilliant presentation about how simple data analysis can detect credit card fraud**. The presentation showed a pattern in how people use their credit cards. Given a large number of people who had been victims of credit card fraud, this pattern showed there was just 1 store in-particular where they had all used their cards. There was no observational evidence of someone at that store stealing card details.
The first time I flew over the North Atlantic was quite an experience. Through the clouds, I could see some little white boats out sailing in the sea. It was puzzling: from 30,000 feet, those boats must have been huge for me to be able to see them at all.