Researchers look at how ’e-prints’ add to — or detract from — online discussions
Scientific papers that have not been peer-reviewed have advantages but also can spread misinformation
Traditionally, scientific research undergoes a rigorous peer review before publication in a professional journal. It’s not a perfect system, but it filters out a lot of flawed methodology and poorly thought-out conclusions.
The internet, of course, is a lot less discerning about what to accept as truth. One thing muddying the waters is the recent proliferation of “e-prints,” which are uploaded and disseminated without peer review.
In a study recently published in the Proceedings of the 15th Association for Computing Machinery (ACM) Web Science Conference 2023, PhD student Satrio Yudhoatmojo and Associate Professor Jeremy Blackburn from Binghamton University’s Thomas J. Watson College of Engineering and Applied Science collaborated with Professor Emiliano De Cristofaro at University College London (now University of California - Riverside) to examine how these e-prints are used on the online forums Reddit and 4chan.
Through their work at the International Data-driven Research for Advanced Modeling and Analysis (iDRAMA) Lab, the researchers have delved into the darker corners of the internet and studied the extremist political and social views found there.
“Those communities have access to our papers, and they’ve talked about our conclusions,” said Blackburn, a faculty member in Watson’s Department of Computer Science. “That got us thinking: Wait a second — it’s not just scientists who are reading the research that we put out there. It’s laypeople who aren’t trained scientists, and it’s definitely bad actors as well.”
Blackburn believes there are good aspects of e-prints and the servers that host them. Because the papers are free to download and read, they are available to anyone who can’t afford the fees to access professional journal articles. They also can provide timely information unimpeded by a lengthy peer review process. However, flawed research or even purposely deceptive claims spread false information that could do real damage.
“There was an explosion of e-prints talking about COVID during the early months of the pandemic, and many of them were cited by news and social media,” Yudhoatmojo said. “Some of those findings were not completely true at that moment.”
Looking at Reddit posts between mid-2005 and mid-March 2021 as well as posts from 4chan’s Politically Incorrect board from mid-2016 to mid-March 2021, the researchers found that topics ranged from computer science and physics to genetics and neuroscience, with particular emphasis on COVID-19 at the height of the pandemic.
When posters included links to e-prints to bolster their arguments, they sometimes misinterpreted what the research said or read just the abstracts without looking at or understanding the researchers’ methodologies. They missed flaws in how the research was done or whether it was pseudoscience masquerading as serious scholarship.
“It’s interesting to see laypeople citing scientific articles directly, rather than news articles, blogs or something like that,” Yudhoatmojo said.
While there are clear advantages to getting scientific papers published more quickly, the researchers are concerned about how online commenters with different levels of expertise may view the content differently. Incorrect interpretations or low-quality papers may become the “gold standard” for some people. That could lead to potentially dangerous implications for the scientific community.
“There are still a lot of open questions about this particular research,” Yudhoatmojo said. “We didn’t consider peer-reviewed papers that are open access, because the eight preprint servers that we decided to look at are not peer-reviewed. So there is a question about the quality of discussion that cited the peer-reviewed, open-access papers compared to the ones that we looked at.”
Blackburn added: “Science is about the production and dissemination of knowledge, and there’s always risks in it. Peer review is not perfect. There are politics involved, people make mistakes and most of the e-prints out there are not completely invalid. I think, ultimately, it’s been a big boon to science.
“At the same time, social media and all this kind of stuff mean that there are going to be laypeople accessing our work as well. We as scientists should think about how we’re not just communicating with other scientists.”
A happy reunion
During the process of writing (and rewriting) this paper, Yudhoatmojo also dealt with being separated from his family since before the pandemic.
He came to Binghamton from Indonesia through a Fulbright scholarship in fall 2019, leaving behind a wife and young daughter. He planned to bring them to the U.S. after his first year here, but then COVID-19 prevented that.
Although Yudhoatmojo posted the paper as an e-print in late 2021, it underwent several revisions before passing a peer review for conference publication.
“That was an interesting struggle for Satrio, dealing with COVID, with being here for four years without his family and trying to get a paper about science accepted as science,” Blackburn said.
Meanwhile, Yudhoatmojo hasn’t been able to go home to see them — until this summer. He’s glad he can finally see his daughter, now 6 years old, and that he has some progress to show for his time in the U.S.
“I’m feeling a little bit of relief about going back home with this paper published so that I can continue with my work when I return to Binghamton,” he said.