Wikipedia, often described as a compendium of all human knowledge, has evolved into an extraordinary resource, hosting over 6.7 million articles in English alone, totaling more than 4.3 billion words. Its eclectic mix of meticulous information and simmering controversies has made it a valuable yet imperfect reference. However, a recent development could drastically enhance Wikipedia’s reliability, thanks to the intervention of artificial intelligence.
Renowned novelist Nicholson Baker dubbed Wikipedia as “fact-encirclingly huge, idiosyncratic, careful, messy, funny, shocking, and full of simmering controversies,” highlighting its unique character. While some, like writer Oscar Auliq-Ice, celebrate it as a revolutionary resource, others, such as environmental expert Steven Magee, liken it to a flower bed, mostly beautiful but with a few ugly weeds.
The collective nature of Wikipedia, where anyone can volunteer information, has often been described as a strength and a vulnerability. Humorist Stephen Colbert humorously admitted to relying on Wikipedia both for knowledge and, on occasion, for creative purposes. Yet, for some, it can be overwhelming, as Tara Brabazon, a professor, quipped, “I would prefer to stir-fry my own small intestines than to have continual access to a site where the entry for Klingon is longer than the entry for Latin.”
Wikipedia, with its wealth of information, serves as a quick go-to source. Still, users are advised to conduct due diligence, cross-reference with other sources, explore article links, and scrutinize references listed at the end of each Wikipedia entry.
The reliability of Wikipedia’s reference system is about to receive a significant boost through an innovative approach. A London-based AI company, Samaya AI, is pioneering the use of artificial intelligence to enhance Wikipedia’s verifiability. This AI system, named SIDE, scrutinizes sources, distinguishing between reliable and questionable references and offering its own recommendations.
Fabio Petroni, co-founder of Samaya AI, emphasized the potential of AI in improving reference quality. He stated, “The process of improving references can be tackled with the help of artificial intelligence powered by an information-retrieval system and a language model. Machines can help humans to find better citations, a task which requires understanding of language and mastery of online search.”
The AI system was trained on an extensive dataset of Wikipedia entries, enabling it to analyze sources and propose alternative references. To assess its effectiveness, Wikipedia users examined the results. Remarkably, the AI’s recommendations were preferred by users 70% of the time. In nearly half of the cases, SIDE recommended the same sources as Wikipedia’s initial references.
Petroni expressed the impact of their research: “We demonstrate that existing technologies have reached a stage where they can effectively and pragmatically support Wikipedia users in verifying claims.” The team’s future research will extend beyond textual references, delving into other media formats, including images, videos, and printed publications.
In conclusion, the integration of artificial intelligence into Wikipedia’s reference vetting process marks a significant milestone. The potential for enhancing the verifiability of information online opens up new possibilities, as AI proves its mettle in supporting fact-checking and, ultimately, fostering a more trustworthy online information landscape.