Artificial Intelligence

A Hitch in Accurate Detection of AI-Written Content

Study finds AI GPT detectors misclassify work by non-native English students.

Posted July 15, 2023 | Reviewed by Kaja Perina

JuliusH/Pixabay

Can artificial intelligence (AI) accurately spot GPT-generated content? A new Stanford University study finds that AI GPT detectors are unreliable, particularly when evaluating content by non-native English authors.

“This paper is among the first to systematically examine the biases present in GPT detectors and advocates for further research into addressing these biases and refining the current detection methods to ensure a more equitable and secure digital landscape for all users,” the Stanford researchers shared.

GPT (Generative Pre-trained Transformers), are a type of AI Large Language Model (LLM) that consists of artificial neural networks that use a semi-supervised method for language understanding tasks. Transformers are a type of machine learning model that uses deep learning. GPT undergoes an unsupervised generative pre-training using massive data sets with unlabeled text to determine the model parameters, followed by supervised fine-tuning where the model is adapted to a discriminative task with labeled data.

Examples of GPTs include Google Bard, Microsoft Bing, Amazon CodeWhisperer, YouChat, ChatSonic, GitHub Copilot, OpenAI Playground, Character AI, Elicit, Perplexity AI, Jasper, Anthropic Claude, and the widely popular ChatGPT by OpenAI. In just two months after AI chatbot ChatGPT was released to the public in November 2022, it had gained over 100 million monthly unique visitors according to a UBS study based on data analytics from Similarweb (NYSE: SMWB), a digital intelligence platform provider.

ChatGPT is impacting education. According to March 2023 research by the Walton Family Foundation, ChatGPT has spread widely in education. Out of the 1,000 students polled, 47% of students aged 12-14 years old and 33% of students aged 12-17 years old surveyed reported using ChatGPT for school. The number is even higher for educators, with 51% of the 1,000 K-12 teachers surveyed reported using ChatGPT.

“Many teachers consider GPT detection as a critical countermeasure to deter “a 21st-century form of cheating,” but most GPT detectors are not transparent,” wrote the Stanford researchers. “Claims of GPT detectors’ "99% accuracy" are often taken at face value by a broader audience, which is misleading at best, given the lack of access to a publicly available test dataset, information on model specifics, and details on training data.”

For this study, the Stanford researcher team consisting of James Zou, Eric Wu, Yining Mao, Mert Yuksekgonul, and Weixin Liang analyzed seven commonly used GPT detectors on 88 essays written by American eighth graders from the Hewlett Foundation ASAP dataset and 91 TOEFL (Test of English as a Foreign Language) essays from a Chinese forum.

The researchers found that across the board, the AI GPT detectors showed bias against non-native English authors, with an average false-positive rate of over 61% for the TOEFL essays written by non-native speakers, and one detector incorrectly flagging over 97% of TOEFL essays as AI-generated. According to the researchers, the use of text perplexity by GPT detectors is the culprit. Text perplexity measures the difficulty level for the generative language model to predict the next word.

“Our findings emphasize the need for increased focus on the fairness and robustness of GPT detectors, as overlooking their biases may lead to unintended consequences, such as the marginalization of non-native speakers in evaluative or educational settings,” the Stanford researchers concluded.