Skip to Main Content
Phillips Library Banner

ChatGPT, AI, and Implications for Higher Education

Detecting Machine Generated Text

Tools to detect machine generated text are relatively new. These tools are being developed, released, and improved all the time. As of early February 2023, there is no tool that is perfectly accurate. 

OpenAI, the creator of ChatGPT, released its own detection tool on January 31, 2023. On July 20, 2023, the organization withdrew the detection tool     in response to poor accuracy. The introduction made it clear that the AI Text Classifier had some limitations, which the organization expanded on in a blog postThe Guardian described the imperfect tool in an article the day after its release: ChatGPT maker OpenAI releases ‘not fully reliable’ tool to detect AI generated content 

Because of the limitations of these tools, using multiple detection tools on each sample will provide the most accurate response.


 

Free Detection Tools

Turnitin AI Detection

Turnitin has released a new ChatGPT detection feature. The AI writing indicator has been added to the similarity report as of April 4, 2023. 

Sneak preview of Turnitin’s AI writing and ChatGPT detection capability from February 2023

Learn more about Turnitin and similarity reports:

Further Reading

Research on Detection

Research on Detecting AI-Generated Text

More articles will continue to be released on this topic. See the suggestions on the left-side of the page to learn how to search for more.

Can AI Generated Text be Reliably Detected? | arXiv


The rapid progress of Large Language Models (LLMs) has made them capable of performing astonishingly well on various tasks including document completion and question answering. The unregulated use of these models, however, can potentially lead to malicious consequences such as plagiarism, generating fake news, spamming, etc. Therefore, reliable detection of AI-generated text can be critical to ensure the responsible use of LLMs. Recent works attempt to tackle this problem either using certain model signatures present in the generated text outputs or by applying watermarking techniques that imprint specific patterns onto them. In this paper, both empirically and theoretically, we show that these detectors are not reliable in practical scenarios. Empirically, we show that paraphrasing attacks, where a light paraphraser is applied on top of the generative text model, can break a whole range of detectors, including the ones using the watermarking schemes as well as neural network-based detectors and zero-shot classifiers. We then provide a theoretical impossibility result indicating that for a sufficiently good language model, even the best-possible detector can only perform marginally better than a random classifier. Finally, we show that even LLMs protected by watermarking schemes can be vulnerable against spoofing attacks where adversarial humans can infer hidden watermarking signatures and add them to their generated text to be detected as text generated by the LLMs, potentially causing reputational damages to their developers. We believe these results can open an honest conversation in the community regarding the ethical and reliable use of AI-generated text.

 

GPT detectors are biased against non-native English writers | arXiv

The rapid adoption of generative language models has brought about substantial advancements in digital communication, while simultaneously raising concerns regarding the potential misuse of AI-generated content. Although numerous detection methods have been proposed to differentiate between AI and human-generated content, the fairness and robustness of these detectors remain underexplored. In this study, we evaluate the performance of several widely-used GPT detectors using writing samples from native and non-native English writers. Our findings reveal that these detectors consistently misclassify non-native English writing samples as AI-generated, whereas native writing samples are accurately identified. Furthermore, we demonstrate that simple prompting strategies can not only mitigate this bias but also effectively bypass GPT detectors, suggesting that GPT detectors may unintentionally penalize writers with constrained linguistic expressions. Our results call for a broader conversation about the ethical implications of deploying ChatGPT content detectors and caution against their use in evaluative or educational settings, particularly when they may inadvertently penalize or exclude non-native English speakers from the global discourse.

 

ChatGPT Generated Text Detection | pre-print


Generative models, such as ChatGPT, have gained significant attention in recent years for their ability to generate human-like text. However, it is still a challenge to automatically distinguish between text generated by a machine and text written by a human. In this paper, we present a classification model for automatically detecting essays generated by ChatGPT. To train and evaluate our model, we use a dataset consisting of essays written by human writers and ones generated by ChatGPT. Our model is based on XGBoost, and we report its performance on two different feature extraction schemas. Our experimental results show that our model successfully (with 96% accuracy) detects ChatGPT-generated text. Overall, our results demonstrate the feasibility of using machine learning to automatically detect ChatGPT generated text, and provide a valuable resource for researchers and policymakers interested in understanding and combating the use of ChatGPT for malicious purposes.

 

How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection | arXiv


The introduction of ChatGPT has garnered widespread attention in both academic and industrial communities. ChatGPT is able to respond effectively to a wide range of human questions, providing fluent and comprehensive answers that significantly surpass previous public chatbots in terms of security and usefulness. On one hand, people are curious about how ChatGPT is able to achieve such strength and how far it is from human experts. On the other hand, people are starting to worry about the potential negative impacts that large language models (LLMs) like ChatGPT could have on society, such as fake news, plagiarism, and social security issues. In this work, we collected tens of thousands of comparison responses from both human experts and ChatGPT, with questions ranging from open-domain, financial, medical, legal, and psychological areas. We call the collected dataset the Human ChatGPT Comparison Corpus (HC3). Based on the HC3 dataset, we study the characteristics of ChatGPT's responses, the differences and gaps from human experts, and future directions for LLMs. We conducted comprehensive human evaluations and linguistic analyses of ChatGPT-generated content compared with that of humans, where many interesting results are revealed. After that, we conduct extensive experiments on how to effectively detect whether a certain text is generated by ChatGPT or humans. We build three different detection systems, explore several key factors that influence their effectiveness, and evaluate them in different scenarios. The dataset, code, and models are all publicly available.

 

DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature | arXiv


The fluency and factual knowledge of large language models (LLMs) heightens the need for corresponding systems to detect whether a piece of text is machine-written. For example, students may use LLMs to complete written assignments, leaving instructors unable to accurately assess student learning. In this paper, we first demonstrate that text sampled from an LLM tends to occupy negative curvature regions of the model's log probability function. Leveraging this observation, we then define a new curvature-based criterion for judging if a passage is generated from a given LLM. This approach, which we call DetectGPT, does not require training a separate classifier, collecting a dataset of real or generated passages, or explicitly watermarking generated text. It uses only log probabilities computed by the model of interest and random perturbations of the passage from another generic pre-trained language model (e.g, T5). We find DetectGPT is more discriminative than existing zero-shot methods for model sample detection, notably improving detection of fake news articles generated by 20B parameter GPT-NeoX from 0.81 AUROC for the strongest zero-shot baseline to 0.95 AUROC for DetectGPT. Code, data, and other project information are available.