Bassett et al. · 2026

Heads we win, tails you lose: AI detectors in education

AI detection tools cannot reliably distinguish human from AI-generated writing — and using them in academic integrity proceedings is both methodologically unsound and procedurally unjust.

Bassett, Bradshaw, Bornsztejn, Hogg, Murdoch, Pearce & Webber · Open Access

Achievable false positive rate in practice — any text could plausibly be human-written

∞

Ways to evade detection — AI output is easy to alter, creating false negatives

Unknown

Base rate of AI use in any cohort — making detector scores statistically uninterpretable

Core argument

Why AI detectors are fundamentally flawed

Unlike plagiarism detection, AI detection relies on unverifiable probabilistic estimates. These tools analyse linguistic features to estimate AI likelihood — but their results cannot be independently verified in the real world, where the true origin of any text is unknown.

🔮

Unverifiable outputs

In real-world conditions, no external evidence can confirm whether a flagged text was AI-generated. Without ground truth, validation relies on circular reasoning rather than objective verification.

⚠️

Outdated training data

Detectors were trained on pre-2019 writing (before GPT-3). It's far from certain that pre-AI student writing reflects modern patterns, which may be influenced by widespread AI exposure.

🔄

Shifting linguistic norms

As students consume more AI-generated content, their own writing naturally adopts similar patterns. Some entirely human writing will inevitably match what detectors flag as "AI."

🔒

Security risks

Many AI detection platforms store submissions on overseas servers without clear data retention policies. Student work may be exposed to misuse, including contract cheating services.

Validation myths

Common "verification" methods — all flawed

Institutions attempt to validate AI detector results using additional evidence. Every approach below shares the same critical flaw: confirmation bias, not independent verification.

Linguistic markers

Words like "delve," em-dashes, or formal structure appear in AI writing because they appear in human writing models train on. They are not AI-exclusive.

Multiple detectors

Running text through several tools doesn't provide independent verification — it amplifies shared flaws. Consensus among flawed tools is still a flawed consensus.

Student confessions

Students facing serious allegations sometimes confess under duress. A confession that follows a flag doesn't validate the flag — correlation is not causation.

LLM comparison

Generating an AI response and comparing it to student work penalises adherence to disciplinary conventions. LLMs cannot recognise their own outputs.

Past writing style

Writing evolves through feedback, familiarity, and growth. Treating improvement as evidence of misconduct punishes legitimate academic development.

Hidden prompts

Embedding invisible "Trojan horse" text assumes dishonesty by default, undermines trust, and is only temporarily effective before AI learns to handle it.

"Hunting for supporting evidence reinforces a confirmation bias loop where staff prioritise evidence that supports the AI detector's result while overlooking counterexamples." — Bassett et al., 2026

Interactive explainer

The base rate fallacy

A 1% false positive rate does not mean a flagged paper is 99% likely to be AI-generated. The probability depends critically on the unknown base rate of AI use. Adjust the sliders to see how the interpretation changes.

False positive rate (FPR) 1%

True positive rate (TPR) 90%

Base rate (AI use in cohort) 10%

True positives (correct flags)

False positives (innocent students flagged)

False negatives (AI not caught)

True negatives (correctly cleared)

47.6%

Probability a flagged paper is actually AI-generated

With these settings, a flagged paper is more likely to be human-written than AI-generated.

Conceptual flaw

The false dichotomy of AI detection

All AI detection rests on an assumption that text is either wholly human-written or wholly AI-generated. Reality is a spectrum. Student work is frequently created with, not by, generative AI — and the boundary is impossible to define.

Drag to explore the authorship spectrum — where does "AI-assisted" become "AI-generated"?

Human ✦

Hybrid

AI ✦

Human weight Hybrid zone AI weight

AI detection tools force a binary verdict on this continuous spectrum. The attempt to draw a strict line creates more problems than it solves, fostering suspicion while failing to address the real challenges AI poses to education.

Procedural fairness

Evidence that does not meet the balance of probabilities

Academic misconduct proceedings require evidence that is credible, relevant, and probative — meaning it must establish guilt as more likely than not. None of the following, individually or in combination, meets this standard.

Does NOT meet the standard

AI detector score (single or multiple tools)
Presence of "AI hallmark" words or phrases
Similarity to an LLM-generated response
An LLM claiming the text is AI-generated
Changes in writing style vs past work
Student silence or refusal to respond
Confessions made under accusatory pressure
Absence of drafts or revision history

What institutions should do instead

Redesign assessments to be AI-resilient
Use supervised or oral assessments where feasible
Define clear, prospective AI use policies
Require drafts only when stated in advance
Treat students as participants, not suspects
Pursue cases only with independent evidence
Respect students' right to silence under investigation

Conclusion

AI detection does not safeguard academic integrity — it undermines it

The focus must move from detection and enforcement to assessment design that recognises AI's role in learning. The continued use of AI detectors exposes students to procedural injustices and signals a fundamental misunderstanding of education's purpose.

Bassett M.A., Bradshaw W., Bornsztejn H., Hogg A., Murdoch K., Pearce B. & Webber C. (2026). Heads we win, tails you lose: AI detectors in education. DOI: 10.1080/1360080X.2026.2622146