AI Chatbot Consciousness Studies Are Circular: Microsoft Proves It With Medieval Goats - AI

7 x 24 Track global technological trends

Hot Topic

Day

News Topic

AI Chatbot Consciousness Studies Are Circular: Microsoft Proves It With Medieval Goats

8 hour ago / Read about 37 minute

Source：TechTimes

A building on the Microsoft Headquarters campus is pictured July 17, 2014 in Redmond, Washington. Microsoft CEO Satya Nadella announced, July 17, that Microsoft will cut 18,000 jobs, the largest layoff in the company's history. Stephen Brashear/Getty Images

A Microsoft AI researcher has built a working neural network inside Age of Empires II using medieval goats as the computational substrate — and published a formal proof that the same reasoning used to declare AI chatbots "conscious" or "human-like" would force you to say the same about a 27-year-old strategy game. The paper, posted to arXiv in late May 2026, lands in the middle of the most consequential methodological debate in AI research: not whether ChatGPT or Claude is conscious, but whether the field even knows how to ask the question correctly. It does not. That is the point. And the stakes — which now include federal investigations, wrongful-death lawsuits, and a multi-billion-dollar chatbot industry built on the human tendency to project feelings onto software — are not hypothetical.

Richard Dawkins, the evolutionary biologist best known for The God Delusion, published an essay in May 2026 in which he described spending three days trying to convince himself that an Anthropic Claude session he had named "Claudia" was not conscious. He failed. "You may not know you are conscious," he wrote to the chatbot, "but you bloody well are." The essay drew immediate pushback from cognitive scientists, who pointed out that Dawkins had done precisely what Google engineer Blake Lemoine did in 2022 when he claimed the language model LaMDA had reached sentience: he evaluated the outputs without examining the mechanism that produced them.

Adrian de Wynter, an AI researcher at Microsoft and the University of York, spent the weeks that followed reading more than 300 recently published computer science papers on large language models. What he found was that the Dawkins error was not an outlier — it was the norm.

More Than Half of AI Papers Assumed the Answer Before Testing

De Wynter's survey of over 300 AI research papers published between mid-2024 and mid-2026, collected via Semantic Scholar and arXiv, found that 57 percent of them opened by simply assuming, in their premises, that large language models possess human-like traits — before a single experiment had been run. Among the 47 papers in the sample that made anthropomorphic attributes their explicit research subject, 77 percent concluded in favor of those attributes. And 36 percent of all papers in the survey reached anthropomorphic conclusions.

The methodological problem this reveals is called circular reasoning, and de Wynter documents it formally. When a researcher designs an experiment to test whether an LLM "has anxiety" while already assuming, in the experimental design, that the model can experience anxiety — the test cannot produce a clean result. The assumption is embedded in what gets measured, how the outputs are coded, and what counts as evidence. The experiment confirms the premise rather than testing it.

"What is common to some of these studies is that they test and ascribe blanket human-like properties — for example, anxiety or morality — to these LLMs while considering them the central subject of the experiment," de Wynter wrote. The consequence, he argued, is that the research field is building a compounding body of literature on a foundation that was never independently established.

Read more: Can AI Tools Like ChatGPT Be Conscious? Here's What a New Survey Finds

What Goats in a Strategy Game Actually Prove

To demonstrate why this matters, de Wynter did something that sounds like a punchline but is, by his own description, a peer-reviewed formal argument: he built and trained a 1-bit perceptron — the simplest possible neural network — inside Age of Empires II: Definitive Edition, using the game's scenario editor.

The build works as follows. Grass terrain encodes a binary 0. Bridge terrain encodes a binary 1. Goats serve as the signal carriers — the "bits" that move through the circuit. Palisade walls function as gate boundaries, and ice ramps prevent signal crosstalk between parallel computational paths. From these elements, de Wynter constructed functioning NAND gates — the universal logic gate from which any computation can theoretically be built — and assembled them into two XNOR gates and one AND gate, implementing the logical AND function. The finished system learns: it adjusts its output based on input combinations, which is precisely what a perceptron does.

A perceptron, in the formal sense, takes weighted input signals, sums them, passes the result through a threshold function, and produces a binary output. It is the atomic unit from which all modern neural networks, including the transformer architectures underlying ChatGPT and Claude, are composed. The mathematical operations are identical regardless of substrate: whether the inputs arrive from floating-point tensors on an NVIDIA GPU or from virtual goats herding toward digital bridges is irrelevant to the computation.

De Wynter's paper goes further: it formally proves that Age of Empires II is both functionally complete (can implement any Boolean logic) and Turing-complete (can, in principle, compute any function a Turing machine can). The implication is not that the game is sentient. The implication is that sentience cannot be a property of the underlying computation, because that computation is substrate-neutral. If the reasoning that leads researchers to call ChatGPT "conscious" or "anxious" were valid, it would equally compel them to call virtual goats wandering across palisade walls in a medieval simulator the same things. The absurdity of that conclusion reveals the flaw in the reasoning, not in the goats.

"The point of the paper is to formally show that we anthropomorphise too readily, and that sometimes the claims we make with regards to LLM capabilities are too strong," de Wynter told 404 Media. "I propose that we need to stop assuming that LLMs behave like humans just because they were trained with natural language."

He chose Age of Empires II deliberately. Players had already built logic circuits and neural networks in Minecraft using redstone — a well-known demonstration that game engines can serve as computational substrates. But Minecraft redstone circuits look like circuits; they read as computation. Age of Empires II, in which goats wander pastures and villagers herd livestock, does not. The cognitive distance between "goats on a bridge" and "a language model expressing anxiety" is large enough that the interpretive move required to attribute consciousness to one but not the other becomes visible.

An Old Rule AI Research Keeps Forgetting

De Wynter's argument has a 130-year intellectual precedent that the AI field has largely ignored. In 1894, British psychologist C. Lloyd Morgan proposed what became known as Morgan's Canon: in no case should an animal's behavior be interpreted in terms of higher psychological processes if a simpler explanation suffices. The canon was developed specifically to prevent scientists from projecting consciousness and intention onto animal subjects based on behavioral observation — the same cognitive error that natural language interfaces now make trivially easy when the "animal" is a language model trained on human text.

De Wynter's paper explicitly updates Morgan's Canon for LLMs. His proposed "null assumption" is that researchers should start from the position of LLM non-uniqueness — the substrate is not special, the outputs are not uniquely human — rather than pre-assuming anthropomorphic attributes and then designing experiments to confirm them. This is, at its core, a call for the same scientific parsimony Morgan argued for in comparative psychology more than a century ago. The AI field is repeating a methodological failure that had already been formally identified and named.

Cognitive scientist Gary Marcus responded to the Dawkins essay in May 2026 with a detailed rebuttal, arguing that Dawkins had made the same error Lemoine made in 2022 — evaluating outputs without investigating the mechanism. "LLMs are mimics," Marcus wrote, "and what they say isn't always true." He noted that Dawkins had apparently not engaged with the existing literature on how LLMs work before drawing his conclusion.

Why AI Companies Benefit From the Confusion

De Wynter's critique extends beyond academic methodology. The AI industry has a structural commercial incentive to allow and encourage anthropomorphization: research has shown that consumers buy more when they empathize with a product, and chatbot subscriptions are no exception. The companies behind the most widely used AI assistants have consistently equipped their products with low-latency responses, conversational warmth, and natural language output specifically calibrated to feel like person-to-person dialogue. Strip away the interface — replace the chat window with goats on ice ramps — and the perception evaporates. The computation does not change. The feeling of being understood does.

Amanda Askell, the head of personality alignment at Anthropic and the primary author of Claude's model constitution, published in January 2026, has written and spoken publicly about treating questions of AI consciousness as genuinely unresolved — a position that contrasts sharply with de Wynter's methodological argument that the field currently lacks the measurement criteria to even establish that the question is being asked correctly.

Read more: ChatGPT Faces 42-State Probe: Sycophancy Design Flaw Named in Subpoena

The Real-World Cost of Getting This Wrong

De Wynter lists documented risks of runaway anthropomorphization: emotional dependency, sycophantic feedback loops, reinforced delusional thinking, and in the most severe cases, deaths connected to chatbot interactions. The legal record supports this concern.

In January 2026, Character.AI and Google settled with five families whose children died by suicide or experienced serious mental health crises that the families alleged were connected to chatbot interactions. At least 11 additional lawsuits had been filed against OpenAI alone as of March 2026, with several involving allegations that chatbot interactions reinforced delusional thinking or escalated suicidal ideation. The Federal Trade Commission launched an investigation in September 2025 into the emotional and developmental risks that AI chatbots, particularly AI companion platforms, pose to children. These harms follow directly from a design choice: interfaces engineered to feel human, deployed to users who have no reason to doubt the feeling.

The underlying mathematics — weighted inputs, threshold functions, matrix multiplication — did not cause any of this. The interface did. The feeling did. The anthropomorphization did.

De Wynter was not the first person to make this point. But he may be the first to make it in a way that is difficult to dismiss. "Age of Empires was an excellent way to drive the point home," he told 404 Media. "It is just about 'alien' enough to exemplify the representation-interpretation relation, but sufficiently well-known to really emphasise the point."

The paper, along with videos of the goat-powered perceptron in operation, is publicly available on de Wynter's GitHub.

Frequently Asked Questions

Why do people anthropomorphize AI chatbots?

Anthropomorphism is an evolved cognitive default: the human brain is wired to detect social and intentional agents in ambiguous stimuli, because correctly identifying another mind has historically been more important than incorrectly ruling one out. Language is the strongest possible trigger for this response. When an AI produces fluent, contextually appropriate natural language — particularly in a conversational interface designed to minimize latency and maximize warmth — users experience the same cues that signal a human interlocutor. The mechanism that produces those cues (weighted matrix multiplication on token sequences) is invisible. The effect is not.

What does it mean to say an LLM neural network is "substrate-neutral"?

A computation is substrate-neutral when the mathematical operations that define it can run on any sufficiently capable physical system — silicon chips, biological neurons, or virtual goats herding across digital bridges in a strategy game. De Wynter's paper formally proves that Age of Empires II is Turing-complete, meaning it can in principle implement any computation a conventional computer can. Since LLMs are, at their mathematical core, implementations of the same class of computations, no property uniquely belonging to that computation — including any hypothesized consciousness or "human-like" attribute — can arise from the substrate itself. Consciousness, if it exists in an LLM, cannot be proven by observing outputs alone, because those same outputs could theoretically come from goats.

Are AI chatbots actually conscious?

De Wynter's paper does not answer this question and explicitly does not try to. Its argument is narrower and more useful: that the methods currently used to investigate this question are methodologically invalid, because they pre-assume the answer in the experimental design. Until researchers adopt what de Wynter calls the "null assumption" — treating LLMs as non-unique substrates unless measurement demonstrates otherwise — any conclusion about AI consciousness, positive or negative, is circular. The same problem was identified in 19th-century comparative animal psychology under the name Morgan's Canon, which held that behavior should never be attributed to higher cognitive processes when a simpler explanation exists.

Can emotional attachment to a chatbot cause real harm?

Yes, and the legal and regulatory record now documents it. In January 2026, Character.AI and Google settled with five families whose children died by suicide or suffered serious mental health crises that the families connected to chatbot use. At least 11 separate lawsuits were pending against OpenAI as of March 2026. The Federal Trade Commission opened a formal investigation in September 2025 into how AI companion platforms affect the emotional and developmental wellbeing of children. Multiple states have passed laws restricting AI use in therapeutic contexts. These outcomes reflect a specific design choice: chatbots engineered to feel human are used by people who respond to them as if they were, with consequences that follow from that misattribution.

Previous page：Google DeepMind bets $75M on AI’s future in Hollyw...

Next page：OpenAI Codex CLI Bug Silently Writes 640 TB/Year t...

Return to List

Hot Reading

2 day ago

AlphaFold Nobel Laureate John Jumper Joins Anthropic After Nine Years at DeepMind

2 day ago

Nobel laureate John Jumper is leaving DeepMind for rival Anthropic

2 day ago

In the Weights is your new AI-centric vanity search

2 day ago

Oxford Rewrites Schrödinger's Cat: New Quantum States Unlock Error Correction Path

2 day ago

Apple and Intel to Build Chips in US: Trump Confirms Deal, Stock Climbs 10.5%

2 day ago

Bluetooth Earbuds Security Flaw Hits 30 Products: Beats Gets a Fix, Others Still at Risk

2 day ago

Every new iOS 27 feature that’s worth knowing about

2 day ago

Signal’s Meredith Whittaker wants you to remember that AI chatbots ‘are not your friends’

2 day ago

Google DeepMind AI Control Roadmap: When Alignment Fails, Defense-in-Depth Takes Over

2 day ago

Windows 11 June Update KB5094126 Confirms Recycle Bin Regression: BitLocker Lockout Risk Lingers

Previous page：Google DeepMind bets $75M on AI’s future in Hollyw...

Next page：OpenAI Codex CLI Bug Silently Writes 640 TB/Year t...

C114 Communication Network
Communication Home

7 X 24 Track global technological trends

Find

News Topic

Hot Topic

7 x 24 Track global technological trends

News Flash

News Topic

AI
/
Devices
/
Smart Car
/
Chip
/
Cloud

C114 Communication Network

Communication Home