By mapping AI’s “attention” to the physics of spinning particles, scientists uncover the hidden mechanics behind repeated, biased, and even harmful chatbot outputs, hinting at powerful new ways to make artificial intelligence safer and smarter.

Attention, shown here in its most basic form, is used across all generative AI because it works (e.g. LLMs such as ChatGPT). However there is no first-principles theory for why it works and when it won’t. See End Matter for explanations of its terminology which is unusual for physics. (b) The ‘physics’ of this Attention process that emerges exactly from our first-principles derivation. Each spin Si is exactly equivalent to a token in an embedding space whose structure reflects the prior training that the AI (LLM etc.) received. Wiggly lines are the effective 2-body interactions that emerge from Eq. 1. (c) The Context Vector N(0) is exactly equivalent to a bath-projected form of the 2-spin Hamiltonian (Eq. 1) which is then weighted toward the sub-region of the bath featuring the input spins. The theory predicts how a bias (e.g. from pre-training or fine tuning the LLM) can perturb N(0) so that the trained LLM’s output is dominated by inappropriate vs. appropriate content (e.g. ‘bad’ such as “THEY ARE EVIL” vs. ‘good’). Figures 3,4 show this phase boundary in detail.

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
AI models like ChatGPT have amazed the world with their ability to write poetry, solve equations and even pass medical exams. But they also can churn out harmful content or promote disinformation.
In a new study posted to the arXiv preprint* server , George Washington University researchers used physics to dissect and explain the attention mechanism at the core of AI systems.
Researchers Neil Johnson and Frank Yingjie Huo investigated why AI repeats itself, why it sometimes makes things up, and where harmful or biased content comes from, even when the input seems innocent.
Findings
The attention mechanism at the heart of these systems behaves like two spinning tops (a “2-body spin” system, in physics terms) working together to deliver a response, similar to how particles interact in quantum materials.
AI's responses are shaped not just by the input, but by how the input interacts with everything the AI has ever learned.
The research shows that repetition and bias in AI outputs arise from mathematical "attractor" effects, where certain words or phrases become more likely to repeat themselves, especially when the AI’s training data or vocabulary is limited or biased.
The study also describes a “phase boundary” — a tipping point where the AI’s output can abruptly switch from appropriate to harmful content, depending on how the input aligns with patterns learned during training.
According to the researchers, the attention mechanism in current AIs is mathematically similar to a statistical ensemble in physics, with each possible output treated like a “spin” in a spin-bath system.
The rigorous analysis could lead to solutions that would make AI safer, more trustworthy, and resistant to manipulation.
The authors further speculate that expanding AI’s attention mechanism to capture not just pairwise (2-body) but also three-way (3-body) interactions, similar to more complex phenomena in physics, could make language models even more powerful and nuanced.

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
Source:
Journal reference:
- Preliminary scientific report.
Huo, F. Y., & Johnson, N. F. (2025). Capturing AI's Attention: Physics of Repetition, Hallucination, Bias and Beyond. ArXiv. https://arxiv.org/abs/2504.04600