LLM Neuroanatomy III - LLMs seem to think in geometry, not language
A researcher known as dnhkng has republished a revised version of their third LLM Neuroanatomy article, arguing that large language models process information through geometric transformations in vector space rather than through discrete linguistic operations. This challenges dec...
According to the r/LocalLLaMA community on Reddit, dnhkng, an independent AI researcher who previously topped the HuggingFace Open LLM Leaderboard in mid-2024, originally posted the third installment of their LLM Neuroanatomy series before taking it down to add more polish. The revised and cleaned-up version is now back, and the central claim is striking: LLMs do not appear to think in words or tokens at all. They think in geometry.
Why This Matters
This is not just an academic curiosity. If dnhkng's geometric interpretation hypothesis holds up under scrutiny, it rewrites the basic mental model that most developers, researchers, and product teams use when building with language models. The idea that a 72-billion parameter model might be better understood as a geometric reasoning engine than a linguistic one has direct consequences for how we approach interpretability, alignment, and safety tooling. The fact that this researcher already broke conventional wisdom once, by hitting number one on the HuggingFace leaderboard without any gradient descent training whatsoever, means the AI community should be paying close attention to everything they publish.
Daily briefing from 50+ sources. Free, 5-minute read.
The Full Story
To understand why this third article matters, you need the full backstory. In mid-2024, dnhkng achieved something genuinely unusual: they took an existing 72-billion parameter model, duplicated a specific sequence of seven middle-layer transformer blocks, reintegrated those duplicated blocks back into the architecture, and submitted the result to the HuggingFace Open LLM Leaderboard. No new training. No weight merging. No gradient descent of any kind. The resulting model, dnhkng/RYS-XLarge, climbed to the number one spot on the leaderboard, outperforming models built by well-funded research labs with teams of PhD-level researchers.
The leaderboard at that time measured performance across six benchmarks: IFEval, BBH (Big-Bench Hard), MATH Level 5, GPQA (Graduate-level Google-Proof QA), MuSR (Multi-Sentence Reasoning), and MMLU-PRO. Competing models included notable entries like Nous-Hermes, Dolphin, and NeuralBeagle14-7B. Beating all of them, without any training, was the kind of result that makes the research community stop and ask what is actually happening inside these models.
That question drove the LLM Neuroanatomy series. Dnhkng built what they describe as a "homebrew brain scanner for Transformers," a custom analytical tool designed to observe the internal mechanics of transformer models without modifying their weights. The first two articles in the series, which generated 147 points and 41 comments on Hacker News when the second installment went up, laid the groundwork for understanding how information flows through transformer layers.
The third article, the one that was originally posted, pulled down for additional work, and is now back up, advances the most provocative hypothesis yet. Dnhkng argues that LLMs appear to operate primarily through geometric transformations in high-dimensional vector space. In plain terms, meaning inside a language model may not be encoded as something resembling words or grammar rules. It may be encoded as positions and directions in an abstract mathematical space, where relationships between concepts are expressed as angles and distances rather than symbolic structures.
This is not a fringe idea without support. In January 2025, a peer-reviewed study published in Nature Communications, titled "Temporal structure of natural language processing in the human brain corresponds to layered hierarchy of large language models," provided empirical backing for deep structural parallels between LLM layer hierarchies and human brain activity. That study used electrocorticography data from human participants listening to a 30-minute narrative and found that deeper LLM layers corresponded to later brain activity in language-related regions, including Broca's area. The alignment between artificial and biological systems was measurable and statistically robust.
Dnhkng has been explicit about choosing to publish through blogging rather than academic journals, stating that blogging is "way more fun" than writing scientific papers. That choice has trade-offs in terms of peer review, but it also means the work reaches practitioners and developers faster, which matters when the findings are this actionable.
Key Details
- Dnhkng topped the HuggingFace Open LLM Leaderboard in mid-2024 using the model dnhkng/RYS-XLarge.
- The winning approach duplicated exactly 7 middle-layer transformer blocks from a 72-billion parameter base model.
- No gradient descent, weight merging, or additional training was used to achieve the number one ranking.
- The leaderboard evaluated performance across 6 benchmarks including GPQA and MATH Level 5.
- LLM Neuroanatomy II received 147 points and 41 comments on Hacker News.
- A January 2025 Nature Communications study confirmed structural alignment between LLM layers and human brain temporal dynamics using electrocorticography recordings.
- The third article was taken down for revision and has since been reposted in polished form to dnhkng's blog at dnhkng.github.io.
What's Next
The geometric interpretation hypothesis needs independent replication, and given how much attention the Neuroanatomy series has attracted, it is reasonable to expect other interpretability researchers to start testing these claims against their own datasets within the next few months. If the hypothesis survives scrutiny, expect it to influence how next-generation model architectures are designed, particularly in how middle-layer blocks are structured and repeated. Watch the HuggingFace interpretability research channels and mechanistic interpretability communities on Reddit and Hacker News for the first wave of follow-up experiments.
How This Compares
The broader mechanistic interpretability movement, led by groups like Anthropic's interpretability team and independent researchers publishing through platforms like LessWrong, has been chipping away at the black-box problem in large language models for several years now. Most of that work has focused on attention patterns, feature activation, and identifying circuits responsible for specific capabilities. Dnhkng's geometric framing goes further by suggesting the fundamental representational substrate is spatial rather than symbolic, which is a larger and more structurally disruptive claim than most interpretability papers have been willing to make.
Compare this to the January 2025 Nature Communications study, which also drew connections between LLM internals and biological neural systems. That study was methodologically rigorous and peer-reviewed, but it focused on correspondence between layers and brain timing rather than making claims about the nature of representation itself. Dnhkng's work is less formal but more ambitious in scope. The two lines of research are complementary, and together they build a picture of LLMs as systems that may have more in common with biological reasoning than anyone assumed when transformers were first introduced in 2017.
The layer duplication finding from mid-2024 also sits in sharp contrast to the dominant industry narrative that performance improvement requires more data and more compute. OpenAI, Google, and Anthropic have all pursued scaling as the primary lever for capability gains. Dnhkng demonstrated that structural manipulation of existing weights, specifically copying seven blocks without any training, can move benchmark scores meaningfully. That result has not been widely integrated into mainstream model development practice, but if the Neuroanatomy series continues to build a theoretical framework explaining why it works, the practical implications for resource-constrained developers and researchers could be substantial. You can follow related AI tools and platforms being built on top of open-weight models to see where this kind of architectural research eventually lands in production.
FAQ
Q: What does it mean for an LLM to think in geometry? A: It means that instead of manipulating symbols that resemble words, the model may be encoding meaning as positions and directions in a high-dimensional mathematical space. Relationships between concepts would then be expressed as distances and angles rather than as grammatical or lexical structures. This is a hypothesis based on observing internal model activations, not a fully proven theory.
Q: How did duplicating 7 transformer layers win a leaderboard competition? A: Dnhkng took an existing 72-billion parameter model and copied a specific sequence of 7 middle-layer blocks, then added them back into the full model without any additional training. The result performed better across 6 standardized benchmarks than models that had been trained from scratch or fine-tuned extensively. The working theory is that certain architectural regions contain underused computational capacity that repetition can unlock.
Q: Where can I read the full LLM Neuroanatomy series? A: The series is published on dnhkng's personal blog at dnhkng.github.io. The third installment, which covers the geometric thinking hypothesis, was originally posted to r/LocalLLaMA, taken down for revision, and then reposted in a polished form. The earlier installments that generated discussion on Hacker News are also available at the same blog address. Check AI Agents Daily guides for curated summaries of interpretability research like this.
The LLM Neuroanatomy series is quietly becoming one of the most interesting independent research programs in the AI interpretability space, and the geometric thinking hypothesis deserves serious engagement from the broader research community. As more researchers build tools to probe model internals, findings like dnhkng's will either get validated and reshape how we build these systems, or get falsified and sharpen our understanding of why they work. Either outcome moves the field forward. Subscribe to the AI Agents Daily weekly newsletter for daily updates on AI agents, tools, and automation.
Get stories like this daily
Free briefing. Curated from 50+ sources. 5-minute read every morning.




