James Vornov, MD PhD
Neurologist, drug developer and philosopher exploring the neuroscience of decision-making and personal identity.
ChatGPT is pretty dumb when compared to an agentic complex system like a worm
LLMs and the meaning of “intelligence”
I use a selection of large language models every day. I think they are actually kind of dumb.
Yet I keep hearing about “Artificial General Intelligence” being reached, the prospect of superintelligence, and the replacement of knowledge workers by LLMs. And then there are the questions about sentience that just make me roll my eyes.
I’ll admit that at this point, LLMs make excellent assistants. They help with fact-checking, reflecting back ideas, and making counterarguments based on conventional wisdom. There remain huge problems with hallucinations and guessing when something could easily be looked up on the internet but they don’t seem to understand bayesian induction. They are much better at summarizing and analyzing text than they are at producing text. And new ideas are almost entirely absent. And they just mess up numerical and quantitative arguments all the time. Which is not to say that I’m not inspired with new ideas through exploratory chats, it’s just that the ideas are mine, never the model’s. Why do we insist on ascribing general intelligence to them?
The subjective experience of talking to our current models is weirdly persuasive that there’s an intelligence there. It’s not just that the answers are fast and fluent; it’s that the model can hold a thread, shift registers, and generate language that looks like it came from a person who has actually spent time thinking. They feel alien and at the same time oddly knowable as another intelligence.
Comparing intelligence: LLM vs a Worm?
Exactly how intelligent is an LLM? So I got to thinking that I simply could count connections or potential network states. After all, if you think the model is intelligent, it comes down to all the connections and how complex its possible states are to produce that intelligent behavior. My gut is that the LLM is pretty stupid really, it’s just a model of something intelligent.
So what’s the simplest thing that I could compare it to? What’s a simple, mapped-out intelligence? How about our old friend, C. elegans, that simple worm that lives in leaf litter and has been the subject of so much study? The worm is the oldest and best-mapped nervous system we have, and it’s the kind of organism that tempts you into thinking the hard part is over. It’s tiny. The wiring diagram of its 302 neurons has been charted. Its behavioral repertoire is really modest. It feeds, avoids danger, and reproduces, and not much more. Very modest compared to mammals or a summary of recent dining trends in New York City. If you’ll allow the possibility that “intelligence” is to be found in a network, then the worm should be the perfect comparison.
So let’s compare LLM to a worm. On the one hand is a machine that’s startlingly competent in conversation and appears intelligent, approaching AGI, whatever that is. The number of nodes and weights in the LLM is really large, but countable, albeit in the billions. We can package the LLM as a downloadable file that can be duplicated, moved, and run anywhere. On the other hand is a simple organism whose nervous system is measured in hundreds of neurons and whose connectome is public knowledge, ready to reproduce as connections and weights in a similar neural network.
Seems fair. We’re comparing two systems that, on paper, look fully specified—an LLM with a complete set of parameters and a worm with a simple wiring diagram. The LLM produces sophisticated, meaningful output, and the worm eats and reproduces. So which is more intelligent?
Counting parts is seductive—and misleading.
If you start with component counts, the comparison looks almost absurd. A modern LLM is often described in terms of parameters—tens to hundreds of billions of scalar weights distributed across layers with tens of thousands of units and hundreds of layers. That number is so large that it seems clear why the LLM is so intelligent. With that many possible states, it must be intelligent, having a huge range of potential responses to a huge range of inputs. C. elegans, by contrast, has exactly 302 neurons, on the order of several thousand chemical synapses, and a few hundred gap junctions. Even if you’re generous with what you count, the worm’s nervous system is very simple compared to a transformer with billions of weights. If you reduce the comparison to “connections,” it looks like the LLM is six or seven orders of magnitude more intelligent than the worm.
But viewed another way, the parameter-count framing depicts the LLM as complicated but pretty dumb in the real world. A trained LLM is essentially a frozen object. Once training is complete, the model is a static mapping from input tokens to output tokens. Yes, it is high-dimensional, yes, it is nonlinear, and yes, it can generate long sequences that feed back into themselves through context. But its fundamental internal machinery does not change while you talk to it, much to my chagrin. I can get it to shift around, but I can’t teach it anything, I can’t get it to adapt outside of a limited context. The weights don’t drift or adapt with experience. It doesn’t need to do anything without input. It’s an inert thing.
The worm is the opposite kind of thing. Even if you stay inside the nervous system, a synapse is not a scalar weight in the way a transformer weight is a scalar weight. Synapses are state-dependent. Neurons change excitability. Neuromodulators reconfigure circuits so that the same wiring diagram produces different functional modes. Activity triggers intracellular signaling cascades that shift responsiveness on timescales that matter behaviorally, and on longer timescales can recruit gene expression programs that alter future behavior. And then, of course, you can’t stop at the nervous system because the worm is embodied. Its sensory experience depends on where it is, what it’s doing, and what just happened. The environment is not an “input” presented to it; the environment is a medium it lives in, pushes against, and changes by moving. In that sense, the worm is a continuous dynamical system in closed-loop interaction with the world, and if you widen your definition of “state” to include receptor conformations, intracellular messenger levels, muscle dynamics, posture, and the physics of the substrate, the number of possible configurations is not just large—it’s effectively uncountable.
Now the LLM is revealed to be a big dumb thing. There’s no physiological state that has to be maintained. There are no neuromodulators shifting global regimes. There is no metabolic constraint, no homeostasis, no hunger, no stress response, no sleep-like reorganization, no developmental arc. It is silent until prompted, and when it responds, it does so by running the same learned function again and again. Looking like something intelligent, but having no agency, no purpose, no complexity of its own.
The Map Is Not the Territory
This is why the worm is such a useful antidote to the “we can imitate it, therefore we have reproduced it” attribution of AGI to LLMs. A connectome is a map of who connects to whom, and that is an extraordinary achievement, but it is not a full specification of behavior. If you handed me the weight tensors and architecture of an LLM, you have handed me the system in a practical engineering sense. I can run it, and it will behave like itself, because the artifact is closed and complete. It’s a map that I can duplicate and use because it’s only a file, not a thing.
If you handed me the connectome of C. elegans, you have handed me one layer of description of an organism that is not closed. The worm’s behavior is an emergent property of wiring plus biophysics plus modulation plus body plus environment plus history. The connectome map is not wrong; it’s just incomplete in the specific way maps are always incomplete. It doesn’t have biochemistry. And that incompleteness matters enormously when you’re trying to get the thing to *behave*.
LLMs hold up a simplified mirror to us that fools us into seeing intelligence that’s not there. They don’t learn by living in the world the way real nervous systems do; they learn by being fed the output of the Earth’s most sophisticated predictive world models, our brains. Training data contains centuries of accumulated human brain artifacts. We’re realizing that while the mechanism is next token prediction, there are high-dimensional relationships that reflect human styles of reasoning, ways of arguing, rhetorical patterns, mathematical notation, social norms, cultural taboos, and the implicit rules of how humans interact and explain things to each other through language. No way does it result in a model of human mind or brain connectivity, but it does create a model in the high-dimensional shape of human thought as expressed in text. It feels intelligent to us because we’ve trained it to produce the kinds of responses that humans produce when they’re being intelligent. The model’s ability is, in a very literal sense, a learned approximation of the most compressed interface to human cognition we’ve ever created: written language. And we figured out how to create a multidimensional map that captures what we mean and how we think.
This is also where I think we have to be a lot more careful about the way we talk about “intelligence” in LLMs. Imitating human language output is a neat trick, but reproduction is trivial for the LLM because the artifact contains everything required to reinstate the behavior. The worm does not. Biology is not a file you can copy. The worm forces you to confront the uncomfortable truth that even with a complete wiring diagram, behavior does not automatically fall out unless you also capture the dynamics that make the wiring diagram into a living control system. Which is why, even with that wiring diagram, we can’t simulate the worm’s behavior in a simulated environment.
Maybe worms are a type of AGI
I think I started thinking about this comparison for a simple reason: LLMs feel alive in conversation, and worms do not. I can have a long, nuanced discussion with a language model about personal identity, the boundaries of self, or the neurobiology of decision-making, and the model can generate coherent paragraphs that sound like it’s tracking the argument. Meanwhile, C. elegans will never surprise me with an opinion. So it’s natural to ask how a system that is, in one sense, just a static mapping can look so sophisticated while an organism that is undeniably alive looks so cognitively opaque.
One answer is that the LLM is using the human interface that’s easiest to falsify: language. When the chatbot produces fluent, context-sensitive language, it directly activates the machinery in the human brain that constructs other minds. We infer agency from coherent language output. We infer reasoning from well-formed argument. We infer intention from an answer that anticipates our next question. This is not a moral failing; it’s how social cognition works and the source of input is irrelevant. It’s the same reason we can feel emotionally attached to a character in a novel or feel anger at a stranger behind a screen name. The medium is persuasive because it’s disembodied by nature. A language model can look exactly like a mind within limits even if what it’s actually doing is approximating the statistical surface of human language output.
But the deeper answer, and the one that matters for the AGI conversation, is that the worm is solving a different class of problem. The worm is an autonomous agent embedded in a hostile and changing world, and the sophistication of its behavior is not measured in the ability to produce abstractions; it’s measured in the ability to regulate itself and survive over time. As a species, geological time. In leaf litter, the worm has to embody an entire lifecycle: locomotion, chemotaxis, feeding, threat avoidance, mating, egg-laying, sleep-like states, and developmental transitions, all under fluctuating temperature, nutrient availability, and chemical cues. It doesn’t get to wait for a prompt. It doesn’t get to be wrong five percent of the time and apologize with a plausible paragraph. It has to keep itself within the bounds of survival. That kind of capability is not flashy in a chat window, but it is the real substrate of what we mean when we talk about general intelligence in the biological sense.
So what does this say about AGI?
This is where I think the term “AGI” quietly misleads us when we use it to characterize LLMs, because it suggests that the LLM is a brain in silico. But it’s not, it’s a map of human brain outputs. LLMs don’t have the machinery of brains. They don’t have continuous dynamics, embodiment, self-regulation, or closed-loop interaction with an environment. They are not “language generators” in the sense of generating meaning in the world like a brain making sense of its environment. An LLM is a big static network trained on enormous amounts of human output, adjusted until the output matches what a person might output to a prompt closely enough to be useful to people. And they are useful—sometimes astonishingly so—because modern work is so often the manipulation of language, symbols, and plans rather than direct engagement with the physical world.
But an LLM is useful in the way a map is. It’s an extraordinarily detailed map of human language and knowledge, trained on centuries of human output. You can navigate with a map, you can plan and try out routes or find things you didn’t know were there before you looked at the map. But the map doesn’t do anything without being used. It has no territory of its own; it’s just input for the human brain in the world.
So the right question isn’t whether an LLM “becomes” a human brain or has the general intelligence when trained on enough data. It never will be that by nature. I think the worm comparison makes that obvious. The correct question is a more pragmatic and, to me, more interesting: how far can a static, disembodied model go in approximating the outputs of a complex biological system well enough that, for many tasks, the distinction stops mattering? That reframes AGI away from metaphysics and toward engineering. We may not be building minds in the biological sense, but we are building tools that emulate enough of the outward behavior of cognition that they become functionally transformative, even while remaining fundamentally different from the organisms whose outputs they learned to imitate. A better and better map arbitrarily close to the territory, limited only by storage and compute. Can you model the intuitive representations of symbols and metaphor that underlie the words when needed? The other parts of the brain besides language generation?
And if you want a single sentence that captures why I keep returning to the worm: if we can’t convincingly simulate C. elegan, even with its connectome in hand, then we should be cautious about confusing conversational fluency with general intelligence. At the same time, we should recognize what we’ve achieved: we’ve learned to build static artifacts that can approximate the linguistic surface of human thought at scale. That isn’t life. It isn’t a worm. It isn’t a self. It is not thinking. But it is a new kind of machine that lives in the medium that we use to exchange thought, which is to say language, and that’s why it feels, so insistently like it’s halfway to a mind.
Prefer to follow via Substack? You can read this and future posts (and leave comments) by subscribing to On Deciding… Better on Substack: Brain, Self, and Mind
© 2025 James Vornov MD, PhD. This content is freely shareable with attribution. Please link to this page if quoting.