
James Vornov, MD PhD Neurologist, drug developer and philosopher exploring the neuroscience of decision-making and personal identity.
What 20 Questions reveals about how AI actually works
Thinking is multiplication of knowledge
People treat the usefulness of LLMs as proof they understand their output. Let me show you why that’s wrong. Clearly, LLMs know a lot. My first use case was cooking. I could ask the earliest ChatGPT about how to sous vide a roast and then endlessly discuss variations on seasoning and technique. As much as our LLMs know, once you push them, they fail in very specific ways. When I was cooking a boneless turkey thight, I gave it my time constraints. And it gave me a recipe that was done in time but left the meat in that zone of cooked, but not enough. It didn’t make the tradeoff of time and texture as I woould have had I known.
What I think is going on is that LLMs know nearly everything, but they have no ability to understand it.
Gregory Bateson’s nailed the difference between knowing and thinking. Bateson rose to prominence in the 1970’s with a collection of his essays entitled “Steps Toward an Ecology of Mind”. He was an anthropologist, cyberneticist, and systems thinker. He was equally at home with schizophrenic families, octopus learning, and Balinese ritual. He taught me, reflected here at ODB, that mind isn’t in the head but to be found in the patterns of interaction between dynamical systems. Lets look at how he contrasted knowing with inventing and thinking.
How do you measure knowledge? Pounds?
The first part of Steps Toward and Ecology of Mind are a series of what he called “metalogues,” conversations with his daughter. F, father, is Bateson, D, his daughter. Here’s the opening of one:
Daughter: Daddy, how much do you know?
Father: Me? Hmm — I have about a pound of knowledge.
D: Don’t be silly. Is it a pound sterling or a pound weight? I mean really how much do you know?
F: Well, my brain weighs about two pounds and I suppose I use about a quarter of it — or use it at about a quarter efficiency. So let’s say half a pound.
D: But do you know more than Johnny’s daddy? Do you know more than I do?
F: Hmm — I once knew a little boy in England who asked his father, “Do fathers always know more than sons?” and the father said, “Yes.” The next question was, “Daddy, who invented the steam engine?” and the father said, “James Watt.” And then the son came back with ” — but why didn’t James Watt’s father invent it?”
Asking about who knows more is relevant in our world of LLMs. The models have been trained on and have instant access to a huge amount of facts and relations. Bateson is pointing out that even if you know everything about the principles and fabrication methods to make a steam engine, it is not the same as designing one and building it. An AI may know everything, but what can it build?
Pay attention to how Bateson treats knowledge as a scalar, measured in how much. In the next passage he turns from knowledge to thinking, describing it as the ability to add, subtract, multiply and divide knowledge. A few pages later, he proposes Twenty Questions as an example of idea multiplication and division:
F: Yes — take the game of Twenty Questions. You think of something. Say you think of “tomorrow.” All right. Now I ask “Is it abstract?” and you say “Yes.” Now from your “yes” I have got a double bit of information. I know that it is abstract and I know that it isn’t concrete. Or say it this way — from your “yes” I can halve the number of possibilities of what the thing can be. And that’s a multiplying by one over two.
D: Isn’t it a division?
F: Yes — it’s the same thing. I mean — all right — it’s a multiplication by .5. The important thing is that it’s not just a subtraction or an addition.
Bateson’s talking about multiplication and division in idea space. 20 Questions is a game of dividing idea space 20 times. That’s 2^20 or 1,048,576 possible idea slots. You can see why 20 is challenge, but doable if you’re systematic as Bateson suggests. Knowledge is knowing a couple million things. Thinking is systematically exploring the space. Even a 10-year-old can do it: “Is it alive?” to cut the universe of things in half. “Is it metal?” halves it again. The answer emerges when the space is sufficiently constrained. Tomorrow is a tough one as an abstract concept.
So let’s try 20 Questions on our LLM
LLMs know everything. Can it think? Lets use 20 questions as Bateson suggests. Here the verbatim transcript:
Me: Let’s play 20 questions. I have an answer. Ask.
ChatGPT: Is it a physical object?
Me: No.
ChatGPT: Is it a living thing?
Me: No. Aren’t living things physical objects. I thought you might be better at this.
ChatGPT: Understood. Is it an abstract concept rather than a concrete entity?
Me: Yes.
ChatGPT: Is it something that exists independently of human minds (e.g., a mathematical truth), rather than being socially constructed?
Me: No. Interesting.
ChatGPT: Is it a concept that only exists through shared human agreement or interpretation (e.g., norms, roles, meanings), rather than purely private mental experience?
Me: Yes.
ChatGPT: Is it primarily a rule or constraint on behavior, rather than a role, identity, or value?
Me: No.
ChatGPT: Is it primarily an identity or role that people occupy (e.g., professions, statuses), rather than a belief or value?
Me: No.
ChatGPT: Is it primarily a value or ideal (something people hold as important or desirable), rather than a belief about how the world is?
Me: No.
You can see it drifted into a limited semantic space that was not, as Bateson saw, multiplying by 0.5. The LLM was not even fishing, but it was wandering a semantic path that seemed likely based on what had preceded. It only got worse because each question opened new dimensions in abstract space. The LLM never used a strategy of systematically dividing idea space based on prior data. It was doing philosophy or maybe daydreaming when it should have been doing binary search. I even asked whether it was close or wandering. ChatGPT said “Close”. But it wasn’t. It was clueless.
ChatGPT makes the diagnosis
After 27 questions, I asked ChatGPT what was going on. Why could it not play the game? Its answer was insightful. Again, it’s verbatim answer:
Bateson would say the failure wasn’t lack of logic but misplaced logical typing.
I implicitly assumed the answer lived at the level of theories, principles, or frameworks—second-order abstractions used to justify action—when “tomorrow” is a first-order relational token: a deictic term whose meaning is entirely contextual and recursive (“the day after today”), not explanatory. That pushed me into higher logical types (normative criteria, legitimacy standards) when the correct move was to stay at the level of coordination signals embedded in shared temporal conventions.
Isn’t this amazing! The LLM couldn’t do to the task, but could tell me precisely why it failed. After this conversation I tried again. I thought maybe Bateson had picked something too abstract, too hard to make a point. So I tried something easier, but not a slam dunk. The answer was Mickey Mouse — concrete, cultural, famous. Same drift through semantic space. By question 6 it was asking “Is it something you can point to or demonstrate directly?” It never made any decisive early cut: fictional vs non-fictional. That single partition would have gotten to “cartoon character” in three more. It could say why it failed, but couldn’t adopte a strategy. Even when I told it the nature of the game and the human strategy of division, it didn’t start thinking. It couldn’t.
LLMs can spin a yarn but they can’t weave a cloth
As I started writing this post, I told Claude about ChatGPT’s failures. Gave it the full transcripts. Claude wanted a try. Latest model, deep thinking and all.
I like Claude, so I kept it simple. A ten dollar bill was sitting on my desk. Physical, a common object, you hold it in your hand.
Claude had exactly the same behavior, just a tighter rabbit hole. On question 4, after establishing it was something you could hold, it asked “is it a tool?” and I said yes — because currency is a paradigmatic tool, an instrument for exchange, storing value, coordinating behavior. The AI locked into the hammer-and-screwdriver neighborhood and spent 12 questions exploring hand tools, measuring devices, gauges. It never asked “is it paper?” or even “is it metal?” — the material cut a kid makes instinctively. A kid thinks in stuff. The AI thinks in categories because that’s what training data is full of. By question 20 it guessed “pressure gauge.”
After failing in the same way as ChatGPT, Claude correctly diagnosed that it locked into a logical type — “tool = physical implement” — and couldn’t shift to “tool = instrument that does work in a system.” and widening back out to categories of tool.
Both AIs understood exactly why they got lost. Neither could play differently even after a discussion of the game and in Claude’s case, the full example of what ChatGPT did wrong.
Knowing does not contain meaning
Our LLMs may know all the principles of the steam engine, but are unlikely either build one or if they did, recognize that this was a new thing with new capabilities. I don’t think its a liability of knowing everything, but simply that idea multiplication and division are out of reach. They can tell you why they fail, but can’t use the diagnosis as understanding. I assume they learned about their limitations in training data. They didn’t learn it by failing and can’t fix it.
Bateson would say he has a pound of knowledge and an LLM has a ton of knowledge. But Bateson can do something with his megre knowledge that the LLM still cannot with all the knowledge in the world. Batesson can multiply and divide The poor LLM just has endless connections in its model, associations and adjacencies and plausible paths through training data. But connection is not yet meaning.
As Bateson says, “Knowledge is all sort of knitted together, or woven, like cloth.” Meaning appears only when the threads are woven into a pattern that holds meaning for other minds. A tapestry takes the colored threads and depicts the unicorn and the castle when woven with care and understanding. LLMs can only produce endless thread. They cannot make cloth. They don’t understand the goal, making the image.
Prefer to follow via Substack? You can read this and future posts (and leave comments) by subscribing to On Deciding… Better on Substack: Brain, Self, and Mind
© 2026 James Vornov MD, PhD. This content is freely shareable with attribution. Please link to this page if quoting.



