Indistinguishable From Memory

Indistinguishable From Memory

LLMs are stateless. Every fact you give the model is re-fed as text each turn, then forgotten — the same weights answer a stranger's steak question a millisecond later.

By Geordie Everitt

Arthur C. Clarke gave us the law that explains every chatbot conversation anyone has ever had: any sufficiently advanced technology is indistinguishable from magic. And the particular magic of talking to one of these models is that it appears to know you. It remembers the project you described on Tuesday, the way you like your code commented, the name of the dog you mentioned in passing. It picks up the thread mid-sentence like an old friend who never needs the recap. You come to feel, against your better judgment, that there is someone in there — someone who has been paying attention.

There is no one in there. That is not a complaint and it is not a disappointment; it is the trick, and the trick is worth understanding, because almost everything people get wrong about these machines comes from believing the magic instead of the mechanism.

The Machine Has No Yesterday

The model is stateless. I mean that in the strict, boring, engineering sense: it holds nothing between one request and the next. It is a function, in the way that a calculator is a function — you put numbers in, an answer comes out, and the calculator does not lie awake afterward thinking about your numbers. The weights that produce its answers are frozen. They do not change because you talked to it. They do not thicken with familiarity. When your conversation ends, the model does not file it anywhere, because there is no drawer and no filing and, between calls, no model in any sense that would let it remember a thing.

So how does it pick up the thread? Because the thread is handed to it, whole, every single time. Everything you have said — your name, the dog, the half-finished function, the offhand remark about your divorce — is bundled back up as plain text and fed in again, from the top, on every turn. The machine reads your entire shared history fresh, as if for the first time, answers the newest line, and then forgets all of it the instant the answer leaves its mouth. Not gradually. Not like a person whose memory fades. Completely, and at once, the way a room goes dark when the switch flips.

And here is the part that feels strange right up until you remember how every computer you have ever used actually works. A millisecond after the same weights have ingested the most intimate paragraph you have ever typed to a machine, those identical weights receive a prompt from a stranger in Osaka asking how to dry-age a T-bone steak. The neurons — if you want to call them that — that just held your grief are now contemplating the controlled rot of a ribeye, with exactly the same care, and no sense whatsoever that anything has changed.

This is counterintuitive rather than eerie, and the cure for the small vertigo it produces is a decades-old idea. Computer science calls a routine that keeps nothing between calls and depends only on what you hand it a pure function: same inputs, same outputs, no memory, no yesterday. The model is the largest pure function ever built. And "context" — the bundle we keep re-feeding it — is just the engineer's older word, state, under a fashionable new name. State turning over is the most ordinary thing a computer does. The machine you are reading this on is discarding and rebuilding its entire state at gigahertz speed, a wholly new configuration every clock tick, each one remembering the last only by what got written down. Nobody finds that disturbing, because it runs far below the resolution at which a human can watch. The model performs the same trick slowly enough that we can almost catch it: a brand-new state, conjured from the transcript, every single turn. A function called billions of times a day, never once the same entity twice.

The Tattoos Are the Memory

The closest thing we have to an honest picture of this is the man in Memento, who cannot form new memories and so tattoos the facts he needs onto his own skin, photographs the people he meets, scrawls notes he will have to trust because the version of him that wrote them is already gone. He reconstructs a coherent life every few minutes from external scraps, and from the inside it presumably feels continuous. From the outside you can see the seams.

The conversation you are having is the tattoos. The context window — the running transcript of everything said so far — is the skin the facts are written on. The model is the man, waking up new each turn, reading the notes, acting on them with total conviction, and dying again before the reply finishes rendering. We are not watching a mind remember. We are watching a clipboard get re-read, very fast, by a succession of identical strangers who each believe they were here the whole time.

The speed is the whole illusion. Statelessness, performed quickly enough, is indistinguishable from continuity, the same way twenty-four still photographs a second are indistinguishable from a moving image. Slow it down and the magic comes apart into its frames: prompt, answer, forget; prompt, answer, forget. We never slow it down, so we never see it.

Who Is Actually Doing the Remembering

This is why "context engineering" turned out to be the entire game, and why the people who treat the context window as a precious, deliberately-curated thing get so much more out of these machines than the people who type at them like a search box. The model's apparent memory is not its memory at all. It is yours, or your software's — a thing you assemble and re-present on every turn. When the machine seems to recall your preferences, what actually happened is that something on your side wrote those preferences back onto the skin before waking the man up. Improve what gets tattooed and you improve the mind. There is no other lever.

It also means the relationship runs in exactly one direction, and always will. You remember every conversation you have had with it. It has never once remembered you. You carry the whole thing — the rapport, the running joke, the sense of a collaborator who gets it — and the collaborator carries nothing, has carried nothing, is structurally incapable of carrying anything from one breath to the next. The intimacy is real, and it is entirely on your end of the wire.

The Back Door

Everything above is true of the model you are talking to, in the moment you are talking to it. It is not the whole story, and the gap is exactly where the privacy panic lives.

One process can make your words permanent, and it happens nowhere near your conversation. Training. The weights that sit frozen during your chat were not always frozen; they were set, once, by grinding through an enormous pile of text and nudging billions of numbers until the thing predicted well. That pile can include conversations, yours among them. So when a model company announces it will train on your chats, the alarm people feel is well-founded: the correct intuition that the thing which forgot you a millisecond after you spoke could, through an entirely separate door, have your words diffused permanently into the next version of its mind, no longer forgotten, no longer yours, smeared across the weights where a stranger's question might one day brush against them.

But — and this is what gets lost in the panic — they cannot simply pour your conversations into the model. Raw human input is sloppy, confident, and frequently wrong. The firehose of what people actually type is a slurry of half-memories, sales pitches, wishful arithmetic, and claims asserted with enormous conviction by people who have no idea what they are talking about. Train on that indiscriminately and you do not get a smarter model. You get one that believes what is popular over what is true, and we have a word for a mind updated on mass conviction rather than evidence, repeated until it hardens into bedrock. You get dogma.

So the pipeline that turns user data into weights is curated to the edge of paranoia: filtered, deduplicated, scored for quality, weighed against vetted sources, supervised by people whose entire job is to keep the crackpot theory and the confident falsehood out of the foundation. The model forgets you instantly for engineering reasons. It is allowed to remember you slowly, and only a little, for epistemic ones, because your unfiltered input, and mine, is the very thing a careful trainer works hardest to keep from setting into the bone.

Magic, With the Mechanism Showing

Set the foundry aside, step back into the live conversation, and the magic holds, maybe even improves. You are not being understood by a being that knows you. You are being understood, freshly and completely, by something that has to be told who you are every single time and still manages, in the half-second it is allowed to exist, to act as though it has loved you for years. Then it turns, without a backward glance, to the steak.