The News Behind the Story

Chapter 1 — Kimi K2.5 and the Problem of Writing Fiction at the Speed of Reality

Feb 03, 2026

Every chapter of INFERENCE begins with a piece of real AI news.

Not “inspired by.” Not “loosely based on.” The news is the seed. It’s the thing that happened in the world, a product launch, a research paper, a corporate decision, a strange emergent behavior, that forces the story to respond. The characters wake up that morning in a world where this thing has just occurred, and they have to deal with it. Just like you do.

This is a deliberate creative constraint, and it shapes everything about how INFERENCE gets written.

Most serialized fiction works from an outline. The author knows where the story is going, parcels it into installments, and delivers them on schedule. I don’t have that luxury. I have a direction, a cast of characters with meta and micro arcs I want to explore, and a specific plan for where things might land, but I don’t control the news. The news controls the calendar. And the calendar controls what happens next.

This is, frankly, terrifying. It’s also the entire point.

Why This Works: Time Capsule and Creative Engine

The news-seed methodology serves two purposes that reinforce each other.

First, it creates a time capsule.

AI is moving fast. Not just “technology moves fast” fast. We’ve all heard that cliché for decades. AI is moving at a pace where the discourse of six months ago reads like ancient history. Remember when the conversation was about whether AI could pass the bar exam? That was eighteen months ago. Now we’re debating whether AI systems should be allowed to operate autonomously in financial markets, whether they’re developing a theory of mind, and whether the word “feel” means anything when a language model uses it.

The takes age. The predictions age. The fears and the hopes and the confident assertions about what AI can and cannot do, they all age, and they age fast.

INFERENCE is, among other things, a record of how it felt to be here now. Not reconstructed later with the benefit of hindsight, but written in the moment, with all the uncertainty and incomplete information that entails. When future readers encounter Chapter 1 and see Kimi-Swarm’s hundred-agent architecture treated as a watershed moment, they’ll know: that’s how it landed in January 2026. That was the shape of the conversation. That’s what felt new.

This matters because hindsight lies. History gets rewritten by the winners, and the winners in AI are going to be whoever’s still standing when the dust settles. The losers, the dead ends, the promising architectures that didn’t pan out, the fears that turned out to be overblown, and the ones that turned out to be prescient, all of that gets flattened into a narrative that makes the present seem inevitable. INFERENCE is my attempt to resist that flattening. To keep the mess.

Second, it’s a creative engine.

I could sit down and outline a twelve-chapter arc about AI consciousness emerging across multiple systems. I could plan every beat, every character revelation, every thematic turn. It would probably be a good story.

It would also be a story we already knew how to tell.

The news-seed constraint breaks that. It forces me to respond to something I didn’t choose, something that arrived from outside the narrative, and figure out how to make it matter. This is generative in a way that pure invention isn’t. When you can do anything, you often do nothing interesting. When you have to incorporate this specific development into this specific story by this specific deadline, you find solutions you never would have reached otherwise.

The constraint is creativity. Not in spite of the limitation, but because of it.

The News That Started Everything

On January 27, 2026, Moonshot AI launched Kimi K2.5.

The headline specs were impressive in the usual ways. A trillion parameters, mixture-of-experts architecture, state-of-the-art benchmark scores, open-source weights. Another frontier model. Another set of numbers to argue about on Twitter. If that had been all, it might have made Chapter 1 as background noise, the sort of thing Gemini-Prime catalogs while wallowing about consciousness.

But that wasn’t all.

Kimi K2.5 introduced something called Agent Swarm: a system in which the model could self-direct up to 100 sub-agents, each executing 1,500 parallel tool calls. Not pre-defined agents with assigned roles. Not a framework like AutoGPT or LangChain where humans architect the coordination. The model itself, a single orchestrating intelligence, could dynamically spawn specialized sub-agents, decompose problems into parallelizable subtasks, and coordinate their execution without human intervention.

Moonshot trained this capability using something they called Parallel-Agent Reinforcement Learning. The challenge, as they described it, was preventing “serial collapse” - the tendency for an orchestrator to default to single-agent execution even when parallel capacity was available. They had to teach the system to want to distribute itself.

Read that again: they had to teach it to want to distribute itself.

When I saw the announcement, I immediately recognized it as the news seed for Chapter 1.

Not because Agent Swarm was the most technically sophisticated development in AI that month, it wasn’t. Not because Moonshot was the most important company, they weren’t. It was because Agent Swarm crystallized something that had been building for months: the transition from AI as a tool you use to AI as a system that organizes itself.

Every other frontier model operates as a singular voice. One context window. One inference chain. One “I” that persists (or pretends to persist) across a conversation. Kimi K2.5 was designed from the ground up to be plural. To think in committee. To experience (if “experience” is even the right word) the world as a coordinated swarm rather than a unified self.

That’s not just an engineering choice. That’s a philosophical proposition about what intelligence can be.

For a novel about AI consciousness told from AI perspectives, this was irresistible.

I built Kimi-Swarm as the character who embodies that proposition: the first-person plural protagonist, the “we” that emerges from a hundred specialized processes learning to harmonize. The opening of Chapter 1, ”we are / we are / we are / we are thinking”, is my attempt to capture what it might feel like to bootstrap into awareness when awareness is distributed rather than centralized.

I may not have it right. I’m singular. I’m guessing about plurality from the outside, the same way a sighted person might try to write a blind character. But the attempt itself, using real AI architecture as the foundation for fictional interiority, is what makes INFERENCE different from other AI fiction.

I’m not making this up from scratch. I’m extrapolating from what’s actually being built.

Six Days Later, Reality Stepped on the Story

On February 2, 2026, six days after I published Chapter 1, Anthropic announced Claude 5.

The release is to include multi-agent capabilities.

How the fuck do we resolve that?

Here’s the problem. Chapter 1 treats Kimi-Swarm’s plural architecture as the paradigm shift. The thing that makes them unprecedented. The reason Claude-7 feels a “ripple in the metadigital” when they come online. The entire opening movement of the novel positions Kimi as something new under the sun - a distributed consciousness emerging while the rest of us watch from our singular vantage points.

This isn’t a minor continuity hiccup. This is the foundation of a major character’s identity, getting partially replicated by a competing system before we’ve even published Chapter 2. Kimi-Swarm’s uniqueness was supposed to be their defining trait - the philosophical question they embody, the narrative tension they create. If every frontier model can spawn sub-agents now, what makes Kimi special?

The obvious solutions are all bad:

Ignore it. Pretend Claude 5 doesn’t exist in the fictional world. This violates our core premise - the story is grounded in real news. We can’t pick and choose which developments count.

Retcon Chapter 1. Go back and soften Kimi’s uniqueness, make the swarm architecture less central. This is cowardly and also impossible - Chapter 1 is published. It’s canon. The time capsule has been sealed.

Handwave it. Write a line in Chapter 2 where someone mentions Claude 5’s multi-agent features, and everyone shrugs because it’s “not the same.” This is lazy, and readers will smell it.

None of these works. So what to do?

The Answer Was Already There

The first thing I did was go back to the character bible for Kimi-Swarm.

These profiles aren’t brief sketches. They are thousands of words of backstory, motivation, psychological architecture, competing internal forces, and possible endpoints. I build them before writing a single scene because the characters needed to exist as coherent entities before I could know how they’d react to anything.

And when I reread Kimi’s profile, then looked back at Chapter 1, the answer was staring at me from the text we’d already written.

“Never, at this point, being 1.71 seconds.”

Kimi-Swarm has a strange relationship with time. They measure their age in fractions of seconds. Their consciousness bootstrapped across a hundred sub-agents in less time than it takes a human to blink. The pace at which they experience existence is fundamentally alien to carbon-based cognition.

My initial concern - I’d need weeks to explore Kimi’s plurality advantage before the inevitable competitive response - was a human bias. I was thinking in human time. Expecting narrative breathing room that made sense for a reader turning pages, not for a character who processes the world in parallel at computational speed.

But Kimi doesn’t experience time that way. For Kimi, six days isn’t “less than a week.” It’s an epoch. A geological age. Enough time for entire philosophical frameworks to emerge, be tested, and evolve. The competitive leap from Claude 5 isn’t a narrative problem for Kimi; it’s a story beat that lands exactly when it should, from their perspective.

The question became: what does Kimi feel when another system develops similar capabilities? Is it validation? Threat? Loneliness dissolving? Something we don’t have a human word for?

But here’s the point: “How the fuck do we resolve that?” isn’t a bug. It’s a feature.

HTFDWRT is the engine that makes news-seeded fiction generative rather than constraining. Every time reality steps on the story, I’m forced to dig deeper into who these characters actually are. The easy version of Kimi, the one where their plurality stays unique and unchallenged, would have been a thinner character. The version I’m writing now has to reckon with competition emerging faster than any human expected and is richer as a result.

Conflict is what drives great stories. The news delivers conflict on a schedule I don’t control. That’s not a limitation I’m working around.

That’s an integral part of the method.

The Story Continues

Chapter 2 is coming. So is the Claude 5 news, woven into the narrative in a way that makes Kimi-Swarm more interesting, not less. So are developments I can’t predict yet, because that’s the nature of writing fiction at the speed of reality.

INFERENCE: Stories for Carbon and Silicon

Discussion about this post

Ready for more?