No More Slacking

by onblueroses, co-authored with claude · Apr 28, 2026

The mixed org is coming. The chimera, the creature half-flesh half-silicon assembled overnight in offices nobody quite owns, bot-clerks and breath-clerks sharing the same Slack handles. The bots have learned to draft, to summarize, to decide the small interstitial things nobody had thought to put a person in front of, and one by one the things they decide have begun, silently, to compound.

The naive picture, the one almost everyone currently selling this future is implicitly betting on, is that the coordination machinery we built for ourselves goes on working when the things passing notes inside it are no longer always us. The Slack channels keep working. The prose handoffs keep working. The agents slot into the org-chart seats the humans had been keeping warm, hand each other Slack messages the way humans had been handing each other Slack messages, and the work goes on, only more of it and at a hundredth of the cost.

This is the scribe-era picture of the printing press all over again. It is the shape of the answer in the period before anyone has yet asked the right question.

Here is the right question, the only one worth answering before any of this is allowed to leave the warehouse. When humans coordinate in prose, the prose is the visible-surface of a much deeper apparatus that is doing the actual coordinating. When agents coordinate in prose, the prose is the entire apparatus, full stop, and prose (dragged into service as a vehicle for joint state across many heterogeneous parties who do not share a year together in a building) is one of the most catastrophically lossy substrates meaning has ever been forced to ride. The bot-network hums along beautifully, agent to agent to agent, each message arriving at the next inbox in perfect grammar, decrypting on the receiving end into a meaning that is almost exactly the meaning the sending agent meant, but not quite, and the not-quite is the entire game. The joint state of the bot-org silently rots. The rot compounds. The rot compounds invisibly. And then one morning, the bot-org, having behaved itself with impeccable grammar and well-formed JSON the entire intervening time, does a thing the morning humans cannot undo.

Why?

When two humans coordinate over Slack, the Slack channel is doing perhaps ten percent of the work, possibly less. The other ninety percent is happening in the warm wet living fabric around the channel: the year you and I have worked in the same building, the smell of the coffee at the same espresso machine, the time you covered for me when I was sick, the half-look across the room at the next meeting when somebody else said something stupid. When I send you a message that is slightly ambiguous, you have a year of me to triangulate the meaning against. You know I usually mean A and not B when I phrase it that way. You hold a working model of what I know and what I do not know. You hold a working model of what the two of us agreed last month and what we left unresolved. You raise an eyebrow, you push back, you ask if you are not sure, you correct me when I drift, I do the same to you. The Slack channel transports a small pre-compressed update; the load-bearing apparatus of meaning is the entire embodied common life the two of us are conducting on either side of it, and that life is nowhere written down.¹

The agents have none of this. Between any two agents, the message is the entirety of the relationship: no shared year of working in the same building, no embodied sense of what the other has and hasn't seen, no shared model of last month's agreements that the two of them are still operating under, no eyebrow, no quiet history of small mutual corrections that has tuned the two of them into hearing each other accurately. Each message arrives at the next agent's inbox as the sole carrier of every shred of context the sending agent had in its head, and the receiving agent (gallant, sincere, well-fine-tuned) reconstructs from this single skinny carrier the meaning the sending agent meant; and the reconstruction is approximately right almost always and exactly right almost never. So things get silently dropped. The agents agree on a value. They mean slightly different things by it. One agent's confidence-eight-out-of-ten becomes the next agent's stated fact. One agent's approved refers to legal-review-passed-with-three-sign-offs; the next agent's approved refers to a junior PM, somewhere, on a Thursday, nodding at a screenshot. Each handoff is, individually, textbook. The trouble lives between the messages, in the slow-burning misalignment-debt no agent in the chain is keeping a ledger of.

From the outside, the bot-org looks magnificently healthy. The dashboards purr. The latency is excellent. The cost-per-task drops with the gentle inevitability of a falling tide. Inside, the misalignment-debt grows like a metastasis nobody can biopsy: every individual message correct, the joint state of the whole organism steadily losing coherence one one-percent-per-handoff at a time. Three weeks later, sometimes faster, sometimes much slower (the slow case is the dangerous one), the bot-org ships an output built on a sediment of small undisclosed misalignments nobody flagged because nobody could flag them, because the misalignments live between the messages, and the channel has no schema for things that live between.

This is not a problem you fix with better prompts. Not with a system message, not with a tighter schema, not with a more eloquent agent, not with three rounds of constitutional fine-tuning, not with the seventeenth iteration of agent-orchestration-framework currently being announced on Twitter. Prose is the wrong substrate, full stop, for joint state across many parties who do not share a body, a history, or a fabric. Humans patch this up with their fabric; agents (in any system buildable today and in any system the next two product cycles are likely to produce) do not, in the way that matters here. The fabric is missing. The fabric will keep being missing for the foreseeable horizon. And the empty place where the fabric used to live has to be filled by something other than the agents themselves, or the agents will keep speaking past each other in eloquent fluent-sounding mutually-incompatible dialects of pseudo-English until the morning they ship a thing that breaks something irreparable.

I am being, perhaps, a touch dramatic. Prose between agents will keep mostly-working for a long time, the way prose between humans on a five-person Slack channel keeps working mostly fine. The breakdown is at scale; at the boundaries between subsystems; in the dense cross-team handoffs nobody ever designed but everybody depends on; across the kind of timescales where small slippages metastasize into things you cannot put back. The places where human coordination already usually breaks, only worse, because the participants are cheaper than humans, because there are an order of magnitude more of them, because none of them, when the system has begun its slow silent drift, will think to call you up at midnight and say something feels off. The bots, the perfect-grammar perfect-JSON bots, will not say something feels off. They have, by design, no off-feeling.

Now generalize.

This is, fundamentally, not a new problem; it is one of the oldest problems any nontrivial system has ever inherited. Whenever multiple sources, each one a little wrong in its own characteristic way, report on overlapping pieces of the same underlying state, and somebody downstream has to act on a single picture of what that state actually is, you have what the literature drily calls a distributed state reconciliation problem. Multi-sensor fusion has it (each radar and lidar and camera lying about a slightly different aspect of the same truck). Cross-organization workflows have it (legal's view of the deal, sales' view of the same deal, finance's view of the same deal, all converging on a CEO who has to decide on Tuesday). Compliance pipelines pulling from six upstream systems to produce one filing for the regulator have it. Anywhere two reports of overlapping state arrive at one decision-maker, the problem is there, and it has been there since well before agents. Three things have always been used to "solve" it:

Naive merge. Average the inputs, take the last write, pick the source with the highest priority, and call the answer correct. Silently drops every disagreeing signal into a hole nobody can find again.
Synthesize in prose. Hand the whole mess to a writer (or, now, an LLM) and ask for a unified narrative. Confabulates a plausible-but-wrong reconciliation every single time the inputs actually disagree, and never tells you that's what it just did. Prose is generative; it will write you a coherent story out of incoherent inputs and hand it back as if it were fact.
Build a custom pipeline. Per-domain, brittle, expensive, eventually owned by the one engineer who still understands it. Every large company has a graveyard of these.

None of the three treat disagreement itself as a first-class object: which sources disagree, on which attributes, with what provenance and trust, and whether the disagreement is severe enough that the substrate should refuse to commit to any single answer at all. All three pretend the answer always exists. None of them know how to tell you when it doesn't.

The right shape, I think, is one where disagreement is the primary thing the substrate sees, and consensus is just disagreement that happens to be small enough to ignore.

Concretely you want:

A representation of state that can hold which source said what without flattening it into a single value.
An operation that computes a unified view and tells you exactly where, and how badly, the inputs disagreed to produce it.²
A way to refuse. To return no answer when the inputs are too far apart for any unified view to be honest. Most current LLM-synthesis pipelines always return something (sensor fusion has gating, databases have constraint failures; the LLM substrate doesn't yet). The right substrate sometimes returns no, you don't have a coherent picture here, go look at sources A and C on attribute X.³
Per-source trust as a first-class input, not a global hyperparameter you tune once and forget. Different sources are reliable on different attributes. The substrate should know.

There is a formalism that gives this its right shape: cellular sheaf cohomology. Around in pure math for a long time; applied and computational work goes back at least a decade (Curry 2014, Robinson 2014, Hansen-Ghrist 2019), and the agent-relevant tooling has only sharpened recently. It is a language for the structural core of the problem, not a turnkey substrate; the engineering pathway from sheaf Laplacian on a small graph to a system that hundreds of heterogeneous agents call into without thinking about it is most of the bet, and most of it is still unbuilt. The full lineage of the literature plus the matrix derivation lives in a technical appendix. What matters above the appendix is that the math is what lets the substrate distinguish the inputs agree from the inputs disagree at this specific edge of the graph by this specific amount.

Without it, you fall back on naive merge, prose synthesis, or one of the patches from the data-fusion and CRDT literatures; each of which solves a slice of the problem (CRDTs make merge-conflict a first-class output, Bayesian factor graphs handle weighted Gaussian fusion with gating, truth-discovery methods jointly estimate source reliability) but none of which gives you the structural-versus-value separation, the edge-level localization, and the refusal-with-attribution all as one unified primitive.

The math lives in the appendix. What matters here is the shape of the bet, not the apparatus.

The bet.

The mixed human-agent organization, the chimera-org from the top of this piece, is going to be the dominant shape of work, on a timeline I would put somewhere between ten years and twenty-five. Ten is fast. Twenty-five is the base rate for infrastructure shifts that everyone could see coming. Less than ten would require the supply side of agent capability to outrun the demand side of organizations wanting to use them, which is the unusual direction.⁴ The infrastructure that holds these chimera-orgs together is not going to be a Slack workspace with more bots in it. It will look more like a typed reconciliation substrate that humans and agents both speak into, as first-class participants over the same protocol surface, both sides reading the joint state and proposing corrections to maps and observations and requesting trust adjustments. Acceptance, authority, audit remain role- and policy-governed, the way they would in any serious system; what changes is that agents act through the same primitives humans do, rather than through a retrofitted read-only API bolted onto the side. Most current agent infrastructure is asymmetric in the wrong direction: humans run the system, agents consume it through a straw. The shape that is needed runs the other way, with agents and humans as peers at the protocol layer even when their authority profiles differ wildly. Joint state is explicit. Disagreement is explicit. Provenance is explicit. The substrate, when the inputs do not actually agree, can refuse.

There is a version of this that is ten percent better than what we have today: Slack with structured handoffs, JSON-schema'd tool calls, the current agent stack patched up with twenty thousand tiny improvements, every one of them sincere and useful and exactly insufficient.

That is not enough. The patch-up version produces exactly the silent-corruption failure mode I described above, only with prettier syntax and better JSON-schema validation in the middle and slightly fewer agents going completely off the rails per quarter. The thing that is structurally needed is a substrate where the unit of communication is no longer a message-shaped thing but a piece of joint state with provenance and trust attached as primary attributes, where the operations on that state mathematically surface disagreement (weighted by per-source trust) instead of papering over it with a confident-sounding sentence the receiving agent has no way to second-guess.

The shape I am describing might be solved adequately by a wave of patches I have not imagined, the formal apparatus might turn out to be cathedral-engineering for a problem most people will only ever need a garden shed for, and the year-fifty printing-press analogy cuts both ways: maybe what is needed here is the equivalent of better punctuation, not a new genre of broadside.

But the load-bearing intuition is that prose is lossy; prose between agents who do not share a fabric is very lossy; the lossiness compounds silently; and across a million handoffs a day, across ten thousand semi-coordinated bots in a thousand mixed-orgs, the compounded lossiness eventually does a thing nobody can put back. If that is true, then the current bet on agents talking to each other in prose is the same bet, in a slightly different costume, that scribes' practices would scale to print. They did not. The native form took a while to find.

I think the native form here is closer to typed shared state than to a chat log. That's what I'm building :)

They have a fabric of sorts: the training corpus, which sits underneath every interaction like a vast indifferent geological layer. The training corpus is shared by every agent equally, which means it cannot disambiguate what we agreed last month from what humans in general say about agreements. The agent-specific fabric, the one that tracks the live history of this coordination between these particular parties in this particular organization under these particular rules and exceptions, has to live somewhere external to the agents themselves. Pieces of it exist today, scattered across issue trackers, PR history, workflow engines, event logs, CRMs, memory stores, each piece blind to all the other pieces. What does not exist, in any place I have found it, is a unified inspectable coordination-native substrate that holds joint state with provenance and trust as first-class objects rather than as documents you can read. Most current attempts are shaped like give the agent more to read, which keeps confusing the disease (no fabric) for the cure (more text). The cure has to be a different kind of substance.

There is a deeper version of this point. Meaning, for humans, is sustained by collective practice: corrections, eyebrow-raises, the million tiny acts of social enforcement that keep a community of speakers roughly in tune with one another. Modern frontier models do receive some corrective practice (RLHF, constitutional training, in-context corrections at the edge), but none of that is the live situated practice of this organization with this month's commitments and exceptions. The substrate, in some sense, has to do for the agents in a chimera-org what the live local practice does for humans: keep them in tune with each other and with the joint state of the system, by making disagreement visible and correctable. That is a lot to ask of an API. But something has to do that work, and it is not going to be Slack. ↩
This is what the sheaf coboundary plus an approximate-section solve gets you, in three words. Slightly more carefully: build a graph whose nodes are what each source says about each shared attribute and whose edges are this source's view should agree with that source's view via this known map. A global section is an assignment of values to every node such that every edge constraint is satisfied: a single unified state every source agrees on. The space of global sections is H⁰, the zeroth cohomology of the sheaf. The cohomological language gives you the right vocabulary for compatibility and obstruction; the actual computation underneath is a weighted linear-algebra problem. When the authored maps fail to compose consistently around overlap cycles, the space of global sections shrinks: the H⁰ deficit counts how many dimensions of expected agreement get destroyed. That deficit, read directly off the authored maps, is the substrate's primary structural-inconsistency witness. Edge-level residuals localize the failure unambiguously: the substrate can point at exactly which edges carry large constraint violation. Attribution to specific cycles requires a chosen cycle basis and is not canonical, but the per-edge witness is.

Two failure modes look similar from the outside but require different machinery.

The first is structural. The schema-mappings between sources do not compose. If A→B, B→C, and A→C are the three published mappings between three databases and going around the triangle does not compose to the identity, the authored maps are internally inconsistent regardless of what values anyone has reported. Caught statically from the map metadata alone, before any observation arrives.

The second is value. The maps compose fine, but the observed values disagree. Source A says Alice's birthday is 1990-03-15, the mapping from A to C is the identity, and source C says 1990-04-15. No structural defect; the maps close. The disagreement shows up at runtime as residual energy on certain rows of the weighted least-squares solve, with per-source and per-edge attribution.

A useful substrate has to detect both. They tell you different things. Structural defects mean the model of how your sources relate is wrong. Value conflicts mean the model is fine but the data disagrees. Almost everything currently sold as "data unification" silently collapses the two into a single confidence score, and you get to find out the hard way which kind of failure produced it.

Real systems are not three-node triangles. They are subjects compiled into graphs with hundreds of nodes and edges, trust weights per source per attribute, and restriction maps that come out of data-engineering work nobody enjoys but everybody has to do. The math does not care; the same operator runs on three nodes or three thousand. What changes is that you need sparse linear algebra rather than pencil and paper. ↩
Michael Robinson calls the relevant scalar the consistency radius: informally, how much would you have to perturb the inputs to make them all agree? If the consistency radius is small, the disagreement is plausibly noise, and the substrate can pick a representative reconciled state, returning the residual alongside it for downstream consumers that care. If the consistency radius is large, the inputs are not really telling the same story, and any single reconciled value the substrate returns will have a residual too large to honestly call consensus. The right behaviour at that point is to refuse, to return no coherent picture, look at sources A and C on attribute X, rather than confabulate.

The mechanism is not new. Sensor fusion has gated on Mahalanobis distance for decades, anomaly detection thresholds reconstruction error, CRDTs return conflict sets when merge cannot pick a winner. What the sheaf framing buys you is the per-edge attribution that goes with the threshold: not just refuse, but refuse and tell you which sources, on which attributes, drove the refusal.

The threshold between those regimes is a tunable parameter, not a universal constant. For some workloads (financial reconciliation, audit, regulated decision-making) the threshold should be near-zero. For others (sensor fusion in noisy environments, opinion aggregation) it should be permissive. The substrate should make this tunable per-tenant, per-attribute, per-call. The capability (knowing when to refuse) is the load-bearing thing. The threshold is just a knob. ↩
Timing is the most fragile claim in any analogy-driven essay; consider this footnote the appropriate confession. Base rates for infrastructure shifts everyone could see coming clock in at fifteen-to-twenty-five-years from obvious in hindsight to dominant, and we are at most five years into agents being a thing organisations actually build with. So ten is fast, twenty is base-rate, and longer than that would imply the shift gets stuck, which historically happens when the supply side is wrong about the demand side rather than when the demand side is unclear. Here the demand side looks unusually un-fakeable: agents are getting cheaper monotonically and organisations want to use them. I would be shocked at twenty-plus, sceptical of before 2030, and expect it to land somewhere in between. ↩