Code Is Still Law
Lawrence Lessig argued code regulates as powerfully as law. AI model weights now do the same, and nobody can read them.
The FOIA That Doesn't Work
For twenty-five years, open-source code served as a kind of freedom-of-information law for the digital world. If software constrained your behavior, you could read the source, hire someone to audit it, or build a competing version without the constraint. The legal scholar Lawrence Lessig gave this principle its sharpest articulation in his 1999 book Code and Other Laws of Cyberspace and its 2006 update Code: Version 2.0. His central claim was methodological, not triumphant: study code the way you'd study legislation, because it constrains behavior the same way. "Code is never found," Lessig wrote. "It is only ever made, and only ever made by us." His deeper argument was about transparency. When France asked Netscape to modify SSL encryption to enable government surveillance, the open-source architecture meant anyone could build a module without the backdoor. "The module that wins would be the one users wanted," Lessig observed. Transparency was the structural check on power. Not perfection, not democracy, but a minimum: the governed could read the rules.
Then the rules started being written in a language nobody can read.
The Opacity Escalation
Lessig's framework assumed a legibility spectrum. Statutes are public and debatable. Open-source code narrows the audience to those with programming literacy, but a determined regulator can still hire someone to read it. Closed-source code hides the text in practice while leaving it readable in principle. Reverse engineering exists, leaks happen, and "Closed code is the propagandist's best strategy," Lessig warned, but the strategy has cracks.
Model weights are something else entirely. You can download Meta's Llama 4 weights today, every parameter, every layer, the full architecture laid bare. You cannot read them. A 400-billion-parameter model encodes its constraints across billions of numerical values whose individual contributions to any given output no one can trace. In 2024, Anthropic used sparse autoencoders to identify individual features inside Claude that correspond to recognizable concepts, a genuine first step toward mapping neural circuits. The distance between that step and explaining why a frontier model produces a specific output is still measured in orders of magnitude.
Imagine a freedom-of-information law that releases all government documents, but in an alien language no one on Earth can translate. The documents are technically public, the transparency requirement technically met, and it accomplishes nothing. That is model weights. Not a locked filing cabinet but documents written in a script nobody has deciphered.
Lessig worried about code that hides its constraints. Model weights don't hide their constraints; they don't have constraints anyone can point to. The regulation is real, shaping what hundreds of millions of users can and cannot do every day, and the text of that regulation does not exist in any human-readable form.
Architecture Wins by Default
Four forces regulate behavior in Lessig's framework: law, norms, markets, and architecture. Architecture was the sleeper. It doesn't ask for compliance or cooperation. Law says "you must not." Architecture says "you cannot."
In April 2026, security firm OX Security published research showing that the Model Context Protocol, the open standard for connecting AI models to external tools, has a critical vulnerability enabling remote code execution across every major implementation. According to OX Security, as many as 200,000 servers were vulnerable. The protocol meant to connect agents became a universal attack surface: multiple layers of code, written by different actors with different incentives, interacting in ways none of them fully controlled. That is what architecture winning looks like in practice. Not a dramatic power grab, just quietly shipped infrastructure whose consequences emerge after deployment.
The other three forces are trying to keep up. Norms are pushing back hard: AI has worse favorability than ICE in American polling, according to an NBC News poll, with analysis by CNET and The Verge confirming AI ranked below ICE in favorability, and a Quinnipiac survey found more than half of Americans believe AI will do more harm than good. But you cannot demand changes to a system you cannot describe. Public hostility is real and diffuse, directed at architecture it cannot see into.
Law is trying hardest. The EU AI Act, the most ambitious AI regulation on Earth, imposes conformity assessments and technical documentation requirements on high-risk systems and general-purpose AI models, with obligations taking effect August 2, 2026. The framework was designed for predictable algorithmic systems. It landed in a world of emergent behavior, and conformity assessment implicitly assumes the system is auditable. For a model whose builders can't fully explain its outputs, that assumption fails before the audit begins. The United States has no federal AI legislation at all; Brookings described the situation as "the empty national AI policy framework." China's Interim Measures for Generative AI Services mandate algorithmic transparency and user-facing explanations, but a Chinese regulator auditing Alibaba's Qwen faces the same legibility wall as the Brussels AI Office evaluating GPT-5.5. The opacity escalation does not respect borders.
Lessig warned about this twenty years ago. "We should worry about a regime that makes invisible regulation easier," he wrote. "We should worry about a regime that makes it easier to regulate." Architecture is outrunning the other three forces because none of them can read what architecture is doing.
The Open-Weight Paradox
The solution Lessig proposed for closed-code power was structural: open the code. If anyone can read it, transparency becomes the check on architectural power. Open-weight AI models are supposed to serve this function. Meta's Llama releases weights with limited documentation. DeepSeek publishes weights, architecture details, and training methodology. A paper published on arXiv in April 2026, titled "Why Restricting Access to AI Models May Undermine the Safety It Seeks to Protect," argued that restricting access to weights may undermine safety, because access is a precondition for accountability. Governments cannot force a backdoor that sticks, for the same reason France couldn't force one into SSL: anyone can build a version without it.
But Lessig assumed that open code was readable. Open weights are downloadable but not interpretable. The structural safeguard is technically in place, and it accomplishes nothing, because the documents can't be read.
Lessig himself was careful about this distinction. He did not argue that open code was unambiguously good. "For some, the objective is to build code that disables any possible governmental control," he wrote. "That is not my objective." His value was transparency, not deregulation. Open weights deliver the deregulation without the transparency, the opposite of what he wanted. Treating weight access as the end of the transparency project, rather than its beginning, is a category error.
The two strongest objections to this argument are worth taking seriously.
The first: behavioral testing doesn't require interpretability. Red-teaming, evaluations, and behavioral audits test what a model does without requiring an explanation of why. This is not a hypothetical approach; it's how most safety evaluation works today. The analogy to medicine is instructive and underappreciated. Drug regulators approve medications whose mechanisms of action remain poorly understood, provided clinical trial evidence demonstrates safety and efficacy. Lithium was prescribed for bipolar disorder for decades before researchers began to understand how it worked. Acetaminophen remains somewhat mechanistically mysterious. If pharmaceutical regulation can function without complete molecular understanding, why can't AI governance function without complete model interpretability? Rigorous behavioral testing, applied systematically across diverse scenarios, might be enough. The systems don't need to be legible; they need to be safe, and safety is measurable through outcomes.
The second objection: interpretability research is on a trajectory, and it's moving faster than critics acknowledge. Anthropic's sparse autoencoder work, DeepMind's circuit analysis, and independent representation engineering research have all produced meaningful results in the past two years. The field barely existed five years ago; now it has dedicated teams at every major lab and a growing research community with shared tools and benchmarks. History suggests that "we can't understand X" is almost always a statement about the present rather than a law of nature. Give researchers a decade with the right incentives and legibility might arrive on roughly the same timeline as mature AI governance frameworks. Governance built on patience might prove wiser than governance built on panic.
These objections are strong enough that they could be right. They deserve the weight of genuine uncertainty rather than rhetorical dismissal.
But neither fully answers the structural problem. Behavioral testing catches the problems you think to test for. It is reactive by design, built on the assumption that you can anticipate the failure modes worth probing. With systems that exhibit emergent behavior and novel failure modes under distributional shift, the failures you didn't anticipate are precisely the ones that matter most. Medicine's tolerance for mechanistic opacity works partly because drug interactions are bounded in ways that model interactions with open-ended prompts are not. A drug has a defined chemical structure; a frontier model responds to an unbounded input space, and the behavioral surface is too vast to test exhaustively. The analogy holds up to a point, then quietly breaks down.
And interpretability research, however promising its trajectory, has not yet explained a single frontier model's behavior comprehensively. The gap may close, but governance requires legibility now, not eventually. Billions of people interact with these systems daily. "We'll understand it later" is a promissory note written against a capability nobody has demonstrated at scale, and the systems keep shipping while the note remains outstanding.
The argument is not that interpretability is impossible or that behavioral testing is useless. Both are necessary and good. The argument is that governance built on the assumption of legibility fails when legibility does not yet exist, and when governance cannot verify what the rules are, the question shifts from whether the rules work to something more basic: who is writing them, and by what authority?
Who Writes the Constitution?
The "constitution" of an AI system in practice is multi-layered: model training encodes base behavior, system prompts override it, deployment guardrails constrain outputs further, and enterprise compliance wrappers add another layer on top. Each layer is written by different people with different incentives. Unlike actual constitutions, there is no hierarchy, no Supremacy Clause, no clear answer to which layer wins when they conflict. A system prompt can override training, an enterprise filter can override both, and no layer is clearly supreme.
The result is not constitutional order but something closer to constitutional chaos. Lessig saw this coming for code twenty years ago. "Code codifies values," he wrote, "and yet, oddly, most people speak as if code were just a question of engineering."
Anthropic, OpenAI, Google DeepMind, and Meta all have alignment teams and published safety frameworks. Anthropic named their approach "Constitutional AI," the most articulate framing of the problem. But everywhere, foundational rules are written by engineers without public input or transparency, and with no appeals process when the model gets it wrong. Nilay Patel, editor of The Verge, described the worldview driving these decisions as "software brain," the conviction that everything is an optimizable system, every problem a database waiting to be structured. That worldview determines whose values get codified at each layer, and the people affected by those values have no seat at the table.
The diffuse authorship of trained models compounds this problem in a way that has no real precedent. At least with Lessig's closed-source nightmare, someone wrote the code. You could subpoena them, hold a hearing, ask what they intended and whether the outcome matched the design. Responsibility was obscured but not dissolved. With trained models, the "author" is a dataset of billions of text samples and an optimization function that minimized a loss metric. No human decided that Claude should refuse a particular chemistry question on Tuesday and answer it on Wednesday. No committee voted on the boundary between helpful and harmful. The regulatory choices are real, experienced by hundreds of millions of people as the texture of the tools they use daily, and their authorship is genuinely nobody's. The decisions exist without a decision-maker, which means they exist without anyone who can be held accountable for them.
This creates a crisis not just for transparency but for the entire concept of democratic governance over consequential infrastructure. If the governed cannot identify who wrote the rules, they cannot petition for changes. If the authors themselves cannot point to where a behavioral constraint lives in the model, they cannot testify meaningfully about it. The whole apparatus of accountability (legislative hearings, regulatory review, public comment periods) assumes that someone, somewhere, can explain what the system does and why. That assumption is breaking.
Constitutional law has a name for when private power becomes public enough to warrant constitutional constraints: the state action doctrine. Scholars are beginning to ask whether AI systems present a stronger case for this analysis than social media platforms, where the question is already being litigated in Moody v. NetChoice. But even state action doctrine assumes you can identify the action and the actor. Model behavior functions as regulation in the Lessig sense: architecture constraining behavior, experienced as the built environment rather than as a rule imposed from outside. Lessig's own formulation remains the sharpest version of the problem: "If code functions as law, then we are creating the most significant new jurisdiction since the Louisiana Purchase. Yet we are building it just outside the Constitution's review."
Reading the Law
The honest version of the demand is circular. We need interpretability audits for systems we cannot yet interpret. The EU AI Act's conformity assessment framework assumes a kind of system legibility that does not exist for frontier foundation models. China's algorithmic transparency mandates assume companies can explain systems whose reasoning is opaque to their own engineers. The regulations take effect against systems that exhibit emergent behavior nobody fully understands. The gap is not in the regulation itself but in the technology that regulation depends on.
Accept that this is a circular problem: governance requires legibility, legibility requires research, research requires governance to mandate and fund it, and someone has to go first. The circularity is real. The paralysis it produces is a choice. Lessig himself identified this kind of paralysis. "There are choices we could make," he wrote, "but we pretend that there is nothing we can do. We choose to pretend; we shut our eyes."
We have been here before, or close enough. Societies have repeatedly been governed by forces they could not fully read. Financial derivatives grew so complex that regulators couldn't model their risk. Global supply chains fragmented until no single actor could map the whole system. Legal codes swelled beyond any citizen's capacity to know what bound them. The response in every case was never perfect legibility. It was always some negotiated, imperfect, politically contested arrangement that made the illegibility survivable. AI governance will probably look the same: not a clean solution but a messy, ongoing argument about how much opacity a society is willing to tolerate and who bears the cost when that tolerance is wrong.
The difference is the pace. Every previous round of the legibility problem played out over decades. This one is measured in quarters.
Lessig closed Code with a line that still works twenty years later, updated only slightly for what has changed: "We build these models, then we are constrained by these models we have built. We will speak about their 'nature' making it so, forgetting that here, we are nature."
We are nature. And we can no longer read what we have written.