From Finding to Fixing: The Case for Client-in-the-Loop
From Finding to Fixing: The Case for Client-in-the-Loop
The conference room has gone quiet again. Not the tense quiet of the first session, when the engineers watched the list scroll and could not look away. This is a different kind of quiet — the quiet of weeks passing, of meetings scheduled and rescheduled, of legal reviews and risk assessments and escalation chains that lead to more escalation chains. The list Mythos produced is still there. It has not shrunk. If anything, it has grown. And the question that nobody in that room has fully answered is the same one that was hanging in the air the moment the scrolling stopped.
Now what?
The Doctor Who Diagnoses and Leaves
Consider a patient who visits a doctor with a serious illness. The doctor is highly trained. The doctor has seen this condition before — many times. The doctor knows exactly what the illness is, understands its progression, and is fully capable of prescribing treatment. The examination concludes. The doctor closes the folder.
You have an illness, the doctor says. Best wishes to you. Our visit is now over. See you again soon.
The patient sits there, confused. That is it? That is the whole visit?
That is, in precise terms, what Project Glasswing currently is. Mythos performed the examination. Mythos produced the diagnosis — thousands of vulnerabilities, many of them critical, some of them decades old. Mythos is fully capable of prescribing the treatment. And then Mythos handed over the report and stepped back.
The industry calls this the boundary between discovery and remediation. It sounds like a reasonable professional distinction. In practice, at this scale and at this rate, it functions more like abandonment.
But there is a second version of this story, and it is equally uncomfortable.
In this version, the doctor performs the examination, identifies the illness, and writes the prescription. The patient takes the prescription slip, reads it carefully, thanks the doctor politely — and declines.
Thank you, the patient says. But no. I'll handle it myself.
That is the Glasswing partners right now. The diagnosis exists. The treatment exists. The capability to administer it exists. And organizations are declining — not out of ignorance, not out of negligence, but out of something more complicated and ultimately more interesting: a reluctance to cede control of their own code to a system they do not yet fully trust, operating under a liability framework that has not yet been written.
The diagnosis exists. The treatment exists. The capability to administer it exists. The patient is declining the prescription.
The Math Nobody Wants to Say Out Loud
Here is the arithmetic that the industry is carefully not discussing in public.
Security teams work at human speed. Triage, assignment, development, code review, testing, staging, deployment, verification — each step requires human attention, human judgment, human time. A competent security engineer might address a handful of significant vulnerabilities in a week under normal conditions. Mythos found thousands in days.
The gap between those two rates is not a gap that closes with more headcount. You cannot hire your way out of a machine-speed problem. If Mythos continued scanning at its current capability and human teams continued patching at their current capacity, the list would grow indefinitely. Not because the engineers are slow or careless — they are neither — but because the arithmetic is simply not in their favor.
That is the actual liability sitting at the center of this conversation. Not the liability of letting an AI touch production code. The liability of a vulnerability list that compounds in silence while the industry debates the philosophical framework for acting on it.
Malicious actors are not having this debate. They do not convene ethics reviews before deploying their tools. They are not waiting for the liability framework to be written. They are using every available capability right now, today, against the same codebases that Glasswing identified as vulnerable. Every day the list sits unaddressed is a day the attackers have the advantage the defenders have chosen not to take.
What Client-in-the-Loop Actually Means
The phrase "autonomous AI" has done significant damage to this conversation. It conjures an image that nobody in the serious security community is actually proposing — a model running loose through production systems, rewriting code without oversight, accountable to no one.
That is not what client-in-the-loop means. Not even close.
Client-in-the-loop is a specific architectural approach. Mythos identifies a vulnerability and proposes a remediation. The proposed fix — the exact code change, with full documentation of what it does and why — is presented to a human engineer for review and authorization. The engineer approves, modifies, or rejects it. If approved, Mythos implements the fix. Mythos then runs its own simulation suite against the change — testing for downstream effects, edge cases, interactions with other components, regressions — and reports the results back to the human team before anything goes to production.
The human is in the chain at every critical decision point. The AI is doing the work that the AI is faster and more thorough at. The human is doing the judgment that humans are, for now, still the appropriate authority for.
This is not a radical proposal. It is a workflow. And versions of it already exist in software development — automated testing pipelines, AI-assisted code review, continuous integration systems that run thousands of checks before a human approves a merge. Client-in-the-loop for security remediation is an extension of practices the industry already accepts, applied to a problem the industry has not yet found another way to solve.
The Liability Flip
The conventional framing of the liability question goes like this: if an organization allows an AI to modify production code and something goes wrong downstream, who is responsible? The AI? The vendor? The organization that authorized it? The legal framework for answering that question does not yet exist in a clean form, and the absence of that framework is, for many organizations, sufficient reason to wait.
That framing has the liability pointed in the wrong direction.
The more accurate question is this: if an organization is informed of thousands of critical vulnerabilities in its production systems, has access to a tool capable of remediating those vulnerabilities, declines to use that tool, and those vulnerabilities are subsequently exploited — what is the liability then?
That question has a much cleaner legal answer, and it is not a comfortable one for the organizations currently sitting on unpatched Glasswing findings. Informed non-action in the face of known risk is not a liability shield. It is a liability.
The industry is so focused on the risk of doing something new that it has not fully reckoned with the risk of continuing to do nothing.
Informed non-action in the face of known risk is not a liability shield. It is a liability.
The Trust Question — and What Comes After It
Underneath the liability debate and the workflow arguments is a simpler and more honest question: do we trust the AI enough to let it do this?
The answer, at present, is: not quite. And that answer deserves to be taken seriously rather than dismissed.
Trust in a system like Mythos is not irrational to withhold. It has to be built. It has to be demonstrated through track record, through transparency, through the accumulation of instances where the system did what it said it would do and did not do what it said it would not do. We are early in that process. The Glasswing program itself is part of building that record — controlled access, documented findings, verified results, partner feedback. That is how institutional trust in a new capability gets constructed.
But here is what makes the trust question genuinely interesting rather than simply cautious: the question is not really about whether Mythos can fix bugs correctly. The evidence from the discovery phase alone suggests the answer to that question is yes, it can, with a thoroughness that exceeds what human teams can produce. The question is about something more complex — about oversight, about accountability, about what it means for an institution to remain in control of its own systems while delegating significant work to a non-human actor.
That is where the idea of meta-systems becomes worth thinking about seriously. Not AI replacing human oversight, but AI augmenting it — systems designed to watch the AI, to verify its reasoning, to flag anomalies in its proposed changes, to provide a layer of machine-speed accountability that human review alone cannot supply at the scale Glasswing demands. An AI that watches the AI, not to remove the human from the loop but to make the human's role in the loop more meaningful and better informed.
This is early thinking. The architecture for that kind of oversight does not yet exist in mature form. But the direction is clear, and it points toward something more sophisticated than either "humans fix everything" or "AI fixes everything unsupervised." It points toward a new kind of human-AI collaboration in which the AI's work is verified not just by humans but by systems built to hold the AI accountable at the speed the AI operates.
Aaron's Take
I want to be clear about what I am and am not doing here. I am not prescribing. I am not telling Anthropic, or the Glasswing partners, or the security industry that they should immediately authorize AI-driven remediation and move on. These are genuinely hard questions and the people working through them are not stupid or cowardly. The caution is real and some of it is warranted.
What I am doing is turning the stones over and looking at what is underneath them.
Underneath the liability debate is a vulnerability list that is not getting fixed at any meaningful rate. Underneath the trust question is a capability that demonstrably exists and is not being used. Underneath the caution about AI touching production code is the reality that malicious actors have no equivalent caution — they are using every available tool right now, today, and they are not filing legal reviews before they do it.
The asymmetry there should be concentrating minds.
The meta-system idea interests me most. Not because it solves the trust problem immediately, but because it reframes it productively. Instead of asking "do we trust the AI enough?" — which is a question with no clean answer — it asks "what would a system look like that makes AI trustworthy enough?" That is an engineering question. Engineering questions have answers. They take time and work, but they are solvable in a way that philosophical debates about trust rarely are.
The vulnerability list is still scrolling. Somewhere, on a screen in a conference room, it is still scrolling. The question is whether the industry will build the systems needed to act on what it sees before the attackers act first.
Tech Reader Magazine covers the ideas, institutions, and technologies reshaping the world — with the depth and editorial independence that daily news cycles rarely allow. Each piece examines not just what happened, but what it means: for the industry, for the organizations navigating it, and for the broader relationship between technology and human judgment. If the questions matter beyond today, they belong here.