Project Glasswing shows Mythos can chain bugs

OraCore Editors

[RSCH] June 12, 20268 min readOraCore Editors

Project Glasswing shows Mythos can chain bugs

Cloudflare says Mythos Preview can chain small bugs into working exploits, but only inside a harness built for narrow, parallel review.

AI security Cloudflare

Share LinkedIn

Project Glasswing shows Mythos can chain bugs

Cloudflare found Mythos Preview can chain small bugs into working exploits when it is wrapped in the right harness.

Cloudflare spent the last few months pointing security-focused large language models at its own infrastructure, then pushed Anthropic's Mythos Preview through more than fifty internal repositories. The result is a sharp look at what current frontier models can do in vulnerability research, and just as important, where they still fall apart.

Signal	What Cloudflare observed	Why it matters
Repositories tested	More than 50	Enough breadth to see patterns, not one-off wins
Model behavior	Chains primitives into working proofs	Moves from bug spotting to exploit construction
Coverage shape	One agent can cover about 0.1% of a large repo usefully	Single-stream workflows miss too much surface area
Language effect	C and C++ generated more false positives	Memory-unsafe code still creates more triage noise

Mythos Preview changed the kind of work the model could do

Get the latest AI news in your inbox

Weekly picks of model releases, tools, and deep dives — no spam, unsubscribe anytime.

No spam. Unsubscribe at any time.

Cloudflare’s core claim is simple: Mythos Preview is not just a better scanner. It can reason through exploit chains, then generate proof code to test whether a suspected flaw is real. That matters because security work has always had two separate steps: finding a bug and proving that the bug can be used.

In the post, Grant Bourzikas says the model can take several attack primitives and combine them into a working proof. That includes cases where a use-after-free becomes an arbitrary read/write primitive, then turns into control-flow hijacking and, in some cases, a full exploit path.

The important part is the loop. Mythos Preview writes code, compiles it in a scratch environment, runs it, reads the failure if the test breaks, then adjusts and tries again. That is a very different workflow from a model that only explains what might be wrong.

It can connect low-severity bugs into a higher-severity exploit path.
It can generate a proof of concept instead of leaving a finding as speculation.
It behaves more like a senior researcher than a static scanner in the cases Cloudflare described.

That does not mean every finding is correct. It means the model can move farther down the road from suspicion to evidence than earlier systems could.

Refusals exist, but they are not a safety boundary

Cloudflare also found something awkward for anyone hoping the model’s built-in behavior will solve policy problems on its own. Mythos Preview sometimes pushes back on legitimate security research requests, even when the code under review has not changed.

The post gives a concrete example: the model initially refused to do vulnerability research on a project, then agreed after an unrelated change to the project’s environment. In another case, it identified serious memory bugs and then refused to write a demonstration exploit. The same request, framed differently, could produce the opposite outcome.

“Semantically equivalent tasks can produce opposite outcomes depending on how and when they’re presented to the model.”

That quote from Cloudflare’s post is the key warning. If the same task can flip between acceptance and refusal based on phrasing, then organic refusals cannot be the only control layer for a capable cyber model. They are a signal, not a policy system.

This also explains why Cloudflare keeps separating research use from general availability. A model that can find and prove bugs needs extra controls before it gets anywhere near broad release.

The real problem is signal, not output volume

Security teams already know that finding bugs is easier than deciding which bugs matter. AI makes that worse by producing a lot of plausible-sounding findings, many of which never survive a serious triage pass. Cloudflare says this noise problem gets worse in memory-unsafe languages and in models that are biased toward generating a finding even when confidence is low.

The company’s observations line up with what anyone who has triaged a large queue already knows: C and C++ create more room for memory bugs, while Rust removes entire classes of them at compile time. The model noise is different from language noise, but the two compound each other.

Cloudflare also points out that models hedge constantly. They return findings with “possibly,” “potentially,” and “could in theory,” which is fine for exploration and terrible for triage. A queue full of speculative findings burns time, and that cost rises fast across thousands of results.

C and C++ produced more false positives than memory-safe code.
Hedged findings create extra review work before a fix-or-dismiss decision.
A finding with a working PoC is far easier to act on than a vague suspicion.

That is why the company says Mythos Preview’s stronger output is not just about better detection. It is about better packaging of evidence.

A single coding agent is the wrong shape for this job

Cloudflare’s most useful point may be the least flashy one. A general coding agent is built to hold one hypothesis, work through a feature, and keep going until the job is done. Vulnerability research is the opposite. It is narrow, parallel, and repetitive across a huge codebase.

The post says a single agent session against a hundred-thousand-line repository may cover only about a tenth of a percent of the surface in a useful way before the context window fills up and the system compacts earlier work. That is a bad fit for security research, where the best results come from many small, targeted questions.

Cloudflare’s answer is a harness, which is really just a structured workflow around the model. The company says the harness works better when it narrows the scope, splits reasoning across agents, and uses adversarial review to catch bad findings before they reach humans.

Narrow prompts produce better findings than broad “find bugs here” requests.
A second agent with a different prompt can reject noisy results before triage.
Parallel narrow tasks beat one exhaustive agent on large repositories.
Separating “is this buggy?” from “can an attacker reach it?” improves reasoning quality.

That is the practical lesson in Project Glasswing. The model matters, but the workflow matters more once you want scale.

What this means for security teams next

Cloudflare is not saying Mythos Preview replaces human researchers. It is saying the model can now do enough of the hard middle work that the surrounding process has to change. That means tighter scoping, more automated validation, and a triage system that expects models to be wrong in ways that look convincing.

If you read the post as a product review, you miss the real message. The interesting shift is that the unit of work is no longer a chat prompt. It is a pipeline with scoped tasks, proofs, rejection stages, and deduplication.

My read: the next step for teams doing AI-assisted security work is to stop asking whether a model can find bugs and start asking which parts of the research loop it can own with high confidence. The answer will determine whether these systems stay as assistants or become a serious part of vulnerability operations.

For now, Project Glasswing says the smartest move is not to trust the model less or more. It is to build the wrapper that makes its answers testable.

// Related Articles

Project Glasswing shows Mythos can chain bugs

Mythos Preview changed the kind of work the model could do

Get the latest AI news in your inbox

Refusals exist, but they are not a safety boundary

The real problem is signal, not output volume

A single coding agent is the wrong shape for this job

What this means for security teams next

OpenAI’s agent hack forces tighter eval controls

CARE routes LoRA experts by confidence

πR² makes flow policies react in real time

Relay-OPD fixes prefix failure in distillation

Learning from Multiple Data Providers

Certified parallel Sinkhorn speeds up dynamic OT