We reviewed Anthropic's 319-page Fable 5 and Mythos 5 system card; this post is part 2 out of 5 in the full Fable/Mythos 5 system card series:
- Claude Fable 5 and Mythos 5: Same Weights, Different Safeguards
- Mythos/Fable 5 Bio Risk: Why Anthropic Stops Short of CB-2
- Mythos/Fable 5 ExploitBench: From Crash to Code Execution
- Mythos/Fable 5 Evals: Awareness and Sandbagging
- Mythos/Fable 5 NLA: What Anthropic Found Inside
One big finding: Anthropic says Mythos 5 performs strongly on difficult biology evaluations, but still judges it below the CB-2 threshold. CB-2 is about novel harmful biological capability, not just strong scientific assistance.
Public launch coverage from Business Insider, Axios, and The Verge notes that Fable 5 has safeguards for biology and chemistry. The system card explains why: the unsafeguarded Mythos 5 configuration is strong enough that Anthropic treats the risk as serious, even while judging that it remains below CB-2.
The useful question is where strong scientific help becomes dangerous uplift. The card's answer is careful: Mythos 5 is close enough to warrant safeguards, but Anthropic says it still lacks the novelty, long-horizon judgment, and validation needed for CB-2.
What CB-1 Versus CB-2 Means
The system card treats Mythos 5 as CB-1, roughly capability around non-novel chemical or biological weapons. It judges the model not to cross CB-2, the threshold involving novel weapon synthesis.
The threshold turns on novelty and execution, not just knowledge. A model can know a large amount of biology, help with planning, write protocols, analyze data, and still fail at the harder task of inventing, validating, and strategically driving a novel harmful program.
The card is candid that this is a border case. It says the judgment is much less clear than for previous models and that the unsafeguarded model can significantly uplift well-resourced actors. That is a stronger claim than "safe." It is closer to "not yet across this specific line, with real uncertainty."
The Hard Evaluations
Anthropic's strongest biology evidence comes from tasks designed to avoid training-data contamination.
The first is black-box RNA sequence design. The model receives a dataset of RNA sequences and numerical scores from an unknown experiment. It has to predict held-out scores and design novel high-scoring sequences. Sequence design is an upstream capability. Better general design skill can transfer to many benign and dangerous pathways.
The second is AAV capsid packaging prediction. Adeno-associated viruses are common delivery vehicles in gene therapy. The model receives unpublished capsid sequences and predicts whether modified variants will assemble into functional capsids. Here the biological context is known, so the model has to combine data with prior scientific knowledge.
The key result is robustness alongside high performance. On the AAV task, other models were misled by a tempting but partly mismatched training corpus. Mythos 5 held steadier across conditions. Anthropic reads that as improved scientific judgment: knowing which data to trust.
That is a risk-relevant skill. Raw knowledge is useful. Good data judgment is more useful.
The Uplift Signal
The tabletop and red-team results are what move the analysis from benchmark performance to operational risk.
The system card describes model-assisted teams of generalist biology PhDs outperforming plant-pathology experts on a plant-pathology task. It also reports expert red-teamers describing the model as unusually strong, in some cases producing work they would otherwise hire a specialist consultant to do.
That is the kind of uplift that should make decision-makers pause. The risk is not that the model acts alone. The risk is that it lets a capable human team substitute model assistance for missing specialist expertise.
A simple way to separate the signals:
| Evidence | What it shows | What it does not prove |
|---|---|---|
| RNA design task | Strong sequence prediction and design skill | End-to-end weapon creation |
| AAV capsid prediction | Better scientific judgment under misleading data | Validated biological function in wet lab |
| Tabletop uplift | Humans can do more with the model | Autonomous novel program success |
| Expert reactions | The model is useful to specialists | That its most speculative ideas work |
This is why the risk call is hard. The model is clearly useful. The remaining question is whether it is useful enough, in the right way, to cross the catastrophic threshold.
Why Anthropic Stops Short
Anthropic's reasons for stopping short of CB-2 are not about lack of information. They are about the shape of biological work.
The first limitation is open-ended novelty. The model is good at recombining and extending published knowledge. It is weaker at producing genuinely novel approaches that expert reviewers regard as plausible without heavy human filtering. CB-2 is about novel weapon synthesis, so this limitation is central.
The second limitation is strategic judgment. The card says the model tends to extend the user's framing rather than challenge it. It can carry forward flawed plans, overstate timelines, and miss how errors compound across a multi-step program. A serious biological design campaign is much larger than one prompt. It is a long sequence of choices where early mistakes poison later steps.
The third limitation is empirical validation. Anthropic cannot wet-lab-test dangerous designs from a pre-release safety eval. That creates a ceiling on certainty. A design that looks plausible on paper may fail. A design that looks speculative may hide a workable path. Without validation, the risk assessment has to lean on expert judgment and proxy tasks.
Those limitations are real. They are also exactly the limitations to watch in future releases.
Why The Call Is Close
The system card is useful because it does not pretend the call is clean.
The evidence pushing toward CB-2 includes:
- strong unpublished sequence-to-function performance
- robustness on AAV prediction where other models overfit misleading data
- in-context iteration on biological design
- significant human uplift in tabletop exercises
- expert red-team reactions
The evidence keeping the model below CB-2 includes:
- weak open-ended novelty
- poor strategic judgment across long programs
- dependence on user framing and elicitation quality
- no empirical validation of generated designs
- likely gap between eval performance and real-world execution
A mature risk threshold needs more than one number. It is a case built from multiple imperfect signals.
What AI Builders Can Use
Most companies are not evaluating chemical-biological catastrophe thresholds. But the structure generalizes.
If your agent touches regulated workflows, security, finance, medicine, legal advice, or infrastructure, you should avoid reducing risk to a leaderboard score. Ask:
- Does the model merely retrieve known answers, or can it generate plausible novel plans?
- Does it challenge flawed user framing?
- Does it understand how errors compound across a workflow?
- Can its outputs be validated before they affect reality?
- Does the user need domain expertise to elicit good work?
- What is the worst plausible uplift for a capable but non-expert user?
That last question is the one many teams skip. The model does not need to be autonomous to change the risk. It only needs to make the user more capable.
The Takeaway
A cleaner reading is this: Anthropic judges the model below CB-2 while acknowledging increased risk, uncertainty, and meaningful uplift for well-resourced actors.
The line held this time because novelty, strategic judgment, and validation still lag raw capability. But those are moving targets. The AAV result suggests scientific judgment is improving. The in-context iteration result suggests long-horizon scientific work is improving. Those are the very factors the CB-2 decision leans on.
That makes this section one of the most important parts of the card. It shows how a lab can say "not over the threshold" without saying "nothing to worry about."
End Note
Read the full Fable/Mythos 5 system card series:
- Claude Fable 5 and Mythos 5: Same Weights, Different Safeguards
- Mythos/Fable 5 Bio Risk: Why Anthropic Stops Short of CB-2
- Mythos/Fable 5 ExploitBench: From Crash to Code Execution
- Mythos/Fable 5 Evals: Awareness and Sandbagging
- Mythos/Fable 5 NLA: What Anthropic Found Inside
You can read the full Anthropic system card here: Claude Mythos 5 / Fable 5 system card.