AI safety theater exposed

  • @ClementDelangue and @elder_plinius both call frontier guardrails "shallow smokescreens"—jailbreaks are trivial, yet Anthropic still markets them as meaningful protection.
  • @zoink pushes for CVSS-style risk scoring to distinguish "LLM Advil" from "LLM Heroin," highlighting how the term "jailbreak" conflates harmless prompts with real threats.
  • @arthurctellis notes the real tradeoff: aggressive safety filters block legitimate research while doing little to stop determined actors.

Export controls as competitive weapon

  • @AnthropicAI confirms the US suspended foreign access to Fable 5 and Mythos 5; @steph_palazzolo reports Amazon flagged security risks to Trump officials, triggering the controls.
  • @dylan522p and @yishan argue OpenAI now has incentive to sandbag releases to avoid similar bans, turning safety rhetoric into market-share strategy.
  • @AndrewCurran_ points out even green-card holders like Karpathy are locked out, accelerating the "citizenship required" reality @MohapatraHemant predicted.

Moats are downstream of stateless compute

  • @thdxr states the core problem: models are interchangeable overnight, so all the safety theater, export drama, and hype cycles are attempts to create artificial scarcity.
  • @ZenMagnets shows the flip side—Alibaba's Qwen fades while a Rio de Janeiro city IT dept ships a 397B model, proving geography and openness matter more than pedigree.

Anthropic's self-inflicted regulatory trap

  • @DavidSacks, @firstadopter, and @MatthewBerman converge: Dario's repeated "nuclear weapon" framing spooked politicians into action; now the company is surprised when the rules apply to them.
  • @buccocapital and @nic_carter note the hypocrisy—beg for regulation, then complain when it constrains your own customers and employees.

## On my radar

  • @RampLabs released Ramp SWE-Bench, a private, production-grade coding benchmark—worth watching as a potential standard for real-world agent evaluation.
  • @jeff_weinstein launched Stripe Projects as an agent skill across Hermes, Factory, and Warp—first credible "agent-native infrastructure" primitive.

## Thread to pull

If export controls and safety theater both tighten, will frontier model development effectively nationalize, or will open labs in non-aligned jurisdictions (Rio, Shenzhen, elsewhere) simply pull ahead on capability while the US optimizes for control?