Anthropic Meters Programmatic Usage, Mythos Cracks Cyber Ranges & Pocock Re-Grills
Claude Code & Anthropic Updates
The $20-to-$200 programmatic credit — a rebrand or a 10× cut?
The thread that ate everyone's timeline. @ClaudeDevs announced that from June 15, paid Claude plans get a dedicated monthly credit for programmatic usage, covering:
- Claude Agent SDK
claude -p- Claude Code GitHub Actions
- Third-party apps built on the Agent SDK
Matt Pocock's read (218k views, 2.7k likes):
This is the clarity we've been crying out for. But it's a poisoned chalice. This is a 10X cut to
claude -pdisguised as a monthly bonus. Anthropic is discouraging any kind of programmatic usage. And that's fine — no subsidy lasts forever. But it's time to try Codex.
In follow-ups he noted he'd been asking for clarity on this for nearly two months and that he had built a whole orchestration framework around claude -p. Other replies framed the same trade clearly: "a cap on programmatic usage is the headline. For a team running 5 agents the new math is a 10-20x downgrade dressed up as a credit." He went live on YouTube to walk through the math.
Theo: "any statement from an Anthropic employee is a lie on a timer"
Theo Browne's reaction post hit 494k views:
I can't help but feel personally burned by the Claude Code changes announced today. We put so much work into wrapping the (atrocious) Claude Agent SDK in T3 Code… Now our users are getting their rate limits cut by 40x, despite us doing everything right.
Until we see significant change, it is safe to assume any statement from an Anthropic employee is a lie on a timer.
He followed up with a $10-per-cancelled-subscription bounty for open source (cap: $20,000) and posted that "Anthropic just freed up a bunch of compute by blocking open source devs and apps from using Claude Code". He also corrected the narrative on T3 Code — T3 makes no money on it; users bring their own inference (Codex / Claude Code / Cursor / OpenCode) and T3 just gives them a better UI on top. He says Zed users will be hit too because ACP won't work with subscription limits, forcing them back to the terminal app.
Jeremy Howard's add-on community note made the structural point that's resonating with critics:
This policy redefines the term "interactive" to mean "using an Anthropic front-end". If you use
claude -por Agent SDK to do something interactively, it now uses credits, not your subscription limits.
Armin Ronacher just sighed: "This at least makes their policy consistent." And Simon Willison's one-liner — "Doing this is a great way to make a bonfire of your reputation" — felt like it landed in the same crater.
The sweetener: weekly limits +50% through July 13
Claude Devs posted that Claude Code weekly limits are increasing 50% through July 13 for Pro, Max, Team, and seat-based Enterprise. The interactive pool is unchanged; the new credit is the programmatic pool. Several replies pointed out this only sharpens the implicit deal: interactive in the Anthropic-blessed UI is still subsidised, anything else is metered.
Mythos Preview clears the AISI Cooling Tower
Boris Cherny, who leads Anthropic's offensive-AI program Glasswing, shared a Mythos Preview update (122k views):
The UK AISI found Mythos Preview is the first model to solve both their cyber ranges end-to-end. No model had ever solved the AISI's "Cooling Tower" cyber range before. We're getting it to defenders as fast as we responsibly can.
The quoted background thread says Mythos cleared every AISI task estimated >8 hours within the deliberately low 2.5M-token cap; XBOW called the precision "token-for-token, unprecedented", and Glasswing partners report finding "sometimes double what they'd normally find in a year" in high/critical vulns. Reports: XBOW · UK AISI.
/goal as the third autonomy rung
Boris also previewed the /goal slash command in Claude Code, framed as the AI-evaluated tier of autonomy. Swyx tied it together neatly:
increasing levels of autonomy:
/skill: preset prompts/plan: human-refined inputs/goal: AI-evaluated outputs
Replies dug into the tension — "each rung loses a class of guarantee. /skill: deterministic. /plan: reproducible. /goal: probabilistic — no two runs need match. The hard problem isn't autonomy itself, it's that you stop being able to ask 'did it work' and have to ask 'did the distribution shift'."
Agentic Coding & Agent Harnesses
Steipete's claw orders an Uber over Tailscale
Peter Steinberger streamed an Android phone from a data centre to his Mac via Tailscale + scrcpy, then drove the screen with his claw via Peekaboo (127k views, 1.6k likes). "Now my claw can order me an Uber." When skeptics called it overengineered, he replied: "How many services are not available on the web but exist as app…" — including payments, if you disable biometrics. Adjacent: he showed Codex resolving a Telegram token rotation by using Peekaboo to drive the Telegram Mac app and talk to BotFather.
He also shipped Crabbox 0.13.0: Modal sandbox runs, full resync for stale workdirs, native Windows script support, clearer SSH/sync error hints. Repo: github.com/openclaw/crabbox.
T3 Code, OpenCode, and the harness exodus
Theo rebuilt the T3 Code marketing page to emphasise its BYO-inference model and asked for testimonials. Tone of replies is uniformly "already migrated, never going back." Many of Matt Pocock's commenters echoed this: "Codex run is coming for everyone whether they like it or not."
Theo: CLI vs desktop, and "impossible to paste images over SSH"
A couple of zoomed-out posts worth their own bullet:
- "Just learned it's literally impossible to paste images into Claude Code over SSH. How do you CLI people live like this??"
- "Are you still using the CLI versions of your preferred agent instead of desktop apps like Codex App, Conductor, or T3 Code? Tell me why below."
Codex & OpenAI Updates
Codex's in-app browser learns viewport switching
LLMJunky highlighted a small but telling Codex update: the in-app browser can now resize the viewport to test mobile / tablet / desktop breakpoints, take screenshots at key points, and even hide the IAB to disable animations and run testing 1–2× faster. "You can tell they really use and love these products internally."
GPT 5.5 in Cursor
"GPT 5.5 in Cursor is honestly cracked. The psychosis got me good tonight," reported am.will — light on detail but the replies ("the codebase indexing and all features are very helpful") suggest GPT-5.5 has settled into Cursor as the default option people are reaching for after the Claude Code wobble.
A Codex plugin for daily dependency vuln-scanning
A retweet via @joe_lgtm / @LLMJunky describes a scheduled Codex plugin that "inventories dependency manifests, checks exact package/version exposure through OSV, pulls current vulnerability intelligence from CISA KEV, NVD, RSS feeds, and optional X recent search" — i.e. a daily cron that knows your supply chain. Practical pattern worth stealing.
Security & Supply Chain
Shai-Hulud, open-sourced and then re-buried
am.will reported that the Shai-Hulud worm — the npm-spreading exploit that hit 170+ packages and 400+ repos — was open-sourced on GitHub. GitHub removed the repo; vx-underground apparently mirrored it. Quote-tweet sentiment: "It can no longer be studied… unless there was someone who collected this sort of thing and has a local copy." Don't run the code, but the post-mortem fuel is now in the wild.
GitHub's Advisory DB incorrectly flagged Puppeteer as malware
Mathias Bynens (via an @mitsuhiko RT) flagged that GitHub's Advisory Database mis-listed Puppeteer as malware, blocking new releases. A precise illustration of how automated supply-chain defences eat their own when the truth-of-record turns out to be a low-quality list.
Claude vs a 12-year-old locked Bitcoin wallet
LLMJunky surfaced a viral story of someone using Claude to recover $400,000 of their own BTC from a wallet they hadn't been able to access for 12 years. Not strictly security in the defensive sense, but a nice reminder that interactive frontier-model coding remains absurdly powerful on niche, well-scoped reverse-engineering problems.
Skills, Workflows & Dev Tools
Matt Pocock retires /grill-me for code
Matt Pocock (35k views) says /grill-me is his most popular skill ever — and he's stopped using it for code:
/grill-me is my most popular skill ever. I get 5-10 messages a day about how it's changed people's workflows for the better. But… I've stopped using it for code. Here's the improved version.
Replies framed the evolution clearly: "grilling is opinion, adversarial review is architecture — it asks 'what breaks this in prod' not 'what would I do differently.'" The successor, grill-with-docs, is referenced in a follow-up. Skills index: aihero.dev.
swyx on what model-router data reveals
Swyx browsed the latest Vercel AI Gateway numbers and called out a few counterintuitive splits for that subset of traffic:
- Gemini leads in education and personal assistants
- Anthropic leads in vibecoding, coding, and back office
- OpenAI leads in recruiting outreach
Best reply, from Danny Livshits: "Those category splits hint at procurement gravity: Gemini rides Workspace, Anthropic rides IDE plugins, OpenAI rides sales stacks. Distribution is shaping model share more than evals."
Mitsuhiko / Pi: killing dependencies, why not Rust
Armin Ronacher's working-out-loud thread on his Pi project is a useful counterweight to the "agents make language choice irrelevant" take:
- Killing dependencies in Pi — "because reasons."
- Why not Rust/Go for Pi: "Extensibility is key to it. That leaves Ruby, Python, JS, PHP for the most part unless you want to ship an interpreter. None of those languages have any benefit over Node."
- Stowaway QuickJS: Pi was unknowingly shipping a WASM-compiled QuickJS interpreter because
proxy-agentsupports PAC scripts. "Does anyone need PAC?"
Industry & Misc
Theo's Anthropic vs xAI podcast (and a Bun-in-Rust update)
Theo dropped a podcast episode covering Anthropic/xAI, the OpenAI lawsuit, the Bun-in-Rust port (still progressing — recall the 99% milestone from a few days ago), security, and — yes — pottery. Useful single-listen for catching up if you haven't been following the threads.
"Code is actually the right abstraction"
Lee Robinson, two days back but still circulating:
Too often I see the future of software engineering diminished down to, effectively, writing and reviewing markdown files. Yes, it will be hard to review thousands of lines of agent code. But maybe the takeaway is that code remains the right abstraction.
A useful rebuttal to the "specs replace code" drift in agentic-engineering discourse.
Document parsing without a model
Jerry Liu shipped LiteParse, an open-source, model-free document parser for AI agents — 50+ document types, clean text out in seconds, with a self-hostable HTTP server (liteparse-server) for fully local stacks. Worth a look if you've been paying Claude/GPT tokens just to OCR PDFs.
Coolify's bot-bait trap
A retweet via @theo / @heyandras describes a fun defensive trick: "We made a fake repo with fake bounties, and the bots are applying fake PRs, so we know who is fake, and we can ban them from the Coolify repo." Honey-pot OSS contribution detection. Cheap, mean, effective.