Default changed, ceiling raised: this week's signal

GPT-5.5 became ChatGPT's new default model this week. Anthropic plugged SpaceX's Colossus cluster into Claude's backend and doubled rate limits in the same announcement. Reflex published hard numbers on what it actually costs to run a vision agent versus a structured API call. Here's the full week.

This week the signal split two ways. On the capability front: model upgrades and compute announcements that immediately change the constraints operators work under. On the market front: the first real financial penalty for AI vaporware, and Chinese state capital systematically buying the open-weight models your evals benchmark against.

Models + launches

GPT-5.5 Instant is ChatGPT's new default [1]

OpenAI replaced GPT-5.3 Instant with GPT-5.5 Instant as ChatGPT's default model this week. The benchmark jump is real: 81.2 on AIME 2025 versus 65.4 for the outgoing model. OpenAI also cites targeted hallucination reductions in law, medicine, and finance.

The benchmark isn't the thing to watch. The behavior shift is. "Targeted hallucination reductions in law, medicine, and finance" means the model's output distribution changed in domain-specific ways. If you're running production prompts against ChatGPT's default endpoint, run a regression pass before shipping anything. GPT-5.3 stays available on API for three more months — that's your fallback window if something breaks.

Anthropic doubled Claude Code's rate limits and plugged SpaceX into its backend [2]

Anthropic removed peak-hour caps for Pro and Max plans and doubled the five-hour rolling rate limits for Claude Code sessions. The compute story behind it: SpaceX's Colossus 1 data center — 300MW, 220,000 Nvidia GPUs — is being brought online to Claude's infrastructure within the month.

If you've been hitting throttle walls on extended Claude Code sessions, the ceiling just moved with no price change. The SpaceX piece is the longer-term signal: when your AI vendor's infrastructure partner runs the world's largest private satellite and compute operation, the capacity constraints that shaped your session architecture last month aren't the same constraints you're designing under today.

Tooling shifts

Computer-use automation costs 45x more than structured API calls [3]

Reflex published a benchmark this week that every operator running vision agents should read before their next architecture decision. Same admin task: computer-use automation took 53 steps and 551,000 tokens. A structured API integration took 8 calls and 12,000 tokens. That's a 45x gap, and it's architectural — not fixable by model improvements, not improvable by better prompts.

The practical call: if you control the system you're automating, defaulting to computer-use is a 45x cost penalty relative to building a proper integration. Computer-use earns its place for legacy systems you can't touch or APIs that don't exist. For systems you own, it's the wrong architecture and the numbers now prove it.

Here's the decision tree I run when someone asks which approach to use:

flowchart TD
    A[New automation task] --> B{Do you control<br/>the target system?}
    B -->|No — legacy app,<br/>no API| C[Computer-use agent<br/>via WorkSpaces or similar]
    B -->|Yes — own the code<br/>or can add an API| D{Does the task require<br/>judgment / unstructured input?}
    D -->|Yes| E[Claude agent via<br/>structured API + tool use]
    D -->|No — deterministic rules| F[Workflow tool<br/>n8n / Make / Zapier]
    E --> G{Token cost<br/>acceptable?}
    G -->|Yes| H[Ship it]
    G -->|No — session too long| I[Reduce context:<br/>summarize, paginate, chunk]
    F --> H
    C --> H

SMB angles

Etsy is live inside ChatGPT. Your product descriptions need a rewrite. [4]

Etsy launched as a native @-mentionable app inside ChatGPT this week — 100 million listings surfaced through natural-language queries. A user types "help me find a Mother's Day gift under $100 that feels handmade" and gets Etsy results. Which listings surface depends on description quality, not just recency or sales rank.

Seven million Etsy sellers have never optimized copy for conversational AI search. Now they have a live channel where description quality directly determines whether they appear in the response. "Blue ceramic mug handmade" isn't the query anymore. "A mug that feels handmade but not too precious for everyday use" is. Sellers who understand the shift own the discovery surface. Sellers who don't are invisible.

DoorDash turned storefront launch into a form fill [5]

DoorDash shipped AI auto-onboarding this week that scrapes your existing website for menu items and photos, retouches the images, generates a storefront with ~10% order conversion rates, and includes a campaign builder. No designer, no developer, no marketing team required.

Pair this with Marc Lore's Wonder Create — natural-language restaurant brand plus robotic kitchen network — and the pattern is clear: the zero-capital path to launching a food business is becoming a form you fill out. That's a structural shift for food operators, for the agencies that serve them, and for anyone who thought "you need a real operation to run a real business" was still true.

| This week's signal | Theme | Operator action | Urgency | |---|---|---|---| | GPT-5.5 Instant is ChatGPT's default | Models | Regression pass on production prompts | Immediate | | Anthropic doubled rate limits + SpaceX compute | Models | Test new session ceilings in Claude Code | This week | | Reflex: 45x cost gap for computer-use vs API | Tooling | Audit automation architecture for systems you own | Before next build | | Etsy native in ChatGPT | Distribution | Rewrite product descriptions for conversational queries | Immediate for Etsy sellers | | DoorDash AI auto-onboarding | Distribution | Benchmark against existing storefront and ops costs | Before next contract renewal | | DeepSeek $45B + Moonshot $2B state-backed | Market | Track open-weight model ownership dependency | Ongoing | | Apple Siri $250M AI vaporware settlement | Legal | Audit AI feature claims in all marketing and sales materials | This week |

Adjacent to watch

Chinese state capital is buying the open-weight models you benchmark against [6] [7]

Two data points this week, same pattern. DeepSeek is raising at a $45 billion valuation in its first outside round — led by China's state IC fund, with Tencent and Alibaba in talks. One week earlier: Moonshot AI (Kimi K2.6, the second-most-used LLM on OpenRouter) raised $2 billion from Meituan, Tsinghua Capital, and China Mobile at a $20 billion valuation.

DeepSeek raising at $45B valuation in its first investment round, led by China's state IC fund Story: TechCrunch — DeepSeek could hit $45B valuation from its first investment round. Image via TechCrunch.

The open-weight models you use to benchmark your product, validate your architecture, or supplement your frontier API spend are being systematically funded by Chinese state capital. That's not an immediate operational problem. It is a strategic dependency to name clearly. Any architectural commitment to DeepSeek weights or Kimi models is a bet on state priorities staying aligned with your product roadmap.

The open-weight models your evals run against are now a bet on state priorities staying aligned with your roadmap. State priorities don't have a stable track record.

Apple paid $250 million for promising AI features it didn't ship [8]

Apple settled a class-action suit this week for $250 million — $95 per device — after the advanced Siri features announced alongside iPhone 15 and 16 were delayed 18+ months without disclosure at point of sale. This is the first major financial penalty for AI vaporware.

The operator checklist it generates: if your product marketing, sales decks, or feature announcements include AI capabilities that aren't fully shipped, you now have a dollar figure to attach to the risk. Apple had the resources to fight this for years and still paid nine figures. A smaller company with similar AI promises is in a materially worse legal position. The "roadmap" language that worked in SaaS for a decade doesn't survive point-of-sale advertising claims the way it used to.

What I'm watching next week: Apple confirmed iOS 27 will let users swap Claude and Gemini in as defaults for Siri and Writing Tools via a new Extensions framework. If that ships as announced, the "which model runs on your users' devices" question becomes a user preference, not a developer choice — and the distribution math for AI features in consumer apps changes permanently.

Sources

[1] OpenAI — GPT-5.5 Instant — https://openai.com/index/gpt-5-5-instant

[2] Anthropic — Higher limits + SpaceX Colossus 1 — https://www.anthropic.com/news/higher-limits-spacex

[3] Reflex — Computer-use is 45x more expensive than structured APIs — https://reflex.dev/blog/computer-use-is-45x-more-expensive-than-structured-apis/

[4] TechCrunch — Etsy launches its app within ChatGPT — https://techcrunch.com/2026/05/05/etsy-launches-its-app-within-chatgpt-as-it-continues-its-ai-push/

[5] TechCrunch — DoorDash adds AI tools to speed up merchant onboarding — https://techcrunch.com/2026/05/04/doordash-adds-ai-tools-to-speed-up-merchant-onboarding-edit-photos-of-dishes/

[6] TechCrunch — DeepSeek could hit $45B valuation from its first investment round — https://techcrunch.com/2026/05/06/deepseek-could-hit-45b-valuation-from-its-first-investment-round/

[7] TechCrunch — China's Moonshot AI raises $2B at $20B valuation — https://techcrunch.com/2026/05/07/chinas-moonshot-ai-raises-2b-at-20b-valuation-as-demand-for-open-source-ai-skyrockets/

[8] TechCrunch — Apple to pay $250M to settle lawsuit over Siri's delayed AI features — https://techcrunch.com/2026/05/06/apple-to-pay-250m-to-settle-lawsuit-over-siris-delayed-ai-features/

The short version

GPT-5.5 is ChatGPT's new default — run a regression pass on any production prompt suite before this week ends
Anthropic doubled rate limits with no price change — SpaceX's 220,000-GPU cluster is plugging into Claude's backend within the month
Computer-use automation costs 45x more than a structured API — if you control the system, build the integration
Etsy is live inside ChatGPT — 7M sellers now have AI-mediated distribution they haven't optimized for; conversational copy beats keyword copy here
Chinese state capital owns the open-weight models your evals use — DeepSeek at $45B, Moonshot at $20B, both state-backed this week
Apple paid $250M for AI vaporware — the first financial precedent for overpromising AI features; audit your marketing claims now