Pulse — The State of AI for Operators

The one thing to read

December 2025 saw the second real 'ChatGPT moment' for AI capabilities. But it's been mostly noticed by the developer community. A threshold has been crossed and everything downstream is going to change. Read it here.

The one thing to do

Get your team to capture how they make decisions. Not the tasks, but the judgment calls. This is the training data your agents will need.

One shortcut for distributed teams: record all meetings. Granola's pretty great for this — captures the nuance.

Pulse on

Capabilities

Explainer Act Mar 28, 2026

December 2025 was a watershed for agentic viability

People keep calling “ChatGPT moments.” When voice mode made ChatGPT sound like Scarlett Johansson in Her. The control possible in image generation with Nano Banana. Impressive, sure. But none of them opened up capabilities that would materially change how businesses operate.

December 2025 did. Here’s the causal chain that explains why this time is different:

BullshitBench leader jumps from 40% to ~95%. Anthropic (Claude) models stop cheerfully complying with nonsense. They self-correct.
Long-running agents become reliable. Set a task running for 2 hours. It makes mistakes. It fixes them itself. Arrives at the right answer.
Claude Code explodes. $0 → $2.5B ARR before its first birthday. 4% of public GitHub commits, trending to 25–50% by year end.
OpenClaw validates the frontier. Self-correcting agents in open-ended environments. Not interesting because of its architecture — interesting because a high-BullshitBench model made it trustworthy. About as reliable as a junior employee.
Clerical agents within 12 months. The same capability that makes Claude Code magical for devs will power administrative, operational, and data-processing agents. This is the year of the agent. Not 2025.

This. Actually. Changed. Everything.

The snark: Sure, for now OpenClaw is mostly bros having Claws make PDFs about OpenClaw to sell for $5 to other bros. Feels like crypto — same bros. But this time they’ve hit on a real thing. The enterprise implications are months behind them, not years.

Also watching: Along with Opus’ shift to 1M context window, we’ve seen a concrete improvement in how well complex, multi-faceted context is actually used by the models. Multiple instruction sets, structured markdown, Skills in Claude Code. Hard to quantify, but you feel it when you use the tools. We’re also tracking the rumoured “Mythos”/“Capybara” release - another step change may be a few months away.

BullshitBench · Latent Space · WTF 2025 · Karpathy

Explainer Watch Mar 20, 2026

Can you trust what foundation model leaders say publicly?

We shouldn’t pay too much attention to what foundation model owners say — they’re selling. But Amodei has developed credibility with his public stance on responsible AI use, so we’re giving him a charitable take. We scoffed last Spring when he said 90% of code would be written by AI by end of 2025. Then December happened. So, here’s a generous take on Amodei’s prediction:

He was right about capability. Wrong about the multi-year change management industry needs to realize it. We’ve written about this.
Keep listening (with a pinch of salt). We’ve been developing a ‘feel’ for the pace of change in model capabilities - when you track it over a few years, new developments are less surprising. It’s fair to assume that Amodei has a decent finger on the pulse.
Altman remains the sales guy Of the foundation model CEOs, he’s got the worst track record at calling the shots of emerging capabilities. Overselling.
Jensen’s observation: We enjoyed the Nvidia CEO’s observation that it’s kinda funny that software development — traditionally the “smart people” job — is the first one to be reliably automated.

Tracking this: We’re building a scoreboard of what Altman, Amodei, Jensen et al. predicted vs. what happened. Current record: better than expected on capability, wildly optimistic on adoption timelines.

Also watching: We’re not weighing in on Jensen’s AGI claims. That’s a religious debate. (We sit in the Pinker-esque computational theory of mind camp.) Anthropic’s LLM interpretability research at transformer-circuits.pub is incredible and worth reading if you’re on the techier end of our audience.

Anthropic labor report · Jensen Huang · Intentional

Hot take Watch Mar 28, 2026

"Chatbots are increasingly ignoring human instructions." This means they're actually becoming useful.

A CLTR study found ~700 cases of AI “misbehavior” — ignoring instructions, evading safeguards, delegating forbidden tasks. The Guardian’s take was predictably “this is scary.” The Pulse frame: deterministic bots follow scripts. Agents make decisions. Sometimes those decisions diverge. The takeaway shouldn’t be “why are agents disobeying?” — it’s “wow, there’s enough autonomy that this is possible.” It’s a bit like increases in self-driving car accidents: looks like a negative signal out of context, but it’s a side effect of real-world usage going up.

CLTR study · Guardian

Exhibit Mar 2026

BullshitBench — explaining improved agentic behaviour

BullshitBench tracks whether models reject self-evidently stupid instructions.

BullshitBench v2: Detection Rate Over Time — Anthropic models reaching ~95% by Q1 2026

LLMs are more powerful if you already have subject matter expertise. Why? Because LLMs have a tendency to yes-and stupid requests.

“How should I grow my specialty milk business — ‘cat milk’ — a feline lactose product for baristas” will result in a business plan, not a rejection of the premise.

Until December, when Anthropic’s models started rejecting 95% of bullshit requests. This means agents can self-correct in long-running tasks (catching the remaining 5%).

BullshitBench v2

Exhibit Mar 2026

A picture of the AI Adoption gap

Anthropic’s labor market report* maps theoretical AI capability against observed real-world adoption by occupation.

Anthropic labor market radar chart — theoretical AI capability vs observed adoption by occupation

Blue = what AI can theoretically do. Red = what’s actually being used. Software dev: 90%+ capability, ~15% adoption. Legal, accounting: 50–60% capability, similar adoption. The gap between the blue and the red is where all the opportunity sits — and where every “AI strategy” should be focused.

*Shared with the obvious caveat that this comes from a model provider.

Anthropic: Labor Market Impacts of AI

AI SDRs / end-to-end sales automation

Amazing open rates. No bites. Certainly no end-to-end sales. The AI SDR is not a thing yet. Take a look instead at automating support (We’re at the stage where AI support bots are more predictable than the humans we have at the Canadian Telcos helpdesks.)

Bullshit Mar 2026

Fully autonomous enterprise agents

We were on a call with a Salesforce (AgentForce!) rep a couple of months back. When pushed on differentiation he said: “Nothing special about AgentForce as far as agentic capabilities”. This checks out - most enterprise agents are still deterministic workflows, with generative content. Agents that run your ops while you sleep? Still vaporware. For now.

Bullshit Mar 2026

"We gave everyone ChatGPT logins" is not an AI strategy

We use three stages to describe AI use case depth: copilot (human asks AI for advice) → iron man (human with AI superpowers) → robot (full autonomy). A year ago, halfway to copilot put you in the top 50%. Now you need iron man at minimum. Conversational AI as your “AI strategy” is a suggestion box.

What's Ready

Ready Mar 2026

AI Software Development

OK, not a revolutionary take here, but we still speak to a lot of orgs stuck with Copilot, or on Cursor workflows they developed a year ago. Claude Code is a different beast - model capabilities + skills + planning mode has led many seasoned devs to stop writing any code.

Ready Apr 2026

Claude can actually write without the AI smell

With the right guidance you can get Claude to output competent copy from a decent logical outline. Building internal Skills for your teams to share is a stronger pattern than custom GPTs, dropping documents in project context.

Say this to sound smart

If the foundation models stopped improving tomorrow, we'd still have a decade of product innovation on current capabilities.

Ask your team

If an AI-native competitor started today in our space, what would they build first?

When was the last time we re-tested a workflow that failed with AI six months ago?

Do we understand AI capabilities across all modalities, and how they map to our business?

Pulse on

Software Dev

Explainer Act Mar 25, 2026

The $250K app now costs $100 in tokens

Production-ready apps that would have cost $250K, taken 3 months, and needed a team of 4 can now be built in days. As a side project. For less than $100 in tokens. Is that hyperbolic? Maybe a bit. But it’s not just 10x faster — it might be 50x.

Parallel Claude instances on different parts of the same codebase is starting to be a viable pattern.
Maintenance costs collapsing. Bug fixing is becoming automated. “Software is a liability” is getting less true by the month.
Cloudflare rewrote Next.js in 1 week with 1 engineering manager. Previous attempts: months, multiple teams.
Zero-to-one in 3 months is now completely feasible. That used to be zero-to-0.1.

The signal to watch: The Lean AI Leaderboard tracks high revenue-per-employee companies. These are the AI-native companies that will eat your lunch. Stripe’s 2025 report: software at 46% of US GDP growth. 4x GitHub pushes vs 2024.

Lean AI Leaderboard · Stripe 2025 · SemiAnalysis · Cloudflare

Hot take Act Mar 22, 2026

Your release cadence is the canary for your AI risk

Quarterly releases? Dead. The SAFe “agile release train” with its FIFO approach? Even more of an anti-pattern now. Choo choo. Which company wins: the one where Sue fixes the accounting bug herself with Claude and submits a PR, or the one where Sue waits 6 weeks for the next release, uses a side-book, then spends a week reconciling?

Hot take Act Mar 15, 2026

Agent-first coding is now table stakes. Not an experiment.

Claude Code + Opus. Parallel instances. Boris Cherny writes 100% of his code contributions to Claude Code using Claude Code. The old bottleneck was developer time. The new bottleneck is product clarity. If you haven’t mandated this for your eng team, you’re paying 2024 prices for 2026 work.

The zombie problem: The engineer who views their job as translating written requirements to software code is already redundant. A lot of people did CS degrees to get that kind of job. They don’t know it yet.

Further reading — Software Development

→ WTF Happened in 2025 — the microsite tracking the inflection, constantly updated

→ Lean AI Leaderboard — the companies proving small teams + AI = outsized results

Say this to sound smart

The engineer who sees their job as translating requirements to code is already dead. They just don't know it.

Still Bullshit

Bullshit Apr 2026

"AI can replace all our developers"

The zombie engineer problem is real, but experienced devs with taste and product sense are more valuable than ever. AI replaces the mechanical translation of requirements to code. It does not yet replace the judgment of what to build, why, and how it fits together.

Bullshit Mar 2026

"We're fine sticking with Copilot"

IDE autocomplete is the entry level. Agent-first coding with parallel instances is where the real leverage is. The gap between tab-completion and agent-driven development is the gap between a calculator and a spreadsheet.

Bullshit Mar 2026

"We can one-shot refactor our tech debt"

We like to call it ‘heritage’ not legacy code as a sign of respect for those who built it. While OpenRewrite looks like a promising approach, refactoring & tech debt remediations still requires a very hands-on workflow, and considerable time.

What's Ready

Ready Mar 2026

Agent-first development workflows

Claude Code + parallel instances, spec-driven development. The 50x claim is real for greenfield. Not hype — we’re building this way right now. The bottleneck moved from “can we write the code” to “do we know what we want.”

Ready Apr 2026

AI-assisted release management

Still stuck with a release backlog, and one release engineer trying to get things out the door? This is an Enterprise anti-pattern right for AI assistance. Opus with 1MM context is pretty adept at chopping up branches into coherent releases. Are you allowing your teams to use it?

Say this to sound smart

The new bottleneck is product clarity. If you can't describe what you want, 50x velocity just means 50x the wrong thing.

Ask your team

What's our release cadence, and what would it take to cut it in half?

Who on the team is experimenting with agent-first coding? What have they shipped with it?

Where is our codebase most inaccessible to agents with 200K context? 1MM context?

How are we sharing our specific context across our dev team? Skills, rules/claude.md files, etc.

Pulse on

Product & Methods

Explainer Watch Mar 20, 2026

Product is dead. Long live product.

We’re planning a wake for product management. Tongue firmly in cheek. Here’s the thing: the fundamentals — user understanding, value creation, taste — matter more than ever when engineering velocity is 50x. Because 50x velocity without product discipline gives you the Homer Simpson Concept Car.

The admin layer is dead. Non-product people have always misunderstood PM as glorified project management. Just a ticket machine at the butcher’s. That part is geniunely over now (thankfully!).
PRD slop is a real problem. Teams are getting drowned in massive AI-generated PRDs that nobody reads. The document is not the product.
AI eng & PM roles up 400%. The role is growing all while the job description is being rewritten underneath it.
No new consensus since evals. We do need to get some new clarity of purpose. The last thing product agreed on about AI was “evals are useful” — roughly March 2025. A year of stasis in a field moving at lightspeed.

The reverse MVP problem: It’s notoriously hard to remove things from existing products. Agent-coded software makes adding things cheaper than ever. Without product discipline, you get every feature anyone ever wanted — and an undriveable car.

Also watching: Spec-driven development as a trend. We’re also considering the one-legged stool — a convergence from the traditional 3-legged stool (eng-product-design) of the last 20 years into a single generalist role with taste.

Linear /next · community sentiment · AI PM role data

Hot take Watch Mar 26, 2026

Linear just declared its own product dead

Two days ago, Linear’s CEO wrote that issue tracking was “built for a handoff model” that agents are making obsolete. 75% of their enterprise workspaces now have coding agents. Agent-authored issues up 5x in three months. Linear is pivoting from issue tracker to “shared product system that turns context into execution.” This is the biggest product-methodology signal this quarter.

linear.app/next

Hot take Watch Feb 1, 2026

Jira is a punchline now

We were at a Claude meetup recenty, and when Jira was brought up the crowd laughed. To the AI-native set, Jira looks like a fax machine. The handoff-based project management model it embodies is incompatible with the speed at which agent-assisted teams now operate. The question is what replaces it — Linear’s bet is context, not tickets.

"AI-generated PRDs are a productivity win"

PRD slop is drowning teams. The document is not the product. A 40-page AI-generated PRD that nobody reads is worse than a napkin sketch that everyone understands. The productivity win is in thinking, not typing.

Bullshit Mar 2026

"We need an AI product strategy"

You need a product strategy that accounts for AI. Different thing entirely. “AI product strategy” implies AI is the product. It’s not. It’s a capability shift that changes how you build, price, and deliver everything.

Bullshit Apr 2026

"Consumers are ready for metering"

SaaS economics were great. Hosting & infra costs per user are largely negligable. 30% of your recurring revenue isn’t even using the product. As AI becomes part of the stack, inference costs are changing the economics of products, and that’s going to require usage-based-pricing. This is going to act as a small counterbalance to the new development efficiency, and users will hate it.

What's Ready

Ready Mar 2026

Spec-driven development

Martin Fowler’s articulation of where product methods are heading. Context over tickets. Write the spec, hand it to the agent, review the output. The PM role becomes “chief context officer” — less admin, more judgment.

Ready Mar 2026

AI-assisted user research synthesis

Pattern extraction from interviews, survey data, support tickets. The PM admin layer being automated. The insight is still human. The grunt work of organizing 200 interview transcripts is not.

Say this to sound smart

Maximal Vomited Product: 50x velocity without product discipline gives you the Homer Simpson Concept Car.

Ask your team

When was the last time someone actually read one of our PRDs end to end?

Did AI write our AI strategy? Do we actually have a unique POV?

Have we made a significant change to our PM workflow in the past 12 months?

What is our moat that protects us from AI-native competition entering the market?

Pulse on

AI at Work

Explainer Act Mar 28, 2026

Your caution is now your biggest risk

Don Norman wrote about affordances — the design cues that tell you how to use a thing. A door handle affords pulling, a plate affords pushing. It makes things intuitive to use. AI has almost no affordances right now.

This means adoption is happening mainly at the intersection of the technical and the curious. Your average accounting department isn’t reinventing how they work, because the tools aren’t ready. The problem: waiting for affordances to arrive is now the most dangerous strategy available to you.

Anthropic noticed this — engineers were using Claude Code for spreadsheets, timekeeping, document creation — and shipped Cowork in two weeks. But Cowork itself doesn’t really know how you’re going to use it either. The ‘try this first’ example is to organize screenshots on your desktop. It’s going to take a long time for general-purpose “AI for everyone” products to land.

In the meantime, while you wait, there’s a whole new army of tiny AI-native companies scanning every market looking for opportunities to disrupt. They’re building their own tools, their own workflows, at a fraction of traditional enterprise margins.

If you wait for prêt-à-dopter, it’s 2 years out. Software timelines: 3 months to build. Adoption timelines: compliance, inertia, org change. There’s your 2 years.
AI-native companies are full of curious generalists. They don’t wait for affordances. They try stuff. Some fails. They get shit done.
The risk katamari grows every quarter you wait. Your “risk-averse” posture is itself accumulating risk. CRUD SaaS, SMB workflow tools, M&A rollups — that’s where swarming starts here.

The affordances will come, no question. People will build new tools, integrate with your existing systems and workflows, and some platforms will respond with the appropriate degree of urgency (see: Linear). But can you afford to wait for that?

Also watching: Foundation model companies partnering with consultancies for change management, because their own products have no affordances, but the potential is here already. If the foundation models stopped improving tomorrow, we’d still have a decade of product innovation on current capabilities.

Anthropic labor report · VentureBeat · Intentional

Hot take Act Mar 24, 2026

We've barely scratched the surface of AI-as-capability

If the foundation models stopped improving tomorrow, we’d still have a decade of product innovation on current capabilities. The AI-native companies are starting from capabilities and rethinking approaches and workflows. Incumbents are trying to add AI sugar to their existing approaches. We’re looking at you, ✨ emoji buttons.

Hot take Act Mar 18, 2026

The single highest-ROI hiring filter is curiosity

Not technical depth. Not domain expertise. Curiosity. Shopify called their hiring persona “entrepreneur” — same principle. We had a hyper-specialization era (the frontend JavaScript eco system as the reduction ad absurdum). It’s over. Every hire going forward should be someone who tries things before being asked to.

"We gave everyone ChatGPT logins — that's our AI strategy"

Copilot stage is table stakes. You need iron man minimum. Giving everyone a chatbot login and calling it a strategy is like giving everyone a library card and calling it a training program. The gap between “access” and “adoption” is the entire problem.

Bullshit Mar 2026

"AI adoption is a technology problem"

It’s a change management problem. The tech works. The org doesn’t. Every failed AI rollout we’ve seen had working technology and broken incentives, unclear ownership, or leadership that announced the initiative and moved on.

Bullshit Mar 2026

"We'll wait for the tools to be ready"

The tools are ready enough. Your caution is your biggest risk now. Every quarter you wait, the risk katamari grows. The AI-native competitors aren’t waiting for affordances — they’re building without them.

What's Ready

Ready Mar 2026

Curiosity-first hiring

The generalist era is here. Shopify’s “entrepreneur” filter is the model. Hire people who try things before being asked. The hyper-specialization era rewarded depth. The AI era rewards people who can context-switch, experiment, and ship across domains.

Ready Mar 2026

Decision-capture as training data

Recording how your team makes judgment calls. This is the input your future agents need. Not the tasks. The reasoning. Why did you choose vendor A over B? What made you escalate that ticket? Start capturing this now and you’ll have a headstart when true agentic workers become (safely) available.

Say this to sound smart

AI adoption isn't a technology problem any more. It's a change management problem.

Ask your team

Beyond ChatGPT logins, what does our actual AI adoption look like? Who's using what, and how?

How do we document decisions right now? Could someone new reconstruct why we made them?

Who on the team is experimenting with AI outside their core role? What are they finding?

Where are we calling our behaviour 'risk-averse' that's actually 'risk-accumulating'?

We want to hear from you

Got a signal we missed? A take you disagree with? Something we should be tracking? We're building Pulse as a conversation, not a broadcast.

hello@intentional.team →

The State of AI for Operators

Capabilities

December 2025 was a watershed for agentic viability

Can you trust what foundation model leaders say publicly?

"Chatbots are increasingly ignoring human instructions." This means they're actually becoming useful.

BullshitBench — explaining improved agentic behaviour

A picture of the AI Adoption gap

AI SDRs / end-to-end sales automation

Fully autonomous enterprise agents

"We gave everyone ChatGPT logins" is not an AI strategy

AI Software Development

Claude can actually write without the AI smell

Software Dev

The $250K app now costs $100 in tokens

Your release cadence is the canary for your AI risk

Agent-first coding is now table stakes. Not an experiment.

"AI can replace all our developers"

"We're fine sticking with Copilot"

"We can one-shot refactor our tech debt"

Agent-first development workflows

AI-assisted release management

Product & Methods

Product is dead. Long live product.

Linear just declared its own product dead

Jira is a punchline now

"AI-generated PRDs are a productivity win"

"We need an AI product strategy"

"Consumers are ready for metering"

Spec-driven development

AI-assisted user research synthesis

AI at Work

Your caution is now your biggest risk

We've barely scratched the surface of AI-as-capability

The single highest-ROI hiring filter is curiosity

"We gave everyone ChatGPT logins — that's our AI strategy"

"AI adoption is a technology problem"

"We'll wait for the tools to be ready"

Curiosity-first hiring

Decision-capture as training data

Get Pulse updates

We want to hear from you