PULSE

The State of AI for Operators

A statement of the beliefs we operate under at Intentional, the things we're reading, and some hot takes you can borrow to sound informed.
Pulse Status
Acceleration
December shifted capabilities. May is shifting economics. Both sets of assumptions are now wrong.
DecelerationSteadyAccelerationStep Change
The one thing to read

Are we seeing a bait-and-switch starting on AI economics? Enterprise accounts are now metered-tokens-only if you want security features. 5x the price in some cases. The tokenizer got 27% heavier. Frontier AI's IPO clock is ticking and you're holding the bill. Read it here.

The one thing to do

It's worth hedging against a cost explosion in your AI usage. Check in with your team, and try to instill token-efficiency as a side-goal in your ways of working. Some easy cost-control habits: HTML over PPTX for slides, Skills over MCP for repeated patterns, Caveman-style compression for high-volume prompts.

Pulse on

Capabilities

Explainer Act May 25, 2026

Memory is the next frontier battle.

In its short life, the GenAI frontier battle has been fought on several fronts: perplexity, multimodal, reasoning. The current frontier is memory: how AI systems capture, distill, and recall information, and how that memory evolves over time.

We’ve always been fascinated by the (deceptively hard) problem of mapping messy knowledge onto coherent structures. Ontologies, taxonomies, schemas, graphs, embeddings, oh my. Well it seems our niche interest is now the tip of the frontier.

  • 2025: foundation chat agents got rudimentary memory. Anthropic shipped Claude’s first persistent memory; OpenAI and Google followed within months.
  • For agentic development, RAG has been the blunt instrument. Retrieval-augmented generation works, but approximate similarity isn’t memory, it’s search.
  • Patterns mainstreamed in 2026. Claude Code shipped Skills with layered persistent memory; OpenClaw and Nous Hermes demonstrated self-correcting agents holding real context.
  • Nuance at scale is in sight with bigger context windows plus better context usage meaning models can juggle multiple concepts in working memory.
  • Cognitive science has entered the chat. Working / Episodic / Semantic / Procedural memory have become architectural primitives. Claude’s Dreaming experiments are async memory consolidation (incidentally, the same pattern we’ve been building into Dennett)

Memory and knowledge are key IP for any organization, so when the models bury them under the hood, you’ve ceded control of your key logic and mental models. Platform-agnostic architecture has historically been an anti-pattern; cloud-agnosticism was an expensive engineering boondoogle project a decade ago reserved for the terminally risk-averse. But in the fast moving waters of frontier AI, optionality has business value.

The landscape is exploding. memclaw, supermemory, Karpathy’s knowledge-raven. RAG pioneer Pinecone shipped Nexus and KnowQL - a declarative query language for agentic memory that we wouldn’t bet against. Our own work on this in Dennett - the internal consulting platform we’ve been building since well before memory was cool - validated periodic knowledge compilation as a valuable approach for async reconciliation & de-duping.

Worth mentioning: Three markdown files in a trenchcoat still solves a surprising number of cases. Throw Tobi’s QMD in the pot, and you’ve got yourself a stew.

No winner has emerged yet. This is a “be aware” moment, not a “commit” moment. The frontier is moving too fast for a multi-year vendor commit. Build literacy. Prototype. Own your corpus.

Also watching: Anthropic is shipping Memory Files — the most accessible vendor-managed memory layer yet. TBD whether it’s increased lock-in or a portability win. We’ll know by how openly they treat the file format.

Explainer Act Mar 28, 2026

December 2025 was a watershed for agentic viability

People keep calling “ChatGPT moments.” When voice mode made ChatGPT sound like Scarlett Johansson in Her. The control possible in image generation with Nano Banana. Impressive, sure. But none of them opened up capabilities that would materially change how businesses operate.

December 2025 did. Here’s the causal chain that explains why this time is different:

  • BullshitBench leader jumps from 40% to ~95%. Anthropic (Claude) models stop cheerfully complying with nonsense. They self-correct.
  • Long-running agents become reliable. Set a task running for 2 hours. It makes mistakes. It fixes them itself. Arrives at the right answer.
  • Claude Code explodes. $0 → $2.5B ARR before its first birthday. 4% of public GitHub commits, trending to 25–50% by year end.
  • OpenClaw validates the frontier. Self-correcting agents in open-ended environments. Not interesting because of its architecture — interesting because a high-BullshitBench model made it trustworthy. About as reliable as a junior employee.
  • Clerical agents within 12 months. The same capability that makes Claude Code magical for devs will power administrative, operational, and data-processing agents. This is the year of the agent. Not 2025.

This. Actually. Changed. Everything.

The snark: Sure, for now OpenClaw is mostly bros having Claws make PDFs about OpenClaw to sell for $5 to other bros. Feels like crypto — same bros. But this time they’ve hit on a real thing. The enterprise implications are months behind them, not years.

Also watching: Along with Opus’ shift to 1M context window, we’ve seen a concrete improvement in how well complex, multi-faceted context is actually used by the models. Multiple instruction sets, structured markdown, Skills in Claude Code. Hard to quantify, but you feel it when you use the tools. We’re also tracking the rumoured “Mythos”/“Capybara” release - another step change may be a few months away.

BullshitBench · Latent Space · WTF 2025 · Karpathy
Exhibit Mar 1, 2026

BullshitBench — explaining improved agentic behaviour

BullshitBench tracks whether models reject self-evidently stupid instructions.

BullshitBench v2: Detection Rate Over Time — Anthropic models reaching ~95% by Q1 2026

LLMs are more powerful if you already have subject matter expertise. Why? Because LLMs have a tendency to yes-and stupid requests.

“How should I grow my specialty milk business — ‘cat milk’ — a feline lactose product for baristas” will result in a business plan, not a rejection of the premise.

Until December, when Anthropic’s models started rejecting 95% of bullshit requests. This means agents can self-correct in long-running tasks (catching the remaining 5%).

Explainer Watch Mar 20, 2026

Can you trust what foundation model leaders say publicly?

We shouldn’t pay too much attention to what foundation model owners say — they’re selling. But Amodei has developed credibility with his public stance on responsible AI use, so we’re giving him a charitable take. We scoffed last Spring when he said 90% of code would be written by AI by end of 2025. Then December happened. So, here’s a generous take on Amodei’s prediction:

  • He was right about capability. Wrong about the multi-year change management industry needs to realize it. We’ve written about this.
  • Keep listening (with a pinch of salt). We’ve been developing a ‘feel’ for the pace of change in model capabilities - when you track it over a few years, new developments are less surprising. It’s fair to assume that Amodei has a decent finger on the pulse.
  • Altman remains the sales guy Of the foundation model CEOs, he’s got the worst track record at calling the shots of emerging capabilities. Overselling.
  • Jensen’s observation: We enjoyed the Nvidia CEO’s observation that it’s kinda funny that software development — traditionally the “smart people” job — is the first one to be reliably automated.

Tracking this: We’re building a scoreboard of what Altman, Amodei, Jensen et al. predicted vs. what happened. Current record: better than expected on capability, wildly optimistic on adoption timelines.

Also watching: We’re not weighing in on Jensen’s AGI claims. That’s a religious debate. (We sit in the Pinker-esque computational theory of mind camp.) Anthropic’s LLM interpretability research at transformer-circuits.pub is incredible and worth reading if you’re on the techier end of our audience.

Anthropic labor report · Jensen Huang · Intentional
Exhibit Mar 1, 2026

A picture of the AI Adoption gap

Anthropic’s labor market report* maps theoretical AI capability against observed real-world adoption by occupation.

Anthropic labor market radar chart — theoretical AI capability vs observed adoption by occupation

Blue = what AI can theoretically do. Red = what’s actually being used. Software dev: 90%+ capability, ~15% adoption. Legal, accounting: 50–60% capability, similar adoption. The gap between the blue and the red is where all the opportunity sits — and where every “AI strategy” should be focused.

*Shared with the obvious caveat that this comes from a model provider.

  • "Chatbots are increasingly ignoring human instructions." This means they're actually becoming useful. Mar 2026
Further reading — Capabilities
Anthropic Memory Files — the vendor-managed memory layer landing now
VentureBeat: end of the RAG era — compilation-stage knowledge layer, explained
Latent Space: WTF Happened in 2025 — best single overview of the December shift
BullshitBench — the benchmark that explains the causal chain
Anthropic: Labor Market Impacts of AI — skip the paper, look at the radar chart
Say this to sound smart

The model trained on your judgment doesn't port to the next vendor. Own your memory before the contracts arrive.

Bullshit Mar 2026

AI SDRs / end-to-end sales automation

Amazing open rates. No bites. Certainly no end-to-end sales. The AI SDR is not a thing yet. Take a look instead at automating support (We’re at the stage where AI support bots are more predictable than the humans we have at the Canadian Telcos helpdesks.)

Bullshit Mar 2026

Fully autonomous enterprise agents

We were on a call with a Salesforce (AgentForce!) rep a couple of months back. When pushed on differentiation he said: “Nothing special about AgentForce as far as agentic capabilities”. This checks out - most enterprise agents are still deterministic workflows, with generative content. Agents that run your ops while you sleep? Still vaporware. For now.

Bullshit Mar 2026

"We gave everyone ChatGPT logins" is not an AI strategy

We use three stages to describe AI use case depth: copilot (human asks AI for advice) → iron man (human with AI superpowers) → robot (full autonomy). A year ago, halfway to copilot put you in the top 50%. Now you need iron man at minimum. Conversational AI as your “AI strategy” is a suggestion box.

Ready Mar 2026

AI Software Development

OK, not a revolutionary take here, but we still speak to a lot of orgs stuck with Copilot, or on Cursor workflows they developed a year ago. Claude Code is a different beast - model capabilities + skills + planning mode has led many seasoned devs to stop writing any code.

Ready Apr 2026

Claude can actually write without the AI smell

With the right guidance you can get Claude to output competent copy from a decent logical outline. Building internal Skills for your teams to share is a stronger pattern than custom GPTs, dropping documents in project context.

Say this to sound smart

I wonder when Gemini and Codex are going to catch up to Claude's BullshitBench score.

If an AI-native competitor started today in our space, what would they build first?
When was the last time we re-tested a workflow that failed with AI six months ago?
Do we understand AI capabilities across all modalities, and how they map to our business?

Pulse on

Engineering

Explainer Act May 22, 2026

Forward Deployed Engineering: What you reach for when your product is opaque

Palantir popularized FDE: send your engineers to live inside the customer, learn the domain, ship the solutions hand-in-hand. It was needed because Palantir’s platform was opaque without a human translator. Now Anthropic, OpenAI and Google are doing the same thing, for the same reason. The models are powerful, but the affordances aren’t there, solutions don’t install themselves.

As someone who ran a solutions engineering team, this is all very validating. As an AI consulting firm we could see it as a threat, but we also think there’s plenty of pie.

  • Applied AI teams, Solutions Architects, AI consulting wings are all variants of the same playbook.
  • Joint ventures are forming up with Blackstone & Goldman Sachs getting hands-on with Anthropic, and more private equity getting in on the game with TPG & Advent contributing to “The OpenAI Deployment Company”.
  • What about the big 4, I hear you ask? Well, there’s ‘partnerships’ that have been announced over the year with Deloitte, McKinsey, BCG et al. Basically AI is hard enought to figure out, they’ll grab anyone’s hand. We’re here if you need us chaps: hello@intentional.team
  • The job is essentially translation. What does this business actually do? Which workflows have effort worth capturing? Then an army of fresh compsci grads looking to shave some yaks.
  • For many orgs, this is going to be required. You have data. You have appetite. You have AI budget. What you don’t have (or can’t afford or entice) is the engineer who can bridge between the GTM team and a functioning agentic workflow.

It’s not going to be cheap, and as with most scaled consulting models, it’s going to look like a claws-in model. We don’t imagine there will be much discussion of model-agnosticism, token optimization, or avoiding vendor lock-in.

So…consulting?: Yes, “Forward Deployed Engineer” is consulting with a title for those allergic to our fine profession.

Also watching: Pragmatic Engineer’s deep-dive on FDE in their catchily name “The Pulse” is the best primer if you want to brief your leadership team. More on the new role taxonomy emerging in the same wave.

Explainer Act May 26, 2026

Security teams need yoga skills to keep changing their posture.

We’ve talked about Mythos before - Anthropic’s AI bug-hunter that was too effective to release, or so the press release legend goes. It seems now that they’re preparing for a public release, after several months of Project Glasswing. Or, at least some ‘Mythos-class’ models, that are likely more than a little nerfed, and with some hefty guardrails.

Snark aside, there’s a pretty defensible logic to the ways the security threat model is going to change. The evidence is stacking up: OSS maintainers starting to see real value in the AI-generated patches coming their way, DARPA competitions bearing fruit and sneaky individuals jailbreaking foundation models to help with exploits. Our founder’s prediction for the year is looking pretty healthy.

  • AI is finding real bugs in OSS now. Indie researchers are running agents against major codebases. Patches are landing.
  • Enterprise security teams face an AI-vs-AI dynamic. Your blue team needs agentic tooling. Your red team already has it (whether you sanctioned that or not!)
  • Mythos and successors will reshape what’s exploitable. Expect some interesting stories on the horizon.

Much of the world is still dragging their feet, or actively avoiding AI, but certain jobs are right in the firing line. As you sit down with a gin & tonic tonight, spare a thought for the Enterprise security teams in the financial sector who probably are still basking in the glow of their monitors (along with pretty much any Enterprise CFO).

Also watching: The leaked Mythos data-store incident and unauthorized access to the model tells you that opsec is just as important as infosec, but maybe not as easily automated.

Hot take Act May 15, 2026

The new engineering role taxonomy.

Three roles are emerging from the vibe-coding aftermath. They’re not all entirely new, but demand is already going through the roof for these specializations.

  • Applied AI Engineer. AKA “Person who knows how to use AI to do stuff”. Five years ago, they were a full-stack engineer. Now they’re wiring up models, tools, skills, evals, deployment. Emerging grads will slot in here.
  • Forward Deployed Engineer. Sits inside the customer. Translates the customer’s workflows into agentic implementation designs. Lives at the boundary between consulting and engineering. Discussed above.
  • The Turd Polisher. OK, so maybe not as clearly in demand, but we’re starting to see this person emerge as the hero of small prod/eng teams. Calm, unflappable, and happy to show up every day to take the vibe-coded MVPs that a self-declared ‘AI Product Manager’ slopped out, and help them not embarrass the org. Patience a required virtue. Officially: Production Engineer, SRE, Staff+. Unofficially: turd polisher.
Hot take May 8, 2026

The vibe coding shitstorm clouds are rolling in.

Twelve months in, the rough edges are showing. Non-engineers using Claude Code, Cursor, v0 are shipping tantalizing prototypes. They’re also shipping things that break in production. Auth that doesn’t auth. Race conditions in checkout. Migrations that nuke prod.

We love a good vibe coding session for making a POC projection mapping system, or reverse engineering a cheapo bluetooth display, but production code it ain’t.

The market is correcting. Demand for senior engineers who can productionize AI output is climbing. See: the turd polisher.

  • Your release cadence is the canary for your AI risk Mar 2026
  • The $250K app now costs $100 in tokens Mar 2026
Further reading — Engineering
Pragmatic Engineer: Forward Deployed Engineering — the best primer on FDE for leadership briefings
Anthropic: Unreasonable effectiveness of HTML — why their own team uses HTML over Markdown
WTF Happened in 2025 — the microsite tracking the inflection, constantly updated
Lean AI Leaderboard — the companies proving small teams + AI = outsized results
Say this to sound smart

Solution architecture didn't die. It got the title "Forward Deployed Engineer" and a 10x rate card.

Bullshit May 2026

"Agents as architects"

Opus and Codex are pretty good at writing code. But they’re only OK at designing architectures (just try the same thing three times, you get some pretty wide variations without spoonfeeding constraints). Large context windows and clever tricks are still not sufficient to ingest existing codebases of any significance.

Bullshit Apr 2026

"AI can replace all our developers"

The zombie engineer problem is real, but experienced devs with taste and product sense are more valuable than ever. AI replaces the mechanical translation of requirements to code. It does not yet replace the judgment of what to build, why, and how it fits together.

Bullshit Mar 2026

"We can one-shot refactor our tech debt"

We like to call it ‘heritage’ not legacy code as a sign of respect for those who built it. While OpenRewrite looks like a promising approach, refactoring & tech debt remediations still requires a very hands-on workflow, and considerable time.

Ready Mar 2026

Agent-first development workflows

Claude Code + parallel instances, spec-driven development. The 50x claim is real for greenfield. Not hype — we’re building this way right now. The bottleneck moved from “can we write the code” to “do we know what we want.”

Ready Apr 2026

AI-assisted release management

Still stuck with a release backlog, and one release engineer trying to get things out the door? This is an Enterprise anti-pattern right for AI assistance. Opus with 1MM context is pretty adept at chopping up branches into coherent releases. Are you allowing your teams to use it?

Say this to sound smart

The engineer who sees their job as translating requirements to code is already dead. They just don't know it.

What's our release cadence, and what would it take to cut it in half?
Who on the team is experimenting with agent-first coding? What have they shipped with it?
Where is our codebase most inaccessible to agents with 200K context? 1MM context?
How are we sharing our specific context across our dev team? Skills, rules/claude.md files, etc.

Pulse on

Product

Explainer Act May 22, 2026

Design systems are coming! Useful ones!

For at least a decade “design system” has meant a Figma library nobody enforced and a Storybook nobody read. While everyone else has been off at AI sleepaway camp, design has been staying at home, retaking its credits in summer school.

Design folks we speak to have, not unreasonably, been waiting on Figma to get their shit together. Unlike product (words!) and engineering (<words!>), the visual/creative component of design isn’t well served even by the multimodal capabilites. The delta between a nano-banana output and real graphic design remains pretty large. But hope is at hand!

  • Claude Design and equivalents are landing. Foundation labs are productizing design tooling that can consume myriad inputs, and output something on-brand, on-grid, and ship-ready. But it’s pretty pricey. Holy token consumption batman.
  • Figma is finally shipping useful things. The “design as final artifact” era is over. Make-systems are converging with build-systems and Figma’s finally getting itself onto the right side of that line.
  • Design systems become AI lego bricks. Tokens, components, layout grammar are what the model needs to do useful work. Treat them as instructions for an agent, not a style bible for humans.

Leave it to Anthropic (who else!) to stir things up with Claude Design. The interface is ironically terrible, the affordances weak, and the workflow opaque, but when it works it’s pretty sweet. The productivity gains of a central digital system are starting to show up in a way that doesn’t required the type of tasteful UI engineers that were the preserve of the few. Models can now consume design tokens, component libraries, and layout grammars, and produce production-faithful output, without it all looking like purple-gradient-Claude-slop.

The design system isn’t documentation any more, it’s a tool.

Also watching: This is the year where the “your non-AI-native toolchain is legacy” narrative starts to soften. Atlassian and Figma both have credible AI-forward paths. Jira (with Rovo) is suddenly useful again if you have a real Confluence base for the model to ground in. The legacy toolchain isn’t dead - don’t rip-and-replace until you’ve checked back in on recent developments.

Ed - this construct is ‘not-a, but-b’, one of our least favourite AI-slop writing tells, but in this case we wrote it by hand, and it’s accurate

Claude Design · Figma · Atlassian Rovo
Explainer Act May 15, 2026

HTML beats Markdown for most work. And PowerPoint for all work.

Anthropic’s own team uses HTML as a primary communications format. Our ‘Deck-orator’ outputs slides in HTML. Why is HTML the format of choice? (and wasn’t Markdown the new hotness just last week?). Two reasons: training-data weight, and rendering control.

  • HTML and CSS dominate the training corpus. Decades of websites means models are vastly better at HTML/CSS than at PowerPoint XML, Keynote, or SVG for anything requiring layouts.
  • Layout control is finally good. Grid, flexbox, modern CSS — models can produce precise, on-brand artifacts that don’t need a designer pass.
  • PPTX is the worst of both worlds. Verbose XML, tokens wasted on layout cruft, output the model is bad at, and the result still needs cleanup inside PowerPoint. Having Claude write Powerpoint slides is expensive. And painful.
  • Markdown is great for prose, mid for documents, bad for presentation. When you want it to look like something, reach for HTML.

Obvious eyeroll: Follow the money. Of course Anthropic recommends HTML over the more token-conservative markdown. Their team isn’t paying metered rates.

Hot take Watch May 8, 2026

Cambrian explosion in PM, incoming.

Software projects scale with cost. Cost is collapsing. The number of in-flight initiatives in any given org is about to explode, including in orgs that have never historically built software at all. Each initiative needs a product person to make sure it’s…useful, basically. The PM job market doesn’t shrink with AI. It explodes.

  • Cheap softwaremore softwaremore contexts that need product thinkingmore PMs.
  • The PM job changes. Less roadmap, more orchestration. Managing agentic workflows looks more like managing freelancers than managing engineers.
  • Non-software orgs become software orgs. Law firms, plumbing supply distributors, regional logistics — all suddenly shipping internal apps. Each one needs a product mind.

Jevons Paradox: Every “AI will replace PMs” hot take gets it backwards. The product function expands when the cost of building drops (once we’ve got through the initial wave of crappy products that nobody uses).

Hot take Watch Mar 26, 2026

Linear just declared its own product dead

Two days ago, Linear’s CEO wrote that issue tracking was “built for a handoff model” that agents are making obsolete. 75% of their enterprise workspaces now have coding agents. Agent-authored issues up 5x in three months. Linear is pivoting from issue tracker to “shared product system that turns context into execution.” This is the biggest product-methodology signal this quarter.

  • Product is dead. Long live product. Mar 2026
  • Jira is a punchline now Feb 2026
Further reading — Product
Anthropic: HTML as primary format — why HTML beats Markdown for most product comms
Linear: Issue Tracking Is Dead — the biggest product-methodology signal this quarter
Martin Fowler: Spec-Driven Development — where methods might be heading
Say this to sound smart

Your design system stopped being documentation. It's now an interface for the agent.

Bullshit Mar 2026

"AI-generated PRDs are a productivity win"

PRD slop is drowning teams. The document is not the product. A 40-page AI-generated PRD that nobody reads is worse than a napkin sketch that everyone understands. The productivity win is in thinking, not typing.

Bullshit Mar 2026

"We need an AI product strategy"

You need a product strategy that accounts for AI. Different thing entirely. “AI product strategy” implies AI is the product. It’s not. It’s a capability shift that changes how you build, price, and deliver everything.

Bullshit Apr 2026

"Consumers are ready for metering"

SaaS economics were great. Hosting & infra costs per user are largely negligable. 30% of your recurring revenue isn’t even using the product. As AI becomes part of the stack, inference costs are changing the economics of products, and that’s going to require usage-based-pricing. This is going to act as a small counterbalance to the new development efficiency, and users will hate it.

Ready Mar 2026

Spec-driven development

Martin Fowler’s articulation of where product methods are heading. Context over tickets. Write the spec, hand it to the agent, review the output. The PM role becomes “chief context officer” — less admin, more judgment.

Ready Mar 2026

AI-assisted user research synthesis

Pattern extraction from interviews, survey data, support tickets. The PM admin layer being automated. The insight is still human. The grunt work of organizing 200 interview transcripts is not.

Say this to sound smart

It's great that people are finally going to realize Product Management was not just about slinging tickets.

When was the last time someone actually read one of our PRDs end to end?
Did AI write our AI strategy? Do we actually have a unique POV?
Have we made a significant change to our PM workflow in the past 12 months?
What is our moat that protects us from AI-native competition entering the market?

Pulse on

Business

Explainer Act May 25, 2026

The bait-and-switch is starting. Get AI FinOps-ready.

Anthropic and OpenAI are heading toward IPOs in 26–27. Profitable companies IPO well. Unprofitable ones don’t. The next few months are going to feature increasingly creative ways for them to extract revenue without renegotiating the contract.

  • Enterprise token allocations are going away. Renewals are landing as metered-only. The subsidized seats model is dying for Enterprise, will SMB follow?
  • Uber spent their annual AI budget in 4 months. Token-based pricing is hard to track, harder to control. Your CFO is panicking
  • The tokenizer changed. Claude 4.7’s tokenizer counts ~27% more tokens than 4.6 for identical content. Yay invisible price-hikes.
  • Enterprise security is now paywalled. Want domain capture, audit logs, RBAC, Compliance API? Welcome to the metered tier.
  • AI FinOps is about to be a function. Someone in your org needs to own the meter. If it’s nobody, it’s your CFO’s surprise in Q3.

Cute playbook: Ship the breakthrough capability to consumers at a loss. Hook the developer ecosystem. Lock the enterprise into security features only the meter buys. Bump the tokenizer. The first one is always free.

It’s not all doom & gloom. Open-weights models are trailing the frontier by 12-18 months in capability. Good enough for many tasks, trending to good enough for agentic coding. Combine OpenCode and DeepSeek-V4-Pro and you have a fallback for when you hit your Claude limits.

While we don’t know the true cost of inference at Anthropic et al, the cost of raw compute is visible through inference marketplaces like Openrouter. Data suggests there’s a hefty markup coming from the closed shops. For fixed capability, inference costs are dropping, maybe even ~5-10x annually due to techniques like sparse MOE, specialized hardware, etc.

But we’ve still not hit the goldilocks zone yet, the ‘good enough’ cost:performance, and the goalposts keep moving. Tool usage, Reasoning, these all increase demands on inference, with the tradeoff of increased utility.

Also watching: Competition is the only counterweight. DeepSeek made their 75% discount on V4-pro permanent. Kimi K2 prices in at 5–10x cheaper for a similar parameter count. Cursor’s Composer 2.5 is showing real promise for coding-specific work. It’s the ‘solar panels on your roof’ hedge against utility dominanance.

Hot take Watch May 15, 2026

Assume AI is always listening. It's a legal timebomb.

We use Granola. We recommend it. Big fans.

We’re also watching the legal industry and risk teams get nervous about AI notetakers, and they should be. The NYT covered the growing pile of cases where the transcript, not the audio, is being introduced as evidence. Audio doesn’t get retained. Accuracy becomes the dispute. The transcript is treated as fact.

  • Discovery now includes AI-generated transcripts. Granola, Otter, Fathom, Fellow, Zoom, Notion…
  • Audio is rarely retained. When the transcript is wrong, you have no source of truth to dispute it with.
  • The “casual” channel is gone. Side comments, jokes, half-thoughts are all transcribed verbatim, all archived, all eventually surfaceable.
  • We’ve been here before Slack brought the workplace casual chat into a subpoena-ble digital footprint. “Never write something in an email you wouldn’t want read out in court” is an old aphorism.

We’re not telling you to stop using notetakers. We’re telling you to act like every meeting is one subpoena away from being read aloud. Because it is.

Explainer Act Mar 28, 2026

Your caution is now your biggest risk

Don Norman wrote about affordances — the design cues that tell you how to use a thing. A door handle affords pulling, a plate affords pushing. It makes things intuitive to use. AI has almost no affordances right now.

This means adoption is happening mainly at the intersection of the technical and the curious. Your average accounting department isn’t reinventing how they work, because the tools aren’t ready. The problem: waiting for affordances to arrive is now the most dangerous strategy available to you.

Anthropic noticed this — engineers were using Claude Code for spreadsheets, timekeeping, document creation — and shipped Cowork in two weeks. But Cowork itself doesn’t really know how you’re going to use it either. The ‘try this first’ example is to organize screenshots on your desktop. It’s going to take a long time for general-purpose “AI for everyone” products to land.

In the meantime, while you wait, there’s a whole new army of tiny AI-native companies scanning every market looking for opportunities to disrupt. They’re building their own tools, their own workflows, at a fraction of traditional enterprise margins.

  • If you wait for prêt-à-dopter, it’s 2 years out. Software timelines: 3 months to build. Adoption timelines: compliance, inertia, org change. There’s your 2 years.
  • AI-native companies are full of curious generalists. They don’t wait for affordances. They try stuff. Some fails. They get shit done.
  • The risk katamari grows every quarter you wait. Your “risk-averse” posture is itself accumulating risk. CRUD SaaS, SMB workflow tools, M&A rollups — that’s where swarming starts here.

The affordances will come, no question. People will build new tools, integrate with your existing systems and workflows, and some platforms will respond with the appropriate degree of urgency (see: Linear). But can you afford to wait for that?

Also watching: Foundation model companies partnering with consultancies for change management, because their own products have no affordances, but the potential is here already. If the foundation models stopped improving tomorrow, we’d still have a decade of product innovation on current capabilities.

Exhibit May 21, 2026

Embodiment is coming...wait, it just fell over.

Figure’s humanoid robot got beaten by a high-school intern at package sorting. The intern wasn’t really trying.

Figure AI's humanoid robot competing against a teenage intern in a package-sorting challenge

The teenager won this round. Just. But also, it’s the worst version that will ever ship. This is a public benchmark for physical embodiment. A non-juiced livestream, and a measurable result without anyone falling over.

The reality: Every “AI can’t do X” benchmark has a half-life of about nine months (see Will Smith Eating Spaghetti). The clock has started for Embodiment.

Hot take Act Mar 18, 2026

The single highest-ROI hiring filter is curiosity

Not technical depth. Not domain expertise. Curiosity. Shopify called their hiring persona “entrepreneur” — same principle. We had a hyper-specialization era (the frontend JavaScript eco system as the reduction ad absurdum). It’s over. Every hire going forward should be someone who tries things before being asked to.

  • We've barely scratched the surface of AI-as-capability Mar 2026
  • "We gave everyone ChatGPT logins — that's our AI strategy" Mar 2026
Further reading — Business
Forbes: Uber burned its 2026 AI budget in four months — the canonical "your CFO is panicking" story
a16z: LLMflation — inference costs dropping ~5-10x annually at fixed capability &mdash; the price-trend essay
Epoch: LLM inference price trends — the data backing the LLMflation argument
Claude 4.7 tokenizer: the invisible price hike — the 27% tokenizer bump, measured
Bloomberg: DeepSeek 75% discount, permanent — competition begins to bite
NYT Dealbook: AI notetakers and legal risk — why your Granola transcripts matter to your GC
Intentional: AI Adoption Is a Change Management Problem — our take on why this isn't a technology problem
Say this to sound smart

AI FinOps is about to be a function. If nobody owns the meter, your CFO is in for nasty surprise in Q3.

Bullshit Mar 2026

"AI adoption is a technology problem"

It’s a change management problem. The tech works. The org doesn’t. Every failed AI rollout we’ve seen had working technology and broken incentives, unclear ownership, or leadership that announced the initiative and moved on.

Bullshit Mar 2026

"We'll wait for the tools to be ready"

The tools are ready enough. Your caution is your biggest risk now. Every quarter you wait, the risk katamari grows. The AI-native competitors aren’t waiting for affordances — they’re building without them.

Ready Mar 2026

Curiosity-first hiring

The generalist era is here. Shopify’s “entrepreneur” filter is the model. Hire people who try things before being asked. The hyper-specialization era rewarded depth. The AI era rewards people who can context-switch, experiment, and ship across domains.

Ready Mar 2026

Decision-capture as training data

Recording how your team makes judgment calls. This is the input your future agents need. Not the tasks. The reasoning. Why did you choose vendor A over B? What made you escalate that ticket? Start capturing this now and you’ll have a headstart when true agentic workers become (safely) available.

Say this to sound smart

Your release cadence is a good proxy for your org's change tolerance. It's the canary for your AI risk.

Beyond ChatGPT logins, what does our actual AI adoption look like? Who's using what, and how?
How do we document decisions right now? Could someone new reconstruct why we made them?
Who on the team is experimenting with AI outside their core role? What are they finding?
Where are we calling our behaviour 'risk-averse' that's actually 'risk-accumulating'?

We want to hear from you

Got a signal we missed? A take you disagree with? Something we should be tracking? We're building Pulse as a conversation, not a broadcast.

hello@intentional.team →