HomeVideos

The Work Primitive: What Every AI Product Leader Gets Wrong

Now Playing

The Work Primitive: What Every AI Product Leader Gets Wrong

Transcript

689 segments

0:00

This is a piece about the strategy that

0:02

we have to build as product leaders when

0:05

we think about where agents play best.

0:08

And what I'm asserting is that the work

0:10

primitive is what really matters. And a

0:12

lot of us are assuming that the agent's

0:15

ability to use the computer sort of

0:16

levels the playing field because we can

0:18

all sort of put our programs out there

0:20

and the agent can use it. Or we can

0:21

build an MCP server and it's just going

0:23

to work. I want to suggest that there's

0:25

a deeper strategy in play that some of

0:27

the hyper scalers understand and that

0:29

needs to be more widely shared and

0:31

understood. I don't want us to get stuck

0:34

in a world where we just build demos

0:36

that look good in a Twitter video and

0:38

we're not thinking about that more

0:40

carefully. So, let's dive in and

0:42

understand what happens under the

0:44

surface when you see an agentic workflow

0:46

and what we mean by controlling a work

0:49

primitive. Cuz it's a new term, so we're

0:50

going to define it, we're going to

0:51

explain what it means, and we're going

0:52

to explain why it's valuable. Let's jump

0:54

in. When an AI agent opens a browser and

0:56

moves through tabs and clicks buttons

0:58

and fills out a form or checks your

1:00

calendar and it can do all of that now,

1:02

it feels like the model has crossed a

1:03

line. And I will say specifically Codex

1:05

computer use can do that. It is no

1:07

longer just answering questions, it's

1:09

doing real work for you. But I think

1:11

that the visible work that the model

1:12

does is distracting us from the platform

1:15

shift underneath. The future is not an

1:17

AI that gets really good at clicking

1:19

buttons for you. That's the bridge. The

1:21

real fight is over who defines what the

1:24

button means. Because once agents start

1:27

acting inside companies, the question is

1:29

not just can it click a button for me?

1:30

The question is does the system

1:32

understand what kind of work is being

1:33

done, who's allowed to do it, what could

1:36

go wrong, and how the result is checked.

1:38

I've seen this personally just in using

1:40

Codex computer use. I feel like I'm

1:42

running into these sort of friction

1:44

points that I never would have expected

1:45

to see because I am now using an agent

1:49

on my computer at the same time as I am

1:51

on that computer. And so I'm trying to

1:53

figure out what does it look like when

1:55

we have a different set of permission

1:57

states for agents versus people. Let's

1:59

jump into the details here. There are

2:00

three layers to keep in your head.

2:02

Access, meaning, and authority. Those

2:04

are all layers that agents can touch.

2:06

Computer use lets agents access parts of

2:08

the computer. Semantic work primitives

2:11

gives agents meaning. So, there are

2:13

three layers to keep in your head as we

2:15

go through this video. The layer of

2:16

access, underneath it the layer of

2:18

meaning, and deeper still the layer of

2:20

authority. Computer use is what I've

2:22

been messing with. It gives agents

2:23

access. Semantic work primitives give

2:27

agents a real sense of meaning. And the

2:29

companies that control those primitives

2:31

are the ones that end up with real

2:33

platform power. So, there's three

2:34

levels, right? And that sounds abstract,

2:36

so we're going to start with something

2:37

really simple. Imagine an AI agent

2:39

moving a calendar invite. I've had Codex

2:41

do that. On the screen, that looks like

2:43

changing a time and clicking save. But

2:45

the action is not really click save. It

2:48

may notify five people, it may move prep

2:50

time, it may break a commitment someone

2:52

made to a customer, it may turn a

2:53

private conversation into a meeting that

2:55

now conflicts with something more

2:56

important. The human sees a calendar

2:59

event and brings all of that context

3:00

with them. The software sees fields in a

3:03

database, right? The agent sees that it

3:04

needs to fill out the calendar and just

3:06

do the job. It doesn't necessarily

3:08

understand the human intent behind the

3:09

meeting. And the human intent behind the

3:11

meeting, making that more legible, is

3:13

what I mean by a semantic work

3:15

primitive. It's a fancy word, but it

3:17

means basically, does the computer

3:20

understand what it's doing and what we

3:22

humans need it to do when it does a

3:23

task, or is it just using the fields?

3:26

And that's a big difference. The same

3:28

thing happens with checkout. A button

3:30

that says buy is not just a button. It

3:32

represents money, user consent, tax,

3:35

merchant identity, fraud risk,

3:36

fulfillment, returns, card security, and

3:39

maybe a dispute a few weeks from now. Or

3:41

take deleting a file. One file might be

3:42

harmless cleanup, another might be the

3:44

only copy of a signed agreement. On the

3:46

screen, those actions can look

3:47

identical. In the work, they're very

3:49

different. So, yes, agents need to use

3:51

computers. They need browsers, they need

3:53

desktops, they need to survive inside

3:55

software that was built for people. But,

3:56

computer use is not a long-term moat.

3:59

Computer use is like how agents reach

4:01

the old world, right? The thing that

4:03

makes agents really valuable long-term

4:05

is the layer that tells the agent what

4:07

it is touching and why it matters. And

4:10

right now, we're kind of we're kind of

4:11

getting hints of that. So, the auto

4:13

review feature in Codex basically is

4:15

there to guard human intent and ensure

4:18

that the agent using the computer is

4:21

actually using it to do the right task.

4:23

I love it. It works pretty well, but it

4:25

feels like an initial draft in that

4:27

direction because it's very much a

4:29

guardrail tool. It's there to guardrail

4:32

the agent and keep it from doing

4:33

something it shouldn't. That's good. I

4:35

want it to do its job, but that's

4:37

different from positively ensuring that

4:40

agents have the semantic meaning they

4:42

need to really deeply understand my

4:45

calendar. Calendars are complex things.

4:47

Deeply understand the email context for

4:50

a relationship I've had for 3 and 1/2

4:51

years with someone when they write one

4:53

message. That's a larger piece of

4:55

context. And look, I get it. Most of the

4:57

world is not agent-native, and the fact

4:59

that we have computer use is hugely

5:01

helpful. The fact that we have jumped in

5:03

just a few months to the point where

5:04

it's useful is a godsend. Companies are

5:06

full of software that assumes a human is

5:08

sitting there interpreting everything,

5:10

right? Internal dashboards, procurement

5:11

tools, shared drives, government

5:13

websites, Excel workflows, the whole

5:14

thing, right? Like this All of computing

5:17

assumes a human will use it. If an agent

5:19

cannot use a computer visually in that

5:21

world, it cannot reach so much of our

5:23

work. It is stuck inside the clean,

5:26

modern, API-friendly part of the world,

5:28

which is much smaller than people in

5:30

tech want it to be. So, computer use is

5:32

absolutely necessary. It is the

5:33

universal adapter for the messy middle

5:35

period. It's kind of like screenshots,

5:37

right? It just is going to be a

5:38

universal adapter. But, a universal

5:41

adapter is typically a shallow

5:42

interface. A screenshot can show the

5:44

agent what is on the screen, but it does

5:46

not automatically reveal the structure

5:48

underneath. A browser can reach almost

5:51

every web app, but it does not

5:52

automatically know the domain meaning of

5:54

each workflow. A desktop controller can

5:56

click a button, but it does not

5:57

automatically know whether that button

5:59

is reversible, whether that button is

6:02

financially material or dangerous. The

6:04

agent can guess, and the guesses are

6:06

getting much, much better, but guessing

6:08

is not a strategy for high consequence

6:10

work. If an agent is helping you

6:12

summarize an article, then guessing is

6:14

probably something you can fix. If it is

6:16

deciding whether to issue a contract,

6:18

that's a different thing entirely,

6:20

right? If it's deciding whether to email

6:21

a customer, that's a different thing

6:23

entirely, or spend money, you have to be

6:24

sure. And this is where the hierarchy of

6:27

meaning becomes clear. Agents should use

6:30

the richest semantic interface

6:31

available. If there's a connector, use

6:33

the connector. If there's a proper

6:35

protocol, use the protocol. If the

6:36

system exposes a typed object and a

6:38

permission to action, use that. Only

6:40

fall back to a browser or desktop

6:42

control when the richer interface

6:43

doesn't exist. This is not just

6:45

engineering preference here. This is how

6:47

things should be architected, and as far

6:49

as I can see, this is generally how the

6:51

hyperscalers have built their models.

6:53

Codex works this way, Claude prefers to

6:55

work through MCPs when it can, and I

6:57

think that's correct. Ultimately, it is

6:59

that hierarchy of meaning that ensures

7:01

that we get the richest possible

7:02

experience for any given task. So, we're

7:05

not likely to have as many issues as

7:08

long as we have as many connectors as

7:10

possible plugged into our preferred AI

7:12

systems, which by the way is an intended

7:15

a plug for you adding plugins to your

7:18

chat GPT, to your Codex, to your Claude.

7:21

Make sure it has those rich tools if

7:23

they are available to you. And

7:24

increasingly, so much of our work, we

7:27

have MCPs or APIs that are already

7:30

pre-built as plugins for these tools,

7:32

you should add them. That is just a very

7:34

practical takeaway here. If you want

7:36

your agent to not have to use computer

7:38

use all the time, add the plugins. Add

7:40

the connectors. All of that is there

7:43

just to facilitate access, right? The

7:45

model needs access to tools, the agent

7:46

needs access to the browser, the

7:48

assistant needs to access your files.

7:49

So, you get the idea. But, access only

7:51

gets the agent into the workspace. It

7:54

doesn't make the work understandable.

7:56

The next layer that we are just getting

7:58

to now is meaning. What is this object?

8:01

What action is being proposed? Who owns

8:03

it? Who's allowed to change it? What

8:05

happens if the action succeeds? What

8:06

happens if it fails? Is it reversible?

8:09

Does it touch the money? Does it touch

8:11

customer data? Does it touch production?

8:12

Does it create an obligation outside the

8:14

company? Does it require approval? Can

8:16

another agent review it? Can the system

8:18

tell whether the outcome is correct?

8:20

These sound like governance questions,

8:22

but they're really product questions.

8:24

The more clearly a system can answer

8:26

those correctly, the more autonomy it

8:28

can support. The less clearly it can

8:30

answer them, the more the human has to

8:32

sit there supervise it. This is why I

8:34

think describing the agent having the

8:36

power to write is just you trust it's

8:39

right, trusted right access. Access is

8:40

the engineering term. That's too small a

8:42

way of picturing what we're doing here.

8:44

Trust is not a switch. An agent might be

8:46

trusted to read but not write, draft but

8:48

not send, stage but not deploy,

8:51

recommend but not approve, change a

8:53

sandbox but not production, write in one

8:56

space but not another. All of those

8:58

distinctions depend on semantics. If it

9:00

cannot tell the difference between

9:01

issuing a refund from your chosen

9:03

Shopify shop versus issuing a refund

9:06

from your Stripe, you're going to have

9:07

problems as well. If it cannot tell the

9:10

difference between staging and

9:11

production, which by the way there were

9:13

real production systems deleted as a

9:14

result of exactly that issue, then it

9:16

shouldn't be anywhere near the deploy

9:17

button. So, the real primitive here is

9:20

not the ability of the agent to use the

9:23

computer. It's not even the browser tab

9:25

for web browsing. The real primitive,

9:27

the foundation on which we're building,

9:29

is a semantically meaningful unit of

9:31

work. A refund, a reschedule, a payment

9:34

authorization, a compliance exception, a

9:36

meeting brief. all examples of this,

9:39

right? Those are things that agents need

9:40

to understand as units of work. Human

9:42

software hides them behind buttons and

9:44

forms, but humans have always understood

9:46

them intuitively. Agent-native software

9:48

needs to expose them directly. This, by

9:50

the way, is why coding agents arrived

9:52

first. This, by the way, is also why

9:54

coding agents arrived first. It is very

9:56

tempting to say that coding agents

9:58

worked first because code is text and

10:00

language models are good at text. That's

10:02

part of it, but it's not the whole

10:03

story. Coding agents worked first

10:05

because software development already has

10:08

unusually rich work semantics. A code

10:10

base is not just a pile of text files.

10:13

It has modules and dependencies and

10:15

tests and type systems and linters and

10:17

package managers and get history, et

10:19

cetera, right? It has all of these

10:20

things. That means the agent can

10:22

perceive state and act on state and

10:25

observe feedback and revise its actions.

10:27

It can inspect the repo. It can edit a

10:29

file. It can run a test. It can see the

10:30

error. It can change the implementation

10:32

and hand the result back. The loop is

10:34

powerful because the work environment

10:37

itself gives the agent semantic

10:40

feedback. The human doesn't have to

10:42

answer every 30 seconds, is this right,

10:44

if the test is failing. The agent can

10:47

just tell it's wrong. In other words,

10:49

when we are talking about coding tests,

10:51

we are not just talking about

10:53

verification artifacts. We're talking

10:55

about semantic meaning artifacts. They

10:58

tell the agent what world it's operating

11:01

in. Most knowledge work is not like that

11:03

yet, right? A strategy doc doesn't have

11:05

tests. A calendar has events, but the

11:08

importance of those events is hidden

11:09

behind politics and priorities and

11:11

relationships. A sales process might

11:13

depend on unwritten account history.

11:15

Often it does. A procurement decision

11:17

may depend on budget timing and risk

11:19

tolerance, which isn't written down.

11:20

Agents can help in those domains. They

11:22

already do, but the environment doesn't

11:23

give them the same density of meaning

11:25

that a code base would give them. This

11:27

is why coding is a wedge, not because

11:29

all work automatically becomes coding or

11:32

every worker becomes a coder. Coding is

11:34

a wedge because code is legible enough

11:37

that an agent can facilitate and

11:39

participate in it without a human being

11:40

a full-time supervisor. So, once you see

11:42

the world that way, products like Codex

11:45

stop looking like coding tools and they

11:46

start looking like labs for where the

11:49

future of work is going to be. And

11:51

that's where the product strategy starts

11:52

to get really interesting. The model is

11:54

still central, right? Better models

11:56

definitely matter, faster models matter,

11:58

reasoning matters, but the model alone

12:01

is not the product and hasn't been for a

12:02

while. Because to do work, a model needs

12:05

to be in a harness that can enable it to

12:08

access and operate against

12:13

units of work. And if you want it to be

12:15

non-coding work, then the non-coding

12:18

work has to be semantically meaningful.

12:22

So, harnesses really matter. Harnesses

12:24

help the agent access the work, but you

12:27

also have to make sure that the work

12:29

that's being accessed is actually done

12:31

in a way that makes sense. The whole

12:33

point of an agent doing the work is to

12:35

reduce the amount of attention I have to

12:37

spend coordinating the work. If I still

12:39

carry all of that harness intuition that

12:41

makes the semantic meaning of work

12:43

legible inside my head, I'm not getting

12:46

very far. If I carry all the meaning of

12:48

my three calendars and the agent can't

12:50

figure it out, we're not getting very

12:51

far. And I want to be blunt here. I know

12:53

that this is a hard problem, but it is

12:55

exactly the hard problems that are

12:57

valuable to solve. This is basically a

13:00

free roadmap if you are a startup.

13:02

Because as a startup, you want to be in

13:04

a position where you can solve problems

13:07

that are not easy for someone else to

13:08

come and grab. And one of those classic

13:11

problem shapes is make a semantic

13:14

meaning of work legible to agents today.

13:16

Don't just rely on a standard MCP

13:18

interface, try and break it. Understand

13:21

where it's not working. Understand where

13:22

it connects to levers, but the agent

13:24

doesn't know how to reliably drive the

13:26

levers from a prompt because there's

13:27

something else about understanding the

13:29

task that isn't there. I get super

13:30

passionate about this because if we

13:32

don't have agents that understand the

13:33

meaning of work, we get bad calendar

13:35

invites, decks that feel like they're

13:37

off on tone but we can't explain why. We

13:39

get refunds that are issued to customers

13:41

that shouldn't be issued to customers.

13:43

All kinds of things go wrong not because

13:45

the agents can't control the system but

13:47

because the semantic meaning of your

13:49

work is not available. Now, in the

13:51

article that I'm writing for this on

13:52

Substack, I spend more time on getting

13:55

into the commerce stack, understanding

13:57

the difference between discovery and

13:58

checkout and infrastructure, and how our

14:01

agentic commerce strategies are shaped

14:04

by this approach, by how we understand

14:07

semantic meanings of work. Because

14:08

there's a critical semantic layer to

14:10

agentic transactions that's super

14:11

important. But for our purposes today,

14:13

we're going to assume that you realize

14:15

we have to have a semantic meaning to

14:17

transactions, that transactions

14:19

themselves are part of the semantic

14:21

meaning of work, that there's a whole

14:22

strategy there, and we're also going to

14:25

put a pin in that and look at something

14:27

that is more tangible and easy to

14:29

understand in a quick video. And that's

14:31

Perplexity. Perplexity's strategy is

14:33

super interesting here. If you think

14:34

about it from a move to the semantic

14:36

meaning of work perspective, a lot more

14:38

makes sense. This is why Perplexity has

14:41

to move toward products like Comet and

14:44

Computer and Personal Computer long

14:46

term. It needs to get away from search

14:48

per se and closer to the browser, the

14:50

desktop, the files, the apps, the

14:52

workflows where research becomes action.

14:54

That move makes sense. The browser is

14:56

where a huge amount of work already

14:57

happens. Email, documents, dashboards,

14:59

SaaS apps, analytics, shopping,

15:01

calendar, support tools, customer

15:03

systems, internal tools, they all

15:04

collapse into tabs. An agent inside the

15:06

browser can see context between web apps

15:08

and compare pages and take multi-step

15:10

actions, and it just becomes legible

15:12

because it sees your work. And this is

15:15

why browsers and AI are interesting and

15:17

why one of the things that is really

15:19

undecided in 2026 is who is going to

15:21

have the AI browser. If Perplexity

15:24

becomes an AI browser for someone else's

15:27

tools and other tools plug into it, it

15:29

gets durable control here because it

15:31

manages the browser that can see your

15:33

calendar and the calendar system owns

15:35

your recurrence and your attendees and

15:36

your notifications and your meeting

15:37

state. It can see your GitHub. It can

15:39

see whatever you're logging into. But

15:41

the browser war is not just about which

15:43

company gets closest to the user. It's

15:45

about whether the browser can assemble

15:47

cross-domain meaning for you. If

15:49

Perplexity owns the browser in common,

15:51

can it build a durable work graph above

15:53

the underlying apps? Can it turn search

15:55

results into structured actions with

15:56

permissions and validation and review?

15:59

Can it remember the user's projects and

16:00

policies in a way that makes work easier

16:02

or does it remain just an operator of

16:04

interfaces? And that is the trap for any

16:06

kind of search native or browser native

16:09

agent. And that is why even though

16:10

browser is a play for Perplexity,

16:13

Perplexity also has to move to the

16:14

computer because if they're not on the

16:16

computer, if they're not handling those

16:18

compute files I talked about close to

16:20

semantic meaning, where it basically has

16:22

an open claw in your computer,

16:23

Perplexity personal computer, and it

16:25

touches files. It touches these compute

16:27

primitives I was talking about earlier

16:28

in this video, it still has kind of a

16:30

shaky hold on the semantic meaning of

16:33

work. Basically, there are two big plays

16:35

going on right now to figure out how

16:37

agents will do meaningful work in the

16:39

world. Play one is to start from the

16:42

semantic meaning of work out here in the

16:44

real world where we do work and work

16:46

back to the agents. It's the only play a

16:48

lot of people who aren't hyper-scalers

16:50

have. That's why I chose the Perplexity

16:52

example. The other play is the play

16:54

that's available to the hyper-scalers.

16:57

And that is to start from the models

16:59

themselves and their ability to

17:00

understand and use code and move out

17:02

through computing primitives to figure

17:05

out how to do work from there. And that

17:06

is why people have made a lot of hay out

17:08

of the fact that Claude and Codex have

17:11

not too many tools, but use those tools

17:13

super super well to do the

17:15

They are close to the computer. They use

17:17

the tools that make sense for them,

17:19

they're allowed to compose tools to

17:21

accomplish complex workflows, and their

17:23

ability to understand the semantic

17:25

meaning of code turns out to be a good

17:27

general unlock for a lot of other work.

17:29

But, the thing is, the bridge in between

17:32

those two approaches has some holes in

17:35

it. If you're just coming from the

17:36

computer side, as I've been sharing many

17:38

specific examples of, your computer may

17:41

not fully understand the purpose of the

17:43

work it's doing. Your agent may not

17:45

fully understand the purpose of the work

17:46

it's doing. The calendar example is a

17:48

good one. If it moves the calendar

17:49

invite, does it really realize it's

17:51

inconveniencing two or three other

17:52

people you don't want to mess with?

17:53

Probably not. On the other hand, if

17:55

you're coming from the semantic meaning

17:57

of work, if you're coming from making

17:59

sure that you understand how to bundle

18:00

that together and make it useful, sort

18:02

of like Perplexity is doing, you have to

18:04

think about it and say, "Am I ready to

18:06

make this bridge into the hyperscalers,

18:09

and where do I plug in?" And what

18:11

Perplexity has basically decided to do

18:12

is to say, "We welcome all models. We're

18:14

going to be the shop where you have all

18:16

models, and our focus is going to be

18:18

making these semantic

18:20

units of work very, very legible and

18:22

easy." And that's why Personal Computer

18:24

is full of specific workflows for

18:25

knowledge work, like finance. They're so

18:27

far in on finance. And so, you kind of

18:30

have to pick a lane, one approach or the

18:32

other. And if you're not a hyperscaler,

18:34

the lane's been picked for you because

18:36

you don't have a gigantic model that you

18:38

can use to do code with. It doesn't

18:40

belong to someone else that you're

18:41

renting. And so, when you think about it

18:43

that way, the world becomes simpler.

18:45

Humans need clear interfaces, agents

18:48

need clear semantics, the best software

18:50

will provide both. It is going to stay

18:53

simple for people while making

18:55

underlying objects and operations really

18:57

legible to agents, and that is going to

19:00

generate a software where AI and humans

19:02

can coexist together. And that's what

19:04

this video is really about. Software

19:06

that is ready for AI to tell the agent

19:08

what exists, what can be done, what each

19:11

action means, what permission is

19:12

required, how the results should be

19:14

checked, and what happens next. That is

19:16

a way, way higher bar to software than I

19:18

see for most software today. It is the

19:20

It is the future of software in 2026.

19:22

That is your road map if you are not

19:24

doing that today. So, the coming

19:26

platform fight is not going to look like

19:28

one company simply winning AI. It's

19:30

going to look like a negotiation across

19:31

the whole stack. Model companies want

19:33

broad agents that can operate across

19:35

domains. Browser companies want to

19:36

orchestrate work across applications.

19:38

SaaS companies want to preserve

19:40

authority over domain semantics.

19:42

Identity providers want to govern

19:44

authorization. They all have their

19:45

interests, right? The question is going

19:47

to be which layer owns the meaning of

19:51

work? Which layer owns the meaning of

19:53

work that the agent can read? And every

19:55

software company is going to have to

19:56

decide how much semantic access to

19:59

expose into whom. If you expose too

20:01

little, generic agents will operate

20:03

clumsily through the UI. If you expose

20:05

too much, the product risks becoming

20:06

back-end infrastructure for someone

20:08

else's agentic interface. That is the

20:10

tension that anyone in software is

20:12

facing today. This is actually a great

20:14

tension exemplified by uh Salesforce 360

20:17

versus how SAP is handling agents. SAP

20:19

is locking off agents right now. They

20:21

don't want agents to use their products.

20:23

Salesforce is going the other way.

20:25

They're saying they're leading into

20:26

agents and saying, "Let agents operate

20:27

across our substrate and grab MCPs and

20:30

grab APIs and we're going to be headless

20:31

from the get-go because we know that's

20:33

the future." I think Salesforce is more

20:35

correct here, especially from their

20:37

perspective as a system of record. They

20:40

want to be a system of record that's

20:41

sticky, and so they want to be legible

20:44

semantically to agents and humans. And I

20:46

think that's a good example. I think SAP

20:48

is not going to last with that approach.

20:49

Like SAP deciding, "Eh, we're going to

20:52

say no, no to agents" is like sticking

20:54

your head in the sand when the tidal

20:56

wave is coming. Pardon my mixed

20:57

metaphors. It's going to be a disaster.

20:59

And under this deeper test for semantic

21:02

meaning, a lot of flashy products start

21:04

to look thin. A A clicking through a

21:06

website is great today. It does do work.

21:09

I'm glad it works, but it's not the end

21:11

result when we think about the kinds of

21:13

work we want to do with agents long-term

21:16

that are durable and repeatable. And

21:18

that's the question I ask every time I

21:20

see a new AI product. Does this give the

21:22

model access or does it give the model a

21:24

meaningful set of levers it can really

21:27

use to drive the product? I love raw

21:29

computer access. I love that we're

21:31

getting the agent closer to file

21:32

primitives, closer to the work. I love

21:34

these MCP services. I love that we're

21:36

talking about access in 2026. I want to

21:38

talk about semantic control and semantic

21:40

meaning. I want to talk about an AI

21:42

understanding the implications of my

21:44

calendar and how messy it is. And that

21:47

is going to require a new set of

21:49

rethought software that is designed to

21:52

be agent readable from the get-go,

21:54

semantically readable to the agent, not

21:56

just technically legible, not just that

21:58

the agent can use edit calendar and move

22:00

the date, but that the agent understands

22:02

the semantic context of this particular

22:04

environment with these people. We don't

22:06

have software for that yet. We need a

22:08

lot of software like that. This is part

22:10

of why I think software isn't dead. And

22:12

it's part of why Perplexity moving

22:14

toward the computer is strategically

22:16

necessary, but maybe not complete.

22:19

Because Perplexity has to move into a

22:22

world where it is able to deliver a lot

22:25

more workflows like the finance workflow

22:27

it's talking about to become truly

22:29

sticky. Because the future is not an AI

22:31

that clicks every button for you. That's

22:33

the bridge we have today. The future is

22:35

software where the button is no longer

22:37

the primitive. The primitive is the

22:39

action behind it. It's described, it's

22:41

permissioned, it's reviewable, it's

22:42

reversible where possible, it's

22:44

composable. So, computer use and tools

22:47

like that give agents hands. MCP gives

22:49

agents hands. Semantic controls tell the

22:52

agent what it's touching. And that is

22:54

the deeper remote. Now, if you want to

22:55

dive deeper on this, I'm going to go

22:57

into memory ownership, enterprise

22:58

permissions, browser strategy, and

23:00

agentic commerce on the Substack. But

23:01

the core lens here is the same one I

23:03

would use for every AI product over the

23:05

next year. Do not ask only whether the

23:07

agent can act. Ask whether the product

23:10

knows what that action means. That is

23:13

your key takeaway. All right, I'll see

23:15

you next time. Cheers.

Interactive Summary

The video argues that for AI agents to become truly useful and reliable in a work environment, we must move beyond merely giving them the ability to use computers (like clicking buttons or navigating tabs) and focus on defining 'semantic work primitives.' These are structured, meaningful representations of work—such as authorizing a payment or rescheduling a meeting—that provide the agent with necessary context regarding intent, permissions, and consequences. While 'computer use' acts as a universal adapter for current software, it is not a long-term moat. Companies that successfully architect their software to be 'agent-readable,' allowing agents to understand the deeper meaning and authority behind their actions rather than just performing surface-level interactions, will define the future of enterprise software.

Suggested questions

4 ready-made prompts