The 3 Levels of Context Engineering

Watch on YouTube

Now Playing

Transcript

139 segments

0:00

Most people treat coding agents like

0:02

chatbots. They go back and forth until

0:05

compaction ruins their flow. In this

0:08

video, I'll show you the three tiers of

0:10

context engineering and why the third

0:12

tier will change the way you code with

0:14

agents forever. I'm Roman. I published a

0:17

top 3% paper at Nurips, the largest AI

0:20

conference in the world. Now, I'm on a

0:23

mission to become the best AI coder.

0:26

There are three tiers of working with

0:28

context. 90% of users fall into the

0:31

first tier. They vibe code their

0:33

sessions, expressing their intent to the

0:35

agent, allowing compaction and

0:37

continuing as if nothing happens. This

0:40

causes severe context rot and

0:42

information loss, leading to

0:44

catastrophic performance decreases and

0:47

bugridden code. Since most people

0:49

learned AI through chat bots, this comes

0:51

naturally and they don't even know that

0:53

there is a better way. The second tier

0:56

is the intentional developer where 9% of

0:59

users fall. They understand that context

1:02

rod exists and that intentional context

1:05

engineering is the best way to work.

1:07

They frequently compact manually clear

1:10

context and curate handoff documents.

1:13

However, they still suffer from context

1:15

rot and struggle to get the models up to

1:18

speed with where they are. They still

1:20

treat conversations with coding agents

1:22

as linear.

1:24

Then there's the third tier, trajectory

1:27

engineers. After thousands of hours of

1:29

trial and error, they have internalized

1:31

the fact that conversations with LLMs

1:34

don't have to be linear. They frequently

1:36

use slashre to jump back in time with a

1:39

model, not just to try again, but to

1:42

keep context lean. They fork sessions

1:45

and parallelize different trajectories,

1:48

exploring many at once and picking the

1:50

best outcome. But why is trajectory

1:54

engineering possible with LLMs? Well,

1:57

LLMs have no internal state. Every

2:00

single time you hit enter, the model

2:02

processes your request from scratch,

2:05

just with your request appended to the

2:07

conversation array. The chat interface

2:09

just gives you the illusion of

2:11

continuity. Many people who know this

2:13

think of the statelessness of LLMs as a

2:16

limitation because you constantly have

2:18

to retach the model things. But this is

2:20

actually a superpower and allows you to

2:23

time travel through conversations with

2:25

models. Human memory degrades over time

2:28

due to its statefulness. Capture notes

2:30

are just a lossy snapshot of the place

2:33

you were mentally when you wrote them.

2:36

However, agent memory can be respond

2:39

with 100% fidelity. The state you load

2:43

implies the trajectory of what the

2:45

model's response will be. LLMs are

2:48

incredibly sensitive, meaning that small

2:50

perturbations in their input space,

2:52

their context, leads to possibly big

2:55

trajectory changes in their output

2:57

space. So, what does this really look

3:00

like in practice? Well, the trajectory

3:03

engineers are modern-day time travelers.

3:06

In Claude Code, this can be activated by

3:09

double pressing escape. You can jump

3:11

back in time to any previous point in a

3:13

session and explore different

3:15

trajectories. I like to think of context

3:18

as a tree. The tree has a trunk which

3:21

lays the foundation for the rest of it.

3:23

This trunk is analogous to your

3:25

important and reusable information.

3:27

Maybe context about your repo or the

3:29

plan you are implementing.

3:31

Traditionally, context will grow from

3:33

this trunk until the tree falls down,

3:36

aka the user hits clear. But why let the

3:39

tree grow to an unstable point when we

3:41

can keep trimming it down to the trunk?

3:44

Once you become a trajectory engineer,

3:47

you can intentionally create branches,

3:49

explore, and then trim the branches

3:51

back. You can gather more context on the

3:53

main branch, use that as recon, and then

3:55

trim back with slre. This allows you to

3:58

explore perturbations of different

4:00

magnitudes to the input space, letting

4:03

you explore what the model has to say

4:06

and refine your prompts details in order

4:08

to reach or approach the optimal

4:10

trajectory. The optimal output that a

4:13

model could have in your situation. I

4:15

personally use /re multiple times per

4:19

session, whether it be to try an

4:21

approach with a different prompt or just

4:23

trim session context down. Now, let me

4:26

motivate context trimming with an

4:28

example. If you are deep into a coding

4:30

session and your agent implements a bug,

4:33

what you can do is fix the bug by going

4:35

back and forth with the agent. Now that

4:38

the bug is fixed, instead of proceeding

4:40

with your work, you can trim out the

4:43

context related to fixing the bug as it

4:46

is no longer relevant. Time travel back

4:49

to before the bug was spotted. Let the

4:51

model know what happened and how you

4:53

fixed it and proceed from there. Once

4:55

you realize that you can respawn,

4:57

parallelize and jump back to any state

4:59

in a context window, this unlocks the

5:02

true skill ceiling of agentic coding and

5:04

the sky is the limit. Context

5:06

engineering in this way will widen the

5:08

gap between you and other AI users due

5:11

to the power and quality difference of

5:13

experiencing no context rot and

5:15

approaching the optimal trajectory.

5:17

That's why I'm calling these new age

5:19

time travelers trajectory engineers

5:22

because we aren't just engineering the

5:23

context or the prompts anymore. We are

5:25

engineering the output of a blackbox

5:28

system. If you want to learn how to

5:30

build your app or transform your

5:32

business by building with coding agents,

5:33

then join my free community. It is the

5:36

number one agentic coding community on

5:38

school. I'll see you in there.

Interactive Summary

Ask follow-up questions or revisit key timestamps.

This video introduces three tiers of context engineering for coding agents. The first tier, used by 90% of users, involves natural conversation leading to context rot and performance degradation. The second tier, for 9% of users, involves intentional context management like manual compaction and clearing, but still suffers from linear conversation flow. The third tier, 'trajectory engineers' (1%), understands that LLM conversations are not linear due to their stateless nature. They utilize features like 'slashre' to jump back in time, fork sessions, and explore different development paths, effectively treating context as a tree that can be pruned and branched. This 'time travel' capability, enabled by LLMs processing requests from scratch with appended context, allows for precise state management and exploration of optimal code trajectories, leading to significantly improved code quality and a wider skill gap between users.