HomeVideos

Why Coding Agents Hallucinate.

Now Playing

Why Coding Agents Hallucinate.

Transcript

202 segments

0:00

If you have ever tried to solve

0:01

hallucinations in coding agents, then

0:04

this video is for you. In today's video,

0:06

I'm going to take you through why coding

0:08

agents are mathematically designed to

0:10

hallucinate and how this information

0:11

will change your coding workflows

0:13

forever.

0:15

And if you've been trying to fix this

0:16

with a better prompt, bigger specs, or

0:18

more instructions, you've been making it

0:20

worse. I'm Roman. I published a top 3%

0:23

paper at Nurips, the largest AI

0:25

conference in the world. Now I'm on a

0:27

mission to become the best AI coder.

0:30

Here's how the naive user builds with

0:31

Claude code. Claude hallucinates a

0:34

function that doesn't exist. So you add

0:36

rules to prevent this in the future.

0:38

Claude will remember this. You pile

0:40

rules into your cloud.md thinking that

0:42

Claude will learn. After Claude still

0:44

doesn't fix his act, you decide to take

0:46

it to the specs. You draft up a massive

0:48

spec document telling Claude exactly

0:50

what to do in every situation. Specify

0:53

deliberately all of the things that he

0:54

can't or shouldn't do. And after 5

0:57

hours, you have a 10,000line monolith

0:59

plan. Claude surely won't screw this up.

1:03

Except Claude just gets worse. He begins

1:05

to ignore instructions, hallucinate

1:06

more, and leaves you with a bugridden

1:08

codebase.

1:10

Because with all of this added context,

1:12

what you effectively just did is cause

1:14

severe context rot and context

1:16

poisoning. Context rot is a phenomenon

1:18

where the attention in a language model

1:20

gets stretched thin, leading to a model

1:22

who ignores instructions and has lower

1:24

intelligence. Because just like humans,

1:27

LLM's attention is a finite resource. On

1:30

top of this, with all of these

1:32

unnecessary lines in your specs, you are

1:34

boxing in the model. The creator of

1:36

Cloud Code himself said, "Don't box in

1:38

the model." Coding models perform best

1:40

when they are given a strict goal, which

1:43

are specifications, and a path to

1:45

achieve the goal implementation plan.

1:48

When the model has nowhere else to go,

1:49

it bluffs. But there is a better way and

1:52

it starts with understanding why LLMs

1:54

hallucinate. Researchers at OpenAI

1:57

released a paper that mathematically

1:59

proves that hallucinations are a default

2:01

behavior of how large language models

2:03

are trained. In the paper, there were

2:06

two patterns in particular that are

2:08

important to learn. I call them the

2:10

knowledge gap and the confidence trap.

2:14

The first pattern is what I call the

2:15

knowledge gap. The model only knows what

2:18

it was trained on and what is given in

2:19

context. If a fact appeared once or

2:22

never in its training data, the model is

2:24

forced to make an educated guess. And in

2:26

my experience, this guess looks like an

2:28

interpolation over the model's training

2:30

data. Meaning, it will try to choose the

2:32

most plausible answer based on what it

2:34

knows. This is why models are so good at

2:37

boilerplate code. They have seen it

2:39

millions of times in their training

2:40

data. However, for your novel codebase,

2:43

they are forced to interpolate. And when

2:46

it interpolates from the wrong reference

2:47

points, you get hallucinations. Even

2:50

worse, you can't tell if these are well-

2:52

behaved outputs or hallucinations

2:54

because hallucinations are meant to be

2:56

convincing.

2:58

However, the knowledge gap is a solvable

3:00

problem. Since the model has knowledge

3:02

gaps in training, we turn to the other

3:04

way to teach models things in context

3:07

memory. We must give the model proper

3:09

information on how we write code, the

3:12

layout of our codebase, and the docs it

3:13

needs to make the implementation.

3:15

However, as we discussed, you can't just

3:18

give the model everything upfront every

3:20

time because this causes context rot.

3:22

So, what does this look like in our

3:24

codebase? Well, we want to give the

3:26

model a map, not a novel. Specs should

3:29

be traversible and bite-sized instead of

3:31

monolithic, and should tell the model

3:33

what we want, not exactly how it should

3:35

do it. We want to map out what is going

3:38

on in our repository in a way that makes

3:41

sense to the model without forcing it to

3:42

go and read all of the code, which is an

3:45

impossible task. The model should be

3:47

able to grab what it needs and when.

3:49

Part of our job as Agentic engineers is

3:52

to steer the model and tell it where to

3:54

look instead of just letting it run free

3:56

or giving it too much information up

3:58

front and boxing it in. I call this LLM

4:01

friendly architecture. And outside of

4:03

your codebase, the real solution to

4:05

knowledge gaps is to get familiar with

4:07

context engineering, maximizing signal

4:09

while minimizing noise. Mastering this

4:11

principle will decrease hallucination

4:13

rates drastically. Learn the best ways

4:15

to trim context during your coding

4:17

session, not just defaulting to

4:19

autocompact.

4:21

The second problem posed in the paper is

4:23

that of the confidence trap. And it is a

4:25

very logical and simple reason why

4:27

models actually hallucinate. Even when

4:30

the model is uncertain, it was rewarded

4:32

to sound confident. Think of it like

4:34

when you take a multiple choice test.

4:36

You never just leave the bubbles blank.

4:38

You choose one, the one that looks the

4:40

most reasonable. This is exactly the

4:42

same things that LLMs do. They are

4:45

maximizing the expected value of their

4:47

output.

4:49

Here's the thing about the confidence

4:50

trap. You cannot solve it with context.

4:53

The confidence trap is an incentive

4:55

problem, not a knowledge problem. The

4:57

model was trained to always give an

4:59

answer. You can give it perfect context,

5:02

every file in your repo, complete

5:03

documentation, and it will still

5:05

confidently pick one approach when it

5:07

should have been asking you a question.

5:09

It will still generate code instead of

5:11

saying, "I'm not sure this is right."

5:13

So, what do you do? You build

5:15

verification loops around the model so

5:17

that when it bluffs, the bluff gets

5:19

caught. The most reliable verification

5:21

is deterministic. Tests don't

5:24

hallucinate. A llinter doesn't

5:25

hallucinate. A type checker doesn't

5:27

hallucinate. But don't count out

5:29

non-deterministic verification either.

5:31

Not every problem can be boxed into a

5:33

test. Parallel sub aents are fantastic

5:36

at sweeping code, making sure it matches

5:38

specs, and does exactly what it is

5:40

supposed to do. This is because

5:42

verification and generation should be

5:44

split into different context windows.

5:46

The more layers of verification that we

5:48

can add after code is generated, the

5:50

better. So instead of tackling the

5:51

problem of hallucinations by preventing

5:53

them, try to catch them instead.

5:57

The same training objective that makes

5:58

them good at writing code is the

6:00

objective that guarantees they'll

6:02

sometimes write the wrong code

6:03

confidently. That is not changing.

6:06

There's no framework, plugin, or

6:08

architecture that will solve all your

6:09

problems. And when everyone has access

6:11

to the same tools, they are no longer a

6:13

moat. If you ground yourself and stop

6:15

chasing the next big thing, you will see

6:17

that coding with AI takes an incredible

6:19

amount of skill. The biggest reframe for

6:22

me personally was that all of these

6:24

quote unquote AI breakthroughs are just

6:26

reskins of the same two core patterns.

6:29

Context engineering for when the model

6:31

doesn't have the answers and harness

6:33

engineering for when the model

6:34

inevitably tries to convince you it

6:36

does. You are the variable at every

6:39

stage from planning to implementation to

6:41

verification. The models are the same.

6:43

The fact that these models are imperfect

6:45

and require skill to wield, that is not

6:48

a bad thing. The skill is what separates

6:50

agentic coders from vibe coders and the

6:52

returns to those who learn the tools

6:54

will be monumental. Now is the best time

6:57

to learn agentic coding for first

6:58

principles and build the workflows and

7:00

harnesses that fit you and your apps.

7:02

The largest agentic coding community on

7:04

school is linked below. It's free. I'll

7:07

see you in there.

Interactive Summary

This video explains why coding agents like Claude hallucinate and how to mitigate this issue. The speaker, Roman, a researcher with a top publication at NeurIPS, argues that traditional methods like adding more rules or detailed specifications worsen the problem by causing "context rot" and "context poisoning," where the model's attention is diluted and it becomes less intelligent. Instead, understanding the root causes is key: the "knowledge gap" and the "confidence trap." The knowledge gap occurs because models only know their training data and provided context. When faced with novel information, they interpolate, which can lead to hallucinations if they reference incorrect data. The solution is not to provide more context, but to use "context engineering" to deliver concise, relevant information the model can access when needed, like a "map" rather than a "novel." This involves creating "LLM-friendly architecture" with traversable, bite-sized specifications. The confidence trap stems from the model being trained to always provide an answer, even when uncertain, to appear confident. This is an incentive problem, not a knowledge one, and cannot be solved by more context. The solution is to implement "verification loops" such as deterministic tests, linters, type checkers, or parallel sub-agents. The key is to split generation and verification into different context windows and to catch hallucinations rather than trying to prevent them entirely. Ultimately, mastering AI coding requires skill in context engineering and harness engineering, understanding that AI breakthroughs often stem from these two core patterns. The skill of the user is the differentiating factor, and learning agentic coding principles now offers significant advantages.

Suggested questions

5 ready-made prompts