The 5 Levels of AI Coding (Why Most of You Won't Make It Past Level 2)

Watch on YouTube

Now Playing

Transcript

1123 segments

0:00

90% of cloud code was written by claude

0:02

code. Codeex is releasing features

0:04

entirely written by codecs. And yet most

0:07

developers using AI empirically get

0:10

slower, at least at first. The gap

0:12

between these two facts is where the

0:13

future of software lives. Imagine

0:15

hearing this at work. Code must not be

0:18

written by humans. Code must not be even

0:20

reviewed by humans. Those are the first

0:23

two principles of a real production

0:24

software team called Strong DM and their

0:27

software factory. They're just three

0:30

engineers. No one writes code. No one

0:32

reviews code. The system is a set of AI

0:35

agents orchestrated by markdown

0:37

specification files. The system is

0:39

designed to take a specification, build

0:41

the software, test the software against

0:43

real behavior scenarios, and

0:45

independently ship it. All the humans do

0:48

is write the specs and evaluate the

0:50

outcomes. The machines do absolutely

0:53

everything in between. As I was saying,

0:55

meanwhile, 90% and yes, it's true. Over

0:58

at Anthropic, 90% of Claude Code's

1:00

codebase was written by Claude Code

1:02

itself. Boris Triny, who leads the

1:04

Claude Code project at Anthropic, hasn't

1:06

personally written code in months. And

1:08

Anthropic's leadership is now estimating

1:10

that functionally 100% the entirety of

1:12

code produced at the company is AI

1:14

generated. And yet at the same time, in

1:17

the same industry, with us here on the

1:19

same planet, a rigorous 2025 randomized

1:22

control trial by METR found that

1:24

experienced open-source developers using

1:27

AI tools took 19% longer to complete

1:32

tasks than developers working without

1:34

them. There is a mystery here. They're

1:36

not going faster, they're going slower.

1:38

And here's the part that should really

1:40

unsettle you. Those developers are bad

1:42

at estimation. They believed AI had made

1:45

them 24% faster. They were wrong not

1:48

just about the direction but about the

1:50

magnitude of the change. Three teams are

1:53

running lights out software factories.

1:56

The rest of the industry is getting

1:57

measurably slower. Just a few teams

1:59

around tech are running truly lights out

2:02

software factories. The rest of the

2:04

industry tends to get measurably slower

2:06

while convincing themselves and everyone

2:08

around them with press releases that

2:09

they're speeding up. The distance

2:11

between these two realities is the most

2:14

important gap in tech right now and

2:16

almost nobody is talking honestly about

2:19

it and what it takes to cross it. That

2:21

is what this video is about. Dan

2:22

Shapiro, the CEO over at Glow Forge and

2:25

the veteran of multiple companies built

2:26

on the boundary between software and

2:28

physical products, just published a

2:30

framework earlier this year in 2026 that

2:32

maps where the industry stands. He calls

2:35

it the five levels of vibe coding. And

2:37

the name is deliberately informal

2:38

because the underlying reality is what

2:40

matters. Level zero is what he calls

2:43

spicy autocomplete. You type the code,

2:45

the AI suggests the next line. You

2:48

accept or reject. This is GitHub copilot

2:50

in its original format. Just a faster

2:52

tab key. The human is really writing the

2:54

software here. And the AI is just

2:56

reducing the keystrokes and the effort

2:57

your fingers have. Level one is coding

3:00

intern. You hand the AI a discrete well

3:02

scoped task. You write the function. You

3:05

build the component. You refactor the

3:06

module. That's the task you give the AI.

3:08

You hand the AI a discrete and well

3:10

scoped task like write this function or

3:13

build this component or refactor this

3:15

module. You then review as the human

3:17

everything that comes back. The AI

3:19

handles the tasks. The human handles the

3:21

architecture, the judgment and the

3:22

integration. Do you see the pattern

3:24

here? Do you see how the human is

3:25

stepping back more and more through

3:27

these levels? Let's keep going. Level

3:29

two is the junior developer. The AI

3:31

handles multifile changes. It can

3:33

navigate a codebase. It can understand

3:35

dependencies. It can build features that

3:36

span modules. You're reviewing more

3:39

complicated output, but you as a human

3:41

are still reading all of the code.

3:42

Shapiro estimates that 90% of developers

3:45

who say they are AI native are operating

3:48

at this level. And I think from what

3:49

I've seen, he's right. Software

3:51

developers who operate here think

3:53

they're farther along than they are.

3:55

Let's move on. Level three, the

3:57

developer is now the manager. This is

3:59

where the relationship starts to flip.

4:01

This is where it gets interesting.

4:02

You're now not writing code and having

4:04

the AI help. You're simply directing the

4:06

AI and you're reviewing what it

4:08

produces. Your day is whether you want

4:11

to read, whether you want to approve,

4:12

whether you want to reject, but at the

4:14

feature level, at the PR level. The

4:17

model is doing the implementation. The

4:18

model is submitting PRs for your review.

4:21

You have to have the judgment. Almost

4:23

everybody tops out here right now. Most

4:26

developers, Shapiro says, hit that

4:27

ceiling at level three because they are

4:30

struggling with the psychological

4:33

difficulty of letting go of the code.

4:35

But there are more levels. And this is

4:37

where it gets spicy and exciting. Level

4:39

four is the developer as the product

4:41

manager. You write a specification, you

4:44

leave, you come back hours later and

4:46

check whether the tests pass. You're not

4:48

really reading the code anymore. You're

4:50

just evaluating the outcomes. The code

4:52

is a black box. you care whether it

4:54

works, but because you have written your

4:56

eval so completely, you don't have to

4:59

worry too much about how it's written if

5:01

it passes. This requires a level of

5:03

trust both in the system and in your

5:06

ability to write spec. And that quality

5:08

of spec writing almost nobody has

5:10

developed well yet. Level five, the dark

5:13

factory. This is effectively a black box

5:16

that turns specs into software. It is

5:18

where the industry is going. No human

5:20

writes the code. No human even reviews

5:23

the code. The factory runs autonomously

5:26

with the lights off. Specification goes

5:29

in, working software comes out. And you

5:32

know, Shapiro is correct. Almost nobody

5:34

on the planet operates at this level.

5:36

The rest of the industry is mostly

5:38

between level one and level three, and

5:40

most of them are treating AI kind of

5:42

like a junior developer. I like this

5:44

framework because it gives us really

5:46

honest language for a conversation

5:48

that's been drowning in hype. When a

5:50

vendor tells you their tool writes code

5:52

for you, they often mean level one. When

5:55

a startup says they're doing agentic

5:57

software development, they often mean

5:59

level two or three. But when strong DM

6:01

says their code must not be written by

6:03

humans, they really do mean level five,

6:06

the dark factory, and they actually

6:08

operate there. The gap between marketing

6:11

language and operating reality is

6:13

enormous. and collapsing that gap into

6:16

what is actually going on on the ground

6:18

requires changes that go way beyond

6:21

picking a better AI tool. So many people

6:24

look at this problem and think this is a

6:26

tool problem. It's not a tool problem.

6:28

It's a people problem. So what does

6:31

level five software development actually

6:34

look like? I think strong DM software

6:37

factory is the most thoroughly

6:38

documented example of level five in

6:40

production. Simon Willis, one of the

6:42

most careful and credible observers in

6:44

the developer tooling space, calls

6:46

StrongDm Software Factory, quote, "The

6:49

most ambitious form of AI assisted

6:51

software development that I've seen

6:53

yet." The details are really worth

6:55

digging into here because they reveal

6:57

what it looks like to run a dark factory

6:59

for software on today's agents. And as

7:02

we have this discussion, I want you to

7:05

keep in mind that for most of us

7:07

listening, we are getting to time

7:09

travel. We are seeing how a bold vision

7:12

for the future can be translated into

7:14

reality with today's agents and today's

7:16

agent harnesses. It is only going to get

7:19

easier as we go into 2026 which is one

7:22

of the reasons I think this is going to

7:25

be a massive center of gravity for

7:27

future agentic software development

7:29

practices. We are all going to level

7:31

five. So what does strong DM do? The

7:34

team is three people. Justin McCarthy,

7:36

CTO, Jay Taylor, and Nan Chowan. They've

7:39

been running the factory since July of

7:41

last year, actually. And the inflection

7:44

point they identify is Claude 3.5

7:46

Sonnet, which shipped actually in the

7:49

fall of 2024. That's when long horizon

7:52

agentic coding started compounding

7:54

correctness more than compounding

7:56

errors. Give them credit for thinking

7:58

ahead. Almost no one was thinking in

8:00

terms of dark factories that far back.

8:03

But they found that 3.5 sonnet could

8:06

sustain coherent work across sessions

8:09

long enough that the output was reliable

8:11

and it wasn't just a flash in the pan.

8:14

It wasn't just demo worthy and so they

8:16

built around it. The factory runs on an

8:18

open-source coding agent called

8:19

attractor. The repo is just three

8:22

markdown specification files and that's

8:24

it. That's the agent. The specifications

8:27

describe what the software should do.

8:29

The agent reads them. It writes the code

8:31

and it tests it. And here's where it

8:33

gets really interesting and where most

8:35

people's mental model really starts to

8:37

break down. Strong DM doesn't actually

8:40

use traditional software tests. They use

8:42

what they call scenarios. And the

8:44

distinction is important. Tests

8:46

typically live inside the codebase. The

8:48

AI agent can read them, which means the

8:50

AI agent can intentionally or not

8:53

optimize for passing the tests rather

8:55

than building correct software. It's the

8:58

same problem as teaching to the test in

9:00

education. You can get perfect scores

9:02

and shallow understanding. Scenarios are

9:04

different. Scenarios live outside the

9:06

codebase. They're behavioral

9:08

specifications that describe what the

9:10

software should do from an external

9:12

perspective, stored separately so the

9:15

agent cannot see them during

9:16

development. They function as a holdout

9:19

set. The same concept that machine

9:21

learning users use to prevent

9:23

overfitting. The agent builds the

9:25

software and the scenarios evaluate

9:27

whether the software actually works. The

9:30

agent never sees the evaluation

9:32

criteria. It can't game the system. This

9:34

is really a new idea in software

9:36

development and I don't see it

9:38

implemented very frequently yet. But it

9:40

solves a problem that nobody was

9:42

thinking about when all the code was

9:44

written by humans. When humans write

9:46

code, we don't tend to worry about the

9:48

developer gaming their own test suite

9:50

unless incentives are really, really

9:52

skewed at that organization and then you

9:54

have bigger problems. When AI writes the

9:57

code, optimizing for test passage is the

10:00

default behavior unless you deliberately

10:02

architect around it. And it's one of the

10:04

most important differences to really

10:07

understand as you start to think about

10:09

AI as a code builder. Strongdm

10:11

architected around that with external

10:14

scenarios. The other major piece of the

10:16

architecture is what StrongDM calls

10:18

their digital twin universe. Behavioral

10:21

clones of every external service the

10:24

software interacts with. a simulated

10:26

octa, a simulated Jira, a simulated

10:29

Slack, Google Docs, Google Drive, Google

10:31

Sheets. The AI agents develop against

10:34

these digital twins, which means they

10:36

can run full integration testing

10:38

scenarios without ever touching real

10:41

production systems, real APIs, or real

10:44

data. It's a complete simulated

10:46

environment purpose-built for autonomous

10:48

software development. And the output is

10:50

real. CXDB, their AI context store, has

10:53

16,000 lines of Rust, nine and a half

10:55

thousand lines of Go, and 700 lines of

10:58

TypeScript. It's shipped, it's in

11:00

production, it works, it's real

11:01

software, and it's built by agents end

11:03

to end. And then the metric that tells

11:04

you how seriously they take it. They say

11:07

if you haven't spent $1,000 per human

11:10

engineer, your software factory has room

11:12

for improvement. I think they're right.

11:15

That's not a joke. $1,000 per engineer

11:17

per day enables AI agents to run at a

11:20

volume that makes the cost of compute

11:23

meaningful if you are giving them a

11:25

mission to build software that has real

11:27

scale and real utility in production use

11:30

cases and it's often still cheaper than

11:32

the humans they're replacing. Let's hop

11:34

over and look at what the hyperscalers

11:36

are doing. The self-referential loop has

11:39

taken hold at both anthropic and open

11:41

AAI and it's stranger than the hype

11:43

might make it sound. Codex 5.3 is the

11:46

first frontier AI model that was

11:47

instrumental in creating itself. And

11:50

that's not a metaphor. Earlier builds of

11:51

Codeex would analyze training logs,

11:53

would flag failing tests, and might

11:55

suggest fixes to training scripts. But

11:58

this model shipped as a direct product

12:01

of its own predecessors coding labor.

12:04

OpenAI reported a 25% speed improvement

12:07

and 93% fewer wasted tokens in the

12:11

effort to build Codeex 5.3. And those

12:14

improvements came in part from the model

12:16

identifying its own inefficiencies

12:19

during the build process. Isn't that

12:21

wild? Cloud code is doing something

12:22

similar. 90% of the code in Claude Code,

12:25

including the tool itself, was built by

12:27

Claude Code, and that number is rapidly

12:29

converging toward 100%.

12:31

Boris Churny isn't joking when he talks

12:34

about not writing code in the last few

12:35

months. He's simply saying his role has

12:37

shifted to specification, to direction,

12:40

to judgment. Anthropic is estimating all

12:43

of their company moving to entirely AI

12:45

generated code about now. Everyone at

12:48

Anthropic is architecting and the

12:51

machines are implementing. And the

12:52

downstream numbers tell the same story.

12:55

When I made a video on co-work and

12:57

talked about how it was written in 10

12:59

days by four engineers, what I want you

13:02

to remember is it wasn't just four

13:04

engineers hyperting so that they could

13:06

get that out super fast and write every

13:08

line by hand. No, no, no. They were

13:11

directing machines to build the code for

13:14

co-work. And that's why it was so fast.

13:16

4% of public commits on GitHub are now

13:19

directly authored by Claude Code, a

13:21

number that Anthropic thinks will exceed

13:23

20% by the end of this year. I think

13:25

they're probably right. Claude Code by

13:27

itself has hit a billion dollar run rate

13:30

just 6 months since launch. This is all

13:33

real today in February of 2026. The

13:36

tools are building themselves. They're

13:38

improving themselves. is they're

13:40

enabling us to go faster at improving

13:42

themselves and that means the next

13:44

generation is going to be faster and

13:46

better than it would have been otherwise

13:47

and we're going to keep compounding. The

13:49

feedback loop on AI has closed and the

13:53

question is not whether we're going to

13:55

start using AI to improve AI. The

13:57

question is how fast that loop is going

13:59

to accelerate and what it means for the

14:02

40 or 50 million of us around the world

14:04

who currently build software for a

14:05

living. This is true for vendors as much

14:08

as it's true for software developers.

14:10

And I don't think we talk about that

14:11

enough because the gap between what's

14:13

possible at the frontier in February of

14:15

2026 and what tends to happen in

14:18

practice and what vendors want to sell

14:20

has never been wider. That MER study, a

14:23

randomized control trial, by the way,

14:24

not a survey, found that open source

14:27

developers using AI coding tools

14:29

completed their task 19% slower. We

14:32

talked about that, right? The

14:33

researchers controlled for task

14:34

difficulty. They controlled for

14:36

developer experience. They controlled

14:38

even for tool familiarity and none of it

14:40

mattered. AI made even experienced

14:42

developers slower. Why? In a world where

14:45

co-work can ship that fast. Why? Because

14:48

the workflow disruption outweighed the

14:50

generation speed. Developers spent time

14:53

evaluating AI suggestions, correcting

14:56

almost right code, context switching

14:58

between their own mental model and the

15:00

model's output, and debugging really

15:02

subtle errors introduced by generated

15:04

code that looked correct but weren't.

15:06

46% of developers in broader surveys say

15:09

they don't fully trust AI generated

15:11

code. These guys aren't lites, right?

15:13

This is experienced engineers running

15:15

into a consistent problem. The AI is

15:18

fast, but it struggles with the

15:19

reliability to trust without what they

15:22

view as vital human review. And this

15:25

irony is the J curve that adoption

15:28

researchers keep identifying. When you

15:30

bolt an AI coding assistant onto an

15:33

existing workflow, productivity dips

15:36

before it gets better. It goes down like

15:38

the bottom of a J. Sometimes for a

15:40

while, sometimes for months. And the dip

15:42

happens because the tool changes the

15:44

workflow, but the workflow has not been

15:46

redesigned around the tool explicitly.

15:49

And so you're kind of running a new

15:51

engine on old transmission. The gears

15:54

are going to grind. Most organizations

15:55

are sitting in the bottom of that J

15:57

curve right now. And many of them are

15:59

interpreting the dip as evidence that AI

16:02

tools don't work, that the vendors did

16:04

not tell them the truth, and that the

16:06

evidence that their workflows haven't

16:08

adapted is really evidence that AI is

16:11

hype and not real. I think GitHub

16:13

Copilot might be the clearest

16:15

illustration of this. It has 20 million

16:17

users, 42% market share among AI coding

16:20

tools, apparently. Uh, and lab studies

16:22

show 55% faster code completion on

16:25

isolated tasks. I'm sure that makes the

16:28

people driving GitHub Copilot happy in

16:30

their slide decks. But in production,

16:32

the story is much more complicated.

16:35

There are larger poll requests. There

16:36

are higher review costs. There's more

16:38

security vulnerabilities introduced by

16:40

generated code. And developers are

16:43

wrestling with how to do it well. One

16:44

senior engineer put it really sharply.

16:46

C-Ilot makes writing code cheaper but

16:49

owning it more expensive. And that is

16:51

actually a very common sentiment I've

16:52

heard across a lot of engineers in the

16:54

industry. not just for co-pilot but for

16:56

AI generated code in general. The

16:58

organizations that are seeing

17:00

significant call it 25 30% or more

17:02

productivity gains with AI are not the

17:05

ones that just installed co-pilot had a

17:08

one-day seminar and called it done.

17:10

They're the ones that thought carefully

17:12

went back to the whiteboard and

17:14

redesigned their entire development

17:16

workflow around AI capabilities.

17:19

changing how they write their specs,

17:20

changing how they review their code,

17:22

changing what they expect from junior

17:24

versus senior engineers, changing their

17:26

CI/CD pipelines to catch the new

17:28

category of errors that AI generated

17:30

code introduces. End to end process

17:33

transformation. It's not about tool

17:35

adoption. And end toend transformation

17:37

is hard. It's sometimes it's politically

17:40

contentious. It's expensive. It's slow

17:42

and most companies don't have the

17:44

stomach for it. Which is why most

17:46

companies are stuck at the bottom of the

17:48

J curve. Which is why the gap between

17:50

frontier teams and everyone else is not

17:53

just widening, it's accelerating

17:55

rapidly. Because those teams on the edge

17:57

that are running dark factories, they

17:59

are positioned to gain the most. As

18:01

tools like Opus 4.6 and Codeex 5.3

18:05

enable widespread agentic powers for

18:08

every software engineer on the planet.

18:10

95% of those software engineers don't

18:12

know what to do with that. It's the ones

18:14

that are actually operating at level

18:15

four, level five that truly get the

18:18

multiplicative value of these tools. So

18:20

if this is a politically contentious

18:22

problem, if this is not just a tool

18:24

problem but a people problem, we need to

18:26

look at the nature of our software

18:29

organizations. Most software

18:31

organizations were designed to

18:33

facilitate people building software.

18:36

every process, every ceremony, every

18:38

role. They exist because humans building

18:41

software in teams need coordination

18:44

structures. Stand-up meetings exist

18:46

because developers working on the same

18:47

codebase, they got to synchronize every

18:50

single day. Sprint planning exists

18:52

because humans can only hold a certain

18:54

number of tasks in working memory and

18:56

then they need a regular cadence to rep

18:58

prioritize. Code review exists because

19:00

humans make mistakes that other humans

19:02

can catch. QA teams exist because the

19:05

people who build software, they can't

19:07

evaluate it objectively. You get the

19:09

idea. Every one of these structures is a

19:12

response to a human limitation. And when

19:14

the human is no longer the one writing

19:16

the code, the structures, they're not

19:19

optional, they're friction. So what does

19:22

sprint planning look like when the

19:24

implementation happens in hours, not

19:26

weeks? What does code review look like

19:28

when no human wrote the code and no

19:31

human can really review the diff that AI

19:34

produced in 20 minutes because it's

19:35

going to produce another one in 20 more

19:37

minutes. So what does a QA team do when

19:39

the AI already tested against scenarios

19:42

it was never shown? Strong BM's

19:43

threeperson team doesn't have sprints.

19:46

They don't have standups. They don't

19:48

have a Jiraa board. They write specs and

19:50

they evaluate outcomes. That is it.

19:53

The entire coordination layer that

19:55

constitutes the operating system of a

19:57

modern software organization. The layer

19:59

that most managers spend 60% of their

20:02

time maintaining is just deleted. It

20:05

does not exist. Not because it was

20:07

eliminated as a cost-saving measure, but

20:09

because it no longer serves a purpose.

20:12

This is the structural shift that's

20:13

harder to see than the tech shift, and

20:16

it might matter more. The question is

20:18

becoming what happens to the

20:19

organizational structures that were

20:21

built for a world where humans write

20:24

code? What happens to the engineering

20:26

manager whose primary value is

20:28

coordination? What happens to the scrum

20:31

master, the release manager, the

20:32

technical program manager whose job is

20:34

to make sure a dozen teams ship on time?

20:38

Look, those roles don't disappear

20:39

overnight, but the center of gravity is

20:42

shifting. The engineering manager's

20:44

value is moving from coordinate the team

20:48

building the feature to define the

20:50

specification clearly enough that agents

20:52

build the feature. The program manager's

20:54

value is moving from track dependencies

20:57

between human teams to architect the

20:59

pipeline of specs that flow through the

21:01

factory. The skills that matter are

21:03

shifting very rapidly from coordination

21:06

to articulation. From making sure people

21:08

are rowing in the same direction to

21:10

making sure the direction is described

21:12

precisely enough that machines can go do

21:14

it. And oh, by the way, for engineering

21:16

managers, there's an extra challenge.

21:18

How do you coach your engineers to do

21:20

the same thing? It's a people challenge.

21:22

If you think this is a trivial shift,

21:24

you have never tried to write a

21:26

specification detailed enough for an AI

21:28

agent to implement it correctly without

21:30

human intervention. And you've certainly

21:32

never sat down and tried to coach an

21:34

engineer to do the same. It is a

21:35

different skill. It requires the kind of

21:38

rigorous systems thinking that most

21:40

organizations have never needed from

21:42

most of their people because the humans

21:44

on the other end of the spec could fill

21:45

in the gaps with judgment, with context,

21:48

with a slack message that says, "Did you

21:49

mean X or Y?" The machines don't have

21:52

that layer of human context. They build

21:54

what you described. If what you

21:56

described was ambiguous, you're going to

21:58

get software that fills in the gaps with

22:00

software guesses, not customer- ccentric

22:02

guesses. The bottleneck has moved from

22:04

implementation speed to spec quality.

22:07

And spec quality is a function of how

22:10

deeply you understand the system, your

22:12

customer, and your problem. That kind of

22:15

understanding has always been the

22:17

scarcest resource in software

22:19

engineering. The dark factory doesn't

22:20

reduce the demand for that. It just

22:22

makes the demand an absolute law. It

22:25

becomes the only thing that matters.

22:28

Now, let's be honest. Everything that I

22:30

have just talked about assumes you're

22:32

building from scratch. Most of the

22:34

software economy is not built from

22:36

scratch. The vast majority of enterprise

22:39

software is brownfield. It's existing

22:41

systems. It's accumulated over years,

22:43

over decades. It's running in

22:45

production, serving real users, carrying

22:47

real revenue. CRUD applications that

22:50

process business transactions. Monoliths

22:52

that have grown organically through 15

22:54

years of feature additions. CI/CD

22:56

pipelines tuned to the quirks of a

22:58

specific codebase and a specific team's

23:00

workflow. Config management that exists

23:02

in the heads of the three people who've

23:04

been at the company long enough to

23:05

remember why that one environment

23:07

variable is set to that one value. You

23:09

know who you are. You cannot dark

23:11

factory your way through a legacy

23:13

system. You cannot just pretend that you

23:15

can bolt that on. It doesn't work that

23:17

way. The specification for that does not

23:19

exist. The tests, if they're any, cover

23:22

30% of your existing codebase, and the

23:24

other 70% runs on institutional

23:26

knowledge and tribal lore and someone

23:29

who shows up once a week in a polo shirt

23:31

and knows where all the skeletons are

23:33

buried in the code. The system is the

23:35

specification. It's the only complete

23:38

description of what the software does

23:40

because no one ever wrote down the

23:42

thousand implicit decisions that

23:44

accumulated over a decade or more of

23:47

patches of hot fixes of temporary

23:49

workarounds that of course became

23:51

permanent. This is the truth about the

23:54

interstitial states that lie along this

23:57

continuum toward more autonomous

23:59

software development. For most

24:01

organizations, the path is not to start

24:04

with deploy an agent that writes code.

24:06

It starts with let's develop a

24:08

specification for what your real

24:11

existing software really actually does.

24:14

And that specification work that reverse

24:17

engineering of the implicit knowledge

24:19

embedded in a running system is very

24:22

difficult and it's deeply human work. It

24:25

requires the engineer who knows why the

24:27

billing module has the one edge case for

24:29

Canadian customers. It requires the

24:31

architect who remembers which micros

24:34

service it was that carved out of the

24:36

monolith under duress during the 2021

24:38

outage and we've always maintained it

24:39

ever since. It requires the product

24:41

person who can explain what the software

24:44

actually does for real users versus what

24:46

the PRD says it does. Domain expertise,

24:49

ruthless honesty, customer

24:51

understanding, systems thinking. exactly

24:54

the human capabilities that matter even

24:57

Interactive Summary

Ask follow-up questions or revisit key timestamps.

The video discusses the growing gap between cutting-edge AI-driven software development (

Recently Distilled

Videos recently processed by our community

Global Network Webinar - Introduction Module of Advancing Responsible AI - 3 June 2025

Feb 19, 2026

by UNStats

The OpenClaw Saga: Zuckerberg Begged This Developer to Join Meta. He Said No. Here's Who Got Him.

Feb 19, 2026

by AI News & Strategy Daily | Nate B Jones

Elon Musk Sued His Way Into Being Tesla's "Founder" (The Full $175 Million Story)

Feb 19, 2026

by Dr. Josh C. Simmons

They Deleted My H-1B Exposé. Now 100 Employees Are Confirming the Same Pattern.

Feb 19, 2026

by Dr. Josh C. Simmons

The AI Bubble Just Popped | Here's What's Worse

Feb 19, 2026

by Dr. Josh C. Simmons

Рост с нуля до $10 млн: системный разбор e-commerce кейса в США

Feb 19, 2026

by Павел Антонов | Targetorium

Airbus Slips, Euronext Drops, Nestlé Rises | Stock Movers

Feb 19, 2026

by Bloomberg Podcasts

Ethiopia’s fossil fuel car ban is a vision of the future | Zero: The Climate Race

Feb 19, 2026

by Bloomberg Podcasts

US Ratchets Up Iran Pressure; OpenAI Funding to Top $100 Billion | Bloomberg Daybreak: US Edition

Feb 19, 2026

by Bloomberg Podcasts