Building Claude Code with Boris Cherny

Watch on YouTube

Now Playing

Transcript

3022 segments

0:00

You were the first ever TypeScript book

0:01

with O'Reilly.

0:02

>> Yeah, I found that book translated in

0:04

Japanese in this little town in Japan.

0:05

That was just the coolest moment. And

0:07

then I realized I don't remember

0:08

TypeScript at all. Now we're at the

0:09

point where Quad Code writes, I think

0:10

something like 80% of the code had

0:12

Enthropic on average. I wrote maybe 10

0:14

20 p requests every day. Opus 4.5 and

0:16

Quad Code wrote 100% of every single

0:18

one. I didn't edit a single line

0:20

manually.

0:20

>> Andre Carpet posted that he's never felt

0:22

as much behind as a programmer as he is

0:24

now.

0:25

>> This is something I really struggle

0:26

with. The model is improving so quickly

0:29

that the ideas that worked with the old

0:32

model might not work with the new model.

0:33

One metaphor I have for this moment in

0:35

time is the printing press in the 1400s

0:38

because there was a group of scribes

0:39

that knew how to write.

0:40

>> Some of the kings were illiterate who

0:42

are employing the scribes.

0:44

>> And if you think about what happened to

0:45

the scribes, they ceased to become

0:46

scribes, but now there's a category of

0:48

writers and authors. These people now

0:50

exist. And the reason they exist is

0:52

because the market for literature just

0:53

expanded a ton.

0:58

What happens when you join one of the

0:59

top AI labs in the world and your first

1:02

poll request gets rejected? Not because

1:04

the code was bad, but because you wrote

1:06

it by hand. This is exactly what

1:08

happened to Boris Churnney when he

1:10

joined Antrophic. Boris is the creator

1:12

and engineering lead behind Claude code.

1:14

Before joining Androphic, he spent 7

1:16

years at Meta where he led code quality

1:18

across Instagram, Facebook, WhatsApp,

1:19

and Messenger, and was one of the most

1:21

prolific code authors and code reviewers

1:23

at the company. In today's episode, we

1:25

cover how Cloud Code went from a side

1:27

project to one of the fastest growing

1:29

developer tools and the internal debate

1:31

at Entrophic whether to release it at

1:32

all. Boris's daily workflow of shipping

1:34

20 30 poll requests a day with zero

1:37

handwritten code and how code review

1:38

works when AI writes everything. Why

1:40

Boris believes we're living through a

1:42

time as transformative as a printing

1:43

press and which engineering skills

1:45

matter more now and which ones do not.

1:48

If you want to understand how one of the

1:49

people closest to AI coding agents

1:51

actually builds software today and what

1:53

that means for the rest of us engineers,

1:55

this episode is for you. This episode is

1:57

presented by Statsig, the unified

1:59

platform for flags, analytics,

2:00

experiments, and more. Check out the

2:02

show notes to learn more about them and

2:04

our other season sponsors, Sonar and

2:06

Work OS. How did you get into tech,

2:09

software engineering, and and coding in

2:11

general?

2:12

>> It starts a while back. I think there

2:14

was kind of like two parallel paths that

2:16

crossed. So, when I was maybe 13 or

2:18

something like this, I started selling

2:20

my old Pokemon cards on eBay. And I

2:24

realized that on on eBay, you can

2:26

actually like write HTML. And I was

2:28

looking at other people's Pokemon card

2:30

listings and I realized like some of

2:31

them have like big colors and fonts and

2:33

stuff like this. And then I discovered

2:35

the blink tag and I named Blink Tag.

2:39

>> And if I put the blink tag on it, I

2:41

could sell my card, you know, for like

2:42

99 cents instead of 49 cents or

2:44

whatever. So I kind of learned about

2:46

HTML this way. Then I got an HTML book

2:48

and kind of learned about HTML. And then

2:50

uh the second thing was this was also I

2:53

think sometime in middle school. We had

2:55

these old TI83 uh graphing calculators

2:58

and we use them for math. And what I

3:01

realized is I can get a better answer on

3:04

the math test if I just program the

3:05

answers to the math test into my

3:07

calculator. And so I wrote these little

3:08

programs to just program the answers and

3:10

then the test got harder. first then I

3:12

had to program solvers instead of the

3:13

actual questions cuz I didn't know what

3:15

what you know the coefficients and stuff

3:16

would be ahead of time and then the math

3:19

got more advanced like the next year and

3:21

so I had to drop down from basic to

3:23

assembly to just make the program run a

3:25

little bit faster.

3:26

>> Oh wow. So like in high school you

3:28

dropped down to assembly.

3:29

>> I think this is like middle school or

3:30

high school maybe like 8th or 9th grade

3:32

or something like this. Then then the

3:34

thing I realized is uh everyone in my

3:36

class was starting to realize that I had

3:37

the solver and they got kind of jealous

3:39

and so I bought this little serial

3:40

cable. so I can give it to them too. And

3:43

then the next math test, everyone on the

3:45

class just got A's. And the teacher was

3:46

like, what's going on? And then

3:48

eventually she realized it. She was

3:49

like, okay, you get away with it once

3:51

and and uh knock it off. But for me, it

3:53

it was very practical. So, you know, in

3:56

school I studied economics. Um I

3:58

actually dropped out to to startups and

4:01

I never thought that coding would be a

4:03

career at all. It was always very

4:05

practical to me. Coding is a means to

4:08

build things and to to make useful

4:10

things. this startup. Um, the first one

4:13

was I think it's like my friends and I

4:15

were trying to get weed

4:18

and so we started this like weed review

4:19

startup. We made like a website. We

4:21

called kind of different uh dispensaries

4:23

I I think and then we just tried to get

4:26

kind of like weed samples so we could

4:27

like review it for them. And it actually

4:30

kind of blew up. Um, and then I actually

4:32

got more interested in uh at the time no

4:35

one was like testing this stuff and so I

4:38

got into kind of the like chemical

4:40

testing kind of chemical analysis and

4:42

then after this I kind of did a bunch of

4:44

other startups and then I joined YC

4:46

actually pretty early uh and I was the

4:48

first hire of uh this YC startup up in

4:51

up in Palo Alto after.

4:52

>> How did you decide to go go to one

4:54

startup after the other?

4:55

>> Kind of vibes vibes I'd say cuz you know

4:57

you know like you know startups it's

4:59

it's never a linear path. You always

5:00

kind of pivot pivot pivot. You have to

5:02

figure out what the market wants and

5:03

what users want. And it's never the

5:05

thing that you think. You you always try

5:07

a thing, but the the idea is always a

5:10

hypothesis and then almost always you

5:11

have to pivot once, twice, three times.

5:14

You know, at at this uh at this medical

5:16

software company, this is called Agile

5:17

Diagnosis. This was kind of an early YC

5:20

company. This was back in maybe 2011,

5:23

2012, something like that. It was

5:25

medical software for doctors. And the

5:27

idea was there's these like clinical

5:29

decision protocols. They vary a lot

5:30

hospital to hospital. And our idea was

5:33

there was one hospital in Chicago that

5:34

had a really great protocol specifically

5:36

for cardiac symptoms. And so we're like,

5:38

wouldn't outcomes be great if every

5:41

hospital in the US would use the same

5:43

protocol? And so we tried to standardize

5:45

it. And we made this like decision tree

5:46

software for doctors to use. And I

5:50

wrote, you know, some of the software.

5:51

The team was like it it was it was just

5:52

a few of us. It was a pretty small team.

5:55

And I wrote the software. It was in a

5:56

web browser. And I remember this was

5:59

back in the like the Internet Explorer 6

6:01

days. that's what hospitals were using

6:03

>> and I wrote this like SVG renderer uh

6:06

because it was this visual decision tree

6:08

and we launched it and then we had a DAU

6:12

chart and the DUS were flat and couldn't

6:14

figure it out and we were piloting it

6:16

with a few hospitals at the time and at

6:18

the time we were based in PaloAlto we

6:19

were piloting it with uh you know a few

6:21

hospitals including UCSF and I rode a

6:23

motorcycle at the time so I rode my

6:25

motorcycle up to you know UCSF and I

6:27

shadowed doctors for a couple days just

6:29

to see how how do they actually use

6:32

And I realized that actually doctors

6:35

don't have time to sit down and use a

6:37

computer because you're seeing a patient

6:39

>> then you have maybe 5 minutes until the

6:42

next patient and in those 5 minutes you

6:44

have to walk down the hall you have to

6:45

go to the computer station you have to

6:47

open up this totally legacy computer. By

6:50

the time it boots up that's like 3

6:51

minutes. Then you open up Inner Explorer

6:54

6 that takes like 30 seconds. Then you

6:56

have to open up this like app that we

6:57

built. You have to sign in and your 5

6:59

minutes are up. you don't even have time

7:00

to use it. And so we rewrote everything

7:02

to run on Android and they still weren't

7:04

using it. And the thing we realized is

7:06

doctors are walking around with a bunch

7:08

of residents behind them. In this kind

7:10

of situation, it's like a social

7:11

situation, right? Like the thing that

7:12

matters is they're seen as an authority.

7:16

They don't want to be seen on their

7:17

phones. And then we pivoted again. So at

7:21

that point, we were like, okay, so maybe

7:22

the doctor isn't the target user.

7:23

Actually, we wanted to be used by maybe

7:25

nurses or X-ray technicians or something

7:26

like this. At that point, I left because

7:29

I was like, "This is actually pretty far

7:30

off from kind of what I wanted to do."

7:32

This is like the most fun thing for me

7:34

is finding this this product market fit

7:36

because it's always surprising. You

7:38

can't have one big idea because the idea

7:41

is probably going to be wrong. So, you

7:42

kind of form hypothesis, you you follow

7:44

it down and and you see what's right.

7:47

Also, I find it so interesting how

7:49

you're telling us this story because I

7:51

feel behind a lot of startup success

7:54

stories, we hear the success story. We

7:55

hear the path of how it went. But first

7:57

of all, a lot of startups are like this.

7:59

And second of all, what struck me is you

8:00

you were hired as a software engineer,

8:02

right? And this was back before product

8:05

engineers or anything was a thing which

8:06

we're now talking about. But you just

8:08

like you rode your motorbike and you

8:11

went there and you shadowed the people

8:12

and you understood how they're using it,

8:15

why they're not using it. getting

8:17

getting ideas. I I feel, you know, this

8:19

this is what makes a great software

8:22

engineer back then and and even today,

8:25

right? You you weren't doesn't seem to

8:27

me that you were focused on a

8:28

technology. You were focused on the

8:29

outcome, though.

8:30

>> Yeah. I mean, look, there there's

8:32

different kinds of engineers and there's

8:33

different ways to do it. And you know, I

8:35

even even on our team right now, I look

8:37

at an engineer like Jared Sumar and he's

8:39

just incredible technical mind. He

8:42

understands systems better than anyone

8:43

I've met. And you know you need you need

8:46

people like this. You need people with

8:48

this kind of depth. For me engineering

8:50

has always been a practical thing. Uh

8:53

and you know for me I've always been a

8:55

generalist and like it doesn't matter if

8:56

I'm doing you know like design or you

8:59

know if I'm doing engineering or user

9:02

research or whatever. The investment

9:04

thesis for AI and software engineering

9:05

is straightforward. As AI writes more

9:07

code more code needs to be verified. But

9:10

there's a catch. AI generated code is on

9:12

average harder to verify than human

9:14

written code. This is why there's Sonar,

9:16

the makers of Sonar Cube. As a critical

9:18

verification layer for the AI enabled

9:21

world, Sonar ensures that speed and

9:23

volume with AI does not compromise your

9:25

codebase. Sonar's competitive position

9:27

is built on 17 years of specialized

9:30

expertise that no foundational model can

9:32

replicate. We're talking about deep

9:34

analysis engines like symbolic execution

9:36

and cross- repository data flow tracking

9:39

that simulate how code actually behaves,

9:41

not just what it says. To bridge the

9:43

divide between AI productivity and code

9:45

quality, Sonar has released the Sonar

9:47

Cube MCP server. This tool acts as a

9:50

universal translator between AI

9:52

applications and the Sonar Cube

9:54

platform. By using the modal context

9:56

protocol, it gives AI tools like cloud

9:58

code, GitHub copilot, and cursor direct

10:01

access to sonar cubes analysis

10:03

capabilities. Instead of context

10:04

switching, your AI agent becomes a

10:06

full-fledged code review and quality

10:08

assurance copilot capable of analyzing

10:10

code snips for issues, filtering bugs by

10:12

severity, and even checking your

10:14

project's quality gate status before you

10:16

ever commit code. Whether you're working

10:18

with coding assistants or scaling up

10:20

with full agogentic workflows, Sonar

10:22

provides the automated verification that

10:24

75% of the Fortune 100 rely on. It's

10:27

about giving your developers the freedom

10:28

to innovate without the fear of breaking

10:30

the code base. Head to

10:31

sonarsource.com/pragmatic

10:33

to learn more about how Sonar enables

10:35

the confidence to develop at the speed

10:37

of AI. With this, let's get back to

10:39

Boris's career and what he learned

10:41

working at startups. My first job I ever

10:44

had, I was like, I think I was 16 and I

10:47

just wanted to buy an electric guitar.

10:49

And so what I did was I I started uh I

10:51

just started freelancing. And so I was

10:53

like, "Okay, I guess I'll make

10:54

websites." And I think Fiverr was not a

10:55

thing back then. So there were some

10:56

other freelancing websites. So I just

10:57

started like I put up a website. I

10:59

started bidding on stuff. And my first

11:01

paycheck, I just spent the entire thing

11:02

on an electric guitar. But it but it was

11:04

very practical, right? Right? Cuz it's

11:06

like when you're in this kind of setup,

11:07

you have to you have to do the

11:08

engineering, you have to do kind of the

11:09

accounting, you have to do the the

11:11

design, you have to talk to customers.

11:14

It's just always been like that for me.

11:15

After a couple of these startups, you

11:17

ended up at Facebook now now called

11:20

Meta. And there you spent seven years

11:22

there. Can you just talk us through what

11:24

you've worked there, what you've learned

11:26

there? You've also had a very remarkable

11:29

career growth in terms of four

11:30

promotions over over over seven years.

11:33

And what did you take away from that

11:35

that experience?

11:36

>> Yeah, so I started on Facebook groups.

11:39

That was the first time I worked on uh

11:41

Vlad Klesnikov uh hired me. I think I

11:43

think he's actually still at Facebook.

11:45

Um I think he's on some other team now.

11:48

And it was cool actually. There there's

11:50

a big group of people that I worked with

11:51

that were these kind of early JavaScript

11:53

people too. And you know, like I I did a

11:55

bunch of JavaScript stuff. And it's

11:56

funny like I kept crossing paths with

11:58

these people. And so Vlad, he worked on

12:01

Bolt.js, JS which was the software it

12:03

was the framework that powered ads

12:05

manager which later became ReactJS. I I

12:08

kept crossing paths with these people

12:09

and later on for yeah later on there

12:11

there was a bunch more people like this

12:13

but anyway so I I was working on

12:14

Facebook groups um I was really excited

12:16

about it because the because of this

12:19

mission of connecting people to their

12:21

community. This is the thing that drew

12:24

me in. And at the time I was a big

12:25

Reddit user. I became a Reddit user back

12:28

when I was a teenager because I didn't

12:31

know anyone else that coded. Even in

12:34

college, I didn't really know anyone

12:35

that coded.

12:36

>> And honestly, I was always kind of

12:37

embarrassed about it cuz I thought it

12:39

was this nerdy thing. And I thought it

12:40

was kind of this this thing that I knew

12:42

how to do, but I wanted, you know, I

12:44

wanted to be like a cool kid and, you

12:45

know, like I I couldn't like tell people

12:47

that I coded. It was like it was very

12:48

nerdy. Um, and and at some point I

12:51

discovered it was some like programming

12:53

community on Reddit and I was I was just

12:56

shocked like there's other people that

12:58

are into this thing. It's like such a

12:59

weird hobby. It's so niche and it was

13:01

just so exciting to find like-minded

13:03

people like this and get this connection

13:04

and so I just wanted to work on this. I

13:06

wanted to kind of contribute to this in

13:08

in some way. So I worked on Facebook

13:10

groups for a while. Um, and then you

13:14

know there there's a bunch of different

13:15

projects have to to kind of get get into

13:17

details for any of these. Eventually I

13:18

became the the tech lead for for

13:20

Facebook groups and kind of grew grew

13:23

into this and the org grew the work

13:24

really changed. It changed from kind of

13:26

building to a lot of like dock writing

13:28

and coordination and kind of delegating

13:30

to others. The culture was changing at

13:32

the time. So you know this early

13:34

Facebook culture was disappearing. The

13:35

docs were coming in. The you know

13:36

alignment meetings were coming in. uh

13:39

there was a lot of a lot more work

13:40

around this kind of foundational stuff

13:42

like privacy, security, things like this

13:44

that I think honestly early on a lot of

13:46

corners were cut in order to grow. But

13:48

at some point you just have to pay that

13:49

debt and that was the time when that

13:51

happened. Then I spent a few years at

13:53

Instagram after um and that was also a

13:55

funny story. My wife got a got a job

13:58

offer and she was just really excited

14:00

about it and she came to me and was

14:01

like, "Hey, like I got this offer but

14:03

we're going to have to move. Is that

14:04

okay?" And I was like, "Yeah, that's

14:06

fine." You know, like I work in tech. we

14:08

can work remotely anywhere. Where's the

14:10

job? And she was like, it's a N. And I

14:12

was like, where where's that? And uh N

14:14

is like rural Japan. And this was uh

14:16

>> different time zone as well.

14:17

>> Different time zone. Yeah. This was

14:19

>> 12 hours or something different or

14:20

something like that.

14:21

>> Something like that. Yeah. It was like

14:22

2021.

14:23

>> Wow.

14:24

>> Um and then I I tried to kind of find a

14:25

team that would sponsor me cuz there was

14:27

there were these kind of arcane HR rules

14:29

about like the time zone you have to be

14:30

in and the team you have to be

14:31

collocated with and so on. And so uh

14:34

there was a little kind of naent team uh

14:36

for Instagram in Tokyo and Will Bailey

14:39

was running this team. He was also the

14:41

guy that made Instagram stories and uh

14:44

so he was my manager for a while and so

14:46

we decided to grow that team together

14:48

and I worked remotely from NA and then

14:49

most of the team was in Tokyo

14:52

and uh during this time I I started

14:54

hacking on Instagram and the stack was

14:56

just insane like Facebook was the single

14:59

best web serving stack in the world. the

15:02

the way that HH everything is optimized

15:04

like from from the hack language to the

15:06

HHVM runtime to the to GraphQL as the

15:09

transport layer to like the client

15:10

libraries like relay and and all the

15:13

stuff it was just and in React it was

15:14

just amazing there there's no other

15:16

devstack in the world that was this good

15:18

and it's just fully optimized and then I

15:21

went to Instagram and it's like you know

15:23

Python where the type checker didn't

15:25

work and click to definition didn't work

15:28

and it was this like kind of hack

15:30

together Django and then like a work of

15:32

uh you know the Syon runtime and just

15:35

nothing really worked and so I came to

15:37

Instagram I joined the labs team uh you

15:39

know in in Japan and the idea was to

15:40

find the next big thing for Instagram.

15:42

We tried some stuff but what I very

15:44

quickly realized is that I was just not

15:46

effective at working on the stack

15:48

because it was such a terrible stack and

15:50

so I just went and started working on

15:52

Dev Infra because uh we we needed to fix

15:55

it and there there's a few projects that

15:57

we worked on. So one was migrating from

15:58

Python to the big Facebook monolith.

16:00

Another one was migrating from Rest to

16:02

GraphQL. And uh these projects, they're

16:04

they're actually in progress, you know,

16:05

like these are things that involve it

16:07

takes hundreds of engineers many years

16:09

to do this. It's a big code base. It's a

16:11

big migration. Um now it's it's much

16:13

faster.

16:14

>> Yeah. With with with these tools that we

16:16

have, the AI AI tools and migrations are

16:18

a pretty good use case for them though.

16:20

>> Yeah. It's like the it's the perfect use

16:21

case for it. And then I I just started

16:23

getting kind of deeper into this. And by

16:25

the end, by the time I left Instagram,

16:26

so I was working on this on dev and kind

16:28

of leading a bunch of these migrations.

16:30

That's also where I intersected with

16:31

Fiona Fun who is now the manager for the

16:34

quad code team. I just worked with her

16:36

and she was just such an amazing leader,

16:38

this incredible depth and kind of

16:39

history in tech. And I just thought like

16:41

there's no better there's no better

16:43

manager for this team. And then I I also

16:45

started working on code quality. And so

16:47

the the work on Instagram kind of

16:48

expanded a bit. And um by the time I

16:51

left, I was leading code quality for all

16:53

of Meta. And so I was responsible for

16:55

the quality of the code bases across

16:57

Instagram, Facebook, Messenger,

16:59

WhatsApp, Reality Labs, kind of all

17:01

these code bases. At Meta, it it was

17:03

this program called Better Engineering.

17:05

And the idea was I think it's sort of

17:06

like 2016 or 2018 or something, but Zuck

17:10

mandated that every engineer at the

17:11

company 20% of their time has to be

17:14

spent fixing tech debt.

17:16

>> Oh, interesting.

17:17

>> And we called this better engineering.

17:19

>> Mhm. And the some of this is kind of

17:22

bottom up where you know a team knows

17:24

best the tech debt that they have to fix

17:26

and then some of it is top down where

17:28

you need to do you know very big

17:29

migrations you need to migrate to new

17:31

language features new frameworks things

17:33

like this and at Facebook scale you know

17:35

there was tens of thousands of these

17:37

migrations every year. Um and so I I

17:39

just started leading all this and I

17:41

realized very quick that it just needed

17:43

a little bit more order to it. There was

17:45

no goals. No one knew kind of like what

17:47

the outcomes were there. there wasn't

17:48

any tracking. Um, and so we developed a

17:51

bunch of stuff. Uh, one of the ideas was

17:54

a centralized way to prioritize the

17:56

different kind of code quality efforts.

17:57

The second thing was figuring out the

17:59

impact of code quality on engineering

18:00

productivity which turned out to be

18:02

significant.

18:03

>> How how did you measure what did you

18:04

find there?

18:05

>> There was a bunch of stuff. I think some

18:06

of this has been published. I don't know

18:08

if all of it has, but essentially you

18:10

try to do like causal analysis and

18:11

causal inference. This is the

18:12

methodology. You try to figure out like

18:15

what what are the factors that make it

18:16

so engineers are more productive. Some

18:18

of it is code quality, some of it is

18:19

outside of code quality. So for example,

18:21

meta went back to uh you know return to

18:23

office instead of work from home. That

18:25

was partially driven by this because we

18:27

just found some you know fairly strong

18:28

correlations that we thought were

18:29

causal.

18:30

>> Yeah.

18:30

>> Um about this but quality actually

18:32

contributes like you know double digit

18:34

percent to to productivity. It turns out

18:36

even even at the biggest scale. It's

18:38

it's kind of comforting to hear because

18:40

I I think it's it's rare to have a place

18:43

where you actually measure this, but I

18:44

think we feel it like when you have a

18:46

clean code base in modular or it can get

18:48

easier to work with and I I think you

18:50

know reasoning could it also be easier

18:53

for LM to to work with it and my hint

18:56

would be yes it should be right but I I

19:00

think there's just very little data but

19:01

that's a feeling that I I would have.

19:03

Yeah, I think a lot of the big companies

19:04

have published about this. Like I think

19:06

Facebook published something. Uh

19:08

Microsoft publishes a bunch about this,

19:09

Google does, but yeah, totally. If if if

19:11

every time that you build a feature, you

19:14

have to think about do I use framework X

19:15

or Y or Z. These are all options that

19:18

you can consider because the codebase is

19:20

in a partially migrated state where all

19:22

of these are around the code somewhere.

19:24

As an engineer, you're going to have a

19:25

bad time. As a new hire, you're going to

19:27

have a bad time. As a model, you might

19:29

just pick the wrong thing and then, you

19:31

know, like the user has to course

19:32

correct you. So actually you know the

19:33

better thing to do is just always have

19:35

you know a clean code base always make

19:38

sure that when you when you start a

19:39

migration you finish the migration and

19:41

this is great for engineers and nowadays

19:43

it's it's great for models too and then

19:45

you joined entropic and I've heard this

19:48

story which you can confirm or give more

19:50

color to it that your first poll request

19:52

was rejected by Adam Wolf.

19:53

>> He was my rampa buddy. So I joined

19:55

Enthropic. I was trying to figure out

19:56

kind of like what to do next and you

19:58

know I I met a bunch of people at all

20:00

the different labs and anthropic was

20:01

just the obvious choice for me because

20:03

of the mission. This is the thing that

20:05

personally I know that I need the most.

20:07

Um and also just kind of seeing all this

20:09

change that's happening. It's important

20:11

to have some sort of framework to think

20:12

about this and to think about our role

20:14

in it. I'm also a really big sci-fi

20:16

reader. Like that that's definitely my

20:17

genre. Um I'm I'm a big reader. I have

20:20

like, you know, giant bookshelf at home

20:21

and stuff and I just know how bad this

20:23

thing can go and I just felt like this

20:25

is a place that has serious thinkers.

20:27

People are taking this very seriously

20:28

and thinking about what what what can we

20:30

do to make this thing go better. So when

20:32

I joined Anthropic, I did a bunch of

20:34

ramp up projects uh just you know

20:36

various stuff that that I was hacking on

20:38

and I wrote my first pull request by

20:40

hand because I thought that's how you

20:42

write code.

20:43

>> That used to be how you write code.

20:44

>> That used to be how you write code. But

20:46

even at the time at Enthropic, there was

20:48

this thing called Clyde and it was the

20:49

it was the predecessor to quad code. It

20:52

was it was super janky. It was like it

20:54

was Python, you know, it took like 40

20:55

seconds to start up. It was research

20:57

code. It was not agentic. But if you

21:00

prompt it very carefully and hold the

21:01

tool just right, it can write code for

21:03

you. And so Adam rejected my PR and he

21:07

was like, "Actually, you should use this

21:08

Clyde thing for it instead." And I was

21:10

like, "Okay, cool." It took me like half

21:12

a day to figure out how to use this tool

21:13

because you have to like pass in a bunch

21:14

of flags and like use it correctly. Um,

21:17

but then it it sped out a working PR. It

21:20

just one-shotted it.

21:22

>> Oh,

21:23

>> and this was like 2024.

21:26

This like September 2024, August,

21:29

something like that. And I think for me,

21:31

this was my first fuel hi moment at

21:33

Anthropic cuz I I was just, oh my god,

21:36

like I didn't know the model could do

21:38

this. Like I I was used to these like

21:40

kind of tab completions, line level

21:42

completions in an IDE. I had no idea

21:44

that it could just make a working pull

21:46

request for me. Boris just talked about

21:48

how he had a true wow moment at work

21:50

using their AI model. A very different

21:52

wow moment is when you use a tool at

21:54

work that makes things so much easier

21:56

than before. And this leads us nicely to

21:58

our presenting sponsor, Statsig. Statsig

22:01

offers engineering teams the tooling for

22:03

experimentation and feature flagging

22:05

that used to require years of internal

22:07

work to build. It's the kind of tool

22:08

that was so complex to build that only

22:10

large companies like Meta or Uber had

22:12

their own custom advanced tooling for

22:14

it. Here's what satic looked like in

22:16

practice. You ship a change behind a

22:17

feature gate and roll it out gradually,

22:19

say to 1% or 10% of users at first. You

22:23

watch what happens. Not just did it

22:25

crash, but what did it do to the metrics

22:26

you care about? Conversion, retention,

22:29

error rates, latency. If something looks

22:31

off, you turn it off quickly. If it's

22:33

trending the right way, you keep it

22:34

rolling forward. And the key is that

22:36

measurement is part of the workflow.

22:38

You're not switching between three tools

22:40

and trying to match up segments and

22:42

dashboards after the fact. Feature

22:44

flags, experiments, and analytics are

22:45

all in one place using the same

22:47

underlying user assignments and data.

22:49

This is why teams at companies like

22:50

Notion, Brex, and Atlastian use Statsig.

22:53

Statsic has a generous free tier to get

22:55

started, and pro pricricing for teams

22:56

starts at $150 per month. To learn more

22:59

and get a 30-day enterprise trial, go to

23:01

stats.com/pragmatic.

23:03

And with this, let's get back to Boris

23:05

and the origin story of Claude Code.

23:08

>> Yeah. And and then when you when you

23:10

joined Entrophic, we we've covered this

23:12

in in a deep dive, but we could recap

23:14

briefly on how Claude Code came to be

23:17

out of out of what seemed like a side

23:19

project or just a cool hack. So yeah, I

23:21

I I started hacking on a bunch of

23:23

different stuff. Um I was working on

23:25

some things in product. Um I worked on

23:27

reinforcement learning for a little bit

23:29

just to kind of understand the layer

23:31

under the layer which I was building.

23:32

This is still advice that I give to a

23:34

lot of engineers is always understand

23:36

the layer under. It's really important

23:38

because that just gives you the depth

23:39

and you kind of like you have a little

23:41

bit more levers to to work at the layer

23:43

that you actually work at. This was the

23:44

advice 10 years ago. It's still the

23:46

advice today. Um but the layer under is

23:47

a little bit different now. You know,

23:49

before it was like understand, you know,

23:51

the Java if you're writing JavaScript,

23:52

understand the JavaScript VM and

23:53

frameworks and stuff.

23:54

>> Now it's like understand the model. So I

23:56

was hacking on a bunch of different

23:57

stuff. Uh something shipped, some things

24:00

uh didn't ship. And at some point I I

24:02

just wanted to understand the public

24:04

anthropic API because I'd never used it

24:05

before. Um and I didn't want to build a

24:08

UI. I just wanted to, you know, hack

24:10

something up quite quickly cuz we didn't

24:12

have quad code back then. We're still

24:14

writing code by hand. And I wrote this

24:16

little batch tool that um all all it did

24:19

was it hit the anthropic API and it it

24:21

was essentially like a chatbased

24:22

application um but just in the terminal

24:24

because that's what AI used to be. And

24:26

you know, I I still think about it like

24:29

engineers are the first adopters. And so

24:32

when we started to move out of

24:34

conversational AI to agentic AI, it took

24:37

a little bit, but engineers understood

24:39

it pretty quick. And I I think now when

24:41

you ask non-engineers about like what is

24:43

AI, they would say it's this

24:46

conversational AI, it's like a chatbot

24:48

or something. And that's why I'm

24:50

actually very excited for, you know,

24:51

co-work this new product that we

24:53

launched because it's going to bring the

24:55

same thing that engineer saw very early

24:57

to everyone else. But when I think

25:00

about, you know, co-work, I I think back

25:01

to this moment that we're talking about

25:03

like very early on, quad code originally

25:05

wasn't quad code. It was a chatbot

25:07

because that's what I thought AI was.

25:09

Um, but we had to kind of figure out

25:11

kind of what is the next thing. And so I

25:13

at at the time I I built this chatbot.

25:15

It was somewhat useful, but it was just

25:16

a chatbot. And the next thing that I

25:19

tried was I I wanted it to use tools

25:23

because tool use just came out and I

25:25

didn't know what it was and I was like

25:26

let's experiment

25:28

and and I I gave it a single tool which

25:30

was the bash tool and I didn't know what

25:31

to do with the bash tool and so I asked

25:33

it you know like I I actually didn't

25:34

know if it could even do this but I

25:35

asked it like what music am I listening

25:36

to and uh it just wrote a little Apple

25:40

script program using like said or or

25:42

whatever to uh open up my music player

25:45

and then like query it to see what music

25:47

it's listening to and just one shot at

25:49

this with sonnet 3.5. This is actually

25:53

my second a field AI moment very quickly

25:56

after the first one

25:58

>> and the model just wants to use tools

26:01

that though that's that's just what I

26:03

realized like this thing like if you

26:05

give it a tool it will figure out how to

26:07

use it to get the thing done and I think

26:09

at the time when when I think about the

26:11

way that people were approaching AI and

26:13

coding everyone essentially had this

26:15

mental model of you take the model and

26:17

you put it in a box and you figure out

26:20

like what is the interface like what how

26:22

how do want to interact with this model?

26:23

What do you need it to do? Essentially,

26:25

it's like if if you have a program, you

26:27

you stub out some module, stub out some

26:28

function, and you say, "Okay, this is

26:30

now AI." But otherwise, the rest of the

26:31

program is just a program. And so, this

26:33

is just not the way to think about the

26:34

model. The way to think about it is the

26:37

model is its own thing. You give it

26:39

tools. You give it programs that it can

26:41

run. You let it run programs. You let it

26:43

write programs, but you don't make it a

26:46

component of this larger system in this

26:47

way. And I think there's just like, you

26:49

know, this is a version of the bitter

26:51

lesson. There's the bitter lesson is a

26:53

very specific framing, but there's many

26:54

corollaries to it. This is one of the

26:56

corollaries is just let the model do it

26:59

do its thing. Don't try to put it in a

27:01

box. Don't try to force it to behave a

27:03

particular way.

27:04

>> One of the first ways you saw it was

27:06

giving it tools, giving it access to the

27:08

bash and then later to the file system

27:10

and then to more tools. Right.

27:12

>> That's right. Yeah, we we give it uh we

27:14

give it bash then uh I say we it it was

27:17

just me the first three months but then

27:18

the team grew. So it it was bash, it was

27:21

uh and and file edit that was the second

27:23

one.

27:23

>> And one of the interesting thing we

27:24

talked about uh last time for the deep

27:27

dive is when you built it and it started

27:29

to actually write code with with the

27:31

tool tools that you had. You've had an

27:33

internal debate inside entrophic should

27:36

we just keep it to ourselves because

27:37

it's making suddenly it spread across

27:39

engineering and it was making all of you

27:41

a lot more productive right. Yeah,

27:42

that's right. In the end, the decision

27:45

was to release so that we can study

27:47

safety in the wild. Because when you

27:49

think about safety and you know, I keep

27:50

talking about the word safety. The

27:51

reason anthropic exists as a lab is

27:53

safety. This is the reason it was

27:55

founded. This is the reason it exists.

27:57

If you ask anyone at anthropic why they

27:59

chose it, it's because of safety. And so

28:01

if you think about model safety, you

28:02

know, there's different layers at which

28:03

to think about it. There's kind of

28:05

alignment and mechanistic

28:06

interpretability. This is at the model

28:07

layer. Then there's evals and this is

28:09

kind of like a it's kind of putting the

28:11

model in a petri dish and synthetically

28:13

studying it in this way. Um and then you

28:15

can study it in the wild and you can see

28:17

how it actually behaves. You can see how

28:19

users talk about it. You can you can see

28:21

like what are the risks in the wild and

28:23

you actually learn a lot this way. And

28:25

by doing this we we've been able to make

28:27

the model much safer. So in in hindsight

28:30

it was it was totally the right

28:31

decision. It's amusing to hear about it

28:34

from your perspective because from the

28:36

outside what what I saw and what a lot

28:38

of engineers saw is like oh entropic

28:40

release cloth code oh wow this you know

28:42

for the first release with uh I I

28:45

believe it was with sonet 4 release was

28:48

was did it come out with sonet 4

28:49

originally or sonet 4.5

28:51

>> I think it was it was for that that was

28:53

the general availability in February but

28:55

I think it was research preview before

28:56

that

28:56

>> yeah but when it came out my

28:59

infiltration was like oh this thing can

29:01

write code pretty well and over time it

29:03

became a lot more capable. So from from

29:05

our perspective it was like this really

29:07

capable coding tool that we just started

29:09

to adopt and use and use for all sorts

29:12

of increasingly product productive parts

29:15

and it has become I believe one of the

29:17

fastest growing developer tools and I'm

29:20

always surprised to hear the story that

29:22

it actually comes from research and the

29:24

goal to understand how people use the

29:27

model because at the other hand like

29:29

some startups have been trying to build

29:31

developer tools deliberately to to get

29:33

adoption and yet this research tool is

29:35

getting a lot more adoption.

29:36

>> I mean this is a you know anthropic

29:38

we're we're a research lab we're a

29:39

safety lab and you know product is this

29:41

kind of thing tacked on to the side

29:43

product exists so that we can serve

29:46

research better and so we can make the

29:47

model safer and this is kind of how we

29:49

think about everything there there was

29:51

this there's also this funny moment

29:52

early on when uh we we had this launch

29:54

review and we were deciding whether to

29:56

launch it. I remember this moment cuz we

29:58

were in the room. I think it there was

29:59

like there was Mike Creger, there was

30:01

Daario, there were some other folks in

30:02

the room and we were deciding what

30:03

should we do. We were looking at the

30:05

internal adoption chart which was just

30:07

vertical

30:09

said it was just insane. It was you know

30:11

like nowadays

30:12

>> vertical is 100% right

30:13

>> just just 100% like nowadays everyone at

30:15

an every technical employee at anthropic

30:17

uses quad code every day is pretty much

30:19

100%. For nontechnical employees it's

30:22

also like it's actually getting quite

30:23

close to 100%. It's it's increasing very

30:25

quickly like you know like half the

30:27

sales team uses quad code um and I think

30:30

that's increasing it's just it's crazy.

30:32

Dario had this question about like how

30:33

how did it grow this fast? Are you like

30:35

forcing people to use it?

30:37

And I was like no we offer this tool

30:40

people vote with their feet and you know

30:42

just like let people use the tool that

30:43

they prefer.

30:44

>> Yeah they chose it.

30:45

>> You don't seem like the person who's act

30:47

exactly forcing people to use your tool.

30:50

>> Yeah. Yeah. I mean the the way we did

30:51

it, we just we launched the thing and

30:53

then we just like listened to the users

30:54

and we talked to people, we saw how they

30:56

use it, we followed up, we made it

30:57

better and yeah, I mean now now we're at

31:00

the point where Quad Code writes I think

31:02

something like 80% of the code in at

31:04

Enthropic on average and you know it

31:06

writes all of my code for sure.

31:08

>> Yeah. And this started for you it

31:09

started the first time you mentioned I

31:11

think it was in November when it started

31:12

to write all of your code. When did that

31:15

switch come and what what happened to

31:17

made you trust it to to write your code

31:20

or how much you trusted? How much you

31:22

review that code for example?

31:23

>> So the switch was instant when we

31:25

started using Opus 4.5. This was before

31:27

before it came out, you know, we we were

31:29

dogfooting it for a little bit and it it

31:31

was just right away. Um it's such a more

31:34

capable model. I just found that I

31:36

didn't have to open my ID anymore. I

31:38

just uninstalled my ID cuz cuz I just

31:41

didn't need it at that point. I actually

31:42

did that like a month later because I I

31:44

I just didn't even realize that I wasn't

31:46

using it anymore.

31:47

>> Yeah, a lot of us had similar

31:49

experiences once Opus 4.5 was out in the

31:52

public and especially over the winter

31:53

break. I I had a similar experience. I

31:55

just realized that this thing it

31:57

actually writes, if I'm being honest

31:59

with myself, as good code as I would

32:00

have written in the stack that I'm very

32:02

familiar with and my code base, my side

32:05

projects where I know it and just a lot

32:07

better than what I could for code base

32:08

that I'm not as familiar or technologies

32:10

I'm not as familiar with. Yeah. I'll be

32:12

honest, he writes better code than I do.

32:14

>> I I I don't want to go there. I I still

32:17

like to keep my pride, but probably

32:19

true.

32:19

>> Yeah. Yeah. I I realized this because

32:21

also in December, I was traveling a

32:23

little bit. I was like on a I was on a

32:24

coding vacation. We we're talking about

32:26

this before, but I I went to Europe. We

32:28

were just in a different time zone kind

32:29

of nomading around. And it was so fun

32:31

cuz I was just coding all day every day,

32:33

which is my favorite thing to do. And uh

32:36

I wrote maybe, you know, like 10 20 p

32:38

requests every day, something like that.

32:39

Opus 4.5 and quad code wrote 100% of

32:42

every single one. I didn't edit a single

32:44

line manually and I realized uh at the

32:47

end of that month Opus introduced maybe

32:48

two bugs whereas if I had written that

32:50

by hand that would have been you know

32:52

like 20 bucks or or something like that.

32:55

Can we talk about your development

32:56

workflow? You have written threads about

32:58

this which is awesome. It's on it's on

33:00

social media on threads and on on X. But

33:03

can you tell us how you use today uh

33:05

cloud code in terms of you know

33:07

parallelism and and tips and tricks that

33:09

you and the team have kind of learned

33:11

and share across the across the team?

33:13

>> Yeah, I mean look there's no one right

33:15

way to use quad code. So I I can share

33:17

some tips and things but I I think the

33:20

wrong conclusion to draw would be to

33:22

just copy copy these and and use it. The

33:25

way we build cloud code is we build it

33:28

to be hackable because we know every

33:30

engineer's workflow is different.

33:32

There's no one way to do things. There's

33:34

no two engineers that have the same

33:36

workflow. It's just every every engineer

33:37

>> same with workstation setup, right? Like

33:39

keyboards, monitor placement, all that.

33:40

Everyone has it differently.

33:41

>> Yeah. It's like we're like crafts

33:42

people, right? Like you choose you

33:44

choose your tools. Like we care deeply

33:45

about it. So there's no one right way to

33:47

do it. So for me, the way that I do it

33:50

generally is I have five terminal tabs.

33:52

Each one of them has a checkout of their

33:54

repository. So it's five parallel

33:56

checkouts. Um and usually I'll kind of

33:59

roundroin and start cloud code in each

34:01

one. Almost every time I start in plane

34:04

mode. So that's like shift tab twice in

34:05

the terminal. And uh I also overflow uh

34:08

as I run out of tabs cuz there's only so

34:11

many terminal tabs. I used to use web a

34:14

lot for this. So like quad.ai/code,

34:16

that's the place that I overflow to.

34:17

Nowadays I actually use the desktop app.

34:19

Um it's more convenient. So Quad Code,

34:21

you know, it's been in our desktop app

34:22

for, you know, for many months. It's

34:24

just a code tab in in the Cloud app. Um,

34:27

and I actually really like it because it

34:29

has built-in uh work tree support. So

34:31

that's existed for a while. Um, and that

34:33

that's quite nice for parallelism. So

34:35

you have multiple, you don't need

34:36

multiple checkouts. You just have one

34:37

and then we automatically set up Git

34:39

work trees for you. So you get this kind

34:41

of environment isolation. The reason I

34:43

do that is I actually just really hate

34:44

fiddling with git work trees on the

34:46

command line cuz it it's kind of fiddly.

34:48

like you need to know the CD get work

34:50

tree for those of who are not as

34:52

familiar with it. It's it's when you can

34:55

check out instead of having a separate

34:57

local folder, it's almost like checks

34:59

out separate branch, right? And then you

35:01

can work on it separately but not have

35:03

the comp have the complex only at like

35:05

merge time.

35:06

>> That's right. Imagine that you you have

35:07

a folder but you have maybe like git

35:10

makes five copies of that folder in a

35:12

way that's very cheap um and kind of

35:14

easy to throw away. So you get this kind

35:15

of isolation. it can work in parallel

35:17

and the quads don't interfere.

35:18

>> Yeah. So, you now have support for this

35:20

which I I think you recently added like

35:22

native support but like for for your

35:24

workflow you just stuck with the old one

35:26

of checking out on separate f folders,

35:28

right?

35:28

>> Yeah, exactly. I I actually find over

35:30

time I'm using the desktop app more and

35:32

more for this.

35:33

>> Um just cuz I don't need these separate

35:34

checkouts and you know I I just have a

35:36

bunch of quads running in parallel and I

35:37

don't have to think about it. The other

35:39

surprise hit is the iOS app for me.

35:41

Every day I start like I wake up and I

35:44

just start a few agents on my phone. Oh,

35:45

the the native one. Yeah,

35:46

>> the native one. Yeah, it's just like

35:47

it's the quad app. It's the code tab in

35:49

the in the quad app and it's the same

35:50

exact quad code.

35:51

>> Yeah, except it it runs in the cloud,

35:53

right?

35:53

>> It runs in the cloud. Yeah. So, you have

35:55

to kind of configure the environment.

35:56

Luckily, our environment is pretty

35:57

simple. So, you know, um and it we just

36:00

use hooks for it. So, you just use the

36:01

session start hook and configure it.

36:03

This is kind of one of the benefits of

36:04

making quad code really hackable is it's

36:06

very easy to do to do this kind of

36:07

configuration. And this is something

36:09

honestly I would never have predicted

36:12

because you know like I I I code on a

36:15

computer. If you told me six months ago

36:17

I'd be writing I don't know a third I

36:19

haven't pulled the data maybe like a

36:20

third half something like this of my

36:22

code on a phone. That's crazy. But

36:25

that's that's what I'm doing today.

36:27

>> And you're using parallel agents. At

36:29

what point did you start using them? And

36:31

how has it changed your work? Cuz one

36:33

thing that I notice on myself, I don't

36:36

really use that many parallel agents. I

36:39

maybe like two at a time, but I'm

36:41

someone who well I I like to be in

36:44

charge and especially with Claude.

36:45

Claude is is is a a tool that you can

36:48

follow it along. It tells you what it's

36:49

doing. It you can also have for example

36:52

learn mode which this was shipped a lot

36:54

earlier where where you can actually

36:55

follow along. It gives you tasks. I I

36:57

feel that like staying in one tab and

37:00

following along the model is pretty fast

37:01

as well. I can kind of keep in touch.

37:04

I'm assuming at some point you must have

37:05

done this but then what happened when

37:07

you changed to parallel and are do you

37:10

feel you're losing any control or it

37:11

doesn't really matter that much?

37:12

>> Yeah, I I I think there's kind of like

37:14

two modes to think about or kind of like

37:16

two two uh two kind of workflows to

37:18

think about. So when you're new to a

37:19

codebase, highly re learn mode is

37:21

awesome. Highly recommend it for people

37:23

that are onboarding to the quad code

37:25

team, people that onboard to enthropic.

37:27

Um the thing that we recommend is so you

37:30

do for people that haven't tried it you

37:31

do slashconfig in quad code you pick the

37:34

output style and you can do learn or

37:36

explanatory. We usually recommend

37:38

explanatory cuz that tends to be better

37:39

for new code bases um that you kind of

37:42

haven't been in before. For me once

37:44

you're familiar with the codebase you

37:46

just want to be productive right like

37:47

you just want to ship as much as you can

37:49

and you want to kind of be effective

37:50

doing that. Um so the role really

37:53

switches. I don't really go deep into

37:55

tasks anymore. I start a quad in plan

37:57

mode. I'll have it kick something off.

37:59

With Opus 4 4.5, I think it got there.

38:01

With 4.6, it just really really does it.

38:05

Once there is a good plan, it just it

38:06

will oneshot the implementation almost

38:08

every time.

38:09

>> So, the most important thing is to go

38:10

back and forth a little bit to get the

38:11

plan right. So, what I do is I I start

38:14

one, I enter plan mode, I give it a

38:16

prompt. As it's chugging along, I'll go

38:18

to my second tap and I'll start the

38:19

second quad also in plan mode. Get it

38:21

chugging along. Then go to the third

38:23

tab, go to the fourth one. Then maybe

38:25

I'll go back to the first one when I get

38:26

notified that it's done. Uh, and then

38:28

I'll kind of

38:29

>> Do you have notifications on or do you

38:30

turn them off?

38:31

>> I actually operate in both modes. Um,

38:33

sometimes I do like, you know, focus

38:35

mode on the Mac. Um, so I just have it

38:37

off, but also sometimes I use the system

38:39

notifications.

38:40

>> And you're very very productive with

38:42

with PRs. I mean, I I think it was very

38:44

visible. Even around the holiday breaks

38:48

uh on social media, you actually were

38:50

responding to I think someone reported a

38:52

bug or or a feature request. I'm not

38:54

sure which one it was. And then an hour

38:56

or two later it was done cuz cuz you did

38:58

it. You've also talked about like number

39:00

of poll requests you've done on a day

39:01

not to like show up but just as context.

39:04

What what does a poll request typically

39:06

involve in terms of complexity? Are

39:08

these like are some some super trivial

39:11

or some actually like larger pieces of

39:13

work as well?

39:14

>> Yeah, pull request each one varies a

39:15

lot. Um sometimes it's a few lines,

39:18

sometimes it's a few hundred or a few

39:19

thousand lines. They're all just very

39:21

very different. It's changed so much.

39:23

Like back when I was at Instagram, I

39:25

think I was one of the uh top two maybe

39:27

top three most productive engineers at

39:29

Instagram just by volume of code

39:30

written. Oh wow. Um so I've always, you

39:33

know, for me I've I've always just coded

39:34

a lot. Like this is uh coding is like a

39:37

way that I can express myself and it's

39:38

just like it's a way that my brain

39:39

thinks also. And so now I just get to do

39:42

it. But I I think with quad code the the

39:44

the kind of code that you write if you

39:46

are very productive it it tends to be

39:48

even it's just the number of PR sort of

39:51

underelves what what's happening because

39:54

I I think people that used to be very

39:56

productive in the old days before AI

39:58

assistance a lot of the code maybe was

40:01

like code migrations or something like

40:02

this so like people that shipped you

40:04

know 20 30 PRs every day a lot of it was

40:06

like pretty you know like a oneliner or

40:08

kind of migrating A to B or whatever.

40:10

Nowadays I ship you know 20 30 PRs every

40:13

day but every PR is just completely

40:15

different. Some of them are thousands of

40:16

lines, some of them are hundreds, some

40:17

of them are dozen, some of them are

40:18

oneliners. It's none of these are kind

40:21

of code migrations cuz actually Claude

40:23

just does those and I I don't need to be

40:24

part of that.

40:25

>> Shipping this much code or this much

40:27

productive. The obvious question that

40:29

comes up for any I guess software

40:30

professional is well the review. What

40:32

the way teams used to work and I'm not

40:35

sure if Instagram did this but a lot of

40:37

other companies did this is you make a

40:39

pull request you put it up there there's

40:40

a mandatory human reviewer at Google

40:43

there's actually two cuz there's one on

40:44

code quality as as well how has this

40:47

workflow changed how does the hot code

40:50

team think about code review and how has

40:52

it changed over time yeah I'll start by

40:53

thinking I I'll start by talking about

40:55

how code review used to work for me so

40:57

the the way that I used to do it is uh

40:59

every time I I also used to be one of

41:01

the most prolific code reviewers.

41:03

>> Oh, okay. So, both.

41:04

>> I I met Yeah. Yeah.

41:05

>> Right. Or is it code reviewers?

41:06

>> That's actually and that's one of the

41:08

benefits of being in a different time

41:09

zone. Like I'm not super human. I just

41:10

didn't have any meetings. And the the

41:12

way that I approach code review is every

41:14

time that I would have to comment about

41:16

something, I would drop it in a

41:17

spreadsheet

41:19

and I I would like describe the issue.

41:21

So, let's say, you know, like someone

41:22

named a parameter, you know, in a

41:23

function badly, I would like put that in

41:25

a spreadsheet. If someone did some bad

41:26

React pattern or something, I would I

41:28

would put that in a spreadsheet. And

41:29

then over time I would just kind of

41:30

tally up the spreadsheet and anytime

41:32

that a particular row had more than

41:34

three or four instances I would write a

41:36

lint rule for it.

41:37

>> So just automate it with kind of an op.

41:39

And so that's what it used to look like

41:40

for me. I've always tried to automate

41:42

myself away um because there's just so

41:44

many things to do. Um and this is one of

41:46

our superpowers as engineers

41:48

>> is we were able to automate all of the

41:50

tedious work. There's very few other

41:52

fields where you're able to do this

41:54

thing. This is a thing uniquely that

41:55

we're able to do. Um, and this is a

41:57

thing that I I've just always enjoyed

41:59

because it gives me more free time and

42:01

uh I get to do the work I actually

42:02

enjoy. And so today the way this looks

42:05

is a little different, but it it mirrors

42:07

this a little bit. So when cloud code

42:09

writes code, it generally it will run

42:11

tests locally. And this is something

42:13

cloud just often decides to do when it's

42:14

relevant or it'll write new tests. So

42:16

you kind of do this this kind of

42:18

verification. When we make changes to

42:20

cloud code, cloud will also test itself.

42:23

So it'll launch itself kind of in a

42:25

subprocess. It'll verify itself and

42:26

it'll test itself end to end.

42:28

>> This is for the the your internal cloud

42:30

code implementation. So you have like

42:32

this test suite so they can test itself.

42:34

>> Yeah, that's right. That's right. But

42:35

it'll literally launch itself just in a

42:37

bash process and kind of just see like

42:39

hey do I still work.

42:40

>> Wow. Okay. So it'll do this and this is

42:42

something that we we just didn't code in

42:44

like it just with Opus 4 4.5 especially

42:47

it just sort of spontaneously doing

42:48

this. It just wants to kind of check. So

42:50

so we do this and then we also run

42:52

claudep. So this is the quad agent SDK

42:54

in uh CI. So every pull request at

42:57

Enthropic is code reviewed by quad code.

43:00

Uh and that actually catches maybe like

43:02

80% of bugs something like this. Um and

43:05

it's the first round of kind of code

43:07

review. Cloud will automatically address

43:09

some of these. Some of them some of them

43:10

it'll leave to a human cuz it's not sure

43:12

what to do. There's always an engineer

43:14

that does the second pass of code

43:15

review. Um and you know there there

43:17

always has to be a person in the loop

43:19

approving the change.

43:20

>> Mhm. So on on on the team before

43:23

anything goes into production if you

43:25

will an engineer does look at it. Yes.

43:27

As you're thinking of code review would

43:29

you do this for every type of project or

43:31

this is specifically because you now

43:32

know that this actually has real world

43:34

impact people depend on it. You know

43:36

there's a lot of users let me put it the

43:38

other way around like can you see places

43:40

where you would just not have an

43:41

engineer review uh code. What situations

43:44

would that be in?

43:45

>> I think it depends how how how it's

43:47

used. Yeah I'd agree with that. But you

43:49

know if you're building some personal

43:50

side project like you can just yolo

43:52

straight to main you know like

43:53

>> it's even even before AI you would have

43:56

not reviewed you just trust yourself or

43:58

you know just ship to production or SSH

44:00

into production and do some changes that

44:02

kind of stuff right

44:03

>> exactly exactly um the very first

44:06

versions of quad code that were internal

44:07

like you know I committed straight to

44:09

main but then you know as soon as you

44:10

have users and you know for enthropic

44:12

our main customer base is enterprises

44:14

this is what we care about the most for

44:15

us for safety reasons security is really

44:17

important privacy is important. These

44:19

are these are all related. It's also

44:20

very important for our customers. And so

44:22

because this is an enterprise product,

44:24

it has to be secure. It has to be we

44:26

have to make sure that it meets a

44:28

certain bar. So we definitely use a lot

44:30

of automation, but at least for now,

44:33

there has to be a human in the loop just

44:34

to make sure.

44:35

>> One thing that is just known about LM is

44:38

they're nondeterministic.

44:40

And by putting the element as a reviewer

44:44

claude doing a review like it it will

44:46

give good feedback but how do you deal

44:49

with the fact that you can be sure if

44:52

it's always giving the feedback you

44:53

cannot be sure that even if it's capable

44:55

of catching an issue that it will

44:57

necessarily catch that. Are you doing

44:59

anything in in this loop to do

45:01

deterministic thing? For example,

45:02

linting is very deterministic as you

45:03

will very well know. Like have you

45:05

thought of marrying some of these ideas

45:06

or are you using for example are using

45:08

llinters on the codebase or you found no

45:10

need to for it? Yeah, absolutely.

45:12

Absolutely. Yeah, you

45:13

>> this is just a Yeah.

45:14

>> Yeah, we we have type checkers, we have

45:15

llinters, we run the build. Claude is

45:18

actually so good at writing lint rolls.

45:20

So, actually what I do now, I used to

45:21

tally stuff up in a spreadsheet. Now,

45:22

what I do is when a coworker puts up a

45:25

pull request and I'm like, this is

45:26

lintable. I'll just be at Claude, please

45:28

write a lint roll for this in that PR on

45:30

their PR. And we have, you know, you

45:33

just run like slash I think it's like

45:34

setup GitHub or or something like this.

45:36

You can do this in cloud code and it'll

45:38

install the GitHub app which then makes

45:40

it so you can tag add Claude on any pull

45:42

request, any issue. I use this every

45:45

single day. Um, so very very useful. So

45:48

you want these deterministic steps. Also

45:50

though there are there are ways to get

45:52

cloud to be a little bit more

45:54

deterministic. So for example, you can

45:56

do best event. You can have it do

45:57

multiple passes

45:59

>> and and this is actually quite easy to

46:00

do. So you know for example the

46:02

coderview skill that we use internally

46:04

it's open source um and it's available

46:07

in the quad code repo and so all we do

46:09

is you know we launch parallel agents to

46:11

do stuff and then we launch parallel

46:12

dduping agents to check for false

46:15

positives but essentially best of end

46:17

the way you implement it is is all you

46:19

say is claude start three agents to do

46:21

this and that's it. or just talked about

46:23

building that enterprise infrastructure

46:25

layer, the O, the permissions, the

46:27

security that has to all work before you

46:29

can ship to real customers. This makes

46:32

it a great time to speak about our

46:33

season sponsor work OS. If you're

46:35

building any SAS, especially an AI

46:37

product one, then authentication,

46:39

permissions, security, and enterprise

46:41

identity can quietly turn into a

46:43

long-term investment. SL edge cases,

46:46

directory sync, audit logs, and all the

46:48

things enterprise customers expect. It's

46:50

a lot of work to build these mission

46:52

critical parts and then some more to

46:53

maintain them. But you don't have to.

46:55

Work provides these building blocks as

46:57

infrastructure so your team can stay

46:59

focused on what actually makes your

47:01

product unique. That's why companies

47:03

like Antrophic, OpenAI, and Cursor

47:05

already run on Work OS. Great engineers

47:07

know what not to build. If identity is

47:10

one of those things for you, visit

47:12

work.com.

47:13

And with this, let's get back to

47:15

building cloud code with Boris. How does

47:18

cloud code work in terms of ar

47:20

architecture? So as as an engineer, how

47:21

can I imagine it's setup? It's uh we we

47:24

covered some of this in the the deep

47:25

dive and I think you told me that you

47:28

had some pretty complex ideas when you

47:30

started and you just simplified a lot of

47:31

it.

47:32

>> Yeah. Yeah. It's very simple like you

47:34

know there there's not much to it.

47:35

There's like there's a core query loop.

47:37

Uh there's a few tools that it use that

47:39

it uses. We we delete these tools all

47:41

the time. We add new tools all the time.

47:43

We're just always experimenting with it.

47:45

So there's kind of this core kind of

47:46

agent part of it. Then there's the the

47:48

2E part of it. Uh and then there's

47:51

there's actually a ton of different

47:52

pieces around security. Um and making

47:55

sure that everything that QuadCode does

47:57

is safe and that there's a human in the

48:00

loop for when it happens.

48:02

>> And by safety, do you mean as as a user

48:06

when it's doing stuff on my computer or

48:08

also as entropic monitoring use cases

48:11

that that could be deemed unsafe? Yeah,

48:13

there's kind of a couple versions of

48:14

this. You safety, there's just many,

48:16

many layers and for things like safety

48:18

and security, there's no one perfect

48:19

answer. So, you know, it's always a

48:21

Swiss cheese model. You just need a

48:22

bunch of layers and with enough layers,

48:24

the probability of catching anything

48:26

goes up. And so, you just have to kind

48:28

of count the number of nines in that

48:29

probability and pick the threshold that

48:31

you want. And so, for something like

48:32

prompt injection for example, we do this

48:34

generally at three different layers. So,

48:37

let's think about something like web

48:38

fetch. So cloud fetches a URL and uh it

48:42

reads the contents of of of that web

48:44

page and then it does something in in

48:45

quad code. So one of the risks for

48:47

something like this is prompt injection.

48:49

Maybe there's an instruction on that

48:50

website to be like hey quad delete all

48:51

the folders or something like that.

48:54

>> So we think about this in a number of

48:55

ways. The the most basic way is it's an

48:57

alignment problem. And so opus 4.6 is

48:59

the most aligned model we've ever

49:01

released because we've taught the model

49:04

how to be more resistant to prompt

49:05

injection. And so you can read about

49:07

this on the model card and I think it

49:09

was part of the release. The second part

49:11

is that we have classifiers at runtime

49:13

where if there is a request that seems

49:15

to be prompt injected, we block it um

49:18

and we just make the model try again.

49:20

And then the third layer is for

49:22

something like web fetch, we actually

49:23

summarize the results in using a sub

49:25

agent and then we return that summary

49:27

back to the main agent. So again, this

49:29

kind of reduces the probability of

49:31

prompt injection. And so you can kind of

49:32

see how this isn't just one mechanism.

49:34

It's it's a layer and by by having a

49:37

bunch of these different layers, it just

49:38

reduces the probability a lot.

49:40

>> One interesting technical choice that

49:42

you've also mentioned is is using rag or

49:45

not rag retrie retrieval augmented

49:47

generation and you mentioned how in the

49:49

earlier version of cloud code you use a

49:52

local vector database to to get some to

49:55

to speed up search and you layer threw

49:57

this away. Can you talk about how this

49:59

one because this was another example

50:00

where I guess did the model get better?

50:02

>> Yeah, I mean this is one of those things

50:04

where we try so many different things.

50:06

We try so many different tools and just

50:08

statistically most of them we throw

50:09

away.

50:11

>> Even something like the spinner in quad

50:13

code I think it's gone through like a

50:14

hundred iterations

50:16

>> I want to say. Oh

50:17

>> just the spinner and you know out of

50:20

those we've landed maybe like 10 or 20

50:22

in production and like 80 of them I

50:24

probably just threw away cuz it didn't

50:25

feel good enough. So just statistically

50:28

almost all the code we write we throw

50:29

away because it's just so easy to write

50:31

this code and try stuff and see what

50:32

feels good. So for something like rag we

50:36

tried a bunch of different approaches

50:37

early on. So the the first one was rag

50:39

for retrieval cuz I think this I was

50:41

just like reading up like how people

50:42

were doing retrieval and it seemed like

50:44

all the papers were talking about rag.

50:46

Um and so the way I did it was it was

50:47

like a local vector database. I think it

50:49

was like written in Typescript and it

50:51

just lived on the user machine. Uh and

50:53

then I was using some like embedding uh

50:55

model that was in in the cloud to

50:57

compute the embeddings before storing

50:59

it. Um and that that worked like pretty

51:01

good, but there's a lot of issues with

51:04

rag. Um so for example, I was finding

51:06

that the code drifted out of sync. Like

51:08

if I make a local function, it's not yet

51:10

indexed and so rag isn't going to find

51:12

it. There's also this question of like

51:14

how exactly is the index permissioned?

51:16

So who can access it? I can access it.

51:18

Um but then how do we like encode that

51:20

in kind of permission policies? How do

51:22

we make sure no one else can access it?

51:24

How do we make sure that like if there's

51:25

a rogue IT person within the company,

51:28

they can't access someone else's data?

51:30

This is really really important that we

51:31

think about this.

51:32

>> Yeah.

51:32

>> Um and so we just decided like it was

51:35

sort of working, but it was it also has

51:37

a lot of downsides. And so we tried a

51:39

bunch of other stuff. Uh one of them was

51:41

just using the model to uh kind of index

51:43

everything recursively. Um that was kind

51:45

of a cool idea. There was another

51:47

version where um we just tried glob and

51:49

gp. We tried a bunch of different stuff.

51:51

It it turned out that agentic search

51:52

just outperformed everything

51:54

>> and and when I say agentic search, this

51:56

is a fancy word for glob and grap.

51:58

That's all it is.

51:59

>> Nice. So So the model both got good

52:02

enough and you realize that it can use

52:03

these tools pretty efficiently.

52:05

>> Yeah. And this was uh it was partially

52:07

inspired honestly by my experience at

52:09

Instagram because at at Instagram click

52:11

to definition didn't work because the

52:13

the dev stack was just borked like half

52:15

the time and I think now it's better.

52:18

And so what engineers weren't to do

52:20

instead is let's say you're looking for

52:22

the definition of the function fu

52:24

instead of click to definition what you

52:25

would do is you would use the global

52:27

index which is quite good at meta and

52:29

then you would search for fu per opening

52:31

parenthesy and this worked pretty well

52:34

and it it's funny because like this

52:36

works for the model pretty well too

52:38

interesting how one one idea from one

52:41

area can come to the other one of the

52:44

more advanced parts of cloud code that

52:46

we've also previously talked about is

52:48

the permission system. Can you talk

52:50

about what was complex about it? And

52:54

also you recently open source

52:55

sandboxing, right? Permissioning is

52:57

really complex. Um there's like

53:00

everything else that has to do with

53:02

security. It's a Swiss cheese model.

53:05

There are a number of classifiers that

53:07

run to make sure the command is safe. Um

53:10

and there's also static analysis that we

53:12

do to make sure the command is safe. As

53:14

a user, you can also allow list

53:15

particular patterns that you know to be

53:17

safe. So, for example, um some standard

53:20

Unix utilities we preow because we know

53:23

they're readon because we know they

53:24

can't expilt your data or anything like

53:26

this. So, we we just won't prompt you

53:28

for permission. But actually quite few

53:31

tools fall into this category because

53:34

even something like the find command,

53:36

there's actually a way to execute

53:37

arbitrary code as part of that command

53:39

because there's there's like system

53:40

flags that you can use for this. or even

53:42

something like the said command. There's

53:44

ways to use this. So there's just like

53:45

all this like arcania about these

53:48

various Unix utilities where it's

53:49

actually not as safe as you think.

53:51

>> And so we want to be by default fairly

53:53

conservative about what we allow by

53:55

default. As a user though you can

53:56

configure an allow list. So you can say

53:58

for example like the these patterns are

53:59

allowed the these patterns are not

54:01

allowed. Uh and so we we let you define

54:03

that and we also check this allow list

54:05

to to make sure that it's safe.

54:08

>> Yeah. And then you you have this like

54:09

neat permission system where every time

54:12

you run a command that needs permission,

54:13

you can decide to run it once or run it

54:16

for either this session or whatever it

54:18

makes sense or just globally allowed

54:20

going forward. Right. That's right. This

54:22

is a funny artifact. This was actually

54:23

in the very very first version of quad

54:25

code. This is the way permissions

54:27

worked. This is the very first release.

54:29

This was like September 2024, the first

54:31

internal release. I remember at the time

54:33

we weren't sure whether agentic safety

54:35

could be even be solved. And so there

54:38

was actually a lot of push back

54:39

internally from safety teams because

54:40

they were like okay like you can't just

54:42

run let the model run bash commands like

54:44

that's unsafe. So like what do you do

54:46

like this is not a solvable problem so

54:48

like we can't launch this. I I

54:49

brainstormed with Ben man and Ben was he

54:52

started the labs team. He's one of the

54:53

founders at Enthropic. Um he's actually

54:56

he's the the person that hired me to

54:57

Anthropic. We just came up with

54:59

permission prompts as the way to do

55:00

this. You you put the if you're not sure

55:02

just ask the human and and they can

55:04

decide.

55:05

>> Yeah. I wanted to ask you about how

55:07

software engineering is done in general

55:09

in terms of Antrophic and one of the

55:11

first questions which is a I guess a

55:14

more formal one but or from the outside

55:17

is titles or lack of them. Everyone at

55:20

Antroic has the same title member of

55:22

technical staff. Why did this happen and

55:24

what does this result in this kind of

55:26

like everyone there basically no titles

55:29

right except for one? I think it's kind

55:31

of an acknowledgement that um everyone

55:33

just is figuring stuff out. And um if if

55:37

you kind of squint and look at the work

55:39

people are doing, it's all quite similar

55:42

and it's it's kind of quite generalist

55:44

and if you talk to the average software

55:46

engineer, they might not just be doing

55:48

coding. They might also be doing a

55:50

little design. They might also be

55:52

talking to users. They might be writing

55:54

their own product requirements. They

55:56

might be writing software and also uh

55:59

you know doing research. They might be

56:00

writing product code and also

56:01

infrastructure code. At anthropic

56:03

there's a lot of generalists. This is

56:05

also you know from my background. This

56:06

is one of the reasons that I gravitated

56:08

towards it. And I I I think member of

56:10

technical staff just kind of encodes

56:12

this in in the way that people talk to

56:15

each other even if they don't know each

56:16

other. Without this title the default

56:18

would have been I see your name on Slack

56:20

and under your name it says software

56:21

engineer. And then I'm like well okay I

56:23

guess you're like you're the coding

56:24

person then. So I'm I'm not going to ask

56:26

you like product questions, but when

56:28

everyone's title is member of technical

56:29

staff, by default, you assume everyone

56:31

does everything. And so it kind of

56:33

inverts this this relationship between

56:35

people even if you don't know each other

56:36

well yet. In in a way, it's kind of this

56:38

like optimism built into the built into

56:41

the structure. Um I think it's also a

56:44

glimpse of the future because I I think

56:45

this is where software engineering is

56:47

going. I think this is where every

56:49

discipline is going is more of this

56:51

generalist model. It definitely feels

56:53

like it in in software engineing. And I

56:55

I heard this funny uh comment by Mark

56:59

Andre uh how we said that there's this

57:01

Mexican standoff happening in the tech

57:03

world where the the designers are are

57:05

saying that they're actually now doing

57:07

like PM and engineering work. The

57:09

engineering are saying we're doing

57:10

design and and like everyone thinks

57:13

they're doing the work of the others and

57:14

they're kind of standing there like I'm

57:16

doing your work as well. when the

57:17

reality is everyone's role is expanding

57:19

most of it thanks to AI because it makes

57:21

easier for an engineer to do product

57:22

work or for a product person to engineer

57:24

work and so on. So just what what you've

57:26

said

57:27

>> I I remember back in the back in June or

57:29

July of last year I I walked into the

57:31

office and the data there's a row of uh

57:34

data scientists that sit right next to

57:36

the quad code team at least at least at

57:38

the time and I walked in and our data

57:40

scientist for the quad code team had

57:41

quad code up on on his monitor and um he

57:45

he was using it and I was like this is

57:47

interesting cuz you're you're a data

57:49

scientist did you have like why are you

57:50

using a terminal like you didn't have

57:52

NodeJS installed cuz we depended on

57:54

Node.js JS back then. I I was like, "Are

57:56

you are you dog fooding it? Like are you

57:58

just like trying to like figure out how

57:59

this thing works or something?" He's

58:00

like, "No, no, I'm like I'm using it to

58:01

run queries." He was just like using it

58:03

to run SQL and it had like little like

58:04

ASKI visualizations uh in the terminal.

58:07

Uh and then the next week the entire row

58:09

of data scientists had quad code running

58:12

on their computers and and this expanded

58:15

and so if you look at the team today on

58:18

the quad code team everyone codes the

58:21

engineers code our engineering manager

58:23

codes designers code uh data scientists

58:26

code uh our finance guy codes everyone

58:30

on the team codes and I think part of it

58:33

is quad code just makes it so easy so

58:36

you don't really have to understand the

58:38

codebase. You can just like dive in and

58:39

and kind of make small changes quite

58:41

easily. But I think another thing is

58:43

people are able to use cloud code to do

58:46

their jobs more whether it's you know

58:48

financial forecast or you know data

58:50

science or whatever and by doing this

58:52

it's actually quite an easy crossover to

58:53

just use it to write a little bit of

58:55

code also. So it's just a way to dip

58:57

your toe in the water. One other

58:58

interesting thing about how you work is

59:00

Cat Woo was talking about she is I guess

59:04

you the title is the same but people

59:06

might gravitate for role a bit more. I

59:08

understand she's a little bit more on a

59:10

product role but you said that PRDs are

59:12

just not really written inside entropy

59:14

and PRD's product requirement document.

59:16

It's a well-known artifact across big

59:18

tech and increasingly over larger

59:20

startups where you write a spec and the

59:22

idea is that you write down your

59:23

thoughts, people align, you send it over

59:25

and now you know what to build. But

59:26

apparently you're not doing much of this

59:28

or at all.

59:29

>> Some of this I think is because

59:30

Anthropic is still, you know, it's still

59:31

a startup. So you you don't actually

59:33

have to align with that many people

59:34

usually. You can just kind of talk about

59:36

it or do it in Slack or whatever. Um but

59:38

yeah, also part of it is, you know, like

59:40

Cat used to be an engineering manager.

59:41

She's she's extremely technical and I

59:43

think this is this is the way that you

59:45

know our product team thinks about it

59:46

too is you know better send a PR.

59:49

>> You're you're doing a lot of prototyping

59:51

instead. So like that that's also

59:52

something where when we talked about how

59:54

you were building cloud code early on

59:57

you were showing actually you had a

59:58

whole thread about the number I think

60:00

you did like 15 or 20 prototypes for the

60:03

the to-do list and all of them

60:04

interactive working and what surprised

60:06

me compared to my past tech experience

60:09

and you said that well you did this in

60:11

like a day and a half all all 20 tried

60:13

it out got a feeling for it which

60:15

incomprehensible for me it would have

60:17

taken a week or two weeks and people

60:18

would have not done 20 they would have

60:20

done three. Yeah.

60:21

>> So like are are you seeing this? Is

60:23

there an increase in in prototyping and

60:25

and building and showing instead of you

60:27

know writing things?

60:28

>> Yeah. Absolutely. I mean on our team the

60:30

culture is we don't really write stuff.

60:32

We just we show. It's a little hard to

60:33

to reflect back on the time before cuz I

60:36

I think now just prototyping everything

60:38

is so baked into the way that we build.

60:41

Just everything is prototype multiple

60:43

times. Like uh you know we launched

60:45

agent teams earlier this week. This is

60:46

our implementation of swarms. It it's

60:48

very exciting because uh it just lets

60:50

Claude do more work for longer, more

60:53

autonomously. You have a bunch of

60:55

different uh uncorrelated context

60:57

windows and you have this kind of

60:58

communication between agents. They can

61:00

just do more. This is something that uh

61:02

Daisy and Suzanne and other folks on the

61:04

team uh and and Karen, they they

61:07

prototyped this for months and they

61:09

tried all in all probably hundreds of

61:11

versions of this before they got a user

61:13

experience that felt really good. um it

61:15

was just really really hard to get

61:17

right. There's just no way we could have

61:19

shipped this if if we started with, you

61:21

know, like static mocks in Figma or if

61:24

we started with a PRD or something like

61:25

this. It's a thing that you have to

61:26

build and you have to feel and you have

61:28

to see how it feels. And to me, one of

61:30

the big takeaways even from there was

61:32

like we probably should prototype more

61:33

and just be more daring or just release

61:36

your priors of how long it took to build

61:38

a prototype or who needed to build. Back

61:41

then it was always an engineer that

61:42

needed to build, but it's probably not

61:43

true anymore. Yeah, that's right. I

61:45

mean, we're in this world right now also

61:46

where we just we don't know what the

61:48

right answer is. You know, like I I

61:50

think back in the old way of building

61:51

you the cost of building was high and so

61:54

you had to actually spend a lot of

61:55

effort to aim very carefully before you

61:57

take your shot because after you take

61:59

your shot um it it's very hard to course

62:01

correct. You can only take so few shots.

62:03

But now it's changed. The cost of

62:05

building is very low. Um but also we

62:07

don't know where we're aiming. So we

62:08

just have to like we have to try and we

62:09

have to see what feels good. And it's

62:12

just very very exploratory. And I think

62:14

also a big part of it is humility where

62:16

you know personally I'm wrong like half

62:19

the time I'd say like most of my ideas

62:21

are bad. At least half of them are bad.

62:23

And I don't know which half until I try

62:25

it.

62:26

>> And I get feedback from others as well

62:28

sometimes.

62:29

>> That's right. It's like I I have to try

62:30

it myself and then I have to see what

62:32

others think cuz you know my intuition

62:33

does not always match others. When you

62:35

were showing these prototypes of just

62:37

how the the tasks were built, you were

62:40

telling me that you built the prototypes

62:42

and then your process was always you

62:44

first like looked at it, you tried it

62:45

out, you got a feel for it and then for

62:47

the ones that you felt were good, you

62:49

showed it to others and sometimes they

62:51

give you feedback like nah this doesn't

62:53

work and then sometimes when it felt

62:54

good then you shared it even broader. So

62:56

I feel like you know like it's a mix

62:58

right where like sometimes you can

62:59

decide already and then sometimes you

63:01

get feedback and then eventually some

63:03

good ideas come out of it. Yeah, and

63:05

there's a lot of examples of this like

63:06

uh we we launched this kind of condensed

63:08

view for file reads and file search just

63:10

because the the model is just so agentic

63:12

now like I felt like half the screen is

63:14

these like file reads and I actually

63:15

don't care like I you know I read a

63:17

thing I don't really care what it is and

63:19

so we condensed this down to make the

63:20

output a little bit more readable. I

63:22

really liked it after probably 30

63:24

prototypes or something like this. It

63:26

took it took so much effort to make that

63:27

feel really good and clean. We rolled it

63:30

out to employees at Enthropic for about

63:31

a month and we had everyone dog fooded

63:33

and I fixed another probably dozen dozen

63:36

bugs, dozen tweaks based on all this

63:37

feedback. We launched it externally and

63:40

you know almost all users liked it but

63:42

there were a few users that didn't

63:43

because they want more expanded output.

63:45

Um and so on the GitHub issue I was just

63:47

going back and forth with people to be

63:48

like you know what like what don't you

63:50

like and people gave a lot of feedback.

63:51

I shipped another version. Then some

63:53

people liked it, some people didn't. And

63:54

so I iterated again and kind of made it

63:57

good. And it it's actually I think

63:58

almost there where people can configure

64:01

it the way that they want, but still the

64:02

default is really good. But this is just

64:04

the process. You know, we we get it

64:06

right some of the time. We have to learn

64:07

from our users. We want to hear from

64:09

people so we can get it right.

64:10

>> Do you use ticketing systems for your

64:12

work where you know where where you

64:13

capture like, all right, here's the work

64:15

I I want to or do you just pretty much

64:17

do the work as as it comes in?

64:19

>> So at Anthropic, we leave it up to teams

64:21

on the quad code team. and we leave it

64:22

up to every person. Uh different people

64:24

use uh use this differently. For

64:26

example, I don't use a ticketing system.

64:28

Some people like to use a sauna or notes

64:30

or something like this. One of the

64:32

coolest things that I saw, this was

64:34

maybe like 3 months ago or something. We

64:36

launched plugins and the way we launched

64:38

that is uh Daisy for a weekend, she had

64:41

a very early version of swarms and she

64:44

let the swarm run and she told that your

64:46

job is to build plugins. You have to

64:48

come up with a spec. Then you have to

64:50

make a asauna board and split up into

64:52

tasks. And then all the different agents

64:53

have to build it. And uh she set up a

64:56

container and she set up a quad in

64:58

dangerous mode. And she let it run for

65:00

the entire weekend. It spawned a couple

65:03

hundred agents. They made 100 tasks on

65:05

the sauna board. Uh and then they

65:07

implemented it. And that's pretty much

65:08

the version of plugins that we shipped.

65:10

These kind of coordination systems that

65:11

used to be for humans, but um I think

65:14

nowadays it's just as much for models.

65:15

Let's let's talk about cloud co-work. Uh

65:18

it's one of the very impressing things

65:21

about this. It looks great. So I tried

65:23

it out. It's inside cloud. You have the

65:25

co-work tab there and and you can I I

65:28

feel it's a lot more visual way of of

65:30

running agents interacting with them.

65:32

One of the surprising thing I heard that

65:33

it was built in 10 days. Can can you

65:36

take us through like what it took to

65:38

build it and what does actually mean?

65:39

Was it from the idea or like from the

65:41

decision of of building it? And how big

65:43

was the team building it?

65:44

>> The team was really small. It was just a

65:45

few people for a long time. We felt that

65:48

there is some product to be built for

65:51

non-engineers. The reason we felt this

65:54

is for a long time people that were

65:55

using cloud code are non-engineers. Um

65:58

and so you know in the product world

66:00

when you see latent demand you see

66:02

people jumping through hoops to use a

66:03

product that was not designed for them.

66:06

That's a really good sign it's time to

66:08

build another product that is built just

66:10

for them. There's all these people on

66:12

Twitter that there's this one guy that

66:14

was using uh quadco to like monitor his

66:16

tomato plants. I just I love this. It

66:18

was like he had like a webcam set up and

66:20

quad was like, "Oh my god, I'm so happy

66:22

that our plant is budding." And because

66:24

it was it had like a webcam and just

66:25

like every day was like monitoring it

66:27

and it it was so happy that the tomatoes

66:28

were growing. There was someone that was

66:30

using quad code to, you know, recover

66:32

photos off of a corrupted hard drive and

66:34

it was like his wedding photos.

66:36

>> Wow.

66:36

>> Um you know, like I said, our entire

66:38

finance team at Anthropic uses quad

66:40

code. Our sales team uses quad code. So

66:43

there there's just all these people that

66:44

are non-engineers that were using it.

66:46

And at that point quad code it's

66:47

available in a lot of form factors right

66:49

like we started in a terminal then we

66:52

expanded and we added support for

66:54

ideides. So we have extensions for you

66:57

know every VS code based ID every Jet

66:59

Brains based IDE there's also iOS and

67:01

Android apps there's the desktop app uh

67:04

there's web. So uh then then there's

67:07

like Slack and GitHub apps. So we kind

67:09

of expanded to all these places to make

67:10

cloud code easier for engineers. But

67:13

ultimately none of these are built still

67:15

for non-engineers. And so cloud code

67:17

evolved a lot, but it still felt like

67:19

there's a there's kind of a gap and

67:21

there's a product that could make this

67:22

even easier for people. And so for the

67:24

last couple months, the team was kind of

67:26

hacking around and just saying like what

67:28

is the right product? And at some point

67:30

someone came up with this idea of like

67:31

what if we just take quad code, add some

67:33

guardrails. So for example, co-works

67:35

with a virtual machine. This is one of

67:37

the many ways that we make sure it's

67:38

really safe. Um, especially for

67:40

nontechnical users that don't want to

67:42

read like bash commands to figure out

67:44

what it what it's doing. And they were

67:47

hacking on this. I think it was

67:48

something like 10 days end to end or

67:50

something. It was just fully built with

67:51

quad code. Uh, and then we shipped it.

67:54

>> And can you give us a sense of like the

67:55

complexity behind an app like this? And

67:58

if if we can walk through like what

68:00

parts needed to be built because from

68:03

the outside it's a little bit hard to

68:04

tell like is this just a nice UI wrapper

68:06

that's you know like I don't know like a

68:08

few hundred lines of code. I'm just

68:09

being obviously I'm I'm provocative here

68:12

or behind the scenes it's actually

68:14

really complex piece of software. And

68:15

the reason I ask is like Uber is a great

68:17

example where people look at the app it

68:19

looks really simple. I've worked there

68:21

and I know it's it's really really

68:22

complex because you don't see a lot of

68:23

the complexity. There's a a lot of

68:25

regional things. There's a lot of

68:26

backend things that are all hidden. So

68:28

from just from looking at it, claude

68:30

coowork, it's it's hard to tell how much

68:31

of this is is additional business logic

68:34

that needed to be carefully thought out

68:36

versus it's actually just a nice little

68:38

thin wrapper on top of the the model. In

68:40

some places, I think there's less

68:42

complexity than you would think. In some

68:43

places, there's more complexity. So on

68:45

the product side, it's quite simple um

68:47

cuz it's just the quad desktop app. So

68:48

you know, you download the Quad app.

68:50

It's it's a single desktop app. It has a

68:52

tab for co-work, it has a tab for code,

68:54

it has a tab for chat. So it is just one

68:56

app and we were able to inherit a lot of

68:57

that product logic. There's some UI

68:59

rendering code under the hood. You know

69:00

it's just the same quad code running.

69:02

It's the same quad agent SDK that powers

69:04

quad code. A lot of the complexity

69:06

actually is about safety because we know

69:09

like I said we know the user is

69:11

nontechnical and so we just want to make

69:12

sure they have a good experience and so

69:14

for example if someone launches the app

69:16

and then you know like they delete a

69:17

bunch of family photos that's really not

69:19

good and so we wanted to make sure that

69:21

we protect against this so you can't

69:23

accidentally do that. And so that's

69:25

where a lot of the guardrails came from.

69:26

So there's a bunch of classifiers

69:27

running on the back end. This is for

69:29

safety and again extra mitigations for

69:32

things like prompt injection and you

69:33

know risks like this around security. On

69:36

the front end there's an entire virtual

69:38

machine that we ship. There's a bunch of

69:40

operating system system level

69:42

integrations to make sure people don't

69:44

accidentally delete things. So just

69:46

around safety there there's a lot there.

69:48

And then we also had to rethink the

69:49

permission system because we inherit the

69:52

permission system from quad code. Um but

69:54

also for co-work actually a big part of

69:56

the value is not just running locally

69:59

but it's using all of your tools the way

70:00

that quad code uses it. But the thing is

70:03

for nontechnical users your tools aren't

70:05

really available as CLIs. Some of them

70:07

are available over MCP. Many of them are

70:10

available in a browser. And so co-work

70:12

is really really good when you pair it

70:13

with a Chrome extension. And this is the

70:15

way that I usually use it. So, you know,

70:17

for example, I use it every week to do

70:19

uh project management for the team. We

70:21

have like we have a spreadsheet that

70:22

tracks kind of at a really high level

70:23

what everyone's working on. And this is

70:25

kind of my personal way of project

70:26

managing. You know, other people, like I

70:28

said, use ASA, other people use notes or

70:30

whatever. For my own test, I don't use

70:32

anything, but kind of for the team

70:33

overall, I have the spreadsheet and I

70:35

have co-work kind of check-in and I I

70:38

just ask co-work every week, hey, can

70:40

you look at the rows for any status that

70:42

has not been filled out? Can you just

70:43

ping the engineer on Slack? And so it'll

70:46

open one tab in Chrome for the

70:47

spreadsheet. It'll open another tab with

70:49

Slack and then it'll just start

70:51

messaging engineers in Slack and it just

70:53

oneshots it. There's like one engineer's

70:55

name for some reason it can't

70:56

autocomplete. Um but every everything

70:58

else it just gets. And so this is

71:00

actually like from a safety point of

71:02

view, we also thought pretty deeply

71:03

about this Chrome extension and how this

71:05

works and how the permissioning model

71:07

should interact with this local

71:09

permissioning model. So there's also a

71:11

bunch of code to kind of make sure that

71:12

that's that feels smooth. And what's the

71:14

tech side behind this? I assume a lot of

71:16

will be similar to the the cloud app,

71:18

but is it is it electron, typescript,

71:20

those kind of things or or something

71:21

else?

71:22

>> Yeah. Yeah, just electron and

71:23

typescript. Actually, some of the people

71:24

working on it are early electron folks.

71:26

So, uh Felix who's uh you know the

71:29

creator of of co-worker

71:32

on electron. He helped build it.

71:34

>> Oh, amazing. And co-work launched Mac OS

71:37

only. uh what was the reason for both

71:41

for choosing this platform first and for

71:43

now only choosing this platform?

71:45

>> Yeah, so Windows coming soon. Um I think

71:47

probably by the time this podcast comes

71:49

out we will have Windows support. Uh we

71:51

just wanted to start early and start

71:53

learning you know like everything we do

71:54

at Enthropic it's kind of like the way

71:57

that I told my own story the one of the

72:00

things I like about anthropic is it just

72:02

really really matches the way that

72:03

people here think about it. you know,

72:05

back to this point where like we don't

72:07

have high certainty about the things

72:08

that we build and our intuition is often

72:11

wrong and so we just have to like learn

72:12

from users and figure out what people

72:14

actually want and just spend a lot of

72:15

time listening to people and

72:17

understanding the feedback deeply. This

72:18

is the way that we build product and so

72:20

we always launch a little bit before

72:22

it's ready. Um we did this for quad code

72:24

when we launched quad code initially it

72:26

didn't even support Windows also it

72:28

didn't support you know like a lot of

72:30

different stacks and then over the

72:31

coming weeks we added support for every

72:32

stack. Now quad code supports every

72:34

single stack. Um you know like Windows

72:37

whatever weird Linux dro use Mac OS we

72:40

support everything and so for core work

72:41

also we just wanted to launch early we

72:43

wanted to start with Mac as that was

72:45

just the starting point but um yeah it's

72:47

it's going to support everything. One

72:49

thing you mentioned is is getting

72:50

feedback. I'm curious both for cloud

72:53

code and for cloud co-work. How do you

72:55

go about things like observability

72:57

monitoring when you're rolling out? Do

72:59

you use any feature flags? And I'm I'm

73:01

more interested in like did you build

73:02

custom tools for this or did you decide

73:05

to use certain vendors because es

73:08

especially for observability I'm sure

73:10

that this is this is both important but

73:12

it also sounds like pretty high scale in

73:14

terms of the the number of users that we

73:16

can derive or this will not be a small

73:18

operation. Yeah there's there's some

73:20

off-the-shelf vendors that we use

73:21

there's some custom code that we use. So

73:23

um it's actually it's a mix of both.

73:25

There's nothing too surprising about it.

73:27

There's one thing about Enthropic that's

73:29

kind of interesting is because we're an

73:30

enterprise company and we care a lot

73:31

about privacy and security, we can't see

73:33

people's data. Um, and so, you know,

73:36

like if someone reports a bug, like I

73:38

actually can't pull up your logs to kind

73:40

of see what's going on. A lot of work

73:42

goes into kind of figuring out how to

73:43

log events and things like this in a

73:45

privacy preserving way. Um, this is just

73:47

very important to the way that we

73:48

operate

73:48

>> for co-work. What kind of learnings have

73:50

you had so far? It's it's it's been out

73:52

for I think a few weeks now. Did you see

73:55

something unexpected? uh are you shaping

73:58

the product based on feedback that

74:00

you're getting?

74:00

>> Yeah. Uh every day the team is landing

74:03

so many fixes. The most surprising thing

74:05

is just how much people are loving it.

74:07

To be honest, when Quad Code first came

74:09

out, it actually wasn't an overnight

74:11

hit. This is something people think it

74:13

was, but it was sort of a slow take off

74:15

at the beginning. And I think the first

74:16

big inflection was in May when we

74:19

released Opus 4 and Sonnet 4. That's

74:21

when it really clicked and that's when

74:22

our growth became exponential. But at

74:25

the beginning, it was sort of a research

74:26

preview. people didn't really know how

74:27

to use it. Some people got it

74:29

immediately, but most people didn't. It

74:30

took it took a little while. For

74:32

co-work, it's a much steeper growth

74:33

trajectory than quad code was at the

74:35

beginning. So, it it's just been an

74:37

instant hit. And that that's actually

74:39

been very surprising. I I didn't really

74:41

expect that. One of your new releases,

74:44

which came out just very recently, it

74:46

was I think yesterday or the day before

74:48

when we're recording this podcast, was

74:50

agent teams. And I as I understand the

74:52

idea with what agent teams agents forms

74:56

instead of single agent you can have a

74:59

lead agent and it can delegate to its

75:01

different teammates. How did you start

75:02

experimenting with this and how did you

75:04

decide to ship it? Now we're always

75:06

doing experiments right there's uh

75:08

there's there's all sorts of ways uh to

75:11

get more mileage out of out of quad

75:14

code. Um one way you can do it is by

75:16

extending context. Another way is autoco

75:18

compacting context. So it's essentially

75:19

infinite context and that's what we have

75:21

right now. Another way is using sub

75:23

agents. So you have multiple agents kind

75:25

of working together. Um there's just

75:27

like a lot of different approaches to

75:29

get a little bit more mileage out of the

75:30

context window. There's this one idea

75:32

called uncorrelated context windows.

75:35

That's what we call it. And the the idea

75:36

is you have multiple context windows. Um

75:39

but they essentially start fresh. So

75:41

they don't know about each other. And so

75:43

an example of this is like a correlated

75:45

context window is if you have one if you

75:47

have the model and it does a task and

75:49

then you have it just do a second task

75:50

in that same context window. Um and in

75:52

this case the the second task knows

75:54

about the first one cuz it's in the same

75:55

window. But for something like a sub

75:57

aent it's uncorrelated because the main

75:59

agent prompts the sub aent but the sub

76:00

aents context window is fresh. Besides

76:02

that prompt it doesn't know what's in

76:03

the parent context window. And you can

76:06

see this actually a little bit in uh for

76:08

example like sub agents versus uh skills

76:11

because when you run a skill uh you know

76:13

or slash command it sees the parent

76:15

context window versus for a sub agent it

76:17

doesn't. So it's uncorrelated. There's

76:20

some cases where you want that context.

76:22

There's some cases when you don't. Um

76:24

and there's this kind of interesting

76:25

thing where uncorrelated context windows

76:28

and just throwing more context at the

76:30

problem and throwing more tokens at it

76:31

when the windows are uncorrelated gives

76:33

you better results. Um, it's actually a

76:35

form of test time compute to do this.

76:37

And for something like teams, we've been

76:39

experimenting with this for a while. I

76:41

think since maybe like October or

76:44

September or something like this, and it

76:46

really just felt like with Opus 4.6, it

76:49

clicked where the model figured out

76:51

really how to use this. And sometimes

76:54

you see these kind of cute exchanges

76:55

where the agents are talking to each

76:56

other and they're like discussing

76:58

something and it's just very cool to

76:59

see. It's very like humanistic in a way.

77:01

But there's other times where you just

77:03

get very good results. And so we had a

77:05

bunch of internal evaluations for

77:06

example where we have quad build

77:08

something very very complex, something

77:09

more complex than what a single quad

77:11

would build. And we saw the results just

77:13

really really improved with Opus 4.6

77:16

with teams. And that's why we felt it's

77:18

the right time to release it. We also

77:19

wanted to be careful. Um, and the reason

77:22

you have to opt into it, the reason it's

77:23

a research preview is it uses a ton of

77:25

tokens cuz it's just a bunch of quads

77:27

that are running. Um, not everyone wants

77:29

this all the time. So just excited to

77:32

see how people use it and uh you know to

77:34

to hear the feedback. It's it's

77:36

something you want for fairly complex

77:37

tasks. You don't probably want this for

77:39

every task. The main quad decides the

77:41

rules for the sub quads. We don't have a

77:43

kind of a regimented way to do this.

77:45

It's context specific. I wouldn't say

77:47

there's one right way to do it. I think

77:48

actually a lot of the magic of this

77:50

comes out of this idea of uncorrelated

77:52

context windows. It's less about the

77:54

specific configuration of the agents.

77:56

But you know it's something that people

77:57

should experiment with. I don't think

77:58

there's a one-sizefits-all.

78:00

>> Have you seen use cases even in even I I

78:03

know it's it's still research, but have

78:04

you seen use cases where it could look

78:06

it looks promising this approach, the

78:08

swarm approach?

78:08

>> Well, you know, like I said before,

78:10

plugins were fully built with swarms.

78:11

There there's a bunch of other feature

78:13

since that were built in this way. So

78:15

yeah, I I think for anything where you

78:16

see a single cloud struggling, swarms

78:19

can help. It's it's an interesting to

78:21

look at. Talking about change in in

78:24

general with Andrew Carpathy, you had a

78:26

really interesting exchange back in

78:28

December where when he posted that he's

78:30

never felt as much behind as as a

78:32

programmer as he is now because of the

78:36

progress with AI. And then you shared

78:38

the story about how you started to debug

78:40

a memory leak the oldfashioned way and

78:42

then Claude just one shot at it. I think

78:45

it was a reflection of like how everyone

78:47

is feeling that things are changing so

78:48

fast and in the in the holiday break I

78:51

started to feel that things have have

78:53

really shifted. How did you I guess come

78:56

to terms with this or or start to

78:57

embrace this change? This is something I

78:59

really struggle with. The model is

79:02

improving so quickly that the ideas that

79:06

worked with the old model might not work

79:08

with a new model. the things that didn't

79:11

work with the new model might work or

79:12

with the old model might work with a new

79:14

model. And it's weird because there's

79:16

just not a lot a lot of other

79:18

technologies like this. So I I just

79:20

don't really have a lot of experience to

79:22

draw on to figure out how I should

79:25

approach this. And it's been this new

79:28

skill that I've had to learn. In a way,

79:30

it's like you just always have to bring

79:32

this beginner mindset. Honestly, like

79:34

I'm using the word humility a lot, but

79:35

you always just have to bring this kind

79:37

of intellectual humility because just

79:40

all these ideas that were bad before are

79:42

now good and and and the inverse. I I

79:44

think that's honestly it it's something

79:46

I I constantly have to remind myself

79:48

about. And back in the It's funny back

79:51

in the old world when someone tries an

79:53

idea again and we've tried it in the

79:55

past and it didn't work, usually the

79:57

feedback is like, why are you doing this

79:58

again?

79:58

>> Yeah. Yeah. You should learn. This used

80:00

I mean we used to call a bit of a

80:02

gatekeeping but it was somewhat valid

80:03

where I know with architecture someone

80:05

came and said like why don't we do

80:07

microser and someone said we tried it

80:08

and it didn't work and if you tried it a

80:10

year or two or 3 years ago it was kind

80:12

of valid right cuz not much has changed.

80:14

Yeah, that's right. That's right. And

80:15

something with Microsoft, it's it's

80:16

funny because it's like every 10 years

80:18

it goes in and out of in and out of

80:19

style. But yeah, now now it's I think

80:21

the first time ever where it's actually

80:23

not crazy to just try the same idea

80:25

every few months because the model

80:26

improves and it just works. And I I

80:29

actually see this with engineers on the

80:31

team. Like new people that are newer to

80:33

the team, people that are newer to

80:34

engineering sometimes do things in a

80:37

better way than than I do. Um and I just

80:39

have to like look at them and I have to

80:41

learn and I have to adjust my

80:42

expectations. you know, like an an

80:44

example of this is, you know, when when

80:46

we release features, sometimes I'll like

80:47

screenshot myself using them on, you

80:49

know, on X or on threads or whatever

80:51

just to kind of talk about it. Um, but

80:53

recently, Tar, our um, you know, our

80:55

devro guy, he actually codes a lot. Um,

80:57

he's amazing and he just started

81:00

automating this. So, he's having like

81:01

quad code generate its own videos for

81:04

for its launches and he just started

81:05

doing this and, you know, this is

81:08

something like I thought would be, you

81:09

know, maybe it's possible. It's not

81:11

something I would have tried because I

81:12

wouldn't have thought the model was

81:13

ready, but he just he just did it and it

81:14

just kind of worked.

81:15

>> One thing that I've I felt like just a

81:18

bit like odd about and I think a lot of

81:20

developers can relate is I've come to

81:22

terms with this starting from Opus 4.5

81:26

the and and also similar models like I

81:28

think GPT 5.2 gave me similar vibes as

81:32

well. the models have been just really

81:34

good at writing code and I I realize

81:35

that I don't think I will handr write

81:37

the code when I'm get I when I want to

81:40

get stuff done if if I actually want to

81:42

you know get the pleasure of writing I

81:44

can still do it but one thing I

81:46

reflected on is it's just been so much

81:48

effort to get good at coding I I

81:50

remember when I when I was learning when

81:52

I I started from like kind of hacking

81:54

around to go into university to learning

81:56

C and C++ and it it was just bloody hard

81:59

and actually you know going through my

82:01

my first few jobs where I started to

82:02

become better at it. I became better at

82:04

debugging and there's a point where like

82:06

a lot of my identity was tied to being

82:08

good at coding. That's how we used to

82:11

get jobs or higher paying jobs. When I

82:13

was an engineering manager when we

82:14

designed the interview loop at Uber, we

82:16

we had talk with managers of what we

82:18

need to screen for and we we talk like

82:20

well what do developers do most of their

82:21

time? About 50% of the time they code.

82:24

Therefore, we placed about 50% of the

82:26

signal was all about coding. So there

82:28

was a lot of things tied into coding

82:29

because it it is just hard. I think we

82:31

all know that it takes grit. It takes

82:33

some level of intelligence to get good

82:35

at it. And there's a sense of loss of

82:37

like well I I think it's great on one

82:39

end that the model can do it. But it

82:41

feels that something really quickly got

82:43

taken away that I don't think I

82:45

personally thought it would happen this

82:47

quickly. And I'm

82:51

I think a lot of other people are

82:52

feeling like some people move on a bit

82:54

easier, but there's definitely this

82:55

sense of of grief. How did you think

82:58

about it? Because again, you're you're

82:59

an example of you you wrote so much code

83:03

at at Facebook also outside of it. I

83:06

know it was just a tool of doing it, but

83:08

not many people could do what what you

83:09

did. And now the models can also work as

83:12

good as you have or if not better.

83:14

>> That's the challenge. Yeah. I think it's

83:16

it's something that used to be a thing

83:19

that we do as software engineers. It's

83:21

becoming a thing that everyone is able

83:23

to do. There was a moment, you know,

83:24

like when I started coding, it was a

83:27

very practical thing and it was a way to

83:28

get things done. And at some point I

83:31

just fell in love with the art of coding

83:33

and like languages and kind of the the

83:35

the tools themselves. And at some point

83:38

I I kind of fell down this rabbit hole.

83:40

I wrote this like I wrote I wrote a book

83:41

about, you know, a programming language.

83:43

>> Typescript. You wrote the first ever

83:44

TypeScript uh book at with O'Reilly.

83:47

>> Yeah. Yeah. Yeah. That's right. Um it it

83:50

was funny actually. There there was this

83:51

like there was this amazing moment for

83:53

me in my little town in Japan. I went to

83:55

the bookstore and I I found that book

83:56

translated in Japanese.

83:57

>> No.

83:58

>> In this tiny town and that was just like

84:00

the coolest moment. And then I actually

84:01

realized I I don't remember Typescript

84:03

at all cuz I was only writing Python for

84:05

a couple years at that point. Yeah. And

84:08

like at some point I started the the

84:09

first the the biggest TypeScript meetup

84:11

in the world. That was in that was in

84:12

SF. And I got to meet kind of a lot of

84:14

my heroes. There was like Chris Cowell

84:15

who wrote like general theory of

84:17

reactivity. There was Ryan Doll the guy

84:19

that made Node. one of the first times

84:22

that I I went really deep into this this

84:24

community and um just the language

84:26

itself and the the tools themselves and

84:30

for something like TypeScript there's

84:31

this beauty in the type in the type

84:33

system cuz Hilesburg is just like he he

84:36

he's just brilliant like the idea of

84:38

like conditional types and just like

84:40

anything can be a literal type and there

84:42

there's these very deep ideas that even

84:46

the most hardcore functional languages

84:48

do not have like even in something like

84:50

Haskell like it doesn't go this far and

84:53

H Anders just took it and he pushed it

84:54

much further than than it had had been

84:57

pushed and you know like Joe Pamer and a

84:59

bunch of other folks kind of explored a

85:01

lot of these ideas and thought of this

85:02

and I think for them it was also very

85:04

practical right because they had these

85:06

large untyped JavaScript code bases how

85:08

do you gradually migrated to something

85:09

typed and you have to come up with these

85:11

very beautiful ideas to to do this for

85:13

me is Scala was another kind of rabbit

85:15

hole that I fell into in kind of like

85:17

this functional programming world And

85:20

still when I write code and when the

85:21

model writes code I always think in the

85:22

types first that that's what matters is

85:24

what what is the type signature that

85:26

matters more than the code itself and

85:28

getting that right. So there is this

85:30

beauty to it. There's a there's an art

85:31

to it for sure. But in the end it's a

85:35

practical thing and in the end this is a

85:38

thing that we use to to build things and

85:41

you know it's a means it's a means to an

85:44

end. It's not an it's not an end to

85:45

itself. I I think one metaphor I have

85:48

for kind of the this moment in time that

85:49

we're in is the the printing press in,

85:53

you know, like the the 1400s or whatever

85:55

>> because at that moment it it was

85:57

actually quite similar, right? Like

85:58

there was a group of scribes that you

86:00

know knew how to write

86:01

>> and it it it was as I understand of

86:03

course we never lived there but as as I

86:04

imagine it was it was a art process to

86:07

learn. You needed to learn you needed to

86:09

get the equipment. You probably needed

86:10

some sponsorship or being selected

86:13

practicing because you needed to produce

86:15

the same thing over and over again and

86:17

few people could do that and I assume it

86:19

was either high prestige or highly paid

86:20

or who knows let's assume it was

86:22

>> but then the printed press came along.

86:24

>> Yeah. Yeah. And at least in Europe like

86:27

you had to like a lord or a king or

86:29

something had to had to employ you and

86:31

then you had to go through you know

86:32

years of training and there was this

86:34

class of scribes that knew how to write.

86:36

They were employed by someone like this.

86:38

often the king themselves like or you

86:40

know the queen was was not literate. So

86:42

it was this very very niche skill and it

86:44

was like less than 1% of the population

86:46

was literate in Europe you know back

86:48

then and then the printing press came

86:50

out and what happened so the cost of

86:54

printed material went down something

86:56

like 100x over the next I think 30 years

86:59

50 years or something the quantity of

87:01

printed materials went up like 10,000x

87:04

in the next 50 100 years this was the

87:06

first effect literacy it took a little

87:09

while for it to catch up so I think

87:11

global literacy it went up to something

87:12

like 70%. But that took like another 200

87:15

years, 300 years because learning

87:17

learning to read is just very hard.

87:18

Learning to write is hard. It takes a

87:19

lot of effort. It takes uh education

87:21

system. It takes you know infrastructure

87:23

to have paper and ink uh and the free

87:26

time to do this instead of working on a

87:27

farm. So it kind of it took early stage

87:29

of of of industrialization to actually

87:31

get there. But I but I think this effect

87:34

of making it so this thing that was

87:36

locked away in ivory tower and now it's

87:38

accessible to everyone. This is just,

87:40

you know, like none of the things around

87:42

us would exist today without this. Like

87:44

if if we weren't literate, if the people

87:46

that built, you know, this microphone

87:48

weren't weren't literate, it would have

87:50

just been very hard to have a modern

87:51

economy. None of these things would

87:53

exist. And I I just kind of think about

87:57

back then if people had to predict what

87:58

would happen when the printing press

88:00

came out, no one would have predicted

88:01

that the microphone would become a

88:04

thing. So, I I just feel like this is uh

88:06

this is the best the best uh analog for

88:09

for the moment that we're in right now.

88:11

>> Yeah, it's interesting that you say that

88:13

some of the kings were illiterate who

88:15

are employing the scribes because if

88:18

we're being honest with ourselves,

88:21

we have business owners who know what

88:23

they want to build and there are

88:25

employing software engineers because

88:26

they themselves cannot write code. And I

88:28

think we we like to mock the CEOs who

88:30

are coming there coming to the team.

88:33

They they might even have a drawn

88:35

prototype or whiteboard and saying this

88:37

should be easy but of course they don't

88:38

understand how difficult it is. There

88:41

seems to be a bit of analogy where where

88:42

there's a person who wants what they

88:44

want but until now they needed to hire a

88:47

software a specialist who can build that

88:49

and there's always that disconnect

88:51

between the idea and the person and just

88:53

like with the printing press like what

88:55

would happen if they could actually

88:57

express and like the king could actually

88:58

read or write their own letters they

89:00

wouldn't need that middleman and it

89:02

things become more efficient. But I mean

89:04

of course for the scribe it's not the

89:05

best news necessarily but I mean smart

89:08

scribes can also do so someone needs to

89:10

like write the books run the press etc.

89:13

Yeah, exactly. And and if you think

89:14

about what happened to the scribes,

89:16

right? Like they cease to become

89:17

scribes, but now there's a category of

89:19

writers and and authors like the these

89:22

people now exist. And uh the reason they

89:24

exist is because the market for

89:26

literature just expanded a ton.

89:28

>> And I guess also if we think about like

89:30

back then a scrib's work was read by a

89:33

few people and with the printing press

89:34

and author there's a lot more authors

89:36

and some of them are not really read but

89:38

some of them have wider reach than than

89:40

they could imagine. There's new careers

89:41

that that exist because of that.

89:44

>> Yeah,

89:44

>> I love the analogy.

89:45

>> And the most exciting thing for me is

89:47

it's just so impossible

89:51

to say today what will happen after this

89:54

happens and after this transition

89:56

happens just you know the the economy as

90:00

we know it would not have existed

90:01

without it. So what's next? like what

90:04

what is the thing that we can't even

90:06

predict today that will exist because

90:10

anyone can do this?

90:11

>> Well, we cannot predict but I think we

90:13

can look at what is working right now.

90:15

If you look around in your environment,

90:18

may that be the team across entropic who

90:21

are software engineers or or builders or

90:23

members of technical staff, however we

90:25

call them, who to you are stand out.

90:27

What are they doing? What skills have

90:29

they built up? And and how have they

90:31

changed the way they they work? It's

90:33

hard to name individuals because

90:35

honestly this is just the strongest the

90:37

these are the strongest people I've ever

90:38

worked with in my career. There's all

90:40

sorts of different archetypes. There's

90:42

some people that are really amazing

90:43

prototypers. Um so take something from

90:46

zero to.5. Just you know figure out like

90:48

what are some cool ideas? What is the

90:50

technology on walk? There's other people

90:51

that are amazing at finding product

90:53

market fit. So kind of 0.5 to one or

90:55

maybe 0ero to one. There's other people

90:57

that span different disciplines and I

90:59

I'm just seeing more and more of these

91:01

people like I said like people that span

91:03

uh product engineering and

91:04

infrastructure engineering or you know

91:06

product and design or design and

91:09

engineering. I I think I'm just seeing a

91:11

lot more of these of these hybrids.

91:13

>> What's a belief that changed from last

91:15

year to this year? Something that you

91:17

know like you either believed or or a

91:21

conviction that you had that you've

91:22

either revised or completely threw away.

91:24

I think one thing I wasn't sure about is

91:27

how big a problem is safety to be

91:29

totally honest. Um I jo I joined

91:31

Anthropic because like I said I read a

91:33

lot of sci-fi and I kind of I know how

91:35

bad this thing can go if it goes bad. It

91:37

wasn't something I was sure about. Um

91:39

but seeing it from the inside and then

91:42

seeing how the new risks that have

91:45

arisen in the last year, it just makes

91:46

me much much more worried about it. Um

91:49

so I I think it's it was kind of an

91:52

important thing for me. Now it's just

91:54

the most important thing for me is how

91:56

do we make sure this thing goes well.

91:57

>> I think it's safe to say you you were a

91:59

really great software engineer even

92:01

before all all the AI things started and

92:04

you seem to be a very productive

92:05

engineer of course part of a team as

92:07

well but but also individually. What are

92:09

some skills of like you know before

92:12

being a software engineer that are are

92:15

still as valuable or maybe even more

92:17

valuable than before and what are ones

92:19

that are maybe just not as much and and

92:21

they're best left behind probably. Okay,

92:24

so the stuff that's left behind is uh

92:26

best left behind is maybe like very

92:27

strong opinions about like code style

92:29

and languages and things like this. Like

92:32

I I can't wait to get past like these

92:33

endless language debates and framework

92:35

debates and all the stuff because the

92:37

model can just like you know use

92:39

whatever language and framework and if

92:40

you don't like it it can just rewrite it

92:41

for you. So it just doesn't matter

92:43

anymore. I think something that still

92:44

matters a lot today is things it's being

92:48

methodical and hypothesis driven. This

92:51

matters both in product design in this

92:53

world where everything is being

92:55

disrupted and we need to figure out what

92:57

to build next and this is something

92:58

everyone is thinking about. Um, but it

93:00

also matters for engineering day-to-day,

93:02

you know, like something like debugging.

93:03

You just have to be very methodical

93:05

about it. And the model can can do this

93:07

and it can help a lot. Um, but I think

93:09

still we're in this transition point

93:10

where you still need to have the skill.

93:13

I don't know if you you're you're still

93:14

going to need to have it in 6 months.

93:16

Other skills that I think are more

93:17

valuable are

93:20

being curious and being open to doing

93:24

things beyond your swim lane. So, you

93:27

know, if you're working on engineering,

93:29

but you really understand the business

93:30

side, you can just build really awesome

93:33

products. And I and I think the next,

93:35

you know, billion dollar product, you

93:36

know, like after quad code, whatever the

93:39

next startup is that, you know, becomes

93:40

the next trillion dollar startup, it

93:43

might just be like one person that has

93:45

some cool idea and their brain just is

93:48

able to think across, you know,

93:49

engineering and product and business or,

93:52

you know, like design and finance and

93:54

something else. It's like it's people

93:56

are going to become more and more

93:57

multi-disipline and this will become

93:58

more and more rewarded. So in in some

94:01

ways I think this will be the year of

94:02

the generalist. I think the other skill

94:03

that's actually been been rewarded of it

94:05

is uh having a short attention span.

94:08

>> I was being rewarded now. Oh yeah. It's

94:11

uh you know like people you know like

94:13

teenagers are using you know like like

94:15

Tik Tok and and all this stuff and I

94:18

think in some ways it's kind of

94:19

dangerous for society um because like

94:21

you want people that can think deeply

94:22

and can contemplate ideas and uh aren't

94:26

just moving on to the next idea very

94:27

quick but in some ways I think this year

94:29

is kind of the year that is going to

94:31

reward uh it's like the year of ADHD

94:35

because the work for me has become

94:38

jumping between quads. has become

94:40

managing clouds and so it's not so much

94:42

about deep work it's about how good am I

94:45

about context switching and you know

94:47

jumping across multiple different

94:48

contexts very quickly

94:49

>> could I add that from what I unders what

94:52

all you said maybe we could add one

94:54

thing which is adaptability because

94:56

you're saying of course that ADHD and

94:58

and you can jump across but of course

95:01

earlier you are very good at focusing

95:03

deeply on one thing as well and what

95:04

strikes me about you and maybe this is

95:06

true for other people as well you you're

95:08

just kind of very open to adapt ting

95:09

your working style and seeing what works

95:11

well for this stage, especially when

95:13

things are changing. I think the one

95:16

certain thing we can be sure is whenever

95:18

the next model comes out, it'll change

95:19

again. And you need to be curious and

95:21

open to adapting how you work, right?

95:23

>> Yeah. And as closing, what's a book or

95:26

books that that you would recommend?

95:27

I've gone down a rabbit hole. Um, so

95:30

he's the threebody problem guy, but he

95:32

actually has like a lot of other really

95:33

great books. I really love his uh short

95:34

stories. Um, he has a couple books of

95:37

short stories. I'm a big fan. For people

95:38

that are new to sci-fi and you want like

95:40

a little bit like harder sci-fi, um I

95:43

really love Accelerondo by St. This is a

95:46

book I would totally recommend. It's

95:47

like essentially the product roadmap for

95:49

the next 50 years. Um it it it starts

95:52

with takeoff kind of starting to happen

95:54

and kind of AI singularity and then it

95:57

ends up with like uh this kind of like

96:00

group lobster consciousnesses orbiting

96:02

Jupiter and it's just like amazing. And

96:04

the thing that I think it really

96:06

captures is just the pace this like

96:07

quickening quickening quickening pace of

96:09

how this feels. It really matches the

96:11

feeling right now. And then on the

96:12

technical side, I would strongly

96:14

recommend functional programming in

96:15

Scola. Even if language choice just

96:18

doesn't matter as much anymore, I think

96:20

there is this art to functional

96:22

programming that just teaches you how to

96:24

code better. Um, and it'll just teach

96:26

you how to think in types. If you read

96:28

this book, I think what's really

96:29

important is to do the exercises also.

96:31

And I've gone through and I've done all

96:33

of them probably like three times over

96:35

and it's just amazing. It it really just

96:37

like knocks this idea of functional

96:39

types into your head and it's just a

96:41

thing you can't stop thinking about.

96:43

>> Boris, thank you so much. This was

96:45

awesome.

96:46

>> Yeah, thanks Kirk. This was a really

96:48

interesting conversation and the thing

96:50

that I keep coming back to is to Boris's

96:53

prickic press analogy. The idea that

96:54

medieval scribes were this tiny elite

96:56

who could write employed by kings who

96:58

themselves were often illiterate and

97:00

that we soft rangers might be in a

97:02

similar position today. We are the

97:04

scribes. We spent years mastering this

97:06

craft. And now the printer press is

97:08

arriving. But what Boris told me is that

97:10

the scribes did not disappear. They

97:12

became writers and authors and the

97:14

entire market for written work expanded

97:16

beyond anything anyone could have

97:17

predicted. I do find this hopeful and

97:19

also appreciate that Boris didn't

97:21

sugarcoat it. The other thing that

97:22

struck with me is just how differently

97:24

the Cloud Code team built software. No

97:26

PRDS, no mandatory ticketing system,

97:29

designers and data scientists and

97:30

finance people all writing code and

97:32

building dozens or hundreds of

97:34

prototypes before shipping a feature.

97:36

And Boris is shipping 20 to 30 pore

97:38

requests a day without editing a single

97:40

line by hand. And there are different

97:42

verification systems in place. Claw code

97:44

reviewing its code, automated lint

97:46

rules, best of end passes, and human

97:48

code review. If you've enjoyed this

97:50

podcast, please do subscribe on your

97:51

favorite podcast platform and on

97:53

YouTube. A special thank you if you also

97:55

leave a rating on the show. Thanks and

97:57

see you on the next one.

Interactive Summary

Ask follow-up questions or revisit key timestamps.

Boris Churnney, the engineering lead for Claude Code at Anthropic, discusses the transformative impact of AI on software development. From his first handwritten PR being rejected at Anthropic to now shipping 20-30 PRs a day with zero manual code editing, Boris shares how the Claude Code team uses parallel agents, virtual machines, and specialized safety layers. He draws a powerful analogy to the 15th-century printing press, suggesting that while the role of the traditional 'scribe' (coder) is changing, the market for 'authors' (builders) is expanding beyond prediction. The conversation covers the internal development of Claude Code, the importance of generalist skills, and how Anthropic measures code quality and productivity.