Building Claude Code with Boris Cherny
3022 segments
You were the first ever TypeScript book
with O'Reilly.
>> Yeah, I found that book translated in
Japanese in this little town in Japan.
That was just the coolest moment. And
then I realized I don't remember
TypeScript at all. Now we're at the
point where Quad Code writes, I think
something like 80% of the code had
Enthropic on average. I wrote maybe 10
20 p requests every day. Opus 4.5 and
Quad Code wrote 100% of every single
one. I didn't edit a single line
manually.
>> Andre Carpet posted that he's never felt
as much behind as a programmer as he is
now.
>> This is something I really struggle
with. The model is improving so quickly
that the ideas that worked with the old
model might not work with the new model.
One metaphor I have for this moment in
time is the printing press in the 1400s
because there was a group of scribes
that knew how to write.
>> Some of the kings were illiterate who
are employing the scribes.
>> And if you think about what happened to
the scribes, they ceased to become
scribes, but now there's a category of
writers and authors. These people now
exist. And the reason they exist is
because the market for literature just
expanded a ton.
What happens when you join one of the
top AI labs in the world and your first
poll request gets rejected? Not because
the code was bad, but because you wrote
it by hand. This is exactly what
happened to Boris Churnney when he
joined Antrophic. Boris is the creator
and engineering lead behind Claude code.
Before joining Androphic, he spent 7
years at Meta where he led code quality
across Instagram, Facebook, WhatsApp,
and Messenger, and was one of the most
prolific code authors and code reviewers
at the company. In today's episode, we
cover how Cloud Code went from a side
project to one of the fastest growing
developer tools and the internal debate
at Entrophic whether to release it at
all. Boris's daily workflow of shipping
20 30 poll requests a day with zero
handwritten code and how code review
works when AI writes everything. Why
Boris believes we're living through a
time as transformative as a printing
press and which engineering skills
matter more now and which ones do not.
If you want to understand how one of the
people closest to AI coding agents
actually builds software today and what
that means for the rest of us engineers,
this episode is for you. This episode is
presented by Statsig, the unified
platform for flags, analytics,
experiments, and more. Check out the
show notes to learn more about them and
our other season sponsors, Sonar and
Work OS. How did you get into tech,
software engineering, and and coding in
general?
>> It starts a while back. I think there
was kind of like two parallel paths that
crossed. So, when I was maybe 13 or
something like this, I started selling
my old Pokemon cards on eBay. And I
realized that on on eBay, you can
actually like write HTML. And I was
looking at other people's Pokemon card
listings and I realized like some of
them have like big colors and fonts and
stuff like this. And then I discovered
the blink tag and I named Blink Tag.
>> And if I put the blink tag on it, I
could sell my card, you know, for like
99 cents instead of 49 cents or
whatever. So I kind of learned about
HTML this way. Then I got an HTML book
and kind of learned about HTML. And then
uh the second thing was this was also I
think sometime in middle school. We had
these old TI83 uh graphing calculators
and we use them for math. And what I
realized is I can get a better answer on
the math test if I just program the
answers to the math test into my
calculator. And so I wrote these little
programs to just program the answers and
then the test got harder. first then I
had to program solvers instead of the
actual questions cuz I didn't know what
what you know the coefficients and stuff
would be ahead of time and then the math
got more advanced like the next year and
so I had to drop down from basic to
assembly to just make the program run a
little bit faster.
>> Oh wow. So like in high school you
dropped down to assembly.
>> I think this is like middle school or
high school maybe like 8th or 9th grade
or something like this. Then then the
thing I realized is uh everyone in my
class was starting to realize that I had
the solver and they got kind of jealous
and so I bought this little serial
cable. so I can give it to them too. And
then the next math test, everyone on the
class just got A's. And the teacher was
like, what's going on? And then
eventually she realized it. She was
like, okay, you get away with it once
and and uh knock it off. But for me, it
it was very practical. So, you know, in
school I studied economics. Um I
actually dropped out to to startups and
I never thought that coding would be a
career at all. It was always very
practical to me. Coding is a means to
build things and to to make useful
things. this startup. Um, the first one
was I think it's like my friends and I
were trying to get weed
and so we started this like weed review
startup. We made like a website. We
called kind of different uh dispensaries
I I think and then we just tried to get
kind of like weed samples so we could
like review it for them. And it actually
kind of blew up. Um, and then I actually
got more interested in uh at the time no
one was like testing this stuff and so I
got into kind of the like chemical
testing kind of chemical analysis and
then after this I kind of did a bunch of
other startups and then I joined YC
actually pretty early uh and I was the
first hire of uh this YC startup up in
up in Palo Alto after.
>> How did you decide to go go to one
startup after the other?
>> Kind of vibes vibes I'd say cuz you know
you know like you know startups it's
it's never a linear path. You always
kind of pivot pivot pivot. You have to
figure out what the market wants and
what users want. And it's never the
thing that you think. You you always try
a thing, but the the idea is always a
hypothesis and then almost always you
have to pivot once, twice, three times.
You know, at at this uh at this medical
software company, this is called Agile
Diagnosis. This was kind of an early YC
company. This was back in maybe 2011,
2012, something like that. It was
medical software for doctors. And the
idea was there's these like clinical
decision protocols. They vary a lot
hospital to hospital. And our idea was
there was one hospital in Chicago that
had a really great protocol specifically
for cardiac symptoms. And so we're like,
wouldn't outcomes be great if every
hospital in the US would use the same
protocol? And so we tried to standardize
it. And we made this like decision tree
software for doctors to use. And I
wrote, you know, some of the software.
The team was like it it was it was just
a few of us. It was a pretty small team.
And I wrote the software. It was in a
web browser. And I remember this was
back in the like the Internet Explorer 6
days. that's what hospitals were using
>> and I wrote this like SVG renderer uh
because it was this visual decision tree
and we launched it and then we had a DAU
chart and the DUS were flat and couldn't
figure it out and we were piloting it
with a few hospitals at the time and at
the time we were based in PaloAlto we
were piloting it with uh you know a few
hospitals including UCSF and I rode a
motorcycle at the time so I rode my
motorcycle up to you know UCSF and I
shadowed doctors for a couple days just
to see how how do they actually use
And I realized that actually doctors
don't have time to sit down and use a
computer because you're seeing a patient
>> then you have maybe 5 minutes until the
next patient and in those 5 minutes you
have to walk down the hall you have to
go to the computer station you have to
open up this totally legacy computer. By
the time it boots up that's like 3
minutes. Then you open up Inner Explorer
6 that takes like 30 seconds. Then you
have to open up this like app that we
built. You have to sign in and your 5
minutes are up. you don't even have time
to use it. And so we rewrote everything
to run on Android and they still weren't
using it. And the thing we realized is
doctors are walking around with a bunch
of residents behind them. In this kind
of situation, it's like a social
situation, right? Like the thing that
matters is they're seen as an authority.
They don't want to be seen on their
phones. And then we pivoted again. So at
that point, we were like, okay, so maybe
the doctor isn't the target user.
Actually, we wanted to be used by maybe
nurses or X-ray technicians or something
like this. At that point, I left because
I was like, "This is actually pretty far
off from kind of what I wanted to do."
This is like the most fun thing for me
is finding this this product market fit
because it's always surprising. You
can't have one big idea because the idea
is probably going to be wrong. So, you
kind of form hypothesis, you you follow
it down and and you see what's right.
Also, I find it so interesting how
you're telling us this story because I
feel behind a lot of startup success
stories, we hear the success story. We
hear the path of how it went. But first
of all, a lot of startups are like this.
And second of all, what struck me is you
you were hired as a software engineer,
right? And this was back before product
engineers or anything was a thing which
we're now talking about. But you just
like you rode your motorbike and you
went there and you shadowed the people
and you understood how they're using it,
why they're not using it. getting
getting ideas. I I feel, you know, this
this is what makes a great software
engineer back then and and even today,
right? You you weren't doesn't seem to
me that you were focused on a
technology. You were focused on the
outcome, though.
>> Yeah. I mean, look, there there's
different kinds of engineers and there's
different ways to do it. And you know, I
even even on our team right now, I look
at an engineer like Jared Sumar and he's
just incredible technical mind. He
understands systems better than anyone
I've met. And you know you need you need
people like this. You need people with
this kind of depth. For me engineering
has always been a practical thing. Uh
and you know for me I've always been a
generalist and like it doesn't matter if
I'm doing you know like design or you
know if I'm doing engineering or user
research or whatever. The investment
thesis for AI and software engineering
is straightforward. As AI writes more
code more code needs to be verified. But
there's a catch. AI generated code is on
average harder to verify than human
written code. This is why there's Sonar,
the makers of Sonar Cube. As a critical
verification layer for the AI enabled
world, Sonar ensures that speed and
volume with AI does not compromise your
codebase. Sonar's competitive position
is built on 17 years of specialized
expertise that no foundational model can
replicate. We're talking about deep
analysis engines like symbolic execution
and cross- repository data flow tracking
that simulate how code actually behaves,
not just what it says. To bridge the
divide between AI productivity and code
quality, Sonar has released the Sonar
Cube MCP server. This tool acts as a
universal translator between AI
applications and the Sonar Cube
platform. By using the modal context
protocol, it gives AI tools like cloud
code, GitHub copilot, and cursor direct
access to sonar cubes analysis
capabilities. Instead of context
switching, your AI agent becomes a
full-fledged code review and quality
assurance copilot capable of analyzing
code snips for issues, filtering bugs by
severity, and even checking your
project's quality gate status before you
ever commit code. Whether you're working
with coding assistants or scaling up
with full agogentic workflows, Sonar
provides the automated verification that
75% of the Fortune 100 rely on. It's
about giving your developers the freedom
to innovate without the fear of breaking
the code base. Head to
sonarsource.com/pragmatic
to learn more about how Sonar enables
the confidence to develop at the speed
of AI. With this, let's get back to
Boris's career and what he learned
working at startups. My first job I ever
had, I was like, I think I was 16 and I
just wanted to buy an electric guitar.
And so what I did was I I started uh I
just started freelancing. And so I was
like, "Okay, I guess I'll make
websites." And I think Fiverr was not a
thing back then. So there were some
other freelancing websites. So I just
started like I put up a website. I
started bidding on stuff. And my first
paycheck, I just spent the entire thing
on an electric guitar. But it but it was
very practical, right? Right? Cuz it's
like when you're in this kind of setup,
you have to you have to do the
engineering, you have to do kind of the
accounting, you have to do the the
design, you have to talk to customers.
It's just always been like that for me.
After a couple of these startups, you
ended up at Facebook now now called
Meta. And there you spent seven years
there. Can you just talk us through what
you've worked there, what you've learned
there? You've also had a very remarkable
career growth in terms of four
promotions over over over seven years.
And what did you take away from that
that experience?
>> Yeah, so I started on Facebook groups.
That was the first time I worked on uh
Vlad Klesnikov uh hired me. I think I
think he's actually still at Facebook.
Um I think he's on some other team now.
And it was cool actually. There there's
a big group of people that I worked with
that were these kind of early JavaScript
people too. And you know, like I I did a
bunch of JavaScript stuff. And it's
funny like I kept crossing paths with
these people. And so Vlad, he worked on
Bolt.js, JS which was the software it
was the framework that powered ads
manager which later became ReactJS. I I
kept crossing paths with these people
and later on for yeah later on there
there was a bunch more people like this
but anyway so I I was working on
Facebook groups um I was really excited
about it because the because of this
mission of connecting people to their
community. This is the thing that drew
me in. And at the time I was a big
Reddit user. I became a Reddit user back
when I was a teenager because I didn't
know anyone else that coded. Even in
college, I didn't really know anyone
that coded.
>> And honestly, I was always kind of
embarrassed about it cuz I thought it
was this nerdy thing. And I thought it
was kind of this this thing that I knew
how to do, but I wanted, you know, I
wanted to be like a cool kid and, you
know, like I I couldn't like tell people
that I coded. It was like it was very
nerdy. Um, and and at some point I
discovered it was some like programming
community on Reddit and I was I was just
shocked like there's other people that
are into this thing. It's like such a
weird hobby. It's so niche and it was
just so exciting to find like-minded
people like this and get this connection
and so I just wanted to work on this. I
wanted to kind of contribute to this in
in some way. So I worked on Facebook
groups for a while. Um, and then you
know there there's a bunch of different
projects have to to kind of get get into
details for any of these. Eventually I
became the the tech lead for for
Facebook groups and kind of grew grew
into this and the org grew the work
really changed. It changed from kind of
building to a lot of like dock writing
and coordination and kind of delegating
to others. The culture was changing at
the time. So you know this early
Facebook culture was disappearing. The
docs were coming in. The you know
alignment meetings were coming in. uh
there was a lot of a lot more work
around this kind of foundational stuff
like privacy, security, things like this
that I think honestly early on a lot of
corners were cut in order to grow. But
at some point you just have to pay that
debt and that was the time when that
happened. Then I spent a few years at
Instagram after um and that was also a
funny story. My wife got a got a job
offer and she was just really excited
about it and she came to me and was
like, "Hey, like I got this offer but
we're going to have to move. Is that
okay?" And I was like, "Yeah, that's
fine." You know, like I work in tech. we
can work remotely anywhere. Where's the
job? And she was like, it's a N. And I
was like, where where's that? And uh N
is like rural Japan. And this was uh
>> different time zone as well.
>> Different time zone. Yeah. This was
>> 12 hours or something different or
something like that.
>> Something like that. Yeah. It was like
2021.
>> Wow.
>> Um and then I I tried to kind of find a
team that would sponsor me cuz there was
there were these kind of arcane HR rules
about like the time zone you have to be
in and the team you have to be
collocated with and so on. And so uh
there was a little kind of naent team uh
for Instagram in Tokyo and Will Bailey
was running this team. He was also the
guy that made Instagram stories and uh
so he was my manager for a while and so
we decided to grow that team together
and I worked remotely from NA and then
most of the team was in Tokyo
and uh during this time I I started
hacking on Instagram and the stack was
just insane like Facebook was the single
best web serving stack in the world. the
the way that HH everything is optimized
like from from the hack language to the
HHVM runtime to the to GraphQL as the
transport layer to like the client
libraries like relay and and all the
stuff it was just and in React it was
just amazing there there's no other
devstack in the world that was this good
and it's just fully optimized and then I
went to Instagram and it's like you know
Python where the type checker didn't
work and click to definition didn't work
and it was this like kind of hack
together Django and then like a work of
uh you know the Syon runtime and just
nothing really worked and so I came to
Instagram I joined the labs team uh you
know in in Japan and the idea was to
find the next big thing for Instagram.
We tried some stuff but what I very
quickly realized is that I was just not
effective at working on the stack
because it was such a terrible stack and
so I just went and started working on
Dev Infra because uh we we needed to fix
it and there there's a few projects that
we worked on. So one was migrating from
Python to the big Facebook monolith.
Another one was migrating from Rest to
GraphQL. And uh these projects, they're
they're actually in progress, you know,
like these are things that involve it
takes hundreds of engineers many years
to do this. It's a big code base. It's a
big migration. Um now it's it's much
faster.
>> Yeah. With with with these tools that we
have, the AI AI tools and migrations are
a pretty good use case for them though.
>> Yeah. It's like the it's the perfect use
case for it. And then I I just started
getting kind of deeper into this. And by
the end, by the time I left Instagram,
so I was working on this on dev and kind
of leading a bunch of these migrations.
That's also where I intersected with
Fiona Fun who is now the manager for the
quad code team. I just worked with her
and she was just such an amazing leader,
this incredible depth and kind of
history in tech. And I just thought like
there's no better there's no better
manager for this team. And then I I also
started working on code quality. And so
the the work on Instagram kind of
expanded a bit. And um by the time I
left, I was leading code quality for all
of Meta. And so I was responsible for
the quality of the code bases across
Instagram, Facebook, Messenger,
WhatsApp, Reality Labs, kind of all
these code bases. At Meta, it it was
this program called Better Engineering.
And the idea was I think it's sort of
like 2016 or 2018 or something, but Zuck
mandated that every engineer at the
company 20% of their time has to be
spent fixing tech debt.
>> Oh, interesting.
>> And we called this better engineering.
>> Mhm. And the some of this is kind of
bottom up where you know a team knows
best the tech debt that they have to fix
and then some of it is top down where
you need to do you know very big
migrations you need to migrate to new
language features new frameworks things
like this and at Facebook scale you know
there was tens of thousands of these
migrations every year. Um and so I I
just started leading all this and I
realized very quick that it just needed
a little bit more order to it. There was
no goals. No one knew kind of like what
the outcomes were there. there wasn't
any tracking. Um, and so we developed a
bunch of stuff. Uh, one of the ideas was
a centralized way to prioritize the
different kind of code quality efforts.
The second thing was figuring out the
impact of code quality on engineering
productivity which turned out to be
significant.
>> How how did you measure what did you
find there?
>> There was a bunch of stuff. I think some
of this has been published. I don't know
if all of it has, but essentially you
try to do like causal analysis and
causal inference. This is the
methodology. You try to figure out like
what what are the factors that make it
so engineers are more productive. Some
of it is code quality, some of it is
outside of code quality. So for example,
meta went back to uh you know return to
office instead of work from home. That
was partially driven by this because we
just found some you know fairly strong
correlations that we thought were
causal.
>> Yeah.
>> Um about this but quality actually
contributes like you know double digit
percent to to productivity. It turns out
even even at the biggest scale. It's
it's kind of comforting to hear because
I I think it's it's rare to have a place
where you actually measure this, but I
think we feel it like when you have a
clean code base in modular or it can get
easier to work with and I I think you
know reasoning could it also be easier
for LM to to work with it and my hint
would be yes it should be right but I I
think there's just very little data but
that's a feeling that I I would have.
Yeah, I think a lot of the big companies
have published about this. Like I think
Facebook published something. Uh
Microsoft publishes a bunch about this,
Google does, but yeah, totally. If if if
every time that you build a feature, you
have to think about do I use framework X
or Y or Z. These are all options that
you can consider because the codebase is
in a partially migrated state where all
of these are around the code somewhere.
As an engineer, you're going to have a
bad time. As a new hire, you're going to
have a bad time. As a model, you might
just pick the wrong thing and then, you
know, like the user has to course
correct you. So actually you know the
better thing to do is just always have
you know a clean code base always make
sure that when you when you start a
migration you finish the migration and
this is great for engineers and nowadays
it's it's great for models too and then
you joined entropic and I've heard this
story which you can confirm or give more
color to it that your first poll request
was rejected by Adam Wolf.
>> He was my rampa buddy. So I joined
Enthropic. I was trying to figure out
kind of like what to do next and you
know I I met a bunch of people at all
the different labs and anthropic was
just the obvious choice for me because
of the mission. This is the thing that
personally I know that I need the most.
Um and also just kind of seeing all this
change that's happening. It's important
to have some sort of framework to think
about this and to think about our role
in it. I'm also a really big sci-fi
reader. Like that that's definitely my
genre. Um I'm I'm a big reader. I have
like, you know, giant bookshelf at home
and stuff and I just know how bad this
thing can go and I just felt like this
is a place that has serious thinkers.
People are taking this very seriously
and thinking about what what what can we
do to make this thing go better. So when
I joined Anthropic, I did a bunch of
ramp up projects uh just you know
various stuff that that I was hacking on
and I wrote my first pull request by
hand because I thought that's how you
write code.
>> That used to be how you write code.
>> That used to be how you write code. But
even at the time at Enthropic, there was
this thing called Clyde and it was the
it was the predecessor to quad code. It
was it was super janky. It was like it
was Python, you know, it took like 40
seconds to start up. It was research
code. It was not agentic. But if you
prompt it very carefully and hold the
tool just right, it can write code for
you. And so Adam rejected my PR and he
was like, "Actually, you should use this
Clyde thing for it instead." And I was
like, "Okay, cool." It took me like half
a day to figure out how to use this tool
because you have to like pass in a bunch
of flags and like use it correctly. Um,
but then it it sped out a working PR. It
just one-shotted it.
>> Oh,
>> and this was like 2024.
This like September 2024, August,
something like that. And I think for me,
this was my first fuel hi moment at
Anthropic cuz I I was just, oh my god,
like I didn't know the model could do
this. Like I I was used to these like
kind of tab completions, line level
completions in an IDE. I had no idea
that it could just make a working pull
request for me. Boris just talked about
how he had a true wow moment at work
using their AI model. A very different
wow moment is when you use a tool at
work that makes things so much easier
than before. And this leads us nicely to
our presenting sponsor, Statsig. Statsig
offers engineering teams the tooling for
experimentation and feature flagging
that used to require years of internal
work to build. It's the kind of tool
that was so complex to build that only
large companies like Meta or Uber had
their own custom advanced tooling for
it. Here's what satic looked like in
practice. You ship a change behind a
feature gate and roll it out gradually,
say to 1% or 10% of users at first. You
watch what happens. Not just did it
crash, but what did it do to the metrics
you care about? Conversion, retention,
error rates, latency. If something looks
off, you turn it off quickly. If it's
trending the right way, you keep it
rolling forward. And the key is that
measurement is part of the workflow.
You're not switching between three tools
and trying to match up segments and
dashboards after the fact. Feature
flags, experiments, and analytics are
all in one place using the same
underlying user assignments and data.
This is why teams at companies like
Notion, Brex, and Atlastian use Statsig.
Statsic has a generous free tier to get
started, and pro pricricing for teams
starts at $150 per month. To learn more
and get a 30-day enterprise trial, go to
stats.com/pragmatic.
And with this, let's get back to Boris
and the origin story of Claude Code.
>> Yeah. And and then when you when you
joined Entrophic, we we've covered this
in in a deep dive, but we could recap
briefly on how Claude Code came to be
out of out of what seemed like a side
project or just a cool hack. So yeah, I
I I started hacking on a bunch of
different stuff. Um I was working on
some things in product. Um I worked on
reinforcement learning for a little bit
just to kind of understand the layer
under the layer which I was building.
This is still advice that I give to a
lot of engineers is always understand
the layer under. It's really important
because that just gives you the depth
and you kind of like you have a little
bit more levers to to work at the layer
that you actually work at. This was the
advice 10 years ago. It's still the
advice today. Um but the layer under is
a little bit different now. You know,
before it was like understand, you know,
the Java if you're writing JavaScript,
understand the JavaScript VM and
frameworks and stuff.
>> Now it's like understand the model. So I
was hacking on a bunch of different
stuff. Uh something shipped, some things
uh didn't ship. And at some point I I
just wanted to understand the public
anthropic API because I'd never used it
before. Um and I didn't want to build a
UI. I just wanted to, you know, hack
something up quite quickly cuz we didn't
have quad code back then. We're still
writing code by hand. And I wrote this
little batch tool that um all all it did
was it hit the anthropic API and it it
was essentially like a chatbased
application um but just in the terminal
because that's what AI used to be. And
you know, I I still think about it like
engineers are the first adopters. And so
when we started to move out of
conversational AI to agentic AI, it took
a little bit, but engineers understood
it pretty quick. And I I think now when
you ask non-engineers about like what is
AI, they would say it's this
conversational AI, it's like a chatbot
or something. And that's why I'm
actually very excited for, you know,
co-work this new product that we
launched because it's going to bring the
same thing that engineer saw very early
to everyone else. But when I think
about, you know, co-work, I I think back
to this moment that we're talking about
like very early on, quad code originally
wasn't quad code. It was a chatbot
because that's what I thought AI was.
Um, but we had to kind of figure out
kind of what is the next thing. And so I
at at the time I I built this chatbot.
It was somewhat useful, but it was just
a chatbot. And the next thing that I
tried was I I wanted it to use tools
because tool use just came out and I
didn't know what it was and I was like
let's experiment
and and I I gave it a single tool which
was the bash tool and I didn't know what
to do with the bash tool and so I asked
it you know like I I actually didn't
know if it could even do this but I
asked it like what music am I listening
to and uh it just wrote a little Apple
script program using like said or or
whatever to uh open up my music player
and then like query it to see what music
it's listening to and just one shot at
this with sonnet 3.5. This is actually
my second a field AI moment very quickly
after the first one
>> and the model just wants to use tools
that though that's that's just what I
realized like this thing like if you
give it a tool it will figure out how to
use it to get the thing done and I think
at the time when when I think about the
way that people were approaching AI and
coding everyone essentially had this
mental model of you take the model and
you put it in a box and you figure out
like what is the interface like what how
how do want to interact with this model?
What do you need it to do? Essentially,
it's like if if you have a program, you
you stub out some module, stub out some
function, and you say, "Okay, this is
now AI." But otherwise, the rest of the
program is just a program. And so, this
is just not the way to think about the
model. The way to think about it is the
model is its own thing. You give it
tools. You give it programs that it can
run. You let it run programs. You let it
write programs, but you don't make it a
component of this larger system in this
way. And I think there's just like, you
know, this is a version of the bitter
lesson. There's the bitter lesson is a
very specific framing, but there's many
corollaries to it. This is one of the
corollaries is just let the model do it
do its thing. Don't try to put it in a
box. Don't try to force it to behave a
particular way.
>> One of the first ways you saw it was
giving it tools, giving it access to the
bash and then later to the file system
and then to more tools. Right.
>> That's right. Yeah, we we give it uh we
give it bash then uh I say we it it was
just me the first three months but then
the team grew. So it it was bash, it was
uh and and file edit that was the second
one.
>> And one of the interesting thing we
talked about uh last time for the deep
dive is when you built it and it started
to actually write code with with the
tool tools that you had. You've had an
internal debate inside entrophic should
we just keep it to ourselves because
it's making suddenly it spread across
engineering and it was making all of you
a lot more productive right. Yeah,
that's right. In the end, the decision
was to release so that we can study
safety in the wild. Because when you
think about safety and you know, I keep
talking about the word safety. The
reason anthropic exists as a lab is
safety. This is the reason it was
founded. This is the reason it exists.
If you ask anyone at anthropic why they
chose it, it's because of safety. And so
if you think about model safety, you
know, there's different layers at which
to think about it. There's kind of
alignment and mechanistic
interpretability. This is at the model
layer. Then there's evals and this is
kind of like a it's kind of putting the
model in a petri dish and synthetically
studying it in this way. Um and then you
can study it in the wild and you can see
how it actually behaves. You can see how
users talk about it. You can you can see
like what are the risks in the wild and
you actually learn a lot this way. And
by doing this we we've been able to make
the model much safer. So in in hindsight
it was it was totally the right
decision. It's amusing to hear about it
from your perspective because from the
outside what what I saw and what a lot
of engineers saw is like oh entropic
release cloth code oh wow this you know
for the first release with uh I I
believe it was with sonet 4 release was
was did it come out with sonet 4
originally or sonet 4.5
>> I think it was it was for that that was
the general availability in February but
I think it was research preview before
that
>> yeah but when it came out my
infiltration was like oh this thing can
write code pretty well and over time it
became a lot more capable. So from from
our perspective it was like this really
capable coding tool that we just started
to adopt and use and use for all sorts
of increasingly product productive parts
and it has become I believe one of the
fastest growing developer tools and I'm
always surprised to hear the story that
it actually comes from research and the
goal to understand how people use the
model because at the other hand like
some startups have been trying to build
developer tools deliberately to to get
adoption and yet this research tool is
getting a lot more adoption.
>> I mean this is a you know anthropic
we're we're a research lab we're a
safety lab and you know product is this
kind of thing tacked on to the side
product exists so that we can serve
research better and so we can make the
model safer and this is kind of how we
think about everything there there was
this there's also this funny moment
early on when uh we we had this launch
review and we were deciding whether to
launch it. I remember this moment cuz we
were in the room. I think it there was
like there was Mike Creger, there was
Daario, there were some other folks in
the room and we were deciding what
should we do. We were looking at the
internal adoption chart which was just
vertical
said it was just insane. It was you know
like nowadays
>> vertical is 100% right
>> just just 100% like nowadays everyone at
an every technical employee at anthropic
uses quad code every day is pretty much
100%. For nontechnical employees it's
also like it's actually getting quite
close to 100%. It's it's increasing very
quickly like you know like half the
sales team uses quad code um and I think
that's increasing it's just it's crazy.
Dario had this question about like how
how did it grow this fast? Are you like
forcing people to use it?
And I was like no we offer this tool
people vote with their feet and you know
just like let people use the tool that
they prefer.
>> Yeah they chose it.
>> You don't seem like the person who's act
exactly forcing people to use your tool.
>> Yeah. Yeah. I mean the the way we did
it, we just we launched the thing and
then we just like listened to the users
and we talked to people, we saw how they
use it, we followed up, we made it
better and yeah, I mean now now we're at
the point where Quad Code writes I think
something like 80% of the code in at
Enthropic on average and you know it
writes all of my code for sure.
>> Yeah. And this started for you it
started the first time you mentioned I
think it was in November when it started
to write all of your code. When did that
switch come and what what happened to
made you trust it to to write your code
or how much you trusted? How much you
review that code for example?
>> So the switch was instant when we
started using Opus 4.5. This was before
before it came out, you know, we we were
dogfooting it for a little bit and it it
was just right away. Um it's such a more
capable model. I just found that I
didn't have to open my ID anymore. I
just uninstalled my ID cuz cuz I just
didn't need it at that point. I actually
did that like a month later because I I
I just didn't even realize that I wasn't
using it anymore.
>> Yeah, a lot of us had similar
experiences once Opus 4.5 was out in the
public and especially over the winter
break. I I had a similar experience. I
just realized that this thing it
actually writes, if I'm being honest
with myself, as good code as I would
have written in the stack that I'm very
familiar with and my code base, my side
projects where I know it and just a lot
better than what I could for code base
that I'm not as familiar or technologies
I'm not as familiar with. Yeah. I'll be
honest, he writes better code than I do.
>> I I I don't want to go there. I I still
like to keep my pride, but probably
true.
>> Yeah. Yeah. I I realized this because
also in December, I was traveling a
little bit. I was like on a I was on a
coding vacation. We we're talking about
this before, but I I went to Europe. We
were just in a different time zone kind
of nomading around. And it was so fun
cuz I was just coding all day every day,
which is my favorite thing to do. And uh
I wrote maybe, you know, like 10 20 p
requests every day, something like that.
Opus 4.5 and quad code wrote 100% of
every single one. I didn't edit a single
line manually and I realized uh at the
end of that month Opus introduced maybe
two bugs whereas if I had written that
by hand that would have been you know
like 20 bucks or or something like that.
Can we talk about your development
workflow? You have written threads about
this which is awesome. It's on it's on
social media on threads and on on X. But
can you tell us how you use today uh
cloud code in terms of you know
parallelism and and tips and tricks that
you and the team have kind of learned
and share across the across the team?
>> Yeah, I mean look there's no one right
way to use quad code. So I I can share
some tips and things but I I think the
wrong conclusion to draw would be to
just copy copy these and and use it. The
way we build cloud code is we build it
to be hackable because we know every
engineer's workflow is different.
There's no one way to do things. There's
no two engineers that have the same
workflow. It's just every every engineer
is
>> same with workstation setup, right? Like
keyboards, monitor placement, all that.
Everyone has it differently.
>> Yeah. It's like we're like crafts
people, right? Like you choose you
choose your tools. Like we care deeply
about it. So there's no one right way to
do it. So for me, the way that I do it
generally is I have five terminal tabs.
Each one of them has a checkout of their
repository. So it's five parallel
checkouts. Um and usually I'll kind of
roundroin and start cloud code in each
one. Almost every time I start in plane
mode. So that's like shift tab twice in
the terminal. And uh I also overflow uh
as I run out of tabs cuz there's only so
many terminal tabs. I used to use web a
lot for this. So like quad.ai/code,
that's the place that I overflow to.
Nowadays I actually use the desktop app.
Um it's more convenient. So Quad Code,
you know, it's been in our desktop app
for, you know, for many months. It's
just a code tab in in the Cloud app. Um,
and I actually really like it because it
has built-in uh work tree support. So
that's existed for a while. Um, and that
that's quite nice for parallelism. So
you have multiple, you don't need
multiple checkouts. You just have one
and then we automatically set up Git
work trees for you. So you get this kind
of environment isolation. The reason I
do that is I actually just really hate
fiddling with git work trees on the
command line cuz it it's kind of fiddly.
like you need to know the CD get work
tree for those of who are not as
familiar with it. It's it's when you can
check out instead of having a separate
local folder, it's almost like checks
out separate branch, right? And then you
can work on it separately but not have
the comp have the complex only at like
merge time.
>> That's right. Imagine that you you have
a folder but you have maybe like git
makes five copies of that folder in a
way that's very cheap um and kind of
easy to throw away. So you get this kind
of isolation. it can work in parallel
and the quads don't interfere.
>> Yeah. So, you now have support for this
which I I think you recently added like
native support but like for for your
workflow you just stuck with the old one
of checking out on separate f folders,
right?
>> Yeah, exactly. I I actually find over
time I'm using the desktop app more and
more for this.
>> Um just cuz I don't need these separate
checkouts and you know I I just have a
bunch of quads running in parallel and I
don't have to think about it. The other
surprise hit is the iOS app for me.
Every day I start like I wake up and I
just start a few agents on my phone. Oh,
the the native one. Yeah,
>> the native one. Yeah, it's just like
it's the quad app. It's the code tab in
the in the quad app and it's the same
exact quad code.
>> Yeah, except it it runs in the cloud,
right?
>> It runs in the cloud. Yeah. So, you have
to kind of configure the environment.
Luckily, our environment is pretty
simple. So, you know, um and it we just
use hooks for it. So, you just use the
session start hook and configure it.
This is kind of one of the benefits of
making quad code really hackable is it's
very easy to do to do this kind of
configuration. And this is something
honestly I would never have predicted
because you know like I I I code on a
computer. If you told me six months ago
I'd be writing I don't know a third I
haven't pulled the data maybe like a
third half something like this of my
code on a phone. That's crazy. But
that's that's what I'm doing today.
>> And you're using parallel agents. At
what point did you start using them? And
how has it changed your work? Cuz one
thing that I notice on myself, I don't
really use that many parallel agents. I
maybe like two at a time, but I'm
someone who well I I like to be in
charge and especially with Claude.
Claude is is is a a tool that you can
follow it along. It tells you what it's
doing. It you can also have for example
learn mode which this was shipped a lot
earlier where where you can actually
follow along. It gives you tasks. I I
feel that like staying in one tab and
following along the model is pretty fast
as well. I can kind of keep in touch.
I'm assuming at some point you must have
done this but then what happened when
you changed to parallel and are do you
feel you're losing any control or it
doesn't really matter that much?
>> Yeah, I I I think there's kind of like
two modes to think about or kind of like
two two uh two kind of workflows to
think about. So when you're new to a
codebase, highly re learn mode is
awesome. Highly recommend it for people
that are onboarding to the quad code
team, people that onboard to enthropic.
Um the thing that we recommend is so you
do for people that haven't tried it you
do slashconfig in quad code you pick the
output style and you can do learn or
explanatory. We usually recommend
explanatory cuz that tends to be better
for new code bases um that you kind of
haven't been in before. For me once
you're familiar with the codebase you
just want to be productive right like
you just want to ship as much as you can
and you want to kind of be effective
doing that. Um so the role really
switches. I don't really go deep into
tasks anymore. I start a quad in plan
mode. I'll have it kick something off.
With Opus 4 4.5, I think it got there.
With 4.6, it just really really does it.
Once there is a good plan, it just it
will oneshot the implementation almost
every time.
>> So, the most important thing is to go
back and forth a little bit to get the
plan right. So, what I do is I I start
one, I enter plan mode, I give it a
prompt. As it's chugging along, I'll go
to my second tap and I'll start the
second quad also in plan mode. Get it
chugging along. Then go to the third
tab, go to the fourth one. Then maybe
I'll go back to the first one when I get
notified that it's done. Uh, and then
I'll kind of
>> Do you have notifications on or do you
turn them off?
>> I actually operate in both modes. Um,
sometimes I do like, you know, focus
mode on the Mac. Um, so I just have it
off, but also sometimes I use the system
notifications.
>> And you're very very productive with
with PRs. I mean, I I think it was very
visible. Even around the holiday breaks
uh on social media, you actually were
responding to I think someone reported a
bug or or a feature request. I'm not
sure which one it was. And then an hour
or two later it was done cuz cuz you did
it. You've also talked about like number
of poll requests you've done on a day
not to like show up but just as context.
What what does a poll request typically
involve in terms of complexity? Are
these like are some some super trivial
or some actually like larger pieces of
work as well?
>> Yeah, pull request each one varies a
lot. Um sometimes it's a few lines,
sometimes it's a few hundred or a few
thousand lines. They're all just very
very different. It's changed so much.
Like back when I was at Instagram, I
think I was one of the uh top two maybe
top three most productive engineers at
Instagram just by volume of code
written. Oh wow. Um so I've always, you
know, for me I've I've always just coded
a lot. Like this is uh coding is like a
way that I can express myself and it's
just like it's a way that my brain
thinks also. And so now I just get to do
it. But I I think with quad code the the
the kind of code that you write if you
are very productive it it tends to be
even it's just the number of PR sort of
underelves what what's happening because
I I think people that used to be very
productive in the old days before AI
assistance a lot of the code maybe was
like code migrations or something like
this so like people that shipped you
know 20 30 PRs every day a lot of it was
like pretty you know like a oneliner or
kind of migrating A to B or whatever.
Nowadays I ship you know 20 30 PRs every
day but every PR is just completely
different. Some of them are thousands of
lines, some of them are hundreds, some
of them are dozen, some of them are
oneliners. It's none of these are kind
of code migrations cuz actually Claude
just does those and I I don't need to be
part of that.
>> Shipping this much code or this much
productive. The obvious question that
comes up for any I guess software
professional is well the review. What
the way teams used to work and I'm not
sure if Instagram did this but a lot of
other companies did this is you make a
pull request you put it up there there's
a mandatory human reviewer at Google
there's actually two cuz there's one on
code quality as as well how has this
workflow changed how does the hot code
team think about code review and how has
it changed over time yeah I'll start by
thinking I I'll start by talking about
how code review used to work for me so
the the way that I used to do it is uh
every time I I also used to be one of
the most prolific code reviewers.
>> Oh, okay. So, both.
>> I I met Yeah. Yeah.
>> Right. Or is it code reviewers?
>> That's actually and that's one of the
benefits of being in a different time
zone. Like I'm not super human. I just
didn't have any meetings. And the the
way that I approach code review is every
time that I would have to comment about
something, I would drop it in a
spreadsheet
and I I would like describe the issue.
So, let's say, you know, like someone
named a parameter, you know, in a
function badly, I would like put that in
a spreadsheet. If someone did some bad
React pattern or something, I would I
would put that in a spreadsheet. And
then over time I would just kind of
tally up the spreadsheet and anytime
that a particular row had more than
three or four instances I would write a
lint rule for it.
>> So just automate it with kind of an op.
And so that's what it used to look like
for me. I've always tried to automate
myself away um because there's just so
many things to do. Um and this is one of
our superpowers as engineers
>> is we were able to automate all of the
tedious work. There's very few other
fields where you're able to do this
thing. This is a thing uniquely that
we're able to do. Um, and this is a
thing that I I've just always enjoyed
because it gives me more free time and
uh I get to do the work I actually
enjoy. And so today the way this looks
is a little different, but it it mirrors
this a little bit. So when cloud code
writes code, it generally it will run
tests locally. And this is something
cloud just often decides to do when it's
relevant or it'll write new tests. So
you kind of do this this kind of
verification. When we make changes to
cloud code, cloud will also test itself.
So it'll launch itself kind of in a
subprocess. It'll verify itself and
it'll test itself end to end.
>> This is for the the your internal cloud
code implementation. So you have like
this test suite so they can test itself.
>> Yeah, that's right. That's right. But
it'll literally launch itself just in a
bash process and kind of just see like
hey do I still work.
>> Wow. Okay. So it'll do this and this is
something that we we just didn't code in
like it just with Opus 4 4.5 especially
it just sort of spontaneously doing
this. It just wants to kind of check. So
so we do this and then we also run
claudep. So this is the quad agent SDK
in uh CI. So every pull request at
Enthropic is code reviewed by quad code.
Uh and that actually catches maybe like
80% of bugs something like this. Um and
it's the first round of kind of code
review. Cloud will automatically address
some of these. Some of them some of them
it'll leave to a human cuz it's not sure
what to do. There's always an engineer
that does the second pass of code
review. Um and you know there there
always has to be a person in the loop
approving the change.
>> Mhm. So on on on the team before
anything goes into production if you
will an engineer does look at it. Yes.
As you're thinking of code review would
you do this for every type of project or
this is specifically because you now
know that this actually has real world
impact people depend on it. You know
there's a lot of users let me put it the
other way around like can you see places
where you would just not have an
engineer review uh code. What situations
would that be in?
>> I think it depends how how how it's
used. Yeah I'd agree with that. But you
know if you're building some personal
side project like you can just yolo
straight to main you know like
>> it's even even before AI you would have
not reviewed you just trust yourself or
you know just ship to production or SSH
into production and do some changes that
kind of stuff right
>> exactly exactly um the very first
versions of quad code that were internal
like you know I committed straight to
main but then you know as soon as you
have users and you know for enthropic
our main customer base is enterprises
this is what we care about the most for
us for safety reasons security is really
important privacy is important. These
are these are all related. It's also
very important for our customers. And so
because this is an enterprise product,
it has to be secure. It has to be we
have to make sure that it meets a
certain bar. So we definitely use a lot
of automation, but at least for now,
there has to be a human in the loop just
to make sure.
>> One thing that is just known about LM is
they're nondeterministic.
And by putting the element as a reviewer
claude doing a review like it it will
give good feedback but how do you deal
with the fact that you can be sure if
it's always giving the feedback you
cannot be sure that even if it's capable
of catching an issue that it will
necessarily catch that. Are you doing
anything in in this loop to do
deterministic thing? For example,
linting is very deterministic as you
will very well know. Like have you
thought of marrying some of these ideas
or are you using for example are using
llinters on the codebase or you found no
need to for it? Yeah, absolutely.
Absolutely. Yeah, you
>> this is just a Yeah.
>> Yeah, we we have type checkers, we have
llinters, we run the build. Claude is
actually so good at writing lint rolls.
So, actually what I do now, I used to
tally stuff up in a spreadsheet. Now,
what I do is when a coworker puts up a
pull request and I'm like, this is
lintable. I'll just be at Claude, please
write a lint roll for this in that PR on
their PR. And we have, you know, you
just run like slash I think it's like
setup GitHub or or something like this.
You can do this in cloud code and it'll
install the GitHub app which then makes
it so you can tag add Claude on any pull
request, any issue. I use this every
single day. Um, so very very useful. So
you want these deterministic steps. Also
though there are there are ways to get
cloud to be a little bit more
deterministic. So for example, you can
do best event. You can have it do
multiple passes
>> and and this is actually quite easy to
do. So you know for example the
coderview skill that we use internally
it's open source um and it's available
in the quad code repo and so all we do
is you know we launch parallel agents to
do stuff and then we launch parallel
dduping agents to check for false
positives but essentially best of end
the way you implement it is is all you
say is claude start three agents to do
this and that's it. or just talked about
building that enterprise infrastructure
layer, the O, the permissions, the
security that has to all work before you
can ship to real customers. This makes
it a great time to speak about our
season sponsor work OS. If you're
building any SAS, especially an AI
product one, then authentication,
permissions, security, and enterprise
identity can quietly turn into a
long-term investment. SL edge cases,
directory sync, audit logs, and all the
things enterprise customers expect. It's
a lot of work to build these mission
critical parts and then some more to
maintain them. But you don't have to.
Work provides these building blocks as
infrastructure so your team can stay
focused on what actually makes your
product unique. That's why companies
like Antrophic, OpenAI, and Cursor
already run on Work OS. Great engineers
know what not to build. If identity is
one of those things for you, visit
work.com.
And with this, let's get back to
building cloud code with Boris. How does
cloud code work in terms of ar
architecture? So as as an engineer, how
can I imagine it's setup? It's uh we we
covered some of this in the the deep
dive and I think you told me that you
had some pretty complex ideas when you
started and you just simplified a lot of
it.
>> Yeah. Yeah. It's very simple like you
know there there's not much to it.
There's like there's a core query loop.
Uh there's a few tools that it use that
it uses. We we delete these tools all
the time. We add new tools all the time.
We're just always experimenting with it.
So there's kind of this core kind of
agent part of it. Then there's the the
2E part of it. Uh and then there's
there's actually a ton of different
pieces around security. Um and making
sure that everything that QuadCode does
is safe and that there's a human in the
loop for when it happens.
>> And by safety, do you mean as as a user
when it's doing stuff on my computer or
also as entropic monitoring use cases
that that could be deemed unsafe? Yeah,
there's kind of a couple versions of
this. You safety, there's just many,
many layers and for things like safety
and security, there's no one perfect
answer. So, you know, it's always a
Swiss cheese model. You just need a
bunch of layers and with enough layers,
the probability of catching anything
goes up. And so, you just have to kind
of count the number of nines in that
probability and pick the threshold that
you want. And so, for something like
prompt injection for example, we do this
generally at three different layers. So,
let's think about something like web
fetch. So cloud fetches a URL and uh it
reads the contents of of of that web
page and then it does something in in
quad code. So one of the risks for
something like this is prompt injection.
Maybe there's an instruction on that
website to be like hey quad delete all
the folders or something like that.
>> So we think about this in a number of
ways. The the most basic way is it's an
alignment problem. And so opus 4.6 is
the most aligned model we've ever
released because we've taught the model
how to be more resistant to prompt
injection. And so you can read about
this on the model card and I think it
was part of the release. The second part
is that we have classifiers at runtime
where if there is a request that seems
to be prompt injected, we block it um
and we just make the model try again.
And then the third layer is for
something like web fetch, we actually
summarize the results in using a sub
agent and then we return that summary
back to the main agent. So again, this
kind of reduces the probability of
prompt injection. And so you can kind of
see how this isn't just one mechanism.
It's it's a layer and by by having a
bunch of these different layers, it just
reduces the probability a lot.
>> One interesting technical choice that
you've also mentioned is is using rag or
not rag retrie retrieval augmented
generation and you mentioned how in the
earlier version of cloud code you use a
local vector database to to get some to
to speed up search and you layer threw
this away. Can you talk about how this
one because this was another example
where I guess did the model get better?
>> Yeah, I mean this is one of those things
where we try so many different things.
We try so many different tools and just
statistically most of them we throw
away.
>> Even something like the spinner in quad
code I think it's gone through like a
hundred iterations
>> I want to say. Oh
>> just the spinner and you know out of
those we've landed maybe like 10 or 20
in production and like 80 of them I
probably just threw away cuz it didn't
feel good enough. So just statistically
almost all the code we write we throw
away because it's just so easy to write
this code and try stuff and see what
feels good. So for something like rag we
tried a bunch of different approaches
early on. So the the first one was rag
for retrieval cuz I think this I was
just like reading up like how people
were doing retrieval and it seemed like
all the papers were talking about rag.
Um and so the way I did it was it was
like a local vector database. I think it
was like written in Typescript and it
just lived on the user machine. Uh and
then I was using some like embedding uh
model that was in in the cloud to
compute the embeddings before storing
it. Um and that that worked like pretty
good, but there's a lot of issues with
rag. Um so for example, I was finding
that the code drifted out of sync. Like
if I make a local function, it's not yet
indexed and so rag isn't going to find
it. There's also this question of like
how exactly is the index permissioned?
So who can access it? I can access it.
Um but then how do we like encode that
in kind of permission policies? How do
we make sure no one else can access it?
How do we make sure that like if there's
a rogue IT person within the company,
they can't access someone else's data?
This is really really important that we
think about this.
>> Yeah.
>> Um and so we just decided like it was
sort of working, but it was it also has
a lot of downsides. And so we tried a
bunch of other stuff. Uh one of them was
just using the model to uh kind of index
everything recursively. Um that was kind
of a cool idea. There was another
version where um we just tried glob and
gp. We tried a bunch of different stuff.
It it turned out that agentic search
just outperformed everything
>> and and when I say agentic search, this
is a fancy word for glob and grap.
That's all it is.
>> Nice. So So the model both got good
enough and you realize that it can use
these tools pretty efficiently.
>> Yeah. And this was uh it was partially
inspired honestly by my experience at
Instagram because at at Instagram click
to definition didn't work because the
the dev stack was just borked like half
the time and I think now it's better.
And so what engineers weren't to do
instead is let's say you're looking for
the definition of the function fu
instead of click to definition what you
would do is you would use the global
index which is quite good at meta and
then you would search for fu per opening
parenthesy and this worked pretty well
and it it's funny because like this
works for the model pretty well too
interesting how one one idea from one
area can come to the other one of the
more advanced parts of cloud code that
we've also previously talked about is
the permission system. Can you talk
about what was complex about it? And
also you recently open source
sandboxing, right? Permissioning is
really complex. Um there's like
everything else that has to do with
security. It's a Swiss cheese model.
There are a number of classifiers that
run to make sure the command is safe. Um
and there's also static analysis that we
do to make sure the command is safe. As
a user, you can also allow list
particular patterns that you know to be
safe. So, for example, um some standard
Unix utilities we preow because we know
they're readon because we know they
can't expilt your data or anything like
this. So, we we just won't prompt you
for permission. But actually quite few
tools fall into this category because
even something like the find command,
there's actually a way to execute
arbitrary code as part of that command
because there's there's like system
flags that you can use for this. or even
something like the said command. There's
ways to use this. So there's just like
all this like arcania about these
various Unix utilities where it's
actually not as safe as you think.
>> And so we want to be by default fairly
conservative about what we allow by
default. As a user though you can
configure an allow list. So you can say
for example like the these patterns are
allowed the these patterns are not
allowed. Uh and so we we let you define
that and we also check this allow list
to to make sure that it's safe.
>> Yeah. And then you you have this like
neat permission system where every time
you run a command that needs permission,
you can decide to run it once or run it
for either this session or whatever it
makes sense or just globally allowed
going forward. Right. That's right. This
is a funny artifact. This was actually
in the very very first version of quad
code. This is the way permissions
worked. This is the very first release.
This was like September 2024, the first
internal release. I remember at the time
we weren't sure whether agentic safety
could be even be solved. And so there
was actually a lot of push back
internally from safety teams because
they were like okay like you can't just
run let the model run bash commands like
that's unsafe. So like what do you do
like this is not a solvable problem so
like we can't launch this. I I
brainstormed with Ben man and Ben was he
started the labs team. He's one of the
founders at Enthropic. Um he's actually
he's the the person that hired me to
Anthropic. We just came up with
permission prompts as the way to do
this. You you put the if you're not sure
just ask the human and and they can
decide.
>> Yeah. I wanted to ask you about how
software engineering is done in general
in terms of Antrophic and one of the
first questions which is a I guess a
more formal one but or from the outside
is titles or lack of them. Everyone at
Antroic has the same title member of
technical staff. Why did this happen and
what does this result in this kind of
like everyone there basically no titles
right except for one? I think it's kind
of an acknowledgement that um everyone
just is figuring stuff out. And um if if
you kind of squint and look at the work
people are doing, it's all quite similar
and it's it's kind of quite generalist
and if you talk to the average software
engineer, they might not just be doing
coding. They might also be doing a
little design. They might also be
talking to users. They might be writing
their own product requirements. They
might be writing software and also uh
you know doing research. They might be
writing product code and also
infrastructure code. At anthropic
there's a lot of generalists. This is
also you know from my background. This
is one of the reasons that I gravitated
towards it. And I I I think member of
technical staff just kind of encodes
this in in the way that people talk to
each other even if they don't know each
other. Without this title the default
would have been I see your name on Slack
and under your name it says software
engineer. And then I'm like well okay I
guess you're like you're the coding
person then. So I'm I'm not going to ask
you like product questions, but when
everyone's title is member of technical
staff, by default, you assume everyone
does everything. And so it kind of
inverts this this relationship between
people even if you don't know each other
well yet. In in a way, it's kind of this
like optimism built into the built into
the structure. Um I think it's also a
glimpse of the future because I I think
this is where software engineering is
going. I think this is where every
discipline is going is more of this
generalist model. It definitely feels
like it in in software engineing. And I
I heard this funny uh comment by Mark
Andre uh how we said that there's this
Mexican standoff happening in the tech
world where the the designers are are
saying that they're actually now doing
like PM and engineering work. The
engineering are saying we're doing
design and and like everyone thinks
they're doing the work of the others and
they're kind of standing there like I'm
doing your work as well. when the
reality is everyone's role is expanding
most of it thanks to AI because it makes
easier for an engineer to do product
work or for a product person to engineer
work and so on. So just what what you've
said
>> I I remember back in the back in June or
July of last year I I walked into the
office and the data there's a row of uh
data scientists that sit right next to
the quad code team at least at least at
the time and I walked in and our data
scientist for the quad code team had
quad code up on on his monitor and um he
he was using it and I was like this is
interesting cuz you're you're a data
scientist did you have like why are you
using a terminal like you didn't have
NodeJS installed cuz we depended on
Node.js JS back then. I I was like, "Are
you are you dog fooding it? Like are you
just like trying to like figure out how
this thing works or something?" He's
like, "No, no, I'm like I'm using it to
run queries." He was just like using it
to run SQL and it had like little like
ASKI visualizations uh in the terminal.
Uh and then the next week the entire row
of data scientists had quad code running
on their computers and and this expanded
and so if you look at the team today on
the quad code team everyone codes the
engineers code our engineering manager
codes designers code uh data scientists
code uh our finance guy codes everyone
on the team codes and I think part of it
is quad code just makes it so easy so
you don't really have to understand the
codebase. You can just like dive in and
and kind of make small changes quite
easily. But I think another thing is
people are able to use cloud code to do
their jobs more whether it's you know
financial forecast or you know data
science or whatever and by doing this
it's actually quite an easy crossover to
just use it to write a little bit of
code also. So it's just a way to dip
your toe in the water. One other
interesting thing about how you work is
Cat Woo was talking about she is I guess
you the title is the same but people
might gravitate for role a bit more. I
understand she's a little bit more on a
product role but you said that PRDs are
just not really written inside entropy
and PRD's product requirement document.
It's a well-known artifact across big
tech and increasingly over larger
startups where you write a spec and the
idea is that you write down your
thoughts, people align, you send it over
and now you know what to build. But
apparently you're not doing much of this
or at all.
>> Some of this I think is because
Anthropic is still, you know, it's still
a startup. So you you don't actually
have to align with that many people
usually. You can just kind of talk about
it or do it in Slack or whatever. Um but
yeah, also part of it is, you know, like
Cat used to be an engineering manager.
She's she's extremely technical and I
think this is this is the way that you
know our product team thinks about it
too is you know better send a PR.
>> You're you're doing a lot of prototyping
instead. So like that that's also
something where when we talked about how
you were building cloud code early on
you were showing actually you had a
whole thread about the number I think
you did like 15 or 20 prototypes for the
the to-do list and all of them
interactive working and what surprised
me compared to my past tech experience
and you said that well you did this in
like a day and a half all all 20 tried
it out got a feeling for it which
incomprehensible for me it would have
taken a week or two weeks and people
would have not done 20 they would have
done three. Yeah.
>> So like are are you seeing this? Is
there an increase in in prototyping and
and building and showing instead of you
know writing things?
>> Yeah. Absolutely. I mean on our team the
culture is we don't really write stuff.
We just we show. It's a little hard to
to reflect back on the time before cuz I
I think now just prototyping everything
is so baked into the way that we build.
Just everything is prototype multiple
times. Like uh you know we launched
agent teams earlier this week. This is
our implementation of swarms. It it's
very exciting because uh it just lets
Claude do more work for longer, more
autonomously. You have a bunch of
different uh uncorrelated context
windows and you have this kind of
communication between agents. They can
just do more. This is something that uh
Daisy and Suzanne and other folks on the
team uh and and Karen, they they
prototyped this for months and they
tried all in all probably hundreds of
versions of this before they got a user
experience that felt really good. um it
was just really really hard to get
right. There's just no way we could have
shipped this if if we started with, you
know, like static mocks in Figma or if
we started with a PRD or something like
this. It's a thing that you have to
build and you have to feel and you have
to see how it feels. And to me, one of
the big takeaways even from there was
like we probably should prototype more
and just be more daring or just release
your priors of how long it took to build
a prototype or who needed to build. Back
then it was always an engineer that
needed to build, but it's probably not
true anymore. Yeah, that's right. I
mean, we're in this world right now also
where we just we don't know what the
right answer is. You know, like I I
think back in the old way of building
you the cost of building was high and so
you had to actually spend a lot of
effort to aim very carefully before you
take your shot because after you take
your shot um it it's very hard to course
correct. You can only take so few shots.
But now it's changed. The cost of
building is very low. Um but also we
don't know where we're aiming. So we
just have to like we have to try and we
have to see what feels good. And it's
just very very exploratory. And I think
also a big part of it is humility where
you know personally I'm wrong like half
the time I'd say like most of my ideas
are bad. At least half of them are bad.
And I don't know which half until I try
it.
>> And I get feedback from others as well
sometimes.
>> That's right. It's like I I have to try
it myself and then I have to see what
others think cuz you know my intuition
does not always match others. When you
were showing these prototypes of just
how the the tasks were built, you were
telling me that you built the prototypes
and then your process was always you
first like looked at it, you tried it
out, you got a feel for it and then for
the ones that you felt were good, you
showed it to others and sometimes they
give you feedback like nah this doesn't
work and then sometimes when it felt
good then you shared it even broader. So
I feel like you know like it's a mix
right where like sometimes you can
decide already and then sometimes you
get feedback and then eventually some
good ideas come out of it. Yeah, and
there's a lot of examples of this like
uh we we launched this kind of condensed
view for file reads and file search just
because the the model is just so agentic
now like I felt like half the screen is
these like file reads and I actually
don't care like I you know I read a
thing I don't really care what it is and
so we condensed this down to make the
output a little bit more readable. I
really liked it after probably 30
prototypes or something like this. It
took it took so much effort to make that
feel really good and clean. We rolled it
out to employees at Enthropic for about
a month and we had everyone dog fooded
and I fixed another probably dozen dozen
bugs, dozen tweaks based on all this
feedback. We launched it externally and
you know almost all users liked it but
there were a few users that didn't
because they want more expanded output.
Um and so on the GitHub issue I was just
going back and forth with people to be
like you know what like what don't you
like and people gave a lot of feedback.
I shipped another version. Then some
people liked it, some people didn't. And
so I iterated again and kind of made it
good. And it it's actually I think
almost there where people can configure
it the way that they want, but still the
default is really good. But this is just
the process. You know, we we get it
right some of the time. We have to learn
from our users. We want to hear from
people so we can get it right.
>> Do you use ticketing systems for your
work where you know where where you
capture like, all right, here's the work
I I want to or do you just pretty much
do the work as as it comes in?
>> So at Anthropic, we leave it up to teams
on the quad code team. and we leave it
up to every person. Uh different people
use uh use this differently. For
example, I don't use a ticketing system.
Some people like to use a sauna or notes
or something like this. One of the
coolest things that I saw, this was
maybe like 3 months ago or something. We
launched plugins and the way we launched
that is uh Daisy for a weekend, she had
a very early version of swarms and she
let the swarm run and she told that your
job is to build plugins. You have to
come up with a spec. Then you have to
make a asauna board and split up into
tasks. And then all the different agents
have to build it. And uh she set up a
container and she set up a quad in
dangerous mode. And she let it run for
the entire weekend. It spawned a couple
hundred agents. They made 100 tasks on
the sauna board. Uh and then they
implemented it. And that's pretty much
the version of plugins that we shipped.
These kind of coordination systems that
used to be for humans, but um I think
nowadays it's just as much for models.
Let's let's talk about cloud co-work. Uh
it's one of the very impressing things
about this. It looks great. So I tried
it out. It's inside cloud. You have the
co-work tab there and and you can I I
feel it's a lot more visual way of of
running agents interacting with them.
One of the surprising thing I heard that
it was built in 10 days. Can can you
take us through like what it took to
build it and what does actually mean?
Was it from the idea or like from the
decision of of building it? And how big
was the team building it?
>> The team was really small. It was just a
few people for a long time. We felt that
there is some product to be built for
non-engineers. The reason we felt this
is for a long time people that were
using cloud code are non-engineers. Um
and so you know in the product world
when you see latent demand you see
people jumping through hoops to use a
product that was not designed for them.
That's a really good sign it's time to
build another product that is built just
for them. There's all these people on
Twitter that there's this one guy that
was using uh quadco to like monitor his
tomato plants. I just I love this. It
was like he had like a webcam set up and
quad was like, "Oh my god, I'm so happy
that our plant is budding." And because
it was it had like a webcam and just
like every day was like monitoring it
and it it was so happy that the tomatoes
were growing. There was someone that was
using quad code to, you know, recover
photos off of a corrupted hard drive and
it was like his wedding photos.
>> Wow.
>> Um you know, like I said, our entire
finance team at Anthropic uses quad
code. Our sales team uses quad code. So
there there's just all these people that
are non-engineers that were using it.
And at that point quad code it's
available in a lot of form factors right
like we started in a terminal then we
expanded and we added support for
ideides. So we have extensions for you
know every VS code based ID every Jet
Brains based IDE there's also iOS and
Android apps there's the desktop app uh
there's web. So uh then then there's
like Slack and GitHub apps. So we kind
of expanded to all these places to make
cloud code easier for engineers. But
ultimately none of these are built still
for non-engineers. And so cloud code
evolved a lot, but it still felt like
there's a there's kind of a gap and
there's a product that could make this
even easier for people. And so for the
last couple months, the team was kind of
hacking around and just saying like what
is the right product? And at some point
someone came up with this idea of like
what if we just take quad code, add some
guardrails. So for example, co-works
with a virtual machine. This is one of
the many ways that we make sure it's
really safe. Um, especially for
nontechnical users that don't want to
read like bash commands to figure out
what it what it's doing. And they were
hacking on this. I think it was
something like 10 days end to end or
something. It was just fully built with
quad code. Uh, and then we shipped it.
>> And can you give us a sense of like the
complexity behind an app like this? And
if if we can walk through like what
parts needed to be built because from
the outside it's a little bit hard to
tell like is this just a nice UI wrapper
that's you know like I don't know like a
few hundred lines of code. I'm just
being obviously I'm I'm provocative here
or behind the scenes it's actually
really complex piece of software. And
the reason I ask is like Uber is a great
example where people look at the app it
looks really simple. I've worked there
and I know it's it's really really
complex because you don't see a lot of
the complexity. There's a a lot of
regional things. There's a lot of
backend things that are all hidden. So
from just from looking at it, claude
coowork, it's it's hard to tell how much
of this is is additional business logic
that needed to be carefully thought out
versus it's actually just a nice little
thin wrapper on top of the the model. In
some places, I think there's less
complexity than you would think. In some
places, there's more complexity. So on
the product side, it's quite simple um
cuz it's just the quad desktop app. So
you know, you download the Quad app.
It's it's a single desktop app. It has a
tab for co-work, it has a tab for code,
it has a tab for chat. So it is just one
app and we were able to inherit a lot of
that product logic. There's some UI
rendering code under the hood. You know
it's just the same quad code running.
It's the same quad agent SDK that powers
quad code. A lot of the complexity
actually is about safety because we know
like I said we know the user is
nontechnical and so we just want to make
sure they have a good experience and so
for example if someone launches the app
and then you know like they delete a
bunch of family photos that's really not
good and so we wanted to make sure that
we protect against this so you can't
accidentally do that. And so that's
where a lot of the guardrails came from.
So there's a bunch of classifiers
running on the back end. This is for
safety and again extra mitigations for
things like prompt injection and you
know risks like this around security. On
the front end there's an entire virtual
machine that we ship. There's a bunch of
operating system system level
integrations to make sure people don't
accidentally delete things. So just
around safety there there's a lot there.
And then we also had to rethink the
permission system because we inherit the
permission system from quad code. Um but
also for co-work actually a big part of
the value is not just running locally
but it's using all of your tools the way
that quad code uses it. But the thing is
for nontechnical users your tools aren't
really available as CLIs. Some of them
are available over MCP. Many of them are
available in a browser. And so co-work
is really really good when you pair it
with a Chrome extension. And this is the
way that I usually use it. So, you know,
for example, I use it every week to do
uh project management for the team. We
have like we have a spreadsheet that
tracks kind of at a really high level
what everyone's working on. And this is
kind of my personal way of project
managing. You know, other people, like I
said, use ASA, other people use notes or
whatever. For my own test, I don't use
anything, but kind of for the team
overall, I have the spreadsheet and I
have co-work kind of check-in and I I
just ask co-work every week, hey, can
you look at the rows for any status that
has not been filled out? Can you just
ping the engineer on Slack? And so it'll
open one tab in Chrome for the
spreadsheet. It'll open another tab with
Slack and then it'll just start
messaging engineers in Slack and it just
oneshots it. There's like one engineer's
name for some reason it can't
autocomplete. Um but every everything
else it just gets. And so this is
actually like from a safety point of
view, we also thought pretty deeply
about this Chrome extension and how this
works and how the permissioning model
should interact with this local
permissioning model. So there's also a
bunch of code to kind of make sure that
that's that feels smooth. And what's the
tech side behind this? I assume a lot of
will be similar to the the cloud app,
but is it is it electron, typescript,
those kind of things or or something
else?
>> Yeah. Yeah, just electron and
typescript. Actually, some of the people
working on it are early electron folks.
So, uh Felix who's uh you know the
creator of of co-worker
on electron. He helped build it.
>> Oh, amazing. And co-work launched Mac OS
only. uh what was the reason for both
for choosing this platform first and for
now only choosing this platform?
>> Yeah, so Windows coming soon. Um I think
probably by the time this podcast comes
out we will have Windows support. Uh we
just wanted to start early and start
learning you know like everything we do
at Enthropic it's kind of like the way
that I told my own story the one of the
things I like about anthropic is it just
really really matches the way that
people here think about it. you know,
back to this point where like we don't
have high certainty about the things
that we build and our intuition is often
wrong and so we just have to like learn
from users and figure out what people
actually want and just spend a lot of
time listening to people and
understanding the feedback deeply. This
is the way that we build product and so
we always launch a little bit before
it's ready. Um we did this for quad code
when we launched quad code initially it
didn't even support Windows also it
didn't support you know like a lot of
different stacks and then over the
coming weeks we added support for every
stack. Now quad code supports every
single stack. Um you know like Windows
whatever weird Linux dro use Mac OS we
support everything and so for core work
also we just wanted to launch early we
wanted to start with Mac as that was
just the starting point but um yeah it's
it's going to support everything. One
thing you mentioned is is getting
feedback. I'm curious both for cloud
code and for cloud co-work. How do you
go about things like observability
monitoring when you're rolling out? Do
you use any feature flags? And I'm I'm
more interested in like did you build
custom tools for this or did you decide
to use certain vendors because es
especially for observability I'm sure
that this is this is both important but
it also sounds like pretty high scale in
terms of the the number of users that we
can derive or this will not be a small
operation. Yeah there's there's some
off-the-shelf vendors that we use
there's some custom code that we use. So
um it's actually it's a mix of both.
There's nothing too surprising about it.
There's one thing about Enthropic that's
kind of interesting is because we're an
enterprise company and we care a lot
about privacy and security, we can't see
people's data. Um, and so, you know,
like if someone reports a bug, like I
actually can't pull up your logs to kind
of see what's going on. A lot of work
goes into kind of figuring out how to
log events and things like this in a
privacy preserving way. Um, this is just
very important to the way that we
operate
>> for co-work. What kind of learnings have
you had so far? It's it's it's been out
for I think a few weeks now. Did you see
something unexpected? uh are you shaping
the product based on feedback that
you're getting?
>> Yeah. Uh every day the team is landing
so many fixes. The most surprising thing
is just how much people are loving it.
To be honest, when Quad Code first came
out, it actually wasn't an overnight
hit. This is something people think it
was, but it was sort of a slow take off
at the beginning. And I think the first
big inflection was in May when we
released Opus 4 and Sonnet 4. That's
when it really clicked and that's when
our growth became exponential. But at
the beginning, it was sort of a research
preview. people didn't really know how
to use it. Some people got it
immediately, but most people didn't. It
took it took a little while. For
co-work, it's a much steeper growth
trajectory than quad code was at the
beginning. So, it it's just been an
instant hit. And that that's actually
been very surprising. I I didn't really
expect that. One of your new releases,
which came out just very recently, it
was I think yesterday or the day before
when we're recording this podcast, was
agent teams. And I as I understand the
idea with what agent teams agents forms
instead of single agent you can have a
lead agent and it can delegate to its
different teammates. How did you start
experimenting with this and how did you
decide to ship it? Now we're always
doing experiments right there's uh
there's there's all sorts of ways uh to
get more mileage out of out of quad
code. Um one way you can do it is by
extending context. Another way is autoco
compacting context. So it's essentially
infinite context and that's what we have
right now. Another way is using sub
agents. So you have multiple agents kind
of working together. Um there's just
like a lot of different approaches to
get a little bit more mileage out of the
context window. There's this one idea
called uncorrelated context windows.
That's what we call it. And the the idea
is you have multiple context windows. Um
but they essentially start fresh. So
they don't know about each other. And so
an example of this is like a correlated
context window is if you have one if you
have the model and it does a task and
then you have it just do a second task
in that same context window. Um and in
this case the the second task knows
about the first one cuz it's in the same
window. But for something like a sub
aent it's uncorrelated because the main
agent prompts the sub aent but the sub
aents context window is fresh. Besides
that prompt it doesn't know what's in
the parent context window. And you can
see this actually a little bit in uh for
example like sub agents versus uh skills
because when you run a skill uh you know
or slash command it sees the parent
context window versus for a sub agent it
doesn't. So it's uncorrelated. There's
some cases where you want that context.
There's some cases when you don't. Um
and there's this kind of interesting
thing where uncorrelated context windows
and just throwing more context at the
problem and throwing more tokens at it
when the windows are uncorrelated gives
you better results. Um, it's actually a
form of test time compute to do this.
And for something like teams, we've been
experimenting with this for a while. I
think since maybe like October or
September or something like this, and it
really just felt like with Opus 4.6, it
clicked where the model figured out
really how to use this. And sometimes
you see these kind of cute exchanges
where the agents are talking to each
other and they're like discussing
something and it's just very cool to
see. It's very like humanistic in a way.
But there's other times where you just
get very good results. And so we had a
bunch of internal evaluations for
example where we have quad build
something very very complex, something
more complex than what a single quad
would build. And we saw the results just
really really improved with Opus 4.6
with teams. And that's why we felt it's
the right time to release it. We also
wanted to be careful. Um, and the reason
you have to opt into it, the reason it's
a research preview is it uses a ton of
tokens cuz it's just a bunch of quads
that are running. Um, not everyone wants
this all the time. So just excited to
see how people use it and uh you know to
to hear the feedback. It's it's
something you want for fairly complex
tasks. You don't probably want this for
every task. The main quad decides the
rules for the sub quads. We don't have a
kind of a regimented way to do this.
It's context specific. I wouldn't say
there's one right way to do it. I think
actually a lot of the magic of this
comes out of this idea of uncorrelated
context windows. It's less about the
specific configuration of the agents.
But you know it's something that people
should experiment with. I don't think
there's a one-sizefits-all.
>> Have you seen use cases even in even I I
know it's it's still research, but have
you seen use cases where it could look
it looks promising this approach, the
swarm approach?
>> Well, you know, like I said before,
plugins were fully built with swarms.
There there's a bunch of other feature
since that were built in this way. So
yeah, I I think for anything where you
see a single cloud struggling, swarms
can help. It's it's an interesting to
look at. Talking about change in in
general with Andrew Carpathy, you had a
really interesting exchange back in
December where when he posted that he's
never felt as much behind as as a
programmer as he is now because of the
progress with AI. And then you shared
the story about how you started to debug
a memory leak the oldfashioned way and
then Claude just one shot at it. I think
it was a reflection of like how everyone
is feeling that things are changing so
fast and in the in the holiday break I
started to feel that things have have
really shifted. How did you I guess come
to terms with this or or start to
embrace this change? This is something I
really struggle with. The model is
improving so quickly that the ideas that
worked with the old model might not work
with a new model. the things that didn't
work with the new model might work or
with the old model might work with a new
model. And it's weird because there's
just not a lot a lot of other
technologies like this. So I I just
don't really have a lot of experience to
draw on to figure out how I should
approach this. And it's been this new
skill that I've had to learn. In a way,
it's like you just always have to bring
this beginner mindset. Honestly, like
I'm using the word humility a lot, but
you always just have to bring this kind
of intellectual humility because just
all these ideas that were bad before are
now good and and and the inverse. I I
think that's honestly it it's something
I I constantly have to remind myself
about. And back in the It's funny back
in the old world when someone tries an
idea again and we've tried it in the
past and it didn't work, usually the
feedback is like, why are you doing this
again?
>> Yeah. Yeah. You should learn. This used
I mean we used to call a bit of a
gatekeeping but it was somewhat valid
where I know with architecture someone
came and said like why don't we do
microser and someone said we tried it
and it didn't work and if you tried it a
year or two or 3 years ago it was kind
of valid right cuz not much has changed.
Yeah, that's right. That's right. And
something with Microsoft, it's it's
funny because it's like every 10 years
it goes in and out of in and out of
style. But yeah, now now it's I think
the first time ever where it's actually
not crazy to just try the same idea
every few months because the model
improves and it just works. And I I
actually see this with engineers on the
team. Like new people that are newer to
the team, people that are newer to
engineering sometimes do things in a
better way than than I do. Um and I just
have to like look at them and I have to
learn and I have to adjust my
expectations. you know, like an an
example of this is, you know, when when
we release features, sometimes I'll like
screenshot myself using them on, you
know, on X or on threads or whatever
just to kind of talk about it. Um, but
recently, Tar, our um, you know, our
devro guy, he actually codes a lot. Um,
he's amazing and he just started
automating this. So, he's having like
quad code generate its own videos for
for its launches and he just started
doing this and, you know, this is
something like I thought would be, you
know, maybe it's possible. It's not
something I would have tried because I
wouldn't have thought the model was
ready, but he just he just did it and it
just kind of worked.
>> One thing that I've I felt like just a
bit like odd about and I think a lot of
developers can relate is I've come to
terms with this starting from Opus 4.5
the and and also similar models like I
think GPT 5.2 gave me similar vibes as
well. the models have been just really
good at writing code and I I realize
that I don't think I will handr write
the code when I'm get I when I want to
get stuff done if if I actually want to
you know get the pleasure of writing I
can still do it but one thing I
reflected on is it's just been so much
effort to get good at coding I I
remember when I when I was learning when
I I started from like kind of hacking
around to go into university to learning
C and C++ and it it was just bloody hard
and actually you know going through my
my first few jobs where I started to
become better at it. I became better at
debugging and there's a point where like
a lot of my identity was tied to being
good at coding. That's how we used to
get jobs or higher paying jobs. When I
was an engineering manager when we
designed the interview loop at Uber, we
we had talk with managers of what we
need to screen for and we we talk like
well what do developers do most of their
time? About 50% of the time they code.
Therefore, we placed about 50% of the
signal was all about coding. So there
was a lot of things tied into coding
because it it is just hard. I think we
all know that it takes grit. It takes
some level of intelligence to get good
at it. And there's a sense of loss of
like well I I think it's great on one
end that the model can do it. But it
feels that something really quickly got
taken away that I don't think I
personally thought it would happen this
quickly. And I'm
I think a lot of other people are
feeling like some people move on a bit
easier, but there's definitely this
sense of of grief. How did you think
about it? Because again, you're you're
an example of you you wrote so much code
at at Facebook also outside of it. I
know it was just a tool of doing it, but
not many people could do what what you
did. And now the models can also work as
good as you have or if not better.
>> That's the challenge. Yeah. I think it's
it's something that used to be a thing
that we do as software engineers. It's
becoming a thing that everyone is able
to do. There was a moment, you know,
like when I started coding, it was a
very practical thing and it was a way to
get things done. And at some point I
just fell in love with the art of coding
and like languages and kind of the the
the tools themselves. And at some point
I I kind of fell down this rabbit hole.
I wrote this like I wrote I wrote a book
about, you know, a programming language.
>> Typescript. You wrote the first ever
TypeScript uh book at with O'Reilly.
>> Yeah. Yeah. Yeah. That's right. Um it it
was funny actually. There there was this
like there was this amazing moment for
me in my little town in Japan. I went to
the bookstore and I I found that book
translated in Japanese.
>> No.
>> In this tiny town and that was just like
the coolest moment. And then I actually
realized I I don't remember Typescript
at all cuz I was only writing Python for
a couple years at that point. Yeah. And
like at some point I started the the
first the the biggest TypeScript meetup
in the world. That was in that was in
SF. And I got to meet kind of a lot of
my heroes. There was like Chris Cowell
who wrote like general theory of
reactivity. There was Ryan Doll the guy
that made Node. one of the first times
that I I went really deep into this this
community and um just the language
itself and the the tools themselves and
for something like TypeScript there's
this beauty in the type in the type
system cuz Hilesburg is just like he he
he's just brilliant like the idea of
like conditional types and just like
anything can be a literal type and there
there's these very deep ideas that even
the most hardcore functional languages
do not have like even in something like
Haskell like it doesn't go this far and
H Anders just took it and he pushed it
much further than than it had had been
pushed and you know like Joe Pamer and a
bunch of other folks kind of explored a
lot of these ideas and thought of this
and I think for them it was also very
practical right because they had these
large untyped JavaScript code bases how
do you gradually migrated to something
typed and you have to come up with these
very beautiful ideas to to do this for
me is Scala was another kind of rabbit
hole that I fell into in kind of like
this functional programming world And
still when I write code and when the
model writes code I always think in the
types first that that's what matters is
what what is the type signature that
matters more than the code itself and
getting that right. So there is this
beauty to it. There's a there's an art
to it for sure. But in the end it's a
practical thing and in the end this is a
thing that we use to to build things and
you know it's a means it's a means to an
end. It's not an it's not an end to
itself. I I think one metaphor I have
for kind of the this moment in time that
we're in is the the printing press in,
you know, like the the 1400s or whatever
>> because at that moment it it was
actually quite similar, right? Like
there was a group of scribes that you
know knew how to write
>> and it it it was as I understand of
course we never lived there but as as I
imagine it was it was a art process to
learn. You needed to learn you needed to
get the equipment. You probably needed
some sponsorship or being selected
practicing because you needed to produce
the same thing over and over again and
few people could do that and I assume it
was either high prestige or highly paid
or who knows let's assume it was
>> but then the printed press came along.
>> Yeah. Yeah. And at least in Europe like
you had to like a lord or a king or
something had to had to employ you and
then you had to go through you know
years of training and there was this
class of scribes that knew how to write.
They were employed by someone like this.
often the king themselves like or you
know the queen was was not literate. So
it was this very very niche skill and it
was like less than 1% of the population
was literate in Europe you know back
then and then the printing press came
out and what happened so the cost of
printed material went down something
like 100x over the next I think 30 years
50 years or something the quantity of
printed materials went up like 10,000x
in the next 50 100 years this was the
first effect literacy it took a little
while for it to catch up so I think
global literacy it went up to something
like 70%. But that took like another 200
years, 300 years because learning
learning to read is just very hard.
Learning to write is hard. It takes a
lot of effort. It takes uh education
system. It takes you know infrastructure
to have paper and ink uh and the free
time to do this instead of working on a
farm. So it kind of it took early stage
of of of industrialization to actually
get there. But I but I think this effect
of making it so this thing that was
locked away in ivory tower and now it's
accessible to everyone. This is just,
you know, like none of the things around
us would exist today without this. Like
if if we weren't literate, if the people
that built, you know, this microphone
weren't weren't literate, it would have
just been very hard to have a modern
economy. None of these things would
exist. And I I just kind of think about
back then if people had to predict what
would happen when the printing press
came out, no one would have predicted
that the microphone would become a
thing. So, I I just feel like this is uh
this is the best the best uh analog for
for the moment that we're in right now.
>> Yeah, it's interesting that you say that
some of the kings were illiterate who
are employing the scribes because if
we're being honest with ourselves,
we have business owners who know what
they want to build and there are
employing software engineers because
they themselves cannot write code. And I
think we we like to mock the CEOs who
are coming there coming to the team.
They they might even have a drawn
prototype or whiteboard and saying this
should be easy but of course they don't
understand how difficult it is. There
seems to be a bit of analogy where where
there's a person who wants what they
want but until now they needed to hire a
software a specialist who can build that
and there's always that disconnect
between the idea and the person and just
like with the printing press like what
would happen if they could actually
express and like the king could actually
read or write their own letters they
wouldn't need that middleman and it
things become more efficient. But I mean
of course for the scribe it's not the
best news necessarily but I mean smart
scribes can also do so someone needs to
like write the books run the press etc.
Yeah, exactly. And and if you think
about what happened to the scribes,
right? Like they cease to become
scribes, but now there's a category of
writers and and authors like the these
people now exist. And uh the reason they
exist is because the market for
literature just expanded a ton.
>> And I guess also if we think about like
back then a scrib's work was read by a
few people and with the printing press
and author there's a lot more authors
and some of them are not really read but
some of them have wider reach than than
they could imagine. There's new careers
that that exist because of that.
>> Yeah,
>> I love the analogy.
>> And the most exciting thing for me is
it's just so impossible
to say today what will happen after this
happens and after this transition
happens just you know the the economy as
we know it would not have existed
without it. So what's next? like what
what is the thing that we can't even
predict today that will exist because
anyone can do this?
>> Well, we cannot predict but I think we
can look at what is working right now.
If you look around in your environment,
may that be the team across entropic who
are software engineers or or builders or
members of technical staff, however we
call them, who to you are stand out.
What are they doing? What skills have
they built up? And and how have they
changed the way they they work? It's
hard to name individuals because
honestly this is just the strongest the
these are the strongest people I've ever
worked with in my career. There's all
sorts of different archetypes. There's
some people that are really amazing
prototypers. Um so take something from
zero to.5. Just you know figure out like
what are some cool ideas? What is the
technology on walk? There's other people
that are amazing at finding product
market fit. So kind of 0.5 to one or
maybe 0ero to one. There's other people
that span different disciplines and I
I'm just seeing more and more of these
people like I said like people that span
uh product engineering and
infrastructure engineering or you know
product and design or design and
engineering. I I think I'm just seeing a
lot more of these of these hybrids.
>> What's a belief that changed from last
year to this year? Something that you
know like you either believed or or a
conviction that you had that you've
either revised or completely threw away.
I think one thing I wasn't sure about is
how big a problem is safety to be
totally honest. Um I jo I joined
Anthropic because like I said I read a
lot of sci-fi and I kind of I know how
bad this thing can go if it goes bad. It
wasn't something I was sure about. Um
but seeing it from the inside and then
seeing how the new risks that have
arisen in the last year, it just makes
me much much more worried about it. Um
so I I think it's it was kind of an
important thing for me. Now it's just
the most important thing for me is how
do we make sure this thing goes well.
>> I think it's safe to say you you were a
really great software engineer even
before all all the AI things started and
you seem to be a very productive
engineer of course part of a team as
well but but also individually. What are
some skills of like you know before
being a software engineer that are are
still as valuable or maybe even more
valuable than before and what are ones
that are maybe just not as much and and
they're best left behind probably. Okay,
so the stuff that's left behind is uh
best left behind is maybe like very
strong opinions about like code style
and languages and things like this. Like
I I can't wait to get past like these
endless language debates and framework
debates and all the stuff because the
model can just like you know use
whatever language and framework and if
you don't like it it can just rewrite it
for you. So it just doesn't matter
anymore. I think something that still
matters a lot today is things it's being
methodical and hypothesis driven. This
matters both in product design in this
world where everything is being
disrupted and we need to figure out what
to build next and this is something
everyone is thinking about. Um, but it
also matters for engineering day-to-day,
you know, like something like debugging.
You just have to be very methodical
about it. And the model can can do this
and it can help a lot. Um, but I think
still we're in this transition point
where you still need to have the skill.
I don't know if you you're you're still
going to need to have it in 6 months.
Other skills that I think are more
valuable are
being curious and being open to doing
things beyond your swim lane. So, you
know, if you're working on engineering,
but you really understand the business
side, you can just build really awesome
products. And I and I think the next,
you know, billion dollar product, you
know, like after quad code, whatever the
next startup is that, you know, becomes
the next trillion dollar startup, it
might just be like one person that has
some cool idea and their brain just is
able to think across, you know,
engineering and product and business or,
you know, like design and finance and
something else. It's like it's people
are going to become more and more
multi-disipline and this will become
more and more rewarded. So in in some
ways I think this will be the year of
the generalist. I think the other skill
that's actually been been rewarded of it
is uh having a short attention span.
>> I was being rewarded now. Oh yeah. It's
uh you know like people you know like
teenagers are using you know like like
Tik Tok and and all this stuff and I
think in some ways it's kind of
dangerous for society um because like
you want people that can think deeply
and can contemplate ideas and uh aren't
just moving on to the next idea very
quick but in some ways I think this year
is kind of the year that is going to
reward uh it's like the year of ADHD
because the work for me has become
jumping between quads. has become
managing clouds and so it's not so much
about deep work it's about how good am I
about context switching and you know
jumping across multiple different
contexts very quickly
>> could I add that from what I unders what
all you said maybe we could add one
thing which is adaptability because
you're saying of course that ADHD and
and you can jump across but of course
earlier you are very good at focusing
deeply on one thing as well and what
strikes me about you and maybe this is
true for other people as well you you're
just kind of very open to adapt ting
your working style and seeing what works
well for this stage, especially when
things are changing. I think the one
certain thing we can be sure is whenever
the next model comes out, it'll change
again. And you need to be curious and
open to adapting how you work, right?
>> Yeah. And as closing, what's a book or
books that that you would recommend?
I've gone down a rabbit hole. Um, so
he's the threebody problem guy, but he
actually has like a lot of other really
great books. I really love his uh short
stories. Um, he has a couple books of
short stories. I'm a big fan. For people
that are new to sci-fi and you want like
a little bit like harder sci-fi, um I
really love Accelerondo by St. This is a
book I would totally recommend. It's
like essentially the product roadmap for
the next 50 years. Um it it it starts
with takeoff kind of starting to happen
and kind of AI singularity and then it
ends up with like uh this kind of like
group lobster consciousnesses orbiting
Jupiter and it's just like amazing. And
the thing that I think it really
captures is just the pace this like
quickening quickening quickening pace of
how this feels. It really matches the
feeling right now. And then on the
technical side, I would strongly
recommend functional programming in
Scola. Even if language choice just
doesn't matter as much anymore, I think
there is this art to functional
programming that just teaches you how to
code better. Um, and it'll just teach
you how to think in types. If you read
this book, I think what's really
important is to do the exercises also.
And I've gone through and I've done all
of them probably like three times over
and it's just amazing. It it really just
like knocks this idea of functional
types into your head and it's just a
thing you can't stop thinking about.
>> Boris, thank you so much. This was
awesome.
>> Yeah, thanks Kirk. This was a really
interesting conversation and the thing
that I keep coming back to is to Boris's
prickic press analogy. The idea that
medieval scribes were this tiny elite
who could write employed by kings who
themselves were often illiterate and
that we soft rangers might be in a
similar position today. We are the
scribes. We spent years mastering this
craft. And now the printer press is
arriving. But what Boris told me is that
the scribes did not disappear. They
became writers and authors and the
entire market for written work expanded
beyond anything anyone could have
predicted. I do find this hopeful and
also appreciate that Boris didn't
sugarcoat it. The other thing that
struck with me is just how differently
the Cloud Code team built software. No
PRDS, no mandatory ticketing system,
designers and data scientists and
finance people all writing code and
building dozens or hundreds of
prototypes before shipping a feature.
And Boris is shipping 20 to 30 pore
requests a day without editing a single
line by hand. And there are different
verification systems in place. Claw code
reviewing its code, automated lint
rules, best of end passes, and human
code review. If you've enjoyed this
podcast, please do subscribe on your
favorite podcast platform and on
YouTube. A special thank you if you also
leave a rating on the show. Thanks and
see you on the next one.
Ask follow-up questions or revisit key timestamps.
Boris Churnney, the engineering lead for Claude Code at Anthropic, discusses the transformative impact of AI on software development. From his first handwritten PR being rejected at Anthropic to now shipping 20-30 PRs a day with zero manual code editing, Boris shares how the Claude Code team uses parallel agents, virtual machines, and specialized safety layers. He draws a powerful analogy to the 15th-century printing press, suggesting that while the role of the traditional 'scribe' (coder) is changing, the market for 'authors' (builders) is expanding beyond prediction. The conversation covers the internal development of Claude Code, the importance of generalist skills, and how Anthropic measures code quality and productivity.
Videos recently processed by our community