Dario Amodei — “We are near the end of the exponential”
1480 segments
We talked three years ago. In your view, what has been the biggest update over the last three years?
What has been the biggest difference between what it felt like then versus now?
Broadly speaking, the exponential of the underlying technology has gone about as
I expected it to go. There's plus or minus
a year or two here and there. I don't know that I would've
predicted the specific direction of code. But when I look at the exponential,
it is roughly what I expected in terms of the march of the models from smart high
school student to smart college student to beginning to do PhD and professional stuff,
and in the case of code reaching beyond that. The frontier is a little bit uneven, but it's
roughly what I expected. What has been the most surprising
thing is the lack of public recognition of how close we are to the end of the exponential.
To me, it is absolutely wild that you have people — within the bubble and outside the
bubble — talking about the same tired, old hot-button political issues, when we are
near the end of the exponential. I want to understand what that
exponential looks like right now. The first question I asked you when
we recorded three years ago was, "what’s up with scaling and why does it work?"
I have a similar question now, but it feels more complicated.
At least from the public's point of view, three years ago there were well-known public trends
across many orders of magnitude of compute where you could see how the loss improves.
Now we have RL scaling and there's no publicly known scaling law for it.
It's not even clear what the story is. Is this supposed to be teaching the model skills?
Is it supposed to be teaching meta-learning? What is the scaling hypothesis at this point?
I actually have the same hypothesis I had even all the way back in 2017.
I think I talked about it last time, but I wrote a doc called "The Big Blob of Compute Hypothesis".
It wasn't about the scaling of language models in particular.
When I wrote it GPT-1 had just come out. That was one among many things.
Back in those days there was robotics. People tried to work on reasoning as
a separate thing from language models, and there was scaling of the kind of RL
that happened in AlphaGo and in Dota at OpenAI. People remember StarCraft at DeepMind, AlphaStar.
It was written as a more general document. Rich Sutton put out "The Bitter
Lesson" a couple years later. The hypothesis is basically the same.
What it says is that all the cleverness, all the techniques, all the "we need a new method to do
something", that doesn't matter very much. There are only a few things that matter.
I think I listed seven of them. One is how much raw compute you have.
The second is the quantity of data. The third is the quality and distribution of data.
It needs to be a broad distribution. The fourth is how long you train for.
The fifth is that you need an objective function that can scale to the moon.
The pre-training objective function is one such objective function.
Another is the RL objective function that says you have a goal,
you're going to go out and reach the goal. Within that, there's objective rewards like
you see in math and coding, and there's more subjective rewards like you see in
RLHF or higher-order versions of that. Then the sixth and seventh were things
around normalization or conditioning, just getting the numerical stability
so that the big blob of compute flows in this laminar way instead of running into problems.
That was the hypothesis, and it's a hypothesis I still hold.
I don't think I've seen very much that is not in line with it.
The pre-training scaling laws were one example of what we see there. Those have continued going.
Now it's been widely reported, we feel good about pre-training.
It’s continuing to give us gains. What has changed is that now we're
also seeing the same thing for RL. We're seeing a pre-training phase
and then an RL phase on top of that. With RL, it’s actually just the same.
Even other companies have published things in some of their releases that say, "We train the
model on math contests — AIME or other things — and how well the model does is log-linear in
how long we've trained it." We see that as well,
and it's not just math contests. It's a wide variety of RL tasks.
We're seeing the same scaling in RL that we saw for pre-training.
You mentioned Rich Sutton and "The Bitter Lesson". I interviewed him last year,
and he's actually very non-LLM-pilled. I don’t know if this is his perspective,
but one way to paraphrase his objection is: Something which possesses the true core of human
learning would not require all these billions of dollars of data and compute and these bespoke
environments, to learn how to use Excel, how to use PowerPoint, how to navigate a web browser.
The fact that we have to build in these skills using these RL environments hints that we are
actually lacking a core human learning algorithm. So we're scaling the wrong thing. That does raise
the question. Why are we doing all this RL scaling if we think there's something that's going to be
human-like in its ability to learn on the fly? I think this puts together several things that
should be thought of differently. There is a genuine puzzle here,
but it may not matter. In fact, I would guess it probably
doesn't matter. There is an interesting thing. Let me take the RL out of it for a second, because I
actually think it's a red herring to say that RL is any different from pre-training in this matter.
If we look at pre-training scaling, it was very interesting
back in 2017 when Alec Radford was doing GPT-1. The models before GPT-1 were trained on datasets
that didn't represent a wide distribution of text. You had very standard language
modeling benchmarks. GPT-1 itself was trained on
a bunch of fanfiction, I think actually. It was literary text, which is a very
small fraction of the text you can get. In those days it was like a billion words
or something, so small datasets representing a pretty narrow distribution of what you can
see in the world. It didn't generalize well. If you did better on some fanfiction corpus,
it wouldn't generalize that well to other tasks. We had all these measures. We had
all these measures of how well it did at predicting all these other kinds of texts.
It was only when you trained over all the tasks on the internet — when you did a general internet
scrape from something like Common Crawl or scraping links in Reddit, which is what we did for
GPT-2 — that you started to get generalization. I think we're seeing the same thing on RL.
We're starting first with simple RL tasks like training on math competitions, then moving to
broader training that involves things like code. Now we're moving to many other tasks.
I think then we're going to increasingly get generalization.
So that kind of takes out the RL vs. pre-training side of it.
But there is a puzzle either way, which is that in pre-training we use trillions of tokens.
Humans don't see trillions of words. So there is an actual sample
efficiency difference here. There is actually something different here.
The models start from scratch and they need much more training.
But we also see that once they're trained, if we give them a long context length of
a million — the only thing blocking long context is inference — they're very good at
learning and adapting within that context. So I don’t know the full answer to this.
I think there's something going on where pre-training is not like
the process of humans learning, but it's somewhere between the process of humans
learning and the process of human evolution. We get many of our priors from evolution.
Our brain isn't just a blank slate. Whole books have been written about this.
The language models are much more like blank slates.
They literally start as random weights, whereas the human brain starts with all these regions
connected to all these inputs and outputs. Maybe we should think of pre-training — and
for that matter, RL as well — as something that exists in the middle space between
human evolution and human on-the-spot learning. And we should think of the in-context learning
that the models do as something between long-term human learning and short-term human learning.
So there's this hierarchy. There’s evolution, there's long-term learning, there's short-term
learning, and there's just human reaction. The LLM phases exist along this spectrum,
but not necessarily at exactly the same points. There’s no analog to some of the human modes
of learning the LLMs are falling in between the points. Does that make sense?
Yes, although some things are still a bit confusing.
For example, if the analogy is that this is like evolution so it's fine that it's
not sample efficient, then if we're going to get super sample-efficient
agent from in-context learning, why are we bothering to build all these RL environments?
There are companies whose work seems to be teaching models how to use this API,
how to use Slack, how to use whatever. It's confusing to me why there's so much emphasis
on that if the kind of agent that can just learn on the fly is emerging or has already emerged.
I can't speak for the emphasis of anyone else. I can only talk about how we think about it.
The goal is not to teach the model every possible skill within RL,
just as we don't do that within pre-training. Within pre-training, we're not trying to expose
the model to every possible way that words could be put together.
Rather, the model trains on a lot of things and then reaches generalization across pre-training.
That was the transition from GPT-1 to GPT-2 that I saw up close. The model reaches a point. I had
these moments where I was like, "Oh yeah, you just give the model a list of numbers — this is
the cost of the house, this is the square feet of the house — and the model completes the pattern
and does linear regression." Not great, but it does it,
and it's never seen that exact thing before. So to the extent that we are building these
RL environments, the goal is very similar to what was done five or ten years ago with pre-training.
We're trying to get a whole bunch of data, not because we want to cover a specific document or a
specific skill, but because we want to generalize. I think the framework you're laying down obviously
makes sense. We're making progress toward AGI. Nobody at this point disagrees we're going to
achieve AGI this century. The crux is you say we're
hitting the end of the exponential. Somebody else looks at this and says,
"We've been making progress since 2012, and by 2035 we'll have a human-like agent."
Obviously we’re seeing in these models the kinds of things that evolution did,
or that learning within a human lifetime does. I want to understand what you’re seeing
that makes you think it's one year away and not ten years away.
There are two claims you could make here, one stronger and one weaker.
Starting with the weaker claim, when I first saw the scaling back in 2019,
I wasn’t sure. This was a 50/50 thing. I thought I saw something. My claim was that this
was much more likely than anyone thinks. Maybe there's a 50% chance this happens.
On the basic hypothesis of, as you put it, within ten years we'll get to what I call a "country of
geniuses in a data center", I'm at 90% on that. It's hard to go much higher than 90%
because the world is so unpredictable. Maybe the irreducible uncertainty puts us at 95%,
where you get to things like multiple companies having internal turmoil, Taiwan gets invaded,
all the fabs get blown up by missiles. Now you've jinxed us, Dario.
You could construct a 5% world where things get delayed for ten years.
There's another 5% which is that I'm very confident on tasks that can be verified.
With coding, except for that irreducible uncertainty,
I think we'll be there in one or two years. There's no way we will not be there in ten years
in terms of being able to do end-to-end coding. My one little bit of fundamental uncertainty,
even on long timescales, is about tasks that aren't verifiable: planning a mission to Mars;
doing some fundamental scientific discovery like CRISPR; writing a novel.
It’s hard to verify those tasks. I am almost certain we have a
reliable path to get there, but if there's a little bit of uncertainty it's there.
On the ten-year timeline I'm at 90%, which is about as certain as you can be.
I think it's crazy to say that this won't happen by 2035.
In some sane world, it would be outside the mainstream.
But the emphasis on verification hints to me a lack of belief that these models are generalized.
If you think about humans, we're both good at things for which we get verifiable reward
and things for which we don't. No, this is why I’m almost sure.
We already see substantial generalization from things that verify to things that
don't. We're already seeing that. But it seems like you were emphasizing
this as a spectrum which will split apart which domains in which we see more progress.
That doesn't seem like how humans get better. The world in which we don't get there is the world
in which we do all the verifiable things. Many of them generalize,
but we don't fully get there. We don’t fully color in the other side
of the box. It's not a binary thing. Even if generalization is weak and you can only do
verifiable domains, it's not clear to me you could automate software engineering in such a world.
You are "a software engineer" in some sense, but part of being a software engineer for you involves
writing long memos about your grand vision. I don’t think that’s part of the job of SWE.
That's part of the job of the company, not SWE specifically.
But SWE does involve design documents and other things like that.
The models are already pretty good at writing comments.
Again, I’m making much weaker claims here than I believe, to distinguish between two things.
We're already almost there for software engineering.
By what metric? There's one metric which is how many lines of code are written by AI.
If you consider other productivity improvements in the history of software engineering,
compilers write all the lines of software. There's a difference between how many lines
are written and how big the productivity improvement is. "We’re almost there" meaning…
How big is the productivity improvement, not just how many lines are written by AI?
I actually agree with you on this. I've made a series of predictions on
code and software engineering. I think people have repeatedly misunderstood them.
Let me lay out the spectrum. About eight or nine months ago,
I said the AI model will be writing 90% of the lines of code in three to six months.
That happened, at least at some places. It happened at Anthropic, happened with
many people downstream using our models. But that's actually a very weak criterion.
People thought I was saying that we won't need 90% of the software engineers. Those things are worlds
apart. The spectrum is: 90% of code is written by the model, 100% of code is written by the model.
That's a big difference in productivity. 90% of the end-to-end SWE tasks — including
things like compiling, setting up clusters and environments, testing features,
writing memos — are done by the models. 100% of today's SWE tasks are done by the models.
Even when that happens, it doesn't mean software engineers are out of a job.
There are new higher-level things they can do, where they can manage.
Then further down the spectrum, there's 90% less demand for SWEs, which I think
will happen but this is a spectrum. I wrote about it in "The Adolescence
of Technology" where I went through this kind of spectrum with farming.
I actually totally agree with you on that.
These are very different benchmarks from each other,
but we're proceeding through them super fast. Part of your vision is that going from 90 to 100
is going to happen fast, and that it leads to huge productivity improvements.
But what I notice is that even in greenfield projects people start with Claude Code or
something, people report starting a lot of projects… Do we see in the world out there
a renaissance of software, all these new features that wouldn't exist otherwise?
At least so far, it doesn't seem like we see that. So that does make me wonder.
Even if I never had to intervene with Claude Code, the world is complicated.
Jobs are complicated. Closing the loop on self-contained systems, whether it’s just
writing software or something, how much broader gains would we see just from that?
Maybe that should dilute our estimation of the "country of geniuses".
I simultaneously agree with you that it's a reason why these things don't happen instantly,
but at the same time, I think the effect is gonna be very fast.
You could have these two poles. One is that AI is not going to make
progress. It's slow. It's going to take forever to diffuse within the economy.
Economic diffusion has become one of these buzzwords that's a reason why
we're not going to make AI progress, or why AI progress doesn't matter.
The other axis is that we'll get recursive self-improvement, the whole thing.
Can't you just draw an exponential line on the curve?
We're going to have Dyson spheres around the sun so many nanoseconds after we get recursive.
I'm completely caricaturing the view here, but there are these two extremes.
But what we've seen from the beginning, at least if you look within Anthropic, there's this bizarre
10x per year growth in revenue that we've seen. So in 2023, it was zero to $100 million.
In 2024, it was $100 million to $1 billion. In 2025, it was $1 billion to $ 9-10 billion.
You guys should have just bought a billion dollars of your own products so you could just…
And the first month of this year, that exponential is...
You would think it would slow down, but we added another few billion to revenue in January.
Obviously that curve can't go on forever. The GDP is only so large.
I would even guess that it bends somewhat this year, but that is a fast curve. That's a really
fast curve. I would bet it stays pretty fast even as the scale goes to the entire economy.
So I think we should be thinking about this middle world where things are extremely fast, but not
instant, where they take time because of economic diffusion, because of the need to close the loop.
Because it's fiddly: "I have to do change management within my enterprise… I set this up,
but I have to change the security permissions on this in order to make it actually work…
I had this old piece of software that checks the model before it's compiled
and released and I have to rewrite it. Yes, the model can do that, but I have
to tell the model to do that. It has to take time to do that."
So I think everything we've seen so far is compatible with the idea that there's one fast
exponential that's the capability of the model. Then there's another fast exponential
that's downstream of that, which is the diffusion of the model into the economy.
Not instant, not slow, much faster than any previous technology, but it has its limits.
When I look inside Anthropic, when I look at our customers: fast adoption, but not infinitely fast.
Can I try a hot take on you? Yeah.
I feel like diffusion is cope that people say. When the model isn't able to do something,
they're like, "oh, but it's a diffusion issue." But then you should use the comparison to humans.
You would think that the inherent advantages that AIs have would make diffusion a much easier
problem for new AIs getting onboarded than new humans getting onboarded.
An AI can read your entire Slack and your drive in minutes.
They can share all the knowledge that the other copies of the same instance have.
You don't have this adverse selection problem when you're hiring AI, so you
can just hire copies of a vetted AI model. Hiring a human is so much more of a hassle.
People hire humans all the time. We pay humans upwards of $50 trillion
in wages because they're useful, even though in principle it would be much easier to integrate
AIs into the economy than it is to hire humans. The diffusion doesn't really explain.
I think diffusion is very real and doesn't exclusively have
to do with limitations on the AI models. Again, there are people who use diffusion
as kind of a buzzword to say this isn't a big deal. I'm not talking about that. I'm
not talking about how AI will diffuse at the speed of previous technologies.
I think AI will diffuse much faster than previous technologies have, but not infinitely fast.
I'll just give an example of this. There's Claude Code. Claude Code is extremely easy to set up.
If you're a developer, you can just start using Claude Code.
There is no reason why a developer at a large enterprise should not be adopting
Claude Code as quickly as an individual developer or developer at a startup.
We do everything we can to promote it. We sell Claude Code to enterprises.
Big enterprises, big financial companies, big pharmaceutical companies, all of them are adopting
Claude Code much faster than enterprises typically adopt new technology. But again,
it takes time. Any given feature or any given product, like Claude Code or Cowork, will get
adopted by the individual developers who are on Twitter all the time, by the Series A startups,
many months faster than they will get adopted by a large enterprise that does food sales.
There are just a number of factors. You have to go through legal,
you have to provision it for everyone. It has to pass security and compliance.
The leaders of the company who are further away from the AI revolution are forward-looking,
but they have to say, "Oh, it makes sense for us to spend 50 million.
This is what this Claude Code thing is. This is why it helps our company.
This is why it makes us more productive." Then they have to explain
to the people two levels below. They have to say, "Okay, we have 3,000 developers.
Here's how we're going to roll it out to our developers."
We have conversations like this every day. We are doing everything we can to make
Anthropic's revenue grow 20 or 30x a year instead of 10x a year.
Again, many enterprises are just saying, "This is so productive.
We're going to take shortcuts in our usual procurement process."
They're moving much faster than when we tried to sell them just
the ordinary API, which many of them use. Claude Code is a more compelling product,
but it's not an infinitely compelling product. I don't think even AGI or powerful AI or
"country of geniuses in a data center" will be an infinitely compelling product.
It will be a compelling product enough maybe to get 3-5x, or 10x, a year of growth, even when
you're in the hundreds of billions of dollars, which is extremely hard to do and has never been
done in history before, but not infinitely fast. I buy that it would be a slight slowdown.
Maybe this is not your claim, but sometimes people talk about this like,
"Oh, the capabilities are there, but because of diffusion... otherwise we're basically at AGI".
I don't believe we're basically at AGI. I think if you had the "country
of geniuses in a data center"... If we had the "country of geniuses
in a data center", we would know it. We would know it if you had the
"country of geniuses in a data center". Everyone in this room would know it.
Everyone in Washington would know it. People in rural parts might not know it,
but we would know it. We don't have that now. That is very clear.
Coming back to concrete prediction… Because there are so many different things to disambiguate,
it can be easy to talk past each other when we're talking about capabilities.
For example, when I interviewed you three years ago, I asked you a prediction about what
we should expect three years from now. You were right. You said, "We should expect systems which,
if you talk to them for the course of an hour, it's hard to tell them apart from
a generally well-educated human." I think you were right about that.
I think spiritually I feel unsatisfied because my internal expectation was that such a system could
automate large parts of white-collar work. So it might be more productive to talk about
the actual end capabilities you want from such a system.
I will basically tell you where I think we are. Let me ask a very specific question so that
we can figure out exactly what kinds of capabilities we should think about soon.
Maybe I'll ask about it in the context of a job I understand well, not because it's the most
relevant job, but just because I can evaluate the claims about it. Take video editors. I have
video editors. Part of their job involves learning about our audience's preferences,
learning about my preferences and tastes, and the different trade-offs we have.
They’re, over the course of many months, building up this understanding of context.
The skill and ability they have six months into the job, a model that can
pick up that skill on the job on the fly, when should we expect such an AI system?
I guess what you're talking about is that we're doing this interview for three hours.
Someone's going to come in, someone's going to edit it.
They're going to be like, "Oh, I don't know, Dario scratched his head and we could edit that out."
"Magnify that." "There was this long
discussion that is less interesting to people. There's another thing that's more interesting
to people, so let's make this edit." I think the "country of geniuses in
a data center" will be able to do that. The way it will be able to do that is it will
have general control of a computer screen. You'll be able to feed this in.
It'll be able to also use the computer screen to go on the web, look at all your previous
interviews, look at what people are saying on Twitter in response to your interviews,
talk to you, ask you questions, talk to your staff, look at the history of edits
that you did, and from that, do the job. I think that's dependent on several things.
I think this is one of the things that's actually blocking deployment:
getting to the point on computer use where the models are really masters at using the computer.
We've seen this climb in benchmarks, and benchmarks are always imperfect measures.
But I think when we first released computer use a year and a quarter ago, OSWorld was at maybe 15%.
I don't remember exactly, but we've climbed from that to 65-70%.
There may be harder measures as well, but I think computer use has to pass a point of reliability.
Can I just follow up on that before you move on to the next point?
For years, I've been trying to build different internal LLM tools for myself.
Often I have these text-in, text-out tasks, which should be dead center
in the repertoire of these models. Yet I still hire humans to do them.
If it's something like, "identify what the best clips would be in this transcript",
maybe the LLMs do a seven-out-of-ten job on them. But there's not this ongoing way I can engage
with them to help them get better at the job the way I could with a human employee.
That missing ability, even if you solve computer use, would still block
my ability to offload an actual job to them. This gets back to what we were talking about
before with learning on the job. It's very interesting. I think with the coding agents,
I don't think people would say that learning on the job is what is preventing the coding agents
from doing everything end to end. They keep getting better. We have engineers
at Anthropic who don't write any code. When I look at the productivity, to your
previous question, we have folks who say, "This GPU kernel, this chip, I used to write it myself.
I just have Claude do it." There's this enormous improvement in productivity.
When I see Claude Code, familiarity with the codebase or a feeling that the model
hasn't worked at the company for a year, that's not high up on the list of complaints I see.
I think what I'm saying is that we're kind of taking a different path.
Don't you think with coding that's because there
is an external scaffold of memory which exists instantiated in the codebase?
I don't know how many other jobs have that. Coding made fast progress precisely because
it has this unique advantage that other economic activity doesn't.
But when you say that, what you're implying is that by reading the codebase into the context,
I have everything that the human needed to learn on the job.
So that would be an example of—whether it's written or not, whether it's available or
not—a case where everything you needed to know you got from the context window.
What we think of as learning—"I started this job, it's going to take me six months to understand the
code base"—the model just did it in the context. I honestly don't know how to think about
this because there are people who qualitatively report what you're saying.
I'm sure you saw last year, there was a major study where they had experienced developers try
to close pull requests in repositories that they were familiar with. Those developers reported an
uplift. They reported that they felt more productive with the use of these models.
But in fact, if you look at their output and how much was actually merged back in,
there was a 20% downlift. They were less productive
as a result of using these models. So I'm trying to square the qualitative
feeling that people feel with these models versus, 1) in a macro level,
where is this renaissance of software? And then 2) when people do these independent
evaluations, why are we not seeing the productivity benefits we would expect?
Within Anthropic, this is just really unambiguous. We're under an incredible amount of commercial
pressure and make it even harder for ourselves because we have all this safety stuff we do that
I think we do more than other companies. The pressure to survive economically
while also keeping our values is just incredible. We're trying to keep this 10x revenue curve going.
There is zero time for bullshit. There is zero time for feeling
like we're productive when we're not. These tools make us a lot more productive.
Why do you think we're concerned about competitors using the tools?
Because we think we're ahead of the competitors. We wouldn't be going through all this trouble if
this were secretly reducing our productivity. We see the end productivity every few
months in the form of model launches. There's no kidding yourself about this.
The models make you more productive. 1) People feeling like they're productive is
qualitatively predicted by studies like this. But 2) if I just look at the end output,
obviously you guys are making fast progress. But the idea was supposed to be that with
recursive self-improvement, you make a better AI, the AI helps you build a
better next AI, et cetera, et cetera. What I see instead—if I look at you,
OpenAI, DeepMind—is that people are just shifting around the podium every few months.
Maybe you think that stops because you've won or whatever.
But why are we not seeing the person with the best coding model have this lasting
advantage if in fact there are these enormous productivity gains from the last coding model.
I think my model of the situation is that there's an advantage that's gradually growing.
I would say right now the coding models give maybe, I don't know,
a 15-20% total factor speed up. That's my view. Six months ago, it was maybe 5%.
So it didn't matter. 5% doesn't register. It's now just getting to the point where it's
one of several factors that kind of matters. That's going to keep speeding up.
I think six months ago, there were several companies that were at roughly the same
point because this wasn't a notable factor, but I think it's starting to speed up more and more.
I would also say there are multiple companies that write models that are used for code and we're not
perfectly good at preventing some of these other companies from using our models internally.
So I think everything we're seeing is consistent with this kind of snowball model.
Again, my theme in all of this is all of this is soft takeoff, soft, smooth exponentials,
although the exponentials are relatively steep. So we're seeing this snowball gather momentum
where it's like 10%, 20%, 25%, 40%. As you go, Amdahl's law, you have
to get all the things that are preventing you from closing the loop out of the way.
But this is one of the biggest priorities within Anthropic.
Stepping back, before in the stack we were talking about when do we get this on-the-job learning?
It seems like the point you were making on the coding thing is that we actually
don't need on-the-job learning. You can have tremendous productivity
improvements, you can have potentially trillions of dollars of revenue for AI companies, without
this basic human ability to learn on the job. Maybe that's not your claim, you should clarify.
But in most domains of economic activity, people say, "I hired somebody, they weren't that useful
for the first few months, and then over time they built up the context, understanding."
It's actually hard to define what we're talking about here.
But they got something and then now they're a powerhorse and they're so valuable to us.
If AI doesn't develop this ability to learn on the fly, I'm a bit skeptical that we're going to see
huge changes to the world without that ability. I think two things here. There's the state
of the technology right now. Again, we have these two stages.
We have the pre-training and RL stage where you throw a bunch of data and tasks into
the models and then they generalize. So it's like learning, but it's like
learning from more data and not learning over one human or one model's lifetime.
So again, this is situated between evolution and human learning.
But once you learn all those skills, you have them.
Just like with pre-training, just how the models know more, if I look at a pre-trained model,
it knows more about the history of samurai in Japan than I do.
It knows more about baseball than I do. It knows more about low-pass filters
and electronics, all of these things. Its knowledge is way broader than mine.
So I think even just that may get us to the point where the models are better at everything.
We also have, again, just with scaling the kind of existing setup, the in-context learning.
I would describe it as kind of like human on-the-job learning,
but a little weaker and a little short term. You look at in-context learning and if you give
the model a bunch of examples it does get it. There's real learning that happens in context.
A million tokens is a lot. That can be days of human learning.
If you think about the model reading a million words, how long would it
take me to read a million? Days or weeks at least. So you have these two things.
I think these two things within the existing paradigm may just be enough to get you the
"country of geniuses in a data center". I don't know for sure, but I think
they're going to get you a large fraction of it. There may be gaps, but I certainly think that just
as things are, this is enough to generate trillions of dollars of revenue. That's one. Two,
is this idea of continual learning, this idea of a single model learning on the job.
I think we're working on that too. There's a good chance that in the next
year or two, we also solve that. Again, I think you get most
of the way there without it. The trillions of dollars a year market,
maybe all of the national security implications and the safety implications that I wrote about in
"Adolescence of Technology" can happen without it. But we, and I imagine others, are working on it.
There's a good chance that we will get there within the next year or two.
There are a bunch of ideas. I won't go into all of them in detail, but
one is just to make the context longer. There's nothing preventing
longer contexts from working. You just have to train at longer contexts
and then learn to serve them at inference. Both of those are engineering problems that
we are working on and I would assume others are working on them as well.
This context length increase, it seemed like there was a period from 2020 to 2023
where from GPT-3 to GPT-4 Turbo, there was an increase from 2000 context lengths to 128K.
I feel like for the two-ish years since then, we've been in the same-ish ballpark.
When context lengths get much longer than that, people report qualitative
degradation in the ability of the model to consider that full context.
So I'm curious what you're internally seeing that makes you think, "10 million contexts,
100 million contexts to get six months of human learning and building context".
This isn't a research problem. This is an engineering and inference problem.
If you want to serve long context, you have to store your entire KV cache.
It's difficult to store all the memory in the GPUs, to juggle the memory around.
I don't even know the details. At this point, this is at a level of detail
that I'm no longer able to follow, although I knew it in the GPT-3 era. "These are the weights,
these are the activations you have to store…" But these days the whole thing is flipped
because we have MoE models and all of that. Regarding this degradation you're talking about,
without getting too specific, there's two things. There's the context length you train at and
there's a context length that you serve at. If you train at a small context length
and then try to serve at a long context length, maybe you get these degradations.
It's better than nothing, you might still offer it, but you get these degradations.
Maybe it's harder to train at a long context length.
I want to, at the same time, ask about maybe some rabbit holes.
Wouldn't you expect that if you had to train on longer context length,
that would mean that you're able to get less samples in for the same amount of compute?
Maybe it's not worth diving deep on that. I want to get an answer to the
bigger picture question. I don't feel a preference
for a human editor that's been working for me for six months versus an AI that's been
working with me for six months, what year do you predict that that will be the case?
My guess for that is there's a lot of problems where basically we can do this when we have
the "country of geniuses in a data center". My picture for that, if you made me guess, is
one to two years, maybe one to three years. It's really hard to tell. I have a strong view—99%,
95%—that all this will happen in 10 years. I think that's just a super safe bet.
I have a hunch—this is more like a 50/50 thing—that it's going to be more like
one to two, maybe more like one to three. So one to three years. Country of geniuses,
and the slightly less economically valuable task of editing videos.
It seems pretty economically valuable, let me tell you.
It's just there are a lot of use cases like that. There are a lot of similar ones.
So you're predicting that within one to three years.
And then, generally, Anthropic has predicted that by late '26 or early '27 we will have AI systems
that "have the ability to navigate interfaces available to humans doing digital work today,
intellectual capabilities matching or exceeding that of Nobel Prize winners, and the ability to
interface with the physical world". You gave an interview two months ago
with DealBook where you were emphasizing your company's more responsible compute
scaling as compared to your competitors. I'm trying to square these two views.
If you really believe that we're going to have a country of geniuses, you want as
big a data center as you can get. There's no reason to slow down.
The TAM of a Nobel Prize winner, that can actually do everything a Nobel Prize
winner can do, is trillions of dollars. So I'm trying to square this conservatism,
which seems rational if you have more moderate timelines, with your stated views about progress.
It actually all fits together. We go back to this fast, but not infinitely fast, diffusion.
Let's say that we're making progress at this rate. The technology is making progress this fast.
I have very high conviction that we're going to get there within a few years.
I have a hunch that we're going to get there within a year or two.
So there’s a little uncertainty on the technical side, but pretty strong
confidence that it won't be off by much. What I'm less certain about is, again,
the economic diffusion side. I really do believe that we could
have models that are a country of geniuses in the data center in one to two years.
One question is: How many years after that do the trillions in revenue start rolling in?
I don't think it's guaranteed that it's going to be immediate.
It could be one year, it could be two years, I could even stretch it to five
years although I'm skeptical of that. So we have this uncertainty. Even if the technology goes as
fast as I suspect that it will, we don't know exactly how fast it's going to drive revenue.
We know it's coming, but with the way you buy these data centers, if you're off by a couple
years, that can be ruinous. It is just like how I
wrote in "Machines of Loving Grace". I said I think we might get this powerful AI,
this "country of genius in the data center". That description you gave comes
from "Machines of Loving Grace". I said we'll get that in 2026, maybe 2027. Again,
that is my hunch. I wouldn't be surprised if I'm off by a year or two, but that is my hunch.
Let's say that happens. That's the starting gun. How long does it take to cure all the diseases?
That's one of the ways that drives a huge amount of economic value. You cure every disease. There's
a question of how much of that goes to the pharmaceutical company or the AI company,
but there's an enormous consumer surplus because —assuming we can get access for everyone,
which I care about greatly—we cure all of these diseases. How long does it take? You
have to do the biological discovery, you have to manufacture the new drug,
you have to go through the regulatory process. We saw this with vaccines and COVID.
We got the vaccine out to everyone, but it took a year and a half.
My question is: How long does it take to get the cure for everything—which AI is the genius
that can in theory invent—out to everyone? How long from when that AI first exists
in the lab to when diseases have actually been cured for everyone?
We've had a polio vaccine for 50 years. We're still trying to eradicate it in the
most remote corners of Africa. The Gates Foundation is trying
as hard as they can. Others are trying as hard
as they can. But that's difficult. Again, I don't expect most of the economic diffusion
to be as difficult as that. That's the most difficult case. But there's a real dilemma here.
Where I've settled on it is that it will be faster than anything we've seen in the
world, but it still has its limits. So when we go to buying data centers,
again, the curve I'm looking at is: we've had a 10x a year increase every year.
At the beginning of this year, we're looking at $10 billion in annualized revenue.
We have to decide how much compute to buy. It takes a year or two to actually build out
the data centers, to reserve the data center. Basically I'm saying, "In 2027,
how much compute do I get?" I could assume that the
revenue will continue growing 10x a year, so it'll be $100 billion at the end of
2026 and $1 trillion at the end of 2027. Actually it would be $5 trillion dollars
of compute because it would be $1 trillion a year for five years.
I could buy $1 trillion of compute that starts at the end of 2027.
If my revenue is not $1 trillion dollars, if it's even $800 billion, there's no force on earth,
there's no hedge on earth that could stop me from going bankrupt if I buy that much compute.
Even though a part of my brain wonders if it's going to keep growing 10x,
I can't buy $1 trillion a year of compute in 2027. If I'm just off by a year in that rate of growth,
or if the growth rate is 5x a year instead of 10x a year, then you go bankrupt.
So you end up in a world where you're supporting hundreds of billions, not trillions.
You accept some risk that there's so much demand that you can't support the revenue,
and you accept some risk that you got it wrong and it's still slow.
When I talked about behaving responsibly, what I meant actually was not the absolute amount.
I think it is true we're spending somewhat less than some of the other players.
It's actually the other things, like have we been thoughtful about it or are we YOLOing and saying,
"We're going to do $100 billion here or $100 billion there"?
I get the impression that some of the other companies have not written down
the spreadsheet, that they don't really understand the risks they're taking.
They're just doing stuff because it sounds cool. We've thought carefully about it. We're
an enterprise business. Therefore, we can rely more on revenue. It's less fickle than consumer.
We have better margins, which is the buffer between buying too much and buying too little.
I think we bought an amount that allows us to capture pretty strong upside worlds.
It won't capture the full 10x a year. Things would have to go pretty badly for
us to be in financial trouble. So we've thought carefully and
we've made that balance. That's what I mean when
I say that we're being responsible. So it seems like it's possible that we
actually just have different definitions of the "country of a genius in a data center".
Because when I think of actual human geniuses, an actual country of human geniuses in a data center,
I would happily buy $5 trillion worth of compute to run an actual country of
human geniuses in a data center. Let's say JPMorgan or Moderna or
whatever doesn't want to use them. I've got a country of geniuses.
They'll start their own company. If they can't start their own company and they're bottlenecked
by clinical trials… It is worth stating that with clinical trials, most clinical trials fail because
the drug doesn't work. There's not efficacy. I make exactly that point in "Machines of
Loving Grace", I say the clinical trials are going to go much faster
than we're used to, but not infinitely fast. Okay, and then suppose it takes a year for
the clinical trials to work out so that you're getting revenue from that and can make more drugs.
Okay, well, you've got a country of geniuses and you're an AI lab.
You could use many more AI researchers. You also think there are these self-reinforcing
gains from smart people working on AI tech. You can have the data center
working on AI progress. Are there substantially
more gains from buying $1 trillion a year of compute versus $300 billion a year of compute?
If your competitor is buying a trillion, yes there is.
Well, no, there's some gain, but then again, there's this chance that they go bankrupt before.
Again, if you're off by only a year, you destroy yourselves. That's the balance. We're
buying a lot. We're buying a hell of a lot. We're buying an amount that's comparable to
what the biggest players in the game are buying. But if you're asking me, "Why haven't we signed
$10 trillion of compute starting in mid-2027?"... First of all, it can't be produced.
There isn't that much in the world. But second, what if the country of
geniuses comes, but it comes in mid-2028 instead of mid-2027? You go bankrupt.
So if your projection is one to three years, it seems like you should want
$10 trillion of compute by 2029 at the latest? Even in the longest version of the timelines
you state, the compute you are ramping up to build doesn't seem in accordance.
What makes you think that? Human wages, let's say,
are on the order of $50 trillion a year— So I won't talk about Anthropic in particular,
but if you talk about the industry, the amount of compute the industry is building this year is
probably, call it, 10-15 gigawatts. It goes up by roughly 3x a year.
So next year's 30-40 gigawatts. 2028 might be 100 gigawatts. 2029 might be like 300 gigawatts.
I'm doing the math in my head, but each gigawatt costs maybe $10 billion,
on the order of $10-15 billion a year. You put that all together and you're
getting about what you described. You’re getting exactly that. You're getting multiple
trillions a year by 2028 or 2029. You're getting exactly what you predict.
That's for the industry. That's for the industry, that’s right.
Suppose Anthropic's compute keeps 3x-ing a year, and then by 2027-28, you have 10 gigawatts.
Multiply that by, as you say, $10 billion. So then it's like $100 billion a year.
But then you're saying the TAM by 2028 is $200 billion.
Again, I don't want to give exact numbers for Anthropic, but these numbers are too small.
Okay, interesting. You've told investors
that you plan to be profitable starting in 2028. This is the year when we're potentially getting
the country of geniuses as a data center. This is now going to unlock all this progress
in medicine and health and new technologies. Wouldn't this be exactly the time where you'd
want to reinvest in the business and build bigger "countries" so they can make more discoveries?
Profitability is this kind of weird thing in this field.
I don't think in this field profitability is actually a measure of spending down
versus investing in the business. Let's just take a model of this.
I actually think profitability happens when you underestimated the amount of demand you were going
to get and loss happens when you overestimated the amount of demand you were going to get,
because you're buying the data centers ahead of time. Think about it this way. Again,
these are stylized facts. These numbers are not exact. I'm just trying to make a toy model here.
Let's say half of your compute is for training and half of your compute is for inference.
The inference has some gross margin that's more than 50%.
So what that means is that if you were in steady-state, you build a data center and if
you knew exactly the demand you were getting, you would get a certain amount of revenue.
Let’s say you pay $100 billion a year for compute. On $50 billion a year you support
$150 billion of revenue. The other $50 billion is used for training.
Basically you’re profitable and you make $50 billion of profit.
Those are the economics of the industry today, or not today but where we’re
projecting forward in a year or two. The only thing that makes that not the
case is if you get less demand than $50 billion. Then you have more than 50% of your data center
for research and you're not profitable. So you train stronger models,
but you're not profitable. If you get more demand than you thought, then
research gets squeezed, but you're kind of able to support more inference and you're more profitable.
Maybe I'm not explaining it well, but the thing I'm trying to say is that you
decide the amount of compute first. Then you have some target desire of
inference versus training, but that gets determined by demand.
It doesn't get determined by you. What I'm hearing is the reason
you're predicting profit is that you are systematically underinvesting in compute?
No, no, no. I'm saying it's hard to predict. These things about 2028 and when it will happen,
that's our attempt to do the best we can with investors.
All of this stuff is really uncertain because of the cone of uncertainty.
We could be profitable in 2026 if the revenue grows fast enough.
If we overestimate or underestimate the next year, that could swing wildly.
What I'm trying to get at is that you have a model in your head of a business that invests,
invests, invests, gets scale and then becomes profitable.
There's a single point at which things turn around.
I don't think the economics of this industry work that way.
I see. So if I'm understanding correctly, you're saying that because of the discrepancy
between the amount of compute we should have gotten and the amount of compute we got,
we were sort of forced to make profit. But that doesn't mean we're going
to continue making profit. We're going to reinvest the money
because now AI has made so much progress and we want a bigger country of geniuses.
So back into revenue is high, but losses are also high.
If every year we predict exactly what the demand is going to be, we'll be profitable every year.
Because spending 50% of your compute on research, roughly, plus a gross margin that's higher than
50% and correct demand prediction leads to profit. That's the profitable business model that I think
is kind of there, but obscured by these building ahead and prediction errors.
I guess you're treating the 50% as a sort of given constant, whereas in fact,
if AI progress is fast and you can increase the progress by scaling up more, you should just have
more than 50% and not make profit. But here's what I'll say. You
might want to scale it up more. Remember the log returns to scale.
If 70% would get you a very little bit of a smaller model through a factor of 1.4x...
That extra $20 billion, each dollar there is worth much less to you because of the log-linear setup.
So you might find that it's better to invest that $20 billion in serving
inference or in hiring engineers who are kind of better at what they're doing.
So the reason I said 50%... That's not exactly our target. It's not exactly going to be 50%.
It’ll probably vary over time. What I'm saying is the log-linear return, what it leads to is you
spend of order one fraction of the business. Like not 5%, not 95%. Then you get diminishing returns.
I feel strange that I'm convincing Dario to believe in AI progress or something.
Okay, you don't invest in research because it has diminishing returns,
but you invest in the other things you mentioned. I think profit at a sort of macro level—
Again, I'm talking about diminishing returns, but after you're spending $50 billion a year.
This is a point I'm sure you would make, but diminishing returns on a genius could
be quite high. More generally,
what is profit in a market economy? Profit is basically saying other
companies in the market can do more things with this money than I can.
Put aside Anthropic. I don't want to give information about Anthropic.
That’s why I'm giving these stylized numbers. But let's just derive the
equilibrium of the industry. Why doesn't everyone spend 100% of their
compute on training and not serve any customers? It's because if they didn't get any revenue,
they couldn't raise money, they couldn't do compute deals,
they couldn't buy more compute the next year. So there's going to be an equilibrium where every
company spends less than 100% on training and certainly less than 100% on inference.
It should be clear why you don't just serve the current models and never train another model,
because then you don't have any demand because you'll fall behind. So there's some equilibrium.
It's not gonna be 10%, it's not gonna be 90%. Let's just say as a stylized fact, it's 50%.
That's what I'm getting at. I think we're gonna be in a position where that equilibrium of how much
you spend on training is less than the gross margins that you're able to get on compute.
So the underlying economics are profitable. The problem is you have this hellish demand
prediction problem when you're buying the next year of compute and you might guess under and be
very profitable but have no compute for research. Or you might guess over and you are not
profitable and you have all the compute for research in the world. Does that make sense?
Just as a dynamic model of the industry? Maybe stepping back, I'm not saying I think
the "country of geniuses" is going to come in two years and therefore you should buy this compute.
To me, the end conclusion you're arriving at makes a lot of sense.
But that's because it seems like "country of geniuses" is hard and there's a long way to go.
So stepping back, the thing I'm trying to get at is more that it seems like your worldview
is compatible with somebody who says, "We're like 10 years away from a world in which we're
generating trillions of dollars of value." That's just not my view. So I'll make
another prediction. It is hard for me to see that there won't be trillions
of dollars in revenue before 2030. I can construct a plausible world.
It takes maybe three years. That would be the end of what I think it's plausible.
Like in 2028, we get the real "country of geniuses in the data center".
The revenue's going into the low hundreds of billions by 2028, and then the country
of geniuses accelerates it to trillions. We’re basically on the slow end of diffusion.
It takes two years to get to the trillions. That would be the world where it takes until 2030.
I suspect even composing the technical exponential and diffusion exponential,
we’ll get there before 2030. So you laid out a model where Anthropic makes
profit because it seems like fundamentally we're in a compute-constrained world.
So eventually we keep growing compute— I think the way the profit comes is… Again,
let's just abstract the whole industry here. Let's just imagine we're in an economics textbook.
We have a small number of firms. Each can invest a limited amount.
Each can invest some fraction in R&D. They have some marginal cost to serve.
The gross profit margins on that marginal cost are very high because inference is efficient.
There's some competition, but the models are also differentiated.
Companies will compete to push their research budgets up.
But because there's a small number of players, we have the... What is it called?
The Cournot equilibrium, I think, is what the small number of firm equilibrium is.
The point is it doesn't equilibrate to perfect competition with zero margins.
If there's three firms in the economy and all are kind of independently behaving rationally,
it doesn't equilibrate to zero. Help me understand that, because
right now we do have three leading firms and they're not making profit. So what is changing?
Again, the gross margins right now are very positive.
What's happening is a combination of two things. One is that we're still in the exponential
scale-up phase of compute. A model gets trained. Let's say a model got
trained that costs $1 billion last year. Then this year it produced $4 billion of
revenue and cost $1 billion to inference from. Again, I'm using stylized numbers here, but that
would be 75% gross margins and this 25% tax. So that model as a whole makes $2 billion.
But at the same time, we're spending $10 billion to train the next model because
there's an exponential scale-up. So the company loses money. Each model
makes money, but the company loses money. The equilibrium I'm talking about is an
equilibrium where we have the "country of geniuses in a data center", but that
model training scale-up has equilibrated more. Maybe it's still going up. We're still trying to
predict the demand, but it's more leveled out. I'm confused about a couple of things there.
Let's start with the current world. In the current world, you're right that,
as you said before, if you treat each individual model as a company, it's profitable.
But of course, a big part of the production function of being a frontier lab is training
the next model, right? Yes, that's right.
If you didn't do that, then you'd make profit for two months and then
you wouldn't have margins because you wouldn't have the best model.
But at some point that reaches the biggest scale that it can reach.
And then in equilibrium, we have algorithmic improvements, but we're spending roughly the
same amount to train the next model as we spend to train the current model.
At some point you run out of money in the economy. A fixed lump of labor fallacy… The economy is
going to grow, right? That's one of your predictions. We're going
to have the data centers in space. Yes, but this is another example
of the theme I was talking about. The economy will grow much faster
with AI than I think it ever has before. Right now the compute is growing 3x a year.
I don't believe the economy is gonna grow 300% a year.
I said this in "Machines of Loving Grace", I think we may get 10-20%
per year growth in the economy, but we're not gonna get 300% growth in the economy.
So I think in the end, if compute becomes the majority of what the economy produces,
it's gonna be capped by that. So let's assume a model
where compute stays capped. The world where frontier labs are making money
is one where they continue to make fast progress. Because fundamentally your margin is limited by
how good the alternative is. So you are able to make money
because you have a frontier model. If you didn't have a frontier model
you wouldn't be making money. So this model requires there
never to be a steady state. Forever and ever you keep
making more algorithmic progress. I don't think that's true. I mean,
I feel like we're in an economics class. Do you know the Tyler Cowen quote?
We never stop talking about economics. We never stop talking about economics.
So no, I don't think this field's going to be a monopoly.
All my lawyers never want me to say the word "monopoly".
But I don't think this field's going to be a monopoly.
You do get industries in which there are a small number of players.
Not one, but a small number of players. Ordinarily, the way you get monopolies
like Facebook or Meta—I always call them Facebook—is these kinds of network effects.
The way you get industries in which there are a small number of players,
is very high costs of entry. Cloud is like this. I think cloud is a good example of this.
There are three, maybe four, players within cloud. I think that's the same for AI, three, maybe four.
The reason is that it's so expensive. It requires so much expertise and so
much capital to run a cloud company. You have to put up all this capital.
In addition to putting up all this capital, you have to get all of this other stuff
that requires a lot of skill to make it happen. So if you go to someone and you're like, "I want
to disrupt this industry, here's $100 billion." You're like, "okay, I'm putting in $100 billion
and also betting that you can do all these other things that these people have been doing."
Only to decrease the profit. The effect of your entering
is that profit margins go down. So, we have equilibria like this
all the time in the economy where we have a few players. Profits are not astronomical. Margins
are not astronomical, but they're not zero. That's what we see on cloud. Cloud is very
undifferentiated. Models are more differentiated than cloud.
Everyone knows Claude is good at different things than GPT is good at, than Gemini is good at.
It's not just that Claude's good at coding, GPT is good at math and reasoning.
It's more subtle than that. Models are good at different types of coding. Models have different
styles. I think these things are actually quite different from each other, and so I would expect
more differentiation than you see in cloud. Now, there actually is one counter-argument.
That counter-argument is if the process of producing models,
if AI models can do that themselves, then that could spread throughout the economy.
But that is not an argument for commoditizing AI models in general.
That's kind of an argument for commoditizing the whole economy at once.
I don't know what quite happens in that world where basically anyone
can do anything, anyone can build anything, and there's no moat around anything at all.
I don't know, maybe we want that world. Maybe that's the end state here.
Maybe when AI models can do everything, if we've solved all the safety and security problems,
that's one of the mechanisms for the economy just flattening itself again.
But that's kind of far post-"country of geniuses in the data center."
Maybe a finer way to put that potential point is: 1) it seems like AI research is especially
loaded on raw intellectual power, which will be especially abundant in the world of AGI.
And 2) if you just look at the world today, there are very few technologies that seem to be
diffusing as fast as AI algorithmic progress. So that does hint that this industry is
sort of structurally diffusive. I think coding is going fast, but
I think AI research is a superset of coding and there are aspects of it that are not going fast.
But I do think, again, once we get coding, once we get AI models going fast, then that will speed up
the ability of AI models to do everything else. So while coding is going fast now, I think once
the AI models are building the next AI models and building everything else,
the whole economy will kind of go at the same pace. I am worried geographically, though.
I'm a little worried that just proximity to AI, having heard about AI, may be one differentiator.
So when I said the 10-20% growth rate, a worry I have is that the growth rate could be like 50%
in Silicon Valley and parts of the world that are socially connected to Silicon Valley, and not that
much faster than its current pace elsewhere. I think that'd be a pretty messed up world.
So one of the things I think about a lot is how to prevent that.
Do you think that once we have this country of geniuses in a data center, that
robotics is sort of quickly solved afterwards? Because it seems like a big problem with robotics
is that a human can learn how to teleoperate current hardware, but current AI models can't,
at least not in a way that's super productive. And so if we have this ability to learn like
a human, shouldn't it solve robotics immediately as well?
I don't think it's dependent on learning like a human.
It could happen in different ways. Again, we could have trained the model on
many different video games, which are like robotic controls, or many different simulated robotics
environments, or just train them to control computer screens, and they learn to generalize.
So it will happen... it's not necessarily dependent on human-like learning.
Human-like learning is one way it could happen. If the model's like, "Oh, I pick up a robot,
I don't know how to use it, I learn," that could happen because we discovered continual learning.
That could also happen because we trained the model on a bunch of environments and
then generalized, or it could happen because the model learns that in the context length.
It doesn't actually matter which way. If we go back to the discussion we had
an hour ago, that type of thing can happen in several different ways.
But I do think when for whatever reason the models have those skills, then robotics will be
revolutionized—both the design of robots, because the models will be much better than humans at
that, and also the ability to control robots. So we'll get better at building the physical
hardware, building the physical robots, and we'll also get better at controlling it.
Now, does that mean the robotics industry will also be generating
trillions of dollars of revenue? My answer there is yes, but there will be
the same extremely fast, but not infinitely fast diffusion. So will robotics be revolutionized?
Yeah, maybe tack on another year or two. That's the way I think about these things.
Makes sense. There's a general skepticism about extremely fast progress. Here's my view. It sounds
like you are going to solve continual learning one way or another within a matter of years.
But just as people weren't talking about continual learning a couple of years ago,
and then we realized, "Oh, why aren't these models as useful as they could be right now,
even though they are clearly passing the Turing test and are experts in so many different domains?
Maybe it's this thing." Then we solve this thing and we realize, actually, there's another thing
that human intelligence can do that's a basis of human labor that these models can't do.
So why not think there will be more things like this, where
we've found more pieces of human intelligence? Well, to be clear, I think continual learning, as
I've said before, might not be a barrier at all. I think we may just get there by pre-training
generalization and RL generalization. I think there just
might not be such a thing at all. In fact, I would point to the history
in ML of people coming up with things that are barriers that end up kind of
dissolving within the big blob of compute. People talked about, "How do your models
keep track of nouns and verbs?" "They can understand syntactically,
but they can't understand semantically? It's only statistical correlations."
"You can understand a paragraph, you can’t understand a word.
There's reasoning, you can't do reasoning." But then suddenly it turns out you can
do code and math very well. So I think there's actually a
stronger history of some of these things seeming like a big deal and then kind of dissolving. Some
of them are real. The need for data is real, maybe continual learning is a real thing.
But again, I would ground us in something like code.
I think we may get to the point in a year or two where the models can
just do SWE end-to-end. That's a whole task. That's a whole sphere of human activity that
we're just saying models can do now. When you say end-to-end, do you mean
setting technical direction, understanding the context of the problem, et cetera?
Yes. I mean all of that. Interesting. I feel like that is AGI-complete,
which maybe is internally consistent. But it's not like saying 90%
of code or 100% of code. No, I gave this spectrum:
90% of code, 100% of code, 90% of end-to-end SWE, 100% of end-to-end SWE.
New tasks are created for SWEs. Eventually those get done as well.
It's a long spectrum there, but we're traversing the spectrum very quickly.
I do think it's funny that I've seen a couple of podcasts you've done where
the hosts will be like, "But Dwarkesh wrote the essay about the continuous learning thing."
It always makes me crack up because you've been an AI researcher for 10 years.
I'm sure there's some feeling of, "Okay, so a podcaster wrote an essay,
and every interview I get asked about it." The truth of the matter is that we're all
trying to figure this out together. There are some ways in which I'm
able to see things that others aren't. These days that probably has more to do
with seeing a bunch of stuff within Anthropic and having to make a bunch of decisions than I have
any great research insight that others don't. I'm running a 2,500 person company.
It's actually pretty hard for me to have concrete research insight, much harder than it would have
been 10 years ago or even two or three years ago. As we go towards a world of a full drop-in
remote worker replacement, does an API pricing model still make the most sense?
If not, what is the correct way to price AGI, or serve AGI?
I think there's going to be a bunch of different business models here, all at once,
that are going to be experimented with. I actually do think that the API
model is more durable than many people think. One way I think about it is if the technology
is advancing quickly, if it's advancing exponentially, what that means is there's
always a surface area of new use cases that have been developed in the last three months.
Any kind of product surface you put in place is always at risk of sort of becoming irrelevant.
Any given product surface probably makes sense for a range of capabilities of the model.
The chatbot is already running into limitations where making it smarter doesn't really help the
average consumer that much. But I don't think that's
a limitation of AI models. I don't think that's evidence
that the models are good enough and them getting better doesn't matter to the economy.
It doesn't matter to that particular product. So I think the value of the API is that the API
always offers an opportunity, very close to the bare metal, to build on what the latest thing is.
There's always going to be this front of new startups and new ideas that
weren't possible a few months ago and are possible because the model is advancing.
I actually predict that it's going to exist alongside other models, but we're always going
to have the API business model because there's always going to be a need for a thousand different
people to try experimenting with the model in a different way. 100 of them become startups and
ten of them become big successful startups. Two or three really end up being the way
that people use the model of a given generation. So I basically think it's always going to exist.
At the same time, I'm sure there's going to be other models as well.
Not every token that's output by the model is worth the same amount.
Think about what is the value of the tokens that the model outputs when someone calls
them up and says, "My Mac isn't working," or something, the model's like, "restart it."
Someone hasn't heard that before, but the model said that 10 million times.
Maybe that's worth like a dollar or a few cents or something.
Whereas if the model goes to one of the pharmaceutical companies and it says, "Oh,
you know, this molecule you're developing, you should take the aromatic ring from that end of the
molecule and put it on that end of the molecule. If you do that, wonderful things will happen."
Those tokens could be worth tens of millions of dollars.
So I think we're definitely going to see business models that recognize that.
At some point we're going to see "pay for results" in some form, or we may see forms of compensation
that are like labor, that kind of work by the hour. I don't know. I think because it's a new
industry, a lot of things are going to be tried. I don't know what will turn out to
be the right thing. I take your point that
people will have to try things to figure out what is the best way to use this blob of intelligence.
But what I find striking is Claude Code. I don't think in the history of startups
there has been a single application that has been as hotly competed in as coding agents.
Claude Code is a category leader here. That seems surprising to me. It doesn't seem
intrinsically that Anthropic had to build this. I wonder if you have an accounting of why it had
to be Anthropic or how Anthropic ended up building an application in addition
to the model underlying it that was successful. So it actually happened in a pretty simple way,
which is that we had our own coding models, which were good at coding.
Around the beginning of 2025, I said, "I think the time has come where you can have
nontrivial acceleration of your own research if you're an AI company by using these models."
Of course, you need an interface, you need a harness to use them.
So I encouraged people internally. I didn't say this is one thing that you have to use.
I just said people should experiment with this. I think it might have been originally
called Claude CLI, and then the name eventually got changed to Claude Code.
Internally, it was the thing that everyone was using and it was seeing fast internal adoption.
I looked at it and I said, "Probably we should launch this externally, right?"
It's seen such fast adoption within Anthropic. Coding is a lot of what we do.
We have an audience of many, many hundreds of people that's in some ways at least
representative of the external audience. So it looks like we already have product
market fit. Let's launch this thing. And then we launched it. I think just the fact that we
ourselves are kind of developing the model and we ourselves know what we most need to use the model,
I think it's kind of creating this feedback loop. I see. In the sense that you, let's say a
developer at Anthropic is like, "Ah, it would be better if it was better at this X thing."
Then you bake that into the next model that you build.
That's one version of it, but then there's just the ordinary product iteration.
We have a bunch of coders within Anthropic, they use Claude Code
every day and so we get fast feedback. That was more important in the early days.
Now, of course, there are millions of people using it, and so we get
a bunch of external feedback as well. But it's just great to be able to get
kind of fast internal feedback. I think this is the reason why we
launched a coding model and didn't launch a pharmaceutical company.
My background's in biology, but we don't have any of the resources that
are needed to launch a pharmaceutical company. Let me now ask you about making AI go well.
It seems like whatever vision we have about how AI goes well has to be compatible with two things:
1) the ability to build and run AIs is diffusing extremely rapidly and 2) the
population of AIs, the amount we have and their intelligence, will also increase very rapidly.
That means that lots of people will be able to build huge populations of misaligned AIs,
or AIs which are just companies which are trying to increase their
footprint or have weird psyches like Sydney Bing, but now they're superhuman.
What is a vision for a world in which we have an equilibrium that is compatible
with lots of different AIs, some of which are misaligned, running around?
I think in "The Adolescence of Technology", I was skeptical of the balance of power.
But the thing I was specifically skeptical of is you have three or four of these companies
all building models that are derived from the same thing, that they would check each other.
Or even that any number of them would check each other.
We might live in an offense-dominant world where one person or one AI model is smart enough to do
something that causes damage for everything else. In the short run, we have a limited number
of players now. So we can start
within the limited number of players. We need to put in place the safeguards.
We need to make sure everyone does the right alignment work.
We need to make sure everyone has bioclassifiers. Those are the immediate things we need to do.
I agree that that doesn't solve the problem in the long run, particularly if the ability of
AI models to make other AI models proliferates, then the whole thing can become harder to solve.
I think in the long run we need some architecture of governance.
We need some architecture of governance that preserves human freedom,
but also allows us to govern a very large number of human systems, AI systems, hybrid
human-AI companies or economic units. So we're gonna need to think about:
how do we protect the world against bioterrorism? How do we protect the world against mirror life?
Probably we're gonna need some kind of AI monitoring system
that monitors for all of these things. But then we need to build this in a way
that preserves civil liberties and our constitutional rights.
So I think just as anything else, it's a new security landscape with a new set of
tools and a new set of vulnerabilities. My worry is, if we had 100 years for this
to happen all very slowly, we'd get used to it. We've gotten used to the presence of explosives
in society or the presence of various new weapons or the presence of video cameras.
We would get used to it over 100 years and we’d develop governance mechanisms. We'd
make our mistakes. My worry is just that this is happening all so fast.
So maybe we need to do our thinking faster about how to make these governance mechanisms work.
It seems like in an offense-dominant world, over the course of the next century—the idea is that AI
is making the progress that would happen over the next century happen in some period of five to ten
years—we would still need the same mechanisms, or balance of power would be similarly intractable,
even if humans were the only game in town. I guess we have the advice of AI.
But it fundamentally doesn't seem like a totally different ball game here.
If checks and balances were going to work, they would work with humans as well.
If they aren't going to work, they wouldn't work with AIs as well.
So maybe this just dooms human checks and balances as well.
Again, I think there's some way to make this happen.
The governments of the world may have to work together to make it happen.
We may have to talk to AIs about building societal structures in such a way that these
defenses are possible. I don't know. I don’t want to say this is so far ahead in time,
but it’s so far ahead in technological ability that may happen over a short period of time,
that it's hard for us to anticipate it in advance. Speaking of governments getting involved,
on December 26, the Tennessee legislature introduced a bill which said, "It would
be an offense for a person to knowingly train artificial intelligence to provide
emotional support, including through open-ended conversations with a user."
Of course, one of the things that Claude attempts to do is be a thoughtful, knowledgeable friend.
In general, it seems like we're going to have this patchwork of state laws.
A lot of the benefits that normal people could experience as a result of AI are going to be
curtailed, especially when we get into the kinds of things you discuss in "Machines
of Loving Grace": biological freedom, mental health improvements, et cetera.
It seems easy to imagine worlds in which these get Whac-A-Moled away by different laws, whereas
bills like this don't seem to address the actual existential threats that you're concerned about.
I'm curious to understand, in the context of things like this, Anthropic's position
against the federal moratorium on state AI laws. There are many different things going on at once.
I think that particular law is dumb. It was clearly made by legislators
who just probably had little idea what AI models could do and not do.
They're like, "AI models serving us, that just sounds scary.
I don't want that to happen." So we're not in favor of that.
But that wasn't the thing that was being voted on. The thing that was being voted on is:
we're going to ban all state regulation of AI for 10 years with no apparent plan to do any
federal regulation of AI, which would take Congress to pass, which is a very high bar.
So the idea that we'd ban states from doing anything for 10 years… People said they had
a plan for the federal government, but there was no actual proposal on the table. There was
no actual attempt. Given the serious dangers that I lay out in "Adolescence of Technology"
around things like biological weapons and bioterrorism autonomy risk, and the
timelines we've been talking about—10 years is an eternity—I think that's a crazy thing to do.
So if that's the choice, if that's what you force us to choose, then we're going
to choose not to have that moratorium. I think the benefits of that position
exceed the costs, but it's not a perfect position if that's the choice.
Now, I think the thing that we should do, the thing that I would support, is the federal
government should step in, not saying "states you can't regulate", but "Here's what we're going to
do, and states you can't differ from this." I think preemption is fine in the sense of
saying that the federal government says, "Here is our standard. This applies to everyone.
States can't do something different." That would be something I would support
if it would be done in the right way. But this idea of states, "You can't do
anything and we're not doing anything either," that struck us as very much not making sense.
I think it will not age well, it is already starting to not age well with
all the backlash that you've seen. Now, in terms of what we would want,
the things we've talked about are starting with transparency standards in order to monitor some
of these autonomy risks and bioterrorism risks. As the risks become more serious, as we get more
evidence for them, then I think we could be more aggressive in some targeted ways and say, "Hey,
AI bioterrorism is really a threat. Let's pass a law that forces
people to have classifiers." I could even imagine… It depends.
It depends how serious the threat it ends up being. We don't know for sure. We need to pursue
this in an intellectually honest way where we say that ahead of time, the risk has not emerged yet.
But I could certainly imagine, with the pace that things are going at,
a world where later this year we say, "Hey, this AI bioterrorism stuff is really serious.
We should do something about it. We should put it in a federal standard.
If the federal government won't act, we should put it in a state standard." I could totally see that.
I'm concerned about a world where if you just consider the pace of progress you're expecting,
the life cycle of legislation... The benefits are, as you say because
of diffusion lag, slow enough that I really do think this patchwork of state
laws, on the current trajectory, would prohibit. I mean if having an emotional chatbot friend is
something that freaks people out, then just imagine the kinds of actual benefits from AI
we want normal people to be able to experience. From improvements in health and healthspan and
improvements in mental health and so forth. Whereas at the same time, it seems like you
think the dangers are already on the horizon and I just don't see that much… It seems like it would
be especially injurious to the benefits of AI as compared to the dangers of AI.
So that's maybe where the cost benefit makes less sense to me.
So there's a few things here. People talk about there being
thousands of these state laws. First of all, the vast,
vast majority of them do not pass. The world works a certain way in theory,
but just because a law has been passed doesn't mean it's really enforced.
The people implementing it may be like, "Oh my God, this is stupid.
It would mean shutting off everything that's ever been built in Tennessee."
Very often, laws are interpreted in a way that makes them not as dangerous or harmful.
On the same side, of course, you have to worry if you're passing a law to stop a bad thing;
you have this problem as well. My basic view is that if we could
decide what laws were passed and how things were done—and we’re only one small input
into that—I would deregulate a lot of the stuff around the health benefits of AI.
I don't worry as much about the chatbot laws. I actually worry more about the drug approval
process, where I think AI models are going to greatly accelerate the rate at which we discover
drugs, and the pipeline will get jammed up. The pipeline will not be prepared to process
all the stuff that's going through it. I think reform of the regulatory process
should bias more towards the fact that we have a lot of things coming where the safety and
efficacy is actually going to be really crisp and clear, a beautiful thing, and really effective.
Maybe we don't need all this superstructure around it that was designed around an era of drugs that
barely work and often have serious side effects. At the same time, I think we should be
ramping up quite significantly the safety and security legislation.
Like I've said, starting with transparency is my view of trying not to hamper the industry,
trying to find the right balance. I'm worried about it. Some people criticize
my essay for saying, "That's too slow. The dangers of AI will come too soon
if we do that." Well, basically,
I think the last six months and maybe the next few months are going to be about transparency.
Then, if these risks emerge when we're more certain of them—which
I think we might be as soon as later this year—then I think we need to act very fast
in the areas where we've actually seen the risk. I think the only way to do this is to be nimble.
Now, the legislative process is normally not nimble, but we need to emphasize the
urgency of this to everyone involved. That's why I'm sending this message of urgency.
That's why I wrote Adolescence of Technology. I wanted policymakers, economists, national
security professionals, and decision-makers to read it so that they have some hope of acting
faster than they would have otherwise. Is there anything you can do or advocate
that would make it more certain that the benefits of AI are better instantiated?
I feel like you have worked with legislatures to say, "Okay,
we're going to prevent bioterrorism here. We're going to increase transparency, we're
going to increase whistleblower protection." But I think by default, the actual benefits
we're looking forward to seem very fragile to different kinds of moral panics or
political economy problems. I don't actually agree that
much regarding the developed world. I feel like in the developed world,
markets function pretty well. When there's a lot of money to
be made on something and it's clearly the best available alternative, it's actually hard for
the regulatory system to stop it. We're seeing that in AI itself.
A thing I've been trying to fight for is export controls on chips to China.
That's in the national security interest of the US.
That's squarely within the policy beliefs of almost everyone in Congress of both parties.
The case is very clear. The counterarguments against it, I'll politely call them fishy.
Yet it doesn't happen and we sell the chips because there's so much money riding on it.
That money wants to be made. In that case, in my opinion, that's a bad thing.
But it also applies when it's a good thing. So if we're talking about drugs and benefits of
the technology, I am not as worried about those benefits being hampered in the developed world.
I am a little worried about them going too slow. As I said, I do think we should work to speed
the approval process in the FDA. I do think we should fight against
these chatbot bills that you're describing. Described individually, I'm against them. I
think they're stupid. But I actually think the bigger worry is the developing world, where we
don't have functioning markets and where we often can't build on the technology that we've had.
I worry more that those folks will get left behind.
And I worry that even if the cures are developed, maybe there's someone in rural
Mississippi who doesn't get it as well. That's a smaller version of the concern
we have in the developing world. So the things we've been doing
are working with philanthropists. We work with folks who deliver medicine and
health interventions to the developing world, to sub-Saharan Africa, India, Latin America,
and other developing parts of the world. That's the thing I think that
won't happen on its own. You mentioned export controls.
Why shouldn't the US and China both have a "country of geniuses in a data center"?
Why won’t it happen or why shouldn't it happen? Why shouldn't it happen.
If this does happen, we could have a few situations.
If we have an offense-dominant situation, we could have a situation
like nuclear weapons, but more dangerous. Either side could easily destroy everything.
We could also have a world where it's unstable. The nuclear equilibrium is
stable because it's deterrence. But let's say there was uncertainty about,
if the two AIs fought, which AI would win? That could create instability. You often have
conflict when the two sides have a different assessment of their likelihood of winning.
If one side is like, "Oh yeah, there's a 90% chance I'll win," and the other side thinks
the same, then a fight is much more likely. They can't both be right,
but they can both think that. But this seems like a fully general argument
against the diffusion of AI technology. That's the implication of this world.
Let me just go on, because I think we will get diffusion eventually.
The other concern I have is that governments will oppress their own people with AI.
I'm worried about a world where you have a country in which there’s already a government that's
building a high-tech authoritarian state. To be clear, this is about the government.
This is not about the people. We need to find a way for
people everywhere to benefit. My worry here is about governments.
My worry is if the world gets carved up into two pieces, one of those two pieces
could be authoritarian or totalitarian in a way that's very difficult to displace.
Now, will governments eventually get powerful AI, and is there a risk of authoritarianism?
Yes. Will governments eventually get powerful AI, and is there a risk of
bad equilibria? Yes, I think both things. But the initial conditions matter. At some point, we're
going to need to set up the rules of the road. I'm not saying that one country, either the United
States or a coalition of democracies—which I think would be a better setup, although it
requires more international cooperation than we currently seem to want to make—should just say,
"These are the rules of the road." There's going to be some negotiation.
The world is going to have to grapple with this. What I would like is for the democratic nations of
the world—those whose governments represent closer to pro-human values—are holding the
stronger hand and have more leverage when the rules of the road are set.
So I'm very concerned about that initial condition.
I was re-listening to the interview from three years ago, and one of the ways it
aged poorly is that I kept asking questions assuming there was going to be some key
fulcrum moment two to three years from now. In fact, being that far out, it just seems
like progress continues, AI improves, AI is more diffused, and people will use it for more things.
It seems like you're imagining a world in the future where the countries get together, and
"Here's the rules of the road, here's the leverage we have, and here's the leverage you have."
But on the current trajectory, everybody will have more AI.
Some of that AI will be used by authoritarian countries.
Some of that within the authoritarian countries will be used by private
actors versus state actors. It's not clear who will benefit more.
It's always unpredictable to tell in advance. It seems like the internet privileged
authoritarian countries more than you would've expected.
Maybe AI will be the opposite way around. I want to better understand what
you're imagining here. Just to be precise about it,
I think the exponential of the underlying technology will continue as it has before.
The models get smarter and smarter, even when they get to a "country of geniuses in a data center."
I think you can continue to make the model smarter.
There's a question of getting diminishing returns on their value in the world.
How much does it matter after you've already solved human biology?
At some point you can do harder, more abstruse math problems, but nothing after that matters.
Putting that aside, I do think the exponential will continue, but there will be certain
distinguished points on the exponential. Companies, individuals, and countries
will reach those points at different times. In "The Adolescence of Technology" I talk about:
Is a nuclear deterrent still stable in the world of AI?
I don't know, but that's an example of one thing we've taken for granted.
The technology could reach such a level that we can no longer be certain of it.
Think of others. There are points where if you reach a certain level, maybe you have offensive
cyber dominance, and every computer system is transparent to you after that unless the
other side has an equivalent defense. I don't know what the critical moment
is or if there's a single critical moment. But I think there will be either a critical
moment, a small number of critical moments, or some critical window where AI confers
some large advantage from the perspective of national security, and one country or
coalition has reached it before others. I'm not advocating that they just say,
"Okay, we're in charge now." That's not how I think about it.
The other side is always catching up. There are extreme actions you're not
willing to take, and it's not right to take complete control anyway.
But at the point that happens, people are going to understand that the world has changed.
There's going to be some negotiation, implicit or explicit, about what the
post-AI world order looks like. My interest is in making that
negotiation be one in which classical liberal democracy has a strong hand.
I want to understand what that better means, because you say in the essay,
"Autocracy is simply not a form of government that people can accept in the post-powerful AI age."
That sounds like you're saying the CCP as an institution cannot exist after we get AGI.
That seems like a very strong demand, and it seems to imply a world where the leading lab
or the leading country will be able to—and by that language, should get to—determine
how the world is governed or what kinds of governments are, and are not, allowed.
I believe that paragraph said something like, "You could take it even further and say X."
I wasn't necessarily endorsing that view. I was saying,
"Here's a weaker thing that I believe. We have to worry a lot about authoritarians and
we should try to check them and limit their power. You could take this much further and have a more
interventionist view that says authoritarian countries with AI are these self-fulfilling
cycles that are very hard to displace, so you just need to get rid of them from the beginning."
That has exactly all the problems you say. If you were to make a commitment to
overthrowing every authoritarian country, they would take a bunch of actions now
that could lead to instability. That just may not be possible.
But the point I was making that I do endorse is that it is quite possible that...
Today, the view, my view, in most of the Western world is that democracy is a better form of
government than authoritarianism. But if a country’s authoritarian,
we don’t react the way we’d react if they committed a genocide or something.
I guess what I'm saying is I'm a little worried that in the age of AGI, authoritarianism will
have a different meaning. It will be a graver thing.
We have to decide one way or another how to deal with that.
The interventionist view is one possible view. I was exploring such views. It may end up being the
right view, or it may end up being too extreme. But I do have hope. One piece of hope I have is
that we have seen that as new technologies are invented, forms of government become obsolete.
I mentioned this in "Adolescence of Technology", where I said feudalism
was basically a form of government, and when we invented industrialization, feudalism was no
longer sustainable. It no longer made sense. Why is that hope? Couldn't that imply that
democracy is no longer going to be a competitive system?
Right, it could go either way. But these problems with
authoritarianism get deeper. I wonder if that's an indicator of
other problems that authoritarianism will have. In other words, because authoritarianism becomes
worse, people are more afraid of it. They work harder to stop it.
You have to think in terms of total equilibrium. I just wonder if it will motivate new ways
of thinking about how to preserve and protect freedom with the new technology.
Even more optimistically, will it lead to a collective reckoning and a more emphatic
realization of how important some of the things we take as individual rights are?
A more emphatic realization that we really can't give these away.
We've seen there's no other way to live that actually works.
I am actually hopeful that—it sounds too idealistic, but I believe it could be the
case—dictatorships become morally obsolete. They become morally unworkable forms of
government and the crisis that that creates is sufficient to force us to find another way.
I think there is genuinely a tough question here which I'm not sure how you resolve.
We've had to come out one way or another on it through history.
With China in the '70s and '80s, we decided that even though it's an
authoritarian system, we will engage with it. I think in retrospect that was the right call,
because it’s a state authoritarian system but a billion-plus people are much wealthier and
better off than they would've otherwise been. It's not clear that it would've stopped being
an authoritarian country otherwise. You can just look at North Korea
as an example of that. I don't know if it takes
that much intelligence to remain an authoritarian country that continues to coalesce its own power.
You can imagine a North Korea with an AI that's much worse than everybody else's,
but still enough to keep power. In general, it seems like we should just
have this attitude that the benefits of AI—in the form of all these empowerments
of humanity and health—will be big. Historically, we have decided it's good
to spread the benefits of technology widely, even to people whose governments are authoritarian.
It is a tough question, how to think about it with AI, but historically we have said, "yes,
this is a positive-sum world, and it's still worth diffusing the technology."
There are a number of choices we have. Framing this as a government-to-government
decision in national security terms is one lens, but there are a lot of other lenses.
You could imagine a world where we produce all these cures to diseases.
The cures are fine to sell to authoritarian countries, but the data centers just aren't.
The chips and the data centers aren't, and the AI industry itself isn't.
Another possibility I think folks should think about is this.
Could there be developments we can make—either that naturally happen as a result of AI,
or that we could make happen by building technology on AI—that
create an equilibrium where it becomes infeasible for authoritarian countries
to deny their people private use of the benefits of the technology?
Are there equilibria where we can give everyone in an authoritarian country their own AI model that
defends them from surveillance and there isn't a way for the authoritarian country to crack
down on this while retaining power? I don't know. That sounds to me like if that went far enough,
it would be a reason why authoritarian countries would disintegrate from the inside.
But maybe there's a middle world where there's an equilibrium where, if they want to hold on
to power, the authoritarians can't deny individualized access to the technology.
But I actually do have a hope for the more radical version.
Is it possible that the technology might inherently have properties—or
that by building on it in certain ways we could create properties—that have this
dissolving effect on authoritarian structures? Now, we hoped originally—think back to the
beginning of the Obama administration—that social media and the internet would have
that property, and it turns out not to. But what if we could try again with the
knowledge of how many things could go wrong, and that this is a different technology?
I don't know if it would work, but it's worth a try.
It's just very unpredictable. There are first principles reasons why
authoritarianism might be privileged. It's all very unpredictable. We just
have to recognize the problem and come up with 10 things we can try, try those,
and then assess which ones are working, if any. Then try new ones if the old ones aren't working.
But I guess that nets out to today, as you say, that we will not sell data centers,
or chips, and the ability to make chips to China. So in some sense, you are denying… There would be
some benefits to the Chinese economy, Chinese people, et cetera, because we're doing that.
Then there'd also be benefits to the American economy because it's a positive-sum world.
We could trade. They could have their country's data centers doing one thing.
We could have ours doing another. Already, you're saying it's not worth that
positive-sum stipend to empower those countries? What I would say is that we are about to be
in a world where growth and economic value will come very easily if we're
able to build these powerful AI models. What will not come easily is distribution
of benefits, distribution of wealth, political freedom.
These are the things that are going to be hard to achieve.
So when I think about policy, I think that the technology and the market will deliver all the
fundamental benefits, this is my fundamental belief, almost faster than we can take them.
These questions about distribution and political freedom and rights are the ones that will actually
matter and that policy should focus on. Speaking of distribution, as you were
mentioning, we have developing countries. In many cases, catch-up growth has been
weaker than we would have hoped for. But when catch-up growth does happen,
it's fundamentally because they have underutilized labor.
We can bring the capital and know-how from developed countries to these countries,
and then they can grow quite rapidly. Obviously, in a world where labor is no
longer the constraining factor, this mechanism no longer works.
So is the hope basically to rely on philanthropy from
the people or countries who immediately get wealthy from AI? What is the hope?
Philanthropy should obviously play some role, as it has in the past.
But I think growth is always better and stronger if we can make it endogenous.
What are the relevant industries in an AI-driven world?
I said we shouldn't build data centers in China, but there's no reason we shouldn't
build data centers in Africa. In fact, I think it'd be
great to build data centers in Africa. As long as they're not owned by China,
we should build data centers in Africa. I think that's a great thing to do.
There's no reason we can't build a pharmaceutical industry that's AI-driven.
If AI is accelerating drug discovery, then there will be a bunch of biotech startups.
Let's make sure some of those happen in the developing world.
Certainly, during the transition—we can talk about the point where humans have no
role—humans will still have some role in starting up these companies and supervising the AI models.
So let's make sure some of those humans are in the developing world
so that fast growth can happen there as well. You guys recently announced that Claude is going
to have a constitution that's aligned to a set of values, and not necessarily just to the end user.
There's a world I can imagine where if it is aligned to the end user,
it preserves the balance of power we have in the world today because everybody gets to have their
own AI that's advocating for them. The ratio of bad actors to
good actors stays constant. It seems to work out for our world today.
Why is it better not to do that, but to have a specific set of values that the
AI should carry forward? I'm not sure I'd quite
draw the distinction in that way. There may be two relevant distinctions here.
I think you're talking about a mix of the two. One is, should we give the model a set of
instructions about "do this" versus "don't do this"?
The other is, should we give the model a set of principles for how to act?
It's kind of purely a practical and empirical thing that we've observed.
By teaching the model principles, getting it to learn from principles,
its behavior is more consistent, it's easier to cover edge cases, and the model is more
likely to do what people want it to do. In other words, if you give it a list of
rules—"don't tell people how to hot-wire a car, don't speak in Korean"—it doesn't
really understand the rules, and it's hard to generalize from them.
It’s just a list of do’s and don’t’s. Whereas if you give it principles—it
has some hard guardrails like "Don't make biological weapons" but—overall you're
trying to understand what it should be aiming to do, how it should be aiming to operate.
So just from a practical perspective, that turns out to be a more effective way to train the model.
That's the rules versus principles trade-off. Then there's another thing you're talking about,
which is the corrigibility versus intrinsic motivation trade-off.
How much should the model be a kind of "skin suit" where it just directly
follows the instructions given to it by whoever is giving those instructions,
versus how much should the model have an inherent set of values and go off and do things on its own?
There I would actually say everything about the model is closer to the direction that
it should mostly do what people want. It should mostly follow instructions.
We're not trying to build something that goes off and runs the world on its own.
We're actually pretty far on the corrigible side. Now, what we do say is there are certain
things that the model won't do. I think we say it in various ways in the
constitution, that under normal circumstances, if someone asks the model to do a task, it should do
that task. That should be the default. But if you've asked it to do something dangerous, or
to harm someone else, then the model is unwilling to do that.
So I actually think of it as a mostly corrigible model that has some limits,
but those limits are based on principles. Then the fundamental question is,
how are those principles determined? This is not a special question for Anthropic.
This would be a question for any AI company. But because you have been the ones to actually
write down the principles, I get to ask you this question.
Normally, a constitution is written down, set in stone, and there's a process of
updating it and changing it and so forth. In this case, it seems like a document
that people at Anthropic write, that can be changed at any time,
that guides the behavior of systems that are going to be the basis of a lot of economic activity.
How do you think about how those principles should be set?
I think there are maybe three sizes of loop here, three ways to iterate.
One is we iterate within Anthropic. We train the model, we're not happy with it,
and we change the constitution. I think that's good to do.
Putting out public updates to the constitution every once in a while
is good because people can comment on it. The second level of loop is different companies
having different constitutions. I think it’s useful. Anthropic puts out a constitution,
Gemini puts out a constitution, and other companies put out a constitution.
People can look at them and compare. Outside observers can critique and say,
"I like this thing from this constitution and this thing from that constitution."
That creates a soft incentive and feedback for all the companies to
take the best of each element and improve. Then I think there's a third loop, which is
society beyond the AI companies and beyond just those who comment without hard power.
There we've done some experiments. A couple years ago, we did an experiment with the Collective
Intelligence Project to basically poll people and ask them what should be in our AI constitution.
At the time, we incorporated some of those changes.
So you could imagine doing something like that with the new approach we've
taken to the constitution. It's a little harder because
it was an easier approach to take when the constitution was a list of dos and don'ts.
At the level of principles, it has to have a certain amount of coherence.
But you could still imagine getting views from a wide variety of people.
You could also imagine—and this is a crazy idea, but this whole
interview is about crazy ideas—systems of representative government having input.
I wouldn't do this today because the legislative process is so slow.
This is exactly why I think we should be careful about the legislative process and AI regulation.
But there's no reason you couldn't, in principle, say, "All AI models have to have a constitution
that starts with these things, and then you can append other things after it, but there has to
be this special section that takes precedence." I wouldn't do that. That's too rigid and sounds
overly prescriptive in a way that I think overly aggressive legislation is.
But that is a thing you could try to do. Is there some much less heavy-handed
version of that? Maybe. I really like control loop two.
Obviously, this is not how constitutions of actual governments do or should work.
There's not this vague sense in which the Supreme Court will feel out how people
are feeling—what are the vibes—and update the constitution accordingly.
With actual governments, there's a more formal, procedural process.
But you have a vision of competition between constitutions, which is actually very reminiscent
of how some libertarian charter cities people used to talk, about what an archipelago of different
kinds of governments would look like. There would be selection among them of
who could operate the most effectively and where people would be the happiest.
In a sense, you're recreating that vision of a utopia of archipelagos.
I think that vision has things to recommend it and things that will go wrong with it.
It's an interesting, in some ways compelling, vision, but things will
go wrong that you hadn't imagined. So I like loop two as well,
but I feel like the whole thing has got to be some mix of loops one, two, and three,
and it's a matter of the proportions. I think that's gotta be the answer.
When somebody eventually writes the equivalent of The Making of the Atomic Bomb for this era,
what is the thing that will be hardest to glean from the historical record that
they're most likely to miss? I think a few things. One is,
at every moment of this exponential, the extent to which the world outside it didn't understand it.
This is a bias that's often present in history. Anything that actually happened looks
inevitable in retrospect. When people look back, it will
be hard for them to put themselves in the place of people who were actually making a bet on this
thing to happen that wasn't inevitable, that we had these arguments like the arguments I make for
scaling or that continual learning will be solved. Some of us internally put a high probability
on this happening, but there's a world outside us that's not acting on that at all.
I think the weirdness of it, unfortunately the insularity of it...
If we're one year or two years away from it happening,
the average person on the street has no idea. That's one of the things I'm trying to change with
the memos, with talking to policymakers. I don’t know but I think
that's just a crazy thing. Finally, I would say—and this
probably applies to almost all historical moments of crisis—how absolutely fast it was happening,
how everything was happening all at once. Decisions that you might think were
carefully calculated, well actually you have to make that decision,
and then you have to make 30 other decisions on the same day because it's all happening so fast.
You don't even know which decisions are going to turn out to be consequential.
One of my worries—although it's also an insight into what's happening—is that some
very critical decision will be some decision where someone just comes into my office and
is like, "Dario, you have two minutes. Should we do thing A or thing B on this?"
Someone gives me this random half-page memo and asks, "Should we do A or B?" I'm like, "I
don't know. I have to eat lunch. Let's do B." That ends up being the most consequential thing ever.
So final question. There aren't tech CEOs who are usually writing 50-page memos every few months.
It seems like you have managed to build a role for yourself and a company around
you which is compatible with this more intellectual-type role of CEO.
I want to understand how you construct that. How does that work? Do you just go away for
a couple of weeks and then you tell your company, "This is the memo. Here's what
we're doing"? It's also reported that you write a bunch of these internally.
For this particular one, I wrote it over winter break.
I was having a hard time finding the time to actually write it.
But I think about this in a broader way. I think it relates to the culture of the company.
I probably spend a third, maybe 40%, of my time making sure the culture of Anthropic is good.
As Anthropic has gotten larger, it's gotten harder to get directly involved in the training
of the models, the launch of the models, the building of the products. It's 2,500
people. I have certain instincts, but it's very difficult to get involved in every single detail.
I try as much as possible, but one thing that's very leveraged is making sure Anthropic is a good
place to work, people like working there, everyone thinks of themselves as team members, and everyone
works together instead of against each other. We've seen as some of the other AI companies
have grown—without naming any names—we're starting to see decoherence and people fighting each other.
I would argue there was even a lot of that from the beginning, but it's gotten worse.
I think we've done an extraordinarily good job, even if not perfect, of holding the
company together, making everyone feel the mission, that we're sincere about the mission,
and that everyone has faith that everyone else there is working for the right reason.
That we're a team, that people aren't trying to get ahead at each other's expense or
backstab each other, which again, I think happens a lot at some of the other places.
How do you make that the case? It's a lot of things. It's me,
it's Daniela, who runs the company day to day, it's the co-founders,
it's the other people we hire, it's the environment we try to create.
But I think an important thing in the culture is that the other leaders as well, but especially me,
have to articulate what the company is about, why it's doing what it's doing,
what its strategy is, what its values are, what its mission is, and what it stands for.
When you get to 2,500 people, you can't do that person by person.
You have to write, or you have to speak to the whole company.
This is why I get up in front of the whole company every two weeks and speak for an hour.
I wouldn't say I write essays internally. I do two things. One, I write this thing
called a DVQ, Dario Vision Quest. I wasn't the one who named it that.
That's the name it received, and it's one of these names that I tried to fight because it made it
sound like I was going off and smoking peyote or something. But the name just stuck. So I get up
in front of the company every two weeks. I have a three or four-page document,
and I just talk through three or four different topics about what's going on internally,
the models we're producing, the products, the outside industry, the world as a whole
as it relates to AI and geopolitically in general. Just some mix of that. I go
through very honestly and I say, "This is what I'm thinking, and this is what Anthropic leadership
is thinking," and then I answer questions. That direct connection has a lot of value that
is hard to achieve when you're passing things down the chain six levels deep.
A large fraction of the company comes to attend, either in person or virtually.
It really means that you can communicate a lot. The other thing I do is I have a channel in
Slack where I just write a bunch of things and comment a lot.
Often that's in response to things I'm seeing at the company or questions people ask.
We do internal surveys and there are things people are concerned about, and so I'll write them up.
I'm just very honest about these things. I just say them very directly.
The point is to get a reputation of telling the company the truth about what's happening, to call
things what they are, to acknowledge problems, to avoid the sort of corpo speak, the kind of
defensive communication that often is necessary in public because the world is very large and full of
people who are interpreting things in bad faith. But if you have a company of people who you trust,
and we try to hire people that we trust, then you can really just be entirely unfiltered.
I think that's an enormous strength of the company.
It makes it a better place to work, it makes people more than the sum of their parts,
and increases the likelihood that we accomplish the mission because everyone is on the same page
about the mission, and everyone is debating and discussing how best to accomplish the mission.
Well, in lieu of an external Dario Vision Quest, we have this interview.
This interview is a little like that. This has been fun, Dario. Thanks for doing it.
Thank you, Dwarkesh.
Ask follow-up questions or revisit key timestamps.
The discussion revolves around the rapid advancement of AI, particularly over the last three years, and the surprising lack of public recognition of its proximity to AGI. The speaker emphasizes the 'Big Blob of Compute Hypothesis,' which prioritizes raw compute, data, training time, and scalable objective functions over novel techniques. AI's learning process is likened to human evolution rather than on-the-spot learning, explaining its high sample inefficiency. Predictions for achieving 'country of geniuses in a data center' are within 1-3 years, with high confidence within 10 years. The economic diffusion of AI, though extremely fast (Anthropic's 10x annual revenue growth), is not instantaneous due to integration complexities. The conversation also covers AI's potential impact on society, including risks of bioterrorism and authoritarianism, and the need for robust governance frameworks and a 'constitution' for AI models. The speaker reflects on the unique intellectual culture at Anthropic, emphasizing transparency and direct communication.
Videos recently processed by our community