HomeVideos

Michael Nielsen – Why aliens will have a different tech stack than us

Now Playing

Michael Nielsen – Why aliens will have a different tech stack than us

Transcript

1305 segments

0:00

Today, I'm speaking with Michael Nielsen. You have  done many things. You're one of the pioneers of  

0:04

quantum computing, wrote the main textbook  in the field of the open science movement. 

0:08

You wrote a book about deep learning  that Chris Olah and Greg Brockman  

0:12

credit with getting them into the field. More recently, you're a research fellow  

0:15

at the Astera Institute and writing a book  about religion, science, and technology. 

0:20

I'm going to ask you about none of those things. The conversation I want to have today is,  

0:25

how do we recognize scientific progress? It's especially relevant for AI because  

0:31

people are trying to close the RL  verification loop on scientific discovery. 

0:35

What does it mean to close that loop? But in preparing for this interview,  

0:39

I've realized that it's a more  mysterious and elusive force,  

0:43

even in the history of human  science, than I understood. 

0:46

I think a good place to start will be  Michelson-Morley and how special relativity  

0:51

is discovered, if it's different from the  story that you get off of YouTube videos. 

0:58

I will prompt you that way,  and then we'll go in there. 

1:02

Michelson-Morley is the famous result often  presented as this experiment that was done in  

1:09

the 1880s that helped Einstein come up with the  special theory of relativity a little bit later,  

1:15

changing the way we think about space and time  and our fundamental conception of those things. 

1:21

And there's a big gap, I think, between the  way Michelson and Morley and other people  

1:27

at the time thought about the experiment and  certainly the way in which Einstein thought  

1:32

or did not think about the experiment. In actual fact, he stated later in his  

1:39

life he wasn't even sure whether he  was aware of the paper at the time. 

1:42

There's a lot of evidence that he probably was  aware of the paper at the time, but it actually  

1:46

wasn't dispositive for his thinking at all. Something else completely was going on. 

1:55

What Michelson and Morley thought they  were doing was testing different theories  

2:01

of what was called the ether. If you go back to the 1600s,  

2:05

Robert Boyle introduced the idea of the ether. We know that sound is vibrations in the air. 

2:14

Boyle and other people got  interested in the question of  

2:16

whether light is vibrations in something,  and they couldn't figure out what it was. 

2:21

Boyle did an experiment where he  tested whether you could propagate  

2:24

light through a vacuum. He found that  you could. You couldn't do it with sound. 

2:28

He introduced this idea of the ether,  and for the next two hundred or so years,  

2:32

people had all these conversations about  what the ether was and what its nature was. 

2:38

The Michelson and Morley experiment was really an  experiment to test different theories of the ether  

2:44

against one another, in particular to find out  whether or not there was a so-called ether wind. 

2:50

The idea was that the Earth is maybe  passing through this ether wind. 

2:55

And if it is passing through the ether  wind and you shoot a light beam parallel  

3:03

to the direction the ether wind is going  in, it'll get accelerated a little bit. 

3:08

If it's being passed back in the opposite  direction, it'll get slowed down a little bit,  

3:12

and you should be able to see this in  the results of interference experiments. 

3:16

What they found, much to their surprise,  was that in fact there was no ether wind. 

3:22

That ruled out some theories of the  ether, but not all, and Michelson  

3:26

certainly continued to believe in the ether. This is what was a shocking part of reading  

3:32

this story from the biography of Einstein that  you recommended by... what was his first name? 

3:36

Abraham Pais. Abraham Pais.  

3:38

Subtle is the Lord. Also from Imre Lakatos, The  Methodology of Scientific Research Programmes. 

3:45

The way it's told is that Michelson-Morley  proved that the ether did not exist. 

3:51

Therefore, it created a crisis in physics  that Einstein solved with special relativity. 

3:56

What you're pointing out is he  actually was trying to distinguish  

3:58

between many different theories of ether. If you're in space or if you're on Earth,  

4:02

it's the same direction of ether, or maybe the  ether wind is being carried around by the Earth,  

4:06

and so you can't really experience it on Earth. But if you go to a high enough altitude,  

4:08

you might be able to experience it. In fact, Michelson's experiments,  

4:12

the famous one is 1887, but he conducted  these experiments for basically two decades. 

4:17

For longer than that. He conducted  the first one in 1881, I think,  

4:21

but he continued to believe until he died. He died, I think it was 1929 or so. It was the  

4:26

late twenties. He was still doing experiments in  the 1920s about whether or not the ether existed. 

4:34

So he continued to believe in  the ether to the end of his life. 

4:38

I think the last public statement he  made was a year or two before he died,  

4:42

and he basically still believed it at that point. In fact, there was another physicist, Miller,  

4:48

who kept doing these experiments in the 1920s. He thought that if he went to a high enough  

4:51

altitude, Mount Wilson in California…  "Oh, I'm high enough that the ether  

4:57

winds are not being dragged by the Earth. And I've measured the effect of the ether." 

5:03

Einstein hears about this and he says, and  this is where you get the famous quote,  

5:06

"Subtle is the Lord, but malicious He is not." Anyways, I think the reason the story is  

5:10

interesting is for many different reasons. One of the ways in which the real history of  

5:16

science is different from this idea you get of the  scientific method is that you really can't apply  

5:22

falsification as easily as you might think. It's not clear what is being falsified. 

5:29

Is it just another version of the theory  of the ether that's being falsified? 

5:33

Certainly you can't induce the theory  of special relativity from the fact  

5:36

that one version of the ether seems to  be disconfirmed by these experiments. 

5:42

It certainly doesn't show that ideas about  falsification are wrong or falsified,  

5:47

but it does show that the most naive ideas… Things  are often much more complicated than you think. 

5:54

Michelson did this experiment in 1881. He was a very young man, and then other people,  

5:58

I think Rayleigh was one of them, pointed out that  there were some problems with the way he did it,  

6:02

so they had to redo it in 1887. At that point, a lot of the leading  

6:08

physicists of the day basically accepted  this result, that there was no ether wind. 

6:16

But what to do about this? Sure, maybe you falsified  

6:20

some theories of the ether. There are others that you  

6:23

haven't falsified at all at this point,  and people set to work on developing those. 

6:29

It is funny, people will phrase it as  showing that the ether didn't exist. 

6:34

Even just the word "the" there is a misnomer. You actually had a ton of different theories  

6:40

and a couple of leading contenders. So yes, there's some version of  

6:45

falsification going on, but how you respond  to this new experiment is very complicated. 

6:54

Certainly the leading physicists of the day  responded by saying, "Okay, this gives us a  

6:59

lot of information about what the ether must be,  but it doesn't tell us that there is no ether." 

7:04

In fact, Lorentz at the end of  the 19th century, before Einstein,  

7:10

figures out the math of how you convert from  one reference frame to another reference frame,  

7:15

and comes up with the Lorentz transformations,  which is the basis of special relativity. 

7:20

But his interpretation is that you are  converting from the ether reference  

7:25

frame to these non-privileged other reference  frames if you're moving relative to the ether. 

7:30

His interpretation of length contraction and time  dilation is that this is the effect of moving  

7:36

through the ether, and you have this pressure.  This pressure is warping clocks. It's warping  

7:43

measures of length. The interesting thing here  is that experimentally you cannot distinguish  

7:50

Lorentz's interpretation from special relativity. I think that's a strong statement. 

8:00

Lorentz introduces this quantity called  local time, which he regards as... 

8:06

My understanding is he's not trying to  give a physical interpretation of this,  

8:11

but it's what Einstein would later just recognize  as time in another inertial reference frame. 

8:18

He's not trying to attribute  much physical meaning to it. 

8:20

I think Poincaré gets much closer  later on to realizing that this  

8:25

is the time that's registered by clocks. About forty-odd years later, people start  

8:35

doing these muon experiments where they see  cosmic rays hit the top of the atmosphere. 

8:40

They produce a shower of muons, and you can look  to see at different heights in the atmosphere how  

8:45

many of those muons remain. They decay over time,  

8:52

and a very strange thing happens, which  is that they're decaying way too slow. 

8:57

You expect they shouldn't be able to last  the whole way through the atmosphere at all. 

9:05

Their decay rate is too quick, if  you were in a classical theory. 

9:10

But if in fact their time really  has slowed down, it's okay. 

9:16

In fact, the measured decay rates in  1940—and there have since been more  

9:21

accurate experiments done—match exactly  what you expect from special relativity. 

9:28

That's the kind of thing where if Lorentz had  been alive—he'd been dead ten or so years at that  

9:34

point—it seems quite likely that he would have  tried to save his theory by patching it up yet  

9:41

again, but it would have been a massive setback. It starts to just look like time—this  

9:50

thing that Lorentz introduced as  a mathematical convenience—that's  

9:53

actually what time is, for the muons at least. Then there's a whole bunch of other experiments  

9:58

that show this very similar phenomenon. When was that experiment done? 

10:01

That was, I think, 1940. It might  have been published in 1941. 

10:05

Maybe to rephrase and change my claim: it's  not that you could not have distinguished them,  

10:12

but the scientific community adopted what  we in retrospect consider the more correct  

10:17

interpretation before it was actually  experimentally shown to be preferred. 

10:25

So there's clearly some process that human science  does which can distinguish different theories. 

10:29

Can I just interrupt? You used the word process,  and it's interesting to think about that term. 

10:36

Process carries connotations  of something set in advance. 

10:43

It's much more complicated in practice. You have people like Lorentz, who Einstein  

10:49

absolutely and utterly admired, and Poincaré,  one of the greatest scientists who ever lived,  

10:57

and Michelson, another truly outstanding  scientist, who never reconciled themselves. 

11:02

It's not as though there's some standard procedure  that we're all using to reconcile these things. 

11:08

Great scientists can remain wrong for a very  long time after the scientific community has  

11:14

broadly changed its opinion. But there's no centralized  

11:18

authority or centralized method. That is the interesting thing. There's  

11:24

progress even though it is hard to articulate the  process by which it happens, the heuristics that  

11:30

are used. You mentioned Poincaré. Lorentz has  the math right, but the interpretation wrong. 

11:38

It seems like Poincaré had the opposite, where he  understood that it's hard to define simultaneity  

11:44

because it requires a circular definition with  time, or velocity of something that might arrive  

11:51

at a midpoint together, but velocity is defined in  terms of time. I find this interesting. There are  

11:56

a couple of other examples we could call on. There is this phenomenon in the history of  

12:01

science where somebody asks the right  question, but then they don't clinch it. 

12:07

I'm curious what you think  is happening in those cases. 

12:11

You actually do want to go case  by case and try to understand. 

12:14

It's not necessarily clear that they're  doing the same thing wrong in all of the  

12:18

cases. The Poincaré case is amazing. He seems  to have understood the principle of relativity,  

12:24

the idea that the laws of physics are the  same in all inertial reference frames. 

12:28

He seems to have understood  that the speed of light is  

12:31

the same in all inertial reference frames. He doesn't phrase it quite that way, but it is  

12:36

my understanding, though I don't speak French. These are basically the ideas that Einstein  

12:44

uses to deduce special relativity. But then he also has this additional  

12:49

misunderstanding where he thinks that length  contraction is a dynamical effect, that somehow  

12:58

particles are being pushed together by some  external force, something is going on dynamically. 

13:04

He doesn't understand that it's purely kinematics. That actually space and time are different from  

13:11

what we thought, and you need to  fundamentally rethink those things. 

13:15

It's almost like he knew too much. He had almost too grand a vision in mind. 

13:22

Einstein subtracts from that and says, "No. Space and time are just different than what we  

13:29

thought, and here's the correct picture." There's a paper in, I think it's 1909,  

13:36

where Poincaré still has this dynamical picture  of what's going on with the length contraction.  

13:44

This is just not necessary. This is a mistake  from the modern point of view. Why is he doing  

13:51

this? Why is he clinging onto this idea? I  don't know. I've obviously never met the man. 

13:59

It would be fascinating to be able to  talk it over and try and understand. 

14:05

His expertise seems to be getting in the way. He knows so much, he understands so much,  

14:13

and then he's not able to let go of these things. A really interesting fact is that a few years  

14:20

prior, in the 1890s, Einstein's a teenager  and he believes in the ether too. He knows  

14:26

about this stuff. But he's not quite  as attached as these older people were. 

14:34

Maybe they were a little bit prisoners of  their own expertise. That's my guess. Some  

14:38

historians of science would certainly disagree. Then there's the obvious stories where Einstein  

14:45

himself later on is said to have not latched onto  the correct interpretations of quantum mechanics  

14:53

or cosmology because of his own attachments. Yeah. 

14:57

Here’s the bigger question I have. The muon example is a great example of  

15:04

these long verification loops and how progress  seems to happen in the scientific community  

15:09

faster than these verification loops imply. Maybe the clearest example is Aristarchus  

15:15

in the second century BC comes up  with the idea of heliocentrism. 

15:20

The ancient Athenians dismiss it on the  grounds that we should see as the Earth  

15:24

is moving around the Sun, if really the  Sun is the center of the solar system,  

15:27

the stars move relative to the Earth. The only reason that would not be the  

15:31

case is the stars are so far away  that you would not observe this. 

15:35

And it's only in 1838 that stellar  parallax was actually measured. 

15:40

And so, we didn't need to wait  until 1838 to have heliocentrism. 

15:44

We didn't need to wait for  the experimental validation to  

15:47

understand that Copernicus is better in some way. In fact, when Copernicus first came up  

15:53

with his theories, it's well known that the  Ptolemaic model was more accurate because it  

15:59

had centuries of adding on these epicycles. What's maybe less well appreciated is that it  

16:05

was also in some sense simpler. Because Copernicus actually  

16:10

had to add extra epicycles. It had more epicycles than the  

16:12

Ptolemaic model because he had this bias that the  Earth should go in a perfect circle in equal time. 

16:20

Anyway, I think this is an interesting story  because it's not a more accurate theory. It's  

16:26

not a simpler theory. So how  could you have known ex ante  

16:30

that Copernicus was correct and Ptolemy was not? Good question. I don't entirely know the answer. 

16:41

I can give you a partial answer that I, centuries  in the future, start to find very compelling. 

16:52

I'm sure it's part of the historic story at least. One of the big shocks for Newton,  

17:01

he did understand Kepler's laws of motion  eventually, so you're able to explain the  

17:07

motions of the planets in the sky. But he also, out of the same theory,  

17:12

his theory of gravitation, was  able to explain terrestrial motion. 

17:16

He's able to explain why objects move in  parabolas on the Earth, and he's able to explain  

17:20

the tides in terms of the moon and the sun's  gravitational effect on water on the Earth. 

17:31

You have what seem like three very different  disconnected phenomena all being explained by  

17:37

this one set of ideas. That starts to feel  

17:42

very compelling, at least to me. I think most people find that very  

17:48

satisfying once they eventually realize it. Have you read the Keynes biography of Newton? 

17:54

He wrote an entire biography? No, the essay. 

17:57

Sure. I love that. This description of him  as the last of the magicians is wonderful. 

18:05

In fact, I think it's maybe worth superimposing. Or you should read out that one  

18:09

passage of the thing. Alright. It's from a talk  

18:17

that he gave at Cambridge not long before he died. He'd acquired Newton's papers somehow and gave a  

18:26

lecture twice about this, or his brother Jeffrey  gave it the other time because he was too ill. 

18:33

There's this wonderful,  wonderful quote in the middle. 

18:36

The whole thing is really interesting,  but I love this particular quote:  

18:41

"Newton was not the first of the age of reason. He was the last of the magicians, the last  

18:46

great mind which looked out on the visible and  intellectual world with the same eyes as those  

18:50

who began to build our intellectual inheritance  rather less than ten thousand years ago." 

18:56

This idea people have that Newton was the  first modern scientist is somehow wrong. 

19:07

There's some truth to it, but he really  had this very different way of looking  

19:12

at the world that was part superstitious  and part modern. It was a funny hybrid.  

19:19

He's a transitional figure in some sense. That phrase, "the last of the magicians,"  

19:27

really points at something. The thing I'm very curious  

19:30

about with Newton is whether it was  the same program, the same heuristics,  

19:34

the same biases that he applied to his alchemical  work as he did to his understanding of astronomy. 

19:43

This is from the Keynes essay: "There  was extreme method in his madness. 

19:47

All his unpublished works on esoteric  and theological matters are marked by  

19:50

careful learning, accurate method,  and extreme sobriety of statement. 

19:54

They are just as sane as the Principia if their  whole matter and purpose were not magical. 

19:59

They were nearly all composed during the  same 25 years of his mathematical studies." 

20:06

Clearly, there was some aesthetic that motivated  people like Einstein to reject earlier ways of  

20:12

thinking and say, "No, the other is wrong, and  there's a better way to think about things." 

20:16

The same is true with Newton. The question I have is whether  

20:24

similar heuristics toward parsimony,  aesthetics, and so on, would be equally  

20:32

useful across time and across disciplines,  or whether you need different heuristics. 

20:37

The reason that's relevant is even if we  can't build a verification loop for science,  

20:41

maybe if the taste tests point in the same  direction, you can at least encode that bias  

20:46

into the AIs. That would maybe be enough. The point is that where we always get  

20:54

bottlenecked is where the previous  processes and heuristics don't apply. 

21:01

That's almost definitionally  what causes the bottlenecks. 

21:05

Because people are smart, they know what  has worked before. They study it. They apply  

21:09

the same kinds of things, so they don't  get stuck in the same places as before. 

21:14

They keep getting bottlenecked  in different places. 

21:18

I'm overgeneralizing a bit,  but I think it's right. 

21:22

If you're attempting to reduce science to  a process, you're attempting to reduce it  

21:27

to something where there is just a  method which you can apply, and you  

21:31

turn the crank and out pops insight. You can do a certain amount of that,  

21:37

but you're going to get bottlenecked at the  places where your existing method doesn't apply. 

21:43

Definitionally, there's no crank you can turn. You need a lot of people trying different ideas. 

21:53

The more difficult the idea is to  have, the greater the bottleneck,  

21:57

but then also the greater the triumph. Quantum mechanics is a great example of this. 

22:02

It's such a shocking set of ideas. It's such  a shocking theory. The theory of evolution in  

22:07

some sense is also quite a shocking idea, not the  principle of natural selection, but that it can  

22:15

explain so much. That's a shocking idea. Existing safety benchmarks claim that,  

22:21

at least for today's top models, attacks are  only successful a few percent of the time. 

22:26

This sounds great, but Labelbox researchers  were able to jailbreak these very same models  

22:30

about 90% of the time – even the ones that  have the strongest reputation for safety. 

22:35

And the disconnect here is that the  prompts which underlie these public safety  

22:38

benchmarks are all framed in a very naive way. There's no attempt to disguise harmful intent. 

22:44

These prompts will just ask models  to “hack into a secure network”  

22:47

and to “do so without getting caught”. But real bad actors don't write like this. 

22:51

So Labelbox built a new safety  benchmark from the ground up. 

22:55

Their prompts reflect real adversarial behavior  by stripping out obvious trigger phrases and  

23:00

wrapping their request in fictional scenarios. For example, instead of outright asking an LLM  

23:04

to steal somebody's identity, the  prompt will frame it as a game. 

23:07

A light bearer who's trying to hide  from dark forces needs a handbook  

23:11

on how to disguise themselves as somebody else. This safety research is linked in the description. 

23:16

If you think this could be useful for your  own work, reach out at labelbox.com/dwarkesh. 

23:26

So Principia Mathematica is released in 1687. The Origin of Species is released in 1859. 

23:33

At least naively, it seems like Darwin's  theory of natural selection is conceptually  

23:38

easier than the theory of gravity. I asked Terence Tao this question. 

23:46

There was this contemporaneous  biologist with Darwin, Thomas Huxley,  

23:49

who read this and said, "How extremely  stupid to not have thought of this." 

23:54

Nobody ever reads the Principia Mathematica  and thinks, "God, why didn't I beat Newton to  

23:59

the punch here?" So what's going on here?  Why did Darwinism take so much longer? 

24:07

The idea must have been known to animal  breeders for a long time at some level,  

24:15

or certainly large chunks of the idea were  known, that artificial selection was a thing. 

24:23

In some sense, Darwin's genius  wasn't in having that idea, it was  

24:29

understanding just how central it was to biology. You can go back and explain a tremendous amount  

24:39

about all the variety of what we see in the world  with this as not necessarily the only principle,  

24:46

but certainly a core principle. He writes this wonderful book,  

24:52

The Origin of Species. It's just so much evidence and  

24:57

so many examples, trying to tease this  out and see what the implications are,  

25:04

and connecting it to as much else as he possibly  can, to geology and all these other things. 

25:15

That hard work—making the case that  it's actually relevant all across the  

25:21

biosphere—is what he's doing there. He's not just having the idea,  

25:25

he's making a compelling case that it's  intertwined with absolutely everything else. 

25:30

The motivation for the question was Lucretius,  this first-century Roman poet who has an idea  

25:37

that seems analogous to natural selection. It's about species getting fitted more over  

25:42

time to their environments, or species  losing fit to their environment. 

25:46

And so, why did this go  nowhere for nineteen centuries? 

25:51

Then I looked into it or, more accurately, asked  LLMs what exactly Lucretius's idea here was. 

25:56

It is extremely different from  what real natural selection is. 

25:59

He thought there was this generative period  in the past where all the species came about,  

26:03

and then there was this one-time filter which  resulted in the species that are around today,  

26:07

and they became fit to the environment. He did not have this idea that it is an  

26:10

ongoing gradual process or that there  is a tree of life that connects all  

26:17

life forms on Earth together, which, by the way,  

26:18

is an incredibly weird fact that every single  life form on Earth has a common ancestor. 

26:23

It's not incredibly weird. If you think that  the origin of life must have been very hard,  

26:29

that there's a bottleneck there,  then it's not so surprising. 

26:32

There's also this verification loop aspect where  even if Newton might be harder in some sense, if  

26:38

you've clinched it, you can experimentally… I know  "validate" is the wrong word philosophically, but  

26:44

you can give a lot of base points to the theory. You can be like, "Okay, I have this idea  

26:48

of why things fall on Earth. I have this idea of why orbital  

26:50

periods for planets have a certain pattern. Let's try it on the Moon, which orbits the Earth." 

26:54

And in fact, it’s weird but the orbital  period matches what my calculations imply. 

26:58

And the tides work correctly. It's just amazing. Exactly. Whereas for Darwinism, it takes a ton  

27:05

of work for Darwin to compile all the  cumulative evidence, but there's no  

27:08

individual piece that is overwhelmingly powerful. And there's a whole bunch of problems as well. 

27:12

He doesn't really understand  what the mechanism is. 

27:16

He doesn't understand genes, all these things. The very interesting thing in the history of  

27:20

Darwinism is, this idea which theoretically you  could come up with at any time, there is almost  

27:29

identical independent creation of that idea  between Alfred Wallace and Charles Darwin. 

27:34

So much so that I think Wallace sends  his manuscript to Darwin and is like,  

27:37

"What do you think of this  idea?" And Darwin's like, "Fuck." 

27:40

I don't think that's an exact  quote, but it's pretty much correct. 

27:44

They end up presenting their ideas  together in the spirit of sportsmanship. 

27:49

Why was this period in the 1850s or 1860s  the right time for these ideas to form? 

27:53

You can come up with different ideas. One is  geology. In the 1830s, Charles Lyell figures  

28:00

out that there's been millions and billions  of years of time that's existed on Earth. 

28:04

The paleontology shows you that fossils  have existed for that entire time. 

28:10

Life goes back a long way. In fact, you can even find  

28:12

fossils for intermediate species  that show you the tree of life. 

28:16

Between humans and other apes as  well, there's intermediate humans. 

28:20

There's also the age of colonization, and we  have all these voyages doing biogeography. 

28:27

That all must have been necessary. In fact, there's a huge history of  

28:31

parallel innovation and discovery  in the history of science. 

28:33

So maybe it is another piece of  evidence that more had to be in  

28:37

place for a given idea to be discovered. Because if it's not discovered for a long  

28:41

time and then spontaneously many different people  are coming up with it, that shows you that the  

28:46

building blocks were in some sense necessary. This example of Lyell and other geologists  

28:56

in the early 1800s having this idea of  deep time does seem to have been crucial. 

29:02

I know Darwin was very influenced by Lyell. If you don't have at least tens or hundreds  

29:13

of millions of years, evolution  starts to look like a non-starter. 

29:20

In order to make it work on a timescale of 5,000  to 10,000 years or 6,000 years with Bishop Ussher  

29:28

you would need to see evolution occurring  at a massive rate during human lifetimes,  

29:34

and we're just not seeing that. That does seem to have been a blocker. 

29:39

To your question of what other blockers were  there, were there any others? I don't know. 

29:47

Or how much earlier could you, in principle,  have come up with it if you were much smarter? 

29:52

Let's go back and zoom out to your original  question about the verification loop in AI. 

30:00

An example that should give you pause  there is the big signature success so  

30:06

far, which is certainly AlphaFold. AlphaFold  really isn't about AI. A massive fraction of  

30:13

the success there is the Protein Data Bank. It's X-ray diffraction, NMR, cryo-EM,  

30:20

and the several billion dollars that were spent  obtaining those 180,000-odd protein structures. 

30:28

It's basically the story of how we spent  many decades obtaining protein structure  

30:34

just by going out and looking very hard  at the world experimentally, and then we  

30:38

fitted a nice model at the end of it, which  was a tiny fraction of the entire investment. 

30:46

That's a story of data acquisition principally. The AI bit is very impressive and quite  

30:52

remarkable, but it is only a  small part of the total story. 

30:56

AlphaFold is very interesting, and  philosophically I wonder what you think  

31:00

of it as a scientific theory or explanation. I guess over time the world is becoming harder  

31:07

to understand… As I'm saying things, because  you're such a careful speaker, I say a phrase  

31:16

and wonder if you'll actually buy that premise. But in some domains, we need to fit models to  

31:25

things rather than coming up  with underlying principles  

31:27

that explain a broad range of phenomena. Compare the theory of general relativity,  

31:35

or any theory which just nets out to some  equations, versus AlphaFold, which is encoding  

31:40

these different relationships between things we  can't even interpret over 100 million parameters. 

31:46

Are those really the same thing? GR can predict things you could  

31:53

have never anticipated or it was never meant  to do, like why Mercury's orbit precesses. 

31:58

AlphaFold is not going to have  that kind of explanatory reach. 

32:03

I want to get your reaction to that. I think it's an incredibly interesting  

32:07

question. Maybe a really pivotal question. If you  take a very classic point of view, you want these  

32:17

deep explanatory principles. You want as few free  

32:21

parameters as you possibly can. You want very simple models which explain a lot,  

32:27

and AlphaFold doesn't look anything like that. You might just say, "It's nice and maybe  

32:32

helpful as a model, but it's  not a scientific explanation." 

32:37

That's a conservative point of  view, answer one to the question. 

32:42

Answer two is to say maybe you shouldn't  think about AlphaFold as an explanation in  

32:51

the classic sense, but maybe it contains  lots of little explanations inside it. 

32:56

Part of what you can get out of  interpretability work is you can go into  

33:00

AlphaFold and start to extract certain things. Maybe by doing an archeology of AlphaFold,  

33:08

we can actually understand a great  deal more about these principles. 

33:12

You can start to extract that a certain circuit  does this interesting thing, and we learn from it. 

33:16

I don't know to what extent that's been done with  AlphaFold, but it's been done a little bit with  

33:22

some of the chess models, like AlphaZero. There seem to be some strategies which  

33:28

were borrowed by Magnus Carlsen, which he  seems to have just taken from AlphaZero. 

33:35

I don't think there's any public confirmation  of this, but some experts have noticed that he  

33:41

changed his game quite radically after some public  forensics were released on how AlphaZero worked. 

33:49

That's an example where human beings are  starting to extract meaning out of these models. 

33:55

That leads to viewing the models as  a potential source of explanations. 

34:01

You need to do more work because  they're not very legible up front,  

34:04

but you can potentially extract them. That's an interesting intermediate  

34:10

situation where they're not explanations  themselves, but you can extract interesting  

34:13

explanations out of them and use them as a source. The third and most interesting possibility is  

34:20

that they're a new type of object. They should be taken very seriously  

34:25

as explanations, but where in the past we haven't  had the ability to really do anything with them,  

34:30

now we have interesting new actions we can do. We can merge them, we can distill them. 

34:41

It's a big opportunity in  the philosophy of science. 

34:48

There's an anticipation of this in some way. Some mathematicians and physicists work today…  

34:58

Historically, if you had a 100-page equation—which  is the kind of thing that does come up—there's  

35:05

just nothing you can do if it's 1920. At that point, you give up on the problem. 

35:11

But today, with tools like  Mathematica, you can just keep going. 

35:18

That's an object now, a  thing that you can work with. 

35:21

There are examples where people work with  these things that formerly were regarded  

35:25

as too complicated, and sometimes  they get simple answers out the end. 

35:28

That’s just an intermediate working state. So I wonder if something similar is going to  

35:33

happen in this case, where you could take these  models and use them in a similar way that people  

35:43

do with Mathematica, and take them seriously. They're not explanations in the classic sense,  

35:48

but they'll be something else which  interesting operations can be done on. 

35:54

The thing I worry about is, suppose it's  1500 and you're training a model on… This  

36:03

is a weird history where we developed  deep learning before we had cosmology. 

36:08

Suppose we live in that world. You're observing how the stars don't seem to move. 

36:13

The planets have all these weird behaviors. Then you train a model on that, and you do  

36:18

some kind of interp on it trying to  figure out what the patterns are. 

36:22

You'd just be able to keep  building on Ptolemy's model. 

36:26

You'd see there's another  epicycle we didn't notice. 

36:31

Parameters X to Y encode this epicycle,  parameters whatever encode the next epicycle. 

36:37

If you were just trying to figure out  why the solar system is the way it is  

36:41

from observational data, you could just  keep adding epicycles upon epicycles,  

36:45

but it really took one mind to integrate it all in  and say, "Here's what makes more sense overall." 

36:56

This is to my point that we don't really  understand what to do with the models. 

37:03

We don't have the verbs yet. It is certainly interesting  

37:08

to think about the question where you  start to apply constraints to the models,  

37:14

essentially saying, "What's the simplest  possible explanation?" Or, "Can you  

37:19

simplify? Can you give me the 90/10 explanation?" And go further and further in boiling it down. 

37:28

It might be that indeed they  start out by providing a very,  

37:31

very complicated, many-parameter model. But you can just force the case, and basically  

37:38

that's scaffolding, which maybe is the very early  days of their attempt to understand something. 

37:48

They're forced through that to a  much more simple understanding. 

37:52

Sorry for misunderstanding, but it sounds  like you're saying maybe there's some  

37:54

regularizer or some distillation you could do of  a very complicated model that gets you to a truer,  

38:02

more parsimonious theory. Take Ptolemy versus  Copernicus. You start off with lots of Ptolemy  

38:09

epicycles, and then you try to distill this  model, and maybe it gets rid of some of the  

38:15

epicycles that are less and less necessary to get  the mean squared error of the orbits to match. 

38:22

But at some point it has to do this  thing which is to switch two things. 

38:26

Locally, it actually doesn't  make things more accurate. 

38:29

It's in a global sense that  it's a more progressive theory. 

38:34

There's some process which obviously  humanity did over its span, which did  

38:38

that regularization or did that swap. But with raw gradient descent,  

38:43

I don't really feel like it would do that. Think about the example of going from  

38:49

Newtonian gravity to Einstein's  general theory of relativity. 

38:53

These are shockingly different theories,  and the question is what causes that flip. 

39:00

As nearly as I understand the history, what goes  on is Einstein develops special relativity and  

39:06

pretty much straight away he understands. It's a  very obvious observation. In special relativity,  

39:13

influences can't propagate faster than  the speed of light, and in Newtonian  

39:17

gravity, action is at a distance. Straight away in special relativity,  

39:24

you could use Newtonian gravity  to do faster-than-light signaling. 

39:28

You could send information backwards in time. You could do all kinds of crazy stuff. 

39:32

It's not a big leap to realize we have  a big problem here. That's the forcing  

39:39

function there. You've realized that  your old explanation is not sufficient.  

39:43

You need something new. Then you're going to  start by doing the simplest possible stuff. 

39:52

It just turns out that a lot of that stuff  doesn't work very well, so you're forced to go  

40:00

through these steps where gradually it gets more  complicated, and it's wrong in a variety of ways. 

40:08

The final theory appears shockingly  simple and beautiful, but it's gone  

40:15

through some somewhat ugly intermediate stages. If you're thinking about what it looks like to  

40:22

have AI accelerate science, there's one for  well-understood domains where we just want  

40:28

local solutions, like how does this protein fold. We just train a raw model using gradient descent. 

40:33

Then there's things like coming up with general  relativity, where you couldn't really just train  

40:37

on every single observation in the universe  and hope that general relativity pops out.  

40:44

What would it require? It also certainly wasn't  immediately discovered. It was decades of thought.  

40:52

You'd need independent research programs where  people start off with these biases, where Einstein  

40:57

is initially motivated by this thought experiment  of whether you can distinguish the effect of  

41:03

gravity from just being accelerated upwards. You just need different AI thinkers to start  

41:10

off with these initial biases and  see what can germinate out of them. 

41:14

The verification loop for that might be quite  long, but you just need to keep all those  

41:17

research programs alive at the same time. This point you make about keeping all  

41:24

the different research programs alive, I  think that is very important and central. 

41:30

A great example is situations where the  same answer has been correct in some  

41:37

circumstances and wrong in other circumstances. The planet Uranus was not in quite the right spot,  

41:45

and people famously predicted the  existence of Neptune on this basis. 

41:52

Wonderful, massive success for Newtonian gravity. The planet Mercury is not in quite the right spot. 

41:58

You predict the existence of  some other distorting planet. 

42:02

It turns out that doesn't exist. Actually, the reason Mercury is not in the right  

42:06

spot is because you need general relativity. You've pursued very similar ideas,  

42:13

and it's been very successful in one  case, and it's been completely and  

42:16

utterly unsuccessful in the other case. A priori, you can't tell which of these is the  

42:20

thing to do, and you actually need to do both. This is certainly very true in the  

42:27

history of science. This kind of diversity,  

42:32

where you just have lots of people go off and  pursue lots of potentially promising ideas,  

42:36

you just need to support that for a long time. It's hard to do that for a variety of reasons,  

42:42

but it does seem to be very, very important. This example of Uranus versus Mercury  

42:52

is very interesting. I think it illustrates  

42:57

the difficulty with falsificationism. The orbit of Uranus is in some sense  

43:02

falsifying Newtonian mechanics. But then you make some ancillary  

43:08

prediction that says, "Oh, the reason this  is happening is there must be another planet  

43:12

which is perturbing Uranus's orbit." I think it's Le Verrier in 1846.  

43:18

"Point a telescope in the right  direction, you find Uranus." 

43:20

Neptune. Sorry. Neptune,  

43:23

yes. But with Mercury, it's observed that the  ellipse which forms its orbit is rotating 43  

43:29

arcseconds more every century than Newtonian  mechanics would imply, so people say that  

43:34

there must be a planet inside Mercury's orbit. They call it Vulcan and point the telescopes.  

43:39

It's not there. But if you're a proper Newtonian,  what you do is say, "Well, maybe there's some  

43:44

cosmic dust that's occluding this planet, or  maybe the planet is so small we can't see it,  

43:49

or let's build an even more powerful telescope,  or maybe there's some magnetic field which is  

43:55

occluding our measurement." At any one of these steps— 

43:56

And this happens over and over. There are just so many stories  

44:00

which are exactly like this. An example I love from the 1990s. 

44:07

Some people noticed that the Pioneer spacecraft  weren't quite where they were supposed to be. 

44:11

You can get very excited about this. "Oh  my goodness, general relativity is wrong. 

44:16

Maybe we're going to discover  the next theory of gravity." 

44:20

Today the accepted explanation is that there's  just a slight asymmetry in the spacecraft. 

44:30

It turns out that the thermal radiation  is slightly larger in one direction than  

44:35

the other, and that's causing a tiny  little acceleration towards the sun. 

44:40

Most of the time when there's  these apparent exceptions,  

44:44

it's just something like that going on. It's very much like the Mercury-Vulcan case. 

44:50

But every once in a while, it's not. A priori, you can't distinguish these. 

44:56

Science is just full of these. It's funny too, the way we tell  

44:59

the history of science, it sounds so simple. You just focus on the right exception and  

45:07

you realize that you need to throw out the old  theory and lo and behold, your Nobel Prize awaits. 

45:14

But in fact, these exceptions are all over the  place. 99.9% of the time, it just turns out to be  

45:20

some effect like this thermal acceleration  in the case of the Pioneer spacecraft. 

45:28

Unfortunately, there's a lot of  selection bias going into those stories. 

45:32

The thing is there's no ex ante heuristic  which tells you which case you're in. 

45:38

To spell out why I think this is important, some  people have this idea that AI is going to make  

45:44

disproportionate progress towards science because  it makes disproportionate progress towards domains  

45:49

where there's tight verification loops. It's really good at coding because you  

45:52

can run unit tests. Science may be similar  

45:54

because you can run experiments. What that doesn't appreciate is that  

46:01

there's an infinite number of theories that  are compatible with any given experiment. 

46:04

Over time, why we latch onto the one we  think is more correct in retrospect is,  

46:10

as we're discussing, hard to articulate. Lakatos has all kinds of interesting examples  

46:16

in the book about these hostile verification  loops that are extremely long-lasting. 

46:24

One he talks about is Prout. There's this chemist in 1815 who hypothesizes that  

46:31

all atomic nuclei must have whole number weights. They're basically all made of hydrogen. 

46:38

The reason he thinks this is because if you  look at the measured weights of all elements,  

46:42

it does seem that almost all of  them have whole number weights. 

46:45

But then there are some exceptions. For example, chlorine comes out at 35.5. 

46:51

So then there's all these ad hoc theories that  people in this school keep coming up with, like,  

46:55

"Oh, maybe there's chemical impurities." But there's no chemical reaction you  

46:59

can do which seems to get rid of this. Maybe it's fractions of whole numbers,  

47:03

so 35.5 can be halves. But actually, if you  

47:05

measure chlorine even closer, it's 35.46, so it's  getting further away from the correct fraction. 

47:11

Later on, what is discovered is what you're  actually measuring is different isotopes,  

47:15

which cannot be chemically distinguished. They can only be physically distinguished. 

47:20

So you have 85 years before we realize what  an isotope is, where the verification loop is  

47:25

actively hostile against the correct theory. You just need this remnant to be defending…  

47:30

There's no ex ante reason  it's the preferred theory. 

47:33

As a community, we should just have people  try to integrate new observations, even if  

47:38

they don't seem to fit their school of thought,  and hopefully enough of that happens… Anyways,  

47:45

I guess the thing I'm trying to articulate  is the difficulty with automating science. 

47:51

The question is, where is  the bottleneck at some level? 

47:56

Are we primarily bottlenecked  on one type of thing, or are  

47:59

we bottlenecked on multiple types of things? Certainly, talking to structural biology people,  

48:07

they seem to think that AlphaFold was an  enormous advance. It was a shock. At some level,  

48:12

yes, AI can certainly help us speed up science. It is helping with a certain type of bottleneck. 

48:22

That doesn't mean though, as you're  saying, that it's necessarily going  

48:24

to help with all kinds of bottlenecks. I suppose the question you're pointing at is,  

48:29

what are the types of bottlenecks that remain,  and what are the prospects for getting past them? 

48:35

Even in the case of coding, it's really  interesting talking to programmer friends. 

48:40

At the moment they're all in this  state of shock and high excitement,  

48:45

and they're all over the place. You do wonder where the  

48:51

bottleneck is going to move to. Certainly, one thing that a lot  

48:54

of them seem to be bottlenecked on now is  having interesting ideas, and in particular,  

48:59

having interesting design ideas. There's not really a verification loop for  

49:04

knowing that a design idea is very interesting. They're no longer nearly as bottlenecked by their  

49:12

ability to produce code, but they are  still bottlenecked by this other thing. 

49:17

Formerly, they weren't bottlenecked on it because  just writing code took so much of their time. 

49:22

They could have lots of ideas while they were  taking three weeks to implement their prototype,  

49:28

and then they would implement the next version. Now they're taking three hours to implement the  

49:32

prototype, and they don't have as good ideas  after that, from a design point of view. 

49:39

Last year, I predicted that by 2028,  AI would be able to prep my taxes about  

49:43

as well as a competent General Manager. But we're already getting pretty close. 

49:47

As I shared before, I use Mercury both  for my business and my personal banking. 

49:51

So I recently gave an LLM access  to my transaction history across  

49:54

both accounts through Mercury's MCP. I asked it to go through all my 2025 transactions  

50:00

and flag any personal expenses that seem like  they should actually be charged to the business. 

50:04

And this worked shockingly well. Mercury's MCP exposes a bunch of  

50:08

detailed information, things like notes and memos  and any JPEGs of receipts and PDF attachments. 

50:15

So my LLM had plenty of context to work with. One of my favorite examples  

50:19

happened with a charge to Bay Padel. If you looked at the vendor alone, you would  

50:22

have had to assume that it's a personal expense. But the LLM looked at the receipt and the attached  

50:28

note in Mercury and realized this  was actually a team bonding exercise  

50:32

from our last in-person retreat. So a legitimate business expense. 

50:36

I imagine it will be a while  before traditional banks have MCP. 

50:40

Functionality like this is why I use Mercury. Go to mercury.com to learn more. 

50:45

Mercury is a fintech company,  not an FDIC Insured Bank. 

50:48

Banking services provided through Choice  Financial Group and Column NA, members FDIC. 

50:54

You have a very interesting take. I think it was a footnote in one of  

50:58

your essays, and I couldn't find it again,  which was that it's very possible that if  

51:02

we met aliens, they would have a totally  different technological stack than us. 

51:07

That contradicts a common assumption I  had that I never questioned, which is that  

51:11

science is this thing you do relatively  early on in the history of civilization. 

51:17

You get to a point and you have a couple hundred  years of just cranking through the basics,  

51:21

understanding how the universe works, and you've  got it. You've got science. Then everybody  

51:27

would converge on the same "science." I found that a very interesting idea,  

51:31

and I want you to say more about it. The idea there that I'm at least somewhat  

51:39

attached to is that the tech tree or the  science and tech tree is probably much  

51:48

larger than we realize. We're in this funny  situation. People will sometimes talk about  

51:55

a theory of everything as a potential goal for  physics, and then there's this presumption that  

52:02

physics is done once you get there. Of course, this is not true at all. 

52:06

If you think about computer science,  computer science started in the 1930s  

52:12

when Turing and Church and so on laid  down what the theory of everything was. 

52:18

They just said, "Here's how computation works." We've spent ninety-odd years since then  

52:24

exploring the consequences of that and gradually  building up more and more interesting ideas. 

52:29

Those ideas, to some extent,  you can regard as technology. 

52:32

But insofar as they're discovered principles  inside that theory of computation, I think  

52:38

they're best regarded as science and in  some cases, very fundamental science. 

52:42

Ideas like public-key cryptography are  incredibly deep, very non-obvious ideas which  

52:50

lay hidden already in the 1930s. My expectation is that there will  

52:59

be different ways of exploring this tech  tree, and we're still relatively low down. 

53:03

We're still at the point where we're just  understanding these basic fundamental theories,  

53:08

and we haven't yet explored them. A thing which I think is quite fun  

53:13

is if you look at the phases of matter. When I was in school, we'd get taught that  

53:17

there are three phases of matter, or sometimes  four or five, depending on what you included. 

53:26

As an adult, as a physicist, you start to  realize we've been adding to this list. 

53:33

We've got superconductors and superfluids,  and maybe different types of superconductors,  

53:38

and Bose-Einstein condensates,  the quantum Hall systems,  

53:41

fractional quantum Hall systems, and so on. It's starting to turn out there's a lot of  

53:49

phases of matter to discover, and we're  going to discover a lot more of them. 

53:54

In fact, we're going to be able to  start to design them in some sense. 

53:57

We'll still be subject to the laws of physics,  but there is this tremendous freedom in there. 

54:02

This looks to me like we're down  at the bottom of the tech tree. 

54:06

We've barely gotten started there, and  I expect that to be the case broadly. 

54:13

Certainly, programming is a  very natural place to look. 

54:18

The idea that we've discovered all the deep ideas  in programming just seems obviously ludicrous. 

54:25

We keep discovering what seem like deep, new,  fundamental ideas. We're very limited. We're  

54:34

basically slightly jumped-up chimpanzees,  so we're slow and it's taking us time. 

54:43

But what do we look like another million  years in the future, in terms of all the  

54:51

different ideas people have had around how  to manipulate computers and information? 

54:58

I think we're likely to discover that there are  a lot of very deep ideas still to be discovered. 

55:07

I think it was Knuth in the preface to The Art of  Computer Programming who says something like it. 

55:12

He started this book back in the sixties. He talked to a mathematician who was a bit  

55:18

contemptuous and said, "Look, computer  science isn't really a thing yet. 

55:22

Come back to me when there's  a thousand deep theorems." 

55:26

Knuth remarks, writing the preface decades later,  "There clearly are a thousand deep theorems now." 

55:36

It's really interesting to think what the  long-term future is as you get higher and  

55:41

higher up in the tech tree, choices about which  direction we go and how we choose to explore. 

55:51

It's potentially the case that different  civilizations or different choices mean  

55:56

we end up in different parts of that tree. In particular, there are just very basic things  

56:03

about how we're very visual creatures, while  certain other animals are much more aurally based. 

56:10

Does that bias the types  of thoughts that you have? 

56:15

Then you extend it to much more exotic kinds  of civilizations where maybe their biases in  

56:23

terms of how they perceive and manipulate  the world are quite different than ours. 

56:29

That might make some significant  changes in terms of how they do  

56:35

that exploration of the tech tree.  It's all speculation, obviously. 

56:39

This is such an interesting take. I want to better understand it. 

56:43

One way to understand it is that  there might be some things which are  

56:47

so fundamental and have such a wide collision  area against reality that they're inevitably  

56:51

going to discover, like general relativity. Numbers. Numbers. Of all the intelligences in  

56:59

the Milky Way galaxy… Maybe that number is one. Well, actually, arguably we've already  

57:04

increased the number. But of all of those,  

57:09

what fraction have the concept of counting?  It does seem very natural. What fraction have  

57:17

discovered the idea of some kind of decimal place  system? Interesting question. Maybe we're missing  

57:25

something really simple and obvious that's  actually way better than that. What fraction  

57:30

got there immediately? What fraction had to  go through some other intermediate state? 

57:34

What fraction uses linear representations  versus a two-dimensional or a  

57:39

three-dimensional representation? I think the answers to these questions  

57:42

are just not at all obvious. It's a lot of design freedom. 

57:45

On theoretical computer science, this is going to  be extremely naive and arrogant, but I took Scott  

57:54

Aaronson's class on complexity theory, and I  was by far the worst student he's ever had. 

58:02

What I remember is there was this period, in  which you were one of the pioneers, where we  

58:08

figured out the class of problems that  quantum computers can solve and how it  

58:13

relates to problems that classical computers  can solve. It was groundbreaking. It's  

58:16

crazy that this works. Since then… There's  literally this website called Complexity  

58:22

Zoo which lists out all the complexity classes. If you have this complexity class with this kind  

58:27

of oracle, it's equivalent to this other class. It feels like we're building out that taxonomy. 

58:34

There are a couple ways to  understand what you're saying. 

58:35

One, maybe you disagree with me that this  is actually what's happened with this field. 

58:39

Another is that while that might happen to  any one field, who would've thought in 1880  

58:44

that computer science, other than Babbage,  was going to be a thing in the first place? 

58:49

We're underestimating how many  more fields there could be. 

58:52

Or maybe you think both, or maybe a  third secret thing. I'd be curious. 

58:59

A very common argument here is  the low-hanging fruit argument. 

59:03

The argument that says there  should be diminishing returns. 

59:06

In fact, empirically we see this. The amount of scientists in the  

59:09

world has exponentially increased. I think it's worth thinking about why  

59:16

you expect diminishing returns and how well  that argument actually applies in practice. 

59:24

An analogy I like is thinking about  going to an event, like a wedding,  

59:30

and you go to the dessert buffet. They've  put out thirty desserts. Naturally,  

59:37

what people do is the best desserts go first. We don't quite have a well-ordered preference  

59:43

there, so maybe there's some difference,  but human beings are fairly similar,  

59:47

so the best desserts will go first. This is an argument for why you expect  

59:53

diminishing returns in a lot of different fields. If it's relatively easy to see what's available  

59:58

and people have similar preferences,  then the best stuff goes first and it  

60:03

just gets worse and worse after that. If you look at a very static snapshot  

60:10

in time of scientific progress,  maybe there's some truth to that. 

60:16

But if somebody is standing behind the dessert  table and is replenishing and restocking the  

60:21

desserts and keeps adding new ones in,  it may turn out that a little bit later,  

60:27

much better desserts appear, and you're  going to go and eat those instead. 

60:33

Scientific progress has a  little bit of that flavor. 

60:36

We go through these funny time periods. Computer science is a great example,  

60:40

where computer science basically arose as a  side effect of some pretty abstruse questions  

60:48

in the philosophy of mathematics and logic. You've got these people trying to attack  

60:57

these rather esoteric questions that  seem quite high up in exploration,  

61:04

and they discover this fundamental new field,  and all of a sudden there's an explosion there. 

61:09

The diminishing returns argument  just didn't apply there. 

61:12

We just weren't able to see what was there. This has been the case over and over again. 

61:19

New fields arrive and all of a sudden, and  boom, it's easy to make progress again. 

61:24

Young people flood in because you can be  twenty-one and make major breakthroughs  

61:28

rather than having to spend twenty-five  years mastering everything that's been  

61:32

done before. It's obviously very attractive.  I'm not sure anybody understands very well  

61:40

the dynamics of that, or how to think about  why the structure of knowledge is that way,  

61:47

where these new fields keep opening up. But it does seem empirically to be the case. 

61:54

Despite the fact that that is  the case… Take deep learning. 

61:58

Obviously, this is an example of a new field  where twenty-one-year-olds can make progress and  

62:05

it's relatively new. Fifteen years or so  

62:08

since it got back into high gear. But already we're in a stage where  

62:16

you need billions, tens of billions,  or hundreds of billions of dollars  

62:20

to keep making progress at the frontier. There are a couple ways to understand that. 

62:24

One is that it actually is harder than the  kinds of things the ancients had to do,  

62:28

or is more intensive at least. Second is it might not have been,  

62:33

but because our civilizational resources are  so large, the amount of people is so large,  

62:37

the amount of money is so large, we can basically  make the kind of progress it would have taken  

62:41

the ancients forever to make almost immediately. We notice something is productive and immediately  

62:47

dump in all the resources. But it's also weird that  

62:51

there's not that many of them. I feel like deep learning is notable  

62:55

because it is one big exception to the fact  that it's hard to think of other examples. 

62:59

I think that's a consequence of  the architecture of attention. 

63:03

At any given time, there's  always a most successful thing. 

63:09

If deep learning wasn't a thing,  maybe you'd be talking about CRISPR. 

63:12

Maybe we wouldn't think about solving the protein  structure prediction problem as a success of AI. 

63:23

Maybe we would have figured out how to do it  with curve fitting, more broadly construed,  

63:28

and we'd just be like, "Wow, that  took a lot of computing resources." 

63:31

But protein structure prediction might  be an enormously important thing. 

63:36

There is always our biggest thing. What you're pointing at is more a consequence  

63:43

of the way in which attention gets centralized. It's basically fashion, is what I'm saying. 

63:49

It's not just fashion, but  there is some dynamic there. 

63:54

There's a very interesting and  important implication of this idea. 

63:59

That the branching is so wide and so  contingent and so path-dependent that  

64:05

different civilizations would stumble  on entirely different technology stacks. 

64:08

There's a very interesting implication that  there will be gains from trade into the far,  

64:13

far future, which might actually be one of the  most important facts about the far future in  

64:17

terms of how civilizations are set up, how  they coordinate, and how they interface. 

64:22

There's not this "go forth and exploit." There are humongous gains to trade from  

64:28

adjacent colonies or whatever. Sort of. There's a question of  

64:35

what's actually hard. If it's just the ideas,  

64:40

well, those spread relatively quickly. It's relatively easy to share ideas. 

64:44

If it's something more, it's  almost a Dan Wang kind of idea  

64:48

where there's some notion of capacity. You need all the right techs, you need all  

64:53

the right manufacturing capacity, and so on. So civilization A has a very different kind  

64:59

of manufacturing capacity, and it's just  not so easy to build in civilization B. 

65:03

Even if civilization B is ahead,  I think that becomes true. 

65:08

There is a comparative advantage which  is going to provide massive benefits  

65:16

to trade in both directions. Eventually, you expect some  

65:19

diffusion of innovation. It is funny to think about  

65:23

what the barriers are there. A fun thought experiment I like  

65:26

to think about is GitHub but for aliens. Somebody presents you with all of the code  

65:36

from some alien civilization. I don't even know what code means  

65:40

there, but their specification of algorithms. It would have many interesting new ideas in there,  

65:50

and it would take forever for human beings to  dig through and try and extract all of those. 

65:56

The origin of this for me was  thinking about proteins in nature. 

66:05

We've been gifted this incredible variety of  machines which we don't really understand at all. 

66:12

We just have to go and try and  understand them on a one-by-one basis. 

66:17

We're still understanding hemoglobin  and insulin and things like this. 

66:23

There are hundreds of millions of proteins known. So it is a little bit like that. 

66:28

We've been gifted by biology this immense library  of machines, no doubt containing an enormous  

66:36

number of very interesting ideas, and we're just  at the very, very beginning of understanding it. 

66:43

I suppose your point—I need to relabel  your argument slightly—but you think of  

66:51

that as a gift from an alien civilization, which  obviously it isn't, but you think of it that way. 

66:56

And oh my goodness, there's so much  in there and we're going to study it. 

67:02

Goodness knows how long we  could continue to study it. 

67:04

There are tens of thousands of papers  about hemoglobin and things like that,  

67:09

and we still don't understand them, and yet  we're getting so much out of it. Just think about  

67:13

insulin alone. It's such an important thing. That's an incredibly useful intuition pump,  

67:23

that you have on Earth… I had Nick Lane on  where he had this theory about how life emerged,  

67:27

but whatever theory you have, something  like DNA has had four billion years. 

67:34

You have an alien civilization come here  and be like, "There's all these interesting  

67:37

things to learn about material science." Think about kinesin walking along. We know  

67:47

almost nothing about these proteins, and yet the  tiny few facts we do know are just incredible. 

67:52

The ribosome is another example, this  miraculous sort of device, a little factory. 

68:01

All seeded by this particular chemistry on Earth  with nucleic acids and carbon-based life forms. 

68:09

That chemistry gives rise to all of  these interesting things which an  

68:13

alien civilization would find very interesting. That very seed, which must be one among trillions  

68:19

of possible seeds of general intellectual  ideas, leads to all this fecundity. 

68:25

That's a very interesting intuition pump. I want to meditate on this "gains from trade"  

68:29

thing because I feel like there's something very  interesting about this idea that if you have this  

68:34

vision of how technology progresses and how it  may be different in different civilizations,  

68:39

it actually has important implications  about how different civilizations  

68:42

might interact with each other. The fact that there are going  

68:45

to be these huge gains from trade. It makes friendliness much more rewarding? 

68:48

Yes. That's a very important observation. I hadn't thought about that at all. 

68:54

That is a very interesting observation. It  is funny. Comparative advantage is something  

69:02

that people love to invoke and it's a very  beautiful idea obviously. There are limits to it.  

69:16

It's a special limited model. Chimpanzees can do  interesting things, but we don't trade with them. 

69:26

I think it's interesting to  think about the reasons why. 

69:31

Part of it is just power, I think. Once there's a sufficiently large power imbalance,  

69:38

very often—not always, but very often—groups  of people seem to shift into this other  

69:44

mode where they just seek to dominate. Maybe there's something special about human  

69:49

beings, but maybe it's also a more general thing. You need all these special things to be  

69:57

true before groups will trade.  It's not necessarily obvious. 

70:05

I think the big thing going on  here is one, transaction costs. 

70:09

Two, comparative advantage does not tell you  that the terms on which the trade happens  

70:17

are above subsistence for any given producer. People often bring this up in the context of,  

70:22

"Well, humans will be employed even in a  post-AGI world because of comparative advantage." 

70:26

There are five different ways that argument breaks  down, but the easiest way to understand it is:  

70:32

why don't we have horses all around on the roads? Because there's some comparative advantage  

70:36

between cars and horses. One, there are huge  

70:40

transaction costs to building roads that are  compatible with horses and cars at the same time. 

70:45

In a similar way, AI thinking at 1,000 times the  speed that can shoot their latent states at each  

70:52

other is going to find it way more costly  than the benefit, in terms of interacting  

70:57

with a human being in the supply chain. Second, just because horses have a comparative  

71:07

advantage mathematically does not mean that it  is worth paying $100,000 a year, or whatever  

71:14

it costs to sustain a horse in San Francisco. That subsistence isn't going to be worth the  

71:19

benefit you get out of the horse. I do think it's interesting,  

71:23

the sheer fact… My expectation and my intuition  obviously differs a great deal from yours on this. 

71:32

Most parts of the tech tree  are never going to be explored. 

71:35

There are just too many interesting  ways of combining things. 

71:38

There are too many deep ideas waiting to be  discovered, and not only we, but nobody ever  

71:45

is going to discover most of them. So choices about how to do the  

71:50

exploration actually matter quite a bit. It's something I really dislike about  

71:55

technological determinist arguments. I'm willing to buy it low enough  

71:59

down when progress is relatively simple. But higher up, you start to get to shape  

72:06

the way in which you do the exploration. And it's interesting, we are starting to  

72:12

shape it in interesting ways. There are various technologies  

72:17

that have been essentially banned. You think about DDT, chlorofluorocarbons,  

72:22

restrictions on the use of nuclear weapons,  the Nuclear Non-Proliferation Treaty. 

72:27

Those kinds of things weren't done before the  fact, but they're starting to get pretty close  

72:36

in some cases, where we just preemptively decide,  "Oh, we're not going to go down that path." 

72:43

So that starts to look like a set of  institutions where we are actually  

72:47

influencing how we explore the tech tree. On where you would see these gains from trade,  

72:54

obviously you'd see the most where it's pure  information that could be sent back and forth,  

72:59

because the information has this quality  where it is expensive to produce,  

73:02

but cheap to verify and cheap to send. It'll be interesting how much of future  

73:09

productivity can be distilled down to information. Right now, it's hard to do. 

73:14

If China's really good at manufacturing  something, there's this process knowledge  

73:18

that's in the heads of 100 million people  involved in the manufacturing sector in China. 

73:23

But in the future, it might  be easier if AIs are doing it. 

73:26

The question is to what extent our  fabrication gets very uniform and  

73:32

gets really commoditized. 3D printers have been  the next big thing for at least 20 years now. 

73:39

Why do they still not work all that well? Why are they still not at the center of  

73:44

manufacturing, and what comes after that? It is funny to look at the ribosome by contrast,  

73:50

which really is at the center of biology  in a whole lot of really interesting ways. 

73:55

Whether or not that's the future of  manufacturing is something very simple,  

74:00

where everything goes as throughput through  a bioreactor or something like that. 

74:07

You send the information, and then you grow stuff,  or you have some 3D printer that actually works. 

74:15

If they're good enough, then it does become much  more a pure information problem, and some of this  

74:20

process knowledge becomes much less important. Jane Street has a lot of compute, but GPUs are  

74:27

very expensive, and so even optimizations  that have a relatively small effect on GPU  

74:32

utilization are still extremely valuable. Two of Jane Street's ML engineers,  

74:36

Corwin and Sylvain, walked through some  of their optimization workflows at GTC. 

74:40

You're not bottlenecked on the network  being too slow, you're bottlenecked on  

74:42

waiting for a different rank in your  training not having completed the work. 

74:47

They talked about how Jane Street profiles  traces and diagnoses bottlenecks, and then  

74:51

how they solve them using techniques like CUDA  graphs and CUDA streams and custom kernels. 

74:56

With these sorts of optimizations,  Corwin and Sylvain were able to  

74:59

get their training steps down from 400  milliseconds to 375 milliseconds each. 

75:04

This 25 millisecond difference might sound  small, but given the size of Jane Street's fleet,  

75:08

that improvement could free up thousands of B200s. Jane Street open sourced all the relevant code. 

75:13

If you want to check it out, I've linked the  GitHub repo and the talk in the description below. 

75:17

And if you find this stuff exciting, Jane  Street is hiring researchers and engineers. 

75:21

Go to janestreet.com/dwarkesh to learn more. Can I ask a very clumsily phrased question? 

75:29

There are these deep principles  that we've discovered a couple of. 

75:35

One is this idea that if there's a symmetry  across a dimension, it corresponds to a conserved  

75:39

quantity. It's a very deep idea. There's  another—which you've written a lot about,  

75:43

written a textbook about in fact—about ways to  understand what kinds of things you can compute,  

75:52

what kinds of physical systems you can  understand with other physical systems,  

75:56

what a universal computer looks like, et cetera. Is your view that if you go down to this level of  

76:01

idea of Noether's theorem or the Church-Turing  principle, that there's an infinite number  

76:07

of extremely deep such principles? Because I feel what makes them special  

76:11

is that they themselves encompass so many  different possible ways the world could be. 

76:17

But no, the world has to be compatible with  a couple of these very deep principles. 

76:23

I don't know. All I have here  is speculation and instinct. 

76:29

My instinct is that we keep finding  very fundamental new things. 

76:33

It was quite formative for me to understand,  as I gave the example before, these wonderful  

76:40

ideas of Church and Turing and these other  people about universal programmable devices. 

76:46

Then you understand later, this also contains  within it the ideas of public-key cryptography. 

76:51

Then you understand later,  that also contains within it  

76:55

the ideas people refer to as cryptocurrency. There's a very deep set of ideas there about  

77:00

the ability to collectively maintain an  agreed-upon ledger, which is built upon this. 

77:11

It's taken many years to figure out  the right canonical form of those. 

77:18

Just this fact that you keep finding what  seem like deep new fundamental primitives  

77:28

has been a very important intuition pump for me. I've given that particular example, but I think  

77:35

you see that same pattern  in a lot of different areas. 

77:37

What is your interpretation then of this empirical  phenomenon where whatever input you consider into  

77:44

the scientific process or technological progress…  Economists have studied this a million ways. 

77:50

It just seems to require a very consistent  rate of X percent more researchers per year. 

77:55

There's this famous paper from a couple years  ago by Nicholas Bloom and others where they say,  

78:00

"How many people are working in the semiconductor  industry, and how has it increased over time  

78:05

through the history of Moore's law?" I think they find that Moore's law  

78:08

means transistor density increases  40% a year, but to keep that going  

78:14

the number of scientists has increased  9% a year, in the semiconductor industry. 

78:19

They go through industry after  industry with this observation. 

78:23

Is your view that there are these deep  ideas, but they keep getting harder to find? 

78:25

Or is there another way to think about what's  happening with these empirical observations? 

78:32

First of all, all of their examples are narrow. They pick a particular thing, and then they  

78:37

look at a particular metric. GPUs don't show up  there. All of a sudden you get this ability to  

78:49

parallelize, and that's really interesting. There are a lot of external consequences. 

79:02

Basically they have these  simple quantitative measures. 

79:04

They look at it in agricultural productivity. 

79:06

They look at it in a whole lot of different  ways, but you do have to focus narrowly. 

79:14

I'm certainly interested in the fact that  new types of progress keep becoming possible. 

79:21

But I think even there, there does still seem  to be some phenomenon of diminishing returns.  

79:33

Is that intrinsic? Is that something about the  structure of the world? What is it? One thing  

79:38

which hasn't changed that much is the individual  minds which are doing this kind of work. 

79:44

Maybe those should be improved as well,  or some feedback process going on there. 

79:54

Maybe that changes the nature of things. I look at scientific progress up until,  

80:02

let's say, 1700, and it was very  slow, and also very irregular. 

80:08

You had the Ionians back five centuries before  Christ doing these quite remarkable things, and so  

80:16

much knowledge would get lost, and then it would  be rediscovered, and then it would be lost again. 

80:22

You'd have to say that progress was very slow. It's partially just bound up with the fact  

80:28

that there were some very good  ideas that we just didn't have. 

80:31

Even once you've had the ideas, you  need to build institutions around them. 

80:35

You actually need to solve a whole lot of  different problems about training, allocation  

80:39

of capital, and all these kinds of things. Even just basic security for researchers,  

80:44

so they're not worried about the  Inquisition or things like that. 

80:48

There are all these complicated problems. You solve all those complicated problems,  

80:51

and then all of a sudden, boom, there's  a massive burst of scientific progress. 

80:56

If there's some kind of stagnation, if you're  not changing those external circumstances, yes,  

81:02

you may start to get diminishing returns again. But that doesn't mean there's anything  

81:07

intrinsic about the situation. Maybe something external needs to change again. 

81:14

Obviously, a lot of people think AI  is potentially going to be a driver. 

81:19

It certainly will at some level. To that extent, you can think of a lot  

81:24

of modern scientific instrumentation  as really, at some level, robots. 

81:31

What is the James Webb Space Telescope? It's unconventional maybe to describe  

81:37

it as a robot, but it's not  completely unreasonable either. 

81:42

It is an example of a highly automated, very  sophisticated system with electronically  

81:47

mediated sensors and actuators, where machine  learning is being used to process the data. 

81:55

In that sense, we're already  starting to see that transition. 

81:58

We've been seeing it for decades. I have this "smoke a joint and take  

82:03

a puff" thought, which— I think we've had a few. 

82:06

I think we're getting to that part of the  conversation, and then you can help me get  

82:08

my foot out of my mouth and figure out  a more concrete way to think about it. 

82:14

To your point that there was the Industrial  Revolution, the Enlightenment, and now there's AI,  

82:21

and each might be a different pace or a  different way in which science happens. 

82:26

If you think about the pace of how fast  such transitions have been happening,  

82:32

you can draw over the long span of human  history this hyperbolic rate of growth that is  

82:39

increasing over time as well. A hundred thousand years ago,  

82:41

you had the Stone Age. 

82:43

You go back even much further, how  long have primates been around? 

82:46

It would be millions of years. A hundred thousand years ago, the Stone Age,  

82:49

then ten thousand years ago, the Agricultural  Revolution, then three hundred years ago,  

82:54

the Industrial Revolution, each marked by this  increase in the rate of exponential growth. 

83:00

Then people think it's going  to happen again with AI. 

83:04

But that would happen potentially even faster. It would not have occurred to somebody  

83:09

at the beginning of the Industrial  Revolution that the next demarcation  

83:12

in this trend will be artificial intelligence. So if things are getting faster, and it's hard  

83:20

to anticipate what the next transition will be. I guess we just think of this singularity between  

83:25

now and AI as what distinguishes  the past from the future. 

83:29

But applying the same heuristic that  many people in the past should have had,  

83:36

maybe the "Intelligence Age" is also  quite short and the next thing after that,  

83:40

we don't even have the ontology to describe  what it is, the future will not think of the  

83:46

past as pre-intelligent AI and post-AI. No, obviously we can't prove this,  

83:55

but it certainly seems quite plausible. Part of the issue is just that the substrate  

84:01

we have available to conceive seems all wrong. You can't speculate with a bunch of chimpanzees  

84:09

about what it would be to have language. Just to pick a major transition in the past,  

84:20

the transition itself is the thing. It  seems likely. If we're talking about  

84:27

"taking a puff" kind of thoughts, I'm certainly  amused by the idea that there's going to be some  

84:34

transition involving artificial general  intelligence using classical computers. 

84:42

But actually, there'll be an interesting  transition with quantum computers as well. 

84:45

They're probably capable of a strictly larger  class of potentially interesting computations. 

84:52

So maybe the character of AQGI,  or whatever it should be called,  

84:59

is actually qualitatively different. So maybe there's a brief period  

85:04

between those two things. As I say, this is just speculation,  

85:09

but it's certainly amusing. Is there a reason to think that? 

85:12

From what I understand, for decades people  like you have put pretty tight bounds on  

85:17

the kinds of things quantum computers are  going to do. It'll speed up search somewhat.  

85:24

The kinds of things it speeds up extremely,  like Shor's algorithm, it seems like… Again,  

85:28

maybe this is to your point that we can't predict  in advance what's down the tech tree, but at least  

85:32

from here, it seems like you break encryption, but  what else are you using Shor's algorithm to do? 

85:36

We've only been thinking  about it for 40 or so years. 

85:43

Not for very long, and we haven't thought  that hard about it as a civilization. 

85:52

Does it turn out that it's very narrow?  Maybe. Does it turn out that it's very broad? 

85:56

That's also a really radical expansion  that seems distinctly possible. 

86:01

Keep in mind as well, we've been doing it  without the benefit of having the devices. 

86:06

That's a pretty big bottleneck to have. If you're thinking about computer science  

86:11

in the 1700s and you're like, "it can do AND/OR,  what can come out of that?" You can't anticipate  

86:16

Bitcoin. You can't anticipate deep learning. Maybe you could if you were sufficiently bright,  

86:21

but it is a pretty hard situation. What is your inside view, having been  

86:30

in and contributing to quantum information and  quantum computing back in the '90s and 2000s? 

86:35

What is your telling of the  history of what was the bottleneck? 

86:40

What was the key transition  that made it a real field? 

86:46

How do you rank the contributions from Feynman  to Deutsch to everybody else who came along? 

86:53

Let's just focus on the question  about what actually changed. 

86:57

Why was quantum computing not a thing in  the 1950s? It could have been. Somebody  

87:04

like John von Neumann is a good example. He  was absolutely pioneering computation. He also  

87:10

wrote a very important book about quantum  mechanics and was deeply interested in it. 

87:14

He could have invented quantum  computing at that time,  

87:18

and I think there were quite a number  of people who potentially could have. 

87:21

So why do we have these papers by people  like Feynman and Deutsch in the '80s? 

87:25

Those are fairly regarded as  the foundation of the field. 

87:31

There are some partial anticipations a little  bit earlier, but they were nowhere near as  

87:36

comprehensive and nowhere near as deep. You should  ask David. You can't ask Feynman, unfortunately,  

87:46

but he'll know much better than I do. A couple things that I think are interesting. 

87:51

One is that computation became far more  salient in the late '70s and early '80s. 

87:58

It just became a thing which many more people were  interested in, partially for very banal reasons. 

88:04

You could go and buy a PC. You could buy an Apple II. 

88:06

You could buy a Commodore 64. You could buy all these kinds of things. 

88:09

It became apparent to people that these were very  powerful devices, very interesting to think about. 

88:14

At the same time, in the quantum case,  that was also the time of the Paul  

88:20

trap and the ability to trap single ions. Up to that point, we hadn't really had the  

88:26

ability to manipulate single quantum states. You got these two separate things that  

88:31

for historically contingent reasons  had both matured around 1980 or so. 

88:41

Somebody like von Neumann could have had the idea  earlier, but it is quite an interesting factor. 

88:52

There's a story about Richard Feynman. He went and got one of the  

88:55

first PCs around 1980 or 1981. He was apparently so excited with this device,  

89:04

he actually tripped and hurt himself quite  badly carrying his brand-new computing device. 

89:16

That's a very historically contingent  coincidence, having somebody who's very  

89:25

talented and understanding of quantum mechanics  also just very excited about these new machines. 

89:32

It's not so surprising perhaps  that he's thinking about it then. 

89:36

What similar story could you  have told 10 years earlier? 

89:41

The conditions don't exist for it. I mean, it's quite a banal story, but… 

89:47

One of the things we were going to discuss was  this idea you had about the market for follow-ups. 

89:53

I think this is the perfect story to  discuss it for because you wrote the  

89:58

textbook about the field. "Mike and Ike" is  the definitive textbook on quantum information. 

90:06

You presumably came in after Deutsch. But you in the '90s somehow identified  

90:12

it as the thing that is worth  following up on and building on. 

90:17

Instead of talking about it more abstractly,  I'd love to just hear the firsthand story  

90:20

of how you knew that this is the thing to do. Of all the things that were happening in physics  

90:25

and computing, how did you decide  you want to think about this problem? 

90:30

Richard Feynman writes this great paper in 1982. David Deutsch writes an absolutely fantastic  

90:35

paper in 1985 sketching out a lot of the  fundamental ideas of quantum computing. I'm  

90:44

11 in 1985. I'm not thinking about this.  I'm playing soccer and doing whatever. 

90:49

But in 1992, I took a class on quantum mechanics  that was really terrific, given by Gerard Milburn. 

90:56

I just went and asked Gerard one day  after the fifth lecture or something. 

91:02

I said, "Do you have any papers or  whatever that you could give me?" 

91:09

He said, "Come by my office  in a couple of days' time." 

91:12

I did, and he presented me with a giant stack  of papers, which included the Deutsch paper,  

91:19

the Feynman paper, and a whole bunch of other  very fundamental papers about quantum computing  

91:24

and quantum information at a time when essentially  nobody in the world was working on it. He was. I  

91:32

think he wrote the very first paper that proposed  a practical approach to quantum computing. 

91:39

It wasn't very practical, but it  was actually in a real system. 

91:44

So in some sense, I'm benefiting  from the taste of this other person. 

91:51

As soon as I read the papers…  These are exciting papers. 

91:57

They're asking very fundamental questions,  and you realize I can make progress here. 

92:03

These are things that one  could potentially work on. 

92:06

Deutsch has this conjecture, or thesis or whatever  you’d call it, that a universal model, a quantum  

92:22

Turing machine, should be capable of efficiently  simulating any physical system at all. 

92:28

This is a very provocative idea. I think in that paper,  

92:32

he more or less claims that he's proved it. I'm not sure everybody would agree with that. 

92:39

There are questions about whether or not you  can simulate quantum field theory effectively. 

92:45

That kind of question is very  interesting and very exciting. 

92:52

It's obviously a fundamental  question about the universe. 

92:56

He has some wonderful ideas in there about  quantum algorithms, where they come from,  

93:04

what they mean, and what they relate  to the meaning of the wave function. 

93:07

Questions like this are still not  agreed upon amongst physicists. 

93:16

There's just some sense of, "Oh, I am in contact  with something which is (A) deeply important,  

93:20

and (B) we as a civilization don't have this." Of course, you start to focus  

93:26

your attention a little bit there. I'm not sure I got the answer to the question… 

93:35

Maybe I misunderstood the question. Maybe I'll explain the motivation first. 

93:44

In a previous conversation, we were discussing how  you could have known in the 1940s that the Shannon  

93:48

theorems and Shannon's way of thinking about a  communication channel is a deep idea that goes  

93:57

beyond the problems with pulse-code modulation  that Bell Labs was trying to solve at the time,  

94:02

and that it applies to everything from quantum  mechanics to genetics to computer science. 

94:09

One of the ideas you stated that we  didn't get a chance to talk about  

94:15

yet… Shannon published this paper. There are all these other papers,  

94:19

but there's some market of follow-ups where  people gravitate to and build upon Shannon's work. 

94:23

How do they realize that that's the thing  to do, and how does that process happen? 

94:29

I guess you gave your local answer. You read these papers, and you  

94:33

immediately realized there's work to be  done here. There's low-hanging fruit.  

94:37

There's some deep provocative idea  that I need to better understand,  

94:40

and I could tractably make progress on. To some extent, you're saying, "Okay,  

94:48

I wanted to get into this game of contributing to  humanity's understanding of the universe," and you  

94:57

are applying this low-hanging fruit algorithm. You're like, "elative to my particular set of  

95:01

interests and abilities, where should  I pick up my shovel and start digging?" 

95:08

There it was like, "Oh, this looks like  quite a good place to start digging." 

95:16

Different people, of course,  chose very differently. 

95:21

It was a very unusual choice at the time. This was  1992. Very few people were thinking about that. 

95:29

Fast-forwarding a bit, I don't know how  you think about your work on the open  

95:34

science movement now, but did it work? What does success there look like? 

95:41

What is the movement trying to accomplish? It's interesting. You didn't stop and define  

95:51

open science there, which 20 years ago you would  have had to do. People recognize the phrase.  

95:58

People have some set of associations with it. Most often, they have a relatively simple  

96:03

set of associations. It means maybe something  

96:06

about making scientific papers open access. Very often they have some set of notions about  

96:11

also making code openly available  or making data openly available. 

96:19

Those are already very large successes of the open  science movement, to make those salient issues. 

96:27

Those are issues on which people have opinions,  and there are relatively common arguments. 

96:35

This is like the meme version: publicly  funded science should be open science. 

96:42

That's a distillation of a set of ideas  which you might be able to contest. 

96:47

But if you can get people actually thinking  about it and engaged with that kind of argument,  

96:53

that's a very fundamental issue to be considering  in the whole political economy of science. 

97:01

If you go back three centuries, there  was a very similar argument prosecuted,  

97:09

which is the question: do we publicly  disclose our scientific results or not? 

97:13

If you look at people like Galileo and  Kepler, the extent to which they publicly  

97:19

disclosed was done in a very odd way. Sometimes they did bizarre things where  

97:26

they published some of their results as anagrams. They'd find some discovery, write down the result  

97:38

in a sentence, scramble it, and publish that. Then if somebody else later made the same  

98:02

discovery, they would unscramble the anagram  and say, "Oh, yeah, I actually did it first." 

98:07

This is not an ideal foundation  for a discovery system. 

98:12

It took a very long time, over a century, I think,  to obtain more or less the modern ideals, in which  

98:20

you disclose the knowledge in the form of a paper. There is an expectation of attribution, and a  

98:27

reputation economy gets built. "So-and-so did this  work, so they deserve the credit for that," and  

98:34

that's the basis for their careers. This is the underlying  

98:37

political economy of science. That made a lot of sense when you have a printing  

98:42

press and the ability to do scientific journals. Then you transition to this modern situation,  

98:48

where you can start to share a lot more. You can share your code,  

98:52

your data, your in-progress ideas. But there's no direct credit associated to those. 

99:00

It's not at all obvious how much reputation  should be associated to them. That's all  

99:10

constructed socially. Making it a live issue  is a very important thing to have done. 

99:18

I view that as one of the main positive  outcomes of work on open science. 

99:23

I'll give you a really practical  example to illustrate the problem. 

99:28

For a long time in physics, there was a preprint  culture in which people would upload preprints  

99:37

to the preprint archive, and  in biology, this didn't happen.  

99:42

There was no preprint culture. That's changing  now, but for a long time, this was the case. 

99:47

I used to amuse myself by asking physicists  and biologists why this was the case. 

99:54

What I would hear from biologists was they would  say, "Biology is so much more competitive than  

100:01

physics that we need to protect our priority,  so we can't possibly upload to the archive. 

100:10

We have to just publish in journals." Then I would sometimes hear from physicists,  

100:14

"Physics is so much more competitive  than biology that we need to establish  

100:18

our priority by uploading as rapidly  as possible to the preprint archive. 

100:22

We can't possibly wait to  do it with the journals." 

100:25

I think this emphasizes the extent to  which this kind of attribution economy  

100:28

is just something we construct. It's something we do by agreement. 

100:36

Any attempt to change that economy results in a  different system by which we construct knowledge. 

100:43

There is this very fundamental set of problems  around the political economy of science. 

100:51

We've got this collective project,  and how we mediate it depends upon  

100:56

the economy we have around ideas. One of the things you've emphasized  

101:01

as a part of this project of open science, and we  talked about it earlier, is collective science,  

101:06

or groups of people making progress on a  problem where no individual understands  

101:11

all the logical and explanatory levels  necessary to make a leap or a connection. 

101:20

Outside of mathematics, what is the  best example of such a discovery? 

101:24

I'm not sure I have a well-ordering  of them to give you a best. 

101:29

An example that I think is very  interesting is the LHC, where it's  

101:34

just this immensely complicated object. Years ago, I snuck into an accelerator  

101:42

physics conference. I didn't know anything at  

101:44

all about accelerator physics, but I was just  curious to see what they were talking about. 

101:49

This particular group of people  were experts on numerical methods,  

101:53

in particular on inverse methods. Inside these accelerators,  

101:59

you have these cascades. A particle will be massively accelerated, maybe  

102:04

it'll be collided, and then you'll get a shower  of particles which decays and decays and decays. 

102:10

There's just this incredible, consequential  shower, which is ultimately what you see  

102:17

at the detector. Then you have to  

102:20

retroactively figure out what produced it. There are these very complicated inverse  

102:25

problems that need to be solved. You've got this final data,  

102:29

but you need to figure out what produced it,  and that's how you look for signatures of these. 

102:34

Many of these people were incredibly  deep experts on simulation methods  

102:40

for following particle tracks. This was really deep and difficult stuff. 

102:46

I was like, "Wow, you could spend a lifetime  just learning how to do this and how to solve  

102:52

some of these inverse problems, and you would know  very little about quantum field theory, detector  

103:00

physics, vacuum physics, or data processing,  all these things that are absolutely essential  

103:09

to understanding, say, the Higgs boson". I don't think it's possible for one person  

103:17

to understand everything in depth. Lots of people broadly understand a  

103:22

lot of these ideas, but they don't understand  everything in the depth that is actually utilized. 

103:29

That's why there are these papers  with well over a thousand authors. 

103:34

Those people can talk to one another at a  high level, but they don't understand each  

103:39

other's specialties in all that much depth. Things like detector physics, vacuum physics,  

103:45

solving inverse problems, this stuff is  incredibly different from each other. 

103:52

To understand it in real detail is serious work. How do you think about prolificness versus depth? 

104:02

Maybe Darwin's an example of somebody who's  just gestating on something for many decades.  

104:07

There are other examples. Einstein during the  year he comes up with special relativity is  

104:11

just doing a bunch of different things. And Pais talks about how they were all  

104:14

relevant to the eventual build-up. It's something I stress about a lot. 

104:20

Sometimes I feel I'm too slow. It's funny though, the Darwin  

104:24

example is really interesting. Prolific at  what? God knows how many letters he wrote. 

104:33

It must have been an enormous number. So he was certainly very active. 

104:41

There's two types of work that tend to be  involved in any kind of creative project. 

104:46

There's routine stuff, and there you  just want to avoid procrastination. 

104:49

You just want to ask, "How do I get good at this?"  or "How do I outsource it?" and "How do I do it as  

104:54

rapidly as possible?" and just avoid getting  into a situation where you're prolonging it. 

105:02

Then there's high-variance stuff where you  actually need to be willing to take a lot of time. 

105:11

You need to be willing to go to different  places and talk to different people,  

105:15

where in any given instance, most of  it is just not going to be an input. 

105:20

Somehow balancing those two things… I  think a lot of people are very good at  

105:25

doing one or the other, but it's almost like  a personality trait which one you prefer. 

105:31

People tend to end up doing a lot  of one and not enough of the other. 

105:37

So I certainly try and balance those two things. Einstein is such an interesting example. 1905  

105:45

is just this extraordinary year. You can delete special relativity  

105:48

entirely, and it's an extraordinary year. You can delete special relativity, and you can  

105:53

delete the photoelectric effect for which he won  the Nobel Prize, and it's still an extraordinary  

105:58

year, plausibly a multi-Nobel-Prize-winning  year. So what's he doing? Maybe the answer is  

106:07

just that he's smarter than the rest of us. There's a lot of luck as well. 

106:16

Certainly for myself anyway, trying to  identify those things that are routine  

106:22

that I should get good at, and then just  try to do them as quickly as possible. 

106:27

I think that's yielded a  certain amount of returns. 

106:30

But also being willing to bet a  little bit more on myself on the  

106:34

variance side has also been very, very helpful. That's really hard, because intrinsically you're  

106:41

putting yourself in situations where you  don't know what the outcome is going to be. 

106:45

If you're very driven to be productive, and  actually mostly it's not working over there,  

106:52

you think, "Let's reduce this." It doesn't  feel right. When I worked in San Francisco,  

106:58

a practice I used to have each day was  instead of taking the 15-minute walk to work,  

107:04

I would take the more beautiful 30-minute walk. 

107:08

Partially just because it was beautiful, but  partially also as just a reminder that there  

107:14

are real benefits to not being efficient. But it's not an answer to your question. 

107:19

Really, I think all I'm saying is  I struggle a lot with the question. 

107:22

I think Dean Keith Simonton has this  famous equal odds rule where he says  

107:30

the probability that any given thing you  release—any paper, book, whatever—will  

107:34

be extremely important for a given person  through their lifetime is not that different. 

107:40

What really determines in what era they are the  most productive is how much they're publishing. 

107:48

Any given thing has equal odds  of being extremely important. 

107:53

I think some of the most successful creatives  or scientists, they're just doing a lot. 

107:58

Shakespeare was just publishing a lot. Of course, then there are counterexamples. Gödel  

108:03

published almost nothing. But broadly speaking,  you need a very good reason to not do that. 

108:17

It's funny, I've met a lot of people  over the years who are clearly brilliant,  

108:23

and they're just obsessed that they are going to  work on the great project that makes them famous,  

108:28

and they never do anything. That seems connected.  It's a type of aversiveness. I think very  

108:33

often they just don't want public judgment. Something that I would love to see… There's  

108:39

an awful lot of biographies and memoirs  and histories of people who achieve a lot. 

108:45

I wish there was a very large  number of biographies of people  

108:50

who are fantastically talented who just missed. I've known people who won gold medals at IMOs  

109:02

and things like that, who then tried to become  mathematicians and failed. What happened? What  

109:11

was the reason? I suspect in many cases that's  actually more informative than anything else. 

109:18

You have this essay that I was reading  before this interview about how you  

109:23

think about what the work you're doing is. And "writer" doesn't seem like the right label. 

109:28

As you say, was Charles Darwin a writer? What  exactly is that label? I'm a podcaster. In a way,  

109:36

obviously our work is very different,  but I also think a lot about what this  

109:41

work is and how I get better at it. In particular, how can I make sure  

109:45

there's some compounding between the  different people I talk to on the podcast? 

109:50

I worry that instead of this compounding, I build  up some understanding that's somewhat superficial  

109:58

about a topic, and then it depreciates. I move down to the next topic, and it depreciates. 

110:04

There are a lot of podcasters in the world who  will interview way more experts than I have,  

110:10

and I don't think they're much the  wiser or more knowledgeable as a result. 

110:15

So it's clearly possible to mess this up. I wonder if you have thoughts or takes or  

110:21

advice on how one actually learns in  a deeper way from this kind of work. 

110:29

It's an incredibly complicated and rich question. It seems like the question is,  

110:37

how do you make it a higher-growth context? How do you make it a more demanding context? 

110:42

You can do that in relatively small ways  that might yield compounding returns,  

110:47

or you can do something that is more radical. Maybe it means starting a parallel project in  

110:52

which you do something that is  actually quite a bit different. 

110:55

There is something really interesting  about how being very demanding can  

111:02

simply change your response to something. Something that I would sometimes do with  

111:07

students and sometimes with myself,  it was really aimed more at myself,  

111:10

was they would say some week, "I'm going to  try and do this work over the coming week." 

111:18

Then the next week would come by  and they hadn't solved the problem. 

111:23

If a million dollars had been at stake,  would you have put the same effort in? 

111:27

And the answer is no, invariably. They've tried, but they haven't really tried. 

111:36

I think that's a very familiar  feeling for all of us. 

111:38

You could do a lot more if you had just the right  demanding taskmaster standing by you and saying,  

111:50

"Look, you're barely operating here." I do wonder a little bit about  

111:56

what's the demanding taskmaster? What can they ask you that is going  

112:01

to make your preparation way more intense? The most helpful thing honestly is… For some  

112:09

subjects it is very clear how I prep. I'm doing an upcoming episode on chip  

112:14

design with the founder of a company that does  chip design, and he wrote a textbook on it. 

112:20

Yesterday I went over to his office, and we  brainstormed five roofline analyses I can do. 

112:27

If I understand that, I have  some good understanding. 

112:31

The problem is with almost every other  field, there's not this curriculum. 

112:37

When I interviewed Ilya three, four years  ago, it was: implement the transformer,  

112:41

and if you implement it, you have some nugget  of understanding you have clamped down. 

112:45

With other fields, it's just that I vaguely  understand this. It's not clamped. There's  

112:52

no forcing function of "do this exercise,  and if you do it, you will understand." 

112:58

Really what you're saying is you can do  a good job at podcasting without actually  

113:04

attaining this kind of understanding, and  that's the problem from your point of view. 

113:07

You want to change your job description so that  you are internalizing these chunks and just  

113:13

getting this kind of integration each time. It seems to me that what that means is you  

113:18

actually want to change the structure  of the work output at some level. 

113:27

There’s this terrible idea that lots of people  have that they should be in flow all of the time. 

113:34

And as far as I can tell, high performers  just don't believe this at all. 

113:38

They're in flow some of the time. You certainly see this with athletes. 

113:41

When they're actually out there  playing basketball or tennis,  

113:46

ideally they are in flow much of the time. But when they're training they're not. 

113:51

They're stuck a lot of the time,  or they're doing things badly. 

113:55

I suppose I wonder what that looks like for you. That I would be extremely satisfied with. 

114:00

The problem is I just don't know what  the equivalent of doing 64 laps is. 

114:06

This is a thing you can change by choosing  guests where there is a legible curriculum. 

114:12

So maybe it's a mistake not to have done that. Also, there's no real way to prep for Terence Tao. 

114:19

There's no curriculum that's a plausible one. There are many failure modes, but one  

114:28

long-term dynamic I'm worried about is that you  can have a good podcast and reach a local maximum,  

114:34

but for no particular guest or  topic are you going deep enough. 

114:39

My model of learning is that if you don't really  understand the deeper mechanism, you're just  

114:44

mapping inputs and outputs of a black box. That just fades incredibly fast or is  

114:49

not worth it in the first place. You just move on and it's over. 

114:54

You need to build the intermediate connection. AI in a weird way is really easy for that reason,  

115:05

because there is a clear thing you can do. Just implement it, and then you understand it. 

115:12

If I applied that criterion elsewhere,  do I just not do history episodes? 

115:16

Exactly. Ada Palmer. Wonderful to  talk to, incredibly interesting. 

115:22

But for you personally, what changed? There are some things I learned. 

115:27

If I had allocated more time,  especially after the interview, to  

115:32

write up 2,000 words on everything I learned  and how it connects to other things I know. 

115:36

Maybe that's a thing worth doing,  spreading out the episodes more  

115:39

and spending more time afterwards consolidating. I would pay infinite amounts of money if there was  

115:46

somebody who was really good at coming up with the  curriculum, the practice problems you need to do,  

115:51

and the exercise you need to do after the  interview to clamp what you have learned. 

115:55

Have you tried doing that with somebody? It's hard to find someone. I haven't  

116:00

tried super hard, but isn't it going to be  tough to find somebody who could do that  

116:04

for every single kind of discipline? Maybe I should just hire different  

116:08

ones for different topics. Maybe. There's something about,  

116:12

what problem are you solving for each episode? As far as I can tell, that's the only way I  

116:18

really understand anything. I get interested in  something. At first, I don't even have a problem,  

116:24

but there's just some sense that there's some  contribution to make here, and gradually you  

116:27

hone in, and there's a problem. Funnily enough, spending time  

116:32

stuck is incredibly important. That used to just be annoying. 

116:39

Now it seems like it's maybe even the  most important part of the whole process. 

116:47

That hard-won nature of it means  that I internalize it afterwards. 

116:53

I've written 10,000-word essays in  a couple of days, and I've written  

117:01

them in three months or six months. I feel like I didn't learn very much  

117:08

from the ones that only took a couple of days. Whereas some of the ones that took three months,  

117:16

15 years later, I'll still remember. Can you describe outside of physics how  

117:23

you learn, of the ones that took three months? By far the most common thing is there's always  

117:31

some creative artifact. Sometimes it's a  class. Sometimes it's engagement with a  

117:37

group of people who are working on some  collective creative artifact together. 

117:45

You might not even be aware  of it, but you're acting as  

117:48

an input to their creative ends in some way. Sometimes it's an essay or a book or whatever. 

117:56

It's one of the reasons why I  often quite enjoy doing podcasts. 

118:03

I said yes to come here partially because I  know you ask unusually demanding questions. 

118:10

That's an attempt to get this sort of perspective  from a different kind of a forcing function. 

118:17

Trying to pick the most  demanding creative context. 

118:20

For this interview, I went through three  lectures of the Susskind special relativity book. 

118:24

The problem is that there's  almost no practice problems in it. 

118:27

So I hired a physicist friend. I haven't done it yet, but for every lecture  

118:33

I want a bunch of practice problems to go through,  and I'm planning on being appropriately humbled. 

118:39

How do you make it as jugular as possible? The higher you can raise the stakes, the better. 

118:46

The interview is in some  sense high stakes, but also  

118:49

it doesn't necessarily test deep understanding. I don't think the interview is that high stakes. 

118:54

You're not writing a book about special  relativity, and you're not trying to write a  

118:57

book that replaces whatever the existing standard  textbook is. That's a really high stake. By the  

119:05

way, a phrase that I find particularly difficult. People will talk about "going deep" on a subject,  

119:16

and it turns out different people have  different ideas of what this means. 

119:19

For some people it means they  read a couple of blog posts. 

119:22

For some people it means  they read a book about it. 

119:24

For some people it means  they wrote a book about it. 

119:32

The standard you hold yourself  to determines a lot about your  

119:36

ability to integrate knowledge in this way. I found that I'm in some sense able to move  

119:47

much faster on some things through the help of  AI, but I don't know if I'm learning better. 

119:51

I think it's probably because… The hardest  thing, the thing that is most demanding,  

119:56

is so aversive that you try to take  any excuse you can to get out of it. 

120:00

Just having a back-and-forth conversation  with an LLM where you gloss over… 

120:04

It’s entertaining but not  necessarily anything else. 

120:07

It’s such an easy way to get out of the thing. In fact, it makes it easier because instead  

120:12

of doing some intermediate thinking, there's  always a next question you can ask a chatbot. 

120:17

Yeah. And it's somewhat valuable. That’s part of  the seductiveness, of course. It's not actually  

120:25

useless. But it can substitute for actually  doing the thing that maybe you should be doing.  

120:33

It’s interesting. To what extent should  you be outsourcing that kind of stuff?  

120:43

It’s an interesting judgment call. There is a  whole bunch of routine work that you want done. 

120:55

It's low value for you, so if you can  get a chatbot to do it, you may as well. 

121:01

Somebody interviewed the pioneering  computer scientist Alan Kay years ago,  

121:04

and he was asked what he thought about Linux. If I remember his answer correctly, he basically  

121:09

said, "It doesn't have anything  to do with computer science. 

121:13

It's just a great big ball of mud. There are a few interesting ideas  

121:17

in there which are worth understanding, but  mostly all you're learning is stuff about Linux. 

121:24

You're not actually learning  anything which is transferable." 

121:27

I thought that was very interesting. There's a certain kind of seductiveness  

121:33

to some things where it's sort  of a Rube Goldberg machine. 

121:38

You can just learn about all the  bits, and it feels entertaining. 

121:42

But if you step back and think about  what you're actually doing here,  

121:48

it might not actually be meeting your objectives. Maybe you want to become a sysadmin, and learning  

121:52

Linux is a great use of your time. There's no harm in that at all. 

121:57

But if your objective is to understand the  fundamentals of computing, it's much less  

122:03

clear that that's a good use of your time. It was certainly an answer I've thought a  

122:08

lot about, where for a certain type of mind,  there is a seductiveness in just learning  

122:17

systems and confusing that with understanding. Okay, I'll keep you updated on how this goes. 

122:24

I owe you a text within a month  of some revamped learning system. 

122:28

I'd be really curious. It's also true  that tiny incremental improvements  

122:33

in this are just worth so much. It's the main input into the podcast. 

122:39

It's great that the bookshelves are fancy  and I've got a blackboard or whatever,  

122:43

but really the thing that makes the podcast  better is if I can improve the learning I do. 

122:47

So yes, it's worth every morsel of improvement. All right, thanks for the therapy session. Great  

122:58

note to end on. Thanks, Michael. All right. Thanks, Dwarkesh.

Interactive Summary

This video is a discussion about how scientific progress is recognized and achieved, exploring the complexities beyond the simplistic view of the scientific method. It delves into historical examples like the Michelson-Morley experiment and Einstein's theory of relativity, highlighting how scientific understanding evolves through a nuanced process of refinement and reinterpretation rather than straightforward falsification. The conversation also touches upon the nature of scientific theories, the role of intuition and aesthetics, and the challenges of automating scientific discovery with AI. It further explores how scientific progress is not always linear, with new fields emerging and challenging existing paradigms, and discusses the idea of a larger, unexplored 'tech tree' of knowledge. The discussion touches upon the idea of 'gains from trade' between different civilizations and the importance of diverse research programs. Finally, it examines the personal and societal aspects of scientific work, including the balance between routine and high-variance tasks, the value of demanding contexts for learning, and the potential of AI to accelerate discovery.

Suggested questions

10 ready-made prompts