HomeVideos

Are LLMs a Dead-End? (Investors Just Bet $1 Billion on “Yes”) | AI Reality Check | Cal Newport

Now Playing

Are LLMs a Dead-End? (Investors Just Bet $1 Billion on “Yes”) | AI Reality Check | Cal Newport

Transcript

854 segments

0:00

We've been told time and again that the

0:03

massive large language models trained by

0:05

companies like OpenAI and Anthropic are

0:08

poised to utterly transform our world.

0:11

We've been told that huge percentages of

0:13

existing jobs are soon to be automated.

0:16

We've been told that skills like writing

0:18

and photography and film making are all

0:21

about to be outsourced. And we'd be told

0:23

if we're not careful, the systems built

0:25

on these models might someday soon

0:27

become sentient and even threaten the

0:30

existence of the human race. But here's

0:34

the thing. One of the AI pioneers who

0:37

helped usher in this current age is not

0:40

convinced. His name is Yan Lakun and

0:43

he's been long arguing that not only

0:45

will LLM based AI fail to deliver all of

0:48

these disruptions, but that it is, and

0:50

I'm quoting him here, a technological

0:53

dead end.

0:56

People have started to listen. Earlier

0:59

this month, a syndicate of investors,

1:01

including Jeff Bezos and Mark Cuban,

1:03

along with a bunch of different VC

1:04

firms, raised over a billion dollars to

1:07

fund Lacun's new startup, Advanced

1:09

Machine Intelligence Labs, which seeks

1:12

to build an alternative path to true AI,

1:15

one that avoids LLMs all together.

1:20

After all of the hype and stressed and

1:22

handrigging around LLMbased tools like

1:24

CHAP, GPT, and Claude Code, is it

1:26

possible that Yan Lun was right that

1:29

those specific types of tools won't

1:31

change everything? And if so, what's

1:35

going to come next? If you've been

1:37

following AI news recently, you've

1:39

probably been asking these questions.

1:40

And today, we're going to seek some

1:43

measured answers.

1:45

I'm Cal Newport and this is the AI

1:48

reality check.

1:52

Okay, so here's the plan. I've broken

1:54

down this discussion into three sub

1:57

questions. Sub question number one, what

2:00

exactly is Yan Lun up to and how does

2:02

this differ from what the existing major

2:04

AI companies are doing? Sub question

2:06

number two. How is it possible that he

2:10

could be right about LLM running out of

2:12

steam if everything we've been hearing

2:14

recently from tech CEOs and news media

2:15

is about how fast LLMs are advancing and

2:18

how this techn is about to change

2:19

everything? And number three,

2:22

if Lun is right, what should we expect

2:24

to happen in the next few years? And

2:26

what should we hap expect to happen in

2:28

the maybe decade time span? All right,

2:31

so that's our game plan here. It's going

2:33

to get a little technical. I'm going to

2:34

put on my computer science hat, but I'll

2:35

try to keep things simple, which really

2:38

is the worst of both worlds because it

2:39

means that the technical people will say

2:41

I'm oversimplifying and the nontechnical

2:43

people will say I still don't make

2:44

sense. So, I'm going to do my best here

2:45

to walk this high wire act. Let's get

2:47

started with our first sub question.

2:50

What is Yan Lun up to? All right. Well,

2:52

let's just start with the basics. Um, I

2:54

want to read a couple quotes here from a

2:56

recent article that Cade Mets wrote for

2:57

the New York Times discussing what just

3:00

happened with Lacun's new company. All

3:02

right. So, I'm quoting here. Lacun's

3:05

startup Advanced Machine Intelligence

3:07

Labs or AMI Labs has raised over $1

3:10

billion in seed funding from investors

3:12

in the United States, Europe, and Asia.

3:15

Although AMI Labs is only a month old

3:17

and employs only 12 people, this funding

3:19

round values the company at $3.5

3:22

billion.

3:24

Dr. Lun who's 65 was one of the three

3:26

pioneering researchers who received the

3:28

Turing Award often called the Nobel

3:31

Prize in computing for their work on the

3:33

technology that is now the foundation of

3:35

modern AI. Dr. Lun has long argued that

3:38

LLMs are not a path to truly intelligent

3:41

machines. The problem with LLMs, he

3:44

said, is that they do not plan ahead.

3:46

Trained solely on digital data, they do

3:48

not have a way of understanding the

3:49

complexities of the real world. quote,

3:52

"If you try to take robots into open

3:53

environments, into households, or into

3:55

the street, they will not be useful with

3:57

current technology." End quote. Uh, Mr.

3:59

Le Brun, who's the CEO of AMI Labs, told

4:01

the Times, "We want to help them reach

4:04

new situation, react to new situations

4:06

with more common sense." All right, so

4:08

that's kind of a a highle summary of

4:10

what's going on. Let's get in the weeds

4:11

here to really get into the technical

4:12

details of what Lacun is saying and how

4:14

it differs, how his vision differs from

4:16

what the major existing frontier AI

4:19

companies are actually doing. All right,

4:21

let's start with a basic idea here. If

4:23

you're an AI company, you're trying to

4:26

build artificial intelligence-based

4:28

systems that help people do useful

4:29

things. This could be like by asking

4:31

them questions with a chatbot or having

4:33

the system help you produce computer

4:35

code if we're talking about coding

4:36

agents. At the core of all these

4:38

products needs to be some sort of what

4:39

we can call digital brain, something

4:42

that encapsulates the core of the

4:45

artificial intelligence that your tool

4:47

or system is leveraging.

4:50

So the major AI companies like OpenAI

4:53

and Anthropic have a different strategy

4:56

for creating those underlying digital

4:58

brains than Yan Lun's new company has.

5:03

All right. So what are the existing AI

5:04

companies doing? They're all in on the

5:07

idea that the digital brain behind these

5:09

AI products should be a large language

5:12

model. Now, we've talked about this

5:14

before. You've heard this before, so

5:15

I'll go quick, but it's worth

5:16

reiterating.

5:18

A large language model is an AI system

5:21

that takes this input text and it

5:24

outputs a prediction of what word or

5:27

part of a word should follow. So if we

5:29

want to be sort of anthropomorphic here,

5:31

what it's trying to do is that it

5:33

assumes the text it has as input is a

5:36

real pre-existing text and that what

5:39

it's trying to do is correctly guess

5:41

what followed that text in the actual

5:43

real existing pre-existing text. That's

5:45

really what a language model does.

5:48

So if you call it a bunch of times, so

5:50

you give it input, you get a word or

5:52

part of the word as output. You then

5:54

append that to your input and now put

5:55

the slightly longer input into the

5:57

language model, you get another word or

6:00

part of a word. And if you add that to

6:01

the input and put that through the

6:02

model, you slowly expand the input into

6:05

a longer answer. This is called auto

6:08

reggressive text production that you

6:09

keep taking the output and putting it

6:11

back into the input until the model

6:13

finally says, uh, I'm done. And then you

6:15

have your your response. So we can think

6:17

about it. Then if we zoom out a little

6:19

bit, the large language model takes text

6:22

as input and then expands whatever story

6:24

you told it to try to finish it in a way

6:26

that it feels is reasonable.

6:29

Under the hood, they look something like

6:30

this. Jesse, can we bring this up on the

6:32

screen here? Um, this is like a typical

6:34

architecture for a large language model.

6:36

You have input like here it says to

6:38

cats. You that gets broken into tokens.

6:41

Those get embedded into some sort of

6:42

mathematical semantic space. Don't worry

6:44

about that. They then go through a bunch

6:45

of transformer layers. Uh each layer has

6:48

two sub layers, an intention sub layer

6:50

and a feed forward neural network. And

6:52

out of the end of those layers comes

6:53

some information that goes into an

6:55

output head that selects what word or

6:56

part of a word to output next. So that's

6:58

the it's kind of this linear structure

7:01

uh is the architecture of a large

7:03

language model. So the way you train a

7:05

large language model is you give it lots

7:07

of real existing text and what you do is

7:10

you knock words out of that text. you

7:12

have it try to predict the missing word

7:14

and then you correct it to try to uh

7:16

make it a little bit more accurate. If

7:18

you do this long enough on a big enough

7:19

network with enough words, this process

7:22

which is called pre-training produces

7:24

language models that are really good at

7:27

predicting missing words. And to get

7:29

really good at predicting missing words,

7:31

they end up encoding into those uh feed

7:34

feed forward neural network layers

7:36

within their architecture lots of

7:39

knowledge about the world sort of uh how

7:42

things work, different types of tones.

7:44

They get really good pattern

7:45

recognizers, really good rules. You

7:46

actually sort of implicitly

7:49

emergently and implicitly within the

7:51

feed uh forward neural networks in the

7:53

language models, a lot of sort of smarts

7:55

and knowledge begins to emerge.

7:57

That's the basic idea with a large

7:59

language model. So the large the AI

8:01

companies their their bet is if these

8:03

things are large enough and we train

8:04

them long enough uh and then we do

8:07

enough sort of fine-tuning afterwards

8:08

with post-training

8:10

you can use a single massive large

8:12

language model as the digital brain for

8:14

many many different applications. Right?

8:17

So when you're talking with a chatbot,

8:19

it's referring it's referencing a the

8:21

same large language model that your

8:23

coding agent might also be talking to to

8:25

help figure out what computer code to

8:27

produce. It'll be the same large

8:29

language model that your openclaw

8:31

personal assistant agent is also

8:33

accessing. So it's all about one how

8:35

9000 style massive model, massive large

8:39

language model that is so smart you can

8:42

use it as a digital brain for anything

8:43

that people might want to do in the

8:45

economic sphere. That is the model of

8:46

companies like OpenAI and Anthropic.

8:50

All right. So what is Yan Lun's AMI Labs

8:53

doing differently? Well, he doesn't

8:55

believe in this idea that having a

8:57

single large model that implicitly

9:00

learns how to do everything makes sense.

9:03

He thinks that's going to hit a uh dead

9:05

end. That's an incredibly inefficient

9:07

way uh to try to build intelligence. And

9:10

the intelligence you get is going to be

9:12

brittle because it's all implicit and

9:14

emergent. you're going to get

9:15

hallucinations or sort of odd flights of

9:18

uh responses that really doesn't make

9:19

sense in the real world. So what is his

9:21

alternative approach? Well, he says

9:24

instead of having just one large single

9:26

model, he wants to shift to what we

9:28

could call a modular architecture where

9:32

your digital brain has lots of different

9:35

modules in it that each specialize in

9:37

different things that they're all wired

9:39

together.

9:40

Let me show you what this might look

9:41

like. I'm going to bring on the screen

9:43

here a key paper that lacun published in

9:46

2022 called a path towards autonomous

9:48

machine intelligence. This has most of

9:50

the ideas that are behind AMI labs. Um

9:52

this paper has this diagram here I have

9:54

on the screen. Uh it's an example of a

9:57

modular architecture. So he imagines an

9:59

AI digital brain now has multiple

10:02

modules including a world model which is

10:04

separate from an actor which is separate

10:07

from the critic which is separate from a

10:08

perception module which is separate from

10:10

short-term memory which is separate from

10:12

an overall configurator that helps move

10:14

information between each of these

10:15

different modules. So you might have for

10:17

example the perception module makes

10:19

sense of input it's getting maybe

10:21

through text or through cameras if it's

10:22

a robot. It passes that to an actor

10:24

which is going to propose like here's

10:26

what we should do next. But then the

10:27

critic is going to analyze it different

10:29

options using the world model which has

10:32

a model of how the relevant world works

10:34

to try to figure out which of these

10:35

options is best pulling from short-term

10:37

memory. Then the actor can choose the

10:39

best of those options which then gets

10:40

executed. So it's a much more of a we

10:42

have different pieces that do different

10:45

things. Now another piece of the yam

10:47

lacun image is that you can train

10:50

different modules within modular

10:52

architecture differently. Again, in a

10:54

language model, there's like one way you

10:56

train the whole model and all the

10:57

intelligence implicitly emerges. In

10:59

Lacun's architecture, he says, "Well,

11:00

wait a second. Train each module with

11:04

the best way uh with whatever way makes

11:06

the most sense for what that module

11:07

does." So, like the perception module,

11:09

let's say it's making sense of the world

11:11

through cameras. Well, there we want to

11:13

use a sort of uh vision network that's

11:15

trained with sort of like classic deep

11:17

learning vision recognition of the type

11:19

that you know Lun actually helped

11:21

pioneer back in the '9s and early 2000s.

11:24

But then the world model which is trying

11:26

to build an understanding of how the

11:27

world works, he's like oh we would train

11:29

that very differently. In fact, he has a

11:32

particular technique. So if you've heard

11:33

of JEA GEA, joint embedding predictive

11:35

architecture,

11:37

this is a new training technique that

11:39

Lun came up with for training a world

11:41

model where at a very high level he says

11:43

here's the right way to do that. Don't

11:45

train a model that tries to understand

11:47

how a c a particular domain works. Don't

11:51

just train it with the low-level data

11:53

like the actual raw words from a book or

11:56

raw images from a camera. What you want

11:59

to do is take these real world

12:01

experiences and convert them all to

12:03

high-level representations and train

12:05

them on the highle representation. So

12:07

like I'm simplifying here a lot. Let's

12:10

say you have as input a picture of a

12:12

baseball about to hit a window and then

12:14

a subsequent picture where the window is

12:16

broken. You don't want to train a world

12:19

model he argues just on those pictures.

12:20

Like if I see a picture like this, the

12:23

picture that would follow is one where

12:25

the glass is broken. That's how maybe

12:26

something like a a a standard LLM style

12:30

generative picture generator might work.

12:32

He's like instead take both pictures and

12:34

have a highle representation. So it's

12:36

like a mathematical encoding of like a

12:39

baseball is getting near a window. Like

12:40

what actually matters? What are the key

12:42

factors of this picture and then the

12:44

next picture is the window breaks. And

12:46

what you really want to teach the model

12:47

is when it has this highle setup, a

12:50

baseball's about to hit the window. It

12:52

learns that leads to the window

12:53

breaking. So it's not stuck in

12:55

particular inputs but learning causal

12:58

rules about how the relevant domain

13:00

works. And anyways there's a lot of

13:01

other ideas like this the critic and

13:03

actor that comes out of RL reinforcement

13:05

learning worlds um as sort of well

13:07

known. You've you've trained one network

13:09

with rewards and another one to propose

13:11

actions. And so there's a a lot of

13:12

different ideas coming together here.

13:15

The third piece about Lacun's vision

13:17

that differs from the big AI companies

13:20

is he doesn't believe in having just one

13:23

system that you train once and is then

13:27

the digital brain for all the different

13:29

types of things you should do. He says

13:31

this architecture is the right

13:32

architecture for everything. But you

13:35

train different systems for different

13:37

domains. So if I want a digital brain

13:39

that we can build computer programming

13:42

agent tools on, I'm going to take one of

13:44

my systems with its world model and

13:46

perception and actor and critics and I'm

13:48

going to train it specifically for the

13:50

domain of producing computer programs

13:53

and then all my computer programming

13:55

agents that people are building will use

13:56

that particular system. But if I want to

13:58

do uh help with call centers or

14:00

whatever, I might completely train a

14:02

different version of the system just to

14:05

be really good at call centers. So we

14:07

don't have just one massive HAL 9000

14:09

that everything uses which is the OpenAI

14:11

plan or the anthropic plan. We custom

14:14

train systems that maybe all use the

14:16

same general architecture but we train

14:18

them from scratch for different types of

14:20

domains. You're going to get much better

14:22

performance out of it. All right. So

14:24

that is uh Yan Lun's vision and he says

14:28

this is how you're going to get uh much

14:31

more reliable and smart and useful

14:34

activity out of AI. this idea that we're

14:36

just going to train like a massive model

14:37

that can do everything based off of just

14:39

text. He's like, "Come on, this makes no

14:40

sense. That can't possibly be the best,

14:43

most efficient route towards actually

14:44

having smarter AI." All right, so that

14:47

is the key tension between the existing

14:49

AI companies and Yan Lun's idea. This

14:52

brings us to our second sub question.

14:54

How is it possible that Lacun

14:57

could be right that LLMs are a dead end

14:59

if we've been hearing nonstop in recent

15:02

months about how these LLM based

15:03

companies are about to destroy the

15:05

economy and change everything? How could

15:07

we be so wrong?

15:09

Lun is not surprised by that. I think

15:12

there if we asked him, I'll simulate

15:14

Lacun. If we asked him, he would say the

15:16

short answer to that question is, look,

15:19

a lot of coverage of LLMs recently have

15:21

been a mixture of hype and confusing the

15:26

specific LLM strategies of the frontier

15:28

companies with the idea and

15:29

possibilities of AI more generally and

15:30

kind of mixing those things together,

15:34

which is fine if you're Sam Alman or or

15:36

Dario Amade, that's great for you

15:37

because you need investment, but it's

15:40

probably not the most accurate way to

15:41

think about it. Now, if we ask Lacun in

15:43

this hypothetical to give a longer

15:44

explanation about how we could be so

15:46

wrong about LLMs, he would probably say,

15:49

"Okay, let me let me explain to you the

15:52

trajectory of the LLM technology in

15:55

three stages." And I think this will

15:57

clarify a lot. All right. So, the first

15:59

stage was the pre-training scaling

16:02

stage. And this is the stage where the

16:05

the AI companies kept increasing the

16:08

size of the LLM. So how big those layers

16:11

are inside of them, the size of the

16:13

LLMs, the amount of data they trained

16:15

them on, and how long they trained them.

16:17

And there was a period starting in 2020

16:20

and lasting until 2024

16:23

where making the model bigger and

16:25

training them longer demonstrabably and

16:27

unambiguously increased their

16:29

capabilities. This petered out after

16:32

about GPT4. after about GPT4, OpenAI

16:37

um we have evidence that XAI had the

16:39

same issue. We have evidence that Meta

16:40

had the same issue. When they continued

16:42

to make their models bigger, they

16:44

stopped getting those big performance

16:46

jumps. So, they couldn't just scale them

16:48

to be more capable. This led to stage

16:50

two, which I think of as starting in the

16:53

summer of 2024, which is where they

16:56

shifted their attention to post

16:57

training. So now like we can't make the

17:00

underlying smarts of these LLMs

17:03

um better by making them bigger,

17:05

training them longer. So what we need to

17:06

do is try to get more useful stuff out

17:10

of these existing pre-trained LLMs. And

17:12

so the first approach they came up with

17:14

and we we saw this with the the alphabet

17:16

soup of models that were released

17:17

starting in the fall of 2024 01 03 nano

17:20

banana like all these type of names. The

17:23

first approach they tried was um telling

17:26

the models to think out loud. So instead

17:28

of uh just directly producing a

17:30

response, they post-train the models to

17:33

be like actually explain your thinking

17:36

and it was sort of a way because

17:37

remember it's auto reggressive. So as

17:38

the model sort of explains its thinking

17:40

that's always going back as input into

17:42

the model and it gives it more to work

17:43

off of in reaching an answer. So it

17:46

turned out if you had the model think

17:47

out loud you got slightly better on

17:49

certain types of benchmarks. So these

17:51

were the so-called reasoning models. Um,

17:52

but it was a bit of a wash because this

17:54

also made it more expensive to use the

17:56

models because it burned a lot more

17:57

tokens because the answers you it

17:59

produced a lot more tokens to get to the

18:01

answer you cared about. So it did better

18:04

but it was unclear like how much of that

18:05

we actually want to turn on for users.

18:07

Um, the second approach they used in the

18:09

second stage was post training. So now

18:12

if you have for example a lot of

18:15

examples of a particular type of

18:17

question prompts correct answers prompts

18:19

correct answers you could use those

18:21

combined with techniques out of

18:22

reinforcement learning to nudge the

18:24

existing pre-trained model to be better

18:26

on those type of tasks. So, we entered

18:28

this stage, stage two of of the sort of

18:30

post-training stage where because we

18:33

couldn't make these uh LLM brains

18:36

fundamentally smarter, we wanted to try

18:39

to tune them to get more performance out

18:41

of them uh on particular types of task.

18:43

This is when we began to see less of

18:45

just, hey, try this model and it's going

18:47

to blow blow your socks off and we

18:49

instead got lots of charts of inscrable

18:52

benchmarks. Look, the the chart is going

18:54

up on this alphabet soup benchmark

18:56

because, you know, you could post train

18:58

for particular benchmarks. It was less

19:01

obvious in a lot of use cases for the

19:02

regular user that like well the

19:04

underlying smart seems to be the same.

19:06

We then entered a stage three. I think

19:08

this started in the fall of 2025

19:11

where the LLM company said really the

19:14

big gains going forward is in the

19:17

applications that use the LLMs. Let's

19:19

make these applications smarter. So it's

19:22

not just how capable the LLM is. It's

19:24

like how capable is the programs that

19:26

are prompting the LLM. Let's make those

19:28

smarter. So we saw a lot of this effort

19:31

going into the programs that are called

19:32

coding agents that help computer

19:34

programmers edit and produce and plan

19:37

computer code. Now these type of agents

19:39

had been around for many years but they

19:42

got really serious a lot of the AI

19:44

companies especially uh last year coming

19:46

into the fall of last year and how do we

19:49

make the program so they weren't

19:51

changing really much the LLMs they did

19:53

some fine-tuning for uh programming but

19:56

really the big breakthroughs in coding

19:57

agents were in the programs that call

19:59

the LLMs and they figured out how can we

20:01

make these coding agents capable of

20:03

working with enterprise code bases so uh

20:05

not just for individuals vibe coding web

20:07

apps but something you could use if

20:09

you're a professional programmer in a

20:10

big company. All of that's tool

20:12

improvements.

20:14

Making sure that you're able to send

20:16

better prompts to the LLM. When you hear

20:18

about things like skill files and um

20:21

managing like hierarchies of agents,

20:23

this is all improvements in the programs

20:24

that use the LLM, not the none of this

20:27

is breakthroughs in the digital brain

20:29

itself. And so this is the stage that we

20:30

are in now is we're spending a lot more

20:33

time building smarter programs that sit

20:37

between us and the LLMs that they're

20:38

quering as their digital brain so that

20:40

it's in it in very particular domains it

20:42

is more useful.

20:44

So this all tells us right this is like

20:46

what Lun would tell you right I'm I'm

20:47

channeling Lun he would say once you

20:50

understand this reality you see that

20:52

this impression that LLM based AI has

20:54

been on this super fast like upper

20:56

trajectory of lots of fast advances is

20:59

pretty illusory

21:01

the fundamental improvements in the

21:02

underlying brain stopped a couple years

21:04

ago

21:06

what we saw was then a period of lots of

21:08

brag bragging about benchmarks doing

21:10

better but this was all about post-

21:12

training and now for the last four

21:14

months like all these improvements we've

21:16

been hearing is about the programs that

21:17

use the LLMs are being made smarter and

21:20

they're better fitting particular use

21:22

cases but there really hasn't been major

21:25

fundamental uh improvements in the

21:28

underlying smartness of the digital

21:30

brains which is why all the problems

21:31

like hallucinations and unreliability

21:33

persist. the brains are actually

21:35

incrementally improving either in narrow

21:38

areas um or in narrow ways. And it's

21:41

what we're building on top of them. This

21:42

creating an illusion of increasing

21:45

trajectory of artificial intelligence

21:46

when in reality we might just be in a

21:48

very longtail stage of now we're going

21:50

to do product market fit and actually

21:51

build do the work of building more

21:53

useful products on top of a mature

21:55

digital brain technology that's only

21:57

advancing at a very slow rate. That

21:58

would be Lun's argument. Uh therefore we

22:02

will find some good fits, but this is

22:03

not a technology that's on a trajectory

22:05

where it's going to be able to make

22:06

massive leaps in what it's actually able

22:09

to do.

22:11

All right. Um

22:13

so there you go. That would be the

22:15

argument for how we could have gotten

22:18

LLM progress so wrong. All right. Sub

22:21

question number three.

22:23

Let's follow through this thought

22:24

experiment. What would happen if Lun is

22:28

right about that? What what would we

22:29

expect then to happen in the near

22:32

future? Well, let's start with the the

22:34

window of the next one to three years.

22:36

If he is right, we would see a long tale

22:40

of applications based on existing LLMs

22:43

to begin to fill in. So, computer coding

22:45

agents have gotten more useful. We will

22:47

see other use cases like that that don't

22:50

exist now, but where people are really

22:51

experimenting to try to figure out

22:53

applications that are going to uh work

22:56

in other types of fields. So there'll be

22:58

sort of claude code moments in other

23:00

fields which I think will be useful and

23:02

exciting. Um the tool sets used in many

23:06

jobs will change but because we're now

23:09

just trying to like find areas where we

23:11

can build useful applications on top of

23:13

existing LLMs. these doomsday scenarios

23:16

like we we've been talking about on this

23:17

AI reality checks recently where

23:19

knowledge workers are going to have to

23:21

become uh pet massuses and then after

23:23

that they're going to have to cook the

23:24

pets on garbage can fires because

23:26

there's no money left the economy none

23:28

of those scenarios are are would unfold

23:30

based on LLMs in this current vision

23:33

there would be a big economic hit

23:34

because what we're going to if we've

23:36

shifted our attention to building better

23:38

applications on top of the LLM what

23:40

we're going to see is a lot more

23:41

companies get into that game and they're

23:43

going to say, "I don't want to pay for a

23:46

cutting edge frontier hyperscaled LLM.

23:50

It's too expensive. Let's look at

23:52

cheaper LLMs. Let's look at open source

23:53

LLMs. Let's look at LLMs that can fit on

23:56

chip."

23:58

We saw this already with the OpenClaw

24:00

framework, which allowed people to build

24:02

their own custom applications that use

24:04

LLMs to do personal assistant type

24:06

roles. And right away, people are like,

24:08

I don't want to pay all the money to use

24:09

Claude or GPT. And you saw an explosion

24:13

of interest in onchip machines and open

24:14

source machines. All this is going to be

24:16

I think good news for the consumers.

24:18

That means we could have more people

24:19

building these applications. There'll be

24:21

more variety of these applications and

24:22

they'll be cheaper. It's bad news for

24:24

the stock market because we've invested

24:26

depending on who you ask somewhere

24:27

between 400 to 600 billion dollars into

24:30

these LLM hyperscalers like OpenAI and

24:33

Anthropic.

24:34

That market's not going to support it.

24:36

So there's going to be a big crash. This

24:38

will probably temporarily slow down, if

24:40

this vision is correct, would

24:41

temporarily slow down AI progress

24:43

because investors are going to feel

24:46

burnt. All right. What's going to happen

24:48

now if we zoom out to like a a 3 to 10

24:50

year range? Um, that's roughly the range

24:53

in which the modular architecture

24:55

approach that Lacun is talking about

24:56

would reach maturity. That's what their

24:58

current CEO is saying. Again, it's it's

25:00

a research company now and they said

25:02

it'll be several years until we really

25:03

get the products that are ready for

25:04

market. If lacun is right, what we're

25:07

going to see is domain by domain, you're

25:10

going to have these uh very bespoke

25:14

train domain specific modular

25:16

architecture systems which if he's right

25:19

are going to be way more reliable and

25:21

more smart in the sense of like they do

25:23

the thing I asked them in a way that's

25:26

good and as good as like uh some of my

25:28

human employees and in a way that I can

25:30

actually trust. We're going to see a lot

25:32

more of that. what's promised with LLMs

25:34

we're going to see instead on that 3 to

25:36

3 to 10 year basis if lacun is right

25:40

because they're uh based on this mild

25:43

jar architecture I think these systems

25:44

will you know they'll be more reliable

25:47

um they're also going to be easier to

25:48

align

25:50

LLMs are so offuscated it's just like

25:52

here's 600 billion parameters in this

25:54

big box that we trained for a month on

25:56

all the text on the internet let's just

25:58

see what it does modular architectures

26:00

are way more alignable like you have

26:02

literally a critic module in there that

26:04

evaluates plans based on both a world

26:07

model and some sort of hard-coded value

26:08

system to say which of these do I like

26:10

better and you could just go in there

26:10

and sort of hardcode don't do these type

26:13

of plans you know uh really have a low

26:16

score for plans that lead to whatever

26:19

like a lot of variability in in outcome

26:21

or something like that you have more

26:23

direct knobs to turn so it does make

26:25

alignment more easier um they would also

26:27

be more economically efficient because

26:29

when you're when you have to train one

26:30

module long enough one model long enough

26:32

that could everything. It has to be huge

26:34

and it takes a huge amount of energy.

26:36

But when you're training different

26:37

modules in a domain specific system,

26:39

these can be much smaller. I like to

26:41

point out the example of a deep mind, a

26:43

Google DeepMind tool called Dreamer V3,

26:46

which can learn how to play video games

26:48

from scratch. It's it's famous for

26:50

figuring out how to find diamonds in

26:52

Minecraft. And it uses a modular

26:53

architecture um very similar to what

26:56

Lacun is proposing here. And we just

26:58

read a paper about it in my doctoral

26:59

seminar I'm teaching on super

27:00

intelligence right now. Dreamer v3 which

27:03

can play Minecraft it well better than

27:06

if you ask an LLM to do right it's

27:07

domain specific requires around 200

27:10

million parameters which is a factor of

27:14

10 or less than what you would get in a

27:16

standard LLM. It could be trained on a

27:17

single GPU chip and it could do this

27:20

domain way better than uh a frontier

27:23

language model which is significantly

27:24

longer and train significantly more

27:26

exhaustively. So there would be some

27:28

advantages here. There would also be

27:29

some there's a little bit of digital

27:31

lick around this world because uh way

27:35

more so than LLMs again these domain

27:37

specific models might actually have more

27:39

of a displacement capability. So we'd

27:40

have to keep an eye on them. All right,

27:42

conclusion. What do I think is going to

27:46

happen here? Well, you know, I don't

27:48

know, right? It's possible that there's

27:51

more performance breakthroughs to get on

27:53

LLMs and we're going to get more useful

27:55

tools. A gun to the head if I had to

27:58

predict,

27:59

you know, through my computer science

28:01

glasses, lacun's modular architecture,

28:05

it feels like that has to be the right

28:06

answer.

28:08

I I think of this doubling down on LLMs

28:13

is we're going to look back at this like

28:14

an economic mistake. It was the first

28:16

really promising new AI technology uh

28:20

widespread AI technology built on top of

28:22

deep learning and it did cool things.

28:25

But instead of stepping back and like

28:26

okay what will this be good for and what

28:29

types of domains might we want different

28:31

models we said no let's just raise half

28:33

a trillion dollars and just go all in on

28:36

everything textbased LLMs which are

28:38

trained on text and are made to produce

28:40

text all artificial intelligence will

28:42

run off of these things. I just think

28:43

when we zoom out on the 30-year scale,

28:45

we'll be like that was so naive. This

28:47

idea that like this was the only type of

28:49

model we need for artificial int. It's

28:50

super inefficient for like 99% of the

28:53

domains we want to use. It's great for

28:54

textbased domains and computer

28:56

programming kind of the planning is a

28:58

little suspect, but the code production

29:00

is okay. But we're going to make all

29:02

intelligence based off just massive LLMs

29:04

and there'll be like four of them, like

29:06

four companies that have like massive

29:08

ones and that's it. That this can't be

29:10

the right way to do it. So my my

29:11

computer science instincts say modular

29:13

architecture it just makes so much more

29:15

sense domain specificity differential

29:18

training of modules

29:20

you have much more alignment capability

29:22

they're much more economically feasible

29:24

like it just feels to me like that

29:26

probably is going to be the right answer

29:29

which means we're going to have to have

29:30

some bumpiness in the stock market

29:32

because I don't think that if this is

29:33

true the hyperscalers is now either they

29:35

have to pivot to those quick enough

29:36

before they run out of money or some of

29:38

them are going to go out of the business

29:39

and the others are going to have

29:40

collapse before they expand again. So I

29:44

think the modular architecture approach

29:46

will work better. I don't know if

29:46

Lacun's company's going to do it or not,

29:48

but I think that architecture it makes a

29:50

lot of sense to a lot of computer

29:51

scientists. Now, I hope they don't get

29:52

too much better because I'm much more I

29:56

can much more imagine a very trained

29:59

modular architecture AI digital brain

30:02

creating justified ick than I can

30:05

building these Python agent programs

30:07

that access some sort of massive LLM

30:09

somewhere. All right. So, yes, we'll

30:11

know. I think within a year we'll begin

30:13

to get a sense of which of these

30:14

trajectories is actually true. Um, I of

30:18

course will do my best to keep you

30:19

posted here on the AI reality check. All

30:21

right, that's enough computer science

30:23

talk for one day. Hopefully that made

30:24

sense. Hopefully that's useful.

30:26

Be back soon with another one of these

30:28

checks. And until then, remember, take

30:31

AI seriously, but not everything that's

30:33

written about it. Hey, if you like this

30:34

video, I think you'll really like this

30:36

one as well. Check it out.

Interactive Summary

The video discusses the perspective of AI pioneer Yan LeCun, who argues that Large Language Models (LLMs) are a technological dead end. While LLMs like ChatGPT have gained significant attention and investment, LeCun believes they are inefficient and brittle, prone to hallucinations due to their lack of real-world understanding and planning capabilities. He proposes a modular architecture for AI, where specialized modules handle different tasks, trained in ways best suited for each module. This contrasts with the current industry trend of using massive, single LLMs for various applications. The video explores how LeCun's ideas differ from major AI companies, analyzes why the current hype around LLMs might be misleading, and speculates on future AI developments, suggesting that a shift towards domain-specific, modular systems could lead to more reliable, efficient, and aligned AI, potentially disrupting the market for current LLM hyperscalers.

Suggested questions

4 ready-made prompts