Are LLMs a Dead-End? (Investors Just Bet $1 Billion on “Yes”) | AI Reality Check | Cal Newport
854 segments
We've been told time and again that the
massive large language models trained by
companies like OpenAI and Anthropic are
poised to utterly transform our world.
We've been told that huge percentages of
existing jobs are soon to be automated.
We've been told that skills like writing
and photography and film making are all
about to be outsourced. And we'd be told
if we're not careful, the systems built
on these models might someday soon
become sentient and even threaten the
existence of the human race. But here's
the thing. One of the AI pioneers who
helped usher in this current age is not
convinced. His name is Yan Lakun and
he's been long arguing that not only
will LLM based AI fail to deliver all of
these disruptions, but that it is, and
I'm quoting him here, a technological
dead end.
People have started to listen. Earlier
this month, a syndicate of investors,
including Jeff Bezos and Mark Cuban,
along with a bunch of different VC
firms, raised over a billion dollars to
fund Lacun's new startup, Advanced
Machine Intelligence Labs, which seeks
to build an alternative path to true AI,
one that avoids LLMs all together.
After all of the hype and stressed and
handrigging around LLMbased tools like
CHAP, GPT, and Claude Code, is it
possible that Yan Lun was right that
those specific types of tools won't
change everything? And if so, what's
going to come next? If you've been
following AI news recently, you've
probably been asking these questions.
And today, we're going to seek some
measured answers.
I'm Cal Newport and this is the AI
reality check.
Okay, so here's the plan. I've broken
down this discussion into three sub
questions. Sub question number one, what
exactly is Yan Lun up to and how does
this differ from what the existing major
AI companies are doing? Sub question
number two. How is it possible that he
could be right about LLM running out of
steam if everything we've been hearing
recently from tech CEOs and news media
is about how fast LLMs are advancing and
how this techn is about to change
everything? And number three,
if Lun is right, what should we expect
to happen in the next few years? And
what should we hap expect to happen in
the maybe decade time span? All right,
so that's our game plan here. It's going
to get a little technical. I'm going to
put on my computer science hat, but I'll
try to keep things simple, which really
is the worst of both worlds because it
means that the technical people will say
I'm oversimplifying and the nontechnical
people will say I still don't make
sense. So, I'm going to do my best here
to walk this high wire act. Let's get
started with our first sub question.
What is Yan Lun up to? All right. Well,
let's just start with the basics. Um, I
want to read a couple quotes here from a
recent article that Cade Mets wrote for
the New York Times discussing what just
happened with Lacun's new company. All
right. So, I'm quoting here. Lacun's
startup Advanced Machine Intelligence
Labs or AMI Labs has raised over $1
billion in seed funding from investors
in the United States, Europe, and Asia.
Although AMI Labs is only a month old
and employs only 12 people, this funding
round values the company at $3.5
billion.
Dr. Lun who's 65 was one of the three
pioneering researchers who received the
Turing Award often called the Nobel
Prize in computing for their work on the
technology that is now the foundation of
modern AI. Dr. Lun has long argued that
LLMs are not a path to truly intelligent
machines. The problem with LLMs, he
said, is that they do not plan ahead.
Trained solely on digital data, they do
not have a way of understanding the
complexities of the real world. quote,
"If you try to take robots into open
environments, into households, or into
the street, they will not be useful with
current technology." End quote. Uh, Mr.
Le Brun, who's the CEO of AMI Labs, told
the Times, "We want to help them reach
new situation, react to new situations
with more common sense." All right, so
that's kind of a a highle summary of
what's going on. Let's get in the weeds
here to really get into the technical
details of what Lacun is saying and how
it differs, how his vision differs from
what the major existing frontier AI
companies are actually doing. All right,
let's start with a basic idea here. If
you're an AI company, you're trying to
build artificial intelligence-based
systems that help people do useful
things. This could be like by asking
them questions with a chatbot or having
the system help you produce computer
code if we're talking about coding
agents. At the core of all these
products needs to be some sort of what
we can call digital brain, something
that encapsulates the core of the
artificial intelligence that your tool
or system is leveraging.
So the major AI companies like OpenAI
and Anthropic have a different strategy
for creating those underlying digital
brains than Yan Lun's new company has.
All right. So what are the existing AI
companies doing? They're all in on the
idea that the digital brain behind these
AI products should be a large language
model. Now, we've talked about this
before. You've heard this before, so
I'll go quick, but it's worth
reiterating.
A large language model is an AI system
that takes this input text and it
outputs a prediction of what word or
part of a word should follow. So if we
want to be sort of anthropomorphic here,
what it's trying to do is that it
assumes the text it has as input is a
real pre-existing text and that what
it's trying to do is correctly guess
what followed that text in the actual
real existing pre-existing text. That's
really what a language model does.
So if you call it a bunch of times, so
you give it input, you get a word or
part of the word as output. You then
append that to your input and now put
the slightly longer input into the
language model, you get another word or
part of a word. And if you add that to
the input and put that through the
model, you slowly expand the input into
a longer answer. This is called auto
reggressive text production that you
keep taking the output and putting it
back into the input until the model
finally says, uh, I'm done. And then you
have your your response. So we can think
about it. Then if we zoom out a little
bit, the large language model takes text
as input and then expands whatever story
you told it to try to finish it in a way
that it feels is reasonable.
Under the hood, they look something like
this. Jesse, can we bring this up on the
screen here? Um, this is like a typical
architecture for a large language model.
You have input like here it says to
cats. You that gets broken into tokens.
Those get embedded into some sort of
mathematical semantic space. Don't worry
about that. They then go through a bunch
of transformer layers. Uh each layer has
two sub layers, an intention sub layer
and a feed forward neural network. And
out of the end of those layers comes
some information that goes into an
output head that selects what word or
part of a word to output next. So that's
the it's kind of this linear structure
uh is the architecture of a large
language model. So the way you train a
large language model is you give it lots
of real existing text and what you do is
you knock words out of that text. you
have it try to predict the missing word
and then you correct it to try to uh
make it a little bit more accurate. If
you do this long enough on a big enough
network with enough words, this process
which is called pre-training produces
language models that are really good at
predicting missing words. And to get
really good at predicting missing words,
they end up encoding into those uh feed
feed forward neural network layers
within their architecture lots of
knowledge about the world sort of uh how
things work, different types of tones.
They get really good pattern
recognizers, really good rules. You
actually sort of implicitly
emergently and implicitly within the
feed uh forward neural networks in the
language models, a lot of sort of smarts
and knowledge begins to emerge.
That's the basic idea with a large
language model. So the large the AI
companies their their bet is if these
things are large enough and we train
them long enough uh and then we do
enough sort of fine-tuning afterwards
with post-training
you can use a single massive large
language model as the digital brain for
many many different applications. Right?
So when you're talking with a chatbot,
it's referring it's referencing a the
same large language model that your
coding agent might also be talking to to
help figure out what computer code to
produce. It'll be the same large
language model that your openclaw
personal assistant agent is also
accessing. So it's all about one how
9000 style massive model, massive large
language model that is so smart you can
use it as a digital brain for anything
that people might want to do in the
economic sphere. That is the model of
companies like OpenAI and Anthropic.
All right. So what is Yan Lun's AMI Labs
doing differently? Well, he doesn't
believe in this idea that having a
single large model that implicitly
learns how to do everything makes sense.
He thinks that's going to hit a uh dead
end. That's an incredibly inefficient
way uh to try to build intelligence. And
the intelligence you get is going to be
brittle because it's all implicit and
emergent. you're going to get
hallucinations or sort of odd flights of
uh responses that really doesn't make
sense in the real world. So what is his
alternative approach? Well, he says
instead of having just one large single
model, he wants to shift to what we
could call a modular architecture where
your digital brain has lots of different
modules in it that each specialize in
different things that they're all wired
together.
Let me show you what this might look
like. I'm going to bring on the screen
here a key paper that lacun published in
2022 called a path towards autonomous
machine intelligence. This has most of
the ideas that are behind AMI labs. Um
this paper has this diagram here I have
on the screen. Uh it's an example of a
modular architecture. So he imagines an
AI digital brain now has multiple
modules including a world model which is
separate from an actor which is separate
from the critic which is separate from a
perception module which is separate from
short-term memory which is separate from
an overall configurator that helps move
information between each of these
different modules. So you might have for
example the perception module makes
sense of input it's getting maybe
through text or through cameras if it's
a robot. It passes that to an actor
which is going to propose like here's
what we should do next. But then the
critic is going to analyze it different
options using the world model which has
a model of how the relevant world works
to try to figure out which of these
options is best pulling from short-term
memory. Then the actor can choose the
best of those options which then gets
executed. So it's a much more of a we
have different pieces that do different
things. Now another piece of the yam
lacun image is that you can train
different modules within modular
architecture differently. Again, in a
language model, there's like one way you
train the whole model and all the
intelligence implicitly emerges. In
Lacun's architecture, he says, "Well,
wait a second. Train each module with
the best way uh with whatever way makes
the most sense for what that module
does." So, like the perception module,
let's say it's making sense of the world
through cameras. Well, there we want to
use a sort of uh vision network that's
trained with sort of like classic deep
learning vision recognition of the type
that you know Lun actually helped
pioneer back in the '9s and early 2000s.
But then the world model which is trying
to build an understanding of how the
world works, he's like oh we would train
that very differently. In fact, he has a
particular technique. So if you've heard
of JEA GEA, joint embedding predictive
architecture,
this is a new training technique that
Lun came up with for training a world
model where at a very high level he says
here's the right way to do that. Don't
train a model that tries to understand
how a c a particular domain works. Don't
just train it with the low-level data
like the actual raw words from a book or
raw images from a camera. What you want
to do is take these real world
experiences and convert them all to
high-level representations and train
them on the highle representation. So
like I'm simplifying here a lot. Let's
say you have as input a picture of a
baseball about to hit a window and then
a subsequent picture where the window is
broken. You don't want to train a world
model he argues just on those pictures.
Like if I see a picture like this, the
picture that would follow is one where
the glass is broken. That's how maybe
something like a a a standard LLM style
generative picture generator might work.
He's like instead take both pictures and
have a highle representation. So it's
like a mathematical encoding of like a
baseball is getting near a window. Like
what actually matters? What are the key
factors of this picture and then the
next picture is the window breaks. And
what you really want to teach the model
is when it has this highle setup, a
baseball's about to hit the window. It
learns that leads to the window
breaking. So it's not stuck in
particular inputs but learning causal
rules about how the relevant domain
works. And anyways there's a lot of
other ideas like this the critic and
actor that comes out of RL reinforcement
learning worlds um as sort of well
known. You've you've trained one network
with rewards and another one to propose
actions. And so there's a a lot of
different ideas coming together here.
The third piece about Lacun's vision
that differs from the big AI companies
is he doesn't believe in having just one
system that you train once and is then
the digital brain for all the different
types of things you should do. He says
this architecture is the right
architecture for everything. But you
train different systems for different
domains. So if I want a digital brain
that we can build computer programming
agent tools on, I'm going to take one of
my systems with its world model and
perception and actor and critics and I'm
going to train it specifically for the
domain of producing computer programs
and then all my computer programming
agents that people are building will use
that particular system. But if I want to
do uh help with call centers or
whatever, I might completely train a
different version of the system just to
be really good at call centers. So we
don't have just one massive HAL 9000
that everything uses which is the OpenAI
plan or the anthropic plan. We custom
train systems that maybe all use the
same general architecture but we train
them from scratch for different types of
domains. You're going to get much better
performance out of it. All right. So
that is uh Yan Lun's vision and he says
this is how you're going to get uh much
more reliable and smart and useful
activity out of AI. this idea that we're
just going to train like a massive model
that can do everything based off of just
text. He's like, "Come on, this makes no
sense. That can't possibly be the best,
most efficient route towards actually
having smarter AI." All right, so that
is the key tension between the existing
AI companies and Yan Lun's idea. This
brings us to our second sub question.
How is it possible that Lacun
could be right that LLMs are a dead end
if we've been hearing nonstop in recent
months about how these LLM based
companies are about to destroy the
economy and change everything? How could
we be so wrong?
Lun is not surprised by that. I think
there if we asked him, I'll simulate
Lacun. If we asked him, he would say the
short answer to that question is, look,
a lot of coverage of LLMs recently have
been a mixture of hype and confusing the
specific LLM strategies of the frontier
companies with the idea and
possibilities of AI more generally and
kind of mixing those things together,
which is fine if you're Sam Alman or or
Dario Amade, that's great for you
because you need investment, but it's
probably not the most accurate way to
think about it. Now, if we ask Lacun in
this hypothetical to give a longer
explanation about how we could be so
wrong about LLMs, he would probably say,
"Okay, let me let me explain to you the
trajectory of the LLM technology in
three stages." And I think this will
clarify a lot. All right. So, the first
stage was the pre-training scaling
stage. And this is the stage where the
the AI companies kept increasing the
size of the LLM. So how big those layers
are inside of them, the size of the
LLMs, the amount of data they trained
them on, and how long they trained them.
And there was a period starting in 2020
and lasting until 2024
where making the model bigger and
training them longer demonstrabably and
unambiguously increased their
capabilities. This petered out after
about GPT4. after about GPT4, OpenAI
um we have evidence that XAI had the
same issue. We have evidence that Meta
had the same issue. When they continued
to make their models bigger, they
stopped getting those big performance
jumps. So, they couldn't just scale them
to be more capable. This led to stage
two, which I think of as starting in the
summer of 2024, which is where they
shifted their attention to post
training. So now like we can't make the
underlying smarts of these LLMs
um better by making them bigger,
training them longer. So what we need to
do is try to get more useful stuff out
of these existing pre-trained LLMs. And
so the first approach they came up with
and we we saw this with the the alphabet
soup of models that were released
starting in the fall of 2024 01 03 nano
banana like all these type of names. The
first approach they tried was um telling
the models to think out loud. So instead
of uh just directly producing a
response, they post-train the models to
be like actually explain your thinking
and it was sort of a way because
remember it's auto reggressive. So as
the model sort of explains its thinking
that's always going back as input into
the model and it gives it more to work
off of in reaching an answer. So it
turned out if you had the model think
out loud you got slightly better on
certain types of benchmarks. So these
were the so-called reasoning models. Um,
but it was a bit of a wash because this
also made it more expensive to use the
models because it burned a lot more
tokens because the answers you it
produced a lot more tokens to get to the
answer you cared about. So it did better
but it was unclear like how much of that
we actually want to turn on for users.
Um, the second approach they used in the
second stage was post training. So now
if you have for example a lot of
examples of a particular type of
question prompts correct answers prompts
correct answers you could use those
combined with techniques out of
reinforcement learning to nudge the
existing pre-trained model to be better
on those type of tasks. So, we entered
this stage, stage two of of the sort of
post-training stage where because we
couldn't make these uh LLM brains
fundamentally smarter, we wanted to try
to tune them to get more performance out
of them uh on particular types of task.
This is when we began to see less of
just, hey, try this model and it's going
to blow blow your socks off and we
instead got lots of charts of inscrable
benchmarks. Look, the the chart is going
up on this alphabet soup benchmark
because, you know, you could post train
for particular benchmarks. It was less
obvious in a lot of use cases for the
regular user that like well the
underlying smart seems to be the same.
We then entered a stage three. I think
this started in the fall of 2025
where the LLM company said really the
big gains going forward is in the
applications that use the LLMs. Let's
make these applications smarter. So it's
not just how capable the LLM is. It's
like how capable is the programs that
are prompting the LLM. Let's make those
smarter. So we saw a lot of this effort
going into the programs that are called
coding agents that help computer
programmers edit and produce and plan
computer code. Now these type of agents
had been around for many years but they
got really serious a lot of the AI
companies especially uh last year coming
into the fall of last year and how do we
make the program so they weren't
changing really much the LLMs they did
some fine-tuning for uh programming but
really the big breakthroughs in coding
agents were in the programs that call
the LLMs and they figured out how can we
make these coding agents capable of
working with enterprise code bases so uh
not just for individuals vibe coding web
apps but something you could use if
you're a professional programmer in a
big company. All of that's tool
improvements.
Making sure that you're able to send
better prompts to the LLM. When you hear
about things like skill files and um
managing like hierarchies of agents,
this is all improvements in the programs
that use the LLM, not the none of this
is breakthroughs in the digital brain
itself. And so this is the stage that we
are in now is we're spending a lot more
time building smarter programs that sit
between us and the LLMs that they're
quering as their digital brain so that
it's in it in very particular domains it
is more useful.
So this all tells us right this is like
what Lun would tell you right I'm I'm
channeling Lun he would say once you
understand this reality you see that
this impression that LLM based AI has
been on this super fast like upper
trajectory of lots of fast advances is
pretty illusory
the fundamental improvements in the
underlying brain stopped a couple years
ago
what we saw was then a period of lots of
brag bragging about benchmarks doing
better but this was all about post-
training and now for the last four
months like all these improvements we've
been hearing is about the programs that
use the LLMs are being made smarter and
they're better fitting particular use
cases but there really hasn't been major
fundamental uh improvements in the
underlying smartness of the digital
brains which is why all the problems
like hallucinations and unreliability
persist. the brains are actually
incrementally improving either in narrow
areas um or in narrow ways. And it's
what we're building on top of them. This
creating an illusion of increasing
trajectory of artificial intelligence
when in reality we might just be in a
very longtail stage of now we're going
to do product market fit and actually
build do the work of building more
useful products on top of a mature
digital brain technology that's only
advancing at a very slow rate. That
would be Lun's argument. Uh therefore we
will find some good fits, but this is
not a technology that's on a trajectory
where it's going to be able to make
massive leaps in what it's actually able
to do.
All right. Um
so there you go. That would be the
argument for how we could have gotten
LLM progress so wrong. All right. Sub
question number three.
Let's follow through this thought
experiment. What would happen if Lun is
right about that? What what would we
expect then to happen in the near
future? Well, let's start with the the
window of the next one to three years.
If he is right, we would see a long tale
of applications based on existing LLMs
to begin to fill in. So, computer coding
agents have gotten more useful. We will
see other use cases like that that don't
exist now, but where people are really
experimenting to try to figure out
applications that are going to uh work
in other types of fields. So there'll be
sort of claude code moments in other
fields which I think will be useful and
exciting. Um the tool sets used in many
jobs will change but because we're now
just trying to like find areas where we
can build useful applications on top of
existing LLMs. these doomsday scenarios
like we we've been talking about on this
AI reality checks recently where
knowledge workers are going to have to
become uh pet massuses and then after
that they're going to have to cook the
pets on garbage can fires because
there's no money left the economy none
of those scenarios are are would unfold
based on LLMs in this current vision
there would be a big economic hit
because what we're going to if we've
shifted our attention to building better
applications on top of the LLM what
we're going to see is a lot more
companies get into that game and they're
going to say, "I don't want to pay for a
cutting edge frontier hyperscaled LLM.
It's too expensive. Let's look at
cheaper LLMs. Let's look at open source
LLMs. Let's look at LLMs that can fit on
chip."
We saw this already with the OpenClaw
framework, which allowed people to build
their own custom applications that use
LLMs to do personal assistant type
roles. And right away, people are like,
I don't want to pay all the money to use
Claude or GPT. And you saw an explosion
of interest in onchip machines and open
source machines. All this is going to be
I think good news for the consumers.
That means we could have more people
building these applications. There'll be
more variety of these applications and
they'll be cheaper. It's bad news for
the stock market because we've invested
depending on who you ask somewhere
between 400 to 600 billion dollars into
these LLM hyperscalers like OpenAI and
Anthropic.
That market's not going to support it.
So there's going to be a big crash. This
will probably temporarily slow down, if
this vision is correct, would
temporarily slow down AI progress
because investors are going to feel
burnt. All right. What's going to happen
now if we zoom out to like a a 3 to 10
year range? Um, that's roughly the range
in which the modular architecture
approach that Lacun is talking about
would reach maturity. That's what their
current CEO is saying. Again, it's it's
a research company now and they said
it'll be several years until we really
get the products that are ready for
market. If lacun is right, what we're
going to see is domain by domain, you're
going to have these uh very bespoke
train domain specific modular
architecture systems which if he's right
are going to be way more reliable and
more smart in the sense of like they do
the thing I asked them in a way that's
good and as good as like uh some of my
human employees and in a way that I can
actually trust. We're going to see a lot
more of that. what's promised with LLMs
we're going to see instead on that 3 to
3 to 10 year basis if lacun is right
because they're uh based on this mild
jar architecture I think these systems
will you know they'll be more reliable
um they're also going to be easier to
align
LLMs are so offuscated it's just like
here's 600 billion parameters in this
big box that we trained for a month on
all the text on the internet let's just
see what it does modular architectures
are way more alignable like you have
literally a critic module in there that
evaluates plans based on both a world
model and some sort of hard-coded value
system to say which of these do I like
better and you could just go in there
and sort of hardcode don't do these type
of plans you know uh really have a low
score for plans that lead to whatever
like a lot of variability in in outcome
or something like that you have more
direct knobs to turn so it does make
alignment more easier um they would also
be more economically efficient because
when you're when you have to train one
module long enough one model long enough
that could everything. It has to be huge
and it takes a huge amount of energy.
But when you're training different
modules in a domain specific system,
these can be much smaller. I like to
point out the example of a deep mind, a
Google DeepMind tool called Dreamer V3,
which can learn how to play video games
from scratch. It's it's famous for
figuring out how to find diamonds in
Minecraft. And it uses a modular
architecture um very similar to what
Lacun is proposing here. And we just
read a paper about it in my doctoral
seminar I'm teaching on super
intelligence right now. Dreamer v3 which
can play Minecraft it well better than
if you ask an LLM to do right it's
domain specific requires around 200
million parameters which is a factor of
10 or less than what you would get in a
standard LLM. It could be trained on a
single GPU chip and it could do this
domain way better than uh a frontier
language model which is significantly
longer and train significantly more
exhaustively. So there would be some
advantages here. There would also be
some there's a little bit of digital
lick around this world because uh way
more so than LLMs again these domain
specific models might actually have more
of a displacement capability. So we'd
have to keep an eye on them. All right,
conclusion. What do I think is going to
happen here? Well, you know, I don't
know, right? It's possible that there's
more performance breakthroughs to get on
LLMs and we're going to get more useful
tools. A gun to the head if I had to
predict,
you know, through my computer science
glasses, lacun's modular architecture,
it feels like that has to be the right
answer.
I I think of this doubling down on LLMs
is we're going to look back at this like
an economic mistake. It was the first
really promising new AI technology uh
widespread AI technology built on top of
deep learning and it did cool things.
But instead of stepping back and like
okay what will this be good for and what
types of domains might we want different
models we said no let's just raise half
a trillion dollars and just go all in on
everything textbased LLMs which are
trained on text and are made to produce
text all artificial intelligence will
run off of these things. I just think
when we zoom out on the 30-year scale,
we'll be like that was so naive. This
idea that like this was the only type of
model we need for artificial int. It's
super inefficient for like 99% of the
domains we want to use. It's great for
textbased domains and computer
programming kind of the planning is a
little suspect, but the code production
is okay. But we're going to make all
intelligence based off just massive LLMs
and there'll be like four of them, like
four companies that have like massive
ones and that's it. That this can't be
the right way to do it. So my my
computer science instincts say modular
architecture it just makes so much more
sense domain specificity differential
training of modules
you have much more alignment capability
they're much more economically feasible
like it just feels to me like that
probably is going to be the right answer
which means we're going to have to have
some bumpiness in the stock market
because I don't think that if this is
true the hyperscalers is now either they
have to pivot to those quick enough
before they run out of money or some of
them are going to go out of the business
and the others are going to have
collapse before they expand again. So I
think the modular architecture approach
will work better. I don't know if
Lacun's company's going to do it or not,
but I think that architecture it makes a
lot of sense to a lot of computer
scientists. Now, I hope they don't get
too much better because I'm much more I
can much more imagine a very trained
modular architecture AI digital brain
creating justified ick than I can
building these Python agent programs
that access some sort of massive LLM
somewhere. All right. So, yes, we'll
know. I think within a year we'll begin
to get a sense of which of these
trajectories is actually true. Um, I of
course will do my best to keep you
posted here on the AI reality check. All
right, that's enough computer science
talk for one day. Hopefully that made
sense. Hopefully that's useful.
Be back soon with another one of these
checks. And until then, remember, take
AI seriously, but not everything that's
written about it. Hey, if you like this
video, I think you'll really like this
one as well. Check it out.
Ask follow-up questions or revisit key timestamps.
The video discusses the perspective of AI pioneer Yan LeCun, who argues that Large Language Models (LLMs) are a technological dead end. While LLMs like ChatGPT have gained significant attention and investment, LeCun believes they are inefficient and brittle, prone to hallucinations due to their lack of real-world understanding and planning capabilities. He proposes a modular architecture for AI, where specialized modules handle different tasks, trained in ways best suited for each module. This contrasts with the current industry trend of using massive, single LLMs for various applications. The video explores how LeCun's ideas differ from major AI companies, analyzes why the current hype around LLMs might be misleading, and speculates on future AI developments, suggesting that a shift towards domain-specific, modular systems could lead to more reliable, efficient, and aligned AI, potentially disrupting the market for current LLM hyperscalers.
Videos recently processed by our community