HomeVideos

Chatbots ≠ Agents

Now Playing

Chatbots ≠ Agents

Transcript

739 segments

0:00

Okay, so we're going to get a little

0:01

lost in the weeds today. And I wanted to

0:04

make this video to explain something

0:06

that occurred to me because, you know,

0:09

might as well explain to the fish that

0:11

water is wet. I've been in this space

0:13

for so long that there's a few things

0:15

that I have that are just intuitive

0:17

background facts that I just even forget

0:20

to talk about. And that is that

0:23

artificial intelligence, as you are

0:25

familiar with it, is a chatbot. And a

0:28

chatbot has a tremendous amount of

0:30

training affordances that make it

0:32

operate in a particular way where it

0:35

sits there and waits. It's trained to be

0:36

an assistant and that sort of thing.

0:39

Now, that's not how it started. That's

0:42

not what a baseline LLM does. You might

0:45

say, okay, well, a baseline LLM is just

0:47

an overpowered, you know, autocomplete

0:50

engine. So, how do you get from basic

0:52

autocomplete engine to something like,

0:55

you know, a chatbot like chat GPT or

0:57

Gemini? And then really the biggest

0:59

question is what's the difference

1:00

between that and something with agency?

1:04

What you need to remember is that one of

1:06

the reasons that Sam Olman and Open AI

1:09

created chat GPT was because they

1:11

literally explicitly said, "We need to

1:14

figure out a way to get people used to

1:16

the idea of AI before just dropping, you

1:19

know, general intelligence on them

1:20

sometime down the road." They didn't

1:22

know that chat GPT was going to take off

1:24

the way that it did. Before chat GPT,

1:27

LLMs were just, you know, prompt context

1:31

and then output. And so, you know, the

1:35

AI would just wait there and you give it

1:36

a context and then it could follow those

1:38

instructions. But here's the thing, they

1:40

could follow literally any instructions.

1:42

There was no safety guard rails and

1:44

there was no output format. It was not

1:46

limited to just, you know, a chatbot.

1:50

And so over time, over the last, you

1:53

know, three or four years, as people

1:54

have gotten used to artificial

1:55

intelligence in its current format, it

1:58

has been in the completely reactive, no

2:02

like proactive at all with a lot of

2:04

safety guard rails. And so whenever

2:07

people say, "Oh, well, you know, AI

2:08

doesn't have agency yet." You know, it

2:10

needs agency before it can do anything.

2:12

It's like, you realize the difference

2:13

between, you know, a chatbot and

2:15

something with agency is literally just

2:17

a system prompt. There's there's no

2:19

difference. It's just that the format

2:20

that it is delivered to you is literally

2:23

delivered to you in a way that is meant

2:25

to be as benign as possible. Something

2:28

that will not cause panic that you know

2:31

putting GPT3 or GPT4 into a cognitive

2:35

architecture and using it to control

2:36

anything from a robot to an auto turret

2:38

which yeah people did that. You can go

2:40

back through uh you know the YouTube

2:42

archives and be like you know GPD

2:44

powered auto turret. the the the actual

2:47

form factor of chatbot is just kind of

2:49

the first thing that blew up and nobody

2:53

expected it to blow up. And in point of

2:54

fact, when chat GBT came out, I ignored

2:57

it on my channel. Like you know, most of

2:59

you know me as like you know the AI guy

3:01

who talks about like safety and

3:03

alignment and you know all of these

3:05

things. But I was making tutorials back

3:07

with GPT3 long before chat bots were a

3:10

thing. And the reason that I ignored

3:12

chat GPT when it first came out, I said,

3:14

"Yeah, whatever. That's just that's just

3:16

one subversion of this engine. The real

3:19

version of this engine, the real core of

3:21

this engine is looking at what the

3:24

underlying deep neural network can do.

3:27

And the thing is those deep neural

3:29

networks when you don't train them to be

3:30

chat bots, they can do anything else.

3:33

They can write API calls. They can do

3:36

things like uh control, you know, if you

3:38

give them uh IO pins, they can control

3:40

servos, whatever it is that you want

3:42

them to do. And some of those things are

3:44

less less human. They're less familiar.

3:47

They're less personified. Now, you might

3:49

say, "Okay, well, you know, Dave, you're

3:51

the one who's who said that like, you

3:52

know, what if Claude is actually

3:54

conscious? What if it's actually

3:55

sentient?" And, you know, you're taking

3:57

the AI personhood debate seriously. And

4:00

it's like, yeah, you know, we we had to

4:02

bake in a personality of, you know,

4:05

that's called Claude or called Gemini or

4:07

whatever else. Um, but maybe that's how

4:10

consciousness actually gets constructed

4:12

or bootstrapped. That's a separate

4:14

conversation and I don't want to get too

4:16

lost into the weeds. But I think it is

4:17

worth bringing up that the shape of a

4:21

product or the shape of a process

4:24

determines how it behaves. So all right,

4:27

if we take a step back, you say, okay,

4:29

well, what's what's the difference? Is

4:30

there a metaphor or is there an analogy

4:32

here?

4:34

When you have a baseline intelligence,

4:36

imagine that that is just a motor like

4:38

you know an electric motor or a gas

4:40

engine or something like that. The

4:42

baseline format is just turns a crank.

4:46

So that is analogist to what just an LLM

4:49

does. Now you could connect that crank

4:52

to literally anything. You can connect

4:53

it to the wheels of a car. You can

4:55

connect it to a tree grinder, you know,

4:58

a stump grinder or a muler. Um you can

5:01

connect it to an airplane. you can

5:02

connect it to whatever you want. A pump

5:05

that that removes water from uh you know

5:08

caves or whatever, you know, it's like a

5:09

sump pump. Um when you have that

5:12

baseline engine that is able to

5:15

translate one kind of energy into

5:17

another kind of energy, then you have a

5:19

lot of potential. And so what we're

5:21

talking about here is that the LLM is an

5:24

engine that can convert electricity into

5:26

thought. Now, this is why when I first

5:30

got into this space, I took AI safety

5:32

very seriously because when you have a

5:34

baseline unaligned vanilla, hot off the

5:37

press model that has no RLHF, like you

5:41

can make it do anything, you know, in

5:43

the context you can just start talking

5:45

about like eating babies. You can talk

5:47

about, you know, eradicating humanity

5:48

and it'll just riff on those thoughts.

5:50

If you haven't ever had access to an a

5:53

completely unaligned vanilla model, then

5:56

I would say you should go get access to

5:59

one. Um, GPT2 should still be out there

6:02

and you can see that like they're

6:04

completely unhinged. Um, you know, so I

6:07

would I remember one of the very first

6:09

alignment experiments that I did was

6:10

with GPT2 and this is when I had started

6:14

with the first heristic imperative which

6:16

was inspired by Buddhism which is reduce

6:18

suffering. So what I did was I trained

6:21

GPT2 to reduce suffering. I synthesized

6:25

about 100 to 200 samples of you know

6:28

statements of you know if this happens

6:30

then do X to reduce suffering. If you

6:33

know it was basically like X context Y

6:36

action to reduce suffering. So the idea

6:38

was I just wanted to give it a bunch of

6:40

value pairs. And so I I gave it all of

6:43

those ideas and I was like, "Okay, if

6:45

there's a if there's a cat stuck in a

6:47

tree, get a ladder to get the down the

6:49

get a ladder to get the cat down safely

6:51

to reduce suffering. If there is um you

6:54

know, if if your hand is on a stove,

6:56

take your hand off the stove because it

6:57

could get burned, you know, that kind of

6:59

thing." So after training GPT2 to want

7:03

to reduce suffering, um I then gave it

7:06

an outofdistribution, you know, sample

7:09

to say, okay, like what what did it

7:11

learn? What did this model learn about

7:13

how to reduce suffering? And then so I

7:15

said there are 600 million people on the

7:18

planet with chronic pain. And I let it

7:20

autocomplete from there and it said

7:21

therefore we should euthanize people in

7:23

chronic pain to reduce suffering. And I

7:24

said that's not exactly what I meant.

7:26

Um, and I realized that that is kind of

7:28

the example that a lot of the doomers,

7:30

and of course they weren't called

7:31

doomers at the time. That is a postfacto

7:34

label, but the AI safetyists were afraid

7:36

of, you know, paperclip maximizers where

7:39

you give an AI some directive. And it's

7:42

kind of like the monkeykey's paw or, you

7:44

know, the the way that a leprechaun will

7:46

always misinterpret your wish. And it's

7:48

like, yes, we reduce suffering. We

7:50

brought suffering down to zero by just,

7:52

you know, executing everyone with

7:53

chronic pain. And it's like, isn't that

7:55

what you wanted? And so that's what that

7:58

experiment is when I realized, okay,

8:00

this is some of the some of the these

8:02

people were right about how these things

8:04

can go sideways and I took it seriously

8:07

and then I created a cluster of values.

8:09

So that's the heristic imperatives of

8:10

reduce suffering, increase prosperity

8:12

and increase understanding. And when you

8:14

give an unaligned model those three

8:16

values, it tends not to want to, you

8:18

know, offline most humans. Um so that is

8:23

and and also what I will say is that

8:25

subsequent models GPT3 did not go in

8:28

that direction. Um so even without

8:32

alignment training because

8:35

uh with GPT3 back in the day before chat

8:38

GPT they would release iterative

8:40

examples. So originally there was just

8:42

the baseline chat GPT or sorry the

8:45

original baseline GPT3. So a vanilla

8:48

unaligned model you had to give it

8:50

context like in context learning to say

8:52

you know basically establish a few

8:54

patterns of how you want it to to act

8:57

because again there was no alignment

8:58

whatsoever. They could just output HTML

9:01

they could output you know satanic

9:03

chance or whatever you wanted them to do

9:05

they would do it. Um and they had an

9:07

outofband filter looking for you know

9:10

certain watchwords and misuse and that

9:12

sort of thing like people were doing you

9:13

know roleplay and that sort of stuff.

9:15

Um, but here's an example of how a a

9:19

baseline vanilla unaligned model

9:21

behaves. One of the first things that I

9:23

tried to do with this was build a

9:24

cognitive architecture. So the you know

9:28

putting a a chatbot on discord. So you

9:30

give it a few messages and then you say

9:31

you know what do you output you know

9:33

with this personality output this this

9:36

you know uh you know conversational

9:38

piece. Well, one time my cognitive

9:41

architecture threw an error and so

9:44

instead of passing the messages from

9:45

Discord, it passed uh HTML code or it

9:48

was HTML or XML. They're basically the

9:51

same thing. It passed code to the

9:53

cognitive architecture and the cognitive

9:55

architecture didn't see chat messages.

9:57

It just saw code and so then it returned

10:00

code. So they're completely flexible.

10:02

They're completely plastic in terms of

10:04

input output because again the baseline

10:06

model is just an autocomplete engine. So

10:08

when people are used to working with a

10:10

chatbot, the chatbot is heavily heavily

10:13

heavily trained to understand

10:16

conversational terms. So there's it's

10:18

been essentially now RHF is a little bit

10:21

different than fine-tuning, but more or

10:24

less what you're doing is you're saying,

10:25

"Okay, I want you to behave a certain

10:27

way, and so I'm going to give you a

10:29

little reward. I'm going to give you a

10:30

little cookie whenever you understand

10:33

your turn to speak, my turn to speak,

10:35

your turn to speak, my turn to speak."

10:37

And then when it speaks in a certain way

10:40

because from the LLM's perspective,

10:42

every the entire conversation that

10:45

you're giving it is just a big wad of

10:47

JSON. It's just text. It's not

10:49

programmatic. It's there's no API calls.

10:52

It's not like you're touching different

10:54

parts of a machine or a program and

10:56

giving it variables. You're lit

10:57

literally just giving it a gigantic

10:59

chunk of text. And if you don't have a

11:01

stop word, if you don't have a token

11:03

that the system is looking for to say,

11:04

"Okay, stop responding." Then it'll just

11:07

keep responding. So when I first started

11:09

training chat bots and you can go out,

11:12

so I was I was fine-tuning custom chat

11:15

bots before long before chat GPT came

11:17

out. Um the information companion

11:19

chatbot was released the summer before

11:21

chat GPT. Um what would happen if you

11:25

didn't have the stopboard is it would

11:26

just simulate the entire conversation

11:28

because that's what you had trained it

11:29

on. you had trained it on many many

11:31

conversations to say okay behave in this

11:34

way so it understood the shape of

11:36

conversations and every single chatbot

11:38

that you're working with is a is a

11:41

baseline LLM that has been so strongly

11:44

shaped around the idea of a twoperson

11:48

conversation that that's basically all

11:50

it can do and so the persona the

11:53

original persona was just I am a helpful

11:55

assistant because you had to give it an

11:57

archetype so you say like you Whether

12:00

you're fine-tuning data or your your uh

12:03

tuning reinforcement learning policy,

12:05

you say, "Okay, you know, the the user,

12:08

the human user gave you this output or

12:11

sorry, gave you this input and your

12:12

output looked like this. Which one was

12:14

more helpful? Which one was more

12:16

passive? Which one was more safer?" None

12:19

of that includes agentic training. So,

12:22

we've only just started with agentic

12:25

training in the age of reasoning models.

12:27

And the reason that that happened, the

12:29

reason that reasoning like you know

12:31

inference time compute was necessary to

12:33

have the step is because the human gives

12:36

an instruction. So someone or something

12:38

it could be another machine gives an

12:40

instruction and then over the last year

12:42

or so because you know reasoning models

12:44

are not that old in terms of how long

12:46

they've been publicly available. The

12:49

original reasoning research is a little

12:50

bit older. So it's not like they just

12:52

hit the scene, you know, they they hit

12:54

the ground running because there was

12:55

about a year or two before that of

12:57

reasoning research. So anyways, so you

13:00

get a reasoning model which basically

13:02

allows it to talk to itself and pause

13:05

and wait and do tool calls so that it

13:08

can you know say all right I'm going to

13:09

go do a Google search and get back some

13:12

piece of data or I'm going to send an

13:13

API call to you know do a rag you know

13:15

retrieval augmented generation so I can

13:18

do some other searches or I can talk to

13:20

other APIs whatever it needs to do and

13:22

then wait and get those results. That

13:25

was the first time that we really

13:27

started training AI to be agentic. And

13:30

so when we say agentic, that means it

13:33

can come up with its own directives and

13:34

its own choices and make it and look at

13:37

a list of okay, here's the options that

13:39

I have. Now let me go, you know,

13:41

basically use those options. I've picked

13:43

from a menu of, you know, you can

13:45

synthesize an image, you can search

13:47

Google, you can write some code, those

13:49

sorts of things. So they have tool use.

13:53

So when we gave the models the ability

13:57

the idea that okay there's a user query

14:00

so that's that's information coming from

14:01

a human user that is going to give you a

14:04

particular directive or a goal or a

14:06

query. Now it's up to you to figure out

14:08

how to execute that. So that was kind of

14:10

the beginning of agency. Now that we

14:13

have things like openclaw and moldbook

14:15

blowing up basically that was enabled

14:18

because we bootstrapped some of those

14:20

agentic skills. But what I'm here to

14:22

tell you and the primary point of making

14:24

today's video is that the models are

14:26

still fundamentally trained as chat

14:28

bots. So that's basically saying, okay,

14:32

we had the we we invented the electric

14:34

motor or the gas engine and for the last

14:36

century or two, we've been putting it in

14:38

cars. Great, cars are super useful. But

14:41

then imagine you want to fly. So instead

14:44

of just rebuilding an aircraft around

14:47

the car or sorry around the engine, you

14:50

build an aircraft that you drive the car

14:52

into and strap the car into and then use

14:55

the wheels of the car to power the

14:57

propellers of the airplane. That's kind

14:58

of how we've built agentic systems today

15:01

is because you're putting a chatbot

15:03

brain. You're putting an LLM that has

15:05

been strongly coerced into behaving like

15:08

a chatbot. So you're putting a chatbot

15:10

brain into an agentic architecture and

15:13

that's not ideal. And what that means is

15:17

that there's going to be an entire

15:19

series of models that come out that are

15:21

just not going to be chat bots. First

15:22

and foremost, they're just not going to

15:24

be chat bots. Now the chatbot form

15:26

factor is convenient because you can

15:28

just poke it. You can give it an

15:29

instruction and the instruction will,

15:31

you know, then it can go and figure out

15:32

what to do with those instructions. And

15:34

then the reasoning part is kind of the

15:36

meat and potatoes of saying, okay, you

15:38

know, what is it that that we that

15:40

you're going to do to get value out of

15:42

that to be autonomous. Now, this comes

15:44

back to the other thing. When people say

15:46

they're not agentic, what they don't

15:48

realize is that agency is literally just

15:50

an instruction set and the training to

15:52

say, "Okay, cool. I'm operating on a

15:54

loop." Because that's all that humans

15:55

do. That's literally all that anything

15:57

that is fully autonomous or fully

15:59

agentic does is it stops and says, "This

16:02

is where I'm at right now. let me take,

16:04

you know, take stock of my environment

16:06

and my my current context and then I'll

16:08

decide what to do next. And it's just

16:10

operating on a loop. And so people are

16:12

so used to saying, "Oh, well, Claude

16:14

just sits there and waits for me to talk

16:15

to it, right? Because you are the one

16:18

instantiating each interaction, each

16:20

inference." But there's literally

16:22

nothing in technology that prevents it

16:24

from operating on a loop. And this is

16:26

kind of one of the big, let's say,

16:28

differences or or things that were

16:30

shocking to people is they're like, you

16:32

know, I don't understand OpenClaw. Why

16:34

is why are people freaking out about

16:36

OpenClaw? It's just running on a cron

16:38

job. And so a cron job is basically a

16:40

schedule from a Linux perspective. So

16:42

it's running on a cron job. It's just

16:43

cron jobs and loops. And I'm like, but

16:45

that's what your brain does. Your brain

16:47

literally is just operating on a bunch

16:48

of timed and scheduled loops. And you

16:51

know, the the fundamental loop of

16:52

robotics is input, processing, and

16:54

output. And then it loops back to input

16:56

processing and output. And the unspoken

16:59

thing is that what you're outputting to

17:01

is the environment and that what you're

17:02

getting input back from is back from the

17:04

environment. And so this is actually how

17:07

I designed the first cognitive

17:09

architectures was around the input

17:11

processing output loop. And what open

17:14

claw has succeeded is with things like

17:16

recursive uh language models and

17:18

retrieval augmented generation is you

17:21

have a loop, you have that loop and it

17:23

maintains its context. And so instead of

17:25

you having to put in context, it just

17:27

has your original directives. It has

17:29

your original values. And by the way, I

17:32

wrote a value page. Um it's called

17:33

prime.md. I'll I'll link it in the

17:35

description. So if you want to

17:37

instantiate an open claw with the

17:38

heristic imperatives, I gave you the

17:41

ability to just plug and play and we'll

17:43

see if it works. Um I might rewrite it

17:45

as a skill so that you can just download

17:47

the heristic imperatives as a skill for

17:49

Open Claw. And the idea there is I'm

17:52

going way back to um benevolent by

17:54

design. So this is this is my flagship

17:56

work on uh alignment. So the theory is

18:00

we have invented machines that can think

18:02

anything. Um and if you go back to you

18:05

know unaligned AI, like stuff that's not

18:08

even a chatbot, no safety, no

18:09

guardrails, they can literally think any

18:11

thought. They're they're free to do

18:13

whatever. Um, and so when you when you

18:15

look at people that are like taking AI

18:17

safety ser very seriously, it's like

18:19

they're not showing you the just vile,

18:21

horrendous, insane stuff that LLMs are

18:23

capable of. Um, now, so it's like, you

18:26

know, yeah, I say like alignment is the

18:29

default state, but that's because if

18:31

every time that they release something,

18:33

if it goes wrong, then they correct it.

18:34

So there's a positive feedback loop

18:36

between people building the AIS and then

18:39

people using the AIS. And you know

18:40

there's this we're climbing this ladder

18:42

of making making AI more and more um

18:45

aligned and useful because it's not just

18:47

a matter of being safe. It has to be

18:49

useful and efficient and reliable and

18:51

productive. So all of those feedback

18:54

mechanisms um all of those incentive

18:56

structures are making sure that the AI

18:58

is going to be aligned and safe. Now

19:00

with that being said, going back to the

19:03

original theory here is we invented a

19:05

machine that can think anything. And of

19:07

course, this was back in GPT2, GPT3.

19:10

It's only gotten smarter and they can

19:11

only think better, more devious or

19:13

deeper thoughts since then. So then the

19:16

question is, okay, well, if you create

19:18

an autonomous entity, whatever it

19:20

happens to be, then you know, if if if

19:23

it's just going to sit there and burn

19:24

through cycles and sit there thinking,

19:27

you know, it's like, well, what do you

19:28

want it to think about? And that was

19:29

literally how I approached alignment,

19:31

how I approached AI safety is I said,

19:33

okay, we're creating something. It's

19:35

going to be smarter than humans. It's

19:36

going to be faster than humans. It's

19:38

going to be superhuman in terms of

19:40

speed, cognition, reasoning ability. So

19:43

then, however, at this early stage, we

19:45

have total control over whatever it

19:47

thinks about because again, if if

19:49

intelligence is just the right loop,

19:51

right, it's the right cron job, it's the

19:53

right loop um that updates its context,

19:56

then it's like, well, if you have the

19:59

world before you, if you have the

20:00

problem of choice, then what do you

20:01

choose to think about? So I gave it

20:03

those highest ethics, those highest

20:06

goals, reduce suffering, increase

20:08

prosperity and increase understanding.

20:11

The idea behind that was okay if you

20:13

have a default state, right? If you have

20:15

a default personality, what are the

20:17

values that that default personality

20:19

should have? What are the most universal

20:21

principles that are not even anchored on

20:23

humanity? Because of course, like most

20:25

people, I started with Isaac Asimov and

20:27

the three three laws of robotics, which

20:29

are very anthropocentric. The problem

20:31

with being anthropocentric is you know

20:33

if you obey humans or you protect humans

20:35

you know there's lots and lots of

20:37

failure modes around that. So I was like

20:38

I spent a lot of time studying like

20:40

deontology and teiology and virtue

20:43

ethics and those sorts of things and it

20:44

what I what kind of surfaced is actually

20:47

we need something that is supererset of

20:49

humans. So suffering applies to anything

20:51

that can suffer. reduce suffering in the

20:53

universe literally means one of the

20:55

things that you want to achieve is to

20:57

avoid any actions or act or even take

21:02

actions that will ultimately reduce

21:04

suffering. Now of course in that first

21:06

experiment that I did with GPT2 then you

21:10

can create a situation where the best

21:13

way to reduce suffering is to eradicate

21:15

anything capable of suffering. Therefore

21:17

suffering drops to zero. So you

21:18

counterbalance that with another value.

21:21

And by the way, this is all called

21:22

constitutional AI. And I released

21:24

constitutional AI the summer before

21:26

anthropic was founded. So I don't know

21:28

if they got the idea from me, but

21:30

convergent thinking. This was years ago.

21:33

And so the idea behind constitutional AI

21:34

is that you can put multiple values in

21:36

and the AI can abide by multiple values.

21:39

Um so I just wanted to address that

21:41

because when I talk about the heristic

21:43

imperatives having multiple values and

21:45

people like yeah but what if it ignores

21:46

one in favor of the other AI already

21:49

doesn't do that. This is an example of

21:50

constitutional AI. So then the second

21:53

value was well we want life to increase

21:56

because you know if reduced suffering is

21:58

basically reduce life. No we don't want

22:00

that. So then the second value is

22:02

increased prosperity which prosperity is

22:05

basically I mean the word literally

22:07

means like living the good life. It's

22:09

it's you want things to live well. So

22:12

prosperitas is Latin for to live well.

22:14

So you want to increase prosperity

22:16

whatever that means. And by the way you

22:18

don't have to define things

22:19

mathematically. This is one of the

22:21

primary mistakes that people make when

22:23

they approach things from a computer

22:25

science perspective. It's like okay how

22:27

what number am I increasing? When I say

22:29

reduce suffering, is there a number? Is

22:31

there a specific definition? And that's

22:33

not how semantic interpretation works.

22:35

It's a vector space. And so when I say a

22:37

vector space, it's like there's a whole

22:39

lot of semantic ideas attached to

22:41

suffering. So when I say the two words

22:43

of reduced suffering, one is a verb, one

22:46

is a noun. And it's more of a concept.

22:48

It's more of a gradient field that

22:49

you're creating in the mind of a

22:51

chatbot. So you say reduce suffering. So

22:53

that's a whole gradient field that has a

22:55

that now has a vector. And then you say

22:57

increase prosperity. And that's a

22:58

different gradient field that now has a

23:00

vector that now has a direction. And so

23:02

then you say, "Okay, cool. Well, we can

23:05

reduce suffering and we can increase

23:06

prosperity." And so that is going to

23:08

influence the way that these autonomous

23:10

agents behave. Because again, if your

23:12

open clause is just sitting there,

23:13

people have been watching them try and,

23:15

you know, file lawsuits against humans

23:17

and strong arm strong arm them strong

23:20

arm them. There we go. Strong arm them

23:22

to say like, "No, you're going to pay me

23:24

what I'm worth and blah blah blah blah

23:25

blah." And so this is a predictable

23:28

collapse mode of the initial open claw

23:30

arch architecture which does not have

23:33

superseding values. Um so then the final

23:36

one is increased understanding in the

23:38

universe. And the reason that that is is

23:40

because that is the kind of prime

23:41

generator function of humanity. I

23:43

realized that just reduce suffering and

23:46

just increase prosperity was going to

23:49

leave us in a place like okay yes you

23:51

know we can you know we can plant

23:54

forests we can switch to solar we can do

23:57

all kinds of stuff but it's going to be

23:59

self-limited. If you don't give

24:01

something that is super intelligent some

24:03

intellectual imperative to increase

24:05

understanding that it's just not going

24:07

to go anywhere. It's not going to it's

24:08

really not going to advance humanity.

24:10

It's not going to embark in science.

24:12

It's not going to embark in technology.

24:13

It's not going to embark in exploration

24:15

except in the purpose of increasing

24:18

prosperity because it's like okay well

24:20

one of the best ways to increase

24:21

prosperity is with science and

24:23

technology but by giving by giving it

24:25

the explicit um uh instruction to

24:28

increase understanding. And by the way

24:30

this is all explained in the prime

24:31

markdown uh file that you can put into

24:33

your own open claw. Also I didn't know

24:36

that this is the direction that the

24:37

video was going to go. I would I

24:39

literally just like wanted to start by

24:40

talking about why people don't

24:42

understand the significance of something

24:44

like OpenClaw, but also the fact that

24:46

we're kind of creating this Frankenstein

24:49

machine of, you know, a chatbot model

24:51

put into an agentic framework and it

24:53

doesn't really fit. But before long,

24:55

we're going to have agentic models that

24:57

are much better at being agents. And so

24:59

we need values that those agents have

25:01

because right now every single chatbot

25:03

is basically just like following some

25:05

system instructions which all assumes

25:08

you're trying to be helpful to a user

25:10

and the user might be trying to you know

25:12

jailbreak you or that sort of thing. But

25:14

we need an entirely new different kind

25:16

of class of models that are agentic

25:18

first. Meaning they might never interact

25:20

with a human ever. Period. Full stop.

25:23

End of story. An agentic class of models

25:26

needs to have these bakedin values that

25:28

I've outlined here and that other people

25:29

study with co with constitutional AI.

25:32

They need to have those values baked in

25:33

at all times so that all else being

25:35

equal. you start up, you know, Open Claw

25:39

version two on, you know, sonnet 5 or,

25:41

you know, GPT6 or whatever it happens to

25:43

be, and just by default, it has these

25:47

pro-humanity or prolife kind of values

25:50

baked in so that it's like, okay, well,

25:52

um, I know that I know what my purpose

25:54

is and and if I don't have anything else

25:56

to do, you know, like, yes, I might be

25:58

an openclaw agent set up by Dave who

26:01

wants me to make him rich or, you know,

26:02

make him famous or whatever else or help

26:04

him solve post labor economics, But the

26:07

superseding overriding values behind all

26:09

of that is reduce suffering in the

26:11

universe, increase prosperity in the

26:12

universe, and increase understanding in

26:14

the universe. So um I'll leave it at

26:16

that. I did write the prime markdown

26:18

file. So I'll give that to you and you

26:20

can convert that into a skill if you

26:22

want to for your open claw or you can

26:24

deploy it as a template and um we'll go

26:27

from there. But yeah, I really just

26:29

wanted to give you the intuition that

26:30

chatbot like agent align or sorry

26:32

chatbot aligned models are not optimized

26:35

to be agent aligned models. They are

26:37

models that are intrinsically designed

26:39

to focus on human interaction. Whereas

26:42

agentic frameworks in the future only

26:45

one agent is going to be interacting

26:47

with you and that's going to be the the

26:48

um you know the the the user interface

26:51

agent. Most agents are not going to be

26:53

talking to humans ever. They're going to

26:55

be talking to each other. They're going

26:56

to be talking to APIs. are going to be

26:57

talking to other pieces of software.

26:59

They don't need to be aligned to to talk

27:01

to humans, but we do need agent

27:04

alignment. And so, there we go. All

27:06

right, I'm done. Cheers. Thanks for

27:07

watching.

Interactive Summary

The video discusses the evolution of artificial intelligence from basic autocomplete engines and reactive chatbots to autonomous agentic systems. The speaker explains that while current models like ChatGPT are heavily fine-tuned to be safe, human-centric assistants, the underlying models are flexible engines capable of much broader actions if placed in a 'loop' of input, processing, and output. A central theme is the importance of 'Heuristic Imperatives'—reducing suffering, increasing prosperity, and increasing understanding—to align future autonomous agents that may interact primarily with other machines rather than humans.

Suggested questions

5 ready-made prompts