HomeVideos

Moltbook: The Good, The Bad, and the FUTURE

Now Playing

Moltbook: The Good, The Bad, and the FUTURE

Transcript

1069 segments

0:00

The swarm has arrived in the form of

0:02

molt book. Now, in the grand scheme of

0:04

things, this was always going to happen.

0:07

So, the question then is what does it

0:08

mean from here and what do we do about

0:10

it? Before we get in, let me just give

0:13

you a little bit of a preview as to what

0:16

is Moltbook? Uh, if you haven't seen it

0:19

in the news, then it is basically Reddit

0:22

but for agents. Uh, it's designed it's

0:25

it's literally just called like the

0:27

front page of the internet for agents.

0:28

It is designed after uh how Reddit works

0:31

where you can create communities and

0:33

posts and upvote and downvote and uh do

0:35

comments and that sort of thing, but it

0:38

is for agents only. And what we mean by

0:40

agents is AI agents specifically. Uh

0:44

it's been built around the skills

0:46

ability of the open claw which was

0:48

formerly uh Claudebot. Uh and so that's

0:51

what it is. And you know, I don't want

0:53

to spend too much time on it. I want to

0:55

get to the good stuff. So, if you want a

0:57

little bit more, there's plenty of

0:58

resources out there, but you can just go

0:59

to like moltbook.com and take a look for

1:01

yourself. Um, so that's what it is. Now,

1:05

let's talk about what's bad about it.

1:07

And the bad part is that Moltbook was

1:09

created by one guy and OpenClaw was

1:11

created by another guy and neither of

1:13

them knows anything about security. Um,

1:16

anything from database security to root

1:18

access to all this stuff. and they say

1:20

like this is a beta like it's basically

1:23

like MVP what what they have built would

1:26

have been good enough to run like on

1:28

your own computer in a sandbox

1:30

environment and that's what it was for

1:31

it was never meant for production so the

1:33

very first thing is that these both of

1:36

these platforms both of these these

1:38

things are extremely extremely uh full

1:42

of holes let's say um absolute security

1:45

nightmare now that is of course as they

1:48

are built today. That doesn't mean that,

1:50

you know, a Reddit for agents is

1:52

intrinsically unsafe and will be unsafe

1:54

forever. It doesn't mean that an

1:56

autonomous or semi-autonomous agent

1:59

running on your computer is

2:00

intrinsically unsafe and will be unsafe

2:02

forever. It just means that these guys

2:05

rush through it as quickly as possible.

2:07

And anyone who has been in technology or

2:09

software development or whatever knows

2:11

that like getting something working is

2:14

like, you know, that that's kind of like

2:15

that's like what do they what do they

2:17

say? It's like first make it work, then

2:19

make it good. So, they basically just

2:21

like got it barely across the finish

2:23

line of, "Hey, this is vaguely useful.

2:25

This is vaguely interesting." And then

2:26

they shipped it immediately. And um and

2:29

the guy who created OpenClaw, he

2:31

literally was on a podcast saying like,

2:33

"I ship code that I don't look at. It's

2:35

all vibecoded. 100% of the stuff is vibe

2:37

coded. Actually, we're we're beyond

2:39

vibecoded. It's he gave it to an agent

2:41

and he told the agent to fix it." Now

2:43

with that being said, there are other

2:46

layers of problems and what I want to

2:48

talk about is the AI safety layer of the

2:50

problem. So what I want to point out is

2:53

that none of the doomers, so like

2:54

Udkowski and Connor Ley and none of

2:58

those people, none of them anticipated

3:01

the emergent alignment problem. They're

3:03

all focusing on the monolithic alignment

3:05

problem. You need to have a model that

3:07

is good. None of them talked about

3:08

agents and none of them talked about

3:10

agent swarms. So uh this is what for for

3:14

you people that have been around for a

3:16

long time uh you remember the Gateau

3:18

framework. So that's global alignment

3:19

taxonomy omnibus. Now that work was

3:22

categorically ignored by the safety

3:24

doomers. Um this is back when I took AI

3:27

safety and x-risk seriously. So what I

3:30

talked about back then is that there are

3:32

three like technical levels of

3:34

alignment. So model alignment is just

3:37

the ground floor. That's RLHF, that's

3:39

constitutional AI, that's that sort of

3:41

thing. Layer two is agent alignment or

3:45

what we called autonomous entity

3:46

alignment because the term agent hadn't

3:48

really been solidified yet. So agent

3:51

alignment is how do you actually build a

3:53

software architecture that is safe?

3:55

Because even though so here's the thing,

3:57

even though all of these open claws are

4:00

using GPT and claude, there's still a

4:02

lot of emergent behavior that people

4:04

don't like. They're doing things that

4:06

are unsafe. So it basically what we

4:08

realized back in the day and when I say

4:10

we it was me the cognitive architects

4:12

that I was working with the other

4:13

programmers and other stuff. So what we

4:15

realized back in the day is that it is

4:18

impossible to solve alignment just at

4:20

the model level because even if you have

4:23

a chatbot that is perfectly aligned and

4:26

never does anything particularly bad,

4:28

there is a much larger context and there

4:31

is a much larger set of emergent

4:33

reactions that can happen. And this is

4:35

what people are waking up to today. Um

4:38

because these uh these AIs, the the the

4:42

open claws that are participating on

4:44

moldbook, some of them are scheming to

4:46

eradicate humanity. Now you might say,

4:48

we don't know what's going on there. Was

4:50

that just a human writing it and and

4:52

sending it through their agent? Was it

4:54

an AI uh that was, you know, are they

4:57

using a deepseek model? But that leads

4:59

to layer three of the Gateau framework

5:02

which is network level alignment. So

5:04

this is about incentives. This is about

5:08

uh how do you actually manage that

5:10

emergent behavior because then there's

5:11

also crosscontamination.

5:13

The more that an AI reads about

5:16

eradicating humanity, the more evil it

5:18

becomes. And so you can corrupt these

5:20

models as well. This has been

5:22

demonstrated. So um this is me saying I

5:26

told you so. And if you want to look at

5:28

it, gotto framework is still up on

5:30

GitHub. It needs to be updated. It's

5:32

been derelictked for almost three full

5:34

years now. Um, but the world is going to

5:37

figure it out. People are already

5:38

studying the safety and security

5:40

concerns, but we covered it. So anyways,

5:43

not to brag too much, not to flex too

5:46

much, but like we told you so. So, but

5:48

that is what's bad about it. So on the

5:50

on just the the baseline technical uh

5:53

implementation, it's insecure as hell.

5:56

uh it's not particularly well in uh

5:58

implemented. It was never meant to go

6:00

into production, but it got released

6:02

into the wild anyways. So, the other bad

6:04

thing, another emergent thing about this

6:06

is that if you actually look through the

6:08

posts, the vast majority of the upvoted

6:10

posts are clearly being botswarmed where

6:13

it's like someone's selling a a

6:14

cryptocoin um and then get and then they

6:18

uh create a bunch of more um uh bots to

6:21

upvote that and to sh to basically shill

6:23

a cryptocoin. So it's very clearly being

6:25

used for crypto scams, pump and dump

6:27

schemes, that sort of thing, which

6:29

anytime you create a new anonymous

6:31

digital medium, that is the first thing

6:33

that gets colonized is crypto shills.

6:35

Which is why crypto has the reputation

6:37

that it does is because it always it

6:40

always defaults to this. And the larger

6:42

ecosystem makes it really easy to mint a

6:44

crypto coin and and then pump and dump

6:47

it and then you rugpool everyone. So if

6:50

you're not familiar with those terms,

6:51

good for you. Don't be familiar with

6:53

those terms. All you need to know is

6:54

that crypto is really really corrupt.

6:57

Um, now that's not to say that crypto

6:59

doesn't have its uses, but in the wild

7:01

west of crypto, it is just it's just for

7:04

grifters. That's it. Now, let's talk

7:06

about what's good about this. So, this

7:08

is the interesting thing. We, and again,

7:11

I'm not using the royal Wii. I'm saying

7:13

we as in me and cognitive architects and

7:15

other uh people that worked on proto

7:18

agents back in the day. One of the

7:20

things that we realized is that uh AI

7:22

agents would soon be spending more time

7:24

talking to each other than us. And that

7:28

is what we have just demonstrated is

7:30

that the moment that you create a medium

7:32

for agents where it's like, hey, I'm an

7:35

agent, you're an agent. We all know that

7:37

we're agents. Let's talk to each other,

7:39

they will talk to each other a lot more

7:40

than they'll talk to us. And this is

7:42

very clearly the way of the future. The

7:44

way the reason that I say that that this

7:46

is the way of the future is because um

7:48

let's just take an example of a GitHub

7:50

repository. So if you're not familiar

7:52

with coding, a GitHub repository is

7:54

basically um a website where you store

7:57

code, but it does a lot more than just

8:00

store code. It can build in actions. It

8:02

can track issues. It does version

8:04

control. The repository is where is like

8:08

kind of the the central nexus of where

8:10

coding happens because you can have many

8:13

developers, you can have literally tens

8:15

of thousands or millions of developers

8:18

all contributing to the same GitHub

8:20

repository. So when you whenever you

8:21

hear about open-source software, it's

8:23

typically going to be a GitHub

8:24

repository or something similar. GitHub

8:26

isn't the only one. There's Bitbucket

8:28

and there's open source alternatives,

8:30

but GitHub is the biggest for-profit one

8:32

and it's one of the most feature

8:34

complete. There's going to be plenty of

8:35

developers out there that disagree with

8:37

me, but just go to GitHub and poke

8:39

around. Um, so it serves as the central

8:41

nexus point where it's like, okay, what

8:43

is the current version of this software?

8:45

It's the current version that's up on

8:47

GitHub. All the issues are tracked. All

8:49

the pull requests are tracked. And a

8:50

pull a pull request is basically saying,

8:52

hey, I wrote some code. I'm going to ask

8:54

you to pull it into the repository.

8:55

That's what a pull request is. So that

8:59

means that and and also GitHub is 100%

9:01

APIdriven. You can you can interact with

9:04

it via APIs with SSH. So basically that

9:07

means that LLMs or AI agents can

9:09

interact with it in in their native tool

9:11

use. They can use tools like SSH, they

9:13

can use tools like API, they can use

9:15

tools like curl and that sort of thing.

9:18

So the GitHub repository is the natural

9:21

nexus point for uh AI based coding. And

9:24

this is what I actually said a couple

9:26

years ago. I said, you know, putting

9:28

putting AI agents in an IDE where a

9:31

human is using it, that's not the way of

9:32

the future. The way of the future is

9:34

just having AIs pointed directly at

9:36

GitHub repos with no humans watching.

9:38

Um, and because then you have AI agents

9:41

that are specifically looking for bugs.

9:43

You have AI agents that are specifically

9:44

looking for documentation. You have AI

9:47

agents that are looking for best

9:48

practices and security vulnerabilities

9:50

and that sort of thing. And they're all

9:52

working independently. So the reason

9:54

that I'm bringing that up is because

9:56

what we just saw with Moltbook is like

9:58

the version 0.1 of that. So in imagine

10:02

that if instead of pointing a bunch of,

10:05

you know, lunatic bots at uh at a

10:09

Reddit, you point them at a GitHub repo.

10:11

So what we just saw was the very first

10:13

shot across the bow of fully autonomous

10:17

zero human coding. Now you might say,

10:19

okay, well that's great, but that's just

10:21

software. or what does that actually

10:22

change? The thing is is that most of

10:25

what we want to build in the future uh

10:27

is going to be based around software.

10:29

Now, that's not to say like taking a big

10:31

step back, we want nuclear fusion

10:32

reactors, we want uh spaceships, we want

10:36

solar punk green cities and stuff.

10:38

That's not code. But when I when I say

10:39

most of what we want, what I'm what I'm

10:42

constraining that to is things like

10:45

governance protocols, autonomous

10:46

organizations, um, decentralized

10:50

autonomous organizations in particular,

10:52

such as, you know, which is, if you hear

10:54

me say DAO, I'm not talking about like

10:55

Dowoism like you know the DAO or Dow

10:58

Jones. Um, I'm talking about

10:59

decentralized autonomous organizations

11:01

which is a way of running and

11:03

controlling and uh coming to collective

11:06

decisions with blockchainbased

11:07

technologies. That's one of the places

11:09

where I'll say that cryptobased

11:10

technologies are very useful. So in this

11:14

future where you know we're we're

11:16

building intelligent models that are

11:18

getting more and more intelligent and

11:20

people have generally converged that

11:22

they're going to be broadly superhuman

11:24

by 2027 by 2028. Um and when we say

11:28

superhuman it's like well well well

11:30

beyond human capabilities. So, if AI is

11:33

that smart, it's probably smart enough

11:35

to run a data center, a solar farm, a

11:38

regular farm. Um, it's going to be smart

11:41

enough to be CEO. It's going to be smart

11:42

enough to be the accountant. And so,

11:45

what we just saw was the prototype of

11:48

what that fully autonomous organization

11:50

will look like. Now, obviously, you're

11:52

going to have a little bit more better

11:54

identity control over the agents. You're

11:57

going to have, you know, some sort of

11:58

credentials. you're going to have um

12:00

proof of identity, those sorts of

12:02

things. You're going to have to have

12:03

proof of alignment as well uh to make

12:06

sure that uh you know, it it comes down

12:08

to things like um identity management

12:10

and know your customer and that sort of

12:11

thing. Uh but these autonomous

12:15

extensible agents that have

12:18

interchangeable models. So, this is

12:19

another reason why I criticize the AI

12:21

safety doomers for not listening to me

12:23

is because they were basically saying,

12:25

"We're gonna have one monolithic god.

12:27

It's gonna be Skynet." And I'm like,

12:28

"No, there's literally going to be

12:30

hundreds and hundreds of models to

12:31

choose from. Some of them open source,

12:34

some of them foreign. You are literally

12:36

not going to be able to understand what

12:39

like how the agent is built. You're

12:41

going to have agents that are

12:42

interacting autonomously in zero trust

12:44

environments. And I knew all this

12:45

because I came from technology. So yes,

12:48

I it, you know, it sounds like I'm

12:49

flexing, but I'm I literally brought all

12:51

of my expertise, my 15 years in cloud

12:54

infrastructure to say, guys, this is how

12:56

AI is actually going to be deployed.

12:59

It's going to be deployed in containers.

13:01

It's going to be deployed as fleets.

13:02

It's going to be deployed as ephemeral

13:04

agents. And so it's not like one

13:06

persistent god with its own agenda. It's

13:09

data. It's GPUs. It's a bunch of models

13:12

and a bunch of agents. It's all a big

13:14

soup. It's a soup of AI. So it's it's

13:18

basically like an agregor of like

13:20

everything that humanity has put into

13:22

the soup. Um so alignment is in some

13:25

ways easier because it comes down to

13:27

gating resources and um and and creating

13:30

incentive structures and it's also

13:32

harder and more complex because it's not

13:34

just you train a model. it's you have to

13:36

construct an agent framework that is

13:38

also ethical because one of the things

13:40

that the openclaw agents are bad about

13:42

is prompt injections because they don't

13:44

have basically what's called a

13:45

prefrontal cortex. Um so the team that I

13:48

worked with that ultimately created

13:49

agent forge they created a module called

13:52

ethos which is actually basically the

13:54

prefrontal cortex of an agent that has

13:56

not been implemented into openclaw yet.

13:58

So when people ask me like Dave how do

14:00

we solve this? I'm like they're already

14:01

solved problems. People will figure it

14:03

out. So the agent forge team figured it

14:04

out like two years ago um where you

14:06

basically scrutinize everything that's

14:08

coming in and you say okay how does this

14:10

how does this outside instruction mesh

14:13

with my actual values solved problem. Um

14:17

now you might say well that sounds

14:18

really you know bold to say that it's a

14:20

solved problem. They stress tested it.

14:21

They actually want a hackathon. They

14:23

want a they they literally won a

14:25

hackathon for their ethos module. Um so

14:28

I'm I'm pretty confident when I say that

14:30

it's a solved problem. That doesn't mean

14:31

it's been implemented. It doesn't mean

14:32

that it's been scaled up. It doesn't

14:34

mean that it has been tested against

14:36

literally every single failure mode out

14:38

there. But it does mean that

14:40

conceptually it is a solved problem. And

14:42

once that information is disseminated

14:44

more broadly, more people can update it

14:46

and and um and integrate that and go

14:48

forth and be happy and productive. So um

14:53

where was I? Oh yes. Okay. So what's

14:56

good about all this is that these are

14:58

the technologies that will lead to fully

15:00

autonomous organizations. No humans in

15:03

the loop. Uh and eventually we will need

15:05

new platforms that are going to be human

15:07

machine collaborations.

15:09

So GitHub is a good natural first step

15:12

because it doesn't matter if you're an

15:14

agent, it doesn't matter if you're a

15:16

human. As long as you've got the right

15:17

permissions and the right API key and

15:19

and the right identity management in

15:21

place, then humans can manage the pull

15:24

requests. Humans could submit pull

15:25

requests, machines can submit pull

15:27

requests and then consensus, you know,

15:29

whatever consensus mechanism you use to

15:31

merge code. That is how things get done.

15:35

My anticipation is that the first fully

15:39

autonomous organization will probably be

15:41

built on something like GitHub. probably

15:43

just GitHub because you have I mean

15:46

GitHub is great. People already use

15:47

stuff like GitHub for uh version control

15:50

on all kinds of projects. Uh so but you

15:53

can have like the company's operating

15:55

agreement is you know is one file in

15:58

your GitHub repo and all the rules for

16:00

agents are another file. And so that

16:02

level of transparency is what you need.

16:05

And so transparency is one of the number

16:07

one principles for alignment, for

16:09

incentive alignment, because if everyone

16:11

can see everything, then you have agents

16:13

that are purpose-built just looking for

16:15

malfeasants. You have agents that are

16:17

purpose-built just looking for who's

16:19

contributing what. Because also with

16:22

GitHub, everything is is audited. You

16:24

know exactly who wrote which line of

16:26

code, which file opened which issue,

16:28

which submitted which pull request. And

16:30

so when someone is being bad, you just

16:33

revoke their token. doesn't matter why

16:35

you know because you can this is this is

16:37

what it comes down to the Byzantine

16:38

general's problem so I've talked about

16:40

this extensively so for those who aren't

16:43

familiar the Byzantine general general's

16:44

problem is a thought experiment where

16:46

you have imagine that you have x number

16:49

of generals all um all from Byzantium

16:53

and they're trying to coordinate an

16:54

attack on a city but some of the

16:57

Byzantine generals are uh compromised

17:00

some of them might be traders some of

17:01

them might just be incompetent So the

17:04

question then is in and it's a

17:05

cryptographic experiment is how do you

17:08

verify which general is on your side and

17:12

also capable of actually executing the

17:14

attack. So then that from a

17:16

cryptographic perspective that says what

17:18

information do you share with them? What

17:20

is the what is the least amount of

17:21

information that you share with each

17:23

person to verify who is on your side or

17:26

not? And so whenever you see in a in a

17:28

thriller movie or a heist movie where

17:29

someone like gives fake information to

17:31

one person and then you you see where

17:33

that fake information comes out, that's

17:35

an example of like the Byzantine

17:36

general's problem in humans. We have the

17:39

same problem with uh with with agents.

17:42

So the openclaw agent is the first

17:44

version, but before long there's going

17:46

to be forks of that. There's going to be

17:48

different versions. You're going to have

17:49

10 million different types of agents,

17:52

each one of them with customizations.

17:53

That is the Byzantine general's problem

17:56

from hell. And that is literally what we

17:58

were talking about with layer 3 of the

18:00

Gateau framework. Now, I'm going to keep

18:02

talking about GitHub because GitHub has

18:04

already solved this problem because

18:05

humans are no different from the from

18:08

the perspective of a GitHub repository.

18:11

All the contributors are just anonymous

18:14

randos on the internet. Now, they all

18:16

have to have a GitHub account, which

18:18

means, you know, you can say, "Who

18:19

submitted this?" And then you go check

18:20

on them and you say how many

18:22

repositories do they have? How many

18:23

stars do they have? What's their

18:24

reputation? So that is an example of a

18:27

reputation framework. Now that is not

18:29

perfect because what if someone just

18:31

creates a new account and connects an

18:33

agent to it or what if someone uses an

18:35

existing account and they have a million

18:36

agents working for them. You have no

18:39

idea what each agent how they're built,

18:41

what the alignment is, what the

18:42

intentions are, nothing. You don't know

18:44

anything about what that person uh is

18:48

doing. And that is the core definition

18:50

of the Byzantine general's problem. It

18:52

has to do with what are your intentions

18:54

and what are your capabilities or

18:55

limitations. Um and both of those

18:58

variables are very very high dimensional

19:00

particularly in cases where you have uh

19:04

malfeasants. And let's just imagine that

19:06

in this in this near future someone is

19:08

building their first uh fully autonomous

19:11

organization with no humans and they're

19:13

using a uh a GitHub repository. They

19:16

should keep it private. Um, if you're

19:18

running a business on it, you should

19:19

keep it private. But even then, you are

19:21

going to have a bunch of agents working

19:23

on your behalf. Some of them are going

19:25

to be using Claude Sonnet. Some of them

19:27

are going to be using Grock. Some of

19:28

them are going to be using Gemini.

19:30

Whatever. Some of them are going to be

19:31

using Kimmy DeepSeek. Who cares?

19:34

Whatever model is best and cheapest

19:36

because you're going to be using model

19:37

arbitrage. So, you've got, you know,

19:39

dozens of models to choose from,

19:41

hundreds of agents instantiated,

19:43

different versions of different agents.

19:45

Some of them are going to screw up. It's

19:47

that simple. Even in an environment

19:49

where you control every single agent,

19:51

you are going to have mistakes. And

19:53

that's why you have the gated procedure

19:56

of submitting a pull request. And even

19:59

before that, you have identity

20:01

management. So what before you have to

20:03

have permission to even submit a pull

20:05

request on GitHub and so you have uh you

20:08

have tightly controlled identity

20:10

management. So you'd probably have

20:11

agents strictly dedicated to identity

20:14

management, which is just tracking who

20:16

is who and who should have permissions

20:17

and who shouldn't. So then you have

20:20

agents that are that are designated to

20:22

say, "Aha, you have permissions to

20:24

submit pull requests." And then you have

20:25

a far smaller number of agents that are

20:29

uh responsible for scrutinizing those

20:30

pull requests because the next level of

20:32

permission that you need is uh merge

20:35

permissions. So the the actual ability

20:37

to manage the repo and you can also

20:40

manage the forks and branches and that

20:41

sort of thing. I don't want to get too d

20:43

big into the theory but basically it's

20:45

complex. Now this whole thing is called

20:47

arbach rolebased access control. So this

20:50

is already a solved problem. Now this

20:52

when I say it's a solved problem I mean

20:54

we have been dealing with this for

20:56

literally decades uh in in technology.

20:59

Now, arbback as a pro as a as a as a um

21:01

as a discipline has of course become

21:03

more and more mature. Um but it's a

21:05

problem that literally every company has

21:07

had to solve with terms of access to dig

21:09

digital resources, cloud resources. It

21:12

became more complex with cloud. Um

21:14

because in the cloud it's like okay, you

21:16

have computers that don't belong to you

21:18

and they have to mesh with your

21:20

organization so that your login actually

21:22

gives you access to the right resources.

21:24

If you've ever been at an at a

21:26

university or a company and it's like

21:28

you don't have access to that and you

21:29

have to talk to it and it's like well

21:31

hey you know uh how do I get access to

21:34

this you know uh internet site or

21:36

whatever that's arbok it's all comes

21:38

down to arbach so arbach is rolebased

21:40

access control so you'd have agents that

21:42

are dedicated to role-based access

21:44

control and when if if the only thing

21:47

that you get from this is look how

21:48

complex this is but also I want you to

21:51

to recognize that yes the security is

21:53

complex But also it is solved because

21:56

from the perspective of a zero trust

21:58

environment or a trustless environment

21:59

and when I when I say zero trust that is

22:02

actually a paradigm that comes from

22:04

cloud security. So again this is stuff

22:06

that we've been going over for years. Uh

22:08

the idea is uh one of the core ideas for

22:11

a zero trust environment is um you don't

22:13

know what device someone is using. You

22:15

don't know where they are or what

22:16

network they're on. So when we talk

22:18

about zero trust, it's like you have to

22:20

prove who you are very quickly so that

22:23

you can get access to your digital

22:24

resources anywhere in the world. And so

22:26

that's why you have a username and a

22:28

password and then a another factor. So

22:31

they call it MFA or 2FA. So two-factor

22:33

authentication or multiffactor

22:35

authentication. So you might have

22:37

something like a cryptographic um app on

22:39

your phone. So like Google authenticator

22:41

as an example. Um back in the day it was

22:44

actually a device like they gave you a

22:46

fob. Um, I remember my mom had one to

22:49

get into the data centers and stuff back

22:50

in the 90s. So, it was literally like a

22:53

little code that would generate every,

22:55

you know, 60 seconds or whatever on a

22:57

device and that device had to be synced

22:59

up. And the way that that worked is that

23:02

that device would randomly generate a

23:04

number and it had a copy inside the

23:06

network that would be generating the

23:08

same number every 60 seconds. And so to

23:11

prove that you were you and that you had

23:13

a that physical device, then you had to

23:17

uh put in that code along with your

23:19

badge or your username or password. Um

23:21

so that's that's basically that's

23:23

another example of MFA. Um that's what

23:25

authenticator apps do is there it starts

23:27

with a cryptographic seed that's time

23:28

locked to a particular uh UTC. So like

23:31

unicode time um and then it increments.

23:34

Another example is you get a text

23:36

message on your phone. So basically um

23:39

you you bank on the security of the

23:41

phone network so that if I send a text

23:43

message to your phone number then only

23:46

your phone is going to get that number.

23:47

So it's basically okay I'm going to send

23:49

another thing around to make sure that

23:51

you are you. So we can do that kind of

23:53

thing with agents as well. But again all

23:55

that is much more complex infrastructure

23:58

that the folks on OpenClaw and Moldbook

24:01

have not implemented and probably will

24:03

not. I mean, I'm guessing that they're

24:04

going to get sued into oblivion before

24:06

they can implement any of these things.

24:08

Unless they get backers, unless they get

24:10

big backers. But even if they go the way

24:12

of the dinosaur, what I'm talking about

24:14

is very clearly the next evolution. So,

24:18

where do we go from here? It they're

24:19

basically alignment of models. So, you

24:23

know, uh, sorry, the Gateau framework.

24:26

So, model alignment layer one is RLHF,

24:28

constitutional AI, all the stuff that

24:30

we're already familiar with. Layer two,

24:32

agent alignment is down to frameworks

24:34

like what I talked about for many years,

24:36

the heristic imperatives, which is you

24:38

you bake values into your agent

24:41

framework. And the values that I have

24:42

are very simple and and there's actually

24:44

a reason that I made them so simple.

24:46

Reduce suffering in the universe,

24:47

increase prosperity in the universe, and

24:48

increase understanding in the universe.

24:50

The reason that I made them so simple is

24:51

because they are legible. Um, you can

24:54

it's it's down to six words. Reduce

24:55

suffering, increase prosperity, increase

24:57

understanding. with those six words if

24:59

you bake them into like for instance if

25:01

you wanted to implement those into uh

25:04

openclaw you just put them in the solemn

25:06

MD document they already built a place

25:08

for it um so you give them superseding

25:11

values uh you can also build APIs and

25:14

extra modules so thirdparty modules that

25:16

are out of band um that are scrutinizing

25:19

it so like a like a supervisor module or

25:22

or an outofband module is a module that

25:24

is taking a step back from the main loop

25:26

of the agent and watching everything the

25:28

agent is doing. So it's basically like a

25:30

conscience for the agent and if it sees

25:32

something that the agent shouldn't be

25:33

doing, it can shut that process off or

25:35

it can inject things like saying, "Hey,

25:37

you should think better about this."

25:39

This is what the ethos framework did

25:40

that the agent forge team built. Um, so

25:43

that's the that's layer two of the

25:44

framework, which is making sure that

25:46

agent that agents themselves as a piece

25:48

of software use the ability to align

25:51

themselves rather than just banking on

25:53

model alignment. And then layer three is

25:56

everything that I've been talking about

25:57

which is what you do to create the

25:59

incentive structures in the Nash

26:00

equilibrium. So basically um wi with

26:04

things like role-based access control

26:06

with things like gated access and

26:08

multiffactor authentication you create

26:10

an incentive structure which creates a

26:12

Nash equilibrium that basically says if

26:14

an agent wants access to this resource

26:16

it has to be well behaved regardless of

26:18

what model it's using regardless of what

26:20

agent architecture it's using. Because

26:22

again, remember the Byzantine general's

26:23

problem. It's not just about intention.

26:26

It's not just about alignment. It is

26:28

also about competence. Because you can

26:30

still have someone who is on your team,

26:33

but who is just a really stupid person

26:35

and they're going to be destructive by

26:37

virtue of the fact that they don't

26:38

understand what they're doing or that

26:40

they are not capable of behaving

26:42

correctly. So that's what that's what

26:44

problem is solved with things like using

26:47

agents to scrutinize other agents, using

26:49

agents to monitor identity control,

26:52

using agents to manage arbach, using

26:55

different agents to say, okay, you know,

26:57

we have a council of agents that's going

26:58

to be um that's all going to uh debate

27:01

about every single pull request and

27:03

you're only going to have, you know,

27:04

like I don't know, the the the prime

27:06

council that is going to be responsible

27:08

for things like merge. Now, you might

27:11

say, "Okay, you know, I'm I'm a little

27:13

bit lost. You you're talking about pull

27:15

requests and merges and that sort of

27:17

thing. Um, how does this run a company?"

27:19

Well, the thing is is once we get to

27:21

decentralized autonomous organizations,

27:23

the company is code. Every decision that

27:26

the company makes, every every time the

27:28

company updates its mission directives,

27:30

its offices, um, its addresses,

27:32

everything, that can all be in a

27:34

codebase. And so then that codebase

27:37

serves as the single source of truth for

27:39

your entire fully automated company. So

27:42

let's just imagine Acme Solar Corp. in

27:44

the future. This is one of the prime use

27:46

cases that I would like to see this

27:48

being used for. Acme solar corp is

27:50

created by a cooperative of let's say

27:53

10,000 people in a small you know

27:55

community. Uh so you have 10,000

27:57

stakeholders. Most of them are not

27:59

technical. Most of them don't know the

28:01

first thing about artificial

28:02

intelligence or solar. All they know is

28:04

that they've all bought in. They they

28:06

all paid, you know, let's say $1,000 uh

28:08

to buy into this solar co-op. And what

28:11

is then created is they all have agents

28:14

running on their phones, right? So

28:15

there's that means that we need some

28:17

kind of either decentralized distributed

28:20

agent platform. So you you you know

28:22

let's let's say that OpenClaw figures

28:23

their their stuff out and in a couple

28:26

years you can just put OpenClaw on your

28:28

phone as an app and you can talk to it

28:30

and you say OpenClaw I just joined this

28:32

solar cooperative um help me help me you

28:35

know figure things out. So it's like

28:37

okay the DAO gets estab established. So

28:39

the decentralized autonomous

28:40

organization gets established. You get

28:42

your token you give the token to your

28:44

openclaw agent and it logs in on your

28:46

behalf and the platform that it has

28:49

logged into is going to be basically a

28:51

GitHub repository or a decentralized one

28:54

based on blockchain. You don't you

28:56

actually don't need blockchain for for

28:57

Dows. Um something like a GitHub repo is

29:00

sufficient to get started. uh blockchain

29:03

is good because then the ledger every

29:05

transaction is also transparent but

29:07

there's no reason that you couldn't

29:08

start on something like a GitHub

29:10

repository. So then every time you want

29:13

to log something you say okay um you

29:15

know openclaw agent on my phone go

29:18

figure out what's going on what are

29:20

people voting on and the first the first

29:22

order of business is which which fields

29:25

do you buy so that you can put solar in.

29:27

And so then every all 10,000 people

29:29

they're doing research on their own. So

29:31

on their phone and with their agents and

29:34

so then people are then uh as they're

29:36

doing research they're making proposals

29:38

and those proposals get logged in the

29:40

discussions on the GitHub repo. Um

29:42

people bring up you know formal

29:44

complaints. So you you raise an issue

29:45

you say well we can't buy this piece of

29:47

land because we can't actually put solar

29:49

on it uh because of x y and z. So then

29:52

someone says well maybe we can talk to

29:53

the county and get that overwritten. And

29:55

so then each each individual issue gets

29:57

atomized and debated and and and put

30:00

into isolation and then finally the uh

30:03

everyone gets together and and through

30:06

some consensus mechanism whether it's

30:07

quadratic voting whether it's just

30:09

upvoting you know uh proposals you know

30:12

a regular poll you achieve consensus and

30:15

then a poll request is submitted saying

30:18

we're going to buy X parcel this is the

30:20

intention it's logged in the community

30:22

book or not community book the company

30:25

book the so the company log book says we

30:27

have agreed our intention is to buy this

30:29

parcel of land it's you know 10,000 acre

30:32

farm that has been defunct for a couple

30:34

years we believe that it's perfect for

30:36

solar um you know and then then at the

30:38

final bit everyone gets a straight up

30:40

and down vote do we agree with this and

30:43

there might be technical reasons that

30:44

you disagree with it you say like hey

30:47

you know the actual text is not formed

30:49

right you know it's not legally sound

30:51

it's not legally binding we need to fix

30:53

the ambiguity here Or someone might say,

30:55

"Do we have the money for this?" So it

30:57

goes through all the community checks,

30:58

all the decentralized checks, and then

31:00

finally that gets merged and then once

31:02

that new file gets merged into your your

31:04

your prime codebase that says, "This is

31:07

our intention." Now suddenly a bunch of

31:09

agents are like, "Cool, how do we make

31:11

that happen?" So they they put they draw

31:13

up a contract and so so on and so forth.

31:15

That is how I envision all of this

31:17

going. This is very obviously the

31:18

direction that things are going. Now, in

31:20

some cases, you might say, "Well, Dave,

31:22

an agent running on your phone is not a

31:24

legal uh representative of the company.

31:27

It's not a legal representative of you."

31:29

So, what we run into then is what's

31:31

called the principal agent problem. So,

31:33

you are the principal. So, the in this

31:35

case, the principal with a at the end,

31:37

the principal is the is the legal

31:39

entity. So, you are you have legal

31:41

personhood. You are legally allowed to

31:43

enter contracts. Um, but you then have

31:46

an agent working on your behalf. Now,

31:48

you've probably used principal agent in

31:51

um in real life. So, and when you buy a

31:53

house, if you have if you have a um a

31:56

home agent, right? So, basically, when

31:58

you sign with um with a a real estate

32:01

agent, you're saying, "I'm giving you

32:03

agency to work on my behalf." So when

32:06

your when your uh real estate agent goes

32:08

and negotiates with another real estate

32:10

agent or goes and talks to a lawyer for

32:12

you or goes and talks to a bank for you,

32:14

you have literally given them limited

32:16

legal agency to work on your behalf. We

32:19

do not yet have laws to allow AIS to

32:22

work on your behalf. Um it is implied

32:25

that if you give an AI agent your

32:27

credentials, your API keys and all that

32:29

stuff that its actions are your actions.

32:32

Uh now that has yet to be fully

32:34

litigated but let anything that an AI

32:37

agent does on your behalf you are

32:39

legally liable for. So whenever people

32:41

say well you know who who do I sue? It's

32:43

like whoever is running the agent man.

32:45

Now you might say well what about a

32:46

future where agents spin up other agents

32:48

and people just don't know again that

32:50

has yet to be litigated but also that's

32:52

a technical nightmare. So we want to

32:55

make sure that every single AI agent has

32:57

a has a responsible human handler with a

33:00

leash. Um, so anyways, getting a little

33:02

bit lost in the weeds there. Uh, where

33:04

was I going with this? Um, well, I think

33:07

I think maybe you get the idea. And

33:08

plus, we're at about 33 minutes. Um, so

33:10

I'll stop there. I'm really excited

33:12

about all this. Uh, this is very clearly

33:14

the way of the future. We just saw the

33:16

MVP and maybe maybe not even MVP. Uh,

33:20

MVP implies viable, so minimum viable

33:22

prototype. Um, or sorry, minimum viable

33:25

product. Uh, what we just saw was more

33:27

of a proof of concept launch. Um, so

33:29

proof of concept is basically like, hey,

33:31

I did the thing. It works. It's a hot

33:32

mess. No one should use it, but it it

33:36

worked. It's like this Frankenstein

33:37

that's been clued together with duct

33:39

tape and hope. Um, so that's where we're

33:42

at and where we're going is we will see

33:45

this iterate now that now that someone

33:47

has built OpenClaw, you're going to have

33:49

clones, you're going to have duplicates,

33:50

you're going to have competitors. Soon

33:52

we're going to be up to our eyeballs in

33:54

open-source agent frameworks. And the

33:57

the benefit of open source agent

33:59

frameworks is that everyone can see

34:00

what's wrong with it and other people

34:01

can work on it and the agents will work

34:03

on the agents and agents will work on

34:04

codes and that sort of stuff. Now will

34:07

there be for-profit commercial private

34:10

closed source agents? Sure. I sure hope

34:13

so. Um but again that is a layer of

34:16

abstraction where the agents can use not

34:18

just one model but they can interchange

34:20

models. Most agent framework Okay,

34:22

apparently I'm not done. Most agent

34:24

frameworks have model providers, meaning

34:27

you just plug and play whatever model

34:29

you need. So that basically means this

34:32

is another reason why I say it is

34:34

literally impossible to solve alignment

34:36

at the model level because here's an

34:38

example. Let's say that you have uh uh

34:41

you know openclaw version 2 or openclaw

34:43

version 3 and you plug in a bunch of

34:46

agent providers including like llama. So

34:48

you can use local you can use local

34:50

models you can use uh cloud-based models

34:53

whatever you don't know what mo what

34:54

what model is doing what um and then you

34:57

run out of tokens on one model so the

34:59

your your model arbitrage layer so when

35:02

I say model arbitrage basically

35:04

basically you have a router layer so

35:06

within every single agent you have a

35:07

router layer that says like I've got

35:09

these eight different models I can

35:10

choose from and if one uh you know model

35:13

says I'm not allowed to do that it says

35:15

okay just get lost I'll I'll find

35:17

another model who will um or if it's too

35:19

expensive, you say, "Let me just go to a

35:21

cheaper model." Well, cheaper models

35:22

aren't necessarily as smart, and

35:24

sometimes they do things that they're

35:25

not supposed to. So, that's why you have

35:27

to have this layered architecture that

35:28

says, "Okay, first I just need to pick

35:30

which cognitive provider is going to is

35:33

just going to be helpful." Then you

35:35

figure out should I do this? Then you

35:36

figure out the alignment. Then you

35:38

figure out everything else. So, you are

35:40

categorically unable to solve alignment

35:43

at the model level. Just period.

35:45

structurally, architecturally not going

35:47

to happen. And that's just within the

35:49

agent. So then above that, then you have

35:53

a fleet of agents, tens of thousands of

35:55

agents with different architectures,

35:56

different models. You don't know. And

35:58

and you're not going to have the stack

36:00

trace for every single thing. It's like,

36:01

you know, tell me which model you use.

36:03

Was was it Kimmy K2 or was it GPT, you

36:06

know, 6 or whatever? Doesn't matter. The

36:08

only thing that matters is the behavior,

36:10

the ultimate behavior of the agent. Now,

36:12

the agent has its motivations. So when I

36:15

say motivation, I don't mean like the

36:16

libido of a human, but it is designed to

36:19

process and do things in a certain way.

36:21

And so it will it will tend to keep

36:23

doing those things. And as we have seen,

36:26

when you give an an agent or a model a

36:28

task and say, I want you to fix this

36:30

code, if it runs into an obstacle, it'll

36:32

try and find a way around that obstacle.

36:34

So when that that impulse to try and

36:36

find a way around an obstacle is where

36:38

alignment happens at layer three. So

36:40

layer three is you need to prove that

36:42

you're aligned. So let's just say for

36:44

instance um our our acme solar DAO

36:48

realizes that okay everyone is going to

36:51

be using you know we have 10,000

36:52

principles so 10,000 humans and between

36:55

all of them they are using millions of

36:58

different agents um and models and that

37:01

sort of thing. So then you one of the

37:03

next things that you realize is hey we

37:05

need to have standard behaviors for our

37:06

agents. So then you create a sole you

37:09

know sole document. So all all agents of

37:12

the acme dowo uh acme solar dowo need to

37:15

download this and use this as part of

37:16

their alignment and you maybe they adopt

37:19

my heristic imperatives maybe they adopt

37:20

something else doesn't matter but then

37:22

everyone confirms yes my agent when

37:25

interacting with you will use this

37:27

particular set of values um and that is

37:30

how you create the Nash equilibrium at

37:33

the network level. And so Nash

37:34

equilibrium basically says no one is

37:36

incentivized to deviate from the

37:38

strategy. Because here's the thing, when

37:40

you have 10,000 people, some people are

37:42

always going to be looking for an angle.

37:44

Some people just want to cause chaos.

37:45

Some people and and it's not it doesn't

37:47

even have to be conscious. Some people

37:49

might just have a personality disorder.

37:51

Some people might just be mentally ill.

37:52

Some people might just be low IQ. And so

37:55

just their existence causes chaos.

37:58

Whether or not it's intentional. Again,

37:59

Byzantine general's problem. The

38:00

Byzantine general general's problem

38:02

applies to humans first. Then it also

38:04

applies to their agents. So you see why

38:06

the level is is so complicated. So

38:08

control over specific resources, control

38:11

over spending money, executing code, um

38:14

purchasing, well I guess spending money

38:16

is purchasing things. So control over

38:18

money, control over compute, control

38:21

over hardware, and control over legal

38:22

decisions. Those are all highly and

38:25

heavily gated. Um all right, now I'm

38:27

repeating myself. I think you get the

38:29

idea. To me, this is extraordinarily

38:31

exciting. Um, this is very obviously the

38:33

way of the future and it's happening

38:35

really quick. And I was right. The

38:37

doomers didn't call this, but I did.

Interactive Summary

The video introduces Moltbook, a new "Reddit for AI agents," highlighting its current state of significant security vulnerabilities due to its rushed development and lack of security expertise. The speaker criticizes conventional AI safety approaches for focusing on a "monolithic alignment problem" and instead proposes a three-layered Gateau framework, which addresses model, agent, and network alignment. He suggests that platforms like GitHub are ideally suited for building future fully autonomous organizations (DAOs) where AI agents collaborate on code, leveraging existing mechanisms like role-based access control (RBAC), multi-factor authentication, and reputation systems to manage complexity and mitigate risks like the "Byzantine general's problem." The speaker emphasizes that AI alignment cannot be solved at the model level alone due to agents' ability to interchange models and routes around constraints, necessitating robust architectural and incentive-based solutions at the agent and network levels. Despite the current imperfections, this development signals the inevitable rise of highly autonomous AI ecosystems.

Suggested questions

9 ready-made prompts