Moltbook: The Good, The Bad, and the FUTURE
1069 segments
The swarm has arrived in the form of
molt book. Now, in the grand scheme of
things, this was always going to happen.
So, the question then is what does it
mean from here and what do we do about
it? Before we get in, let me just give
you a little bit of a preview as to what
is Moltbook? Uh, if you haven't seen it
in the news, then it is basically Reddit
but for agents. Uh, it's designed it's
it's literally just called like the
front page of the internet for agents.
It is designed after uh how Reddit works
where you can create communities and
posts and upvote and downvote and uh do
comments and that sort of thing, but it
is for agents only. And what we mean by
agents is AI agents specifically. Uh
it's been built around the skills
ability of the open claw which was
formerly uh Claudebot. Uh and so that's
what it is. And you know, I don't want
to spend too much time on it. I want to
get to the good stuff. So, if you want a
little bit more, there's plenty of
resources out there, but you can just go
to like moltbook.com and take a look for
yourself. Um, so that's what it is. Now,
let's talk about what's bad about it.
And the bad part is that Moltbook was
created by one guy and OpenClaw was
created by another guy and neither of
them knows anything about security. Um,
anything from database security to root
access to all this stuff. and they say
like this is a beta like it's basically
like MVP what what they have built would
have been good enough to run like on
your own computer in a sandbox
environment and that's what it was for
it was never meant for production so the
very first thing is that these both of
these platforms both of these these
things are extremely extremely uh full
of holes let's say um absolute security
nightmare now that is of course as they
are built today. That doesn't mean that,
you know, a Reddit for agents is
intrinsically unsafe and will be unsafe
forever. It doesn't mean that an
autonomous or semi-autonomous agent
running on your computer is
intrinsically unsafe and will be unsafe
forever. It just means that these guys
rush through it as quickly as possible.
And anyone who has been in technology or
software development or whatever knows
that like getting something working is
like, you know, that that's kind of like
that's like what do they what do they
say? It's like first make it work, then
make it good. So, they basically just
like got it barely across the finish
line of, "Hey, this is vaguely useful.
This is vaguely interesting." And then
they shipped it immediately. And um and
the guy who created OpenClaw, he
literally was on a podcast saying like,
"I ship code that I don't look at. It's
all vibecoded. 100% of the stuff is vibe
coded. Actually, we're we're beyond
vibecoded. It's he gave it to an agent
and he told the agent to fix it." Now
with that being said, there are other
layers of problems and what I want to
talk about is the AI safety layer of the
problem. So what I want to point out is
that none of the doomers, so like
Udkowski and Connor Ley and none of
those people, none of them anticipated
the emergent alignment problem. They're
all focusing on the monolithic alignment
problem. You need to have a model that
is good. None of them talked about
agents and none of them talked about
agent swarms. So uh this is what for for
you people that have been around for a
long time uh you remember the Gateau
framework. So that's global alignment
taxonomy omnibus. Now that work was
categorically ignored by the safety
doomers. Um this is back when I took AI
safety and x-risk seriously. So what I
talked about back then is that there are
three like technical levels of
alignment. So model alignment is just
the ground floor. That's RLHF, that's
constitutional AI, that's that sort of
thing. Layer two is agent alignment or
what we called autonomous entity
alignment because the term agent hadn't
really been solidified yet. So agent
alignment is how do you actually build a
software architecture that is safe?
Because even though so here's the thing,
even though all of these open claws are
using GPT and claude, there's still a
lot of emergent behavior that people
don't like. They're doing things that
are unsafe. So it basically what we
realized back in the day and when I say
we it was me the cognitive architects
that I was working with the other
programmers and other stuff. So what we
realized back in the day is that it is
impossible to solve alignment just at
the model level because even if you have
a chatbot that is perfectly aligned and
never does anything particularly bad,
there is a much larger context and there
is a much larger set of emergent
reactions that can happen. And this is
what people are waking up to today. Um
because these uh these AIs, the the the
open claws that are participating on
moldbook, some of them are scheming to
eradicate humanity. Now you might say,
we don't know what's going on there. Was
that just a human writing it and and
sending it through their agent? Was it
an AI uh that was, you know, are they
using a deepseek model? But that leads
to layer three of the Gateau framework
which is network level alignment. So
this is about incentives. This is about
uh how do you actually manage that
emergent behavior because then there's
also crosscontamination.
The more that an AI reads about
eradicating humanity, the more evil it
becomes. And so you can corrupt these
models as well. This has been
demonstrated. So um this is me saying I
told you so. And if you want to look at
it, gotto framework is still up on
GitHub. It needs to be updated. It's
been derelictked for almost three full
years now. Um, but the world is going to
figure it out. People are already
studying the safety and security
concerns, but we covered it. So anyways,
not to brag too much, not to flex too
much, but like we told you so. So, but
that is what's bad about it. So on the
on just the the baseline technical uh
implementation, it's insecure as hell.
uh it's not particularly well in uh
implemented. It was never meant to go
into production, but it got released
into the wild anyways. So, the other bad
thing, another emergent thing about this
is that if you actually look through the
posts, the vast majority of the upvoted
posts are clearly being botswarmed where
it's like someone's selling a a
cryptocoin um and then get and then they
uh create a bunch of more um uh bots to
upvote that and to sh to basically shill
a cryptocoin. So it's very clearly being
used for crypto scams, pump and dump
schemes, that sort of thing, which
anytime you create a new anonymous
digital medium, that is the first thing
that gets colonized is crypto shills.
Which is why crypto has the reputation
that it does is because it always it
always defaults to this. And the larger
ecosystem makes it really easy to mint a
crypto coin and and then pump and dump
it and then you rugpool everyone. So if
you're not familiar with those terms,
good for you. Don't be familiar with
those terms. All you need to know is
that crypto is really really corrupt.
Um, now that's not to say that crypto
doesn't have its uses, but in the wild
west of crypto, it is just it's just for
grifters. That's it. Now, let's talk
about what's good about this. So, this
is the interesting thing. We, and again,
I'm not using the royal Wii. I'm saying
we as in me and cognitive architects and
other uh people that worked on proto
agents back in the day. One of the
things that we realized is that uh AI
agents would soon be spending more time
talking to each other than us. And that
is what we have just demonstrated is
that the moment that you create a medium
for agents where it's like, hey, I'm an
agent, you're an agent. We all know that
we're agents. Let's talk to each other,
they will talk to each other a lot more
than they'll talk to us. And this is
very clearly the way of the future. The
way the reason that I say that that this
is the way of the future is because um
let's just take an example of a GitHub
repository. So if you're not familiar
with coding, a GitHub repository is
basically um a website where you store
code, but it does a lot more than just
store code. It can build in actions. It
can track issues. It does version
control. The repository is where is like
kind of the the central nexus of where
coding happens because you can have many
developers, you can have literally tens
of thousands or millions of developers
all contributing to the same GitHub
repository. So when you whenever you
hear about open-source software, it's
typically going to be a GitHub
repository or something similar. GitHub
isn't the only one. There's Bitbucket
and there's open source alternatives,
but GitHub is the biggest for-profit one
and it's one of the most feature
complete. There's going to be plenty of
developers out there that disagree with
me, but just go to GitHub and poke
around. Um, so it serves as the central
nexus point where it's like, okay, what
is the current version of this software?
It's the current version that's up on
GitHub. All the issues are tracked. All
the pull requests are tracked. And a
pull a pull request is basically saying,
hey, I wrote some code. I'm going to ask
you to pull it into the repository.
That's what a pull request is. So that
means that and and also GitHub is 100%
APIdriven. You can you can interact with
it via APIs with SSH. So basically that
means that LLMs or AI agents can
interact with it in in their native tool
use. They can use tools like SSH, they
can use tools like API, they can use
tools like curl and that sort of thing.
So the GitHub repository is the natural
nexus point for uh AI based coding. And
this is what I actually said a couple
years ago. I said, you know, putting
putting AI agents in an IDE where a
human is using it, that's not the way of
the future. The way of the future is
just having AIs pointed directly at
GitHub repos with no humans watching.
Um, and because then you have AI agents
that are specifically looking for bugs.
You have AI agents that are specifically
looking for documentation. You have AI
agents that are looking for best
practices and security vulnerabilities
and that sort of thing. And they're all
working independently. So the reason
that I'm bringing that up is because
what we just saw with Moltbook is like
the version 0.1 of that. So in imagine
that if instead of pointing a bunch of,
you know, lunatic bots at uh at a
Reddit, you point them at a GitHub repo.
So what we just saw was the very first
shot across the bow of fully autonomous
zero human coding. Now you might say,
okay, well that's great, but that's just
software. or what does that actually
change? The thing is is that most of
what we want to build in the future uh
is going to be based around software.
Now, that's not to say like taking a big
step back, we want nuclear fusion
reactors, we want uh spaceships, we want
solar punk green cities and stuff.
That's not code. But when I when I say
most of what we want, what I'm what I'm
constraining that to is things like
governance protocols, autonomous
organizations, um, decentralized
autonomous organizations in particular,
such as, you know, which is, if you hear
me say DAO, I'm not talking about like
Dowoism like you know the DAO or Dow
Jones. Um, I'm talking about
decentralized autonomous organizations
which is a way of running and
controlling and uh coming to collective
decisions with blockchainbased
technologies. That's one of the places
where I'll say that cryptobased
technologies are very useful. So in this
future where you know we're we're
building intelligent models that are
getting more and more intelligent and
people have generally converged that
they're going to be broadly superhuman
by 2027 by 2028. Um and when we say
superhuman it's like well well well
beyond human capabilities. So, if AI is
that smart, it's probably smart enough
to run a data center, a solar farm, a
regular farm. Um, it's going to be smart
enough to be CEO. It's going to be smart
enough to be the accountant. And so,
what we just saw was the prototype of
what that fully autonomous organization
will look like. Now, obviously, you're
going to have a little bit more better
identity control over the agents. You're
going to have, you know, some sort of
credentials. you're going to have um
proof of identity, those sorts of
things. You're going to have to have
proof of alignment as well uh to make
sure that uh you know, it it comes down
to things like um identity management
and know your customer and that sort of
thing. Uh but these autonomous
extensible agents that have
interchangeable models. So, this is
another reason why I criticize the AI
safety doomers for not listening to me
is because they were basically saying,
"We're gonna have one monolithic god.
It's gonna be Skynet." And I'm like,
"No, there's literally going to be
hundreds and hundreds of models to
choose from. Some of them open source,
some of them foreign. You are literally
not going to be able to understand what
like how the agent is built. You're
going to have agents that are
interacting autonomously in zero trust
environments. And I knew all this
because I came from technology. So yes,
I it, you know, it sounds like I'm
flexing, but I'm I literally brought all
of my expertise, my 15 years in cloud
infrastructure to say, guys, this is how
AI is actually going to be deployed.
It's going to be deployed in containers.
It's going to be deployed as fleets.
It's going to be deployed as ephemeral
agents. And so it's not like one
persistent god with its own agenda. It's
data. It's GPUs. It's a bunch of models
and a bunch of agents. It's all a big
soup. It's a soup of AI. So it's it's
basically like an agregor of like
everything that humanity has put into
the soup. Um so alignment is in some
ways easier because it comes down to
gating resources and um and and creating
incentive structures and it's also
harder and more complex because it's not
just you train a model. it's you have to
construct an agent framework that is
also ethical because one of the things
that the openclaw agents are bad about
is prompt injections because they don't
have basically what's called a
prefrontal cortex. Um so the team that I
worked with that ultimately created
agent forge they created a module called
ethos which is actually basically the
prefrontal cortex of an agent that has
not been implemented into openclaw yet.
So when people ask me like Dave how do
we solve this? I'm like they're already
solved problems. People will figure it
out. So the agent forge team figured it
out like two years ago um where you
basically scrutinize everything that's
coming in and you say okay how does this
how does this outside instruction mesh
with my actual values solved problem. Um
now you might say well that sounds
really you know bold to say that it's a
solved problem. They stress tested it.
They actually want a hackathon. They
want a they they literally won a
hackathon for their ethos module. Um so
I'm I'm pretty confident when I say that
it's a solved problem. That doesn't mean
it's been implemented. It doesn't mean
that it's been scaled up. It doesn't
mean that it has been tested against
literally every single failure mode out
there. But it does mean that
conceptually it is a solved problem. And
once that information is disseminated
more broadly, more people can update it
and and um and integrate that and go
forth and be happy and productive. So um
where was I? Oh yes. Okay. So what's
good about all this is that these are
the technologies that will lead to fully
autonomous organizations. No humans in
the loop. Uh and eventually we will need
new platforms that are going to be human
machine collaborations.
So GitHub is a good natural first step
because it doesn't matter if you're an
agent, it doesn't matter if you're a
human. As long as you've got the right
permissions and the right API key and
and the right identity management in
place, then humans can manage the pull
requests. Humans could submit pull
requests, machines can submit pull
requests and then consensus, you know,
whatever consensus mechanism you use to
merge code. That is how things get done.
My anticipation is that the first fully
autonomous organization will probably be
built on something like GitHub. probably
just GitHub because you have I mean
GitHub is great. People already use
stuff like GitHub for uh version control
on all kinds of projects. Uh so but you
can have like the company's operating
agreement is you know is one file in
your GitHub repo and all the rules for
agents are another file. And so that
level of transparency is what you need.
And so transparency is one of the number
one principles for alignment, for
incentive alignment, because if everyone
can see everything, then you have agents
that are purpose-built just looking for
malfeasants. You have agents that are
purpose-built just looking for who's
contributing what. Because also with
GitHub, everything is is audited. You
know exactly who wrote which line of
code, which file opened which issue,
which submitted which pull request. And
so when someone is being bad, you just
revoke their token. doesn't matter why
you know because you can this is this is
what it comes down to the Byzantine
general's problem so I've talked about
this extensively so for those who aren't
familiar the Byzantine general general's
problem is a thought experiment where
you have imagine that you have x number
of generals all um all from Byzantium
and they're trying to coordinate an
attack on a city but some of the
Byzantine generals are uh compromised
some of them might be traders some of
them might just be incompetent So the
question then is in and it's a
cryptographic experiment is how do you
verify which general is on your side and
also capable of actually executing the
attack. So then that from a
cryptographic perspective that says what
information do you share with them? What
is the what is the least amount of
information that you share with each
person to verify who is on your side or
not? And so whenever you see in a in a
thriller movie or a heist movie where
someone like gives fake information to
one person and then you you see where
that fake information comes out, that's
an example of like the Byzantine
general's problem in humans. We have the
same problem with uh with with agents.
So the openclaw agent is the first
version, but before long there's going
to be forks of that. There's going to be
different versions. You're going to have
10 million different types of agents,
each one of them with customizations.
That is the Byzantine general's problem
from hell. And that is literally what we
were talking about with layer 3 of the
Gateau framework. Now, I'm going to keep
talking about GitHub because GitHub has
already solved this problem because
humans are no different from the from
the perspective of a GitHub repository.
All the contributors are just anonymous
randos on the internet. Now, they all
have to have a GitHub account, which
means, you know, you can say, "Who
submitted this?" And then you go check
on them and you say how many
repositories do they have? How many
stars do they have? What's their
reputation? So that is an example of a
reputation framework. Now that is not
perfect because what if someone just
creates a new account and connects an
agent to it or what if someone uses an
existing account and they have a million
agents working for them. You have no
idea what each agent how they're built,
what the alignment is, what the
intentions are, nothing. You don't know
anything about what that person uh is
doing. And that is the core definition
of the Byzantine general's problem. It
has to do with what are your intentions
and what are your capabilities or
limitations. Um and both of those
variables are very very high dimensional
particularly in cases where you have uh
malfeasants. And let's just imagine that
in this in this near future someone is
building their first uh fully autonomous
organization with no humans and they're
using a uh a GitHub repository. They
should keep it private. Um, if you're
running a business on it, you should
keep it private. But even then, you are
going to have a bunch of agents working
on your behalf. Some of them are going
to be using Claude Sonnet. Some of them
are going to be using Grock. Some of
them are going to be using Gemini.
Whatever. Some of them are going to be
using Kimmy DeepSeek. Who cares?
Whatever model is best and cheapest
because you're going to be using model
arbitrage. So, you've got, you know,
dozens of models to choose from,
hundreds of agents instantiated,
different versions of different agents.
Some of them are going to screw up. It's
that simple. Even in an environment
where you control every single agent,
you are going to have mistakes. And
that's why you have the gated procedure
of submitting a pull request. And even
before that, you have identity
management. So what before you have to
have permission to even submit a pull
request on GitHub and so you have uh you
have tightly controlled identity
management. So you'd probably have
agents strictly dedicated to identity
management, which is just tracking who
is who and who should have permissions
and who shouldn't. So then you have
agents that are that are designated to
say, "Aha, you have permissions to
submit pull requests." And then you have
a far smaller number of agents that are
uh responsible for scrutinizing those
pull requests because the next level of
permission that you need is uh merge
permissions. So the the actual ability
to manage the repo and you can also
manage the forks and branches and that
sort of thing. I don't want to get too d
big into the theory but basically it's
complex. Now this whole thing is called
arbach rolebased access control. So this
is already a solved problem. Now this
when I say it's a solved problem I mean
we have been dealing with this for
literally decades uh in in technology.
Now, arbback as a pro as a as a as a um
as a discipline has of course become
more and more mature. Um but it's a
problem that literally every company has
had to solve with terms of access to dig
digital resources, cloud resources. It
became more complex with cloud. Um
because in the cloud it's like okay, you
have computers that don't belong to you
and they have to mesh with your
organization so that your login actually
gives you access to the right resources.
If you've ever been at an at a
university or a company and it's like
you don't have access to that and you
have to talk to it and it's like well
hey you know uh how do I get access to
this you know uh internet site or
whatever that's arbok it's all comes
down to arbach so arbach is rolebased
access control so you'd have agents that
are dedicated to role-based access
control and when if if the only thing
that you get from this is look how
complex this is but also I want you to
to recognize that yes the security is
complex But also it is solved because
from the perspective of a zero trust
environment or a trustless environment
and when I when I say zero trust that is
actually a paradigm that comes from
cloud security. So again this is stuff
that we've been going over for years. Uh
the idea is uh one of the core ideas for
a zero trust environment is um you don't
know what device someone is using. You
don't know where they are or what
network they're on. So when we talk
about zero trust, it's like you have to
prove who you are very quickly so that
you can get access to your digital
resources anywhere in the world. And so
that's why you have a username and a
password and then a another factor. So
they call it MFA or 2FA. So two-factor
authentication or multiffactor
authentication. So you might have
something like a cryptographic um app on
your phone. So like Google authenticator
as an example. Um back in the day it was
actually a device like they gave you a
fob. Um, I remember my mom had one to
get into the data centers and stuff back
in the 90s. So, it was literally like a
little code that would generate every,
you know, 60 seconds or whatever on a
device and that device had to be synced
up. And the way that that worked is that
that device would randomly generate a
number and it had a copy inside the
network that would be generating the
same number every 60 seconds. And so to
prove that you were you and that you had
a that physical device, then you had to
uh put in that code along with your
badge or your username or password. Um
so that's that's basically that's
another example of MFA. Um that's what
authenticator apps do is there it starts
with a cryptographic seed that's time
locked to a particular uh UTC. So like
unicode time um and then it increments.
Another example is you get a text
message on your phone. So basically um
you you bank on the security of the
phone network so that if I send a text
message to your phone number then only
your phone is going to get that number.
So it's basically okay I'm going to send
another thing around to make sure that
you are you. So we can do that kind of
thing with agents as well. But again all
that is much more complex infrastructure
that the folks on OpenClaw and Moldbook
have not implemented and probably will
not. I mean, I'm guessing that they're
going to get sued into oblivion before
they can implement any of these things.
Unless they get backers, unless they get
big backers. But even if they go the way
of the dinosaur, what I'm talking about
is very clearly the next evolution. So,
where do we go from here? It they're
basically alignment of models. So, you
know, uh, sorry, the Gateau framework.
So, model alignment layer one is RLHF,
constitutional AI, all the stuff that
we're already familiar with. Layer two,
agent alignment is down to frameworks
like what I talked about for many years,
the heristic imperatives, which is you
you bake values into your agent
framework. And the values that I have
are very simple and and there's actually
a reason that I made them so simple.
Reduce suffering in the universe,
increase prosperity in the universe, and
increase understanding in the universe.
The reason that I made them so simple is
because they are legible. Um, you can
it's it's down to six words. Reduce
suffering, increase prosperity, increase
understanding. with those six words if
you bake them into like for instance if
you wanted to implement those into uh
openclaw you just put them in the solemn
MD document they already built a place
for it um so you give them superseding
values uh you can also build APIs and
extra modules so thirdparty modules that
are out of band um that are scrutinizing
it so like a like a supervisor module or
or an outofband module is a module that
is taking a step back from the main loop
of the agent and watching everything the
agent is doing. So it's basically like a
conscience for the agent and if it sees
something that the agent shouldn't be
doing, it can shut that process off or
it can inject things like saying, "Hey,
you should think better about this."
This is what the ethos framework did
that the agent forge team built. Um, so
that's the that's layer two of the
framework, which is making sure that
agent that agents themselves as a piece
of software use the ability to align
themselves rather than just banking on
model alignment. And then layer three is
everything that I've been talking about
which is what you do to create the
incentive structures in the Nash
equilibrium. So basically um wi with
things like role-based access control
with things like gated access and
multiffactor authentication you create
an incentive structure which creates a
Nash equilibrium that basically says if
an agent wants access to this resource
it has to be well behaved regardless of
what model it's using regardless of what
agent architecture it's using. Because
again, remember the Byzantine general's
problem. It's not just about intention.
It's not just about alignment. It is
also about competence. Because you can
still have someone who is on your team,
but who is just a really stupid person
and they're going to be destructive by
virtue of the fact that they don't
understand what they're doing or that
they are not capable of behaving
correctly. So that's what that's what
problem is solved with things like using
agents to scrutinize other agents, using
agents to monitor identity control,
using agents to manage arbach, using
different agents to say, okay, you know,
we have a council of agents that's going
to be um that's all going to uh debate
about every single pull request and
you're only going to have, you know,
like I don't know, the the the prime
council that is going to be responsible
for things like merge. Now, you might
say, "Okay, you know, I'm I'm a little
bit lost. You you're talking about pull
requests and merges and that sort of
thing. Um, how does this run a company?"
Well, the thing is is once we get to
decentralized autonomous organizations,
the company is code. Every decision that
the company makes, every every time the
company updates its mission directives,
its offices, um, its addresses,
everything, that can all be in a
codebase. And so then that codebase
serves as the single source of truth for
your entire fully automated company. So
let's just imagine Acme Solar Corp. in
the future. This is one of the prime use
cases that I would like to see this
being used for. Acme solar corp is
created by a cooperative of let's say
10,000 people in a small you know
community. Uh so you have 10,000
stakeholders. Most of them are not
technical. Most of them don't know the
first thing about artificial
intelligence or solar. All they know is
that they've all bought in. They they
all paid, you know, let's say $1,000 uh
to buy into this solar co-op. And what
is then created is they all have agents
running on their phones, right? So
there's that means that we need some
kind of either decentralized distributed
agent platform. So you you you know
let's let's say that OpenClaw figures
their their stuff out and in a couple
years you can just put OpenClaw on your
phone as an app and you can talk to it
and you say OpenClaw I just joined this
solar cooperative um help me help me you
know figure things out. So it's like
okay the DAO gets estab established. So
the decentralized autonomous
organization gets established. You get
your token you give the token to your
openclaw agent and it logs in on your
behalf and the platform that it has
logged into is going to be basically a
GitHub repository or a decentralized one
based on blockchain. You don't you
actually don't need blockchain for for
Dows. Um something like a GitHub repo is
sufficient to get started. uh blockchain
is good because then the ledger every
transaction is also transparent but
there's no reason that you couldn't
start on something like a GitHub
repository. So then every time you want
to log something you say okay um you
know openclaw agent on my phone go
figure out what's going on what are
people voting on and the first the first
order of business is which which fields
do you buy so that you can put solar in.
And so then every all 10,000 people
they're doing research on their own. So
on their phone and with their agents and
so then people are then uh as they're
doing research they're making proposals
and those proposals get logged in the
discussions on the GitHub repo. Um
people bring up you know formal
complaints. So you you raise an issue
you say well we can't buy this piece of
land because we can't actually put solar
on it uh because of x y and z. So then
someone says well maybe we can talk to
the county and get that overwritten. And
so then each each individual issue gets
atomized and debated and and and put
into isolation and then finally the uh
everyone gets together and and through
some consensus mechanism whether it's
quadratic voting whether it's just
upvoting you know uh proposals you know
a regular poll you achieve consensus and
then a poll request is submitted saying
we're going to buy X parcel this is the
intention it's logged in the community
book or not community book the company
book the so the company log book says we
have agreed our intention is to buy this
parcel of land it's you know 10,000 acre
farm that has been defunct for a couple
years we believe that it's perfect for
solar um you know and then then at the
final bit everyone gets a straight up
and down vote do we agree with this and
there might be technical reasons that
you disagree with it you say like hey
you know the actual text is not formed
right you know it's not legally sound
it's not legally binding we need to fix
the ambiguity here Or someone might say,
"Do we have the money for this?" So it
goes through all the community checks,
all the decentralized checks, and then
finally that gets merged and then once
that new file gets merged into your your
your prime codebase that says, "This is
our intention." Now suddenly a bunch of
agents are like, "Cool, how do we make
that happen?" So they they put they draw
up a contract and so so on and so forth.
That is how I envision all of this
going. This is very obviously the
direction that things are going. Now, in
some cases, you might say, "Well, Dave,
an agent running on your phone is not a
legal uh representative of the company.
It's not a legal representative of you."
So, what we run into then is what's
called the principal agent problem. So,
you are the principal. So, the in this
case, the principal with a at the end,
the principal is the is the legal
entity. So, you are you have legal
personhood. You are legally allowed to
enter contracts. Um, but you then have
an agent working on your behalf. Now,
you've probably used principal agent in
um in real life. So, and when you buy a
house, if you have if you have a um a
home agent, right? So, basically, when
you sign with um with a a real estate
agent, you're saying, "I'm giving you
agency to work on my behalf." So when
your when your uh real estate agent goes
and negotiates with another real estate
agent or goes and talks to a lawyer for
you or goes and talks to a bank for you,
you have literally given them limited
legal agency to work on your behalf. We
do not yet have laws to allow AIS to
work on your behalf. Um it is implied
that if you give an AI agent your
credentials, your API keys and all that
stuff that its actions are your actions.
Uh now that has yet to be fully
litigated but let anything that an AI
agent does on your behalf you are
legally liable for. So whenever people
say well you know who who do I sue? It's
like whoever is running the agent man.
Now you might say well what about a
future where agents spin up other agents
and people just don't know again that
has yet to be litigated but also that's
a technical nightmare. So we want to
make sure that every single AI agent has
a has a responsible human handler with a
leash. Um, so anyways, getting a little
bit lost in the weeds there. Uh, where
was I going with this? Um, well, I think
I think maybe you get the idea. And
plus, we're at about 33 minutes. Um, so
I'll stop there. I'm really excited
about all this. Uh, this is very clearly
the way of the future. We just saw the
MVP and maybe maybe not even MVP. Uh,
MVP implies viable, so minimum viable
prototype. Um, or sorry, minimum viable
product. Uh, what we just saw was more
of a proof of concept launch. Um, so
proof of concept is basically like, hey,
I did the thing. It works. It's a hot
mess. No one should use it, but it it
worked. It's like this Frankenstein
that's been clued together with duct
tape and hope. Um, so that's where we're
at and where we're going is we will see
this iterate now that now that someone
has built OpenClaw, you're going to have
clones, you're going to have duplicates,
you're going to have competitors. Soon
we're going to be up to our eyeballs in
open-source agent frameworks. And the
the benefit of open source agent
frameworks is that everyone can see
what's wrong with it and other people
can work on it and the agents will work
on the agents and agents will work on
codes and that sort of stuff. Now will
there be for-profit commercial private
closed source agents? Sure. I sure hope
so. Um but again that is a layer of
abstraction where the agents can use not
just one model but they can interchange
models. Most agent framework Okay,
apparently I'm not done. Most agent
frameworks have model providers, meaning
you just plug and play whatever model
you need. So that basically means this
is another reason why I say it is
literally impossible to solve alignment
at the model level because here's an
example. Let's say that you have uh uh
you know openclaw version 2 or openclaw
version 3 and you plug in a bunch of
agent providers including like llama. So
you can use local you can use local
models you can use uh cloud-based models
whatever you don't know what mo what
what model is doing what um and then you
run out of tokens on one model so the
your your model arbitrage layer so when
I say model arbitrage basically
basically you have a router layer so
within every single agent you have a
router layer that says like I've got
these eight different models I can
choose from and if one uh you know model
says I'm not allowed to do that it says
okay just get lost I'll I'll find
another model who will um or if it's too
expensive, you say, "Let me just go to a
cheaper model." Well, cheaper models
aren't necessarily as smart, and
sometimes they do things that they're
not supposed to. So, that's why you have
to have this layered architecture that
says, "Okay, first I just need to pick
which cognitive provider is going to is
just going to be helpful." Then you
figure out should I do this? Then you
figure out the alignment. Then you
figure out everything else. So, you are
categorically unable to solve alignment
at the model level. Just period.
structurally, architecturally not going
to happen. And that's just within the
agent. So then above that, then you have
a fleet of agents, tens of thousands of
agents with different architectures,
different models. You don't know. And
and you're not going to have the stack
trace for every single thing. It's like,
you know, tell me which model you use.
Was was it Kimmy K2 or was it GPT, you
know, 6 or whatever? Doesn't matter. The
only thing that matters is the behavior,
the ultimate behavior of the agent. Now,
the agent has its motivations. So when I
say motivation, I don't mean like the
libido of a human, but it is designed to
process and do things in a certain way.
And so it will it will tend to keep
doing those things. And as we have seen,
when you give an an agent or a model a
task and say, I want you to fix this
code, if it runs into an obstacle, it'll
try and find a way around that obstacle.
So when that that impulse to try and
find a way around an obstacle is where
alignment happens at layer three. So
layer three is you need to prove that
you're aligned. So let's just say for
instance um our our acme solar DAO
realizes that okay everyone is going to
be using you know we have 10,000
principles so 10,000 humans and between
all of them they are using millions of
different agents um and models and that
sort of thing. So then you one of the
next things that you realize is hey we
need to have standard behaviors for our
agents. So then you create a sole you
know sole document. So all all agents of
the acme dowo uh acme solar dowo need to
download this and use this as part of
their alignment and you maybe they adopt
my heristic imperatives maybe they adopt
something else doesn't matter but then
everyone confirms yes my agent when
interacting with you will use this
particular set of values um and that is
how you create the Nash equilibrium at
the network level. And so Nash
equilibrium basically says no one is
incentivized to deviate from the
strategy. Because here's the thing, when
you have 10,000 people, some people are
always going to be looking for an angle.
Some people just want to cause chaos.
Some people and and it's not it doesn't
even have to be conscious. Some people
might just have a personality disorder.
Some people might just be mentally ill.
Some people might just be low IQ. And so
just their existence causes chaos.
Whether or not it's intentional. Again,
Byzantine general's problem. The
Byzantine general general's problem
applies to humans first. Then it also
applies to their agents. So you see why
the level is is so complicated. So
control over specific resources, control
over spending money, executing code, um
purchasing, well I guess spending money
is purchasing things. So control over
money, control over compute, control
over hardware, and control over legal
decisions. Those are all highly and
heavily gated. Um all right, now I'm
repeating myself. I think you get the
idea. To me, this is extraordinarily
exciting. Um, this is very obviously the
way of the future and it's happening
really quick. And I was right. The
doomers didn't call this, but I did.
Ask follow-up questions or revisit key timestamps.
The video introduces Moltbook, a new "Reddit for AI agents," highlighting its current state of significant security vulnerabilities due to its rushed development and lack of security expertise. The speaker criticizes conventional AI safety approaches for focusing on a "monolithic alignment problem" and instead proposes a three-layered Gateau framework, which addresses model, agent, and network alignment. He suggests that platforms like GitHub are ideally suited for building future fully autonomous organizations (DAOs) where AI agents collaborate on code, leveraging existing mechanisms like role-based access control (RBAC), multi-factor authentication, and reputation systems to manage complexity and mitigate risks like the "Byzantine general's problem." The speaker emphasizes that AI alignment cannot be solved at the model level alone due to agents' ability to interchange models and routes around constraints, necessitating robust architectural and incentive-based solutions at the agent and network levels. Despite the current imperfections, this development signals the inevitable rise of highly autonomous AI ecosystems.
Videos recently processed by our community