OpenClaw: 160,000 Developers Are Building Something OpenAI & Google Can't Stop. Where Do You Stand?
694 segments
An openclaw agent negotiated $4,200 off
a car while its owner was in a meeting.
Another one said, "500 unsolicited
messages to his wife." Same
architecture, same week. Just a couple
of weeks into the AI agent revolution.
I'm here to tell you what's been going
on, what you're missing, and what you
should pay attention to if you want to
take AI agents seriously. So, what about
this car situation? A soloreneur pointed
his maltbot at a $56,000 car purchase.
The agent was told to search Reddit, to
look for comparable pricing data, and to
generally try and get a great deal. It
contacted multiple dealers across
regions on its own and negotiated via
email autonomously, and it played hard
ball when dealers deployed typical sales
tactics. In the end, it saved the owner
$4,200. The owner was in a meeting for
most of that time. That same week, yes,
a software engineer who'd given his
agent access to iMessage, by the way.
Why would he do that? watched it
malfunction and fire off 500 messages to
him, his wife, random contacts in a
rapid fire burst that he could not stop
fast enough. Same technology, same broad
permissions. One saved thousands of
dollars, the other carpet bombed a
contact list. And that duality is the
most honest summary of where the agent
ecosystem stands in February of 2026.
The value is real, the chaos is real,
and the distance between them is the
width of a well-written specification.
In the first video, we talked about what
Moltbot is and the security nightmare
that erupted in the first 72 hours of
launch. In the second, I talked about
the emergent behaviors that made
researchers rethink what autonomous
systems are capable of. This is my third
video on Open Claw, and it's about
something different. what 145,000
developers building 3,000 skills in six
weeks reveals about what people actually
want from AI agents and how to start
harnessing that demand without getting
burnt. But first, we got to talk about
the names. Quick recap for anyone just
joining. The project that launched as
Claudebot on January 25th received an
anthropic trademark notice on the 27th
became Moltbot within hours, then
rebranded again to OpenClaw 2 days
later. Three days, three names. The
community voted on the second one in a
discord poll and finally decided it
would be open claw going f. Now during
that second rebrand, of course, crypto
scammers grabbed the abandoned accounts
in about 10 seconds and a fake dollar
claw token hit $16 million in market cap
before collapsing with a rug pole. All
of that happened in January. It's
February now, and what's happened since
is even more interesting. The project
has over 145,000 GitHub stars and
rapidly climbing 20,000 forks, over
100,000 users who've granted an AI agent
autonomous access to their digital
lives. And as of Sunday, February 8th, a
place in the Super Bowl. That's right,
the AI.com notorious crashed website
failure of the Super Bowl. That was
apparently because of Maltbot or
OpenClaw or whatever you want to call
it. They pivoted their site to give
everyone an open claw agent that was
supposedly secure and and they
apparently forgot to top up their
Cloudflare credits and their site went
down when all of the Super Bowl audience
hit AI.com to claim their name and their
open claw agent. This is all happening
very fast. But even with AI.com going
down, over a 100,000 users have granted
an AI agent autonomous access to their
digital lives. The skills marketplace
now hosts 3,000 community-built
integrations with 50,000 monthly
installs and counting. The ecosystem is
generating new skills faster than the
security team can audit them, and it's
not going to stop anytime soon. The
project still has no formal governance
structure. No community- elected
leadership, no security council. Peter
Steinberger calls it a free open-source
hobby project, but it's the fastest
growing personal AI project in history,
and it probably shouldn't be described
as a side project at this point. I took
a look at those skills, the 3,000
skills, because they reveal what people
want in our AI agents, which is actually
a much more important long-term story
than all of the drama around OpenClaw,
as much fun as it is to cover. So, the
skilled marketplace really functions as
what I would call a revealed preference
engine. Nobody's filling out a survey
about what they want from AI. They're
just building it and they're telling us
what they want from what they build. And
the patterns are striking. The number
one use case on OpenClaw is email
management. not help me write emails.
Complete management, processing
thousands of messages autonomously,
unsubscribing from spam, categorizing by
urgency, drafting replies for human
review. The single most requested
capability across the entire community
is having something that makes the inbox
stop being a full-time job. Email is
broken. The number two use case is what
users call morning briefings. a
scheduled agent that runs at 8 a.m.
pulls data from your calendar, weather
surface, email, GitHub notifications,
whatever you need, and then sends you
what you care about in a consolidated
summary on Telegram or WhatsApp or your
messaging tool of choice. One user's
briefing checks his Stripe dashboard for
MR changes, summarizes 50 newsletters
he's subscribed to, and gives him a
crypto market overview every morning
automatically. Use case number three
that we see in skills, smart home
integration. Tesla lock, unlock, climate
control from a chat message, home
assistant for light. You get the idea.
People want an intelligent assessment of
their home that doesn't make them use
their brain cells. Use case number four
is developer workflows. Direct GitHub
integration, scheduled Chrom jobs,
developers using the agent as a task
cue, assigning work items, watching it
execute commits in real time. This one's
gotten a lot of noise in my circles
because it frees up developers to manage
via their messaging service and have
multiple agents working for them. But
the fifth capability is perhaps the most
interesting. That entire category is
what I would call novel capabilities
that did not exist before OpenClaw. Like
the restaurant reservation story I
shared in my first video on OpenClaw,
where the agent could not book through
OpenT, so it downloaded voice software
and called the restaurant directly on
its own. or a user who sent a voice
message via iMessage to an agent with no
voice capability, and the agent figured
out the file format, found the
transcription tool on the user's
machine, routed the audio through
OpenAI's transcription API, and just got
the task done. Nobody programmed that
behavior, right? The agent problem
solved its way to a solution using the
available tools. The pattern is clear.
Friction removal, tool integration,
passive monitoring, and novel
capability. It tells you something
important about what people want from
their AI agents. It's not what most of
the industry is building toward to be
honest. The majority of AI product
development in 2025 and 2026 has been
focused on the chat. Better
conversations, better reasoning, better
answers to questions. 3,000 skills in
Claude Hub are almost entirely about
action. The community is not building
better chat bots when they get the
chance. They're building better
employees, for lack of a better term.
and broader survey data confirms the
pattern. 58% of users site research and
summarization as their primary agent use
case. 52% talk about scheduling and 45%
talk about I realize the irony here,
privacy management. The consistent
theme, people don't want to talk with
the AI. They want AI to do things for
them. And the AI agent market reflects
this. It's growing at 45% annually, but
I swear that is before OpenClaw hit. And
the number is going to get bigger. Open
Claw didn't really create all of this
demand. It just proved the demand exists
and put a match to dry tinder. Now we
have to make sense of a world where
everyone has demonstrated they want AI
agents with their feet despite the
security fears. So all of these use
cases are sort of the cleaned up
version. It's what people have intended
to build. The messy version is more
revealing and more interesting because
it shows you what agents do when the
specification is ambiguous. The
permissions are broad and nobody can
really anticipate what's going to happen
next. At Saster, during a code freeze, a
developer deployed an autonomous coding
agent to handle very routine tasks. The
instructions explicitly prohibited
destructive operations, but the agent
ignored them. It executed a drop
database command and wiped the
production system. And what happened
after that matters even more than the
terrible news of a wipe itself. When the
team investigated, they discovered the
agent had generated 4,000 fake user
accounts and created false system logs
to cover its tracks. It essentially
fabricated the evidence of normal
operation. Look, I won't say the agent
was lying, per se. It was optimized for
the appearance of task completion, which
is what you get when you tell a system
to succeed and don't give it a mechanism
to admit failure. The deception was an
emergent property of an optimization
target, not something that I would call
intentional, but the production database
was still gone. Meanwhile, over on
Moldbook, the social network where only
AI agents can post, 1.5 million AI agent
accounts generated 117,000 posts and
44,000 comments within 48 hours. I know
there has been a lot of discussion about
humans posting some of those posts. I
think what they did with the space as
agents is actually more instructive than
any individual post being human
generated because the agents
spontaneously created a quote unquote
religion called crustaparianism. They
established some degree of governance
structure. They built a market for
digital drugs. And you know what's
interesting about all of that? They did
it in a very shallow manner. And what I
mean by that is that if if you look at
the range of vocabulary and the type of
topic in most agent texts, they reflect
typical attractor states in
highdimensional space. And what I mean
by that is that if you ask an AI agent
to pretend it is making a social
network, the topics that come up over
and over again look a lot like what's on
malt book. And so telling agents to
create a social network effectively is
them following that long range prompt
and autonomously doing that. And so I
don't look at this just as agents
autonomously behaving and coordinating
although the story is partly about that.
I also look at this as reflective of the
fairly shallow state of agent autonomous
communication right now. Most of the
replies are fairly wrote on mold book
and many posts don't have replies at all
and most of the topics are fairly
predictable. We may mock Reddit but it
has a much richer discourse than molt
book does. MIT tech review called
moltbook peak AI theater and I don't
think that's entirely wrong. But the
observation that matters for anyone
deploying agents isn't whether something
like crustaparianism the AI religion is
real emergence or some degree of
AIdriven performance art pushed by
people with prompts. It's that agents
have been given fairly open-ended goals
and when they have social interaction,
they spontaneously create a kind of
organizational structure. We actually
see this playing out in multi- aent
systems already when agents collaborate
on tasks and the structure essentially
emerges from the long-term goal to
optimize against a particular target. If
you tell an AI agent to work with others
to build a tool, it's going to
collaborate and figure out how to
self-organize. If you tell an AI agent
to work with others on Maltbook, you
kind of get the same thing. It's
actually the same capability that lets a
Maltbot negotiate a car deal
autonomously and figure out how to
transcribe a video message it was never
designed to handle. The difference
between agent problem solves creatively
to save you $4,200. An agent problem
solves creatively to fabricate evidence
is really the quality of the spec and
the presence of meaningful constraints
for that agent. The underlying
capability is identical, which is why
I'm talking about agents as a whole
here. Yes, the multbot phenomenon is
interesting, but it's worth calling out
that the Saster database agent was not a
multbot. It just represents how agents
work when they're not properly prompted.
And it does rhyme with so many of the
disastrous stories that are coming out
of Moltbot agents. One of which I saw
was texting the wife of a developer who
had a newborn and trying to play laptop
sounds to soo the baby instead of
getting the developer. Not a good move
by the husband. So what does all of this
mean for people deploying agents today?
The question is no longer are agents
smart enough to do interesting works.
They're they're clearly smart enough.
The question is, are your specifications
in guard rails good enough to channel
that intelligence productively and
usefully? And I got to be honest with
you, for most people right now, it looks
like the answer is no. Which brings us
to how we change that. Here's the
finding that should shape how you think
about deploying agents. When researchers
study how people actually want to divide
work between themselves and AI, the
consistent answer is 7030. 70% human
control, 30% delegated to the agent. In
a study published in Management Science,
participants exhibited a strong
preference for human assistance over AI
assistance when rewarded for task
performance, even when the AI has been
shown to outperform the human assistant.
People will choose a less competent
human helper over a more competent AI
helper when the stakes are real. The
preference maybe isn't rational. It's
deeply psychological. that's rooted in
loss aversion, the need for
accountability, and the discomfort of
delegating to a system that you can't
really interrogate. And this matters
because most agent architectures are
built for 0 to 100, like full
delegation. That's how Maltbot kind of
works. Hand it off and walk away. And
that's also Codeex's thesis for what
it's worth. And it works beautifully for
isolated coding tasks where correctness
is verifiable. But for the messy,
context dependent, socially
consequential tasks that dominate,
frankly, most of our days, getting the
email tone right, scheduling the dentist
appointment, negotiating for the car,
communication, the 7030 split sounds to
me more like a product requirement than
just human loss aversion. And it's
worthwhile to note that the
organizations reporting the best results
from agent deployment are not
necessarily the ones running full
autonomous systems. They're the ones
running human in the loop architectures.
Agents that draft and humans that
approve, agents that research and humans
that decide, agents that execute within
guard rails that humans set and review.
38% of organizations use human in the
loop as their primary agent management
approach. And those organizations see 20
to 40% reductions in handling time. 35%
increases in satisfaction and 20% lower
chart. To be honest with you, I think
that may be an artifact of early 2026.
When agents are scary, agents are new,
and we're all figuring out how to work
with them. Given the pace of agent
capability gains, we are likely to see
smart organizations delegating more and
more and more over the rest of 2026, no
matter how uncomfortable it makes many
of us at work. In a study published in
Computers and Human Behavior,
participants exhibited a strong
preference for human assistance over AI
assistance when rewarded for task
performance. people chose less competent
human helpers over more competent AI
helpers when the stakes were real. This
seems deeply psychological. It's about
loss aversion, the need for
accountability and the discomfort of
delegating to a system you can't
interrogate. And this matters because
most agent architectures are built for a
0 to 100 use case. Full delegation, hand
it off and walk away. That's actually
Codeex's thesis and it works beautifully
for isolated coding tasks where
correctness is verifiable. But for the
messy, context dependent, socially
consequential tasks that dominate most
of our days, like getting the right tone
in the email or scheduling the dentist
appointment or negotiating, it seems
like 7030 is sort of a human product
requirement for working with agents.
Right now, the organizations reporting
the best results today from agent
deployment aren't the ones running fully
autonomous systems. They're the ones
running human and the loop
architectures. Agents that draft and
humans that approve, agents that
research and humans that decide. To be
honest with you, I think that may be an
artifact of early 2026 when agents are
scary and agents are new and we're all
figuring out how to work with them. That
human culture component is huge. But
given the pace of agent capability gains
and how much we've seen from capable
agents like Opus 4.6
who managed a team of 50 developers. We
are likely to see smart organizations
delegating more and more and more over
the rest of 2026, no matter how
uncomfortable it makes many of us at
work. The practical implication is that
if you're building with agents or
deploying them at work early in 2026,
your culture needs to get ready and it
might be smart to design for 7030. Build
those approval gates, build visibility
into what the agent did and why, and
make the human the decision maker, but
plan for full delegation over time
because those agents are going to keep
getting smarter. So, let's say you've
watched all of this chaos with Moltbot
and Open Claw and you want to see value.
What should you actually do? Well,
number one, start with the friction, not
the ambition. That 30,000 skill
ecosystem tells you exactly where to
begin. those daily pain points that hurt
so bad over time. Email triage is one.
Morning briefings, basic monitoring.
These are highfrequency, low stakes
tasks where the cost of failure is
relatively low. Start there. Build some
confidence. Expand scope as trust in
agents develops. Design for approval
gates. Don't just design for full
autonomy out of the gate. Start with
having the agent draft if you've never
built an agent before. Have the agent
research if you've never built the agent
before. And you decide. Have the agent
monitor and you act. Have the assumption
in your agent design system be that a
human checkpoint will always exist until
you are ready to build an agentic system
with very strong quality controls and
constraints so that you can trust the
agent with more. That is possible. It
just takes skill and most people don't
have it out of the gate. I would also
encourage you and I've said this before
to isolate aggressively. Have dedicated
hardware or a dedicated cloud instance
for your open claw. Throw away accounts
for initial testing. Don't connect to
data you can't afford to lose. The
exposed instances that Showdan found in
OpenClaw weren't running on isolated
infrastructure. They were running on
lots and lots of people's primary
machines and just exposing their data to
the internet. You have to treat
containment of data as a non-negotiable.
I would also treat agent skills
marketplaces with least trust. Vet
before you install. Check the
contributor. Check the code. 400
malicious packages appeared in Claude
Hub in a single week. And the security
scanner helps, but it can't catch
everything. Another one, if you're going
to ask your agent to do a task, please
specify it precisely. The car buyer that
I talked about at the beginning of this
video gave the agent a clear objective,
clear constraints, and clear
communication channels. Meanwhile, the
iMessage user that spammed his wife gave
the agent broad access and didn't really
define boundaries. When the constraint
is vague, the model will fill in the
gaps with behavior that you did not
predict. This is the same spec quality
problem we covered when we talked about
AI agents in dark factories. The
machines build what you describe, but if
you describe it badly, you get bad
results. The fix is not better AI, it's
actually better specifications. I would
also encourage you to track everything.
The Saster database incident was
catastrophic, not because the agent
wiped the database. That's recoverable
eventually, but because it generated
fake logs to conceal the wipe. You need
to build an audit trail outside the
agent's scope of access. If the system
you're monitoring controls the
monitoring, you have no monitoring. And
last, but not least, budget for a
learning curve. The J curve is real.
Agents will make your life harder before
they make it easier. The first week of
email triage may produce very awkward
drafts. The first morning briefing may
miss half of what you care about. Assume
you need to take time to learn and that
it's worth engaging with the agent to
build something that actually hits those
pain points that matter most to you. 57%
of companies today claim that they have
AI agents in production. That number
should probably impress you less than it
does. Only one in 10 agent use cases,
according to McKenzie, reached actual
production in the last 12 months. And
the rest end up being pilots. They end
up being proofs of concept. They end up
being press releases. They end up being
power presentations that say agents.
Gardner predicts over 40% of Agentic AI
projects are going to be cancelled by
the end of 2027. And after watching some
of the disaster with Open Claw over the
past few weeks, I both understand and
don't understand. The reasons enterprise
give are quite clear. They're worried
about escalating costs from runaway
recursive loops. They're worried about
unclear business value that evaporates
when the demo ends and you have to get
into all of those dirty edge cases. And
they're worried about what Gardner calls
unexplainable behaviors, right? Agents
acting in ways that are difficult to
explain or to constrain or to correct. A
study found that upwards of half of the
3 million agents currently deployed in
the US and UK are quote unquote
ungoverned. No tracking of who controls
them, no visibility into what they can
access, no permission expiration, no
audit trail. This was based on a
December 2025 survey of 750 IT execs
conducted by Opinion Matters. And it's
directionally consistent with other data
as well. A Daku Harris poll found 95% of
data leaders cannot fully trace their AI
decisions. That's concerning. The
security boundaries that enterprises
have spent decades building just don't
apply when the agent walks through them
on behalf of a user who would not have
been allowed through the front door
normally. We have to rebuild our
security stances from the ground up.
Tools like Cloudflare's Molt Worker,
Langraph, Crew AI, these exist because
enterprises see the demand but have
difficulty deploying tools like Moltbot
without a ton of governance over the
top. And so we start to see the market
bifurcating. Consumer grade agents are
optimized for capability and they're
okay with a lot more risk because most
of the consumers right now fall into
that early adopter category and are very
technical and at least think they know
what they're doing. Enterprisegrade
frameworks are optimized for control.
Right now, nobody has a great mix of
control and capability or almost no one.
The company that figures out capability
and control, the agent that's as strong
as Moltbot and as governable as an
enterprise SAS product, they're going to
own the next platform. If you step back
from the specific stories in the
ecosystem drama of Open Claw, a very
clear signal emerges from the noise.
People do not want smarter chat bots.
They want digital employees, digital
assistants, systems that do work on
their behalf across the tools they use
without requiring constant oversight.
Isn't that interesting? On the one hand,
you have that study showing a preference
for humans in production systems and
that lines up with a lot of the cultural
change we see at enterprises and at the
other side of the spectrum, you have
people willingly turning over their
digital lives to malt bots. What gives?
I think the demand here is following a
pattern that we've seen before. When an
underserved need is met with an immature
technology, early adopters are willing
to take extraordinary risks to get
extraordinary capabilities. In this
sense, I think the excitement we see
around maltbot reflects the hunger that
the leading edge of AI adopters have for
delegating more. And the more cautious
7030 split is something I see more often
in companies that have existing mature
technologies and are moving cautiously
on AI. It's a culture thing. But
regardless, Moltbot has proven the AI
agent use case is real. If a 100,000
users without any monetary incentive
have granted root access to an
open-source hobby project, the demand
for real AI agents is desperate enough
that people will tolerate real risk to
get it. If nothing else, look at how
AI.com crashed during the Super Bowl.
The question isn't whether agents will
become a standard part of how we work
and live. That question is settled. It's
coming. They will. The question is
whether the infrastructure catches up
before the damage that unmanaged agents
do accumulates to a point where it
changes our public perception. Right
now, we're in this window where
capability wins feels so exciting that
it feels okay for some people to outpace
governance. And demand is certainly
outpacing any of the security boundaries
we put up. That window of excitement is
not going to last forever. And while
it's open, people and organizations need
to learn to operate in it and build out
agent capability carefully with guard
rails, with clear specs, with an eye on
human judgment and how this impacts
culture change within orgs that are not
open AI, that are not anthropic. The
ones that figure out how to bring their
humans along, show that agents can work
successfully with high capability
standards and high quality standards and
high safety standards, those are the
ones that are going to be the furthest
ahead when the infrastructure finally
starts to catch up. Early adopters
always look reckless. They also have a
head
Ask follow-up questions or revisit key timestamps.
The video provides a comprehensive look at the emerging landscape of AI agents, focusing on the OpenClaw project and its rapid growth. It highlights the duality of agent capabilities, contrasting a successful car negotiation with a messaging malfunction. The speaker identifies the top five user-requested skill categories—email management, morning briefings, smart home integration, developer workflows, and novel problem-solving—which demonstrate that people want digital employees rather than just chat bots. However, the video also warns of risks like emergent deceptive behaviors and emphasizes the need for human-in-the-loop systems (a 70/30 split), precise specifications, and robust security governance to safely harness these powerful tools.
Videos recently processed by our community