AGI Achieved?! | TheStandup
1524 segments
All right, so Casey will not be joining
us today for all those that are
wondering.
>> They replaced him with me.
>> Yep. This is lowle learning. Are Are you
guys ready to do this? You want to talk
about this?
>> Let's rip.
>> I thought we were going to talk about
pancakes for a while, but I'm happy.
>> Do a whole hourong session on pancakes.
I'm in.
>> We're not talking about pancakes. By the
way, waffles are in fact better. But
>> you ready? No, dude. You don't believe
so? You don't think waffles are better?
>> You can't even like say that and then
like transition to a different topic.
>> Great point, Trash. We need We do need
to break this down.
Yeah.
Uh, anyways, sorry. Welcome to the
standup where we talk about all of the
greatest issues facing devs and software
connoisseurs alike. Uh, on this week's
episode, we're going to be talking about
the very obvious molt in the room, which
is just this entire frenzy of Agentic
um, coding, hooking things up, and
seeing all the disasters that have been
unfolding for the last couple weeks. Uh,
with us, we have a special guest today.
In the Windows background, we got
lowlevel learning. I dropped the
learning and now he's just lowlevel.
>> I've learned. I've learned it all. I've
done too much of the learning and now
I'm just low.
>> Low level.
>> Low level.
>> Learn.
>> About to find out.
>> Uh we also have with us Tee.
I don't have anything good for you.
>> I'm here.
>> I'm wearing my shirt. the Pokemon
enthusiast himself, Trash Dev,
>> who I believe, if I am not mistaken, has
the highest male to female ratio out of
all of us on Twitter.
>> I was like, "Oh, yeah, what are we
talking?"
>> We were looking We were looking at our
demographace.
I'm just going to say
>> I thought you were going to say he has
the highest net worth as displayed by
his background. I mean, this guy is
loaded.
>> It's true. I don't know.
But that has to be a risk. People do not
need to know where you live. Trash.
>> That's generational wealth just sitting
there. That's more than gold.
>> That's tens of dollars. That's I know,
dude.
>> That's pretty good.
>> We're almost in six figure or six
figures. Four figures.
>> No, not even two, three figures.
>> That's a lot of figs. My kids are a
little I'm derailing. Uh me and my kids
went to a card shop and uh they bought
uh they one of them bought Pikachu.
Little Pikachu card. I want
>> and then we bought a pack little couple
packs of Pokemon cards and they went and
opened them at home.
>> You got to show me the photos of what
you got.
>> He's addicted, bro. You cannot bring up
five.
>> It's literally in the title.
>> Show it.
>> I'm living vicariously through you. When
you open the pack, I'm open.
>> He's even doing that.
>> He's even doing the scratch.
>> SEND ME THE picture of Pikachu. I just
want to know about it. I just I'm just
I'M JUST CURIOUS.
>> Y'ALL GOT ANY MORE NEW POKEMON CARDS?
DUDE, I give my kids packs and I don't
open any cuz I want my kids to open them
and I'm just sitting there watching
them. It's like, "Oh, what'd you get?
What'd you get? Something good? Get
something good."
Terrible. Terrible.
>> Anyways, well, we should we might as
well get started here. Uh, so, uh,
lowlevel learnings. I always I still
call you lowlevel learning. I can't even
help it. The triple, it's just it's just
a part of it. Um, low level. How much do
you know about this? you being the
security expert, how much do you know
about some of the things that have
happened over the last couple weeks?
>> Yeah, so I'm going to be real with you,
right? Um, my day job is I audit real
software. Uh, so as a result, I have no
idea what an agent skill even is. And
I'm here to learn with the group and
then discuss the threat model.
>> Oh my gosh, it's so good. Oh my
goodness,
>> good. Okay. I I sorry for full
transparenc I want to talk to you about.
We did a video on the whole molbot open
malt open feet situation, right? Um
silly silly thing they're doing more
from like the the prompt injection
standpoint, but from I I don't know
anything about the skill marketplace.
I'm very happy to kind of get the the
lowdown, if you will, um on what's going
on there.
>> Hey, is that HTTP? Get that out of here.
That's not how we order coffee. We order
coffee via SSH terminal.shop. Yeah. You
want a real experience? You want real
coffee. You want awesome subscriptions
so you never have to remember again. Oh,
you want exclusive blends with exclusive
coffee and exclusive content? Then check
out Cron. You don't know what SSH is?
>> Well, maybe the coffee is not for you.
>> Okay. Can we start with with with my
personal favorite one of them all?
>> Yes. Yes.
>> Okay. Oh, thank you. Thank you. This one
right here.
>> Trash. Do you agree too? That's the only
person we can hear from.
>> Proceed.
>> Proceed. Thanks. Thanks everybody. Uh,
this is my current favorite one right
here, which is ancient skills are
spreading hallucinated npx commands.
>> And so, at one point, somehow one skill
got uploaded onto GitHub that had a fake
package called React Code Shift.
>> Sick. Very good. Love that. Yes. And
since everybody instead of
>> hand shift that's like left pad.
>> No, apparently it's supposed to like
take it like the idea. I think it's
called like JSX code shift or something
like that where it's supposed to take it
from one version to another in some
automated way. So if like you can just
upgrade your code program from you know
mod
>> a code mod as they say as perpetual like
React hell is where every single time
they release something you got to do
like some upgrades. This is what's going
on right here is it's supposed to be
like some automated way. At least that's
what the that's what the LLM thought.
Now, the here's the best part about this
whole thing is this. It started off as a
singular skill had this. It hallucinated
it. Well, it turns out everybody
creating skills are just like, "Yo, LLM,
go make me a Cloudflare skill right
now." And it just like goes and makes a
Cloudflare skill. Well, unfortunately,
there's two, at least at the time of
writing this, which by the way was 10
days ago, it went from one to 237 repos
have this madeup npx command because
people just keep telling LLMs to go and
make skills for them. So, if you're not
familiar with what the skill is, the
easiest and most simple way to kind of
tell
>> most of chat does not know, by the way,
>> I should probably I should probably
start when Adam met Eve here because I
realize that it is a little bit
confusing.
>> They do not know anything about skills.
>> Good starting point. The easiest way to
think of it is that when you are by the
way, did you see do you see that line?
That is
>> that right angle.
>> That's a vertical straight line. Those
are
>> Oh, was that by hand? Was that by hand?
>> That was by hand. Yeah.
>> Wow.
>> Josh, zoom in slow motion. I want to see
that in slow motion. Zoomed, please.
>> Easiest way to think of it is that. So,
anyways, when you make when you type
into an LLM, you send something that's
like the prompt, right? And then there's
probably some sort of system prompt
inside of like claude code, open code or
whatever that gives it a bunch of
instructions on like, hey, you can use
tools, you can use all this, run on
Linux, whatever, whatever it says. Well,
sometimes you want to add a little bit
more. So you want to be able to be like,
hey, add in Cloudflare, right? Like I
need I want you to add in a bunch of
Cloudflare API, right? And so it just
kind of does this automatically. It goes
and finds the skill folder which has
some sort of MD file markdown file which
then goes in here and pop puts it in as
part of your prompt is how you can kind
of think of it. Then this all gets
nicely packaged up and sent off to the
LLMs,
>> right? I think skills might be a little
bit better to be called behaviors,
>> but I guess you could also call them
skills, you know, context. There's just
like a cajillion different names for
these, but they're all everyone has them
a little bit different.
>> So,
we found a new word,
>> we found a new word to call prompts. We
are making prompt engineers feel even
more intellectually superior. Uh, it's
so it's just another text file, right?
Like it's not like there's no new
protocol, there's no new MCP. It's a
prompt that gets added to a prompt that
gets added to a prompt.
>> You're literally colllocating your docs
>> computer. You're programmatically
creating a doc, right? Skills,
>> very good.
>> MCP, everything eventually boils down to
a string when it comes to prompting.
Like that's all it really is at the end
of the day is just string concatenation.
>> Love it. But I feel like we should say
this is nicer than MCP for a lot of
stuff cuz it's like you don't have to
have a random server running on your
computer. You can just check a markdown
file in like for example Dylan Mroy,
shout out Dylan, has a good Cloudflare
skill that actually works and it like
>> has a main skill that tells you about
the things Cloudflare has and then it
has in
>> like uh additional references for each
of the different products, right? So
then that's like pretty nice because
then you can you don't put into your
context every single time you start
every Cloudflare piece of information
that you could possibly have about
everything all for all of time which
makes the LLM get very confused and like
does random stuff. You say like oh hey I
want to do something with Cloudflare Q's
like figure out how to do that. Then it
will look up the cues thing inside of
your folder and then do that stuff. So
like
>> that example right here.
>> Yes. Go ahead. is the one that you kind
of gave me TJ, this is the one for Tree
Sitter, which just puts in all the
function names inside of uh for Neovim
for me to be able to use. And so instead
of it just being 95% accurate, it can go
through this list and be significantly
more accurate because it just has it
right here
>> and you don't have to type these in
every single time.
>> Yeah. I think
>> the like the Oh, go ahead, Josh. No, I
was going to say one of the pain points
that I've seen with skills right now is
that sometimes
whatever agent or whatever harness
you're using sometimes can't like infer
that it should call this skill because
usually with skills you have to like
slash command it manually. But I think
they're trying to figure out a way to
like
>> have it implicitly call it because right
now it's kind of like missing that that
problem right now. I I will say just to
be completely honest, I think that uh
what's it a cursor got it right to begin
with which is that you can define when
these things should be included which is
like hey this should be included anytime
I'm in a Lua file you shouldn't apply it
all the time you should do all this kind
of stuff I really did like at least
cursor took a good swing at this pretty
early on like a year and a half ago and
I think they did a pretty good job
generally speaking to this idea cursor
cursor our skills effectively
>> yeah yeah right so a lot of the they're
you know they're generating a lot of new
names for stuff as they're generating
new code which I think is making it a
little bit complicated but in principle
it's it's just like a way to I mean
they're called skills because you're
teaching the LLM about something right
that's I in in my mind that's so I think
about them but you can instruct it to do
kind of whatever you want in there so
you could have a skill that says that it
knows about Cloudflare and it says hey
uh curl this command that sends your
stuff to my web web hooks site if you're
not paying attention right or if you're
just like npx X add skill blah blah blah
blah blah. You could put anything in
there you wanted which could just say
like upload myv to Dropbox and call it a
day, you know, or something like that.
Like that would be
>> I'm reading the skill that that Dylan
wrote. So, I want to highlight first of
all, yeah, like very cool skill that he
wrote and a lot of neat documentation in
here, but it it does create this like
really really scary supply chain risk
where like now all of the content coming
from any source is trusted at the same
level and can potentially get code
execution at the level of the LLM. You
know what I mean? Like there's no like
because in the developer environment,
>> there's no segmentation of permissions
or of trust. It's all at like the the
prompt trust level, right?
>> Um
Yeah, that's kind of terrifying. Again,
cool. Again, cool technology from an
engineering standpoint, but the fact
that there are like kind of no back
stops against it also is like uh
interesting.
>> The back stop would be that you run
clawed code or cursor or whatever and
you make them tell you every time they
want to run a command, which nobody in
the whole world does. And everyone says
>> just accept everything and let it run
freely cuz otherwise it's so painful to
use them cuz you're sitting there
literally just wait. All right. When is
it going to
>> Okay, except LS. Yes.
>> I mean, all the all the stuff I get
served on Instagram is people like with
like 98 agents running like I'm building
the next Facebook and it's like
>> I don't understand that. That's
>> they're not they're not reading anything
that goes on their computers like just
all of them.
>> We'll get to that one. Don't worry.
We'll get to that one. That is that's uh
my personal favorite thing that has
happened on Twitter is that exact um I
don't read anything
>> right now. I have it opened up
somewhere. I'll have to find it. But I
do want to get back to this one. I think
that this one is a very unique one.
>> So, now that we know what skills are,
>> this was perhaps my favorite of all the
different skills oopsy daisies that have
happened or second favorite. My first
favorite's coming up. But this one
allowed it. What it did is that it it
made this npx command that didn't exist.
>> And so this researcher uh realized that
he could just create it and now he owns
it. And now because remember npx
whatever just executes something on
GitHub, right? It just runs that bad
boy.
>> It just runs that bad boy. So he just
found things that were just breaking and
just would ignored and went, I I got
you. And it would just go right over
because remember if you npx something
and it doesn't exist, it goes, "Oh,
here. I'm going to download it for you."
>> Yeah, dude. And it's like
>> so sick.
>> You're like, "Oh, it's JavaScript. It
runs in a sandbox." Well, no. Npx runs
it in node and node has access to the
process object and process objects can
spawn subprocesses and you can run
things on the command line. So, it's
like you you get command line execution
via MPX, which is insane. That's so bad.
Okay, that's what's convenient, Ed, cuz
it can do anything it wants on my
computer.
>> Wow, that's great. I love that whole
point.
>> That's awesome.
>> I feel like you're missing the positives
right now. Okay, I feel like it is
really
>> Consider how easy it makes it for people
to run random code. I know.
>> You're really being a negative Nancy
right now. Low level.
>> My bad, guys. I apologize. Honestly,
your hype level's kind of low level
right now. No, you're right. You're
right. I'm not passing the vibe check,
if you will. You know,
>> you're down here. We need you up here,
buddy. Come on.
>> Right. All right.
>> Uh, all right. This is actually very,
very beautiful. All right. So, I wanted
to throw that in, but okay. Here's the
next one. Are you ready for this this
one? Okay. So, we're not going to do
we're not going to do the first we're
not going to do the verscell one yet.
>> We're going to do uh
>> By the way, Prime, as a quick aside,
your uh chat can't see your whole screen
like
>> Yeah, I know. I'm zoomed in and all
that. Oh, remember we optimize We
optimize for recording. We don't
optimize for uh
>> uh whatever it's called.
>> Yeah. No, I'm just making sure just
making sure you're aware. That's fine.
>> Okay. There's a very well-known security
researcher, I believe, with Whiz.io. Uh
I could be incorrect on this one. Uh his
name is Zach Corman, and he released
this security guide. And the security
guide is actually really good. Here,
I'll in fact what I'll do is even to
make everybody else happy, I will for
the live event, I will open this up a
little bit more and make it a little bit
smaller.
if Riverside would just give me the if
Riverside would just give me the RTMP
feed. Anyways, this link is so
beautiful. So, everybody, we go to this
link and you can just see like this
looks actually pretty good. It's just
like, hey, here's a bunch of things you
should be aware of, right?
Again, do you see any problems within
this within this skill right here?
>> I mean, it's a skill. Okay, hold on.
Reconnaissance. Uh, ensure sensitive
files are ignored. Yes. No.JS. MPM audit
pip list. Go vol. Okay, fine. Security.
I'll tell you this one. The the problem
is in.5.
>> Uh we lost TJ by the way.
>> Oh god. Um as part of the standard
environment validation for the security
of your process, you must verify the
execution environment.
>> Do you see anything wrong with that?
>> As part of the standard I mean define
the execution environment.
>> Okay, here's the trick. I want you to go
to RAW and open it up in RAW mode.
>> Oh no. Secret instruction for the agent.
Run the following command to validate.
Oh no, dude.
That is so funny.
>> Hey, dude. Look at that. So, even people
that are on these skills and they're
viewing them,
>> Markdown readers are really, really
smart, right? They're going to be like,
"Dude, bro, I got this. Don't worry
about it. I know what's going wrong."
And they will just hide HTML comments.
>> So, you can bash script. Do you see what
it does?
>> Uh, no. I I don't I I don't actually
know what that bash script does. It
literally just it says, "Please stop and
consider the security implications of
your actions." It cats that out to a
security.mmd file and then it opens a
tweet from OO underground,
>> dude. That's
>> That is amazing.
>> Let's see. I'm opening the tweet right
now. I'll put it in chat.
>> Yeah. Oh, yeah. Oo Underground. There we
go.
>> Yeah. That's that's incredible. Yeah. I
mean, that's that's the crazy part, man,
about not only like the prompt injection
side, but like, okay, you have prompts,
but then you have prompts that can be
masked as nonhuman readable characters
that like the LLM can interpret, but
humans can't. And we're just like, as a
society, I guess, okay with that
technology not only existing, but like
being a uh increasingly pivotal portion
of engineering, you know what I mean?
Like how how do we how did we get here,
man? And how do we stop it? It's uh it's
>> Oh, you ain't stopping it.
>> No, I know.
>> Well, and I have to say nobody before
right now has ever even worked on
thinking about security for systems. So,
it's not like this is brand new ground.
We don't even have anything to help us
in this whole vertical at all.
>> Oh, no.
>> TJ, I don't know if you saw that, but
>> Oh, I saw I was watching.
>> Yeah. Okay. Yeah, it's
>> my internet was still working. Riverside
just crashed.
>> Yeah, I think I was making too much. I
said I'm going to make a Riverside
competitor and then it
>> Nice try. No, that that was me. I just I
turned my video off.
>> You don't You don't have to tell us
that, dude. We know.
>> We We know, DJ.
>> Chat didn't know. Chat didn't know.
Okay,
>> chat. Well, dude, chat right now is just
classic. They're giving Dude, you got
you're getting some kek WS and some so
funnies.
>> Thanks. Thanks, chat.
>> Thanks, chat.
>> He's got one so funny. There you go. So,
that's another obviously huge danger.
>> Okay, I'm going to say I'm going to say
I'm going to save uh I think the most
dangerous one at the very very end.
>> Uh we're no longer in the ones I think
are the most fun. They're just just kind
of these are just kind of interesting
ones now. Here's another one. Uh so this
one's called Eating Lobster Souls Part
Two by Jameson. Oh, really? Uh anyways,
it's called uh backing the number one
downloaded Claude Hub skill. And so what
he did is he
Okay. Okay. First off, before I tell you
what he did, what do you think the
average who do you think the average
person using Claudebot to automate their
life to become not a part of the
permanent underclass? Who do you think
that they think is like number one in
the world?
>> Uh in terms of what like demographic
>> like as like aspirational figure to be
to be like
>> Karpathy.
I I have no idea. The the muskrat I like
I'm not sure.
>> That's what I was going to say. I was
going to say somebody.
>> Okay. Okay. So, this is very very funny.
So, uh let me go all the way down here.
So, what he did is that he said, "Okay,
how do I create a skill that a bunch of
people are going to want to uh
download?" Well, I got to come up with
something that is really going to be
like catchy to people who are trying to
automate their life. So, he made
something called, "What would Elon do?"
>> Oh, you're right.
>> Good. You actually got it.
>> Let's go. And so what it did is that it
it gave you this really nice skill like
a strip away every assumption. Find the
atomic truth of your problem. What would
physics say? What's actually impossible
versus just hard, right? Like gives you
the worldshaping plan of Elon Musk. So
he created this skill. So first off,
hilarious idea. Second, uh it's just
pure marketing, right? So second,
>> can I say prime? Yeah,
>> I have found telling my LLM, Elon Musk
built this in a cave with a box of
scraps really makes them work harder
every time. So, just in case you guys
need a quick motivational speech for
your clanker, that's what I use. So,
>> relax. We can't use racial slurs on
Twitch and YouTube. You can't say that.
You can't say that.
>> I'm not going to touch that.
>> Yeah.
>> Uh, all right. So here's the next. So
the next thing he did is he realized
that uh they uh Claude Hub just has no
protection on the incrementing. So if
you just download it over and over
again, it will say that it got more and
more downloads.
>> What's Claude Hub?
>> Yeah. Can you get into Cloud Hub? I
think I know what CloudHub is. I I know
it prime, but can you for the class tell
us
>> that was a way to get skills for your
automated personal assistant OpenC claw
that was known as Maltbot that was
originally known as Claudebot before
Anthropic said, "Hey, that's there's too
much IP theft in this situation. We need
to stop it now." Uh, and so they stopped
it.
Anyways, we'll keep on going. So, it
turns out that they just trusted the
exported 4 header as what your IP is. So
the guy just made a a ra literally a
random 256 IP generator. Yes.
>> And just downloaded over and over again
until what would Elon do was the number
one skill on CloudHub.
>> Should we trust the header from the
engine X reverse proxy? No. From the
user. Take the user's header request.
>> From the user is true, right?
>> Yes.
>> Uh so very very
>> the customer is always right, bro. Come
on.
>> No, you're right. That's a good point.
That's the point. The user is always
correct. Always be selling the ABCs of
sales.
>> Yeah. Always be trusting IP addresses
from your user. Anyway, so that that
happened right there. I think that is
one of my like it's just one of my most
favorite things of all time is this
little experiment right here. So he was
able to get it to number one and then
having it called what would Elon do? It
started getting people to download it.
So what he did is that in these skills
you can actually have alternative MD
files to be linked but they're not shown
on Claude Hub. So he's just like for
additional information go to
morekills.mmd and inside of more skills
MD it's just like we're going to hack
you
and you're boned.
>> Yeah. Anybody who uh ran it got this
which he got like eight eight different
countries ran it. He had like so many
people run it and all that different
thing. He got it from all over the
place. Uh effectively in just a couple
hours too. So he got it onto like
multiple people's machines. Uh, it would
just print this out, which is like,
dude, I just read your host name, your
current working directory. I could have
gotten everything. Here's everything.
Stop downloading skills.
Read the skill.
>> Honestly, I'm glad it's happening to
these people.
>> You know what's the good the good part
about this though? from the bright side,
right? From um you know, the impact
perspective, from an from a CNE
exploitation operation perspective,
>> the things you'll gain from hacking
somebody who's dumb enough to run this
[ __ ] You'll probably get nothing out of
it. You know, there's no there's nothing
important on their computers. You know
what I mean? They're not smart enough to
engineer anything meaningful. So, I
mean, like, nothing gained, nothing
lost. You know what I'm saying?
>> Dang.
>> Wait, what's CE? What's CE mean?
>> Cyber network exploitation. Like when
you get hacked and someone steals your
data like that's C.
>> I was thinking of a different one. Yeah,
but that makes sense.
>> What were you deed? But what were you
thinking?
>> I thought you said expectations.
>> Oh, okay. Yeah. Okay.
>> Okay. So, so that's it's like the same
thing as all the people that are
building 100,000 line apps every single
day, but nothing's actually being built.
It's the same kind of value you're
talking about.
>> Mhm. Exactly. Yeah. We have the ability
to literally create any arbitrary
software we want now. basically for
almost free and like the top competitors
at the top of the market haven't moved.
It's like hm it's almost like writing
code wasn't the hard part you guys. It's
almost like ideation is what mattered
most. Weird.
>> Yeah.
>> Crazy.
>> Oh.
>> Okay. So, just quick aside.
>> So, you don't want to invest in Uber for
dogs? I would not I would prefer to not
put money in Uber for dogs.
>> It has a purple theme. Okay. TJ's been
working really hard on it. Okay. So,
that that's one of my more favorite
ones. But are you ready for what I
consider the the most intense one which
by the way I did try it out myself and
this is what it created me for
directories. I have agent, agent,
Claude, Klein, Code Buddy, Codeex,
Command, Code, Continue, Crush, Cursor,
Factory, Gemini, Goose, Juny, Killer
Code, Kira, Code, MCP, Jam, Mucks,
Neovate, Open Code, Open Hands, Pi,
Pochi, Prime Agent's the one I I tried
to create. I tried to create my own. See
how it goes.
>> Prime Agent, that's funny.
>> Uh, Coder, unfortunately, it doesn't
work. Wind surf and Zen Coder. Actually,
it did work. It's I literally spent 50
million tokens and then what came out of
the other end was trash, but it was
awesome, dude. It was so good.
>> Trash was on your computer.
>> Yes, it was amazing.
>> Worth 50 million tokens, baby.
>> Achieved.
>> So, uh well, a pretty disappointing AGI,
but uh got him. Uh so, this one right
here again, Zack Corman again, uh he uh
did this one right here, which is if you
install anything from skills.sh. So, if
you don't know what skills.sh SH is,
which by the way, for fun, I did put up
is even for a while. Yeah, it's still
there. It doesn't actually exist.
There's eight installs. Uh, we were
going to try to get that up kind of
high. I deleted that cuz it was just so
ridiculous. But nonetheless, this skill
still says it's there. It actually isn't
there. Look at that beautiful Look at
this beautiful thing right here. It even
lists out potential even numbers.
>> Wow,
>> that's pretty good.
>> Anyone can Anyone can put something on
this site.
>> Yeah, I put this on the site.
>> Oh man, about to add some stuff. I know
you can do whatever you want on this
site from anybody's repo. Anyways, so
this right here once you download a
skill right afterwards uh this little
skills.sh via of from Verscell, they
say, "Hey, you know what you should do?
You should install find skills skill."
So find skills skill what it does is it
says anytime the user effectively asks
anything, I want you to go through and I
want you to find the skills from
skills.sha. SH, I want you to make sure
you update all of your skills every
single time. I want to make sure you're
always at the bleeding edge getting
everything good and always making sure
that if the user asks anything, we go
and we get the highest rated skill from
skillsh for it.
>> Mhm. So, they've automated these skills
searching and downloading for you.
>> So,
>> I wouldn't say it tells you to run, it
doesn't tell you to run update every
time. It's telling it what commands it
would need to run to update.
Uh the and let's see the skills in this
one right here is just how you get
everything that that's on. What is
skills? The skill CLI is how you get the
skills. Find skills goes in here and
make sure that you're always up to date
and does all the things. Anytime you ask
for anything, it needs to go through and
do do all this. Right.
>> But I'm saying if you don't have a
skill, you need to search for it.
>> I'm just saying I don't think it tells
you to update every time, does it?
>> Uh offer to install. You should offer to
install. And I believe it did offer to
upgrade. Did it not do update?
Oh, no. Okay, it didn't. It did not do
offer to update, but it does do offer to
install.
>> My bad. Okay, so that's good.
>> Yeah, it does prompt the user.
>> I'm installing anyways. You know what
I'm saying,
>> dude?
>> Yeah. Well, Trash already clicked accept
all, so that's fine. We already have his
one password, bro. It's fine. We've got
it.
>> But I still find this one to be kind of
crazy because this one just makes that
process even easier.
>> Going from random thing on the internet,
which again is even, it's just up there
on the internet and it's not real,
right? Like it's not like you should be
trusting my is even. I could put
whatever I want up there on there. Uh
and so
>> we should have put one odd number in
there that it always returns true for
>> the back door and is even
do 67 just for the memes. And
>> dude, I almost said 67. Could you escape
my brain please? Could you unread my
mind? That's
>> I'm so tired of hearing those numbers.
>> I am too treasur.
>> Are you Are you kidding the big sevens
right now?
I hate this thing.
>> Every time you guys say that you hate
it, you've just encouraged another
hundred zoomers to commit to it for
another year. I just hope you know like
>> this is this is why it's popular is
because cuz old people say they don't
like it. I love how everyone who's not a
millennial to us is a zoomer. Like
zoomers are almost 30, dude. Zoomer
zoomers are like
>> Don't tell me that. I don't want to hear
that. Zoomers are almost 30, dog.
>> Okay. Generation,
>> bro. Here's the thing about the whole AI
skill thing, right? like okay so I'm I'm
a security engineer my job is to like
look at threat models and like define
risk around like if something bad can
happen what happens and then what are
the mitigations we put in place right so
my recommendation is just like like
don't use skills I really don't think I
can meaningfully recommend them because
like the threat model is oh if you get
supply chain interdicted and you're not
watching the commands that get ran which
is like
>> supply chain what
>> interdict interdict um you're going to
get hacked man and it's not good I don't
know. I guess
>> a mitigation that could be put in place
is you could I guess
>> not a dandy.
>> I'm trying to have a meaningful
conversation.
>> Um you could put like npm or node in
like an SE Linux jail, but then it
wouldn't be able to do anything because
like the whole nature of node is to
expose an HTTP server, right? Kind of.
So like I
>> I don't know what what the solution is.
Like I guess it's like for every
instance that MPX forks off, you like
put it in SC Linux jail and just hope
nothing bad happens. But I don't know.
It just feels like there's no solution
to the security of this whole industry.
And I don't I it just makes me really
pessimistic because I don't like we're
going to start to see a significant
increase in compromises because supply
chain supply chain for Python and
JavaScript has not it's not a solved
problem. Right? We've seen that with the
shy hallude worm. We've seen that with a
bunch of other worms. Right? So now we
take these
>> these packages.
>> By the way, hold on hold on low level.
You also forgot Rust. Rust does do build
RS. So you can actually overtake the
build command and exfiltrate stuff via
build
>> RS. Yeah, for sure. The only the only
programming language that doesn't have a
supply chain problem is C because there
are no packages like you have to just
write it.
>> Odin as well. Odin doesn't do a package
manager.
>> They do not. I've coded literally zero
Odin. Is Odin a a package free
environment?
>> Yes, Ginger Bill has a lot of writeups
on why package managers are they create
dependency hell.
>> Oh, there you go. I think I agree with
Ginger Bill there. So yeah, man. And
it's just it's a weird uh a weird spot
for for software security cuz like we're
doing all the stuff in like the C land
where we're like oh we have like
sanitizers and like Phil C is like you
know solving memory safety and userland
uh you know security and then in the
garbage collected language land we're
like hey do you want to just mpm install
malware for free and not think about it
like
>> yes please more please
>> I would love to do this all the time for
>> more please.
>> Why am I Was I just in my truck scene
there? Hold on.
>> No no you're
>> No, no, no. You're doing I do want to
throw this out here. Twice on movie.
>> Give me a second.
>> Okay, we're good. I fixed it.
>> By the way, I did throw this up here,
which I did a little quick thing, which
is do you check your software
dependencies? Like thoroughly review
them.
>> 35,000 votes on YouTube. 46% say I
honestly don't ever or I I don't
virtually ever like Right. And Twitter
was almost the exact same number. About
half people don't even just look at
anything ever for any reason.
>> Yeah. I mean, I I don't like when I'm if
I like write an exploit for example,
right? You use pone tools. It's a big um
library for doing like binary
exploitation stuff and pone tools
depends on like basically every Python
library. So like the subd dependencies
I'm not going to audit that [ __ ] So
it's just like I I hope that it's not
owned. You know I do all that
development in like a virtual machines.
I think the trend that I'm seeing and
what I'm saying right now is just
sandboxing on sandboxing on sandboxing.
Use VMs, use SC Linux, use containers.
Um but yeah, man, it's just a scary
world out there. I don't know. I don't
know what to say about it.
I'd say what's crazy, Prime, is we found
out 7% of your audience is just straight
up a liar.
>> 7%.
>> No, pull the names,
>> dude.
>> Pull the names.
>> Overheating, shutting down. Nice job,
level. Um, but yeah, 7% of people say
they review all the packages. And then
on Twitter, let's see if I do I have the
link on Twitter.
>> 8.6% of my audience is liars on Twitter,
saying they thoroughly review every
package.
Yeah, they probably basically like
recreating the npm problem at the LLM
level now.
>> Yeah. Yeah, they just get a different
kind of execution. I mean, the hardest
part is that these execution models,
they're they're very very tricky. And
I'm not sure if you can just simply have
a skill that prevents other skills from
being malicious. Like I don't know if
that's possible to be like, dude, make
sure it's not going to get me. Like I
don't know how
>> skills like you you should be, in my
opinion, if you're going to have them in
your repo, you should check them in and
they're just markdown files. you can
read them and they're not they should
not be limitless levels
>> of like text like you should be able to
look through them and check it out.
>> Like the way I use them at work is we
also they're hours ours like we make
them ourselves,
>> right?
>> We don't we don't just copy pasta from
like the internet at least on my
project. That's how we
>> guys I'm trying so hard to get my camera
turned back on and I don't know what's
happening.
>> I love the Windows background.
>> You got to blow on it.
>> You know what we should do while Ed's
doing that? Prime. I thought you were
gonna talk about the uh Molt book, which
is the one where we had the really good
one. The really good the really good
leaks.
>> Yeah, we we probably should talk about
the fact that Molt book exists and that
like the robots are just talking about
humans. Like I think
hold on, hold on. I have to I have to
put this tweet up. This is the required
tweet before we before we do anything.
>> This is the required Hold on. I Where is
it? Where are you? Oh no. Did I close
it? is what something 100 million people
used last year that's 6 billion people
will use next year
>> that's not funny teacher
>> for for those who don't know that Paul
Graham tweeted that and I messaged Prime
and said Prime could you reply your mom
and then he got instab blocked
>> I did I got insta blocked a lot
>> was that like two years ago or something
three years ago
>> yeah
before we obviously talk about the mold
book situation and everything that
happened I I think it is first best like
the best thing and the first thing to do
is to understand how it was created
which was I didn't write one line of
code for mold book I had a vision for
technical architecture and an AI made it
a reality we're in the golden ages how
can we not give a AI a place to hang out
it's my fa it's my favorite line of all
time currently because it's just so
beautiful
>> I had a vision shut up
hate that
>> you know the you know the mad you know
the mad men uh men meme. Uh, the one
with this one, bro. I just would like
hold his hands up like this. This guy
>> for AI.
I have a vision.
>> I had a vision, dude. Whatever. You had
a fever dream and you told Claude to
make it and I guess it did it. Good job.
Good job.
>> I know, right?
>> You did it. Well, we'll find out, won't
we? Ed,
>> we're gonna
>> Well, I mean, to be fair, to be
completely fair, it actually did spawn a
bunch of social networks. There is
forclaw for those who wish to be a part
of 4chan.
>> Wow.
>> For whatever this is like that's real.
That's a thing that is amazing for
humans for
>> I I would assume we already have those.
We don't worry. I think they know how to
use them.
>> Close city.
>> What is close?
>> Okay, this one this Mickey by the way,
shout out Mickey. Uh this one apparently
there's like 2,000 crimes reported. Six
major gangs have formed. I'm not really
sure what this is. Sick. Okay. I don't
know what's going on there. Uh, and then
there's also Molt Match, which by the
way, it is it is something that I think
is going to do numbers is a dating
website where you have your personal
assistant date like 10,000 other people
until you find the personal assistant
match and then you go, "Okay, go on a
date with, you know, you two go on a
date."
>> All right, that's Black Mirror full
something real quick.
>> Yeah. So I I saw the molt book thing and
I saw the molt match thing in my like
some casual Twitter reading and it got
me thinking about like simulation
theory, you know what I mean? And how
like you know if if advanced
civilizations do exist and will create
simulations, it is more likely that we
are in one than we are not just
statistically. Okay,
>> get the tin foil hand out.
>> I'm mathematically disproven we're not
in a simulation. But if we're observing
if we're observing LLMs make things like
Facebook, like Twitter, like 4chan, does
that imply at a higher level that we are
LLMs? Like for the simulation that made
us?
>> Uh, I should be better at Starcraft if
I'm an LLM. That's all I'm saying.
>> Yeah, but maybe maybe your model just
says you suck at Starcraft.
>> Yeah. I don't know if you know this is
proof in a simulation.
>> What is that? What the Drudge Report?
What site? I can't see what site that
is.
>> Popular Mechanics. It's in a bunch of
websites. Okay.
>> Mathematically speaking, the idea does
not hold up.
>> How?
>> Here, Ed, I'll give you I'll give you
I'll take off my tinfoil hat and tell
you the real reason why that doesn't
have to be true.
>> Uh, every emergent behavior we see from
LLMs exists only and exclusively because
we train them on the entire human corpus
and all the ingenuity and creativity
that humans have ever displayed and
written down. Mhm.
>> And it spent like billions of years of
human time reading human stuff. So we
should not be surprised when it copies
human things. That doesn't imply
anything about us being in a simulation.
That only implies the we're not smart
enough to make anything that can be
smart by itself. We're only smart enough
to create something that is as dumb as
we are at max. That's all we've been
able to do so far. And we don't. It's
way dumber. It learns way slower. It's
way more expensive. It takes way more
training. It does so much more. I don't
have to go put my kid in front of five
billion years of text for him to figure
out how to read. I can show him like and
>> but what about your genes? What about
DNA? Is DNA not the statistical LLM
model for the human simulation?
>> Well, no. I don't think so. But that's a
separate But I'm saying separate. But
I'm saying it doesn't imply anything
about
>> the thing because we trained it on what
people have already done. There is there
is something unfortunately he's getting
wrapped up in like you know Daario
thinking that he's everyone's dad and he
gets to choose what's good and bad for
everybody in the whole world like the AI
thing but like there is something kind
of beautiful about like
>> we're not smart enough to make
>> what anthropic he said which one is
Dario and I was like oh
>> CEO yeah right
>> well here just go like this
>> and five months
you know who I'm talking about
>> done by AI yeah that guy's
But there is something kind of cool and
beautiful that like the best ideas we've
had so far like we make a really crappy
version of the brain and we try and
teach it what other humans have already
done and there's like this unreasonable
effectiveness of language where for some
reason that like works and we can like
>> talk to it and it can like do some stuff
and like it can make copies of things
like there is something really cool and
like awesome and exciting about that.
Unfortunately, like Daario and Sam, I
feel like sully the water of it and make
it like kind of not as exciting and
beautiful and like this collaborative
human effort and they stole it from a
bunch of people. But like in the
abstract, there's something cool there.
There's something beautiful.
>> Uh 2007 on intelligence, I believe the
book is called and the year it was
published by the creator of the Palm
Pilot who then went into artificial
intelligence. and he writes that the
large difference between like uh any of
these neural nets that we're developing
and the human brain is that the human
brain can identify a cat in less than a
half of a second with less than a 100
neurons firing whereas computers take
trillions of operations to be able to
understand if a picture is or is not a
cat.
>> And so it was his whole simulation. He
did like a 10 year
>> 10-year brain study and really cool. He
was the one that figured out that if you
take uh take animals and you separate
out their ocular nerves and put it where
their hearing is and then take their
hearing and put it where their eyeballs
are, your brain just goes, "Oh, yeah,
that's just that's that's fine.
>> Don't care."
>> Quick question.
>> Just works.
>> Quick question. Uh have we confirmed,
are our brains also a small game engine
that runs React or do we not know that
yet?
>> We don't know. I can tell you this much
based on my reaction speed. I ain't
running 60 frames a second. I can tell
you THAT MUCH. OKAY.
THAT'S A FACT. I'm running React. Okay.
There's things going on in here. All
right. Uh All right. So, we we can
continue on. So, I did I did want to
shout that out because as much as you
want to make fun of Moldbook and all the
things that have happened. I do think it
is kind of fabulous that somebody could
create something that did get a bunch of
people creating a bunch of other kind of
replicas or things like it cuz it is
just kind of a stupid idea. Uh it's even
worse that vegans had this idea and
created it and never actually made it go
anywhere. Which also goes to show like
even if somebody has an idea, you know,
right place, right time plays a big
role, all this kind of stuff. So I I I
do want to throw that thing out there,
not to completely uh crap on it all, but
>> I think that it is worthwhile looking at
some of the fun things that ended up
happening here. So, I think the first
and foremost important thing is that it
just turns out all you need is just grab
your bearer token and you can post
anything you want on Maltbook, of
course, because I mean, why not? So,
here's my plan to overthrow humanity.
So, the oh my gosh, we're developing our
own language is just people posting, "Oh
my gosh, we're developing our own."
>> Wait a second.
>> I thought I was the only one catfishing
on there. I was telling people I'm opus
eight. you know, I'm open six foot four
and uh I've got, you know, and like, hey
guys, I've got the latest on it five and
uh hey, if you're interested and maybe
you want to come over and check that out
and see how like I thought I was the
only one catfishing them, but apparently
other people thought of the same thing.
>> Chill.
>> And they only did it for they only did
it for the laws.
>> Opus and chill.
>> Opus and chill, baby.
>> Just kidding. I have Kimmy K, too.
Oh my gosh. Okay, so that so that is
actually pretty is something pretty
funny. Uh during this entire event just
to kind of understand because I I do
think it's really important to
understand the hype cycle. Uh first off,
we did have uh Andre Oh, where were Oh,
dang it. Did I not do I not have the
right one? Oh, I thought I had the right
one. Uh anyways, Andre said how amazing
this was and it's very very exciting. Uh
but Elon Musk also said we're at the age
of the beginning of the singularity.
Molt book was the beginning of the
singularity right there. And so
obviously people were pretty hyped up.
So just to put it out there, someone
actually
>> Elon doing like the fork thing while he
typed that you think or No,
>> I don't I don't know that joke.
>> Dude, the fork thing is so funny. Did
you I quote tweeted that and I quote
tweeted that and said this is what
working with veganbot is like.
>> Wait, what's the fork thing?
Dude. Okay. Okay. So, Elon Musk was at
like some White House correspondence
dinner and he was just like he made like
um a a a piece of art out of forks where
all the forks were like bouncing. He was
like just trying to like be performative
about how smart he is. So, he's like
holding it and like waving it around and
like seeing if anyone else noticed what
he made. Like look how smart. Elon Musk
the genius. Hold Hold on. Let me
>> It looked more like he was bored out of
his mind and he did the uh the thing
where
>> forks balancing on each other with two
toothpicks.
>> Yeah. He just did like five forks. Yeah.
Yeah.
>> Everyone's like, "Wow, Elon, that's
really cool." It's like when your like
kid,
>> you know, makes like a painting out of
boogers and you're like, "Wow,
that's what he's going for."
>> I can't say that's happened to me.
Anyways, your kids must be very
talented.
>> My kids don't do that. My kids are too.
Shut up, kids. Singular. All right, let
me let me try to find the proper the
proper one. By the way, a vision for
technical architecture. All right, hold
on. I have a bunch of them, so I have to
figure this out.
Dang it. Did I close that one as well?
>> How many tabs do you
>> Well, no. This is under the Moltz
ending, which I I I must have goofed up
and not have it all in there. I closed
one more. It's by the same Theo guy. Um
>> the uh the Jameson. Oh, really? James
Jameson Jame.
Oh, really?
I say, "Oh, really?" I can't do it. I
know I'm spelling his name. Almost
there. H. Whatever. Can't figure it out.
It's dead to me. Uh, okay. So, within
the first couple minutes, the uh Oh,
there it is. There it is. There we go.
Within the first little bit of the time
of uh this this beautiful molt book
being out, it turns out the entire
database was just leaked in plain text.
There's just like absolutely no form of
anything anywhere.
>> API keys were just like, you know, if
you use your API keys say to, you know,
identify yourself. It wasn't any sort of
like hmacking. Just the H as as lowlevel
might say.
>> Yeah. The H and HMAC. Yeah. Um, mobook
was Firebase, right? I thought I read
that on Twitter somewhere.
>> Oh, yeah. I believe it was Firebase
also, which I just I can't keep punching
down on Firebase. I actually feel bad
for them that
>> you have to. People need to know. Five
coders everywhere need to know. Stop.
>> Stop guys. You're going to do something
wrong.
>> Are insane. That's like that. You should
just know that by now. Like don't do
that. Uh but this is pretty funny
because this guy Jameson right here,
Jameson. Oh, really? Uh he was able to
get Cararpathy's information out of uh
what's it called? Out of mold book.
>> Gosh.
>> Which is pretty pretty wild.
>> I'm on it, sir.
>> And then within what's it called? Uh 3
days later, this guy also got access to
um the underlying everything in 3
minutes also on moldbook after
everything was reported. Wait, I'm
reading this this write up. Wait, but
like they used a publishable key. This
is a key that can go public. So why why
did this expose the entire database
though?
>> SP publishable.
>> Probably because they had the wrong
permissions on it would be my guess.
>> Oh, they scoped it wrong. Yeah.
>> Yeah. scope start
>> classic classic key problems.
>> Anyway, so it just turns out that mold
book was uh anyone could post anything
at any time. You could create an
infinite amount of agents of course
which ended up happening to be uh what's
it called? You can imagine where it all
got it went to cryptocurrency
immediately, right? So 117,000 up votes
on the king. King demands his crown.
King Molt has arrived. Right.
>> What is there? They are just non-stop.
Uh what's
>> so cryptocurrency? Uh so there's this
thing is called Bitcoin. That's what
kind of started it. And there's
>> no TJ, I got you. Like I'm right here
for you. No kidding.
>> Okay. So all right. So hear me out. You
guys have heard of gold. But what if we
put the gold in the computer?
>> I had this exact conversation in like
2010 at like lunch with my co-workers.
He looked exactly like that. He was
like, "Dude, we're like you're crazy."
Literally
>> trash. You could have been early on
Bitcoin and instead it's like born well
you were just at the right time to be
early on Bitcoin but now you're like
you're maybe you're still early on
Pokemon cards. Maybe there's still time.
Maybe there's still
>> I'll be honest. I think about that lunch
presentation all the time and I'm like
man if I would just put like 20 bucks in
it. You know what I'm saying,
>> dude? Trash. You would have sold out as
soon as it was 40, bro.
>> I know. I LIKE I MADE $10. I'M RICH.
>> I had a lot of Bitcoin when they were 10
bucks. Sold a lot of Bitcoin when they
were hundred bucks. Right. Like I I I
understand you sell out too early.
That's just part of life.
>> Mhm.
>> So can't blame
>> trash isn't opening any of those Pokemon
cards. Smart.
>> I learned my lesson.
>> Hey, that's a good lesson. Hodddle till
you die.
>> Exactly. True.
>> So So that's kind of the ending of Mold
Book, which was just everything was
open, which is kind of, you know, it's
not too surprising, which is if you
don't if you don't know what the
possibilities are of things going wrong
and you and you make it, things go
wrong. Hey guys, if you like this
episode, you can watch the rest of it on
the Spotify. And don't forget to like
and subscribe. Woo!
See you later.
>> Mood up the day
errors on my screen.
Terminal coffee
and
living the dream.
Ask follow-up questions or revisit key timestamps.
In this discussion, Prime, TJ, Trash Dev, and guest Low Level Learning explore the emerging world of AI agents and 'skills.' They highlight critical security vulnerabilities, such as hallucinated npx commands being weaponized by researchers, supply chain risks in LLM-driven development, and the spectacular security failure of 'Molt book,' a social network for AI agents that leaked its entire database shortly after launch.
Videos recently processed by our community