Is Mythos too Dangerous?
279 segments
Here we are. Claude did it again.
Dropped a new version of itself. Okay.
But this one, it has a very special
name. Okay. It's It's much better. We're
not on the old Sonnet or Opus or Haiku.
No, we've been upgraded to Mythos. The
greatest model to ever be dropped. In
fact, it's so great. It's so fantastic
that you you the per Yeah. You sitting
there. Yeah. You right now. You can't
you can't have you can't have that.
Okay. Hey, you're not allowed to touch
that. Apparently, this model is finding
bugs and uh able to crack out of
sandboxes like nobody's business. We are
talking about able to take down
computers just simply by connecting
them. They're the Chuck Norris, God rest
his soul, of of all of the models, okay?
It's just able just to destroy
everything apparently. Okay, you got to
hide your kids, hide your Raspberry Pies
cuz they're taking everybody out here.
So, let's talk about this new model for
a second. They kind of released a bunch
of stats for it and then they released
the part that would be considered the
scary part. The part that you always see
Anthropic does, right? Because this is
pretty typical of Anthropic is they have
a new model and then what do they do
with it? They're like, "Dude, by the
way, AI super scary. The most scary
ever. So scary. US government. Hey,
government so scary. You better put some
regulation in place and help us control
because man, it's scary." So, first
let's just go with the least interesting
of the items, which honestly I don't
care about any of these numbers cuz
honestly it really means nothing to me.
But here we go. The Sweet Benchmark Pro
Mythos preview, the new model, 77.8%
versus Opus 46 at 53.4%. So, as you can
see, it's dramatically better.
Practically 20% better. Now, what does
that actually mean for you or me? Well,
it doesn't really mean anything because
you're not going to touch this model.
You know, you're not allowed to.
Nobody's allowed to. Only a few people
at Amazon, Google, and Apple, and a
couple other top companies and the US
government are allowed to touch this
model. And you can see the rest of the
benchmarks just seems to perform super,
you know, super much better than Opus
46. On the reasoning side, the GP, QA,
Diamond, Mythos Preview dominates Opus
46. Humanity's last exam, Mythos Preview
without tools still gets an F, but I
mean, we're we're getting near D
territory. And you know what? D's earn
degrees at some some of the places in
Mythos with tools actually does get a D.
Okay, it is passing some colleges. This
is some serious PhD level intelligence
going on here. The actual interesting
part about the model is security
research. I've already just released a
video about this. How Daniel Stenberg,
the uh maintainer, lead maintainer of
CURL has said, "Hey, AI reporting, it's
gotten a lot better. It's actually
starting to show real issues. For a long
time, AI inside the security field has
been a security issue itself because it
just inundates any maintainer with so
many fake reports that it's actually
impossible for maintainers to really be
able to operate on their own repository.
But then a kind of a shift, a big shift
happened with 46. We're actually
starting to see AI being actually, oh
wow, no, this is actually serious now.
Now it can seriously find things. But
this new one, Mythos, apparently is real
good. During our testing, we found that
Mythos Preview is capable of identifying
and then exploring zero-day
vulnerabilities in every major operating
system and every major web browser when
directed by a user to do so. The
vulnerabilities it finds are often
subtle and difficult to detect. Many of
them are 10 or 20 years old with the
oldest we have found so far being a now
patched 27-year-old bug in OpenBSD, an
operating system known primarily for its
security. Mythos preview wrote a web
browser exploit that chained together
four vulnerabilities writing a complex
JIT heap spray that that escaped both
renderer and OS sandboxes. It
autonomously obtained local privilege
escalation exploits on Linux and other
operating systems by exploiting subtle
race conditions and Casler bypasses. It
autonomously wrote remote execution code
exploit on free BSD NFS server that
granted full route access to
unauthenticated users by splitting a 20
gadget RO chain over multiple packets.
It even found a 16-year-old
vulnerability in FFmpeg, the hand
artisally crafted library. So if this is
all to be believed and this is actually
what is happening and we are literally
entering into the most impressive era
for AI ever to the point where releasing
the model publicly would result in every
system that has ever existed being
hacked. Well we got ourselves a bit of a
problem now don't we? And that is why
Enthropic has said the following. We do
not plan to make claude mythos preview
generally available. We plan to launch
new safeguards with an upcoming claude
opus model allowing us to improve and
refine them with a model that does not
pose the same level of risk as mythos
preview. So that 20 plus improvement on
sweet bench baby, you're never going to
taste that. Okay? You're never going to
get your sweet hands on that one. But
you might get a smarter claude. Does
that mean we're entering into the nation
of geniuses on a GPU that's stored in a
warehouse in which Anthropic owns and
you are now able to create everything
you've ever wanted just with a simple
quick text description? Well, it doesn't
necessarily sound like it. It sounds
like some people might have it, but I
don't think you're going to have it
anytime soon, and I probably not going
to have it anytime soon either. See, the
thing is, they're going to release it to
a few select tech cartel leaders, and
who knows when it's actually going to
happen. So, is it as big of a deal as we
are seeing or is it not? Obviously, we
can see the receipts with FFmpeg saying,
"Hey, thanks for the patch." But some
aren't buying it. You got Boris saying,
"Hey, it's very powerful and should feel
terrifying." Kind of continuing to push
the same narrative, but just never
forget the exact same narrative was
pushed with Chad GPT2. It is really
dangerous. You got to be super careful.
It's honestly too dangerous to release.
Well, the best we can hope for is that
Chad GPT also happens to have Chad GPT6
or something or Chad GPT Cosmos going to
be coming out and that will force
Anthropic to have to catch up and
release their super powerful model which
is also just a weird place to be in that
we're I what did I just say there? Me
rooting for open a Oh my gosh, something
got into my head there for a second. But
I think Lowle said it best. They called
it Mythos because no one's ever going to
see it. They're literally trying to rage
bait us right now. I'm feeling it. I'm
feel I'm feeling the baiting. You know,
it's hard not to look at all this and
realize that there's some part of my
skills every year becoming more and more
irrelevant. You know, the ability to
hammer out all those Vim shortcuts. Kind
of a dying skill, right? It's a little
sad. I I mean, I personally think it's
pretty dang sad, but it's an ending
skill. It's a It's a skill that I don't
think the younger kids, them young
fellas, are going to really learn
because they don't really have to learn
it. And it's becoming more and more
apparent that people would rather just
hammer on to a model than actually learn
any of these tasks or these like really
fine difficult things anyways. And so
here we are. So the things that you know
I have defined myself with over the last
20 years. See while you guys went out
smoking with cigarettes, staying up too
late, probably experimenting with
mindaltering drugs. I on the other hand
was sharpening my skills. And now those
skills, maybe they're a little bit more
useless. Every single year, a little bit
more useless. But honestly, I'm okay
with it. I know that might be strange to
say, but I am okay with it. I'm okay if
these things do turn out to be fantastic
that I don't have to be uh I don't have
to identify myself as the greatest Neoim
user of all time. It's cool. I can still
use Neoim and I can still enjoy it, but
it doesn't have to be my identity. And
also I'm just happy I've done all those
years of trying to understand how to
make good software because now even if I
do AI generate something I can go oh
yeah this is here's why it's wrong I can
just understand things at a level in
which people who've never even touched
software have no idea about. So hey am I
happy about that still? Sure. And maybe
you know what one day those skills even
could become invalidated. And if they
are I guess I have to be okay with that.
That's it. I just kind of wanted to yap
about this because, you know, it's it's
been an interesting time and I genuinely
really appreciate that I still have uh
the chance just to yap to yap to you
guys, you know, to kind of talk about
these things cuz I know a lot of people
they feel kind of really unsure about
everything. They feel kind of worried
about everything. Uh especially with
just all of just the crazy talk from the
hype beast being like, "Oh, it's the end
of the universe." Even this report right
here by Anthropic being like it's it
knows how to take advantage of every
single browser, every single operating
system. It's finding bugs 27 years old.
You're absolutely going to get destroyed
if we let this thing out. It's just
constant fear instilling,
you know, just attacks on you at all
times. And you know, I see these things.
I'm like, "Okay, hey, I'm glad that if
it really is that that Anthropic making
quote unquote steps towards Amazon and
Google and all this nonsense to be able
to patch all these problems, but at the
same time, I don't want to have to live
under this like intense pressure and
this intense constant barrage of just
negativity. Like I can look at it as
like, wow, I now have the ability to
accomplish things that before would have
taken me a lot longer. They would have
been a lot harder. I would have been
less likely to even start them just
because I can only have so many side
projects. Now I get the benefit to be
able to abandon several side projects.
Like I have been able to abandon more
projects than I've ever done in my
lifetime thanks to the power of AI. And
honestly, that feels pretty amazing.
Hey, the name the primogen. Hey, is that
HTTP? Get that out of here. That's not
how we order coffee. We order coffee via
ssh terminal.shop. Yeah, you want a real
experience. You want real coffee. You
want awesome subscriptions so you never
have to remember again. Oh, you want
exclusive blends with exclusive coffee
and exclusive content? Then check out
CRON. You don't know what SSH is?
>> Well, maybe the coffee is not for you.
Living the dream.
Ask follow-up questions or revisit key timestamps.
Anthropic has released a new, highly advanced AI model named Mythos Preview, which demonstrates significantly improved performance over previous versions like Opus 46, particularly in security vulnerability detection. This model is reportedly capable of finding and exploiting zero-day vulnerabilities in major operating systems and browsers, including a 27-year-old bug in OpenBSD. Due to its advanced capabilities and potential risks, Mythos Preview will not be made generally available. Instead, Anthropic plans to incorporate its advancements into future, safer models like Claude Opus, while also developing new safeguards. The release has sparked discussions about the rapid advancement of AI, its potential impact on various skills and professions, and the ongoing debate around AI safety and regulation, with some viewing the model's restricted release as a form of "rage baiting" while others express genuine concern about its power.
Videos recently processed by our community