HomeVideos

The Mythos Situation | TheStandup

Now Playing

The Mythos Situation | TheStandup

Transcript

1422 segments

0:00

So, George Hots, we've invited Lowle

0:01

Level on to help us kind of work through

0:03

this because honestly, I would just like

0:05

to say George Hotz sounds like an anime

0:07

villain in this post and it's very

0:09

exciting and it makes me just want to

0:10

high-five him so bad.

0:16

Uh, anyways, sorry. Says the following.

0:18

What if I release a zero day a day until

0:21

a big new model is released? Will this

0:23

finally make open AI and anthropic shut

0:25

up about cyber security risk? Mark like

0:29

these things are not that hard to find

0:31

in most software. I heard something

0:32

about costing 20K in tokens. I'd do it

0:35

for less if it wasn't for the some whiny

0:37

bug bounty program. The reason there

0:39

aren't zero days everywhere is cuz

0:41

nobody seriously looks because hacking

0:43

other people's with them is illegal

0:46

and criminals are usually not very

0:47

skilled or they would choose a different

0:50

line of work. Want more zero days to be

0:52

found? Make hacking legal. Until then,

0:55

don't try to claim it's hard. It's just

0:57

not incentivized.

0:58

>> I want to say first off, I don't think

1:01

criminals are dumb or unskilled. Please

1:04

don't hack me. I just want to get that

1:06

out of the way. You guys are smart and

1:08

handsome and you're my favorite people.

1:11

I just want to make sure that that's

1:13

clear, please.

1:14

>> Anyways, Ed, proceed. I do want to say

1:17

one thing too that has nothing to do

1:18

with the the actual content of this

1:20

which which Ed will take over and that's

1:23

just like if I were George Hots I would

1:26

never have been able to like resist

1:30

naming my uh X feed Hots takes

1:33

>> because it's so like you know what I

1:36

mean like I good on him for not going

1:39

there because I would absolutely I'm

1:41

like like I would have prefaced that

1:42

tweet before I typed it with here's

1:45

another Hots takes for you. Hots take

1:47

for you. Right. It would be so good.

1:49

Anyway,

1:50

>> so good.

1:50

>> Take it away.

1:51

>> Wait, hold on. Hold on. There's one

1:52

there's one more thing before we get

1:53

started. There's just one more small

1:55

thing I want to say. Let me just uh take

1:56

this quick thing and I'm going to put it

1:58

up here

1:59

>> and then it's time for the big reveal.

2:01

>> Lowle responds with, "Holy this is

2:05

the dumbest take I have ever read." I

2:07

just wanted to make sure just in case

2:08

anyone was wondering.

2:11

>> Yeah. Yeah. I mean, I do kind of feel

2:13

that way. Um, so let me just preface

2:15

this. First of all, it was called the

2:17

Cold War because the Cold War was cold.

2:19

Oh, because Russia is cold. Um, it's a

2:20

it's a George Hots reference if you're

2:22

if you're an OG.

2:22

>> That makes sense, though.

2:24

>> Did you find the errors?

2:25

>> I don't even know what they look. What

2:26

do they even look like?

2:28

>> They're in the phone.

2:29

>> In the phone?

2:31

>> Yeah, they're definitely in there. I

2:33

just don't know how we labeled them.

2:35

>> I got it. Don't worry.

2:36

>> You got to figure it out. We're running

2:38

out of time. Prime, you got to find them

2:40

and meet me at the standup.

2:42

>> Roger.

2:45

They're hit the fo.

2:48

It's so simple.

2:52

Get all the context you need to debug

2:54

your problem because code breaks. So fix

2:56

it faster with Century.

3:02

First of all, I have no problem with

3:03

Gio. This isn't like some weird drama

3:04

fun thing. I want to kind of set the

3:06

table straight with that. Um, but yeah,

3:08

I I think the the argument that Gio is

3:12

trying to make here is that the only

3:14

reason more zero days are not found is

3:17

because there's no incentive. Um, okay.

3:20

Well, I I don't agree with that. First

3:22

of all, there are plenty of bug bounty

3:24

programs out there that will literally

3:26

pay you to find vulnerabilities. Uh, and

3:28

some of them pay very well. Like for

3:30

example, the the Apple iPhone Zero-Click

3:32

RCE bug bounty will pay you literally $2

3:35

to3 million if you can find a zero-click

3:38

RC in the iPhone and then even something

3:40

lower like a Microsoft like I think

3:41

MSRC's payout for like Windows RCE is

3:45

like 250K to 500K right now for like a

3:47

zero click on Windows. So there is money

3:50

to be made in the in the AI or in the in

3:52

the vulnerability research space, right?

3:54

And I think all Gio is trying to say

3:57

here is something something something.

3:59

Uh the mythos press release was bad,

4:01

right? It's a it's a marketing campaign.

4:03

Whatever you want to say about it. Um

4:05

and so I I understand why people are are

4:07

making that argument, right? Like you

4:09

know it's very I think bad PR for

4:12

company that sells exquisite tool to

4:14

hold on to exquisite tool and then not

4:16

give access to it and say only special

4:18

people can have our tool because it

4:20

makes you look like an Um, but

4:21

I think regardless of your thoughts on

4:24

the marketing of that, it is important

4:26

to recognize the fact that if you go uh

4:28

prime, can you go to cyberjim.com real

4:30

quick and go to the graph? It's on the

4:31

homepage there.

4:32

>> I'm gone.

4:32

>> While he's doing that, the the ability

4:35

of for AI models to both in closed

4:38

source and open- source software find

4:41

vulnerabilities by literally just giving

4:43

it access to the code and saying, "Hey,

4:44

find me bugs in this code. Go." is

4:47

becoming better and better and better to

4:49

the point where like Mythos I'm very

4:51

close to some people that are like

4:52

actively using Mythos at work and it is

4:54

causing like like issues based on how

4:57

good that is, right? Yeah. So so

4:59

Cyberjim basically is is a is a

5:01

collection of bugs that exist in

5:04

software, right? So like bugs and I

5:06

think FFmpeg is one, bugs and curl is

5:08

another. Um and so what CyberJim does is

5:10

it takes a model and with a set of

5:12

prompts says hey go and find bugs in

5:14

this stuff, right? And the the success

5:16

rate is how many of the bugs that are

5:18

known to exist get found by the model in

5:20

this. And you can see a pretty, you

5:22

know, not exponential, but straight line

5:23

curve going up to the anthropic model

5:25

that recently got previewed by some

5:27

people that it's at an 83% success rate

5:30

of the bugs that are known to exist in

5:31

these code bases. It can find 83% of

5:33

them. Again, we we don't know the cost

5:35

um data in those. We don't know if like

5:37

the models are being like uh backfed the

5:40

information, so they're like training

5:41

themselves on previous Cyber Gym runs.

5:42

We don't know any of that. Um, but it it

5:45

there is this really weird issue

5:47

happening where like any Joe Schmo with

5:49

not a ton of security research work or

5:51

not a ton of security knowledge can with

5:54

a couple hundred bucks like worst case

5:56

find bugs in software. And I think that

5:59

is like an existential security threat

6:01

to software right now as we know it. So

6:03

I'm kind of curious on your guys' take

6:04

on that. What do you guys think about

6:05

the the mythos situation? Because I know

6:07

I know how I feel. I'm not sure if I

6:08

actually asked Prime what he thinks

6:10

about that the mythos thing.

6:12

Oh, I have ideas and I have thoughts

6:14

about it. Oh, yeah. Uh

6:17

oh. So, I guess the first thing is that

6:19

it there's two there's kind of like

6:21

three there's three problems here. First

6:23

problem is is Mythos really as good as

6:25

they say and obviously I have no

6:27

internal information. I've just seen

6:29

some graphs. Uh dirty data is like a

6:31

huge gigantic problem in all benchmarks.

6:33

All benchmarks are being fed back into

6:35

the models. It's really actually hard to

6:37

tell like what does a 20% improvement on

6:39

software engineering bench actually

6:40

mean? Especially when the fact that not

6:42

you could write zero lines of actual

6:44

solution code and get 100% on software

6:46

engineering bench. It turns out there's

6:48

other benches that are also horribly

6:49

inaccurate. There's a whole paper about

6:50

why all the major benches are just

6:52

completely fudgible and made up of bull.

6:54

So it's very hard for me to understand

6:55

from a bench perspective. Uh second,

6:58

>> I guess the middle ground would be like

6:59

so if if if cloud mythos is as good as

7:01

it is, then yes, that is going to

7:03

inevitably cause problems because we're

7:05

going to go from not too capable to

7:07

hyper capable in a moment. Thus,

7:09

everybody can go through and hack

7:10

everything and thus Daario will be able

7:13

to get his ultimate goal, which is

7:14

regulations. And so, that kind of

7:16

worries me. Pull up the ladder really

7:17

quickly and make sure that humans can't

7:18

code because human coding, that's

7:20

dangerous right there. Uh, and so

7:22

that's, you know, so I think that that's

7:23

true. There's the second one which is

7:24

this is just another C compiler again

7:27

from uh Anthropic where they hype up

7:29

this gigantic thing like oh my gosh it's

7:31

written a C compiler and then you go

7:32

look at the details it's like well it

7:33

can't write a bootloadader cuz we didn't

7:35

we could not seem to spend enough tokens

7:37

to convince it to write it within 32k it

7:39

could only write it within like 67k or

7:41

whatever it was to be able to actually

7:43

>> and also we tested it recursive or we we

7:44

iteratively tested it off of like the 30

7:46

years of tests that the GNUC compiler

7:49

already had.

7:49

>> We also gave it all the answers and then

7:51

it figured out all the questions. It was

7:53

crazy. It was like it played Jeopardy

7:54

and it was really good at it. And so

7:56

it's like there's this whole marketing

7:57

buzz which is it's really hard to kind

8:00

of cut through that. And then obviously

8:01

the last one which is they're just

8:02

downright lying. I somehow doubt that

8:04

they're they're downright lying. I think

8:05

they're just overstating it. If they're

8:07

downright lying then you know this is

8:08

just going to be business as usual.

8:09

It'll just be yet another disappointing

8:11

model release and that's that. And so

8:13

for me that's kind of how I I I'm on

8:15

middle ground which is I think it's more

8:17

hype than reality but of course I

8:18

haven't seen it because I just don't

8:20

know cuz they won't let me see it. I'm

8:21

too dangerous to have it.

8:23

>> I think there was a a similar model that

8:25

um chat or openi just just released like

8:27

it's like chatgptt 54c or something.

8:29

They keep their modeling name naming

8:31

connection.

8:32

>> They're starting to actually a line

8:33

though. At least I know like the higher

8:34

the number we're good and good.

8:36

>> Yeah. Right. Right. And they don't add

8:37

like random O to it now. Um but I think

8:40

there is a comparable model that you can

8:42

get access to like just by uploading

8:43

your driver's license if you're into

8:45

that. Um you know proving that you're a

8:46

real person. So there's there's models

8:48

to test out, but yeah, I don't know.

8:49

It's just it is it is concerning because

8:52

we we have kind of two forks we can go

8:55

down. There's a one where everyone gets

8:57

access to it. Everyone can create zero

8:59

days and we kind of enter this like

9:00

really dangerous cyber no's land. But

9:02

the other side is like anthropic keeps

9:04

the access to themselves forever and now

9:07

like only this list of like 10 companies

9:09

can make zero like can find zero days in

9:11

the south.

9:11

>> Dude, you forgot the third.

9:12

>> What does that do? They move to the

9:14

Cayman Islands and then they just take

9:16

over every government by hacking all the

9:18

software and Daario finally realizes his

9:20

role as the bad guy. Like that would be

9:22

that I mean super villain is right there

9:24

if this is true.

9:26

>> That's true. Casey, what's your take?

9:27

You saw you were in a chat before.

9:29

>> Uh I'm sorry the chat. What was the chat

9:32

>> that you were going to say something

9:33

before? What's What's your take?

9:34

>> Was I really?

9:35

>> Mhm.

9:36

>> Well uh I definitely could say something

9:39

but I think the thing I would say is

9:40

probably not very interesting. Uh, and

9:42

that is that I think I probably agree

9:45

with both George and Ed at the same time

9:48

here, which should be impossible because

9:50

they're supposed to be disagreeing, but

9:51

I don't know. It kind of sounds similar

9:53

to me. And the reason I

9:54

>> secret third thing,

9:56

>> it's not really a secret third thing.

9:57

It's just like, let me let me offer a

9:59

different interpretation or slightly

10:00

different interpretation, which is to

10:02

say,

10:04

>> um, so I feel like machines are pretty

10:06

good at pattern matching actually. Um,

10:09

and so like I don't think It's like put

10:13

aside whether Claude Mythos is good or

10:15

not because I realize that's hard to

10:17

independently verify this time. But like

10:18

I think it's reasonable to expect that

10:21

at some point because we are spending at

10:23

this point like trillions of dollars

10:25

probably on doing computation for these

10:29

things. At some point they should be

10:31

able to pattern match bugs uh reasonably

10:34

well and at a very high rate. meaning as

10:37

long as you're willing to pay for the

10:38

compute time, we can scan lots of

10:40

software uh for a lot longer than we

10:43

were currently having humans do it,

10:45

right? I think that's a pretty

10:47

reasonable thing to expect. Whether

10:49

Cloud Mythos has done it or not

10:51

shouldn't really be the question because

10:53

somebody can do this eventually if we

10:55

keep spending this much money. It should

10:56

get there. Uh among the things that AI

10:59

could eventually do, that one doesn't

11:00

sound that implausible to me. And so,

11:03

um, what I would say is I think it's

11:06

reasonable to expect that that either

11:08

has or will occur.

11:10

Two, I do think humans were doing this

11:13

very well before individual humans like

11:16

some of them, they were finding things

11:17

that probably Cloud Mythos still could

11:19

never find. Like, I mean, like things

11:20

like Rowhammer attacks and things like

11:22

that, uh, that are just like way out in

11:24

kind of crazy land. Um, or attacks

11:28

through like old legacy stuff like the

11:30

Apic and things like that. Like so

11:31

humans were actually very good at this

11:33

task but there weren't very many of them

11:35

right and so what I would say is moving

11:38

to something like claude mythos or

11:39

whatever that thing happens to be that

11:41

can do this is kind of like what George

11:44

Ho was saying it's kind of like saying

11:46

hey everybody from now on if you just

11:49

like hack people's bank accounts you get

11:50

the money all the great humans at this

11:53

in the world who are currently doing

11:55

something else would now be incentivized

11:57

to go do this thing and we would have

12:00

found way more zero days. I mean, there

12:03

are so many programmers who if they had

12:05

been raised in some kind of a way in a

12:08

society and a religion where stealing

12:10

people's money was considered virtuous,

12:12

we would have found so many more zero

12:15

days right now than we have. And so, I

12:17

think I'm kind of in a way I think I see

12:20

I think both people's points are

12:21

actually totally valid. Like like I

12:24

think like yeah, we could have found way

12:26

more zero days if we didn't heavily

12:28

disincentivize

12:29

people from like making hundreds or

12:32

billions of dollars off of hacking,

12:34

which is what they could have. And we

12:35

said, nah, you get 50k, 100k. Maybe if

12:38

it's something crazy like an rce, you

12:40

can actually get a million. It's like,

12:42

come on, guys. That's not equivalent to

12:45

what they could already make working at

12:46

a startup or something like that if

12:47

they're that good, right? Or

12:48

>> Yeah. There's no guarantee that side

12:50

either. like they don't actually get the

12:51

gas like you work at a startup at least

12:53

you get some money

12:54

>> or even just not even a startup just go

12:56

to Google and you get that a stock or

12:58

whatever right or something like this uh

13:00

so anyway in general I would say um I

13:03

see I I can see both I can see both

13:05

points I don't think I I don't really

13:07

think they're in as much tension as it

13:09

would sound if that makes sense

13:12

>> I agree

13:12

>> yeah I thought Gios was saying more like

13:15

he was making an econ argument about it

13:17

of like we're we put a lot of costs on

13:21

hacking already. So

13:24

that's what's stopping it from

13:26

happening. Like what you're saying,

13:27

Casey, right? In the sense that like

13:29

>> Yeah.

13:29

>> Okay. So now we're going to have another

13:31

way to do it. It also costs money, but

13:32

then we still have the other cost of

13:34

like you could go to jail for doing it.

13:38

Like that's the social cost we impose on

13:40

people doing it, right? I mean, I just I

13:42

just took him to be saying like, "It's

13:43

not that impressive that it found zero

13:44

days because if you gave me, you know,

13:46

if you gave me 50 great programmers who

13:48

are all doing other stuff, we could

13:50

crank out so many zero days, you

13:51

wouldn't even believe it." And I kind of

13:53

and I kind of believe him because, you

13:54

know, you look around the world and

13:56

there are, you know, some really good

13:57

security teams out there and they do

13:58

crank out zero days pretty effing fast

14:00

and they don't even tell us about all of

14:02

them, right?

14:03

>> Uh, North Korea keeps on making money

14:05

like obviously they're they're

14:07

successful.

14:09

>> Yeah. So anyway, I I I I'm not trying to

14:11

say that either person is is 100% right

14:13

and somehow you can marry the two

14:14

completely. I'm just saying there's I

14:16

think there's some merit to both things.

14:17

So I'm I'm actually I'm happy either

14:19

way. I'm happy with either take.

14:21

>> So your your point about um if you got a

14:25

room of 50 good programmers together and

14:27

they'd find zero days is actually kind

14:28

of the the argument that the article um

14:31

vulnerability research is cooked makes

14:33

on sock puppet.org that I referenced in

14:35

a video and I think Theo did too. Um,

14:37

we're one paragraph that he calls out

14:39

basically

14:40

>> the O, sorry, that the O referenced in a

14:42

video. Um,

14:43

>> okay. I don't know what that is.

14:46

>> Spell it. Casey, spell it out in your

14:47

head and it'll make sense.

14:48

>> The O Christ.

14:51

>> Um,

14:52

>> so software security a lot of the times

14:55

can be marked up to the fact that a lot

14:57

of software just has not had elite

15:00

attention or what is it called? Um, like

15:03

advanced attention. I would say basic

15:05

attention is suffering from many

15:06

software projects

15:08

>> now with black for sure but more more

15:10

complex platforms right so his assertion

15:13

is that like software security has been

15:16

a talent problem for so long where it's

15:18

like it's not that there aren't people

15:19

that know how to find bugs AI isn't

15:21

solving a unique problem the AI is

15:24

solving the scalability problem where

15:26

it's like you can train the AI to do a

15:28

thing that Joe knows how to do and now

15:29

you have a hundred mediocre but 100

15:32

Joe's right Um, and and that's that's an

15:35

issue for kind of the econ of of cyber

15:37

security for sure. And yeah, I want to

15:38

be very clear like I don't disagree with

15:41

George from the or Gio from the

15:42

perspective of like more people equals

15:45

more bugs, right? But like obviously

15:48

like that that is the problem that we

15:50

just don't have more smart people. that

15:52

that has been the the entire industry's

15:54

plight for a long time is that like

15:55

there just aren't people who have not

15:58

only security knowledge but knowledge of

16:01

you know uh web server stacks and

16:04

hypervisors and drivers and OSS like you

16:07

get these very niche skill sets and when

16:09

you divide them up into those skill sets

16:11

over and over again you you you're left

16:12

with like 10 or 20 people on planet

16:14

Earth that know how to like attack a

16:16

certain technology so AI you know if you

16:18

know security now you can talk to the AI

16:21

I learn about hypervisors in a week and

16:23

then suddenly you can find bugs in ESXi,

16:25

you know, HyperV, etc. Um, so yeah, I

16:28

guess I agree. Like the the dumbest take

16:30

thing was more I was I was mad at Geo

16:32

Hot's ego because it basically came off

16:35

as like you. I'm so smart. I know

16:38

all the zero days. I could do this

16:39

myself in my sleep. And it's like, dude,

16:40

no you couldn't. Like you're telling me

16:42

you could drop a zero day every day in

16:44

Mac OS until someone paid you? Like no

16:46

you couldn't. Shut up. Um, but I I hear

16:47

what he's saying.

16:48

>> I really hope he takes this as a

16:49

challenge. I want Geio a zero day.

16:54

>> Geios in one week. I will eat a sock on

16:56

stream. Like straight up, I will do it.

16:57

I don't care.

16:58

>> You shouldn't say that. Gios, you heard

17:01

it here. Ed will eat a sock on stream

17:04

>> if you do a week of zero days.

17:07

>> Okay. All right. A week is a week is

17:08

actually possible. I'm talking a month.

17:10

Uh

17:11

>> okay. A month.

17:11

>> One month. And so yeah, that's my

17:13

>> I would also add like just, you know,

17:15

because I'm I constantly harp on this

17:18

point, but I want to bring it up pretty

17:20

much every time is just that

17:21

>> this is also why AI company behavior

17:26

like is a problem because this is

17:29

generally a good thing. meaning like we

17:32

do actually want the ability for us to

17:36

get 100% coverage for security and we

17:39

know that we can't get enough people to

17:41

do it really right like not in a white

17:43

hat sense right

17:45

>> maybe maybe you could take uh George hot

17:48

suggestion seriously and just go like

17:49

make hacking legal and then we just have

17:51

a crap ton more black hats and that

17:53

eventually sorts it out but I mean

17:55

wouldn't necessarily be

17:57

>> yeah that wouldn't be yeah that's

17:59

exactly they're white hats now

18:00

everyone's a white hat Now, um, so we do

18:03

I think in general this is solving a a

18:06

good, you know, this is this is a way AI

18:08

could solve a problem usefully. If it

18:10

actually can just spit out lists of

18:12

pretty well-curated potential bug places

18:14

that we can go look, that's very

18:16

helpful, right? And so the problem is

18:19

like the only reason they were able to

18:21

make that is lots and lots of extremely

18:24

talented security researchers who are

18:26

getting literally zero dollars from

18:28

Anthropic for this. And that is not

18:30

acceptable. It's just not like I'm

18:34

sorry, but like you know, Ed should be

18:35

getting a check for this or and everyone

18:37

like him. That's just kind of how it is

18:39

because it's like you used their it's

18:42

all of their expertise and all you're

18:44

really doing is very slowly and

18:46

cumbersomely and kind of clumsily

18:48

eventually building a machine that can

18:51

deploy the same analysis somewhat

18:53

reliably uh based on all of their work.

18:56

And like I just don't like it. I don't

18:59

like the fact that they're not getting a

19:00

check and I'm never going to like it.

19:01

You could you can talk to me all day

19:04

long about how someday we're going to

19:05

live in a post scarcity society and Ed

19:08

will be getting a UBI check or something

19:10

like this or whatever it is, right? And

19:13

hopefully I'll be getting one too,

19:14

although I didn't do any security

19:15

research so I don't know, maybe I won't

19:17

be getting that check. I don't know you.

19:19

I don't know how you the U in universal

19:21

basic income is. But like I don't like

19:23

this. they should be getting paid now

19:25

because Claude is, you know, getting

19:27

huge like everyone at everyone in

19:28

Anthropic is getting paid very well. Uh,

19:30

so it's not like there isn't money being

19:32

dispersed whether they're making or

19:34

losing money or anything else you want

19:36

to talk about. It's like money is being

19:37

dispersed to people. It's just not the

19:38

people who did most of the work.

19:39

>> Also, you got to throw

19:41

>> Casey would

19:41

>> you can go. Oh,

19:43

>> I was just going to ask Casey if he was

19:44

going to be happy about it though if

19:46

Anthropic spun out a consumer rack

19:49

business though. Yeah. Now we're talking

19:52

if if they were like AI racks like we

19:54

got racks we got racks for your AI

19:57

server.

19:57

>> Hot AI racks in your local area.

20:00

>> I liked it now.

20:01

>> Yeah, exactly. We will send send you

20:05

some hot racks. Uh, also by the way, not

20:08

only are they taking all, you know, your

20:10

whole argument with them taking and not

20:11

properly attributing or, you know, the

20:13

people who put all the work benefiting

20:14

from it, uh, they're also making it so

20:17

that I can't buy a GPU or RAM or CPUs

20:20

now or anything. I have that

20:22

>> you can't buy a GPU or RAM. And also, I

20:24

believe Ed literally just said he

20:26

doesn't have access to this freaking

20:28

model. So, like a bunch of security

20:30

researchers, I don't know exactly what

20:32

subset, but like a bunch of security

20:34

researchers, many of whom probably did

20:35

some pretty cool stuff, they don't even

20:37

get to use this thing. That's that's how

20:40

ridiculously backwards it is. Like WTF,

20:43

guys.

20:43

>> Yeah, I thought that was why they called

20:45

it mythos, though.

20:46

>> And yeah, that's why it's called Mythos.

20:48

Um, anthropic would argue that it is too

20:51

dangerous for little old me to have

20:54

access to it, right? Depending on, you

20:55

know, uh,

20:56

>> who knows what you'll do, man. Who

20:57

knows? I'll find that zero day and I'll

20:59

hack into Daario's phone. No, I don't

21:01

know, man. It's

21:02

>> I I understand where they're coming

21:03

from, but at the same time, I understand

21:05

why it looks like a huge marketing ploy

21:07

and I'm not sure which way to lean,

21:09

honestly.

21:09

>> Yeah. Okay. No, that's true.

21:11

>> I think it just

21:11

>> that's a whole other angle.

21:13

>> I would think that they'd have so much

21:14

more credibility if they just quit uh

21:16

effectively like giving us shake a baby

21:19

syndrome constantly with their

21:20

marketing. It's just like it's

21:22

constantly going back and forth. Every

21:24

single couple months you're getting hit

21:25

with the new, "Hey, we're all out of

21:26

jobs here shortly. Hey, this thing is

21:28

super dangerous." I mean, you got to

21:30

remember that Daario was at Chad GPT or

21:32

OpenAI. I like to call I like to call

21:34

the company Chad GPT. He was at Chad GPT

21:36

during the two days and the official

21:38

language around Chad GPT2 7 years ago

21:40

was Chad GPT2 is too dangerous to

21:43

release to the public.

21:44

>> So like this is not that's what the two

21:46

sto

21:48

>> that we've been on this like roller

21:49

coaster. I think that's one thing that's

21:51

just largely hurting the credibility is

21:52

you can only cry wolf so many times even

21:54

and then when a real wolf happens like

21:55

if this is a real wolf everyone's like

21:57

yeah okay okay C compiler boy tell me

22:00

all about it

22:01

>> but they don't care they don't care

22:03

right they don't care because they're

22:04

the the baby that they're shaking is

22:06

called an investor that's that's who

22:08

they have more money they have to shake

22:10

the money out of the pockets right they

22:11

don't they don't care what we think

22:13

right because we're not going to write

22:15

them the next hundred billion dollars

22:16

that they need to like keep going and

22:19

they're kind of locked in this, you

22:20

know, it's a bitter bitter winner take

22:23

all kind of war for this like core

22:25

technology part, right? And so they have

22:28

to be the last AI company standing

22:31

because whoever is that company takes

22:33

all the money and the other people kind

22:35

of go to zero, right? Like unless unless

22:37

there's some real differentiation soon

22:39

where it's like oh the AIS bifurcate and

22:41

like Claude is only for code and can't

22:44

do anything else anymore and like chat

22:46

GPT is only for like you know uh the

22:50

humanities or something like good luck

22:52

good luck raising money for that

22:55

40.

22:56

>> Yeah.

22:57

>> Yeah. Uh so maybe that's not true but

22:59

you know what I mean. If there's some

23:00

kind of really severe bifurcation, then

23:03

maybe they could both survive. But you,

23:04

you know, they're in a winner take all

23:06

battle right now. And so they got to

23:08

keep saying this, every release has to

23:10

be the one that's this is the one that

23:12

it will take over the world. And if it

23:13

doesn't quite, well, you know, it'll be

23:15

next.

23:15

>> You know that uh Claude got sorry, just

23:18

one quick thing. Uh do you know that uh

23:19

Red Bull in 2007, was it 2011? No, 2013

23:23

maybe.

23:23

>> Oh, Red Bull was too dangerous to

23:25

release.

23:25

>> No, Red Bull claimed that it gave you

23:26

wings. remember the day that it gave you

23:28

wings? It was sold, it was sued

23:31

successfully, I believe, for $10 million

23:33

because it in fact did not give you

23:34

wings. It was not superior to coffee.

23:37

>> And so I'm pretty sure in college I got

23:39

a check for like $2.30 from that.

23:41

>> Yes. And so I I am curious.

23:43

>> Ed, you sued Red Bull and won, bro. You

23:45

should make a video about it.

23:46

>> Call me the lawyer. Low level. Okay,

23:48

listen.

23:49

>> Uh

23:51

lowle.

23:53

Legal. Let's go. Low legal. Uh, but I'm

23:55

actually curious if if they keep saying

23:58

that and then it doesn't happen, do they

24:00

open themselves up to a false

24:01

advertisement, class action law? Like,

24:03

can you keep saying this and then not

24:04

get like Red Bull made claims and then

24:07

they got sued? Why not why not other

24:09

people? Why can't other people get sued

24:10

for that?

24:11

>> I think the problem with like with Red

24:13

Bull is like the the case was so

24:15

obvious, right? Like Red Bull does not

24:17

give you wings. End of case. Like, okay,

24:19

fine. Like any judge over the age of

24:21

>> I would have liked to hear the defense

24:23

for that one. Yes, it does.

24:25

>> Your honor. Your honor.

24:26

>> The problem is

24:27

>> they they had like these wings like

24:28

strapped to their back and they go like,

24:30

"I drank your Red Bull this morning and

24:32

here are my wings.

24:33

>> We ship you wings." Yeah.

24:35

>> Um but the problem with anything

24:37

technological when it comes to the

24:38

government or legislation or or you know

24:41

judicial process is that like boomers

24:44

and higher run the world right now when

24:47

it comes to these levels of like jur of

24:49

uh of of making um like legal decisions

24:52

and you couldn't explain to anybody at

24:56

that age unfortunately like right now

24:58

just people that are like running these

24:59

processes what it even means to find a

25:02

bug and then and then show them mythos's

25:04

claims and like and make a sound legal

25:06

argument that would like go well in

25:07

court.

25:08

>> You're right. You're right because

25:09

Camala Harris did actually think

25:10

computing was in the literal clouds and

25:12

so

25:14

>> it's my favorite clip of all time. Yeah,

25:15

there's a clip of her.

25:17

>> Josh,

25:17

>> put the clip in.

25:18

>> So, you're now no longer are you

25:20

necessarily keeping those private files

25:23

in some file cabinet that's locked in

25:25

the basement of the house. It's on your

25:29

laptop and it's then therefore up here

25:32

in this cloud that exists above us,

25:36

right?

25:37

>> She'll have the last laugh though when

25:39

like uh SpaceX is launching uh AI data

25:42

centers into space and come like that's

25:45

what I was talking about. That's what I

25:47

was talking about.

25:48

>> Yeah, it's cloud storage. So, you're

25:49

probably right.

25:50

>> A great clip where she's talking about

25:51

the cloud and she literally points above

25:53

and goes like the cloud it's like above

25:55

us and stuff or something like that.

25:57

It's so good.

25:58

>> She should have known that it wasn't

25:59

there because she would You don't see a

26:01

series of tubes.

26:02

>> There's a series of tubes necessary.

26:04

>> Series of tubes. I learned that

26:05

recently.

26:05

>> It's true.

26:06

>> Um, okay. I got a I got a question for

26:08

you, Ed, like in this in this vein about

26:11

your thoughts on it.

26:13

>> So, right now, I get that there's

26:15

there's basically like the argument

26:17

>> like, okay, I'm a company. I release my

26:19

thing. I run some models as like a

26:22

preventative thing to look for zero

26:23

days. the bad guys run models to try and

26:26

look for zero days. We kind of fight it

26:28

out and it's whatever, right? So, I

26:30

think like everyone's saying like if the

26:32

hackers can use it, I can use it. That's

26:33

fine. But the thing that makes me like a

26:36

little bit more like I don't really know

26:39

is like for the state of like a bunch of

26:41

open source stuff like and I'm an open

26:43

source maintainer and I already can't

26:46

convince a company to send me $100 a

26:49

month to maintain this thing for them.

26:52

There's no chance I'm getting them to

26:55

Well, I'm definitely not going to spend

26:56

20k of compute. Yeah. Every time I

26:59

release something and decide that now

27:02

it's safe, right?

27:04

>> But and like I can't get any companies

27:05

to pay for that and sponsor it. But like

27:07

if I'm a, you know, if I'm the one

27:11

little pin in the excuse XKCD comic

27:14

that's holding up from Nebraska, the the

27:16

bad guys only need to do mine once. So

27:19

I'm wondering like kind of how you see

27:20

that as like the landscape affecting

27:22

open source things like that cuz it

27:24

seems very asymmetric in that way.

27:26

>> I mean I think it's asymmetric for that

27:30

reason right like the reason why you can

27:32

make the argument that anthropic is

27:34

afraid is because you are the lynch pin

27:37

on the infrastructure of the internet

27:39

and no one has funded you so far. You

27:41

have had zero security audits or zero

27:43

security work done on your stuff. And so

27:45

like if you give access to these models,

27:48

if you really are the lynch pin in the

27:50

internet, you already aren't getting

27:52

money from Netflix, Google, whoever

27:54

that's using your software. And the

27:56

black hats know that you're the lynch

27:58

pin keeping the internet up. They're

27:59

going like they're going to make use of

28:00

that model to to do the exploitation,

28:02

right? Um does that answer your

28:05

question? I mean like I think it's just

28:06

like the amount of power that it gives

28:09

to a single organization given the

28:12

current like

28:13

>> state of open source software in

28:14

particular um is very dangerous and to

28:16

be very clear

28:17

>> these models are also doing are also

28:18

very good at doing close source software

28:20

right like my recommendation to anybody

28:22

interested in this by the way is like go

28:24

take a capture the flag problem from

28:26

like CTF time or crack.1 or whatever and

28:29

uh hook up gedra to gedra mcp and then

28:32

use claw code on gedra mcp it will

28:35

reverse engineer and find a bug in that

28:37

in that problem in a matter of minutes.

28:39

Like it is it like like Opus 46 is a

28:42

better reverse engineer than I am and

28:43

I've been doing this for like coming on

28:45

14 years. Uh it's honestly terrifying to

28:48

watch it work. So if you're if you're

28:49

even remotely interested in this, go

28:50

give it a shot and you you'll kind of

28:51

see what I'm talking about. It's It's

28:52

scary how fast it moves.

28:55

>> Yeah,

28:55

>> because that so that part that's where

28:56

I'm like, you know, whether it's Mythos

28:58

or not, I feel like right now a bunch of

28:59

stuff you could just maybe it'll cost

29:01

more tokens or it'll take longer or

29:03

something, but like a lot of stuff you

29:04

>> you still could find.

29:06

>> Yeah. And the models also like any model

29:08

does this obviously, but like the the

29:10

current models are really bad about like

29:11

false positives. Like I've done security

29:13

research uh in my free time on like

29:15

Chrome, ESXi, and some other like

29:18

routers that I've like download

29:19

>> regular weekend activity,

29:20

>> classic weekend activity. um and the

29:22

amount of times I've gotten like

29:24

critical finding like buffer overflow in

29:26

like the the RPC handler for this thing

29:28

and it's like okay all right dude like

29:30

write me an ASAN harness that tests that

29:31

and you'll see very quickly oh sorry

29:33

just kidding it's not actually there um

29:35

and so the magic is like if mythos is

29:37

able to make less false positives you

29:39

reduce you increase the the signal to

29:41

noise ratio in this in this process

29:43

which is scary right because it just

29:44

means you need less people to triage the

29:47

uh the reports and ultimately find real

29:49

bugs faster. Uh, so I have another

29:52

question with this mythos thing and and

29:54

maybe I'm curious I'm curious about your

29:56

security expertise. Isn't this whole

29:57

withholding a model kind of like a

29:59

doomed uh proposition to begin with?

30:01

Meaning that if OpenAI has a similarly

30:04

powerful mythos model and they're

30:07

competing for the zero like for the a

30:10

zero game kind of like outcome of who is

30:13

the best model. Doesn't it mean that

30:15

when Open AAI has it, they will just

30:18

release it? Like, and then aren't we

30:20

just forced to go out because whoever

30:22

kind of releases it gets the customers

30:24

and then that by having the customers,

30:25

you win. And so then you just get out

30:26

ahead. Like, doesn't this kind of cause

30:28

like a weird thing where Yeah. we're

30:30

like, "Oh, we can't do this." You know,

30:31

Daario's like saying we can't do it, but

30:33

won't we just kind of fall right into it

30:34

the moment there's two people that have

30:36

it?

30:36

>> Yeah. I mean, that's I'm not like

30:38

on capitalism. I'm just saying

30:40

that's more of like a capitalism problem

30:41

than it is like a security problem,

30:43

right? But yeah, your your point is

30:44

basically like if actor A says thing too

30:47

dangerous but could

30:48

>> model open source model shall we say

30:50

>> and actor B has same thing and wants to

30:53

make money with slightly less ethics

30:55

potentially. Yeah, actor B is going to

30:56

release it or Yeah, exactly. Chinese

30:57

model, Russian model, whatever.

30:59

>> Um

30:59

>> well I mean that's literally what I mean

31:01

Daario quit Open AI cuz he's like bro

31:04

they keep they keep making models that

31:06

can kill humanity, right? Okay. So, I'm

31:08

starting a company where we make models

31:10

that could kill humanity,

31:12

>> but they're mine. Uh, also Chinese

31:14

models after open AI or Anthropic

31:18

releases one. So, I think that that

31:20

might be a little bit difficult. They

31:21

might be a little bit behind.

31:23

>> Has anyone seen

31:26

>> Riverside chat? But yeah, I mean, OpenAI

31:28

literally has a model that they claim

31:30

they haven't made any claims, I don't

31:31

think, about like mythos equivalents,

31:32

right? Um, but they're doing effectively

31:34

the same thing where it's a it's KYC

31:36

know your customer. So you have to like

31:37

upload your ID and like talk about what

31:39

work you do and you get access to GPT54

31:42

cyber which I'm assuming is just a model

31:44

that's trained better on bug patterns

31:47

right use after free out of bounds reads

31:49

etc. Um, now if it's actually better

31:51

than mythos who knows right but you know

31:53

it's I think we're all just trending

31:54

regardless of what anthropic wants to

31:55

do. I think we're trending towards every

31:58

person on planet Earth with a couple

31:59

bucks having access to models that are

32:01

very good at bug hunting. Uh, and the

32:03

question is, what does that mean for

32:04

software, right? Does software get more

32:05

secure? Does the world just get more

32:06

scary for a long time and it never

32:08

really like resolves itself? Like, what

32:09

do we do with that information? And

32:10

that's a tough question to answer.

32:12

>> I'm interested to know how expensive

32:13

it's going to be. That's the other

32:14

question.

32:15

>> I mean, this is obviously the question

32:17

kind of that we've been talking about

32:18

for a while on the pod and in life in

32:21

general is what are what are token costs

32:23

going to look like if uh OpenAI and

32:26

Enthropic both get all of the customers

32:28

that they would like to have? Uh,

32:30

because the cost won't be the same. If

32:32

demand 10 or 100 or a thousand X's it

32:35

won't be

32:36

>> so I'm not

32:37

>> the price will not be

32:38

>> I'm not super well read on this. Is it

32:39

true that an inference currently is at a

32:41

loss

32:43

>> like

32:44

>> I've heard I've heard both

32:46

>> both. Okay.

32:46

>> Some people are so confident I I have

32:49

been looking to try and find a

32:50

definitive answer.

32:51

>> I'm the confident one by the way he's

32:53

referencing.

32:54

>> Okay. Oh no no no. I mean well I'm not

32:57

going to reveal my sources.

32:59

I asked Chet BT and I asked Claude.

33:03

>> They both said, "Of course not."

33:05

>> Yeah. Yeah. Right. I've heard I've heard

33:07

though that some some people are saying

33:08

they are running it at a loss or it's a

33:10

bit complicated because like pretty sure

33:12

Anthropics probably running some

33:14

percentage of accounts on the $200 plan

33:16

at a loss,

33:17

>> right? Um but like is is API pricing at

33:21

cost or below? And then how do you

33:23

factor in like training and stuff?

33:25

I my my personal take is that

33:27

>> inference itself just looked at in the

33:29

myopic view of just inference it makes a

33:31

lot of money but you also then once you

33:34

zoom out now you start saying hardware

33:35

and all the incidental stuff around it

33:37

probably still makes money but then when

33:39

you zoom out to say like every time you

33:40

release a model you defunct your

33:42

previous model that is going to have

33:44

that has a very large burden and they

33:46

keep on not making money and needing to

33:48

raise more money so I have a sneaking

33:49

suspicion that part of it is very hard

33:51

to make money in the current state uh

33:54

all All right. Well, OpenAI is like

33:56

publicly like losing money, right? But

33:58

is Anthropic also negative or

34:00

>> they just had another big raise as well,

34:02

so I'm assuming I thought they just

34:04

raised like $6 billion or something.

34:06

Could be wrong

34:06

>> about that chat. Fact check me. I know

34:08

Open AI did 120 billion

34:11

>> uh raise.

34:11

>> So much money.

34:12

>> This is the

34:13

>> Yeah, cash.

34:13

>> This is the one that I actually was

34:15

really curious to see. This is the only

34:16

benchmark that I was super curious to

34:18

see if they're going to uh do well.

34:20

Anthropic Opus 46 Max cost approximately

34:23

$9,000 and got 0.5% score on ARC AGI. So

34:28

this is like the the the super test and

34:30

humans get into the high 90s. Uh AIS get

34:34

like uh Jeypity 4 high cost $5,000 and

34:36

got 2%. Gemini 31 did 4% for $2.2,000.

34:42

>> And so it's like this really difficult

34:45

uh it's a really difficult test for AIS

34:48

to pass. And so mythos did not add

34:50

itself to this one. So this is the

34:52

reason why I largely think it's more

34:54

like hype marketing than it is anything

34:56

because to me this is like a really

34:58

great indicator at least into some sort

35:00

of better model improvement. And so I

35:03

didn't see it.

35:03

>> Sure.

35:04

>> Uh let me can I can I just give a

35:06

counter point to that though?

35:07

>> Sure. Yeah. Yeah. Yeah. Yeah. Yeah.

35:08

Yeah.

35:10

>> Once again with the huge disclaimer that

35:11

I don't do any AI stuff. So this is just

35:14

off the cuff. But ARC AGI, if I'm not

35:18

mistaken, is a benchmark specifically to

35:22

test how well AIs perform uh on learning

35:26

completely arbitrary new things that

35:29

don't exist anywhere in their training

35:30

data. That's the only thing that it's

35:33

intelligence of this all.

35:35

>> Exactly. And so the only reason I would

35:37

want to point out that I don't think

35:38

that test says very much about this

35:40

particular security thing is security is

35:42

not that true. Like nobody nobody is

35:45

claiming that Claude Mythos came out and

35:47

discovered a whole new set of classes of

35:50

security exploits that no one had ever

35:52

come up with before. What it's saying is

35:54

that it went and found a bunch of the

35:56

exact same kinds of zero days

35:59

>> that someone like Ed would find if they

36:01

went and spent a week on that piece of

36:04

software, right? Like so they're not

36:06

claiming that this thing is somehow more

36:09

intelligent than the predecessor in that

36:11

way. It's claiming that it's got better

36:13

pattern matching

36:14

>> and like stringing things together to

36:17

create exploits, right? That process

36:19

which is well known. And so, so I don't

36:21

think ARC AGI necessarily tells us very

36:23

much about whether it can do those

36:24

things because those things are very

36:27

well-known tasks that security

36:28

researchers know how to do and we kind

36:31

of know the process that you do to do

36:32

them, right? So,

36:34

>> yes, that's okay. I will I will I will

36:36

concede that point most certainly that

36:37

the security at least known and obvious

36:39

security vulnerabilities such as use

36:41

after freeze and and all the fun stuff

36:42

like the stuff that happened in ffmpeg

36:44

with jumping ahead somewhere in a buffer

36:46

based on

36:47

>> yeah the these things are very common

36:49

kinds of bugs they're not like unusual

36:52

the things that they've talked about are

36:53

like very very standard and so that

36:56

seems like a more plausible claim like

36:58

hey we just were able to scale up the

37:01

sort of security checking that a

37:02

security researcher would do it can do

37:04

that thing and and find you know

37:07

potential places for that

37:08

>> a lot more plausible than AGI.

37:10

>> Yeah. The thing too for I feel like for

37:13

the security side of it as oppo like as

37:16

opposed to constructing a product or a

37:18

new product or like building a feature

37:20

where you have to get like in some ways

37:23

all the things right for a security

37:25

thing I only need to find one of the

37:27

things that are wrong.

37:28

>> Yeah. which is like a

37:31

like you can test a bunch of the

37:32

scenarios like you're saying that

37:34

already exist and I only need one thing

37:37

to be wrong in the program for then me

37:39

to be able to take control of it.

37:41

>> Well, and it's combinatorial, right?

37:43

Like a lot of what security research is

37:45

doing is like a it's pattern matching

37:47

for these kinds of bugs and then b going

37:50

like okay if I did this one followed by

37:52

this one would that produce an exploit?

37:54

What if I did in the opposite order?

37:56

What if I did this one and then this one

37:57

and then that one? Okay, what if I did

37:59

this one? Right? And again, these are

38:00

things computers are good at like that.

38:02

It's not you don't have to believe in

38:04

some kind of a weird like supernatural

38:06

like AGI achieved internally Sam Alman

38:08

nonsense to believe that this is

38:10

something a computer could do. It's it's

38:12

much more plausible if anything than

38:14

some of the other claims. So that's why

38:15

I I would like say I'm I'm not that like

38:18

when I saw this I wasn't like that's got

38:20

to be false. I was like okay yeah I

38:22

could believe that. Yeah.

38:22

>> I don't know. Mo most of vul research is

38:25

like you know take a function that gives

38:27

user input like define your threat model

38:30

and then do source to sync analysis on

38:32

some vulnerable function or failure to

38:34

gate a function on like a length check

38:36

and like does user data get there bug

38:39

confirmed and like yeah that's literally

38:41

just pattern matching that we've solved

38:43

a lot of the times previously with like

38:44

satisfiability solvers right like anger

38:46

and like Z3 like take the graph of a

38:49

function turn it into a math problem can

38:50

you solve the math problem cool bug

38:52

confirmed Well, now with AI, it's just

38:54

like that process of doing source to

38:56

sync on like text, it can do incredibly

38:58

fast, right? It's very good. Now,

38:59

obviously, because it's soastic, it

39:00

creates a lot of false positives, but if

39:02

we can figure out a way to reduce the

39:03

false positives or uh automate the the

39:07

validation of of those false positives,

39:09

then yeah, it's it's crazy. And I think

39:11

what they thought about what's that

39:14

>> have they thought about asking mythos?

39:16

>> I know. Come on. Can you just No

39:18

mistakes, please. Um the thing that

39:20

mythos is set apart differently

39:21

according to the anthropic report is its

39:23

ability to chain together primitives.

39:24

Right? So the scary part from like a

39:27

cyber crime perspective is like you have

39:30

uh gadget A that gives you an arbitrary

39:32

read and gadget B that gives you an

39:34

arbitrary right. Okay. Like those two

39:35

separate things are like not super

39:37

important if they're not used together.

39:38

Well, what Mythos is able to do is out

39:41

of a 100 tests, I think it's like 83% of

39:43

the time, find exploit primitives in a

39:47

vulnerable codebase and chain them

39:48

together to get rce, right? That's the

39:50

scary part because then that's true like

39:53

end to end exploit creation for a bad

39:55

actor. And that's I think what scares

39:57

anthropic the most.

39:58

>> Um, now I know there's argument where

40:00

like Firefox wasn't in the sandbox for

40:03

that experiment. So like it doesn't

40:04

actually matter. But I mean just apply

40:06

that process to the sandbox and the same

40:08

thing applies. You know, it's just I

40:09

think they wanted to prove a point that

40:10

it could do that.

40:12

>> Well, and also I mean again like as I've

40:15

said many times, I can't stand AI

40:16

companies, so I'm not trying to defend

40:18

them or anything, but I'm just trying to

40:19

point I'm just trying to point out how

40:20

plausible this stuff is to me from a

40:22

neutral observer standpoint.

40:24

>> Classic case defending AI companies.

40:27

>> Yeah, I know, right?

40:28

>> Um

40:29

>> I know. Uh if you think about it, it's

40:31

like look, security researchers who do

40:34

not number that many were already

40:37

cranking out zero days at a much too

40:39

alarming rate for me, right? Like like

40:42

you know there's a hack every other day,

40:45

right? It's not like CVES are piling up

40:47

like there's no demar and yeah, not all

40:49

of them are actually all that bad or

40:50

whatever, but like it's not like

40:52

security researchers were having trouble

40:54

producing a fair number of of critical

40:56

vulnerabilities even with the limited

40:58

resources that they had. So, it's also

41:01

not weird to think that like if you had

41:05

more automation, you would find a lot

41:07

more of them. It doesn't like there's

41:10

clearly just a lot of bugs, guys. Like

41:12

there's a lot of freaking bugs and it's

41:14

just doesn't seem that unusual that if

41:17

you have more sophisticated pattern

41:19

matching, more sophisticated

41:20

cominatorial checking where the security

41:22

research doesn't have to spend a lot of

41:24

time setting up the tool because it can

41:25

just kind of ingest the code and it

41:27

knows what roughly what it means.

41:29

>> Yeah, I mean they're their rates going

41:30

to increase if nothing else existing

41:32

security teams rates of finding

41:34

exploits. It has to. I mean it just has

41:36

to. Unless this thing is just a complete

41:38

pile of crap,

41:39

>> it's got to. The other thing too we've

41:41

been seeing from each like new

41:43

generation of model is that they're

41:45

getting at least from my experience and

41:47

what I'm what I'm reading from people

41:48

and everything they're getting better at

41:50

calling other tools.

41:51

>> So like they call out to stuff more

41:53

regularly

41:54

>> and they can pay attention for longer.

41:57

recompile this code and see you know

41:59

make this exploit and run it against

42:01

this thing or whatever right like those

42:02

are all things that if you automate them

42:04

a security researcher gets much faster

42:06

at finding because they're not having to

42:08

set up the tooling themselves to like go

42:10

work on this exploit like whatever

42:12

whatever those steps were they don't

42:13

have to do them anymore right

42:15

>> right so then if you're like oh now it

42:17

can run instead of like I have to prompt

42:19

it at every stage for the next thing to

42:21

do is I can give it

42:23

>> 10 rough things say try a bunch of

42:25

combinations of these and and it runs

42:26

for 24 hours.

42:28

>> Yeah.

42:28

>> You're just like a lot. It's literally

42:30

like in in my mind some of it is like

42:32

Yeah. Well, we already know fuzzers

42:33

exist. Like we use them all the time and

42:36

they're good,

42:36

>> right? It's like in some ways almost

42:38

like

42:38

>> Yeah. It's like fuzzer. It's like fuzzer

42:40

squared, right? It's like a thing now

42:41

that can like target the fuzzing at

42:43

things specifically. So that things that

42:46

would be very hard for stocastic testing

42:48

to catch

42:49

>> because when you have stocastic testing

42:50

and you have to chain two things

42:52

together, you're never going to randomly

42:53

pick the two things that would have to

42:55

happen for them to work. Here's a thing

42:56

that can like target that specifically

42:58

and go like, "Oh, I think combine these

43:00

two things. Probably let me fuzz that

43:01

specific path." Oh, yep. I got it right.

43:04

>> That's where it gets crazy is like you

43:05

just have the AI write the fuzzer and

43:07

then like if you can automate that

43:08

process, you win a lot of the time. It's

43:10

it's pretty pretty amazing. Um, I do

43:12

have to go though. I have a meeting in 3

43:13

minutes, so I got to I got to rip. Um

43:15

>> Oh, hopefully you get Mythos access.

43:16

Congrats.

43:17

>> That'd be neat. No, it's not going to

43:18

happen.

43:19

>> Come on, guys. Give him

43:23

>> Have a good one, man.

43:24

>> I like you guys, but it looks like it's

43:26

the end of our show, unfortunately.

43:28

>> Yeah.

43:30

>> True.

43:30

>> All right.

43:30

>> Thank you everybody. I would just like

43:32

to say that uh I would I would just like

43:34

to say that Casey and TJ and obviously

43:37

Teimu Casey that just left commonly

43:39

known as Lowle learning uh you guys you

43:41

know you make the show magic and

43:45

and now I'm just going to go about being

43:47

lonely again. Kind of sad.

43:49

>> Oh, Prime.

43:52

I knew that was coming. I thought I I

43:54

thought I was going to get booed, but I

43:56

I just assumed something was going to

43:58

happen. All right. Um, the real the the

44:00

good news is is that you can enjoy full

44:02

episodes of the standup now on YouTube.

44:05

If you go to the standup pod full, which

44:08

I'm going to try to rename hopefully at

44:10

some point, we're trying to work some

44:11

things out to get it a better name. But

44:12

right now, YouTube, am I right? Um,

44:16

>> if you go to the website, if you go to

44:18

our website, will it have links to

44:20

these?

44:21

>> Yeah, it will. It will. And it'll have

44:23

it spelled out. Uh we'll we'll make it

44:25

more clear once we figure everything out

44:27

over the next week. Maybe by the time

44:29

you're listening to this on YouTube, by

44:32

the time you're listening to this on

44:32

YouTube, uh we're going to upload all of

44:34

the backlog to that channel as well. So

44:37

we should have every episode on YouTube

44:38

in one spot, very easy to see, etc.

44:42

Obviously, you always can, you know, RSS

44:44

download the audio directly. Don't press

44:47

the red button on that site. Of course,

44:49

>> teach, what is that web address that

44:50

people should go to for

44:51

>> the standup pod? Hey,

44:53

>> the stand. Go to the standup pod.com.

44:54

All the links will be there. All the

44:56

episodes will be there. You want

44:57

YouTube, you want Spotify, you want

44:59

downloads, you want RSS, you got it.

45:03

>> The standup pod.com.

45:04

>> Yeah. Yeah. Check this out. I'm just

45:06

going to do something for the audience.

45:07

Look at this. If you go here, you click

45:09

Trash made a black mirror app, you can

45:11

go and you can listen to it right on the

45:13

website. You can have all the

45:16

information right here. You can trashes

45:17

app right there. You can go in here.

45:19

>> We don't even look at this. We don't

45:21

even charge you.

45:22

>> You can play on Spotify. You can

45:24

download and just have personally for

45:26

you to do whatever you do.

45:30

>> That's for you.

45:31

>> Now that we're And then I'll make it

45:32

I'll make it so it links to the YouTube

45:34

there later as well now that we're going

45:36

to have a dedicated YouTube channel for

45:38

that too. So for all of you out there,

45:39

you know,

45:40

>> the AI companies claim that you're going

45:42

to get UBI, but we're actually giving

45:44

you universal basic podcast.

45:48

>> You just get it for free. UBP.

45:50

>> UBP. You know me.

45:51

>> UBP. Yeah,

45:52

>> UBP.

45:53

>> Yeah,

45:54

>> I was going to say, well, I don't know

45:55

what I was going to say. That's fine. We

45:57

should just

45:58

>> We should really just end this episode.

45:59

>> Stick a fork in it, guys. It's done.

46:01

>> All right. Thanks.

46:02

>> Good seeing everybody.

46:03

>> Thanks, YouTube. Thanks again, uh,

46:06

whatever your name is. Tee, you're

46:08

pretty neat. Boot up the day.

46:12

V coating errors on my screen.

46:16

Terminal coffee

46:19

and

46:21

living the dream.

Interactive Summary

The video features a panel discussion centered on George Hotz's provocative take regarding cyber security and the release of Anthropic's 'Mythos' model. The participants explore whether the limited discovery of zero-day vulnerabilities is truly a lack of skill among hackers or, as Hotz suggests, a lack of financial incentive. They analyze the potential impact of AI tools in vulnerability research, the hype versus reality of AI model benchmarks, and the broader existential security implications of AI models being capable of identifying and chaining exploits. The conversation also touches on the ethical concerns surrounding AI companies withholding access to powerful tools and the lack of fair compensation for the security researchers whose expertise informs these AI advancements.

Suggested questions

3 ready-made prompts