HomeVideos

we're so back

Now Playing

we're so back

Transcript

334 segments

0:00

software engineering and development for

0:02

the last few years has just been on this

0:04

we're so back and oh my gosh it's so

0:06

over just kind of hype cycle every

0:09

single model that's come out the

0:11

headlines do not stop it's over software

0:14

engineering is so over people who've

0:16

never written a line of software trying

0:17

to flex on you about their to-do

0:19

application telling you that your job is

0:21

over and you're just a whiny little baby

0:23

and then inevitably the highs come back

0:25

right we start seeing posts like this

0:27

such as AI actually reporting fake

0:30

figures for multiple months and just

0:32

completely bamboozling a company. And

0:34

us, we all laugh by ourselves.

0:38

Those peasants, they don't even know the

0:40

power that I have, let alone the power

0:42

that they think they can wield. But

0:44

unfortunately, today, this story that

0:45

I'm going to be making on this old

0:47

YouTube video is an it's an it's over

0:49

kind of moment, okay? And it comes

0:50

courtesy of Mitchell Hashimoto. If you

0:52

do not know Mitchell Hashimoto, his most

0:54

recent and popular work, of course, is

0:57

Ghosty. I'm using Ghosty right here.

0:58

This is actually how I edit the old Vim

1:01

right here is inside of a ghosty

1:02

terminal. I find the uh terminal to be

1:05

quite nice. And if you do not know

1:06

anything about Mitchell, he tends to

1:08

tweet things that are very insightful

1:10

and he takes his time, it appears at

1:11

least to kind of communicate through his

1:13

thoughts. And every single time I see a

1:15

longer post by him, I always go, "Oh,

1:17

I'm excited. This is something I just

1:19

want to go and watch and read through

1:21

because whatever he's talking about

1:24

likely, I'm going to find to be very,

1:25

very interesting." And in this case,

1:28

this was very, very interesting. It was

1:30

about a very difficult bug that he and

1:33

his team had quite some time to try to

1:35

figure out and could not figure it out.

1:36

And then in a very short period of time,

1:39

the AIS it figured it out. Ah, codeex 53

1:42

extra high with a vague prompt just

1:44

solved a bug that I and others have been

1:46

struggling to fix for over 6 months. So

1:48

obviously when he says 6 months, he does

1:50

not mean 6 months fulltime heads down

1:52

and couldn't do it. other full reasoning

1:54

levels with codeex failed. Opus 46

1:56

failed cost $4.14

1:59

minutes full trace plus includes

2:01

original issue right here. So here we

2:03

go. So we're going to walk through a bit

2:05

of what he did and then we're going to

2:06

kind of come to some conclusions because

2:08

I I guess I drew slightly different

2:10

conclusions than he did, but I do think

2:12

that this is a very interesting thing to

2:14

kind of watch through. Use GitHub to

2:17

look up this specific issue and

2:18

determine a fix. And so the AI of course

2:21

went through and used GitHub and kind of

2:23

walked through all the different

2:24

talking. The issue of course was this

2:25

right here. Every single time they made

2:27

a split, the screen would flash a little

2:29

bit. Flash. You see that how there's

2:30

something behind it and then it goes and

2:33

it flashed for a quick second. One more

2:34

time for the you know, you can see that.

2:37

Okay. So that's bad. That's not good. We

2:39

don't want that. So right away the AI

2:40

obviously went, "Okay, this is a GTK

2:42

split update. I'm going to go take a

2:44

look at this." All right. Right. It

2:45

found kind of the exact commit that was

2:47

touching uh this exact kind of area in

2:49

which it's looking into. It then starts

2:51

going through all the patches just going

2:53

all the way through every last piece of

2:55

code and how it's been changed. Then it

2:57

starts looking at each one of the

2:58

commits, you know, according to this one

3:00

specific issue that's been associated

3:02

with the issue. It starts narrowing down

3:04

on a specific operation, this split tree

3:07

split. But then it takes an odd turn

3:09

about partway through it just goes off

3:11

and starts looking at some Gnome GTKC.

3:15

Now, if you didn't catch any of this,

3:16

this right here is Zigg. This is a

3:18

fantastic language. If you've never used

3:20

Zigg, Zigg has to be probably one of my

3:22

personal favorite languages to tool

3:23

around in. Though I haven't written it

3:25

in quite some time, it still is

3:27

something that I just I draw a lot of

3:28

just I have a lot of warm feelings

3:30

about. Now, if you go back to the tweet,

3:32

you'll see that Mitchell says, "The best

3:33

thing that Codeex did was eventually

3:35

start reading the GTK4 source code."

3:38

That's where I ended up seeing my GitHub

3:39

issue, and I knew the answer was

3:41

somewhere in there, but I didn't have

3:42

the time or the motivation to do it

3:44

myself. The other models never went

3:46

there. The lower reasoning efforts with

3:48

53 didn't go there either. Only extra

3:50

high went there. I think it was the

3:52

critical difference. And yes, it

3:53

actually did make the difference. It

3:55

started making these kind of code

3:56

changes saying, "Hey, this is actually

3:58

what's happening. I'm going to delete a

3:59

lot of the comments, delete a lot of the

4:00

stuff happening. This is actually what

4:02

it should look like. And then Mitchell

4:04

went, "Okay, hey, don't run the full

4:05

test, by the way. Super Chad move."

4:08

Absolutely what I wanted to see. And

4:10

then just says, "Hey yo, just explain to

4:12

me what you fixed in a lot of detail."

4:14

And it goes through and it actually

4:15

talks exactly about what exact happened.

4:18

And the general thing was that the the

4:20

little leaf nodes inside these uh kind

4:22

of surfaces that are created in GTK were

4:25

being completely destroyed and new ones

4:26

were being created. Hence like the blah.

4:28

Instead, it's going to try to reuse the

4:30

ones that aren't changing. Gives a nice

4:32

little ex, you know, answer about why

4:34

this reduces flicker. But here's the

4:36

thing that kind of makes a big

4:37

difference. Now, the average vibe coder,

4:40

shall we call him, the average AI Andy

4:42

from San Francisco, typically at this

4:43

point, they put the code in the bag.

4:45

Okay, just put the code in the bag. And

4:48

you see this a lot. You see a lot of

4:49

people that when something starts

4:51

working, they just kind of move on. They

4:53

kind of call it a day and they go,

4:54

"Okay, there we go." And I especially

4:56

have seen this a lot with the younger

4:58

the younger folks. There just seems to

4:59

be a lot less rigor because there's a

5:01

lot more idea that hey, the AI is super

5:03

correct. It's super fast. It can code

5:05

much faster than me. And I have a whole

5:07

bunch of tests, so it's probably just

5:08

okay. It just feels so much different

5:10

than how I would operate. And this has

5:12

honestly been one of my big sticking

5:14

points about diving headirst into this

5:16

Daario take the wheel style of

5:18

programming where you just let all the

5:20

code be written by AI is that I don't

5:22

really like how it does things. I have

5:24

to like constantly correct and say,

5:26

"Hey, I just don't think these things

5:27

are right." And of course, you see the

5:28

same thing with Mitchell right here as

5:29

he goes in, he does a thorough code

5:31

review and asks why this some specific

5:33

thing was created and then asks even

5:35

more questions about, "Hey, we need to

5:36

change how we're doing these failure

5:38

modes." This led to even more changes

5:40

going on that this almost represents

5:42

what happens in the PR. Then he ends up

5:44

doing some manual changes, some manual

5:45

cleanups, and boom, it's over. an issue

5:48

that would have taken so much time for

5:50

an individual to kind of comb through

5:52

all the GTK source code and all the

5:54

source code inside the Zigg and try to

5:56

marry what is exactly happening was able

5:58

to just be quickly identified by the AI

6:01

and I think this is where a lot of

6:02

people just see that it's so over moment

6:04

but I kind of I guess I I I don't know I

6:06

take a lot of encouragement from it from

6:08

for multiple reasons not and it's not

6:10

just the fact that he did some manual

6:11

changes which if you have an opinion

6:13

about software it's really hard to make

6:15

an AI code exactly the way you would

6:18

code it. I have been trying a lot and

6:20

I've been really trying to get it to

6:22

generate code mostly the way I do it. It

6:24

hasn't quite got there. But more so,

6:26

you'll notice just this like really

6:27

intense reviewing process. Obviously,

6:29

that took some time. You couldn't just

6:31

flick through the file and go, "Okay, I

6:33

get what happening. Whatever." That that

6:34

mostly looks okay. No, there had to be a

6:36

lot of thought and a lot of time that

6:38

went into it. And so the reviewing and

6:40

manual cleanup phase took a decent

6:42

amount of time, but it took so little

6:44

time in comparison to figuring out what

6:46

the actual root bug was. And once that

6:49

was figured out, the next step was well,

6:51

it was relatively easy. And so on one

6:53

hand, I can totally see why Mitchell

6:55

says it's so over moment, right? It's

6:57

because it synthesized so much data and

7:01

was able to kind of just produce a, you

7:03

know, 50 60 lines and you went, "Oh,

7:05

that's the reason why." And at the exact

7:07

same time when you see that it does feel

7:09

amazing. He just made his product much

7:11

better. He saved a whole bunch of time

7:13

and that's really really good. And so

7:15

this whole thing just kind of got me

7:17

thinking and realizing that I kind of

7:19

operate in one of two ways which is I

7:21

mostly write code by hand or I pretty

7:24

much just Sam Alman take the wheel and I

7:26

will just put in a bunch of testing and

7:28

I just hope that everything works out. I

7:30

don't often use it this way enough,

7:33

which is like, okay, here's something

7:34

that I know can take a lot of time. It's

7:36

going to take a lot of research. You go

7:38

get me 80% way there. I will do the

7:41

other 20%. And for those that are

7:42

feeling that it's so over and feeling

7:44

really weighed down about it, I just

7:45

love this little interaction right here.

7:46

By the way, I like Ryan Florence. I'm

7:48

not trying to dunk on him. 53 is so

7:49

good. I finally don't even care to look

7:51

at most the code it produces. The next

7:53

models are likely going to be the nail

7:54

in the coffin for writing code by hand

7:56

at all. In this case, it produced two

7:58

bugs, though. And before people jump on

8:00

and be like, "Oh yeah, see but it

8:01

produced bugs." It's like, "Yeah, it

8:02

fixed a really, really difficult one and

8:04

then produced two simple ones." And the

8:06

two simple ones were pretty quickly kind

8:08

of smoothed over and called it a day.

8:10

Anyways, I just wanted to talk about

8:11

this because I just think this stuff is

8:12

so interesting because I've had these

8:14

exact same kind of bugs in my life

8:16

fixing things that just take so much

8:19

time. These are the ones that just make

8:21

me more want to give up is on a very

8:23

large project that just has so much

8:26

momentum behind it. There's these little

8:27

small bugs that end up cropping up that

8:29

are just some deep difference

8:32

undocumented thing that you just have to

8:34

comb through source code to try to

8:35

figure out what is the underlying root

8:37

cause and to be able to have something

8:39

like AI kind of solve that part.

8:42

Honestly, it feels amazing. I did a

8:44

video on Honey and kind of confirming

8:46

their behavior. I went through what

8:48

Megalag did and kind of went, okay, can

8:50

I see this in the code? And more

8:51

importantly, can I see it over time?

8:53

Have they made engineering decisions on

8:55

the code to actually make it better,

8:58

more resilient, and kind of prove the

9:00

point that they know about the code,

9:02

they know about the behavior, and

9:03

they're trying to make it more robust.

9:04

And of course, me going through minified

9:07

JavaScript code that would take months

9:09

of effort. Instead, I'm just like, yo,

9:11

AI, here's a file. This is approximately

9:14

what I'm looking for. Can you like

9:16

return to me some results that could be

9:18

it? And it's just like, here, I found

9:19

three things that are likely it. And you

9:21

just go through like, yep, no. Yep.

9:23

Okay. Yep. This is the one right here.

9:25

It's just like, wow, that just saved me

9:27

hours of time. I love this stuff. I

9:29

think that, you know, these are the

9:30

things that make me so excited about AI

9:32

because this is truly stripping away the

9:35

things in programming that I find to be

9:37

the least pleasurable parts of it. It's

9:39

no longer problem solving, right? You're

9:41

just actually just trying to swim

9:43

through just a just a river of

9:46

you're just like, "Oh my gosh, this is

9:47

just, you know, you're just like just so

9:50

fine grain going into something just to

9:52

find the one hidden answer." Which can

9:54

be summed up best as Hyram's law, right?

9:57

The observation on software engineering

9:58

to put it succinctly, with sufficient

10:00

number of users of an API, it does not

10:02

matter what you promise in the contract.

10:04

All observable behaviors of your system

10:07

will be depended on by somebody. Always

10:09

just never loved that much. Anywh who, I

10:11

hope you like this. I hope you enjoyed

10:13

this. I I don't really know what the

10:14

purpose was of it other than just yap

10:16

for a while. You know what I mean? Hey,

10:17

the name the name is the primogen. Hey,

10:21

do you want to learn how to code? Do you

10:22

want to become a better backend

10:24

engineer? Well, you got to check out

10:25

boot.dev. Now, I personally have made a

10:27

couple courses from them. I have live

10:29

walkthroughs free available on YouTube

10:31

of the whole course. Everything on

10:33

boot.dev you can go through for free.

10:35

But if you want the gamified experience,

10:37

the tracking of your learning and all

10:38

that, then you got to pay up the money.

10:40

But hey, go check them out. It's

10:41

awesome. Many content creators you know

10:43

and you like make courses there.

10:46

boot.dev/prime for 25% off.

Interactive Summary

The video discusses the evolving landscape of software engineering, particularly in the context of AI's increasing capabilities. It highlights a specific instance where an AI (Codeex) successfully debugged a complex issue that had eluded human developers for months. The creator contrasts the initial hype around AI making software engineering obsolete with the reality of AI as a powerful tool that requires human oversight, review, and refinement. The discussion touches upon the nature of AI-generated code, the importance of rigorous code review, and how AI can automate tedious aspects of development, freeing up engineers for more complex problem-solving. The creator shares personal insights into their own programming workflows and emphasizes that while AI is transformative, human expertise remains crucial in the development process.

Suggested questions

4 ready-made prompts