we're so back

Watch on YouTube

Now Playing

we're so back

Transcript

334 segments

0:00

software engineering and development for

0:02

the last few years has just been on this

0:04

we're so back and oh my gosh it's so

0:06

over just kind of hype cycle every

0:09

single model that's come out the

0:11

headlines do not stop it's over software

0:14

engineering is so over people who've

0:16

never written a line of software trying

0:17

to flex on you about their to-do

0:19

application telling you that your job is

0:21

over and you're just a whiny little baby

0:23

and then inevitably the highs come back

0:25

right we start seeing posts like this

0:27

such as AI actually reporting fake

0:30

figures for multiple months and just

0:32

completely bamboozling a company. And

0:34

us, we all laugh by ourselves.

0:38

Those peasants, they don't even know the

0:40

power that I have, let alone the power

0:42

that they think they can wield. But

0:44

unfortunately, today, this story that

0:45

I'm going to be making on this old

0:47

YouTube video is an it's an it's over

0:49

kind of moment, okay? And it comes

0:50

courtesy of Mitchell Hashimoto. If you

0:52

do not know Mitchell Hashimoto, his most

0:54

recent and popular work, of course, is

0:57

Ghosty. I'm using Ghosty right here.

0:58

This is actually how I edit the old Vim

1:01

right here is inside of a ghosty

1:02

terminal. I find the uh terminal to be

1:05

quite nice. And if you do not know

1:06

anything about Mitchell, he tends to

1:08

tweet things that are very insightful

1:10

and he takes his time, it appears at

1:11

least to kind of communicate through his

1:13

thoughts. And every single time I see a

1:15

longer post by him, I always go, "Oh,

1:17

I'm excited. This is something I just

1:19

want to go and watch and read through

1:21

because whatever he's talking about

1:24

likely, I'm going to find to be very,

1:25

very interesting." And in this case,

1:28

this was very, very interesting. It was

1:30

about a very difficult bug that he and

1:33

his team had quite some time to try to

1:35

figure out and could not figure it out.

1:36

And then in a very short period of time,

1:39

the AIS it figured it out. Ah, codeex 53

1:42

extra high with a vague prompt just

1:44

solved a bug that I and others have been

1:46

struggling to fix for over 6 months. So

1:48

obviously when he says 6 months, he does

1:50

not mean 6 months fulltime heads down

1:52

and couldn't do it. other full reasoning

1:54

levels with codeex failed. Opus 46

1:56

failed cost $4.14

1:59

minutes full trace plus includes

2:01

original issue right here. So here we

2:03

go. So we're going to walk through a bit

2:05

of what he did and then we're going to

2:06

kind of come to some conclusions because

2:08

I I guess I drew slightly different

2:10

conclusions than he did, but I do think

2:12

that this is a very interesting thing to

2:14

kind of watch through. Use GitHub to

2:17

look up this specific issue and

2:18

determine a fix. And so the AI of course

2:21

went through and used GitHub and kind of

2:23

walked through all the different

2:24

talking. The issue of course was this

2:25

right here. Every single time they made

2:27

a split, the screen would flash a little

2:29

bit. Flash. You see that how there's

2:30

something behind it and then it goes and

2:33

it flashed for a quick second. One more

2:34

time for the you know, you can see that.

2:37

Okay. So that's bad. That's not good. We

2:39

don't want that. So right away the AI

2:40

obviously went, "Okay, this is a GTK

2:42

split update. I'm going to go take a

2:44

look at this." All right. Right. It

2:45

found kind of the exact commit that was

2:47

touching uh this exact kind of area in

2:49

which it's looking into. It then starts

2:51

going through all the patches just going

2:53

all the way through every last piece of

2:55

code and how it's been changed. Then it

2:57

starts looking at each one of the

2:58

commits, you know, according to this one

3:00

specific issue that's been associated

3:02

with the issue. It starts narrowing down

3:04

on a specific operation, this split tree

3:07

split. But then it takes an odd turn

3:09

about partway through it just goes off

3:11

and starts looking at some Gnome GTKC.

3:15

Now, if you didn't catch any of this,

3:16

this right here is Zigg. This is a

3:18

fantastic language. If you've never used

3:20

Zigg, Zigg has to be probably one of my

3:22

personal favorite languages to tool

3:23

around in. Though I haven't written it

3:25

in quite some time, it still is

3:27

something that I just I draw a lot of

3:28

just I have a lot of warm feelings

3:30

about. Now, if you go back to the tweet,

3:32

you'll see that Mitchell says, "The best

3:33

thing that Codeex did was eventually

3:35

start reading the GTK4 source code."

3:38

That's where I ended up seeing my GitHub

3:39

issue, and I knew the answer was

3:41

somewhere in there, but I didn't have

3:42

the time or the motivation to do it

3:44

myself. The other models never went

3:46

there. The lower reasoning efforts with

3:48

53 didn't go there either. Only extra

3:50

high went there. I think it was the

3:52

critical difference. And yes, it

3:53

actually did make the difference. It

3:55

started making these kind of code

3:56

changes saying, "Hey, this is actually

3:58

what's happening. I'm going to delete a

3:59

lot of the comments, delete a lot of the

4:00

stuff happening. This is actually what

4:02

it should look like. And then Mitchell

4:04

went, "Okay, hey, don't run the full

4:05

test, by the way. Super Chad move."

4:08

Absolutely what I wanted to see. And

4:10

then just says, "Hey yo, just explain to

4:12

me what you fixed in a lot of detail."

4:14

And it goes through and it actually

4:15

talks exactly about what exact happened.

4:18

And the general thing was that the the

4:20

little leaf nodes inside these uh kind

4:22

of surfaces that are created in GTK were

4:25

being completely destroyed and new ones

4:26

were being created. Hence like the blah.

4:28

Instead, it's going to try to reuse the

4:30

ones that aren't changing. Gives a nice

4:32

little ex, you know, answer about why

4:34

this reduces flicker. But here's the

4:36

thing that kind of makes a big

4:37

difference. Now, the average vibe coder,

4:40

shall we call him, the average AI Andy

4:42

from San Francisco, typically at this

4:43

point, they put the code in the bag.

4:45

Okay, just put the code in the bag. And

4:48

you see this a lot. You see a lot of

4:49

people that when something starts

4:51

working, they just kind of move on. They

4:53

kind of call it a day and they go,

4:54

"Okay, there we go." And I especially

4:56

have seen this a lot with the younger

4:58

the younger folks. There just seems to

4:59

be a lot less rigor because there's a

5:01

lot more idea that hey, the AI is super

5:03

correct. It's super fast. It can code

5:05

much faster than me. And I have a whole

5:07

bunch of tests, so it's probably just

5:08

okay. It just feels so much different

5:10

than how I would operate. And this has

5:12

honestly been one of my big sticking

5:14

points about diving headirst into this

5:16

Daario take the wheel style of

5:18

programming where you just let all the

5:20

code be written by AI is that I don't

5:22

really like how it does things. I have

5:24

to like constantly correct and say,

5:26

"Hey, I just don't think these things

5:27

are right." And of course, you see the

5:28

same thing with Mitchell right here as

5:29

he goes in, he does a thorough code

5:31

review and asks why this some specific

5:33

thing was created and then asks even

5:35

Interactive Summary

Ask follow-up questions or revisit key timestamps.

The video discusses the evolving landscape of software engineering, particularly in the context of AI's increasing capabilities. It highlights a specific instance where an AI (Codeex) successfully debugged a complex issue that had eluded human developers for months. The creator contrasts the initial hype around AI making software engineering obsolete with the reality of AI as a powerful tool that requires human oversight, review, and refinement. The discussion touches upon the nature of AI-generated code, the importance of rigorous code review, and how AI can automate tedious aspects of development, freeing up engineers for more complex problem-solving. The creator shares personal insights into their own programming workflows and emphasizes that while AI is transformative, human expertise remains crucial in the development process.