Did the AI Job Apocalypse Just Begin? (Hint: No.) | AI Reality Check

Did the AI Job Apocalypse Just Begin? (Hint: No.) | AI Reality Check | Cal Newport

Watch on YouTube

Now Playing

Did the AI Job Apocalypse Just Begin? (Hint: No.) | AI Reality Check | Cal Newport

Transcript

823 segments

0:00

Did the fintech company Block just lay

0:03

off 40% of its workforce due to AI

0:06

automation?

0:08

Can the best AI models pass a freshman

0:11

computer science class? Programmers love

0:14

Agentic AI, but how exactly are they

0:18

using these tools? For those of you who

0:20

followed the tech news this past week,

0:22

these are all pressing questions, and

0:25

we're going to try to find some answers.

0:28

I'm Cal Newport and this is the AI

0:32

reality check. Now, I want to do a quick

0:35

aside before we get into this week's

0:37

stories because this is a a new format

0:39

for my podcast feed. I want to give you

0:41

a quick explanation. More and more on

0:44

the main Monday episode of this show,

0:46

I've been reacting to the latest AI news

0:49

where I put on my computer science hat

0:50

and I try to push back on hype and vibe

0:52

reporting and surface the deeper trends

0:54

in these topics that I think really

0:56

matter. But not everyone who listens to

0:58

that Monday episode wants to hear about

1:00

this. So I decided I would move the AI

1:02

discussion to its own mini episodes on

1:06

Thursdays. Uh this is an experiment.

1:09

Maybe I'll move it back. Maybe I'll move

1:11

it to its own feed. Maybe I won't do it

1:13

every week. So uh just bear with me. But

1:15

keep in mind if you want to share any of

1:16

these episodes, we're also putting them

1:18

up on YouTube so you can send the video

1:19

link to someone who might need to hear

1:22

some of this reality checking. All

1:23

right, that's enough logistics. Let's

1:25

get into our first story of the week.

1:29

All right. Late last week, Jack Dorsey,

1:32

the CEO of the fintech company Block,

1:34

you know, they're responsible for uh

1:36

Stripe and Cash App among some other

1:38

products, posted a note on X, announcing

1:42

massive layoffs at his company. Let me

1:45

read you from this note. Dorsey said,

1:49

"Today, we're making one of the hardest

1:51

decisions in the history of our company.

1:53

We're reducing our organization by

1:55

nearly half from over 10,000 people to

1:58

just under 6,000. That means over 4,000

2:01

of you are being asked to leave. All

2:04

right. Later on, he says the following.

2:07

We're not making this decision because

2:09

we're in trouble. Our business is

2:10

strong. Dot dot dot. But something has

2:14

changed. We're already seeing that the

2:16

intelligence tools we're creating and

2:18

using paired with smaller and flatter

2:20

teams are enabling a new way of working

2:22

which fundamentally changes what it

2:24

means to build and run a company and

2:26

it's accelerating rapidly. Can I make a

2:28

quick aside? This is like a hint to

2:30

CEOs. If you are announcing the layoff

2:33

of 40% of your staff, can you use

2:36

capital letters at the beginning of your

2:37

sentences? I it really caught my

2:39

attention in this uh tweet that he

2:42

doesn't capitalize any of his words. I

2:43

It feels a little disrespectful, but

2:45

let's get back to the actual story here.

2:47

Uh, the traditional media was quick to

2:50

embrace and amplify Dorsy's claim that

2:54

these layoffs were because AI made these

2:56

positions uh redundant or unnecessary.

3:00

Here is the headline, for example, from

3:01

a New York Times article about the

3:03

layoffs. The headline read, "Block cuts

3:06

40% of its workforce because of its

3:08

embrace of AI."

3:10

Here's the subhead from that article.

3:13

about 4,000 workers will lose their jobs

3:15

as the payment company does more work

3:17

with new artificial intelligent tools,

3:19

its top executive said. Another quick

3:22

aside, because this is a a a

3:24

journalistic thing I began to notice

3:25

more and more, I think really starting

3:27

around the COVID coverage era where you

3:30

have a a claim that feels right that you

3:32

want to put in your subhead because

3:34

there's a point you're trying to make,

3:35

but either it's hard to fact check or

3:38

you don't want to fact check it because

3:39

you're you're not quite sure what you're

3:41

going to find. It'll be complicated. So,

3:42

you just make the claim, then you put a

3:44

comma and attribute it to someone else.

3:46

We didn't used to see attributed claims

3:49

in sub headlines or headlines. But we

3:52

began to see it more. Uh, it's a good

3:54

way of I'm trying to make a point here

3:57

and I don't actually want to go and

3:58

directly verify did they lay off all

4:01

these people because AI tools. Um, I'll

4:03

just say they lay off people because AI

4:04

tools said someone. So, you add as a

4:07

comment. So, just keep in mind that sort

4:08

of reporting trick. Um, if we read the

4:10

article itself, the framing makes it

4:12

super clear what they're implying here.

4:14

Here's from the article. The cuts made

4:16

as Block reported strong financial

4:18

results for it most recent quarter are

4:19

perhaps the most striking example so far

4:21

of a technology companies making plans

4:23

to eliminate employees because of AI. I

4:25

don't mean to pick on the Times. A lot

4:27

of a lot of publications had similar

4:30

coverage. Uh, and the stock price went

4:31

up 20% for Block. This is an important

4:35

article to look at in part because I got

4:37

sent it a lot of times. When I get sent

4:39

an article a lot of times, that means it

4:40

is catching people's attention and is

4:43

either exciting or upsetting them. So,

4:45

it's worth some closer scrutiny. I think

4:48

there's a general vibe that this article

4:50

is trying to verify or validate, which

4:52

is the vibe of something big is

4:55

happening. Yeah, we've been talking

4:56

about AI could get rid of jobs or

4:58

whatever, but now it's happening. See,

5:00

look, this is the first shoe to drop of

5:03

a major crisis. Like, it's the first

5:04

company that laid off almost half of its

5:06

workforce. This is the thing we've been

5:07

warning you about. Major economic

5:09

disruption. It has begun. That is a

5:11

story that is very sticky and very uh

5:14

attention catching.

5:16

But is it true?

5:19

Well, if you dig a little deeper,

5:20

there's a lot of commentators online who

5:21

know this industry sector a little bit

5:23

better who are not at all convinced. Let

5:26

me give you a few bits of contextual

5:30

information about Block and its layoffs.

5:33

Between 2019 and 2025, Block's employee

5:37

count grew from around 4,000 employees

5:40

to over 10,000. So, they had massive

5:44

growth during the pandemic. A lot of

5:46

this growth actually came from

5:48

acquisitions in the crypto and

5:50

blockchain space earlier in the pandemic

5:52

when those things were still hot. Um

5:54

those acquisitions are now of course

5:56

floundering as those technologies

5:57

especially the blockchain based software

5:59

technologies are having a hard time. A

6:01

lot of the startups are really

6:02

struggling

6:03

despite the fact that uh the times said

6:07

that they had quote strong financial

6:09

results in quote if you actually read

6:10

the industry analysts who study the

6:14

quarterly reports from block they're not

6:17

impressed because the last two quarters

6:19

they actually fell short of their

6:21

earnings target.

6:23

So here's an alternative explanation for

6:25

what might be going on here. Like just

6:27

about every major tech company in

6:29

America block overhired during the

6:32

pandemic when that industry was booming.

6:34

Also like just about every major tech

6:36

company right now in the last two years.

6:39

They're shedding jobs to try to

6:41

rightsize back because they had

6:43

overhired during the pandemic. We've

6:44

talked about on this show before Amazon

6:46

doing this, Microsoft is doing this.

6:47

This is a common trend in recent years.

6:51

But how do we know it really wasn't AI?

6:53

AI is the reason why they laid off these

6:55

4,000 people.

6:57

Well, there's a couple things going on.

6:59

One, a lack of specificity in Dorsy's

7:01

statement. He just says like, well, we

7:02

have these intelligence tools. And then

7:04

he talks about non-AI things like and we

7:06

have like different types of teams and

7:07

we just uh we don't need as many people

7:09

anymore. No specific reference of this

7:12

particular tool has taken on this role.

7:14

So, we fired we shut down this division

7:16

because we don't need employees there.

7:18

or in this division what we did is we

7:20

laid off the entire entrylevel class

7:22

because the managers can now get by with

7:23

less. It's very vague what he said. Two,

7:26

as we'll hear later in today's episode,

7:29

though there is major changes happening

7:31

in computer programming because of new

7:33

agentic AI tools. Basically, every

7:35

serious commentator who is studying this

7:37

industry says, "Yeah, we're not yet we

7:39

haven't figured out the companies

7:40

haven't figured out exactly what this

7:41

means. We're certainly not laying off

7:44

ready to lay off half of our workforce

7:45

yet. These tools are very new. the

7:47

versions that people are getting excited

7:50

about. But maybe the most telling uh

7:53

reason why we know this is not AI is

7:55

that Ethan Mollik didn't buy this claim.

7:58

Ethan Mollik from PIN is a a respected

8:01

AI commentator who is very much on the

8:03

booster side. He's very AI is going to

8:06

change everything. And even he didn't

8:09

buy this idea that AI was responsible

8:11

for the layoffs at block. On a LinkedIn

8:14

post, Ethan Mollik said the following,

8:17

referring to the layoffs. This isn't

8:18

about AI, but that is a smart way to

8:22

sell it if you want to see your stock

8:23

jump 20%. Then on X, Ethan Mollik said

8:27

the following in response to Dorsy's uh

8:29

tweet. Two things. One, given that

8:33

effective AI tools are very new and we

8:35

have little sense of how to organize

8:36

work around them, it is hard to imagine

8:38

a firmwide sudden 50% efficiency gain.

8:42

Two, CEOs with vision who hired well

8:45

should also use AI for expansion and

8:47

augmentation, not decimation. I'll just

8:50

say as an aside,

8:52

uh I've been hearing this from the

8:53

managers and programmers I've been

8:55

talking to in the last couple weeks

8:56

about how they're using aantic

8:58

programming. I am much more likely to

9:00

see the effect to be I mean I haven't

9:02

had any of them say we're laying people

9:04

off, but I have heard a lot of people

9:05

say like Mollik implies here, the

9:07

reaction to these tools uh at a lot of

9:09

these startups has been do more work.

9:11

Great. Now we can do more work with the

9:12

same people. Let's make more money out

9:14

of the same people, not let's lay people

9:15

off. All right. Uh we have another voice

9:18

of skepticism here. This one comes from

9:20

Ron Shelvin uh Chevlin, sorry, who is an

9:23

industry analyst who specializes in the

9:25

fintech sector. So he specializes in the

9:27

sector where Block is and he writes and

9:28

covers Block professionally as a

9:30

financial journalist. He wrote a column

9:32

right after this that was titled the

9:34

following. Block lays off 40% of staff

9:36

and blames it on AI. Don't buy the

9:40

excuse. And he goes on to say, "Yeah,

9:42

they they overacquired. They made some

9:43

bad acquisitions. They they need the

9:45

right size." And they're blaming AI

9:47

because it sounds better than saying,

9:49

"Yeah, we uh we made some bad calls

9:52

during the pandemic and now we have to

9:54

adjust to it." All right. So, what's the

9:56

bottom line here in terms of reality

9:58

checking this story?

10:00

AI will have an impact on jobs.

10:04

I'm not one of these skeptics that says

10:05

this is a a fad that's going to go away,

10:08

that this is going to be like uh

10:10

blockchain based software that really

10:12

just failed to catch on.

10:14

But we're not really there yet outside

10:16

of some narrow instances. The the tools

10:18

have not matured to the phase where we

10:20

really understand what's going on the

10:22

where we're really seeing major changes

10:24

to the way companies are structuring

10:26

themselves. Most of the commentators I

10:27

can find who follow this closely say,

10:29

"Yeah, sure. This is probably there is

10:31

going to be things happen with jobs. We

10:33

don't know if it's going to lead to

10:34

expansions or contractions or what

10:35

sectors get hit more than yet, but we're

10:37

not there yet. There is a tendency I

10:39

think among coverage right now to lean

10:42

into the debt vibe that AI is going to

10:44

affect jobs and try to keep making the

10:46

claim it's happening right now. And

10:47

what's happening is the CEOs of these

10:50

companies, especially tech companies, so

10:51

CEOs like Jack Dorsey are seeing the

10:53

tendency towards that vibe reporting.

10:55

This is very tempting for journalists.

10:56

And so they're trying to uh there's a

10:59

term Annie Lowry introduced. I think it

11:00

was something like AI washing. They're

11:02

trying to justify layoffs that are due

11:04

to things like pandemic overhiring by

11:06

saying, "Well, AI, we're being smart so

11:08

they look better uh like better decision

11:10

makers and like they're more forward

11:12

thinking." It's important that we cover

11:14

AI's impact on jobs accurately so that

11:16

when real impacts come,

11:20

we can see them with clear eyes

11:23

and react to them honestly. uh and hold

11:27

to account the actual ch. Why are you

11:29

firing these people? Do we what's

11:31

happening here? What's what leaders

11:32

doing this? We really do need to cover

11:34

that accurately. So, we have to stop the

11:36

vibe reporting on the AI job apocalypse.

11:38

It's not here yet, and we don't know if

11:40

it's going to come at all, but the best

11:41

we can do is try to be accurate about

11:43

what we're saying. All right, second

11:46

story.

11:47

Um, this one's kind of a fun one. All

11:50

right. So, Anthropic CEO Dario Amade

11:52

famously said in recent I guess this is

11:54

all this last last uh year famously said

11:57

that their LLM products have the

11:59

intelligence of someone with a doctorate

12:01

that before like well it was as smart as

12:04

a high school student then as smart as a

12:05

college student now it's as smart as

12:07

someone with a doctorate. He described

12:08

this product deploying this product like

12:10

having an quote army of PhDs in quote in

12:13

your data center. Last month he used a

12:15

related terminology. He said uh we can

12:17

offer you a country of geniuses

12:21

in a data center. Well, I was thinking

12:23

about this this approach of sort of

12:25

describing AI with human education

12:28

levels when I came across an interesting

12:30

video that was posted in January which

12:33

did a really cool experiment. A TA for

12:37

Cornell University's freshman computer

12:40

science course CS uh 2112, they probably

12:43

call it 2112. This is their sort of

12:46

advanced

12:48

freshman fall CS course. So if you come

12:50

into the CS program there uh as a pretty

12:53

advanced student, this would be the the

12:54

course you would take. But it's for

12:55

freshmen in their first semester. He was

12:58

TAing it. So he said, "Here's what I'm

12:59

going to do. I'm going to take the three

13:02

leading AI models and I'm going to give

13:06

them every graded thing we do in this

13:09

class. I will give to the models and

13:11

then I will grade their results at the

13:13

same time I'm grading the real students

13:15

in the class using the exact same

13:16

rubrics and then at the end I will you

13:18

know wait the grades just treat them

13:20

like a student in my in this class and

13:22

see how they do. Let me play you a quick

13:24

clip here. U this is the intro the intro

13:29

uh to that video. Can AI pass a first

13:32

semester freshman CS class? To answer

13:35

this question, I ran every single

13:36

assignment, every exam, every quiz,

13:38

every graded interaction the students

13:40

got this semester through the three best

13:42

models I could get my hands on from

13:43

ChatGpt, Claude, and Gemini. Then I

13:46

graded each result with the exact same

13:48

rubric we use on students so that I

13:49

could give each AI the most accurate

13:52

possible grade in the class. All right,

13:54

so this was a very entertaining video if

13:55

you watched the whole thing because he

13:56

goes through specific assignments. He's

13:58

like, "Wa, look, this is really cool. Oh

13:59

my god, look at this crazy thing it did.

14:01

Um, it's well edited. Uh, I thought it

14:03

was really cool." In the end, they have

14:04

a competition in the class where you

14:07

create these like critters that evolve

14:09

and they uh they had the AI models

14:11

critters compete with the critters from

14:13

the class. Uh a couple things I noticed

14:15

from the videos. Sometimes these models

14:17

did very well on assignments. Sometimes

14:19

they really struggled. Sometime they

14:22

made very revealing, baffling mistakes

14:24

like in an early assignment where they

14:27

were doing some simple string

14:28

concatenation. The assignment had you

14:29

write a program that was going to output

14:31

the word. You're going to create a

14:32

string concatenation. But basically,

14:33

you're going to output the word hello is

14:35

what it asked you to do on the screen.

14:37

Uh, and Claude's submission outputed

14:40

hello world. Because what's going on

14:43

here is there's a lot of AI assignments

14:45

out there. I mean, CS assignments out

14:47

there that famously say, hey, write

14:49

hello world as the first thing you do

14:50

when you're using a new programming

14:52

environment. And clearly, it was just

14:54

trying to statistically grow out his

14:56

answer. It's like, well, if I'm printing

14:57

hello in an assignment, I got to I got

14:58

to print hello world. And then added

15:00

another world just to be safe. Um, but

15:02

how did they end up grade-wise? Okay, so

15:04

I have the grades in front of me here.

15:06

They used the latest greatest models

15:08

from Chat GBT, Glad Claude, and Gemini.

15:11

They actually upgraded during the fall.

15:13

They did this last fall when the they

15:15

were using the very most expensive

15:16

version of uh the Claude LLM available.

15:18

I forgot which one. And then they when a

15:20

new one came out, they upgraded to that

15:22

new one. Um, on some assignments these

15:25

these things did pretty well, especially

15:27

the early assignments. We got like on

15:29

the first assignment, Chat GPT got a 102

15:32

out of 104. Claude got a 99 out of 104.

15:35

Jim and I got a 101 out of 104. They

15:38

also did well on the final exam because

15:40

this was an in-class final exam where

15:42

you're just writing answers, right? So

15:43

like you just have to use the knowledge

15:44

in your head. Um, that's a good setup

15:47

again for um, LLMs. And so like Chat GPT

15:50

got a 93 out of 100. Jim and I got an

15:52

84. There's other assignments where they

15:54

they really uh struggled. Assignment

15:56

six, Chat GPT got 32 out of 100. Claude

15:59

got 20 out of 100. Gemini got 13 out of

16:02

100. On assignment five, Chat GPD got 60

16:05

out of 100. Claude got six out of 100.

16:08

Gemini got 67 out of 100. There's a lot

16:10

of issues it had with uh hallucinating.

16:12

um it had a hard time if you watch this

16:14

video where you would the assignment

16:16

would give you multiple you know some

16:18

rules for what to do in the assignment

16:19

and it would just sort of skip some of

16:21

the rules sometimes I think in the

16:24

example where Claude got six out of 100

16:25

it just kind of made up its own

16:26

assignment and solve that one instead so

16:29

it's sort of a mixed bag in terms of its

16:31

final grades two of the models Claude

16:34

and Gemini ended up getting a C+ in the

16:37

class this is a freshman computer

16:39

science you need a 25 to declare in your

16:42

in the initial classes you need a 25 GPA

16:45

at Cornell to declare yourself as a

16:46

computer science major. Uh a C++ is like

16:48

a 23 something. So uh they weren't doing

16:51

well enough to actually even major in

16:53

computer science. Chach did better with

16:56

the B+. It was below the median for the

16:58

class, but uh it did somewhat better.

17:01

Anyways, here's what's interesting about

17:03

this. I mean there's the kind of the

17:04

catchy thing is like this is an army of

17:06

geniuses. this is a PhD level, whatever.

17:09

They're struggling with the first class

17:10

you take as a freshman in computer

17:12

science, which is the topic that these

17:14

models are best suited for. So, there's

17:16

that sort of like gotcha moment, but

17:17

that's not really what this is about,

17:19

right? Because I'm sure you could get

17:21

these chat bots to get you the right

17:24

answers to these assignments if you're

17:26

willing to be sufficiently interactive

17:28

and hold their hands and get the prompts

17:29

in just the right way and correct them.

17:31

That's not really the right way, the

17:32

right takeaway here. I think the right

17:34

takeaway here was that it was stupid all

17:36

along for Dario Amade to try to use

17:41

human education levels as a way to

17:43

describe a large language model.

17:46

This is just different. The human brain,

17:48

we we have a a general purpose

17:50

integrated brain that does lots of

17:52

things. The whole person is educated. It

17:54

makes sense to talk about the educated

17:56

education level of a person, but not

17:57

really a language model. It turns out a

18:00

lot of these claims like when Dario Amit

18:01

I went back and checked this out excuse

18:03

me why did he originally say that their

18:05

language models were now PhD level it's

18:07

because they had the original time he

18:09

started saying that is that they had

18:11

given it math problems like a problem

18:14

set and it was doing well on the math

18:17

problems from this problem set and one

18:19

of the professors who worked on creating

18:21

those problem sets said those are hard

18:22

problems those are the type of problems

18:25

I would assign to my graduate students

18:26

that's where they originally got the

18:28

claim that this is a PhD level. Right?

18:30

So this idea of just generally talking

18:33

about the intelligence level of language

18:35

models I think is anthropomorphizing and

18:37

is not useful. The reality is these are

18:39

very specialized tools. They tend to get

18:42

tuned for specialized purposes and to

18:45

get their real value. It's a combination

18:47

of the tool and learning as the human

18:49

how best to use and deploy the tool and

18:51

check its work and redeploy it towards

18:53

that particular goal. That is a very

18:56

different tool use scenario. It's a tool

18:58

you use your scenario is very different

18:59

than imagining just an anthropomorphized

19:01

brain that has a general education

19:02

level. So hopefully we can stop using

19:04

terms like having a data center full of

19:06

PhDs. Also that was a clever video. So

19:09

you know kudos to that TA for putting

19:12

that together. It was a hard it was a

19:13

hard CS class. It was definitely harder

19:16

than the intro CS classes

19:18

I took at Dartmouth, but it reminds me

19:20

of the type of classes we had at MIT. So

19:22

you know it was a hard class. All right,

19:23

one final story here. The story actually

19:26

comes from me. Um, obviously there's a

19:29

lot going on in the last four or five

19:31

months with new agentic coding tools

19:34

being enthusiastically embraced by

19:37

computer programmers. A lot of these

19:39

viral essays are going around that just

19:41

keep and and articles that are

19:42

influenced by those essays and podcasts

19:44

where people are talking about, oh my

19:46

god, huge changes are happening in the

19:48

world of computer programming. This is

19:49

this is and this is really going to be

19:51

this is like ground zero for the long

19:54

promised we're about 3 years in now. The

19:55

long promised claim that the language

19:58

modelbased tools are going to have

20:00

massive disruptions. But what actually

20:03

is going on? I've been trying to find

20:05

out. As people who subscribe to my

20:07

newsletter at calupport.com know, a week

20:09

or two ago I put out a call for

20:10

professional computer programmers to

20:12

send me detailed reports about exactly

20:14

how they and them teams use

20:17

language modelbased AI tools and how

20:19

this has changed in the recent past. I

20:21

have over 350 such reports in so far.

20:24

I've carefully made my way through a

20:25

hundred. I'm really trying to get my

20:27

brain around what's really happening

20:28

with professional programmers and these

20:31

tools. I thought it would be useful

20:32

today to read you excerpts from two

20:35

responses that I think are uh very

20:38

typical of the type of responses I'm

20:40

reading that try to give you a better

20:42

picture of what exactly does it mean for

20:45

these programmers to be using these new

20:46

tools. Um I'm I cut out details in these

20:49

and have some illision to get rid of

20:51

identifying details. All right, so

20:53

here's my first excerpt. I'm a software

20:55

developer working at a tech startup. Our

20:59

use of AI varies by person at the

21:01

company, but my use has skyrocketed

21:03

starting in the fall of 2025.

21:06

So much so that I don't write any code

21:10

anymore, but I'm still heavily involved

21:12

in oversight and architecture. I used

21:15

cursor quite a bit last year, but have

21:17

moved on to working directly into

21:18

terminal with codeex at work. The

21:21

workflow goes something like this. Plan

21:23

a feature or start a discussion about a

21:25

bug fix with AI. discuss until I'm

21:28

satisfied, have it output a plan,

21:31

iterate on the plan, then execute the

21:33

plan. After execution, I verify the

21:36

outcome. I use Git extensively

21:38

throughout this process. Git is a

21:42

repository software for managing code

21:44

that multiple people are working on.

21:46

I've tried the multi- aent approach

21:48

where multiple agents are working on

21:50

different git work trees at the same

21:52

time. I can't do it. It's too much

21:55

context switching and I end up just

21:56

accepting things I wouldn't normally

21:58

accept because it's an exhausting

21:59

process. The quality dips dramatically.

22:02

I love my current workflow. I've

22:03

developed things in the past week that

22:05

would have taken me months before. All

22:06

right, let's pause there before I do the

22:08

second excerpt. This, I would say, is

22:10

very typical of what I would call the

22:12

enthusiastic all-in user from among the

22:15

subset of professional programmers. Most

22:18

of the code they're producing is now

22:20

actually being generated by an AI

22:22

agentic tool. Typically it is clawed

22:24

code where they switched the model

22:26

behind it. I don't know if it was opus

22:28

to sonnet or sonnet to opus in the fall

22:30

and that really seemed to be make it

22:31

good enough now that a lot of people

22:33

wanted to use it. Um though I would say

22:35

I also see chatgptt codeexc is also uh

22:38

commonly used but an interesting thing

22:41

about this or I want to point out two

22:42

things. One there's a lot of just

22:45

chatbot discussion happening in these

22:47

workflows. Remember he talked about

22:49

making a plan iterating on the plan.

22:51

That's all actually like chatbot

22:53

interaction. So, so sort of or related

22:56

to using these tools to produce more

22:58

code. These programmers are have entered

23:02

a more interactive way. They, you know,

23:04

they want to talk back and forth. It

23:06

reminds me a lot of the the research I

23:07

did for the New Yorker about how

23:08

students are using chat bots to write

23:10

paper. They find talking back and forth

23:12

with the chatbot as they write is less

23:14

straining. So, that's picking up here.

23:16

But also notice this programmer is not

23:19

really big on the multi-agentic approach

23:22

which is what you see most often told in

23:24

the sort of breathless online articles

23:26

and YouTube videos is this idea of I

23:29

have 20 agents working at the same time

23:31

and this agent checks this agent and

23:33

there's a supervising agent that looks

23:34

at those agents and then it reports over

23:36

here to the hierarchy agent and then

23:37

that agent is on openclaw so that it can

23:40

uh it can send recommendations to my

23:41

YouTube channel and then make sure that

23:43

it pays that you know these super

23:45

complicated trees of different agents

23:46

supervising other agents. You really

23:48

aren't seeing that, at least in my study

23:50

here. It's you're not seeing a ton of

23:53

that in professional programmers. You

23:54

tend to see it more in people who are

23:56

like working on their own personal

23:58

bespoke projects and find it really fun.

24:00

But I don't see as much and that's what

24:02

we saw reflected here. All right, let me

24:03

read you one other typical uh

24:08

uh excerpt here from a real professional

24:10

programmer. I think this captures well

24:12

the the sec another very common type of

24:15

response which is a little bit more

24:17

reticent but still appreciating the

24:19

power of these new tools. Let me read

24:20

this. I'm a software developer working

24:22

at a tech startup. Our use of AI varies

24:25

by person at the company but my use has

24:27

skyrocketed starting in the fall of

24:29

2025. Oh, wait. That was the last one.

24:32

I'm sorry. This is the new one. I don't

24:34

want to just reread the last one. All

24:35

right. I'm like an I'm like a a language

24:38

model here just sort of randomly

24:39

hallucinating the same answer twice. No,

24:41

no, here's the real second excerpt. I'm

24:44

a staff software engineer at a tech

24:46

startup. The AI models have made the

24:48

easiest tasks even easier. Scaffolding a

24:51

solution, boilerplate code, replacing

24:54

variables, or moving an import.

24:55

Repetitive tasks are good candidates.

24:57

LLMs are also useful as a way to quickly

25:00

investigate the documentation of a tool

25:02

or get a reminder on syntax for

25:04

something I'm trying to do. But the easy

25:06

stuff, the task that AI can do well, was

25:09

never the hardest nor most time-conuming

25:11

part of my job. When actively using

25:13

these coding agents, I found that it

25:15

generally slows me down. Using them

25:17

introduce tasks I didn't have before,

25:19

composing a prompt, checking the output,

25:21

reprompt, manually refactor when it

25:23

isn't quite right. It also slows down

25:25

the code review process. I'm much more

25:27

detailed in my reviews when I know a

25:29

co-orker used an LLM to generate some or

25:31

all of the code. That's also a very

25:33

common response as well. that's pointing

25:35

out this idea which I think is a fair

25:37

criticism

25:38

that the people like our first excerpt

25:41

which is doing most of their code

25:42

generation with agentic AI like this is

25:45

saving so much time they're noting the

25:47

more reticent users are noticing you are

25:50

downplaying the huge amount of time that

25:53

now surrounds yeah you don't write the

25:55

code yourself that's faster but now you

25:57

have to do so much other work all of

25:59

this iteration with the model and the

26:01

prompts and try the prompt again and

26:03

work on your agent the markdown file and

26:05

your skills harness and then all of the

26:07

review on the other side and if it was

26:09

produced with AI you really have to

26:11

review it and he's like there's all of

26:12

this other work that's surrounding this

26:14

workflow which is none of it's very fun

26:15

I mean and and this is taking a lot of

26:17

time are we sure are we sure that this

26:21

is actually producing the best code so

26:23

there's sort of this tension going on in

26:24

the computer programming world here's a

26:26

takeaway from this one

26:29

agentic coding tools

26:33

past a threshold of usefulness with the

26:35

cloud codec update uh in the fall that

26:39

has made them much more heavily used. In

26:41

my survey, something like 45% of the

26:43

people I talked to are now producing a

26:45

the majority of their code with an

26:47

agentic tool um such as cloud code. All

26:50

right. Two, it's really unclear exactly

26:54

what the best practices are for this

26:56

are. There seems to be a spectrum of

26:58

enthusiasm of the users of it in the

27:00

space. for sure. On one end, there's way

27:03

too much AI interaction going on. This

27:04

can't be the most efficient way to do

27:06

it. Um, on the other end, there's a lot

27:07

of reticence. The reality is going to

27:09

fall somewhere in the middle. We don't

27:11

yet know what the future computer

27:13

programming looks like. I think by the

27:15

summer, there's going to be some best

27:16

practices. They'll have some clever

27:18

acronyms to go with them. There'll be

27:20

some best practices uh about how best to

27:23

use these. There will be automatic code

27:25

production. I think we're going to pull

27:27

back a little bit on um how much AI

27:30

chatbot should be involved in review as

27:33

well as planning. I think that's a

27:34

little bit of just enthusiasm there. I

27:36

do think a lot of code will still be

27:38

generated but we'll be better at where

27:40

we deploy the code. I think it'll be

27:41

more standardization about planning and

27:43

architecture documents etc which will

27:45

have a high overhead at first but it'll

27:46

allow us to deploy these tools better. I

27:48

do not think based on these interviews

27:50

that the hyper multi-agent approach that

27:52

we see most talked on the internet is

27:54

going to become some sort of standard

27:55

for serious programmers in most places.

27:57

And the vibe coding like you see uh you

28:01

know talked about a lot. Give me this

28:03

app and I come back a week later and

28:05

it's done. That really is in the the

28:07

realm of like hobbyists and apps for

28:09

personal apps for yourself or people who

28:11

are doing experiments. None of the

28:12

serious programmers I heard of so far um

28:15

are doing anything like that for the

28:16

most part. All right. So, there's a lot

28:18

to be done here, but what I'm trying to

28:19

do, this why it's reality check. I am

28:21

not interested

28:23

in breathless accounts of what's

28:25

happening online

28:27

because that's engagement hunting. I'm

28:29

not interested in hearing sort of like

28:32

non-technical reporters who have just

28:34

heard a lot of those accounts and then

28:36

are like, look, I don't know the

28:37

details, but I think we can all agree

28:38

that like there's not going to be

28:40

programmers in the future. I think we

28:41

got to talk to real programmers.

28:44

What is really going on? Something is

28:46

happening. It's more complicated than

28:47

other people make it seem. Let's keep

28:50

listening. I'll read you some more of

28:51

these reports in weeks ahead. Let's

28:53

figure out the oldfashioned way. Turn

28:55

every page, learn what's going on,

28:57

what's working, what's not, what's hype,

28:58

what's not, and let's try to figure out

29:00

what's actually happening. I think we

29:01

will, and we'll get on it, especially if

29:03

you follow me here. All right, that's

29:04

all the time I have for today. Remember,

29:07

take AI seriously, but not necessarily

29:09

everything you hear about it. I'll be

29:11

back on Monday with the main episode,

29:12

and hopefully I'll do another one of

29:13

these next Thursday. See you then. Hey,

29:15

if you like this video, I think you'll

29:17

really like this one as well.

Interactive Summary

Ask follow-up questions or revisit key timestamps.

The speaker performs a "reality check" on current AI news, addressing three main stories. First, he debunks the media narrative that fintech company Block's 40% workforce layoff was primarily due to AI automation. He argues it was more likely due to pandemic overhiring and struggling crypto acquisitions, with AI being a convenient justification (AI washing). Second, he reviews an experiment where leading AI models (ChatGPT, Claude, Gemini) took a Cornell freshman computer science course. The models performed inconsistently, with Claude and Gemini earning C+ grades and ChatGPT a B+, demonstrating that they struggled with foundational programming tasks and that anthropomorphizing AI's intelligence level is misleading. Finally, he shares insights from a survey of professional programmers on their use of agentic AI coding tools. While many use these tools to generate a majority of their code, workflows often involve significant human interaction for planning and review, and the much-hyped hyper multi-agent approach is not common among serious professionals, who find it impractical.

Recently Distilled

Videos recently processed by our community