Everyone is Wrong about Tokens

Watch on YouTube

Now Playing

Transcript

357 segments

0:00

You see this right here? Yeah. That's

0:02

$1.3 million

0:05

spent in OpenAI tokens in the last 30

0:08

days. 603

0:11

billion tokens spent. Now, even if I

0:13

were to try my hardest, I am not

0:15

actually sure it's possible for me to

0:18

spend this amount of money or that

0:20

amount of tokens. I have no idea how we

0:23

accomplished such things. And when I saw

0:25

this, I thought this is just the most

0:27

ridiculous thing and I

0:29

This is so stupid. But then, I started

0:31

thinking about it more and more and I

0:33

realized that there's a future that is

0:36

developing in which I think a lot of

0:37

people are wrong and I think this post

0:40

right here really helps it kind of

0:42

crystallize in my mind

0:45

where things are going. So, I got a lot

0:48

of yapping to do, so I hope you're going

0:49

to you know, strap down cuz I I I think

0:51

that yeah yeah you you you you you

0:53

probably aren't going to see this one

0:55

coming. I don't think you you understand

0:56

what's going to happen here in the next

0:58

year. And I think I might be right on

0:59

this one. I'm going to In fact, I'm

1:00

going to do something I normally don't

1:01

do. I'm going to make a tech prediction.

1:03

I know.

1:05

Kind of dangerous. I I do just got to

1:07

yap about this for a second, okay? The

1:09

reason why I have to yap about this is

1:11

that whenever a post like this happens,

1:13

there's always the exact same thing that

1:15

happens. There's this entire

1:17

fluencer market when it comes to AI and

1:20

I largely think they're just simply

1:21

pull-overs from the crypto days. The

1:23

crypto NFT bros moved over to AI. When

1:26

they see someone make a post like this,

1:27

Papa Pete, of course, they go, "Oh, hey

1:30

bros. Hey bros. Everybody. Uh I don't

1:31

know if you know this. If you aren't

1:33

spending like $100,000 if you A- if

1:36

you're not even hitting 10 billion, if

1:38

you're not even in the B's when it comes

1:40

to token usage a month, you're not going

1:42

to make it. You know how I know that?

1:43

Look at Papa Pete, okay? Open cloud guy,

1:46

he knows what he's doing. Do you know

1:47

what you're doing? Not going to make it.

1:49

Permanent underclass. Hey, buy my course

1:51

and I'm going to teach you how to do AI

1:53

properly." And it's such a bad takeaway.

1:56

And let me explain it in more simple

1:57

terms. Like, you know, the funny thing

1:58

about history and about tech, they don't

2:01

repeat but they do rhyme. I've heard

2:02

that once and it makes me feel like I'm

2:04

really smart saying that.

2:07

You know what I mean? So, what I mean by

2:09

they they rhyme is that in 2016, 2018,

2:13

2020, if you would see any startup, you

2:16

if you went and talked to any of your

2:17

friends in the Silicon Valley, there was

2:19

an entire culture

2:21

that had more microservices and

2:24

Kubernetes usage than they did literal

2:27

customers. I actually had a friend

2:29

lament to me that he was managing 10

2:32

different microservices and he had three

2:34

customers. Unironic, that's not me

2:36

making up or exaggerating things. He had

2:38

triple the services for three customers

2:41

and he was just like, "What the hell

2:43

have I done with my life?" And it's just

2:45

like, "Brother, you have to quit

2:46

listening to Google for how to run a

2:48

company. Just because it works for them

2:51

does not mean it works for you." And

2:53

this is kind of that same vibe. Just

2:55

because this works for Pete, which by

2:57

the way, guess how many dollars he paid

2:59

for those tokens? Yeah, zero. You know

3:02

how much money you're going to pay for

3:03

those tokens? Yeah, full price, okay,

3:05

buddy. It's not going to be cheap, okay?

3:06

You're not spending six six hundred

3:08

three billion dollars in tokens per

3:10

month. And if you if you are, I mean,

3:12

well, hey, nice to meet you, sir. I did

3:14

not realize you

3:15

I was not aware of your game. And so, I

3:17

just wanted to kind of get that out of

3:18

the way, okay? Now to the future, the

3:20

thing that I think all of you have

3:22

wrong, okay? But first, the bag. You see

3:25

these people walking around with their

3:26

laptops cracked just so their agents

3:29

don't stop running?

3:30

Mine never stop running. When making

3:32

changes with Cloud Agents, you can see

3:33

the diffs inline just like with any

3:35

other agent. It will create a PR and you

3:38

can actually see your CI running live

3:40

within the Cloud Agent. You can see the

3:41

status of the CI when it completes and

3:43

you can even go back and fix the failing

3:45

CI. Not only that, but you can also just

3:47

run live commands in the terminal. That

3:49

is my project right there. This is not

3:52

on my computer. This is in the cloud

3:54

running where I can ask it to do things.

3:57

I can ask for changes. I can ask for

3:59

changes on my phone and see the game

4:01

played via MP4. What's even crazier is I

4:05

can just take over the desktop and I can

4:06

place towers and I can just play the

4:08

game. Start round. And I can watch the

4:11

bats happen. This is my game. Try Cloud

4:14

Agents today. cursor.com/agents

4:17

and never have to worry about your

4:19

laptop being open again.

4:21

Okay, welcome back. Let's talk about the

4:22

future here for a second. So, something

4:24

that you need to kind of keep in mind

4:26

when you see these things is that what

4:28

Pete's entire goal is, it's a research

4:31

project. How far can OpenAI take token

4:34

usage? Cuz remember, they believe the

4:36

future is going to be this token Utopia

4:39

where everybody just sits back and

4:42

relaxes like we're in Wall-E and we just

4:44

are able to out anything and you

4:46

have billions upon billions of tokens

4:48

for free cuz everything gets 10x cheaper

4:50

every single year, which by the way,

4:52

that promise is 2 years old and I feel

4:54

like things have never been more

4:55

expensive. I don't know. It feels that

4:58

way to me. Maybe I'm wrong, but things

4:59

kind of feel a little costly.

5:01

Nonetheless, 10x cheaper. Remember that.

5:03

10x cheaper every single year. And so,

5:04

at some distant point in the future, you

5:07

spending 603 billion tokens and every

5:10

last person on Earth doing that, which

5:12

by the way, we don't even have enough.

5:13

Like I don't even think there's enough

5:14

power on Earth to do that currently. We

5:16

might have to 10x all power on Earth and

5:18

only use it to power GP used to make

5:20

this happen. But again, I digress. If

5:22

this were to come across, this is how

5:24

projects could look. So, I think a lot

5:27

of people look at this and they're like,

5:28

"Oh, well, you know, OpenAI is being

5:29

evil." No, I think they actually just

5:30

believe this, right? Like I think they

5:32

actually believe that every last person

5:34

will be using Infinity tokens at all

5:36

times. And yeah, sure. They are the

5:38

benefactors of it. And I mean, it's a

5:39

good future for them, but I actually

5:41

also think they they think this is just

5:43

like how the world should work. This is

5:44

how projects should be ran. And so, this

5:47

is a research project which got me to

5:49

think about something for a second. And

5:51

it's kind of this funny conundrum that

5:53

you see. Uh right now, if you go to any

5:56

of the big companies, what are they all

5:58

about? Hey, what's your token spend? I

6:00

mean, there is literal people getting

6:02

fired because they're not using AI

6:05

enough. You've seen this, you've seen

6:07

the articles, you've seen

6:09

potentially these rage posts on Reddit.

6:11

I can never tell if what I'm reading on

6:12

Reddit is real or not, if it's just

6:14

there to rage bait me into a frothy

6:16

mouth just to go off and tweet a story

6:18

that doesn't even exist. But let's

6:19

pretend they do exist. People are

6:21

getting fired for not using enough AI.

6:23

The I've read stories about people who

6:24

are interviewing, if they use too much

6:26

AI, people don't like it. If they don't

6:28

use enough AI, people are not liking it.

6:29

Like interviewing sounds like hell.

6:31

Working at companies right now sounds

6:33

pretty awful cuz you're constantly being

6:34

shoved down the throat, you must use

6:35

this. People at Amazon, you better use

6:37

Kiro. Hey, if you're over at Google,

6:38

better use that Gemini, buddy. And just

6:40

keeps on going and going and going,

6:42

right? Well, there's kind of a problem

6:43

there. I don't think people realize what

6:46

the problem is. Because right now it's

6:48

like spend all the money you want,

6:49

right? Okay. Well, let's just rewind

6:52

like 18 months, okay? Not even that long

6:54

ago. Let's just go back a little bit.

6:58

You wanted a new computer.

7:01

Oh,

7:02

you want 32 GB of RAM? Well, we're going

7:05

to have to get a vice president to sign

7:08

off on those $400. Oh, and a chair?

7:11

Yeah, that chair, it's going to be a

7:13

used Herman Miller. Okay. You're I'm

7:15

sorry, but those buns of yours do not

7:17

get the luxury of sitting on brand new

7:19

Herman Miller, okay? You know what?

7:21

We're getting you a lifetime chair.

7:22

That's what you get. Yeah, you. You get

7:24

a lifetime chair and I'm going to go

7:25

grab some patio furniture padding and

7:27

duct tape that right onto your chair.

7:28

That's what you get. That's what you

7:29

deserve, okay? Because let's just face

7:31

it, we can't be bothering our VPs for

7:33

these $50 upcharges. We can't do that,

7:36

okay? Us as a multi-billion dollar

7:38

company, we are very concerned if you

7:41

spend $25. And now, all of a sudden,

7:45

you can spend infinity on tokens. In

7:47

fact, you're even encouraged to do so.

7:49

Going back to this for a second, if you

7:51

really think about that, that means it

7:54

takes $1.3 million a month to run

7:57

OpenClaw. So, how many engineers is

8:00

that? Well, like if you think about

8:02

that, let's just pretend we're a big

8:03

tech Google company. It costs $50,000 a

8:07

month, and you're spending $1.3 million

8:10

a month on just AI agents. To replace

8:12

those with just engineers with Well,

8:14

that kind of math you I mean, it's a

8:16

number. That kind of math you can't just

8:17

do off the top of your head. So, let's

8:18

just say 30 engineers. That's like 30

8:20

engineers worth of people working on

8:22

something. You can't just do this for

8:24

every single project. Your company at

8:27

some point's going to go, "Okay,

8:28

timeout. We've made a mistake. We have

8:31

decided that we let you use all the

8:33

tokens you want. That's bad. We're going

8:35

to go back to the old days. Who's the

8:37

most token efficient? Oh, you're not

8:39

token efficient. You're spending 603

8:42

billion tokens on maintaining a simple

8:44

project? No, we're not going to do

8:46

that."

8:47

You're gone. There's going to come a

8:49

world where there's an entire consultant

8:51

class going through these companies

8:54

teaching people how to be efficient with

8:56

tokens. No longer will we see this world

8:59

of infinity token usage. Instead, it's

9:01

going to be, "Okay, who's the top

9:03

performers by features and things

9:04

delivered, not just by how much you

9:06

spend." Because in the old world, we

9:08

used to do buy versus build. Do you

9:10

build the thing or do you buy the thing?

9:12

Depending on the cost and the trade-off,

9:14

sometimes it's better to, you know,

9:15

trade the time for the money or the

9:17

money for the time. But now, we kind of

9:19

have a new world. It's like buy versus

9:21

build versus vibe. Do you vibe it? Well,

9:23

vibing takes both time and money. So,

9:26

which is the proper trade-off? And I

9:27

think companies are going to quickly

9:29

snap back to the old way in which

9:31

they've always done things. It's going

9:33

to be, "Okay, who's the most efficient?

9:35

Who knows how to use these things the

9:37

best? It's not going to be the people

9:39

spending Infinity. It's not going to be

9:40

the fluencers that's telling you

9:42

you need to run 500 agents in the cloud

9:44

at all times or you're not going to make

9:46

it. It's going to be the people that are

9:47

just being engineers. They're the people

9:50

like learning. People that actually want

9:52

to just do good work and use things to

9:54

help speed them up in certain areas. And

9:56

that's my prediction. Yes, I I'm doing a

9:59

prediction. I'm doing an actual

10:00

prediction. I know you're not supposed

10:02

to do predictions. Tech predictions

10:03

almost are largely you're always wrong,

10:06

but I do think in the near future we are

10:09

going to see token efficiency as an

10:11

entire argument as opposed to simply

10:13

token maxing. Token maxing is because

10:16

we're just trying to figure out is this

10:17

even viable? And by we, I don't mean me.

10:20

I'm out here still hand coding stuff for

10:23

my video game, okay? This is it's a

10:25

different world. But nonetheless, this

10:26

was very interesting to see. I was very

10:28

happy I got to read about this and kind

10:30

of see the live reaction from everybody

10:33

because people were just, you know,

10:35

instantaneously suspicious. Like, "Oh,

10:37

this is just open code trying to make

10:38

money." Yeah.

10:40

They are they are trying to make money.

10:41

I'll tell you that much. But they also

10:42

this is just like what they think the

10:44

future looks like. You and 100 agents

10:46

non-stop doing stuff. And maybe at some

10:49

point in the future, maybe hey, you know

10:50

what? Maybe in 10 years, some large

10:53

amount of time when we have, you know,

10:55

100x more energy and 1,000x more GPUs.

10:58

Yeah, maybe that future does exist in in

11:00

some far away place. But right now, to

11:02

me at least, the big takeaway here is

11:05

I think you got to start thinking about

11:06

token efficiency. You got to start

11:08

thinking about how you're actually using

11:09

it. Maybe having a kajillion agents does

11:12

work for one person.

11:13

But I'm not sure if this is really a

11:15

sustainable approach for anybody, even

11:18

if the promised 10x is going to happen.

11:20

Okay, sorry. I made a future prediction

11:22

and I'm probably going to be wrong, but

11:23

I you know, honestly, I think I'm right.

11:25

Also, the consulting class, can we all

11:27

just agree that's going to be the most

11:28

annoying people in the universe?

11:29

Honestly,

11:31

I'd almost rather take the crypto bros

11:34

who are going to be like, "Oh, you got a

11:35

token max." than

11:37

the new class of agile coaches that are

11:39

going to be coming out. These agile

11:40

coaches for token efficiency is just

11:43

going to be the worst. Oh my gosh. There

11:45

is actually going to be prompt trainers.

11:47

it's going to be like Pokémon trainers,

11:49

but they're going to be prompt trainers

11:50

and you're going to have to go in there

11:52

and they're going to like one-v-one you

11:53

on prompts. It's going to be so

11:55

ridiculous. It's all horoscopes, baby.

11:57

The name

11:59

is the Brian Magen.

Interactive Summary

Ask follow-up questions or revisit key timestamps.

The video discusses the viral revelation of an entity spending $1.3 million on OpenAI tokens in a single month. The speaker critiques the current 'AI influencer' trend of promoting extreme token usage, comparing it to past trends of corporate over-engineering, like excessive microservice usage. The speaker predicts a shift away from 'token maxing' toward a focus on token efficiency as companies realize that infinite AI spending is unsustainable. Ultimately, the speaker anticipates the rise of a new, potentially annoying 'consulting class' focused on optimizing token usage rather than just raw consumption.

Recently Distilled

Videos recently processed by our community

How The Federal Reserve Could Shrink Trillions From Its Balance Sheet | Darrell Duffie

Jul 18, 2026

by The Monetary Matters Network

Breaking Down the Multi-Manager Playbook: How This $19B CIO Thinks About Alpha | Sean McGould

Jul 18, 2026

by The Monetary Matters Network

Semiconductors Are Gushing Cash… Here’s What’s Next in The AI Trade | Ben Pouladian

Jul 18, 2026

by The Monetary Matters Network

Why Capturing The Market’s Biggest Trends Means Embracing High Volatility | Takahe Capital

Jul 18, 2026

by The Monetary Matters Network

Meet Moonshot, China's latest Al challenger

Jul 18, 2026

by Yahoo Finance

Bank Earnings Just Gave the Market a Much Needed Confidence Boost | The Weekly Wrap

Jul 18, 2026

by Steve Eisman

The Only TWO Ways to Make Money in This World | Steve Harvey

Jul 18, 2026

by The Official Steve Harvey