Everyone is Wrong about Tokens
357 segments
You see this right here? Yeah. That's
$1.3 million
spent in OpenAI tokens in the last 30
days. 603
billion tokens spent. Now, even if I
were to try my hardest, I am not
actually sure it's possible for me to
spend this amount of money or that
amount of tokens. I have no idea how we
accomplished such things. And when I saw
this, I thought this is just the most
ridiculous thing and I
This is so stupid. But then, I started
thinking about it more and more and I
realized that there's a future that is
developing in which I think a lot of
people are wrong and I think this post
right here really helps it kind of
crystallize in my mind
where things are going. So, I got a lot
of yapping to do, so I hope you're going
to you know, strap down cuz I I I think
that yeah yeah you you you you you
probably aren't going to see this one
coming. I don't think you you understand
what's going to happen here in the next
year. And I think I might be right on
this one. I'm going to In fact, I'm
going to do something I normally don't
do. I'm going to make a tech prediction.
I know.
Kind of dangerous. I I do just got to
yap about this for a second, okay? The
reason why I have to yap about this is
that whenever a post like this happens,
there's always the exact same thing that
happens. There's this entire
fluencer market when it comes to AI and
I largely think they're just simply
pull-overs from the crypto days. The
crypto NFT bros moved over to AI. When
they see someone make a post like this,
Papa Pete, of course, they go, "Oh, hey
bros. Hey bros. Everybody. Uh I don't
know if you know this. If you aren't
spending like $100,000 if you A- if
you're not even hitting 10 billion, if
you're not even in the B's when it comes
to token usage a month, you're not going
to make it. You know how I know that?
Look at Papa Pete, okay? Open cloud guy,
he knows what he's doing. Do you know
what you're doing? Not going to make it.
Permanent underclass. Hey, buy my course
and I'm going to teach you how to do AI
properly." And it's such a bad takeaway.
And let me explain it in more simple
terms. Like, you know, the funny thing
about history and about tech, they don't
repeat but they do rhyme. I've heard
that once and it makes me feel like I'm
really smart saying that.
You know what I mean? So, what I mean by
they they rhyme is that in 2016, 2018,
2020, if you would see any startup, you
if you went and talked to any of your
friends in the Silicon Valley, there was
an entire culture
that had more microservices and
Kubernetes usage than they did literal
customers. I actually had a friend
lament to me that he was managing 10
different microservices and he had three
customers. Unironic, that's not me
making up or exaggerating things. He had
triple the services for three customers
and he was just like, "What the hell
have I done with my life?" And it's just
like, "Brother, you have to quit
listening to Google for how to run a
company. Just because it works for them
does not mean it works for you." And
this is kind of that same vibe. Just
because this works for Pete, which by
the way, guess how many dollars he paid
for those tokens? Yeah, zero. You know
how much money you're going to pay for
those tokens? Yeah, full price, okay,
buddy. It's not going to be cheap, okay?
You're not spending six six hundred
three billion dollars in tokens per
month. And if you if you are, I mean,
well, hey, nice to meet you, sir. I did
not realize you
I was not aware of your game. And so, I
just wanted to kind of get that out of
the way, okay? Now to the future, the
thing that I think all of you have
wrong, okay? But first, the bag. You see
these people walking around with their
laptops cracked just so their agents
don't stop running?
Mine never stop running. When making
changes with Cloud Agents, you can see
the diffs inline just like with any
other agent. It will create a PR and you
can actually see your CI running live
within the Cloud Agent. You can see the
status of the CI when it completes and
you can even go back and fix the failing
CI. Not only that, but you can also just
run live commands in the terminal. That
is my project right there. This is not
on my computer. This is in the cloud
running where I can ask it to do things.
I can ask for changes. I can ask for
changes on my phone and see the game
played via MP4. What's even crazier is I
can just take over the desktop and I can
place towers and I can just play the
game. Start round. And I can watch the
bats happen. This is my game. Try Cloud
Agents today. cursor.com/agents
and never have to worry about your
laptop being open again.
Okay, welcome back. Let's talk about the
future here for a second. So, something
that you need to kind of keep in mind
when you see these things is that what
Pete's entire goal is, it's a research
project. How far can OpenAI take token
usage? Cuz remember, they believe the
future is going to be this token Utopia
where everybody just sits back and
relaxes like we're in Wall-E and we just
are able to out anything and you
have billions upon billions of tokens
for free cuz everything gets 10x cheaper
every single year, which by the way,
that promise is 2 years old and I feel
like things have never been more
expensive. I don't know. It feels that
way to me. Maybe I'm wrong, but things
kind of feel a little costly.
Nonetheless, 10x cheaper. Remember that.
10x cheaper every single year. And so,
at some distant point in the future, you
spending 603 billion tokens and every
last person on Earth doing that, which
by the way, we don't even have enough.
Like I don't even think there's enough
power on Earth to do that currently. We
might have to 10x all power on Earth and
only use it to power GP used to make
this happen. But again, I digress. If
this were to come across, this is how
projects could look. So, I think a lot
of people look at this and they're like,
"Oh, well, you know, OpenAI is being
evil." No, I think they actually just
believe this, right? Like I think they
actually believe that every last person
will be using Infinity tokens at all
times. And yeah, sure. They are the
benefactors of it. And I mean, it's a
good future for them, but I actually
also think they they think this is just
like how the world should work. This is
how projects should be ran. And so, this
is a research project which got me to
think about something for a second. And
it's kind of this funny conundrum that
you see. Uh right now, if you go to any
of the big companies, what are they all
about? Hey, what's your token spend? I
mean, there is literal people getting
fired because they're not using AI
enough. You've seen this, you've seen
the articles, you've seen
potentially these rage posts on Reddit.
I can never tell if what I'm reading on
Reddit is real or not, if it's just
there to rage bait me into a frothy
mouth just to go off and tweet a story
that doesn't even exist. But let's
pretend they do exist. People are
getting fired for not using enough AI.
The I've read stories about people who
are interviewing, if they use too much
AI, people don't like it. If they don't
use enough AI, people are not liking it.
Like interviewing sounds like hell.
Working at companies right now sounds
pretty awful cuz you're constantly being
shoved down the throat, you must use
this. People at Amazon, you better use
Kiro. Hey, if you're over at Google,
better use that Gemini, buddy. And just
keeps on going and going and going,
right? Well, there's kind of a problem
there. I don't think people realize what
the problem is. Because right now it's
like spend all the money you want,
right? Okay. Well, let's just rewind
like 18 months, okay? Not even that long
ago. Let's just go back a little bit.
You wanted a new computer.
Oh,
you want 32 GB of RAM? Well, we're going
to have to get a vice president to sign
off on those $400. Oh, and a chair?
Yeah, that chair, it's going to be a
used Herman Miller. Okay. You're I'm
sorry, but those buns of yours do not
get the luxury of sitting on brand new
Herman Miller, okay? You know what?
We're getting you a lifetime chair.
That's what you get. Yeah, you. You get
a lifetime chair and I'm going to go
grab some patio furniture padding and
duct tape that right onto your chair.
That's what you get. That's what you
deserve, okay? Because let's just face
it, we can't be bothering our VPs for
these $50 upcharges. We can't do that,
okay? Us as a multi-billion dollar
company, we are very concerned if you
spend $25. And now, all of a sudden,
you can spend infinity on tokens. In
fact, you're even encouraged to do so.
Going back to this for a second, if you
really think about that, that means it
takes $1.3 million a month to run
OpenClaw. So, how many engineers is
that? Well, like if you think about
that, let's just pretend we're a big
tech Google company. It costs $50,000 a
month, and you're spending $1.3 million
a month on just AI agents. To replace
those with just engineers with Well,
that kind of math you I mean, it's a
number. That kind of math you can't just
do off the top of your head. So, let's
just say 30 engineers. That's like 30
engineers worth of people working on
something. You can't just do this for
every single project. Your company at
some point's going to go, "Okay,
timeout. We've made a mistake. We have
decided that we let you use all the
tokens you want. That's bad. We're going
to go back to the old days. Who's the
most token efficient? Oh, you're not
token efficient. You're spending 603
billion tokens on maintaining a simple
project? No, we're not going to do
that."
You're gone. There's going to come a
world where there's an entire consultant
class going through these companies
teaching people how to be efficient with
tokens. No longer will we see this world
of infinity token usage. Instead, it's
going to be, "Okay, who's the top
performers by features and things
delivered, not just by how much you
spend." Because in the old world, we
used to do buy versus build. Do you
build the thing or do you buy the thing?
Depending on the cost and the trade-off,
sometimes it's better to, you know,
trade the time for the money or the
money for the time. But now, we kind of
have a new world. It's like buy versus
build versus vibe. Do you vibe it? Well,
vibing takes both time and money. So,
which is the proper trade-off? And I
think companies are going to quickly
snap back to the old way in which
they've always done things. It's going
to be, "Okay, who's the most efficient?
Who knows how to use these things the
best? It's not going to be the people
spending Infinity. It's not going to be
the fluencers that's telling you
you need to run 500 agents in the cloud
at all times or you're not going to make
it. It's going to be the people that are
just being engineers. They're the people
like learning. People that actually want
to just do good work and use things to
help speed them up in certain areas. And
that's my prediction. Yes, I I'm doing a
prediction. I'm doing an actual
prediction. I know you're not supposed
to do predictions. Tech predictions
almost are largely you're always wrong,
but I do think in the near future we are
going to see token efficiency as an
entire argument as opposed to simply
token maxing. Token maxing is because
we're just trying to figure out is this
even viable? And by we, I don't mean me.
I'm out here still hand coding stuff for
my video game, okay? This is it's a
different world. But nonetheless, this
was very interesting to see. I was very
happy I got to read about this and kind
of see the live reaction from everybody
because people were just, you know,
instantaneously suspicious. Like, "Oh,
this is just open code trying to make
money." Yeah.
They are they are trying to make money.
I'll tell you that much. But they also
this is just like what they think the
future looks like. You and 100 agents
non-stop doing stuff. And maybe at some
point in the future, maybe hey, you know
what? Maybe in 10 years, some large
amount of time when we have, you know,
100x more energy and 1,000x more GPUs.
Yeah, maybe that future does exist in in
some far away place. But right now, to
me at least, the big takeaway here is
I think you got to start thinking about
token efficiency. You got to start
thinking about how you're actually using
it. Maybe having a kajillion agents does
work for one person.
But I'm not sure if this is really a
sustainable approach for anybody, even
if the promised 10x is going to happen.
Okay, sorry. I made a future prediction
and I'm probably going to be wrong, but
I you know, honestly, I think I'm right.
Also, the consulting class, can we all
just agree that's going to be the most
annoying people in the universe?
Honestly,
I'd almost rather take the crypto bros
who are going to be like, "Oh, you got a
token max." than
the new class of agile coaches that are
going to be coming out. These agile
coaches for token efficiency is just
going to be the worst. Oh my gosh. There
is actually going to be prompt trainers.
Like
it's going to be like Pokémon trainers,
but they're going to be prompt trainers
and you're going to have to go in there
and they're going to like one-v-one you
on prompts. It's going to be so
ridiculous. It's all horoscopes, baby.
The name
is the Brian Magen.
Ask follow-up questions or revisit key timestamps.
The video discusses the viral revelation of an entity spending $1.3 million on OpenAI tokens in a single month. The speaker critiques the current 'AI influencer' trend of promoting extreme token usage, comparing it to past trends of corporate over-engineering, like excessive microservice usage. The speaker predicts a shift away from 'token maxing' toward a focus on token efficiency as companies realize that infinite AI spending is unsustainable. Ultimately, the speaker anticipates the rise of a new, potentially annoying 'consulting class' focused on optimizing token usage rather than just raw consumption.
Videos recently processed by our community