PewDiePie beat chatGPT?
1096 segments
Anyway, so PewDiePie probably has done
some distillation is my guess.
[laughter]
Is that Is that what we're about to find
out? Peudes is just distilling poor
anthropic and Daario right now is just
dying on the inside.
>> Seek 2.5 way bigger model than mine.
Facebook's flagship model llama for
Maverick. [laughter]
>> Does anyone even consider Facebook at
the race anymore? when you're like
Facebook's, you know, flagship model,
I'm just like, oh, okay. Yeah. Like, is
this equivalent to Carpathy's nano GPT?
Like, what are we looking at right here?
Model B Lama for Maverick destroyed. And
most importantly of all, my model
outperforms Chad GPTs 4 [laughter]
in like November or something. That
works. This sounds like impressive, but
it's way less impressive once we
>> No, first off, it depends on what he
actually means by that. But if if you do
out compete 40, that's actually super
impressive. So, I don't know what's
about to happen in this video, but hey,
I'm in. That's cool. In like November or
something that
sounds like impressive, but it's way
less impressive once we get into it, and
we will get into it. It also sounds way
less impressive considering the fact
that I almost burned down my house twice
for this project.
It also sounds way LESS IMPRESSIVE. I
LOST MY GODDAMN VICINITY FOR THIS. NOW,
I have not created my own AI. I have
merely taken an AI model and trained it.
It's like stealing a child on the street
instead of birthing one myself. It's way
more effective that way. Plus, it would
cost million. What? Okay. I mean, that's
it's true. I guess stealing a child is
way more like time effective than than
creating one yourself. True. True.
Absolutely true. It cost millions and
millions of dollars in infrastructure,
which I do not have yet. Am I talking
about birthing or But it's important for
you to understand just how naive I was
going into this cuz I knew nothing about
machine learning, training, AI, and
coding that I mentioned. My model is a
coding model. I can make a coding model.
Sure, Felix. Great idea. But I also know
I wasn't that crazy going into this, and
I'll explain why. But mainly the fact
that I wanted to do this was because I
wanted to learn.
>> Is he doing what I think he's doing?
Whoopsies.
Is
Yeah, look at this.
He put the He put the He put the little
microphone on his fingy. snapped it on
to the old figure like most people hold
it. He just snapped it on. The man's a
revolutionary genius. I swear he just
does things and I just I don't even know
what to do with it. He's just uh also
just to be completely fair, him saying
he knows nothing about coding but going
to build a coding model. Uh I mean
technically that makes him on par with
every other AI ML researcher right now.
Great idea. But I also know I wasn't
that crazy going into this and I'll
explain why. But mainly the fact that I
wanted to do this was because I wanted
to learn. The reason I'm here with this
video and the series of events has just
been me approaching this philosophy of
just yes this might be difficult but I
will learn from it step by step. You
know that takes you places. And that's
why I'm so excited to announce today's
sponsor which is boot.dev. boot.dev Dev
is a website. I teach it to you how to
code. But for real, I did their Linux
course and it's there is a nonzero
chance I might accidentally show up by
accident in this video. Very exciting.
Okay, very very exciting. Also, by the
way, just so you guys know, my audio
again is kind of goofy. Every time I
pause it, it requires me to like go back
a little bit cuz there's like a few
seconds of silence. So, if you're
wondering why I'm repeating it,
fantastic. I don't know what's
happening. When you actually understand
how something works, it changes things
for real. It's none of this Dualingo
bing fake learning. Okay, it's fun. It's
engaging and it works. I'm super excited
to do their other courses as well.
Create your own AI agent in Python
course. I'm going to say something that
makes me sound like such a boomer dad,
but I am at this point. So, to heck with
it. Instead of focusing on gaming too
much lately, I just been focusing on
leveling up myself. But it's true. It's
such an amazing feeling and I want you
guys to experience it as well. And I
think boot.dev is an amazing venture to
do so. So check it out in the
description and I will remind you later
as well. Okay. Now, funny enough to
dude, he's right. I hope you took that
little QR code. Do you see that little
QR code? I'm going to give you guys one
more chance. Okay. Well, the problem is
that hopefully that doesn't screw it up.
That hopefully that title [laughter]
>> true. It's such an amazing feeling and I
want you guys to experience it as well.
And I I can't believe Peudes is is a
better modern philosopher than most of
tech Twitter. You don't need to learn
[ __ ] Peudes is like actually as you get
older, it turns out it's really
meaningful and awesome to actually
become an expert in something and to
feel like you're actually improving at
something. I can't believe, dude, we
live in just the strangest world. Peudes
is now the modern philosopher. Funny
enough, to train my model, AI model, it
ironically it works very similar to how
you would train on yourself on boot.dev.
There's an instruction of a problem. You
get a framework on how to initiate or
what to use and then there's a validated
answer at the end of it. Basically, I g
around 100,000 samples like that [music]
and then you feed that to the model
which slightly nudges its parameters
like it slightly probably nudges your
brain [music] and bada bing bada boom
you have trained your model. It's kind
of like this. You're AI. Look at this.
Are you learning?
Are you getting this?
>> You will look at these. You will look at
100,000 more. Play. Pay attention.
>> This might take a while. The model. Hey,
to be fair, humans are super efficient.
Like if you really think about it, it
takes a baby from zero to to like one to
figure out how to walk. So it's only
seen however many hours of real life
from zero to one. And and for the first
couple months, its eyeballs didn't even
work. And it doesn't have a developed
nervous system that far into this entire
situation. It didn't even have muscles
and it figured it out. It takes AI like
billions upon billions of hours to be
able to do anything. You know, humanity
hyperefficient when you really think
about it. This little fellow right here,
hyper efficient learner. Okay,
>> hyper efficient. This might take a
while.
>> The model that I used was Gwen 32B,
which is already amazing at coding, but
I needed it to be amazing at coding.
>> He's also doing instruction tuning. By
the way,
>> the whole reason I decided to do this
and where I landed, cuz I discovered
something. There are many ways to
benchmark and test your model. I
discovered there's one called Ader
Polyglad. If you remember last video, I
used the agent ader to code my own web
UI. This is
>> Am I allowed to show this on Twitch?
What do we What do we do with this? Am I
Am I about to get banned? Is legendary
gooning lord Dan Clansancy about to come
in here and ban me? I'm sorry, Dan.
>> This is a respectable benchmark [music]
that has coding in six different
languages. That's six more languages
than I know.
But what was interesting about this is
[music] that state-of-the-art models
like Chat GPT perform like garbage on
this benchmark. 18.2 2% uh what my model
that I was planning to train on performs
8% [music]
trash but it performs 16% if you use a
different format called whole format
instead of diff format. Basically
there's two different formats but one of
them is important. How do I explain
this?
>> It's basically like this. You draw a
picture. Okay, imagine you draw a
picture and you want to add a cloud. But
instead of just adding the [ __ ]
cloud, you redraw the entire picture
with the cloud. It makes no [ __ ]
sense. But basically, a lot of
>> Did the original one The original one
did not have a
>> the cloud.
>> The original one did not have a a
dingleberry. He added more than just a
cloud in that. Okay. Hey, hey,
there is some bonus content in this one
that I was not I did not think we were
allowed to have bonus content of instead
of just writing the whole thing. It's
just stupid. It wastes compute and it
wastes time and I never used it as well.
So I thought hey if I can just fix the
format I will make my model 16%. And I
will almost beat Chad GPT. The goal beat
chat GPT easy because what I had on my
side at my disposal my arsenal was
Chinese AI research. You'd think China
would be on the more censorship side of
things. At least that's what I thought.
How wrong I was. It's completely the
opposite. Deepseek China Chinese AI
basically just [music] released their
model for anyone to run and a whole
document with their entire training
process in great just don't ask about
the three T's. You know what I'm
[laughter]
[laughter]
detail? It's amazing. There's so much
information from these research
documents combined with the open source
community. There's so much to try and it
just makes you really want to try it
yourself. At least that's how I saw it.
Even though a lot of it was super
advanced and I didn't understand
anything, eventually I did. I [laughter]
think
it's also hilarious reading these
documents. Chinese AI researchers are
unironically comedians cuz they write
the most unh [ __ ] They say [ __ ] like,
"Oh, we trained our model on 248
GPUs." You're like, "What? That's like
$60 million." And and you keep reading
and they're like, "We emphasize this is
an economical APPROACH." I'M LIKE,
"ECONOMICAL? WHAT'S THE non-economical
version?" But I think that kind of puts
things in perspective and why a lot of
companies in the west, [music] they
don't want to share information. So,
with Chinese AI research on my side and
a boot dev core,
>> to be completely fair, 24,48 2 to the 11
uh AH200s is an extremely cheap amount
of GPUs to be able to train. And the
reason why is because they uh you know
the whole uh
hey chat GPT
yo dog, how how does quantum mechanics
work? Yo, ain't give me that knowledge.
[laughter]
I that's that's what happens. ready to
do this thing.
Now, what do you need to train AI? Data.
Now, how do you get data without
sacrificing your soul in the process?
Well, there is options, believe it or
not. You can mine the stack, which is
this 60 TBTE data set that you're
[music] allowed to train on. Good thing
I kept my hard drives around. You can
use publicly available data sets. Yep.
You can also mine git, which is a little
more of a gray zone cuz you have to
check for licenses, which yeah, you just
you just hit all the MIT licenses. It's
actually something I did want to see is
like how good could you make nano GPT if
you just use uh MIT licenses and like
hey can you make an okay inline
autocomplete from MIT maybe which I'm
sure everyone is doing right [laughter]
these big tech companies they're they're
being super ethical and great about it
I'm sure that's why they're not sharing
any information they can also synthesize
your data glorious synthetic data by the
way is PewDiePie becoming the next uh
Rossman Is this what we're Are we in
real time? Oh, hold on. Time out. Time
out. Are we in real time watching the
guy who gave us Ligma? The Ligma Chad
to like Luke Smith pipeline, Vim Diesel.
Is this what we're watching like right
now in live action? Like, where are we
inside the cification of it all? Is Is
this what's happening? I think that's
what's happening. Wow. Don't taste so
good. I tried every single method there
is. It's been a mess. This is scraping
git or enriching my data. This is
scraping for more data. This is running
testing on the data. And this is
augmenting the data. And this is my
eight LLMs. This is the level we've
reached. This project was kind of like I
was like in the middle of a freeway and
there's a bunch of cars direct that I
had to direct constantly because [ __ ]
had to move for this to finish.
Everything took so much time and my GPUs
had to constantly be cooking and I had
to do all this makeup cosmetic surgery
to the data cuz all these developers
writing [music] poor lazy first of all
why do you even publish some of that
lazy whoa whoa whoa whoa whoa whoa whoa
whoa whoa whoa pudes what do what do you
mean commit I don't know that's dude
that's that's obvious what that one
means okay it means that they went on
stack overflow or asked an LLM them
about something. A bunch of code was
generated. It worked and they didn't
understand what happened. The second one
is also evidently clear. This guy's
working on CI right now. Do you have any
idea what CI stands for? It stands for
Dante's Inferno. Okay, it's just it's
awful. Okay, that's how you spell Dante.
It starts with a C. It's it's actually
insane. And so he's just praying, please
work. This is my one chance. This is my
one chance to actually have it happen.
And then this last one is also very very
obvious. Okay, some guy started fixing a
bug and then he accidentally had an idea
for a feature and then found three more
bugs along the way. And so a lot of [ __ ]
happened and you can't just like say all
that. So you're just like, yo, there's
just like just just changes. Okay, we
made changes. I don't want to talk about
it. Just a lot of stuff happened. Like
this is really good commit commit
messages. I know what happened on all of
them. Lazy instructions. Don't make a
commit if you're just going to add a
comment. But finally, I had my data, but
I still knew I needed more. And I also
wanted to try out synthetic generation
of data, which is basically you
typically take the strongest model that
you have and you ask it, hey, look at
this. Make more in this format. And my
god, it was a beautiful thing. You get
the perfect data exactly the way you
want it. It's amazing. But the problem
is, and maybe you already know this, AI
is wrong all the [ __ ] time.
[laughter]
>> Okay, it's basically like this. I tell
AI,
>> you know, also what is crazy. Can I just
can I throw something out there? Uh, in
today's world,
why just why oh why if you say AI is
wrong all the time on the internet,
people are like,
uh, hallucin hallucinations have been
largely solved. What are you talking
about? Well, dude, if you're getting No,
no one gets hall. That That was like a
2024 problem. Okay. No, no one gets
hallucinations. Like, what are you
talking about? I get hallucinations all
the time. It just makes up [ __ ]
constantly. Like, what do you what do
you mean?
[laughter]
My drawing skills have really I'm sure
glad I learned how to draw. [laughter]
>> I was just thinking like what happened
to his drawing skills?
by showing him a burger. And then AI
[music] makes what looks like a burger,
but I open the lid and he put razor
blades in there. So my harness is a
burger eater [laughter]
to check that if the real burgers are
made instead of fake burgers. I think I
explained that pretty well. You lied to
me. My synthetic approach, I know that
most people don't care. I used oss
instruct from magic coder and evol
instruct which is an amazing method.
It's basically like that cloning dancing
guy technique. You get feed a code
snippet and then you say hey make it
into this format and also make it do
another one but make it more advanced. I
don't want to get into this technical it
doesn't matter. this video would take
forever. But the problem I had was I was
not getting enough of it and I don't
trust
>> also like how do you know like if you're
not a coder, real talk, if you're not a
coder, how do you know you're even
generating the right thing? Is this why
chat GPT puts like if statements and
stupid guard checks around everything?
I don't know if you guys noticed that,
but chat chat jippity 35 codeex is just
like, oh, hey, is that a function that
takes in a number? Hey, I'm going to do
a quick type check. Hey, is that a
number? Are you sure your number? Hey,
if you're a number, are you like a whole
number? Oh, uh oh, you're a decimal
number. Hey, if you're a decimal number,
are you at least like greater than a
hundredth? It's just like, dude, what
are you doing? I just asked you to do
the absolute value. You don't we don't
need to do all this. Just just put put
the bags in the put the fries in the
bags, bro. Like, it's not that hard.
It's crazy how much if checks it does.
So, I validated it. I kept thinking the
problem was my test harness, so I needed
to FIX THAT. WHAT AN IDIOT.
THAT JUST MADE IT so I passed more
garbage. So when the time came for me to
finally train my model after months,
months, I was so excited. I ran the
benchmark and I had made an AI model
that finally
>> is worse.
>> That was worse. I had made it worse.
I probably should have quit then, but I
am way too stubborn for that. I just
can't. I had I don't have it in me. I
can't do it. Nice. This is when you
realize AI is kind of like magic, right?
But it also is garbage data in, garbage
data out. There was a lot of issues with
my data. When I had fixed my harness, I
had just let more garbage data passed
through. There's also all these other
issues like empty white spaces and
classic coding issues that I just wasn't
aware about. So that's why I had [music]
made it worse. But I gave it another
attempt. And this time, no more
mistakes, no more dillydling. lock in
again. The model is worse. [laughter]
Oh my gosh.
Oh, the brother locked in and actually
made it significantly worse.
[laughter]
Oh, that has to hurt a little bit. That
has to hurt a little bit. There's a meme
here. There's there I just feel it. I
just feel like there's a meme here.
Right here.
Something like that. There's something
out there. Okay. Okay, I don't know what
to do with it. I'm just going to let it
go out into the ether. Somebody will
come up with something way funnier, but
we're going to call it We're going to
call it even. Okay, we're going to call
it even. It was my harness yet again.
I don't know why I got so stuck on this
[ __ ] thing. I didn't even need
synthetic data. I just wanted it to
work. I just wanted to work so bad.
Finally, I had fixed the [ __ ] thing.
And the benchmark came in 16%. Sometimes
15, sometimes 14, but 16 was the
highest. The mod is not going to solve
the
>> Why not just I'm c I I Is it just
because he wanted synthetic data to win
so bad? Why not just GitHub MIT that
crap? GitHub MIT that just get it. Just
get give me all the MI.
>> Honestly, it's not going to solve the
same problem every time. It it just
doesn't work like that. There is a level
of randomness to it to the benchmark
performance, but 16 was a ceiling. And
that makes sense because that was a
ceiling in the official whole format or
whatever. And I had fixed it. So it's in
the diff format, but I had not made a
[ __ ] difference. I had not made the
model smarter. And remember, I need to
beat 18%. To say that I beat Chad GPT,
otherwise this means nothing. So my plan
of attack was to add reasoning to my
data. Adding reasoning to your data is
basically making it write out some
thinking before it solves the problem.
Instead of doing two plus two in your
head, you go, "Oh, okay. So, let me
reasoning to your Why is that mirrored?
Why is he always trolling us?" I swear
every image in all of his videos,
there's just like some small amount of
trolling. This is not some sort of like
right to left language. This is just
English inverted. Look, T h e, right?
Like just right there. It's just just
messing with it's just he's just trying
to get my brain
>> up there. Some thinking before it solves
the problem. Instead of doing 2 plus 2
in your head, you go, "Oh, okay. So, let
me break this down. So, I'll have two
apples and then if I add two more apples
and then I count them all together, then
I will have four." This is really
effective for complicated issues to
break down the problem into parts and it
really can improve the performance of
the AI but when it's simple questions
and it still does it it can be very very
irritating as well. You've probably seen
this actually if you use a stronger AI
model. A lot of times you ask it
questions, it goes, "Oh, absolutely.
Let's break this down into parts. First,
we're gonna you're like, "That was a
simple yes or no question. What do you
mean?" But it's a really effective way
to make your model solve problems more
accurately. And a lot of these smaller
open source models that I train on
struggle with this because they just
haven't been trained on it enough. So
that was my plan of attack to make my
model smarter. and I found a study that
showed that 10% in performance. I'm like
10% that's way more than I need, baby.
Let's just clone this git repo and get
going. The only problem was, of course,
that the ungodly level of computation
that was needed for this. And that's
when things were starting to smell
funny. I realized my room had this weird
aura to it all of a sudden. I swear it
didn't used to smell like that. So, I
decide to reboot my computer and
Lightning strikes all across it. Smoke
starts pouring out. My whole life
flashed before my eyes. I turn off the
computer, but the damage was done. One
of my GPUs [music] had died. Rest in
peace. I tested each one one by one. And
it seemed like everything else was okay.
It was just this one problem. And
looking at my purchase history, that one
GPU came from a different factory. You
have to understand what a hack job this
setup that I have. By the way, I love
that is like what a hack job his setup
is. And it's it's like the single most
beautiful
home setup I have ever seen. Yeah. It's
like my huge hack job. Oh, by the way,
you just excelled beyond everything I
have ever seen anybody just do by
themselves. Like that is beautiful.
That's a beautiful case. Does a great
job.
Okay. I mean, that one's less that Okay,
that's less of a good job.
That one's not as nice. But I mean, but
this one right here, that one's really
good.
It's very nice. [laughter]
Not so nice. That's not so nice. That's
not so nice. I don't know about that
one.
>> Bifurcated. It's undervolted to death.
These GPUs run on 450 watt. I run it on
175 watt just so my house wouldn't
[ __ ] crash every time. And then these
are hacked Chinese 4090 GPUs. This whole
thing is held together by prayers.
Literally in Japan, they sell these
prayer infused it badges. Where is it?
It's in front of my computer. Official
Japanese priests.
I probably have now blessed my computer
and I was ready to give it another shot
only for the smell to appear again.
After sniffing my GPUs like a maniac, I
concluded my GPUs was not the problem
this time. And eventually I found it. I
don't think it's supposed to look like
that. Again, I was using cable that was
graded for 1,500 watt. I was running
over 2,000. Change cable. We were good
to go. You know, I was planning on uh I
I had we had to do some power work at my
house and so I actually had uh some
extra wattage brought over to this room
cuz I was like, "Okay, you know, I
should probably maybe I should think
about building my own GPU rack."
I I don't know, bro.
I don't know. I don't know if I want all
of this. This kind of It just kind of
worries me about the whole the whole
situation. I don't really know what I'm
doing. Uh I don't want to end up
smelling I don't want to end up smelling
GPUs. You don't want that smoke. I don't
know if I want that smoke. Burning down
your house is really dangerous. True.
True. It's true.
It's actually really true. Like the the
more you burn down your house, like the
more likely you are to die. Okay. Hey, I
just want to throw that out there.
That's like it's it's you can't escape
it. Kept crashing still.
>> Oh, it's still thought it crashed. It
has crashed. Oh, [ __ ] What a pile of
You'd think training would be just this
straightforward thing, but it's really
not. It's taking too long to train. I
need more computer. Okay, what am I
going to do? It's not my fault. I know
what you're thinking, Felix. Are you
building another computer?
>> No, I am just extending the one I have.
[laughter] Of course. Of course. I'm an
epic minimalist.
>> He's Dude, Felix, I sorry, Mr. Mr. uh
Mr. pie. Uh, by the end of this video, I
swear he's going to be renting his own
place, being like, I actually had power
issues and I needed to be able to get
more power, so therefore I had to go and
rent my own facility cuz there's no way
I'm going to be able to rent this in my
in my like standard housing unit. I need
I need all the powers I can get. I feel
like that's where we're going. My
bathroom and I drilled a hole.
It's super heavy on your computer. And
every time it crashes, I have to [ __ ]
defibrillate it back. Bring it back from
a coma. It's not pleasant. It's And with
everything that had happened in the
past, extremely stressful. And I really,
really, really started doubting what the
[ __ ] I'm doing here for these 2%. I
started calling upon Deepseek API
because it's practically for free. And
eventually I had 15,000 samples which is
way less than I had aimed for but these
were the top of the top the creme the
cram samples. So hold on let me get this
straight. Okay. Okay guys we're we're
going to run we're going to run some
numbers here. Okay let me jump over here
really quickly. Let's open up [ __ ] GNU
image manipulation program. Uh in which
some people are offended by its name.
You can shut up if you are. Uh so with
[ __ ] let me get this straight. There's
something called open code. uh or sorry
not open code. Sorry, sorry, sorry,
sorry. I got DAX on the mind. I don't
know what happened there. There's Open
AI. Then there's something called
DeepSseek. Right now, what DeepS did is
that they went and requested a bunch of
synthetic examples from Open AI, stole
it,
and then used that to train their model.
So now Peudes, Mr. Mr. Pi over here, all
right, is doing the exact same thing to
DeepSeek. And now I'm no mathematician
but if a equals b and b equals c
therefore a equals c which in this case
means that pewdiepie is actually taking
from openi
okay
brain draining one of the great American
companies. Okay they're stealing. Okay.
Hey PewDiePie. You can't steal. Stealing
is wrong. Okay. Here's the thing. Okay,
hold on. So, if you just go on GitHub or
you take some books, that's one thing.
Okay, but if you go to a company that
has stolen the entire internet and then
proceed to steal their content, that is
unjust. They spent money on the on the
new stuff, therefore that's bad. I'm
just throwing that out there and that's
unethical. Beautifully crafted
step-by-step reasoning the world has
ever seen. I did my supervised
finetuning three epochs and I ran the
benchmark and it scored
>> 17.
Are you [ __ ] kidding me?
>> But as I mentioned, the performance is
kind of random. So I kept running the
benchmark. I had eventually given up,
but I had done one more just for the
[ __ ] You know how it's so funny about
what he's saying that people aren't
putting together. How much do you want
to bet the benchmarks that OpenAI also
published? You want to guess? You want
to guess what they look like? This is
the Opus uh 46 tracker.
Yo, dog. Why does it do really good and
then the next day it goes from 60% to
51%. 60% 55%. 58 54 52 54. And this is
all in the exact same benchmark.
Nothing's even changing.
Look at the look at the variance.
And then, oh my gosh, you jump over to
to to Codeex.
You know what's so funny is everybody
last week on Twitter was just like,
"Dude, Codeex sucks."
You go here and you're like, "Oh, that's
because it's been sitting at like a 60s
pretty much average the entire time."
And then proceeds to go down to a 45%
and everybody on Twitter talks about it.
Crazy. It's actually crazy that you can
go, "Oh, yeah. I see that. Look at the
weekly trend." [ __ ] the bed. [laughter]
It's nuts that like I can actually line
this up to reality.
I know. Anyways, holy [ __ ] This one is
having like a god run. It's at 40%. What
is happening? It drops to 30. I'm like,
"Hold, please hold." It drops to 25. It
drops to 20. And it finally finishes
all the exercises in the benchmark. It
is done at 19.6%.
I have beat chat.
It felt so goddamn good. STOP. STOP.
DON'T LISTEN to this guy. The benchmark
is invalid. I did not check for
contamination before the benchmark.
Basically, if you grab data all across
the internet, there's a high chance that
you're going to grab data that might
already exist in a benchmark somewhere.
So, you have to check for contamination.
I didn't check for contamination. I
didn't want to check for contamination,
but my stupid conscience was like, I
should probably do it.
>> I was backing up, going through my data
for the gazillionth time, and I realized
there was some contamination. It's not a
huge deal, but it's like if I just am
honest, whatever. To me, it's not good
enough. And I want to clear out my data
and I'm going to retrain again,
benchmark again. Oh my gosh, it's
painful watching this because I know for
a fact that
PewDiePie is more ethical than the guys
running those state-of-the-art models.
Like, oh my gosh, what are we doing? Oh,
why? Why is ligma man more ethical? This
is just my whole reality is breaking
before me. It's just it's hurting. Okay,
I'm in physical pain. All right.
>> And I think it will still give me the
same result cuz it's such [music] small
contamination. I'm kind of worried cuz
I'm running out of time. [laughter]
>> Have I done [ __ ] all this entire time?
Have I achieved nothing this entire
time? And the whole thing was a hoc
pogus. That's what I thought. So this
time I went all out. I trained on my
entire data set. Previously, I trained
on a small subset of what I thought was
my best data because it takes forever to
train. It takes forever. It takes days,
weeks. And since I reached the score of
19.6, I was like, uh, I'M DONE. BUT NOT.
THEN I made another discovery. I was
training on the wrong model. A major
update. I'm watching I'm watching my
video. this guy. I'm giving feedback to
my editor and I'm like, "Hold on, hold
on. What is that?
That is not the coder version."
>> YEAH, DUDE. OH MY GOSH. I NOTICED THAT.
THAT WAS EARLIER in the video and I was
just like, "Oh, weird. It's Quen too.
It's Quen whatever." And I was like,
"Huh, okay. That's interesting that he's
using Quen whatever. Not a big deal."
AND THEN HE HE'S TALKING ABOUT IT. OH, I
WISH I WOULD HAVE SET OUT. IT WOULD have
been so many IQ units. Oh, no.
>> Oh, I [laughter] the bag so hard.
>> Have I been training on the wrong
version this entire time?
>> Oh,
so maybe this has all been a blessing in
disguise because I feel like with these
two in my things in mind, I I should I
should beat Chacha. I should. But we
will know in a couple days. Did I beat
Chachi?
Well, I trained on the coding model and
the first score
4.4%.
I don't know what the [ __ ] happened. I
can probably think of five things, but I
don't care to figure it all out. It's
going to take forever. I just retrained
again and the model score 25% baby.
I have not just beat Chachi. I beat
Chachi twice. They're
that's really good.
That's a big improvement. Wow. Okay.
>> August shitty version as well. I was so
relieved. I was like, "Thank God."
>> You know, there is one problem though
that he's not saying, which I I I feel,
you know, I feel like especially a guy
with such uh, you know, such integrity
as him is that he's not mentioning it at
all,
which is Jippy 40. though being worse
will make you feel incredible. I mean,
the sycopency of 40 was so good that
like you could ask it anything and it's
going to love you and it's going to make
you feel loved and seen and it's going
to cause an entire group of people on
Twitter that if you say that thorough is
ridiculous and people that are addicted
to AI and are causing their entire life
to be ruined will go after you and your
family and will have absolutely insane
mental breakdowns live on the internet.
That's how good that model is. Okay? You
can't tell me that a model's you can't
you just can't you cannot convince me
that there has ever been such a great
model as 40. No one has created a
stronger group of sick infancy than
Jeepy 40. It was incredible. I was like,
"Thank God." But I made another
discovery. Okay, it's not over. A third
of the benchmark was not even running.
C++ and JavaScript just wasn't being
tested properly. Oh, that's okay. Okay.
No, nobody likes those languages
anyways. Okay. Nobody even cares about
it. Honestly, Pew, I think you're I
think you're fine. I think we can all I
think I can speak for everybody when I
say that if you never touch JavaScript
or or C++, you're living a good life.
Okay. A good good life. I run it again
and the final score
36% baby.
This means I also beat Google Schmoogles
thing as well. And I think I beat I beat
4.1 or something. I don't know. Chachi,
it's a massacre out here. You're being
destroyed. It's embarrassing. Open AI
stock just demolished. Just quit
already. This is what pops the AI
bubble. This [laughter]
That's actually really impressive.
Generated a bunch of synthetic data and
did that well.
That's really good.
That's like really really good. Is is is
Peudes is Peudes about to start
accepting investments and creating the
first Japanese-based AI company with
Japanese hardware prayer tags on every
GPU?
Is that where we're going? Is this the
universe we're about to land in? I was
like, there are still issues. I'm going
to do some post training. 1,500 samples,
splinky blinky, five epochs, and I ran
the benchmark again. Pure
decontaminated, let it be known. Purer
than the fountain of youth. 39.1%
baby.
Another destroy. I think this one I beat
Google Smoogle. I Where's the stupid
benchmark? Yeah, Gemini. Gemini Pro. You
guys pay for that. I did not think this
would even ever come close. But here's
the embarrassing part about all of this.
>> That does have to hurt Google a little
bit. Okay, this guy right here very
disappointed.
He has spent so much time.
Now look at him. Now look at him.
Glasses entirely too blue.
Doesn't even know what to do with his
life anymore. He's just very very upset.
Five. But Gwen 3 is out and it scores
40%.
So unless I beat 40% this means nothing.
[laughter]
And I'm I don't I'm out of time. I don't
have [ __ ] time. I need to send this
video to my editor. Yes, there was one
more thing. A model being good at one
benchmark is stupid as [ __ ] Okay, I
need to test this model on other
benchmarks as well. I want to test it on
Sweetbench, all these other coding
benchmarks. Unless you're improving that
it see future and said what benchmark he
was going to say and pulled it out.
>> I got that future site.
>> I don't even know. I am just out of
time. Okay, I don't have time to do it,
but I will. And if this model ever gets
to a point where I feel like it's good
enough, I would love to share it, but I
think if anything, I might just move on
to a different project in in the
background because it takes a long time.
And uh I'm kind of tired. That being
said, you've seen me fail a lot in this
video. I have become so accustomed to
failure, you have no idea. I've almost
given up on this project so many times.
There are so many times where I'm just
like, I don't know what I'm doing. This
is the stupidest thing ever. I have
graveyards full of just garbage,
debunkle, schmunkle, deformed data that
I have generated thinking this is the
best. [laughter]
I have gone through the whole alphabet
of failures. I was just so way in over
my head on this project. But I think the
number one thing I've learned, how do I
explain this? When you install Linux,
here's what happens. Linux the creator
becomes your godfather inevitably. And I
was watching one of a random video of
him talking and he was talking about how
he's doing this project and he was
failing. But that's okay because that's
how you learn. Uh some people think that
failure is a bad thing and I happen to
be one of those people who actually
enjoy doing
things I'm not good at
because it's how you learn.
>> Damn, Twitter could really use that
advice, dude. the mo the modern AI crowd
that's just like
it's all about taste now. You can taste
these nuts. Um so it's just like man
this is so good. I again I cannot
believe we're living in a timeline where
like the CEO of Y Combinator feels
incredibly less wise than PewDiePie.
Like what the he what is this world
we're living in? This can't even be
real, right? Am I Am I wrong here? Like,
this it's just No, no, no, no. I refuse
it. But yet, here we are. I don't know,
man.
I love the fact that Peudes is going on
this journey and doing this because
honestly, there are so many people that
just need to be encouraged to see like
failure is actually really good.
You should be measuring your growth not
in just successes, but more importantly
in how many things you've been failing
at. Like honestly, people don't realize
like just how valuable that fail fail
fail succeed path is. It's incredible.
It's just oh my god.
>> I'm like he's speaking to me right now.
Oh my god. But I really feel like that's
the main thing I've learned from all
this because there's so much to learn
from failure. Learn from it and iterate
and keep working. I think if you have
expectations of how things should go for
yourself, you're just going to get
disappointed and you're going to want to
give up. So, expect to fail, embrace
failing. That's the message I want to
send out to you kids. Last thing, this
is a coding model and I think a lot of
people are looking at coding models like
are they going to replace and I saw
Linus say this as well and he was
basically saying coding models if
anything it will just bring in more
people interested in learning how to
code again. That was online as tech
tips. very very good message. Absolutely
loved it. He said that the, you know,
the genie's out of the bo the bottle.
Pandora's opened up the box. They're
here to stay. But does he think No. He
just thinks it's going to end up
[clears throat] aligning closer to a
tool.
Good take. Good take, Lionus. Good take,
Lionus. Like, I would never have learned
wanting to learn how to code if it
wasn't for AI coming into the picture.
So, on the final note, learn something
new. Check out boot.dev. I highly
recommend it. It's a great course. Pick
out whatever you want. Challenge
yourself into something difficult that
may be above what you think you're
capable of. That's it.
>> Now, scram.
>> I'm just kidding, bro.
This month, we're traveling a lot. And
what always happened whenever we travel?
The inevitable connecting to public
Wi-Fi. Free Wi-Fi is a trap. Get the
reference. If you connect to someone
else's Wi-Fi, you might as well just
give up your credit card and banking
[music] information and all your
information they ever have. You should
always use a VPN. NordVPN. Say it with
me. Nordmeos
board connect always. I'm even connected
right now.
>> Now that we're going that direction,
>> I made a little module for me.
>> I wasn't ready.
>> Cuz whatever I do online, if I want to
download is my goddamn business. It's my
[clears throat]
It's my goddamn right. So, thank you
NordVPN for making it possible to free
yourself, protect yourself online. If
you go to nordvpn.com/piepie, you get a
huge deal on a 2-year plan, plus bonus
extra nordvpn.com
online privacy. Seriously, this is the
best deal for NordVPN you're going to
find. So, take advantage. Thank you,
Nord, for sponsoring this video. That's
nordvpn.com/pie.
>> Very lovely. Hey, that was really good.
again. I can't believe how good, dude.
He's just so good. It's just so good.
Absolutely good. The name
this is the primogen.
Ask follow-up questions or revisit key timestamps.
The video details the arduous process of training an AI model, focusing on the challenges and unexpected hurdles encountered. The creator initially aimed to outperform ChatGPT-4 with their own model, but faced numerous setbacks including nearly burning down their house. Key challenges involved data acquisition, cleaning, and formatting, with significant time spent on iterating and refining the training data. The creator also highlighted the surprising performance of Chinese AI research and the ethical considerations of data sourcing. Despite numerous failures and a steep learning curve, the project ultimately led to valuable lessons about embracing failure as a learning opportunity and the importance of persistence. The video also touches upon the role of AI in coding education and the evolving landscape of AI development, with sponsored segments promoting learning platforms and VPN services.
Videos recently processed by our community