HomeVideos

Why Your AI Gets Dumber After 10 Minutes

Now Playing

Why Your AI Gets Dumber After 10 Minutes

Transcript

1612 segments

0:00

Your AI is literally suffocating right

0:03

now. I'm not even joking. Let me show you

0:07

something

0:08

that's actually going to change how you

0:10

use Claude Code and Cursor. If you've been

0:12

using

0:12

ChatGPT as well, any AI system will feel

0:15

sometimes it's getting dumber after just

0:18

10 minutes

0:19

of coding. And what I've basically done

0:20

is created this nice little visualizer to

0:22

kind

0:23

of show us how this stuff works. All you

0:26

do is just run a simple query and you have

0:28

Like a 200k token context window

0:45

Agentagine You target to create more

0:50

and then it does some tool calling as

0:51

well and then comes back and actually does

0:53

some more thinking to make sure it's the

0:55

right thing and then produces some code

0:56

output.

0:57

And so as it does code output, depending

0:59

on whatever type of code you're doing,

1:01

let's just say it puts a bunch of tokens

1:02

here.

1:03

Then you say, oh, can you go ahead and do

1:05

this type of thing here? It's , yeah, go

1:06

ahead. Cool.

1:07

And can you also grab this other files

1:09

and do these other things?

1:10

So now that it's grabbing files from your

1:12

code base, that starts to fill up.

1:14

and so on.

1:15

And then it starts to fill up.

1:16

And so the agent does another 8,000

1:18

tokens worth of thinking, does a couple of

1:20

tool calls

1:20

and stuff and does some more code output.

1:23

And then it starts to output the code or

1:25

output the changes.

1:26

Now this token window starts to actually

1:28

fill up pretty fast.

1:29

And we're basically at 96,000 tokens out

1:32

of 200.

1:33

And you'll probably see a little thing

1:35

that says , yeah, you're using 48%.

1:36

You're , okay, what if I just want to do

1:38

another quick request?

1:39

So then you start to grab some more stuff

1:42

from your database.

1:43

You start to now connect all the front

1:45

end and the back end.

1:46

Your agent does some thinking.

1:47

You say, I wanted to think ultra hard.

1:49

So then you basically pass in the most

1:51

amount of tokens this.

1:53

And you're just for three different, you

1:54

know, if you're thinking ultra hard,

1:56

you're going

1:56

to use 32,000 tokens.

1:59

So that's 8, 16, 24, and then one more

2:02

this.

2:03

And then it does some tool calls to grab

2:05

some additional things.

2:06

And then it's going to try to do the code

2:08

output and boom, just that, you know,

2:10

With a couple of more requests, now

2:12

you're basically hitting almost 98% right

2:14

at the very end.

2:15

And as you can kind of see, this type of

2:17

workflow starts to add up super fast, and

2:20

we've only done a couple of turns.

2:22

And this is the point sometimes where

2:24

maybe Claude Code or Claude Sonnet or

2:26

maybe even Cursor itself will start to

2:28

kind of say,

2:29

Hey, we're going to roll the context

2:31

window. You're , dude, I just started the

2:32

conversation. Why is the AI getting

2:34

dumber?

2:35

and it's maybe it's not that the AI is

2:36

getting dumber but I think it's more that

2:38

the

2:38

context windows are kind of limited here

2:41

and that a lot of these apps in the

2:43

background are trying

2:44

to do different things for you to try to

2:46

maybe not take up so much space and on top

2:48

of that

2:49

there's this interesting study that I

2:51

want to go ahead and show you this was

2:53

produced from ChromaDB

2:54

and in this study they did all this is a

2:56

really lengthy report and I definitely

2:58

would

2:58

recommend you check it out so they did

3:00

this comparison called the needle in a

3:02

haystack

3:02

and so on.

3:03

One of these things is you can have these

3:06

large context windows from Google

3:08

or these other companies, millions of

3:10

tokens.

3:10

You're , okay, if I throw everything in

3:13

there, it should be okay.

3:14

The thing that I learned about this and

3:16

what I really want to share with you that

3:17

I don't

3:18

hear many people talking about other than

3:20

in these really deep reports is that when

3:22

you have a single question and you have a

3:24

single thing for it to search at, that

3:27

accuracy

3:27

is extremely high.

3:29

And so what that means for you and your

3:32

codebase is that maybe if you have one

3:34

specific

3:34

narrow task and you have this giant

3:37

codebase, maybe just giving it that one

3:40

thing is going

3:41

to make it much more accurate.

3:42

And that actual thesis was proven through

3:45

their research and this paper here.

3:47

And so whenever you have basically

3:49

there's this similarity bias and as this

3:52

type

3:53

of thing kind of goes out, it starts to

3:54

get a little more distracted when you

3:56

start to

3:57

give it more tasks.

3:58

This is kind of where we're going to

4:01

approach this thing called subagents and

4:03

kind of why this is such a big movement in

4:05

this realm.

4:06

It's not only about the context window,

4:08

but it's also about accuracy and achieving

4:10

the nice things that you need to do in

4:13

your code.

4:13

So we can kind of unpack this.

4:15

This is going to be a little bit more

4:17

theory type of things, but I kind of want

4:19

to do this a little bit more with visuals

4:21

because there's a lot to unpack here.

4:23

and I feel as a live stream I can answer

4:25

some of your questions back and forth

4:27

maybe some of the

4:27

stuff you've been running into and I'm

4:29

hoping that this can sort of influence and

4:31

change a little bit

4:32

of your behavior for how you prompt the

4:34

models and then how you can get the most

4:36

out of them so that

4:37

that way you could just be really buzzing

4:39

and kind of going I feel sometimes a lot

4:42

of people it's

4:42

really easy to use an off-the-shelf

4:44

framework or an off-the-shelf MCP that

4:47

does all these tool

4:48

So, what you'll actually end up

4:50

discovering is that maybe some of this

4:52

type of stuff is abstracting it a little

4:54

too much for you, and maybe it's kind of

4:56

going off the deep end kind of a big old

4:58

machine gun and just spraying and praying,

5:01

as the old gamers used to say.

5:03

So, in this Haystack question, obviously,

5:05

this is kind of what they're saying is, if

5:08

there's a bunch of irrelevant context,

5:11

this is the needle.

5:12

and whenever you have something called a

5:13

distractor, something that sounds similar,

5:15

this is kind of where it starts to kind of

5:17

get thrown off a little bit.

5:18

So that's sometimes maybe some of your

5:20

queries, you're not really clear at the

5:22

very beginning about what you want.

5:23

And then if you kind of give two similar

5:25

things, that's kind of where it starts to

5:27

say, maybe he wants this one and maybe he

5:29

also wants this.

5:30

And that if you're as confused, the AI

5:32

will also just be as confused for the task

5:35

when you hand that off.

5:36

So they say here, the best writing advice

5:38

I got from my college classmate was to

5:40

write every week.

5:40

and then here it says the the I think the

5:43

best writing tip I received from my

5:45

college professor was to write every day.

5:47

See this is the needle and this is

5:49

the distractor. All in the same paragraph

5:51

when you asked you know what's

5:53

the advice I got from my college

5:54

professor it's gonna see these two

5:56

things. So it's gonna be oh I've seen

5:58

that in the training set I see

6:00

these two things and you're wait which

6:01

one should I give back as the

6:02

answer because the college professor

6:03

they're asking about the college

6:04

professor. I'm well they said I should

6:07

write every week but also the

6:09

and so on.

6:10

So, you know, that's the best writing tip

6:12

that I should write every day.

6:13

And so that's, you know, that type of

6:15

what we call hallucinations can happen.

6:17

And this is kind of why it's a little bit

6:19

more important.

6:20

And this is kind of why I want to share

6:21

this type of information is because not

6:22

only did

6:23

they do the study, they actually went

6:25

super ham and they're , how many

6:26

distractors

6:27

can we have?

6:28

And what's the actual accuracy pool,

6:29

right?

6:30

they did this whole study and it's just

6:32

blown my mind right now.

6:33

So check this out.

6:34

If you have basically what they call four

6:37

distractors, the input token length,

6:40

when you start to kind of add it in, look

6:43

how fast it degrades as far as accuracy.

6:45

It goes crazy if you have four or more

6:48

distractors.

6:48

And so think about this in your own code

6:51

base, think of them as distractors, right?

6:53

Whenever you have code that's repeated on

6:55

all these different components and you

6:57

say,

6:58

go and update this type of thing, you're

7:00

, and you just YOLO it, it's going to

7:02

update

7:02

and so on.

7:03

And then you have some type of layout,

7:09

you know, you're supposed to kind of sort

7:15

of have

7:16

some hierarchy and some components you

7:18

reuse.

7:19

But it's , oh, I'm going to make another

7:20

progress bar over here in the admin panel

7:22

because a user requested it.

7:24

And it's , well, I have a whole directory

7:26

of components that I use from ShadCN.

7:29

You should just initiate one of those and

7:31

make it there.

7:31

AI will be really happy to fulfill your

7:33

request and just starts to actually code

7:35

up the button

7:36

and everything that it wants to do in

7:37

that separate area for the admin panel. So

7:39

now you

7:40

actually have the user side and you have

7:42

the admin side that have literally a copy

7:44

of each other's

7:45

code. And this is what I call a

7:47

distractor. And this is how crazy it is

7:49

and how important it is.

7:50

And Chroma is actually showing, it's one,

7:53

the accuracy stays really high if you just

7:55

have one specific thing you're trying to

7:57

achieve and you add two and then you add

8:00

four. And then

8:00

Think about what people are saying for

8:02

AI. This is kind of why people make that

8:05

term AI slop,

8:05

is because as an engineer, you'll kind of

8:08

start to see these patterns develop.

8:09

You're ,

8:10

well, the AI isn't really good because it

8:12

just keeps rewriting code. And I feel most

8:14

of

8:14

the time I'm doing as an engineer is ,

8:16

I'm writing more rules to guide the model

8:18

right now.

8:19

There is some good news out of this.

8:21

Every single year, these models have been

8:23

getting more and more

8:24

intelligent, so you don't have to guide

8:26

them as much with as many rules than I

8:28

previously did

8:29

and so on.

8:30

So, we're, models are probably going to

8:32

get smarter and the tooling is going to

8:35

get a

8:35

lot better to support these types of

8:37

things so that , you don't have to worry

8:39

about

8:39

this too much, but as of right now, the

8:41

tooling that we have that I use a lot is

8:43

Claude Code

8:44

and subagents is one of those types of

8:46

keys.

8:46

So I want to show you what that looks in

8:48

my content visualizer because I think

8:50

that's

8:50

going to be the key here for kicking off

8:52

subagents and trying to basically narrow

8:55

that scope

8:55

and so on.

8:56

So, you can have very defined things that

8:59

you do in your codebase, generating code,

9:02

generating UI components, working with

9:04

your backend, understanding how those

9:05

types of

9:06

things flow together.

9:07

So when I hit reset and I go down, when I

9:10

basically do a code generation task, you

9:12

can

9:12

kind of see how many tokens you take up,

9:14

right?

9:14

44%.

9:16

But then now when you spawn a subagent

9:18

down below, you'll actually see this type

9:19

of thing

9:20

start to take place.

9:21

And so you say, I'm going to go ahead and

9:23

add a request.

9:24

Now that subagent starts to take on that

9:26

request.

9:26

And then if there's another concern that

9:28

that request takes on, you just do add

9:30

request

9:30

as well.

9:31

And you can kind of see how the subagents

9:33

really start to unlock this whole type of

9:34

thing.

9:35

So you're basically getting this main

9:37

request up here, and then you're handing

9:39

off these

9:40

individual components, right?

9:41

So in this case, in my situation, I

9:44

started to make one specifically around a

9:47

design system.

9:48

And then I'm going to now try to make a

9:50

subagent for my convex database, because

9:52

the convex

9:52

and so on.

9:53

So, I want to make sure that when I do

9:56

make a feature request that it does pick

9:58

these

9:58

types of rules up so I can actually learn

10:01

how to interact with my codebase in

10:03

smaller

10:03

segments so that I can keep that context

10:06

window really tightly focused and that

10:08

just ensures

10:09

higher accuracy.

10:10

Because I'm even noticing that some of

10:12

the code generation, even though I have

10:13

these

10:14

rules files, it may not pick them up and

10:16

it may not pick them up into those

10:17

sub-agents.

10:18

So, I want to kind of share this as a

10:20

primer for y'all because I feel this type

10:22

of

10:22

and so on.

10:23

So, you know, that visualization hasn't

10:24

really been talked about too much and it's

10:25

still

10:26

very confusing because I see lots of

10:27

graphs and things.

10:28

But the best thing is literally just this

10:30

type of thing.

10:30

You just do a tool call, agent does some

10:32

thinking, you do some more input and then

10:34

the code,

10:35

you know, comes out and so forth.

10:36

So this stuff adds up pretty fast, as you

10:38

can kind of see just from a 200k token

10:41

context window.

10:42

And that's why it's important to start

10:43

new chats.

10:44

When you start a new chat and you want to

10:46

do new code generation task, you know,

10:47

this

10:48

is the type of thing you'll do.

10:49

and so forth.

10:54

And that type of stuff happens.

10:56

Now with these thinking models, I'll show

10:57

you real quick.

10:58

They will basically do after the user

11:00

input, they'll do some thinking.

11:03

You may generate some code and the agent

11:05

will come back to do some more thinking to

11:06

understand what it just did.

11:08

And that's sort of the power of Claude

11:10

Code and this is the power of the new

11:11

Sonnet models with the additional thinking

11:13

and the tool calling.

11:14

and this is also very transparent in

11:16

Cursor as well.

11:17

So let me just go ahead and catch up with

11:19

the comments

11:20

because I know a lot of people have been

11:21

trying to catch up here

11:22

and wondering how everyone's doing here.

11:24

So, alright, so let's see what's going

11:28

on.

11:28

I need to dial in my subagents more

11:31

around their tool calling access.

11:33

I've given them a set to inherit all AI

11:35

tools

11:36

even though I only have two MCPs enabled.

11:38

Eating tokens.

11:39

Yeah, that's what's up, man.

11:41

Right, guess what? After some stress and

11:43

work, I need to recharge seeing your fire

11:45

cooking apps and producing some sharing

11:47

contents every day.

11:48

Thank you so much, Alan, for becoming a

11:50

member and also kind of catching up here.

11:51

What if we were to tell the AI to

11:53

reevaluate its solution? Does it still

11:55

hallucinate?

11:56

Yes it can still hallucinate especially

11:59

if you have all of this conversation where

12:02

it can be thrown off So in this context

12:05

the conversation is remember that needle

12:09

in the haystack stuff

12:10

So if you're continuing the same

12:12

conversation, if you take some of that

12:15

context and pass

12:16

it on into a new chat, or if you're just

12:18

continuing on, you're going to keep adding

12:21

these distractors.

12:23

And these distractors in your chat, if

12:24

you're telling it to fix itself, are still

12:26

going

12:26

to be in there.

12:27

and so you start to basically for every

12:30

single turn in the conversation so that's

12:33

every single

12:33

hey go fix itself okay just change this

12:36

one not this one so you've seen that

12:39

vibe coding meme this is basically what's

12:41

happening is you're creating more

12:43

opportunities

12:43

for the accuracy to continue to go down

12:46

so the further you get in your chat the

12:48

more

12:48

it keeps actually going down in accuracy

12:50

and so at that point you just got to

12:52

create a

12:52

new chat and pretty much just focus again

12:54

and try to have it rebuild its context

12:56

because

12:57

Now you're going to basically just be at

12:59

this point you're just gonna be you're

13:01

doing

13:01

some complex debugging task right you're

13:03

doing this type of thing and you're just

13:04

okay user

13:05

input this the agent does thinking it

13:07

does some tool calling some more code

13:09

output you're

13:09

no try again think harder and it's okay

13:12

at this point bro anything that you've

13:14

done

13:15

before you've just introduced massive rot

13:17

and that's basically where you're at

13:20

you're just

13:21

just crazy crazy amounts of rot but it's

13:23

important to know that why why that maybe

13:26

Maybe that's happening and then at every

13:28

type of conversation that what I call a

13:30

distractor is going to keep throwing the

13:32

language model off until you basically

13:34

start a new window.

13:35

In Claude Code you can hit escape twice.

13:37

So hitting escape twice in Claude Code

13:39

will basically let you go back to that

13:42

conversation and truncate things out.

13:44

But I would just sometimes recommend just

13:46

starting a new window. Right now Claude

13:47

Code doesn't have anything visually that

13:49

should show you how much of the context

13:51

window you're taking up, which I feel is

13:53

super important.

13:54

Cursor has recently just introduced that

13:56

feature into their agent and it's not I

13:58

don't think it's live for everyone yet

14:00

but you can kind of see down below it so

14:02

shows context usage for 83% so that is

14:05

super duper valuable because once you

14:08

start passing that mark I'll show you

14:11

really quick in in chroma's paper here

14:13

there's you know this is a performance

14:15

over three models with 500

14:16

over 500k tokens so this is those 500k

14:19

tokens once you start to pass that

14:21

and so on.

14:22

But the accuracy rate, once you start to

14:24

go up in the percentage, right?

14:25

50% is your golden zone, really.

14:29

But when you have more distractors, that

14:31

goes down really, really fast.

14:32

So you're talking about 30%, right?

14:34

So if you have one dedicated thing that

14:36

you're saying you're going to do, you're ,

14:39

in this admin panel, I want you to do

14:41

this specific thing on this component, on

14:43

this,

14:44

and just add a for loop or something,

14:46

it's going to be really, really good

14:48

because it's

14:49

just that one thing, right?

14:51

and so on.

14:52

So, that's just what it is.

14:53

But once you start to continue the

14:54

conversation, it hasn't been going well,

14:55

fix itself and

14:56

try to reimagine and rethink everything

14:59

that you possibly can so that you can be

15:02

the best

15:02

version of you, AI, and then that's not

15:05

going to help.

15:06

That's actually not going to help it at

15:08

all.

15:09

It's been really good to back up some of

15:10

the vibes that we've been feeling with

15:12

actual

15:13

research to better understand what this

15:15

whole vibe shift is really about, because

15:17

I feel

15:17

that's what's been happening now.

15:19

Yo, how's it going, Vlad? Thank you so

15:21

much for joining on the stream. I've

15:23

started playing

15:24

around with Gemini CLI and not going to

15:26

lie, I'm impressed how good it is at

15:27

reinforcing

15:28

or refactoring. Oh, I'll have to try that

15:30

out again. I'm using Google Code Assist

15:31

plan

15:32

at 20 something dollars a month. Yeah,

15:33

I'm going to have to check that out again

15:35

because

15:35

I do want to kind of compare some of the

15:38

notes as well. How does it compare to

15:40

Claude CLI?

15:41

Yeah, I'm just curious as well. I took

15:43

the Cursor Ultra package, but in 20 days

15:46

limit

15:46

I was not even using Opus. Wow, lots of

15:49

agentic coding probably. I'm just really

15:53

curious how that works.

15:54

So are you able to select other models?

15:57

Would love to know more about that

15:58

workflow and kind of what's going on.

16:00

Yeah. OK, that's interesting. Well,

16:02

there's a lot kind of going on here.

16:04

Have you considered developing any local

16:06

Mac apps too, or are you mostly focused on

16:09

developing on the web for mobile apps?

16:10

MobileApps. So right now I'm thinking

16:13

about doing a Mac app and a Swift app, but

16:15

at the same time,

16:16

, it's gonna be more on demand because

16:18

right now with my app RayTranscribes, for

16:21

those who

16:22

aren't familiar, this is basically my app

16:24

right now. So my app is basically to just

16:27

transcribe

16:27

a lot of my videos and I have a lot of

16:30

videos and a lot of live streams and stuff

16:32

so they can be

16:33

hours and hours long. I found a hack.

16:36

It's basically you can get discovered

16:38

literally

16:39

Just by uploading your transcripts and

16:41

putting them into YouTube.

16:42

Because when YouTube processes them, the

16:45

keywords that you say

16:47

will get picked up when people search for

16:49

them. So if I do Claude Limits,

16:53

I bet you my video shows up from

16:55

yesterday or something that. Yep, see?

16:58

There it is.

16:59

So it's the second video, right? And I

17:01

could even do a private mode or something.

17:02

and then you'll see Claude Limits shows

17:06

up it's the second video and so in this

17:10

type of

17:11

thing it's my video is only 21 hours ago

17:14

and that some of the top results and

17:17

because of you know having those

17:18

transcripts in there it's super important

17:20

but I also have

17:20

timestamps in there as well so that's

17:22

super nice because Google likes to chunk

17:24

out the

17:25

timestamps to help with their AI

17:26

overviews they basically you're helping

17:28

their rankings and so

17:29

So any type of reinforcement that you do,

17:31

you basically start to get ranked higher.

17:33

It's sort of a secret sauce.

17:34

I need to write an actual blog post about

17:36

it.

17:37

But my Transcriber app basically does

17:39

this type of thing.

17:40

And right now, this is extremely

17:42

efficient.

17:42

You can have your own custom dictionary.

17:44

So I have these different words that need

17:47

to be in there so that Claude Code, o3 are

17:49

actually in the transcripts specifically

17:51

for these type of keyword things.

17:53

And so, yeah, when it's processing the

17:55

audio, it's doing its stuff here.

17:56

So this is a multi-hour stream.

17:58

I think it's a four hour, three hour live

18:00

stream.

18:01

So it does this stuff pretty fast.

18:03

And you can see all these transcripts

18:05

that you're able to kind of do the

18:05

things.

18:06

So I've done hours and hours and hours of

18:09

stuff.

18:09

And this is basically my little app here.

18:12

And I think if I do want to do this so

18:16

that,

18:18

I'm almost done with my refactor, by the

18:20

way.

18:20

So this started out as a bunch of slop

18:22

and this is kind of why I started to get

18:24

around this context rot is because

18:26

when I'm done with the refactor,

18:27

I can now make the iOS app and because

18:29

I'm using a database by the name of

18:32

Convex, for those who

18:33

don't know you may hear this a lot but I

18:35

just I'm a huge advocate of this company

18:36

because they're

18:37

just amazing. They have really great

18:39

database software. So one of the things

18:41

that they have

18:42

here is fix the bug and set complete, say

18:44

true. Okay so it's a TypeScript database

18:46

so

18:47

you literally you're just basically

18:48

playing around and this is the actual

18:50

dashboard for the database.

18:51

So as you make changes here it's

18:53

basically happening in real time on the

18:55

back end. So

18:57

As you add these different things, it's

18:59

all happening. So someone on their iPhone

19:01

can be

19:02

uploading, doing a transcript. Someone on

19:03

the web can be doing that. Someone on an

19:05

Android phone can

19:06

be doing that. And so this is kind of why

19:08

I wanted to make sure that everything was

19:10

kind of settled

19:10

on my backend side. And so when I build a

19:13

Swift app, I could just hit this database.

19:16

I have a

19:16

clerk for my authentication. Clerk has an

19:18

SDK on the iOS side. Everything would just

19:21

be a smooth

19:21

experience and it would just be super

19:23

duper fast to hit my endpoint, send all

19:26

that stuff up,

19:27

and if they're on the web, because I know

19:29

a lot of people who do content creators,

19:31

they have stuff on the phone, right? They

19:33

have a short they just recorded. They want

19:35

to

19:35

be able to do it from their iOS app and

19:37

then they just want to be on the web and

19:38

just copy and paste

19:39

that transcript and put it into YouTube

19:40

or something that or whatever other

19:42

workflow

19:42

they do and they can do it because of the

19:44

way I'm setting everything up. So yeah,

19:47

long story short,

19:47

that's kind of the way I was going with

19:50

this. I have yet to try the latest

19:53

versions of Swift that

19:55

and so on.

19:56

So, I'll be spending some time probably

19:58

in the next couple of months studying

20:01

that, figuring

20:02

out what's going on with all the latest

20:03

APIs and animations, because they tend to

20:05

settle

20:06

the dust for that right before the fall.

20:09

So yeah, that's kind of why I've been

20:10

waiting a little bit.

20:11

But yeah, I don't know.

20:12

I was thinking about doing some iOS

20:13

stuff, but I've been kind of trying to

20:15

stay away

20:15

from it for now.

20:16

I don't want to get in trouble.

20:18

What's your marketing strategy for

20:21

RayTranscribes?

20:22

I think right now is just literally just

20:24

trying to code with it and then tell

20:26

people about it.

20:27

I want to reach out to more content

20:29

creators to try the app. And if you want

20:31

to try the app right

20:33

now, you can actually try it with the

20:36

code RayCooks. So if you put in the code

20:39

RayCooks,

20:40

what that will enable you to do is when

20:42

you go here to Ray Transcribes in my app,

20:45

let me just put this up again as well. So

20:48

if you start any of the plans today, go

20:50

start today,

20:52

That'll give you basically $15 off. So

20:54

you can get into the beta for $5 basically

20:56

for the first month or just get into the

20:59

starter plan for free for the first month

21:02

and you can start cooking.

21:03

And I give a very generous amount of

21:05

minutes. I mean, there's 6,000 minutes for

21:08

the beta and 3,000, which is way more than

21:10

enough than anyone has.

21:12

And then, yes, obviously you have all the

21:14

cool stuff that you have here and so

21:16

forth. So yeah. Oh my gosh, check it out.

21:19

Dennis just became a member.

21:20

Thank you so much, Dennis, for becoming a

21:21

member.

21:23

If you want to become a member, you can

21:25

secretly slide in through my Discord.

21:27

I still have a member post up.

21:29

And right now what's happening is

21:31

basically people are able to slide into my

21:33

Discord for $6.99 right now.

21:35

It's kind of a secret perk, but it's in

21:37

my member post.

21:38

If you go to my member post, there's some

21:39

instructions there that you can send to

21:40

get into the Discord.

21:42

There's now over 100-plus people in my

21:44

Discord because I finally reached my first

21:46

100 members.

21:47

and so my promise then was to double the

21:49

price so that you know we have this

21:51

exclusivity for folks

21:52

who are early. So if you're watching this

21:54

right now it's still early so this is your

21:56

time to sign

21:56

up. Once the price doubles I'm going to

21:59

use I'm basically going to use Polar.sh

22:02

and so Polar

22:03

has a nice Discord integration that you

22:06

can basically hook up and it's super easy

22:08

to code

22:09

and a lot of the community stuff will be

22:11

built using kind of Polar and and these

22:12

types of

22:13

things and I'll be showing you this stuff

22:14

along the way which would be really really

22:15

cool. So

22:16

So yeah, that's kind of where that's at.

22:19

Ray Transcribe is built with Next.js, or

22:21

do you use another?

22:22

No, Next.js.

22:23

So I built my app, Ray Transcribes, with

22:25

Next.js.

22:26

And yeah, it's basically using a route

22:28

handler.

22:29

I use the API from there.

22:31

And I have some client-side work that I

22:33

do.

22:34

So I do some processing on the client

22:36

side.

22:37

And I'm transitioning that to Vercel so

22:39

that everything is going to be handled

22:42

through Convex.

22:43

So a lot of the pre-processing stuff will

22:45

be kind of minimal now on the client side

22:47

and it's going to move a lot of it to the

22:50

back end side.

22:51

So a lot more updates on that. But yeah,

22:53

Next.js as far as the thing, ShadCN for

22:55

some of the frameworks.

22:56

I think I have, let's see, Tailwind CSS

22:59

v4 so that I have this cool tokenizing

23:02

system.

23:03

I'm using Clerk for authentication. The

23:05

back end database is Convex.

23:07

By the way, when I launched the app, I

23:09

had no database.

23:11

It was literally a v0 slap-on, , hey, I

23:14

want to get some paid users. I sent Stripe

23:17

links.

23:17

People signed up with the Stripe links. I

23:19

went into Clerk and I just added their

23:20

email

23:22

as part of their getting started. I was

23:24

doing it all by hand. And it wasn't until

23:26

I was on

23:27

vacation in Croatia and I lost three

23:29

users who paid. Three users paid for my

23:31

app and I couldn't

23:32

and so on.

23:33

I started to get to them in a couple of

23:35

hours because I was remote and in a

23:36

different time

23:37

zone and I was , man, I didn't want to

23:40

set up the Stripe database, all that

23:43

stuff.

23:44

But I'm glad I did it actually because

23:45

now it's much more scalable and I can

23:47

actually

23:47

do more things, which I plan to do

23:49

anyways, but it's starting to actually get

23:52

traction,

23:52

which is cool.

23:53

So yeah, I'm going to reach out to more

23:54

content creators and people who are , if

23:56

you do

23:56

You got a lot of great minutes for the

23:59

costs and I'm kind of wanted to pass it on

24:03

as well.

24:03

Hi from France you make DiscoverCondex

24:06

thank you I use it for an upcoming

24:08

personal project I and I think you get I

24:10

don know 20 free projects Something

24:13

ridiculously cool for free So it great

24:15

I've been using your transcribed

24:16

platform, Ray. Loving it. How is the

24:18

architecture behind it? Using Groq for

24:20

Faster Whisper?

24:20

Yes, Groq is one piece of the story. So

24:22

part of the reason why I was using Groq is

24:25

because it's just super duper fast.

24:27

Two is I was actually setting up this

24:29

cool thing and I prototyped it and that's

24:32

kind of how Ray Transcribes got started.

24:34

I wanted it to be even faster than any

24:37

other transcriber that's out there, and I

24:39

kind of broke the record because I set up

24:42

a web socket, and I would literally stream

24:44

everything as if it was being processed

24:47

and being sent up in chunks to be

24:49

processed.

24:50

And so I had this cool workflow that I

24:52

don't think anyone really optimized, but

24:53

as a nerd, I was , this is going to be fun

24:55

to do.

24:57

So, yeah, it's a pretty nice, simple type

24:58

of thing that you can set up.

25:00

and that's one piece of it but then I'm

25:03

moving into more features now so I

25:06

realized that speed isn't really the big

25:08

thing it's more about just kind of

25:10

workflows so I can do more features if

25:13

I'm able to maybe fast isn't

25:16

really the big deal is more of just about

25:18

getting that transcript and

25:20

then doing other cool things with it

25:22

adding timestamps maybe having some

25:24

stuff where it writes you and you know

25:25

collecting these types of things to

25:27

I was trying to set up a personal project

25:32

with Groq but did not work streaming to

25:38

Groq.

25:39

Oh, you may want to check out the AISDK

25:43

from Vercel. That's one of them.

25:46

The other thing too, I think, is with

25:48

Groq there's a limit. You have a 25

25:50

megabyte limit to stream a file up.

25:52

So you'll have to do some chunking with

25:53

the file. That took me a long time to

25:55

figure out.

25:56

and then once I figured it out then I

25:58

kind of understood what to do from there

25:59

but basically you have to keep track of

26:02

those files that you chunk up and then

26:04

when they come back down for the results

26:07

then you'll be able to actually put

26:09

them together as one type of thing it

26:11

sounds more complicated but it took a

26:14

long time to figure out I guess I don't

26:16

know whenever I try to start a web

26:17

project all these tools generate and

26:19

suggest super base lovable v0 is there

26:22

any tool that use convex to let you

26:23

connect the way they integrate with

26:25

and SuperBase. I think the best tool

26:27

right now is called Chef from Convex. So

26:29

chef.convex.dev. I'm

26:30

going to put this into the chat here and

26:33

I'm going to show you real quick. So if I

26:35

do chef.convex.dev

26:37

I'll show you. And so this is basically

26:39

their Vibe code version. I think they

26:42

forked it from Bolt

26:43

and you can sign in with your thing and

26:45

you can you can make a Slack clone,

26:47

Instagram clone,

26:48

Notion clone, everything you would

26:49

normally do and it hooks it up right into

26:51

your database.

26:52

They have their own auth stack, so Convex

26:55

has their own Convex auth. You can later

26:58

hook that up to Clerc, which I love, and

27:01

it allows you to do way more, in my

27:04

opinion.

27:05

So yeah, if I sign in, I should be able

27:07

to sign in. Cool. Yeah, so yeah, that's

27:09

kind of Chef. I might do some videos on

27:12

Chef, because I do this.

27:13

But actually, I code so much in Cursor

27:15

and Claude Code that I just take the rules

27:18

files and I just find I have more, you

27:20

know, I could do more with it and so forth

27:22

that.

27:23

There's also a template that you can use

27:25

to get started.

27:26

So GitHub, I forked one and I want to

27:28

show you this and I'll put this in the

27:30

chat so that you guys can have it too.

27:32

So one of them, it's called, what is it

27:35

called?

27:36

I based it off this.

27:37

It is this one.

27:39

So this is from this guy.

27:40

This guy is elite.

27:41

and so on.

27:42

And he basically made the Elite Starter

27:45

Kit.

27:46

And so go ahead and click this one.

27:48

And so this one actually has everything

27:50

set up.

27:51

And this is what we based off the...

27:52

I have a four-hour livestream for members

27:54

only, and we started with this actual

27:56

repo.

27:57

So we started this repo.

27:58

I walk you through getting this stuff set

28:00

up with Qlirk, getting the secret keys

28:02

from

28:02

Qlirk input in, getting the convex keys,

28:05

configuring those two, configuring a

28:07

webhook.

28:08

and so on.

28:09

But afterwards, what we actually got out

28:10

of it was the Stripe stuff.

28:12

So you can actually sign up with Stripe

28:13

and it's actually built right in.

28:15

It syncs to the database.

28:17

And then it has a dashboard which you can

28:19

actually use to, you know, this logs in

28:20

and

28:20

all that stuff that you can use to kind

28:22

of just add your features and various

28:24

things.

28:25

So yeah, this is a great starter kit.

28:26

This is pretty much my stack as well.

28:28

And when I saw this, I was , bro, this is

28:30

exactly my stack.

28:31

I want to keep building stuff off of

28:33

this.

28:33

and so I basically forked this and I'm

28:35

going to update my own and add my own

28:37

rules and everything that that I normally

28:39

would use.

28:39

And then I'm going to try to see if I can

28:42

make my own fork have my actual Claude

28:44

Code subagents, the design enforcer.

28:46

I want to have it so that has this convex

28:48

rules type of thing. So my subagents will

28:51

be more accurate because the goal here is

28:53

really for me to just start a repo,

28:54

not have to create these rules files all

28:56

the times. And I have this nice groundwork

28:58

to start from and just start to just say,

29:00

OK, I want to add this feature.

29:01

I want to do this. Okay, I want to

29:03

connect to the new images API and just add

29:05

something so that it generates images ,

29:07

you know, what I was doing for my

29:08

thumbnail.

29:09

So I want to do stuff that and I want to

29:11

I need to have a good base set.

29:12

And so this would have it all hooked up

29:15

and I'm just I just go.

29:16

So that's the end goal.

29:18

It's going to start with my members first

29:20

content and then eventually I'll roll that

29:21

out to the general public.

29:23

My members are kind of helping me shape

29:24

the feedback, which is really great.

29:26

And so right now you can join in as a

29:27

member, which is super awesome.

29:29

and it's pretty busy right now in the

29:31

discord it's really nice I have these

29:33

different channels and everyone's

29:34

participating so I definitely appreciate

29:35

that I spent my night watching yeah that

29:38

video yesterday oh it's a four and

29:41

a half hour video but it actually could

29:43

have been a couple of hours but that

29:44

the last half of the video is literally

29:47

taking that 380 page PRD that we

29:49

generated there was some good prompting

29:51

that happened in there and I'll give you

29:53

the long story short but basically the

29:56

director's cut is I in the prompt I

29:58

and so on.

29:59

And I said, make sure to include the UI,

30:02

UX trees for all of the different

30:04

components

30:05

that we're going to make.

30:06

So when you're making this PRD file,

30:08

we'll be able to understand that.

30:09

So then I gave it my sort of prompt of

30:12

what I was doing, and then I gave it this

30:15

prompt to generate a PRD, and that

30:17

generated the 300 something line PRD file.

30:19

We gave that to Claude Code and Claude

30:21

Opus for 33 minutes in that live stream

30:24

generated

30:24

the code.

30:25

So that was the last half of that stream.

30:26

So the last half of that stream is

30:27

literally just us waiting for that whole

30:29

thing to generate

30:30

because we weren't really sure.

30:31

I thought it was going to generate

30:33

everything in one minute.

30:34

It generated the full-blown app from

30:37

everything.

30:37

So we took that template.

30:40

On that template, it built everything on

30:41

top of that, all these different trees and

30:43

everything.

30:44

It spawned off four sub-agents, and each

30:46

of those sub-agents went in this

30:48

procedural order

30:49

from working on the UI, working on the

30:51

backend.

30:51

It referenced all these rules.

30:53

So that was a really deep stream.

30:55

I'm pretty sure there's a just so there's

30:57

so much in there and I feel that's a

30:59

perfect members only stream.

31:01

I did go live to the public and then it

31:02

just went to members only afterwards.

31:04

So I want to do more of those at least

31:06

once a week or once a month for members.

31:08

And, you know, that was actually by

31:10

request from several members who wanted to

31:11

just figure out something to get started.

31:13

So that was our massive thing.

31:16

Don't stop believing. Became a member.

31:17

Thank you so much, man, for becoming a

31:18

member.

31:19

I really appreciate that.

31:20

You guys are the real MVP.

31:22

I really need to have a watch of the 4

31:24

hour videos this weekend.

31:25

Yeah, I'm also, I just hired an editor

31:27

because my goal is actually to get my

31:29

first 100 members

31:30

and so I achieved that goal which is

31:32

really awesome.

31:33

And so you may see Dan in the chat.

31:35

So Dan is going to be cutting up some of

31:37

the videos and I got to speak with them to

31:39

figure

31:39

out what parts we want to kind of, how we

31:41

can trim that video down to make it

31:42

really great.

31:44

My goal might be just, I might just

31:45

re-record the whole thing myself just

31:48

offline and that

31:48

way we get that video.

31:50

So I think today is July 29th and I've

31:52

been streaming every single day from the

31:55

beginning

31:55

of July 1st and the growth has been

31:57

incredible.

31:58

I've grown almost 40% of my channel.

32:00

So I started off with around 10,000 subs.

32:04

Now we're at almost 14,000 subscribers

32:07

here on YouTube, which is a huge

32:09

milestone.

32:09

This is amazing.

32:10

Thanks to people you guys who have been

32:13

joining in, but also thanks to everyone

32:15

showing

32:15

and the rest.

32:16

So, I'm just going to be coming up every

32:17

day, making this chat really fun and

32:18

participating

32:18

and helping ask questions to drive some

32:20

of the discussions, which has been great.

32:23

The second part of it is I wanted to have

32:25

my first 100 paying members, and we've

32:26

totally

32:27

blown past that.

32:28

I think we're almost at 135 or 140 paying

32:31

members, so you get access to the Discord,

32:34

and that has actually helped basically

32:36

paying for Dan to do some of the video

32:37

editing, because

32:38

I want to save some of your time as well.

32:40

So the members will get to see those

32:42

videos edited first, and then they'll be

32:44

released

32:45

and the public later on. So yeah, there's

32:46

some really good meaty topics in there.

32:48

There's a lot of stuff I want to cover.

32:50

But today's video was my sort of

32:53

a dry run of my recording that I'm going

32:55

to do for a context rot video. And so you

32:58

got to see a little bit of that. So you

33:00

got to see a little bit of so for those

33:02

who don't know, I mean, this is how

33:04

serious I take my craft is that I started

33:07

this video off with wanting to do a

33:09

context window visualizer. So I went into

33:11

v0 because I originally did this in

33:14

Eraser.io are one of these platforms

33:17

where you literally draw. And I felt it

33:20

didn't really

33:21

do it justice. So I just went into vZero

33:23

and just created the app. And it's just

33:25

much easier to have

33:26

a real time app that you can actually do

33:28

and just deploy it to production. And so

33:31

that way,

33:31

you're able to play around with this and

33:33

then I could share the link as well. So

33:34

I'm actually

33:35

going to turn this Vercel app into a real

33:37

website that you can actually go interact

33:39

with and start playing with and actually

33:41

use it to maybe teach other people about

33:43

context windows,

33:43

So hopefully, I think the best form of

33:45

learning is teaching other people. So if

33:47

you can take what I just explained to you

33:49

and explain it to someone else, then

33:51

you've really mastered it.

33:52

It's part of the reason why I to just

33:54

turn on the camera to try to explain these

33:56

things, because then it helps me solidify

33:58

these thoughts. And that's just so

34:00

awesome.

34:00

So yeah, appreciate y'all for first

34:02

chilling on the growth here. So I really

34:05

need to watch. Okay, yeah. If you design

34:07

in Claude Code, it will build it. Yeah,

34:09

yeah, yeah, for sure.

34:10

All the Claude Code vids are popular.

34:13

Yeah, I think a lot of people are trying

34:15

to figure out

34:16

kind of how to use it, how to best

34:17

utilize it. And right now, for me, I'm

34:19

taking the less is

34:20

more approach. You can see why this is

34:22

really valuable for the less is more

34:24

approach, because

34:25

any type of tool that you add, it's going

34:28

to a tool call will start to eat up the

34:30

window,

34:31

and then you add some thinking tokens,

34:32

and then you add some more code output,

34:34

and you add more

34:35

agent thinking. And yet, look at that,

34:37

we're just filling this thing up fast, you

34:40

know, and

34:40

and let's just say it spawns off all

34:43

these sub-agents.

34:45

Do you want to be that guy who's the 5%?

34:49

Who's actually eating up all the windows

34:51

and everything that and you're gonna get

34:52

that bill

34:52

from OpenAI or ChatGPT or someone,

34:56

or even Claude Code and be , yo, give it

34:59

up.

34:59

Pay up or we're not gonna give you any

35:01

more tokens.

35:01

We're , yo, yo, yo, yo, yo, man, chill,

35:03

chill, chill.

35:03

Chill, chill, chill.

35:05

It's not that in here, you know what I

35:07

mean?

35:07

But if you are one of those people, we

35:09

are here for you.

35:10

This is the AI Anonymous group and we're

35:12

here to support other people you. We think

35:15

that

35:15

you're misunderstood just the way we are

35:18

misunderstood. Yeah, I might have a little

35:20

bit of addiction to AI products, but I

35:22

think that's natural, right? This is the

35:24

new age that

35:25

we're in. So if you're in here, smash

35:27

that thumbs up, definitely drop the

35:28

comments because it's

35:30

super helpful for everyone in here to

35:32

support each other. And I really, I'm just

35:34

in love with

35:35

the community that's kind of showing up

35:37

right now. And this is , I this vibe. This

35:39

This is such a good vibe because we're

35:40

all trying to learn and share best

35:42

practices.

35:42

And I think this is a really great way to

35:44

kind of start on that track, if you know

35:47

what I mean.

35:47

So, yeah, it's been crazy exponential.

35:51

Yeah, I mean, you got something good

35:51

going.

35:52

I'm fixing to join. Yeah, for sure, man.

35:54

You'll see that AI updates itself paper,

35:57

AlphaGo.

35:57

I've seen something this where it

35:59

rewrites its own weights.

36:01

And I'm really curious to read that

36:02

paper.

36:03

I wish you grow this channel fast and get

36:05

some fun spin off more contents.

36:06

That the goal right now I think it to

36:09

kind of focus in a little bit and start to

36:12

produce some content And so far everyone

36:15

been doing a great job kind of

36:18

participating and joining in so I really

36:21

appreciate that A new member here went to

36:24

spam so I going to just go ahead and

36:27

double check my spam folder so I just

36:30

reached out to the recent people who

36:32

reached out to me and they were in that

36:35

spam folder so please forgive me for that

36:40

as well

36:40

Can you go more in depth on how to make

36:43

Grok streaming work?

36:45

I think many people are looking for this

36:47

info.

36:47

Nobody's talking about Grok, even though

36:49

it's the most insane tech for AI.

36:51

Yes. OK, cool.

36:53

You know what I'm going to do? I'm going

36:54

to do a couple of things.

36:55

I'm going to reach out to them because

36:57

I'm very close to the Grok team

36:58

because they're literally in Mountain

36:59

View, down the street from my house.

37:01

I'm in Los Altos and they're very nearby.

37:04

I'm going to see if they can hook it up

37:05

or do something where I can have a session

37:07

with one of their engineers and we can

37:08

kind of talk a little bit more.

37:10

Last time the QEMI model got released on

37:13

Groq, they hit me up.

37:15

They're , yo, this thing is ready.

37:16

Can we get an engineer on your stream?

37:19

I was , hell yeah, let's go.

37:21

And so I want to do more about that

37:23

because I feel ...

37:26

Let me just... man.

37:27

OK, if I get some time at the end of the

37:29

stream, my people who are here

37:31

to clean my room are going to kick me out

37:33

in a couple of minutes.

37:34

Let me just go through the rest of the

37:35

comments

37:35

and let me just show you how I get

37:37

started real quick,

37:37

because you could just literally throw

37:39

the documentation into v0 and have it do a

37:42

quick prototype for you.

37:43

And you're , you'll be you'll be halfway

37:45

there.

37:46

And that's how I've coded my app.

37:49

Yeah, that's what's up, man.

37:50

Yo, ChordsMaze, oh my God, with the super

37:52

chat, bro.

37:53

Yesterday, here's some updates.

37:55

OK, this OK, for those who don't know,

37:58

ChordsMaze yesterday dropped in the chat.

38:00

I call ChordsMaze Mr. Three Comma Club.

38:04

If you don't know about the three commas,

38:05

that means you're a billionaire.

38:07

This guy is a billionaire but from a

38:11

tokens perspective. This guy literally

38:15

yesterday dropped 5,703,144,183 total

38:19

tokens used in Claude Code, my friends.

38:24

That's 10K, 10 bands.

38:26

The Boy Cooks. And I also want to say as

38:31

a disclaimer, he's not proxying tokens.

38:36

He's

38:36

not running this stuff overnight. He's

38:38

not doing all the crazy stuff that

38:40

Anthropix says.

38:41

He's literally locked in 1000% and he's

38:45

just up, no touching grass, just

38:48

constantly

38:49

on this thing. I don't know what type of

38:52

app you're shipping, but I told you

38:54

yesterday,

38:54

Make sure you link that below because

38:56

you're on it and you have such a small

38:58

window.

38:58

You have less than 30 days right now.

39:00

The clock is counting down, ChordsMades.

39:03

This is why I started the AI Anonymous

39:05

group yesterday because I feel you're

39:07

misunderstood.

39:08

Anthropic is misunderstanding us.

39:11

They think we're using AI to do all this

39:12

stuff.

39:12

It's , no, we're just locked in, bro.

39:15

We're locked in this.

39:16

Let's go.

39:17

Need an app engineer to talk about in

39:19

five chips.

39:20

Hey, they're not going to talk.

39:22

PDF on the right.

39:23

CleanYourRoom, NiceFlexRay, love the

39:25

content bro. I'm just living my life. I

39:27

don't know if you if you live in the Bay

39:29

Area

39:31

Housing and everything is expensive, but

39:32

engineers are paid very well, right?

39:34

if you I worked at Apple for 12 and a

39:36

half years and I also invested a lot and

39:40

You know did pretty well at Apple

39:42

But at the same time now I'm kind of ,

39:45

you know, they let me go which was insane

39:48

You know, I was dealing with long COVID

39:50

and all that crazy stuff. They're just

39:52

bye. I'm , bro

39:53

I saved you half a billion dollars how

39:55

are you gonna do that to me they're just

39:57

let

39:57

us know when you get better we'll be

39:59

happy to have hire you back I'm all right

40:01

I see you

40:01

and then I see this AI wave and it's

40:03

literally that meme it's I'm checking out

40:06

with my chick and it's wait there's this

40:08

AI wave so I'm now looking this way I'm

40:11

let's go let's go so that's that's why

40:13

I'm here I started this channel and y'all

40:16

are

40:16

showing up and this is such an amazing

40:18

moment okay I love Grok Kimmy K2 is on

40:21

fire yeah Ruben

40:22

just became a member. Thank you so much.

40:24

Y'all are the real MVPs right now in July.

40:26

Congrats on your success. I'm going to

40:28

follow for that four hour stream after

40:29

this. Yeah,

40:30

check out that four hour stream. There's

40:31

so much sauce in there. Also check out

40:33

that members only

40:34

post to get into the discord, ask some

40:36

questions. I have some sections in there.

40:38

A lot of people are

40:39

starting to hop in right now and

40:41

participate. I thought my CC usage of 1700

40:43

was an API estimation

40:44

last month was impressive. No, bro. My

40:47

guy is locked in, locked in back there. I

40:50

think

40:50

I think KordzMaze just became a member.

40:52

Yo, KordzMaze, bro, I'm so excited to see

40:55

you in the Discord right now.

40:56

, you gotta pop in and drop that comment

40:59

and just say, yo, Mr.

41:00

And I'm gonna make a tag just for you in

41:03

my Discord saying, , you're Mr. 3, Claude.

41:05

So, , three commas is gonna be, , an

41:07

actual Discord tag for anyone who's ever

41:10

broke the billion token barrier, which

41:12

looks this dude.

41:13

But then he only, he didn't do one

41:15

billion. He did two billion. He did three.

41:16

Bro, bro, , my guy is locked in, has no

41:21

chill, bro. No chill. Five billions, bro.

41:26

Five billions.

41:28

How do I get into the Discord? There's a

41:30

post that I have. Just go to my YouTube

41:32

channel. You'll see members only.

41:34

That's where you'll see the members only

41:36

content as well on the YouTube channel.

41:38

There's a members only post.

41:39

And then there's just an email that you

41:41

send with your information. YouTube

41:43

doesn't provide emails or anything that.

41:45

They just show me your name and then I

41:47

think your channel.

41:48

And so if you just provide that in the

41:50

email, that I will be able to link you up

41:52

and stuff.

41:52

Orange hands go right.

41:55

So yeah, yeah.

41:56

AI is doing something to us for sure.

41:59

Day by day, raised vibes getting darker.

42:02

I used to go over 38 million tokens alone

42:06

last night, but only 100%.

42:08

Also I think there's somebody else

42:10

asking, I think if you remember, I might

42:12

just do

42:12

and so on.

42:13

So, I'm going to go ahead and show you

42:15

how I prototype because I got to roll out

42:16

in a

42:16

sec with v0 and stuff that.

42:19

Hacking Claude Code to get that Opus LLM

42:21

download for sure, man.

42:22

Yo, man, let me see if, let me just kind

42:24

of catch up with the other chats as well.

42:26

Just make sure I didn't miss anything.

42:27

And a lot of people kind of popping in as

42:29

well and making sure we're good because we

42:31

cover quite a bit today and I just wanted

42:33

to give you that quick AI context

42:35

visualizer

42:35

type of thing.

42:37

And I think the app is technically

42:39

deployed live right now.

42:40

I'm just going to go through some

42:41

changes, but you can kind of play around

42:42

with it right now if you're just cooking.

42:44

So, yeah, it may get taken down when I

42:47

redeploy the project, so just FYI.

42:50

But this is a great way to kind of

42:51

visualize and teach your friends.

42:53

I feel if you can teach this to other

42:54

engineering friends just the way I'm

42:55

teaching you,

42:56

it's going to help you go a long way in

42:58

terms of visualizing these things.

42:59

So the biggest takeaway I'd say is, with

43:02

your contacts, try to think about,

43:04

if you don't know and it's not really

43:06

clear what you're trying to solve,

43:08

I'm coining this thing called a discovery

43:10

prompt, right? literally just say, I don't

43:12

know where to go with this. Can you help

43:14

me? , I need to break this down so I don't

43:16

eat up the token context window. Just tell

43:18

it what your problem is, right? literally

43:20

it's the confession booth. I think AI is

43:22

our confession booth. So you sit down, you

43:24

tell it, and it's going to be , okay, let

43:26

me use my AI smart brain to kind of break

43:28

this apart into different pieces.

43:30

Once it comes back to you with that, then

43:32

start the new chat window, start the new

43:34

Claude Code window, start the new cursor

43:36

window.

43:36

Take one of those concerns and just pop

43:38

it in and just cook.

43:40

Just let it cook.

43:41

Say, in your first request, before you

43:44

pop it into these new windows, you should

43:46

also tell it the tooling you have

43:48

available.

43:48

So say you have access to subagents which

43:51

can take some of these requests or these

43:53

concerns,

43:54

researching, access to the web, searching

43:57

through codebases.

43:58

They can do these tasks in parallel. So

44:01

the more you describe what that scenario

44:03

looks , and then you can just transcribe

44:06

over, you know, what your problem is or

44:08

what feature you want to ship, the better

44:11

results you're going to get.

44:13

So Opus will then take that whole thing

44:15

and then write basically what it's going

44:17

to write to the subagents for you,

44:19

basically. And so that is kind of what

44:21

you'll take.

44:22

and you can just save that to a markdown

44:24

file so that you can either hand that off

44:26

to Claude Code.

44:27

You can see how this kind of keeps

44:29

compounding, but this is going to, in a

44:31

year from now,

44:32

this is not going to be that. In a year

44:33

from now, you'll just say whatever you

44:35

want. It's going to figure all that stuff

44:36

out for you.

44:37

But this is where we're at right now. We

44:38

have to do these little things piecemeal

44:40

because the tooling is currently just not

44:41

there.

44:42

And everyone's just learning as we go.

44:44

these studies that were done by Chroma was

44:46

recently just published.

44:47

and so on.

44:48

So, everyone had a feeling that this was

44:50

the vibe around AI, but they couldn't

44:51

really pinpoint

44:52

why.

44:53

And so, now that we have this good data,

44:55

the best practices, the engineering

44:57

efforts, everything,

44:58

every single company is going to add

45:00

tooling around this.

45:01

It only gets better for us.

45:02

So that's what's up.

45:05

Let's see.

45:06

I'm just kind of catching up some more

45:08

here.

45:09

So I'm struggling with Claude Code in the

45:11

Windows terminal.

45:11

Can't paste screenshots.

45:12

Does anyone know how to do it?

45:13

I'm on the Mac.

45:14

I can do it.

45:15

Are you using it in Cursor? Because when

45:17

I use Cursor, as soon as I drag it, I hold

45:19

shift.

45:20

But another person said you can copy and

45:22

then paste it in. And basically, as long

45:24

as you have

45:24

the directory, it'll go to that directory

45:27

and scan it. And it'll even ask saying,

45:29

hey, I'm

45:29

checking out your desktop. Is that okay?

45:32

So it will put it in single quotes. It'll

45:34

say this,

45:35

this, this, and then it should be okay. I

45:37

think when you drag it in the terminal,

45:39

let me even put those quotes in there for

45:41

you. And so play around with that. But as

45:44

long as you

45:44

You put whatever the directory path for

45:47

your screenshot in quotes, it should be

45:49

able to

45:50

know what that means and do something

45:51

about it.

45:52

Claude Code needs an interface to uncheck

45:54

context you don't need.

45:56

That'd be cool.

45:57

But at that point, I think the whole goal

45:59

of it is to be more hands off, which is

46:00

kind

46:01

of why they exposed subagents to us, so

46:03

that they can try to intelligently pull

46:05

that out

46:06

from whatever we're saying.

46:08

But I'd be interested.

46:10

You got to paste it into Claude Code by

46:12

dragging from the file from the directory

46:13

view.

46:14

Really hope Claude ends up in the IDE.

46:17

Yeah, that'd be so fire. How many years

46:20

you code?

46:22

Let's see. I mean, I've been in the

46:24

industry, shipping software, probably a

46:27

decade plus.

46:28

And as far as coding, a lot of that was

46:31

more early on. So I think initially I

46:33

started with,

46:34

what was it? C? Yeah, I did C code and

46:38

then C++ and then got into some Swifty

46:42

things, Python,

46:44

I did a lot of shell scripting and then I

46:46

got into Ruby on Rails, I got into PHP

46:48

because I needed tooling and dashboards

46:51

for internal tools.

46:52

And then less of that was... I started

46:54

getting more responsibility around

46:57

shipping software and then quality and

46:59

stuff.

47:00

So then there became less coding, more

47:02

understanding the business of apps and

47:05

quality and output and how that affects

47:08

users.

47:09

So that was kind of where I started to

47:11

spend more time in that arena.

47:13

and then later on was because I knew so

47:15

much of the stack all over the company

47:18

it was just fighting fires I would be the

47:20

person that would go to to figure out

47:22

what the heck happened so debugging was

47:24

my biggest skill it was okay I know what's

47:27

going on at this part of the stack you

47:28

know you know pulling all the people in

47:30

the room to figure

47:31

this stuff out as well so yeah so just a

47:33

wide range of experiences there's some

47:35

really

47:36

nice terminal windows that are better

47:37

than the standard Windows terminal but I'm

47:39

old school I

47:40

I just the standard ZSH shell. Maybe

47:42

throw some PowerShell or something on

47:44

there, but I'm good at that.

47:46

How do I check my token usage in Claude?

47:48

You just run bun x cc usage if you have

47:51

the bun or mpx.

47:52

So cc usage is the command and then bun x

47:56

cc usage is the one I do as well.

47:59

And that will do that. Or you can just do

48:01

mpx instead of bun x if you're a thing.

48:04

I gotta head on out and I will see you on

48:06

tomorrow's stream. Thank you very much.

48:09

Peace out.

48:10

Support.

Interactive Summary

The video discusses why AI models, especially in coding, often seem to "suffocate" or get "dumber" quickly due to the rapid filling of their context windows with tokens from code, tool calls, and thinking processes. It highlights a ChromaDB study demonstrating how "distractors" or irrelevant/similar information drastically reduce AI accuracy, particularly when multiple distractors are present. The speaker emphasizes the importance of using subagents to narrow the scope of tasks, keep context windows focused, and improve accuracy. Practical advice is given on managing AI interactions, such as starting new chats for distinct tasks, using "discovery prompts," and leveraging subagents for specialized concerns. The speaker also showcases their app, RayTranscribes, which helps content creators by transcribing videos and generating timestamps to improve searchability on platforms like YouTube.

Suggested questions

6 ready-made prompts

Recently Distilled

Videos recently processed by our community