HomeVideos

DRAM Shortage Crisis explained..

Now Playing

DRAM Shortage Crisis explained..

Transcript

278 segments

0:00

Samsung is rumored to raise their memory

0:02

price across all memory products by 80%.

0:05

And if you look at the broad market

0:07

outside of Samsung, it's not just the

0:09

consumer and server grade DRAM, but the

0:12

contagion spreads to nan storage as well

0:14

when it comes to pricing. For example,

0:16

if you look at this Corsair Vengeance

0:18

RAM, it easily tripled in price in the

0:21

past 4 months alone. And solidstate

0:23

drives like the Samsung EVO also had a

0:26

huge price jump recently. But why? Why

0:29

is this happening all over the market? I

0:31

think the first culprit that most people

0:33

look at is AI. If you follow the huge

0:35

spike in Nvidia's revenue, mostly driven

0:38

by data center grade graphics cards, it

0:40

may lead you to say this is why DRM

0:43

prices are going up like crazy. But at a

0:46

closer look, DM prices have mostly been

0:48

going up at around October 2025, while

0:51

demand for AI chips have been going on

0:54

for years, dating all the way back to

0:56

2024. So, how does all of this make

0:58

sense? Today, we're going to look at the

1:00

DRM shortage crisis that's happening

1:02

globally and also understand what DRM

1:05

really is in the first place and look at

1:07

the broad market impact that it's

1:09

causing. Welcome to Kobe's code where

1:11

every second counts. Quick shout out to

1:13

Zo Computer. More on them later. Let's

1:14

first look at the impact that AI is

1:17

having on the memory industry. We know

1:19

that modern computers have a hierarchy

1:21

of memory starting from the CPU that has

1:24

L1, L2, and L3 cache. For example, my

1:27

computer at home uses the AMD Ryzen 5

1:30

that has 384 kilobyt of L1, 3 mgabyt of

1:34

L2 cache, and 32 megabytes of L3 cache.

1:37

And the real estate of these memories

1:39

would look something like this in the

1:41

physical form. But we all know that we

1:43

can't just run an AI model directly on

1:46

my CPU. Modern large language models

1:48

like DeepSync R1 that are 671 billion

1:52

parameters in size need a lot more

1:54

memory than what my CPU can provide from

1:57

L1 to L3 cache. The DeepSync R1 model

2:00

for example even at a low 1.58bit

2:03

quantization needs at least 130 GB of

2:06

memory to run them. And since my CPU

2:09

can't offer that kind of memory, we have

2:11

to work our way down the hierarchy of

2:14

memory until we can. So the next stack

2:16

we look at is RAM. Typical consumer

2:18

grade hardware have somewhere around 8

2:21

to 32 GB of RAM which is still nowhere

2:24

close to being able to run our DeepSync

2:26

R1 model. So we have to go even further

2:29

down the stack in our memory hierarchy.

2:31

Solid state drives offer a huge amount

2:33

of space easily in the terabytes. So we

2:37

could load our 130 GB of model in here,

2:41

but the throughput is often slow. As you

2:43

can see, depending on what stack you

2:46

load the model, you're limited by the

2:48

upward bound of memory bandwidth between

2:51

each stack, where the further down the

2:53

stack you go from the CPU, the slower

2:56

the communication between them will be.

2:58

But this kind of setup is rather

2:59

unorthodox since you'll be lucky to get

3:02

more than 10 tokens per second like

3:04

this. That's why most AI models run

3:06

directly on the graphics cards instead

3:08

of CPU, RAM, and SSD stack. And here's

3:11

why. Graphics card that you see in the

3:13

server like the Nvidia H100 offer speed

3:16

up to 819 gigabytes per second for a

3:19

single memory stack. This kind of

3:21

throughput allows an insane amount of

3:23

speed in comparison to our regular

3:26

computer stack. For example, one of my

3:28

graphics card at home, I use the Nvidia

3:30

4080 Super, which uses GDDR 6X that

3:34

offers 23 GBs per second with 256bit

3:37

memory bus. And this gives me access to

3:40

a throughput of around 736 GB per

3:43

second. But unlike my consumer grade

3:45

graphics cards, which is the Nvidia 4080

3:48

Super, that is GDDR6X based memory where

3:51

the memory chip is placed around the GPU

3:54

on the PCB. The Nvidia H100 uses what's

3:57

called HPM or high bandwidth memory that

4:00

stacks them vertically, which means our

4:03

earlier figures of H100 being 819 GB per

4:07

second is actually closer to 3.3

4:10

terabytes per second, factoring in

4:12

overheads and other bottlenecks, which

4:14

is still huge. So immediately we went

4:16

from 50 to 60 gigabytes per second in

4:19

our CPU/RAMM configuration to now

4:22

upwards of 3 terabytes per second in our

4:24

throughput. And since neural

4:26

network-based applications rely heavily

4:28

on insane matrix multiplications, you

4:31

can see why memory configurations like

4:33

this is the buzz around the AI industry.

4:36

And this kind of difference that we see

4:38

in performance will dictate the user's

4:40

experience that can go from few tokens

4:43

per second to use deep 1 to upwards of

4:46

maybe a 100 tokens per second depending

4:48

on the graphics card. So now that we

4:50

outlined the broad ecosystem, the next

4:52

question is why? Why are pricing for

4:55

consumer grade RAM and NAND memory

4:57

affected? if AI data centers are

5:00

spending hundreds of billions of dollars

5:02

in the HBM based memory instead. Let's

5:04

look at the supply chain of the memory

5:06

industry. But first, let's talk about Zo

5:09

computer. Here's the problem. You have

5:10

files stored across Gmail, Google Drive,

5:13

notion, notebook, LM, and whatever other

5:16

applications you use. But there's no

5:18

good way to really unite them into one

5:20

place. Zo is a private cloud computer

5:22

that you can own which means you can

5:24

store all your data in the cloud and own

5:26

that instance yourself and you can also

5:29

leverage AI agent on top of that so that

5:31

you can use AI to manage your files

5:33

build automations build apps and store

5:36

your code there. Unlike traditional

5:38

applications that lock your data away,

5:40

Zo gives you a persistent workspace

5:43

where everything is stored which means

5:45

you have control over your own apps and

5:47

files. And because it's a cloud

5:49

computer, you can access it anywhere,

5:51

including sending and receiving text

5:53

messages, which is one of my favorite

5:54

features. And there are other features

5:56

like email and of course the ability to

5:58

vibe code directly on the machine and

6:00

have customuilt personas so that you can

6:03

make your computer more personalized.

6:05

Try out Zo today and their always on

6:07

computer that you can use to simplify

6:09

your computer needs. Link in the

6:11

description below. Consumers like

6:12

OpenAI, Enthropic, Google, XAI, and Meta

6:15

create huge pressure on the supply

6:17

chain. So much that Nvidia and AMD have

6:20

seen huge rise in revenue in the past

6:23

few years. And we also have Google's TPU

6:26

that's also in the mix as well as

6:28

Amazon's own proprietary chip like their

6:30

application specific integrated circuit

6:32

for inference. And let's not forget

6:34

Grock and Cerever. As you can see, this

6:37

level of pressure that AI puts on the

6:39

supply chain points directly from

6:42

graphics cards to memory requirements.

6:44

We just saw how important the

6:46

interconnect bandwidth plays a role in

6:48

user experience when it comes to serving

6:51

the model. So, while the bottleneck from

6:53

being computebound can largely be

6:55

overcome by just throwing more graphics

6:57

cards at it, being able to serve the

6:59

model for faster inference is more

7:02

bandwidthbound. And Frontier Labs aren't

7:04

the only players around that are putting

7:06

huge demand on memory. We also have

7:09

Neoclouds that are serving auxiliary

7:11

demand from Frontier Labs as well as

7:13

researchers, hobbyists and sovereign

7:15

clouds like Corewave, Nebus as well as

7:18

emerging Neoclouds that are listed here

7:20

in the semi analysis diagram. All of

7:22

this puts a strain on manufacturing and

7:25

production of the memory in the supply

7:27

chain. Memory manufacturers like

7:28

Samsung, SKH Highix, Micron, Western

7:31

Digital and more all serve various kinds

7:34

of memory. Which means to serve this

7:36

kind of demand from AI requires them to

7:39

either expand their operation or change

7:42

their operation to serve AI based memory

7:45

instead or both. And we're actually

7:47

seeing both of them happening from

7:49

memory manufacturers like Samsung,

7:51

Skhinx and Micron. But the problem with

7:53

expansion is that fabs take anywhere

7:56

between two to 5 years which creates a

7:58

huge delay in manufacturing let alone

8:01

the amount of investment that needs to

8:02

go into it. And despite this constraint,

8:04

OpenAI secured up to 900,000 DM wafers

8:08

from Samsung and SKH Highex every month

8:11

to meet their target of constructing the

8:13

Stargate facility. And some project that

8:16

this could be around 40% of the global

8:19

output for memory. In other words,

8:21

OpenAI is buying up nearly half of the

8:23

world's DRAM production, which creates

8:25

the shortage even more. So, blending all

8:28

the demand spikes from Frontier Labs,

8:30

hyperscalers like Microsoft, Oracle, and

8:33

Amazon, as well as inference trip like

8:34

Grock and Cerebrus and Neocloud and side

8:37

deals that are being made from OpenAI.

8:39

The level of shortage that's created

8:41

from this is causing prices on consumer

8:44

grade memory to also spike across

8:47

desktops, laptops, smartphones, storage,

8:49

RAM. Well, anything that really uses

8:52

memory of some type. Impacts like this

8:54

is why we're seeing Samsung recently

8:56

increasing memory products pricing by

8:59

80% across all memory products. Another

9:02

angle that's worth mentioning here is

9:04

the memory industry in general because

9:06

they don't really have a great track

9:08

record when it comes to price fixing

9:10

scandals in the past. Back in 2016, the

9:12

Department of Justice charged Samsung

9:15

executives who plead guilty where each

9:17

paid a4 million dollars for price

9:19

fixing. And many other allegations have

9:21

been made throughout the history given

9:23

how so few companies actually supply the

9:25

large demand. And the entire memory

9:27

industry is known to be more cyclical

9:30

than others. Even going back to 2017, we

9:33

had a boom cycle largely driven by the

9:35

falling inventory and lack of production

9:37

from the manufacturers. And of course,

9:39

coming out of 2020, we had the supply

9:42

chain shortage as well, which ended up

9:44

overproducing with insane inventory that

9:46

led to prices actually going down by a

9:49

large degree. And now the inventory is

9:51

decelerating again, which explained a

9:54

more recent spike in DM prices pairing

9:56

with AI demand that started to occur

9:58

around October 2025. As you can see,

10:01

even though we had many super cycles of

10:03

DM in the past, what we're experiencing

10:06

here is somewhat different than the

10:07

past. Given that SKH Highix is now

10:10

making 10% less nan memory, likely to

10:13

ramp up their HPM production instead.

10:15

Micron is also taking a conservative

10:17

stance in their production. Kyoxia

10:20

reducing their production from 4.8

10:22

million last year to 4.7 million. All

10:25

seem to point to an extended period of

10:27

time until we start to see a normal

10:29

pricing for DM again. China, on the

10:32

other hand, is still 3 to 5 years behind

10:34

in terms of technology in comparison to

10:37

the rest. One of their biggest fab CXMT

10:39

is targeting mass production of RAM at

10:42

$138,

10:43

which is still decently cheap on face

10:45

value, but their technology is still at

10:48

the DDR4 in their production line. But

10:51

they are targeting to have their IPO at

10:53

$4.2 billion soon, which could fuel

10:56

their growth in innovation. What do you

10:57

think about the DM shortage? Do you

10:59

think we will soon see prices of DM

11:02

returning back to normaly or that the

11:04

prices will continue to go up even

Interactive Summary

The video explains the global DRAM shortage and the significant price hikes across all memory products, including consumer RAM and SSDs. While AI demand is a primary driver, the timing of price increases (Oct 2025) vs. AI demand (years) shows a nuanced connection. Large AI models require immense amounts of high-bandwidth memory (HBM) primarily found in specialized GPUs, not traditional CPU/RAM/SSD setups. The colossal demand from major AI labs, hyperscalers, and emerging cloud providers, intensified by OpenAI securing an estimated 40% of global DRAM output for its Stargate facility, is straining the supply chain. Memory manufacturers are expanding or shifting production to HBM, but fab construction takes years. This, coupled with the memory industry's cyclical nature and current inventory deceleration, results in a substantial shortage and higher prices for all memory types, with no immediate return to normal expected.

Suggested questions

5 ready-made prompts