3D NAND: The Most Scalable Semiconductor
241 segments
In the beginning, 2D NAND had only one way to scale. Making the cells smaller.
That eventually hit a technical wall. Now what? It is all over, right? WRONG! We go UP, baby!
The results are astonishing. 3D NAND is perhaps the most
scalable semiconductor product ever mass produced.
In today's video, we talk 3D NAND.
## Beginnings
I previously did a history about flash memory and NAND.
And I recommend you watch it. But if you haven't ...
The fundamental building blocks of NAND are memory cells.
For the first two decades after its commercialization, NAND was
2D. 2D NAND memory cells are transistors with a source, drain, gate, channel, all that jazz.
But NAND cells also have another special polysilicon gate called
the "floating gate". It is called that because it is surrounded by insulating
oxide with no connection to the rest of the transistor.
For all intents and purposes isolated. Like
a brain in a vat. Screaming into the void. Just like me.
During programming, electrons quantum tunnel from the channel through the tunnel oxide
into the floating gate. They stay there until erasure, when they quantum tunnel back out.
While the floating gate holds a charge,
the transistor's threshold voltage rises and we can map that to a bit.
NAND cells are networked together in strings of 16 to 128 with each cell's source connected
to its neighbor's drain. This lets us pack cells very closely together,
but also means no random access. We can only manipulate data in blocks or pages.
## NAND's Big Business
When NAND first hit the market around 1995, a different architecture called NOR dominated.
NOR has similar memory cells but wires them together more like DRAM does,
giving faster data access at the cost of lower density. Today,
vendors use them to store crucial hardware boot code like BIOS.
But NAND's more compressed architecture made it possible to create a solid-state
mass storage product competitive with a hard disk drive. After Apple adopted
NAND storage for its new iPods in 2005, the market took off - eventually eclipsing NOR.
The dynamics of the NAND industry are like DRAM’s. It is a commodity with a short product life cycle.
So companies must continually invest in R&D to achieve leadership in metrics like cost per bit.
They then must spend billions more on semiconductor fab capacity to
harvest profits from their product leadership while they still have it.
This dynamic creates booms and busts. So when times are hard,
NAND companies know that the best way to survive is to be
the lowest cost provider. Or have other profitable businesses to subsidize them.
## 2D Scaling
Reaching the cheapest per-bit cost in the 2D NAND era was achieved with two levers.
The first lever was to follow Moore's Law - shrinking the
transistor's physical dimensions so that we can stuff more cells together.
The second lever - commercialized in the early
2000s - are multi-level cells capable of storing two, three,
or even four bits inside a single memory cell by segmenting the threshold voltages into buckets.
In 1995, the cells were produced using a 470
nanometer process to produce a 32-megabit NAND flash chip.
Almost twenty years later, leading edge NAND produced using the 16
nanometer process node are reaching 128-gigabit capacity. Such advancements
were made possible by increasingly sophisticated lithography machines.
## The End of 2D Scaling
This is incredible scaling. But after twenty years, these two major levers were petering out.
The cost of patterning these smaller NAND cells were getting out of hand. With EUV delayed,
fabs turned to the 193-nanometer immersion and then double patterning to produce the 2D layers.
Moreover, the cells were getting too small. Smaller cells hold fewer electrons inside
their floating gates. A 43-nanometer memory cell can store a few hundred. A 25-nanometer memory
cell does store just a hundred. Which is kind of wild if you think about it.
The multi-level cells have this even worse. Their 100 electrons have to be divided amongst
the cells' multiple bit levels. Each division is now only a third or half the previous size.
This not only makes it more likely that degradation via leakage occurs sooner,
but also lowers the signal-to-noise ratio. Now losing just ten electrons
materially affects the cell's contents.
At the same time, these small cells are now so densely packed together - and their insulating
oxide layers so thin - that the cells begin electrically interfering with each other.
This increasing technical difficulty along with several brutal industry downturns triggered a
cycle of consolidation that reduced the number of NAND firms from nine in 2008 to just five in 2014.
Finally, there emerged the recognition that the
16/15 nanometer nodes would be the end of planar NAND scaling as we know it.
This impending deadline spurred research into new forms of non-volatile memory like MRAM or ReRAM.
But to quote a famous historical text, NAND wasn't dead yet. There was still another
dimension to go. Vertical. If we can't grow bit density by shrinking the cells,
then let's grow it by stacking more cells on top of each other.
## 3D Stacked Architectures
The first attempts at 3D NAND simply fabricated
layers of 2D memory cells on top of each other and connected them.
Each NAND layer is its own self-contained array.
So imagine like as if you were building a multi-story Asian mall,
you completely finished each floor of the mall before moving on up to the next.
One advantage of this 3D stacked architecture is that it carries over the older 2D NAND
processes. So fabs are familiar with it and you don't have to buy as much new equipment.
Moreover you can test for bad memory cells before putting them into the stack.
But there are also serious issues. Just ask yourself why they don't
actually build buildings like this. Construction affects the completed
layers lower down. The most concerning being heat damage.
Moreover, stacking layers of 2D NAND means repeating expensive lithography
steps and inserting duplicative support transistors for each
layer. So you are just multiplying complexity in a way that raises costs faster than bits.
The per-bit costs do not scale down as more
layers are added to the stack. And beyond three levels, the per-bit costs even start to rise.
Memory-makers are not going to spend billions retooling their lines for this if the benefits
peter out so soon. We needed something that grows the layers while also holding
constant the number of expensive steps i.e. lithography. What out there can do that?
## BiCS
In 2007, Toshiba came out with a different approach that literally
turns the existing paradigm on its side.
They called it Bit Cost Scalable, or BiCS,
though they later retconned that abbreviation to "bit column stacked".
To produce it, Toshiba presented what they called a "stack, punch,
and plug" process. The name is surprisingly descriptive.
It starts with "stacking" layers of conducting polysilicon plates with
layers of dielectric in between to insulate them. The polysilicon
plates will become the cells' wordlines and control gates.
After that, we pattern and then "punch" deep vertical shafts through this inedible
polysilicon lasagna. This is usually done with reactive ion etching or RIE.
Where an RF field accelerates reactive ions at the wafer and pound it like a sand blaster. The
ions react with the material at the bottom, removing it.
Once that is done, BiCS builds the memory cell by "plugging" the shaft with deposited material
layers. Here, Toshiba ditched the venerable floating gate
and brought something new to the world of NAND: The charge trap.
## Charge Traps
Charge traps are kind of like floating gates, but different.
So the floating gate contains electrons inside a conductive
polysilicon layer between two insulating oxides.
Charge traps are different in that they use a layer of trapping oxide
like silicon nitride between two insulating oxide layers.
This seemingly subtle difference in materials changes how these
two devices store electrons. It has been described that the floating gate carries
electrons as like a tub of water, with the electrons sloshing around.
A charge trap on the other hand stores electrons like how a sponge
stores water - trapped in discrete areas inside the material's "pores".
The charge trap has been around for decades,
but it wasn't used in a commercial product until 2002 when AMD and Fujitsu - now called
Spansion - introduced a NOR flash memory product called MirrorBit.
Samsung made some 2D NAND stuff using charge traps in the mid-2000s but until Toshiba used
it for BiCS, it wasn't popular. But they did this because it is more manufacturable.
Making a floating gate there requires fabs to pattern and deposit alternating layers of oxide,
polysilicon, and then oxide again at a very specific location in the shaft. Moreover,
those gates have to be electrically isolated from other floating gates in the same shaft.
It is like trying to stack donuts on top of donuts at a specific spot inside a metal
pipe. And to do that consistently at each layer for some 100 layers.
Because the charge trap holds electrons at discrete locations,
the fabs can just focus on the task of depositing continuous layers of oxide, silicon nitride which
is the charge trapping layer, and then oxide along the inside of the shaft.
Which is quite formidable, by the way. These atomically thin layers must be
precisely consistent all through this deep shaft. Keeping that consistency is not easy.
Anyway. The shaft is finally finished off with a pillar
of amorphous polysilicon. This serves as the channel through which Electrons will travel.
Since the BiCS memory cells' layers are structured as a sandwich of
silicon-oxide-nitride-oxide-silicon, they called it SONOS.
Has nothing to do with the wireless speaker brand name.
What makes this architecture so brilliant is that no matter how many more layers are added
to the stack, only one expensive litho-and-etch step is needed to pattern and make the shafts.
In fact, since bit capacity no longer depends on the cells' sizes, we no longer need the
most leading-edge lithography to pattern the shafts. We can take a step back on resolution.
## Other Architectures
At the time of Toshiba's 2007 proposal, Samsung was the leading NAND vendor with 50% market share.
Toshiba was second at 30%, and SK Hynix and Micron both at about 15% market share. As the end of
scaling for 2D NAND became more obvious, Toshiba and Samsung rushed to bring 3D NAND to the market.
In 2008 and 2009, Samsung proposed a multitude of architectures - VRAT, TCAT, VSAT,
and VG-NAND. Toshiba announced in 2009 an evolved version of their BiCS, called pipe BiCS or P-BiCS.
P-BiCS is called that because it takes the existing string of 2D NAND cells,
stretches it out at the middle, and then folds it over.
Like a 刈包 (guabao, Taiwanese street food) or pita
bread. The result is a U-shaped pipe that is repeated throughout the die.
And Intel, SK Hynix and Micron tried a few approaches with the floating gate,
perhaps due to the difficulties of getting the charge traps to work. I
think everyone was just trying to explore the space and find something that worked.
## The Polysilicon Problem
BiCS and P-BiCS were not perfect. Several major issues cropped up.
BiCS used polysilicon plates for its control gates. Polysilicon,
even highly doped, has resistance that causes signals to travel more slowly through it.
This degraded the cells' read and write speeds.
Another issue introduced by the polysilicon gate is a smaller
threshold window. When the cell gets programmed - meaning the charge traps
receive electrons - the threshold voltage is supposed to rise an appreciable amount.
But polysilicon gates exert rather weak control over the channel. As the charge trap is being
erased, the weak control causes electrons to tunnel back into the charge traps.
This means the difference between the threshold voltages of the programmed and
erased states is not that high. And that smaller margin makes
it harder to distinguish between the two states, leading to errors.
Metal would be a far better gate material than polysilicon. But
Toshiba chose polysilicon for a reason. They needed a material through which they can
etch both deeply and cleanly, and metal is definitely not good for that. So what to do?
## TCAT In the end, NAND leader Samsung won the race.
Toshiba's approach begins with the polysilicon control gate and works
its way from the outside in. So we can call it a "gate first" method.
Samsung figured out how to produce a metal control gate using a "gate last" method,
something akin to the High-K metal gate industry transition that the
guys over in logic semiconductors went through in the end of the 2000s.
Like that gate last method, this one employs the use of a sacrificial
layer, meaning that we deposit it with the intent of removing it later down the line.
It starts with depositing alternating layers of silicon nitride and an oxide dielectric.
Then we pattern and punch holes through this thick stack.
So far, so good. This feels like the aforementioned BiCS except
that silicon nitride is swapped in for polysilicon. This silicon
nitride is purely "sacrificial" and will be removed later on.
But here the process flow changes. Instead of depositing the various charge trap layers,
we fill the shaft with polysilicon to produce the channel. If you recall, BiCS did this last.
The next step will be to create the metal gate surrounding the channel. But if the
shaft is already filled, then how are we going to reach the lower levels?
This is done by digging trenches on the sides of the layer stack. We then selectively etch
away only the silicon nitride, leaving just the oxides and the polysilicon channel standing.
There are basically now empty air gaps between the stacks. Into those gaps, we can then deposit
the nitride charge trap layers, insulating dielectric layers, and finally the metal gates.
Samsung called this the Terabit Cell Array Transistor, or TCAT.
TCAT is also what they call the Japanese delivery service company over in Taiwan,
but I reckon the Koreans didn't think about that.
TCAT requires a far more complicated process than BiCS, but the inclusion of metal gates produced
a far better product. So far as I know, they use a variant called Scalable TCAT to this day.
## The 3D NAND Era
Samsung's introduction of 3D NAND, a 24-layer product in 2013, remade the industry landscape.
The decelerating pace of 2D scaling helped the fast-followers catch up. Samsung's lead,
once as large as 14 months, declined to just six.
But 3D NAND reset the table with a disruptively better product. The first
generation was a bit too early. Typical Samsung. But their second generation
VNAND - released in 2014 and made with a ~20 nanometer node - delivered the goods.
TechInsights measured it at 2.6 gigabits per square millimeter. To compare,
Samsung had a 2D NAND made with a 16 nanometer node and it did
740 megabits per square millimeter. So 3D NAND took a step back in process node from
16 to 20ish nanometers but still raised density by 3.5 times. That's incredible.
This substantially grew Samsung's lead in both market share and profit, and triggered
a race amongst the fast-followers to commercialize their own 3D NAND.
Micron and Intel were second in 2015, bringing a variant that uses the floating
gate. The two companies later split their alliance. Micron switched to charge traps,
and Intel sold their NAND business to SK Hynix.
Toshiba/Western Digital and SK Hynix also joined the club in the 2015 and 2016 time period.
Technical pioneer Toshiba later hit financing problems - Westinghouse, nuclear energy,
long story - which I reckon affected the company's ability to invest like the Korean giants.
## Limitations Now the path forward is simple. More layers.
Right now the leader Samsung is at 400, SK Hynix in the 300s, and everyone else in the
high 200s. Predictions of future scaling have gone up to 500, 600, and even 1,000.
But more layers bring more challenges, particularly with High aspect ratio
contact or HARC. "Aspect ratio" refers to the shafts' height to its width.
We want to keep that high. A 128-layer NAND might
have a shaft 100 nanometers wide and 6 micrometers deep, so around 60:1.
But keeping such a high aspect ratio gets challenging as the shaft gets longer. It is
harder to get enough of the reactive particles doing the etch down to the bottom of the shaft,
resulting in weird shapes like a cone or even a spiral.
A new technology developed to address this is an etch technique called cryogenic
etch. As the name implies, it brings temperatures down
to -70 to -196 degrees Celsius for higher etch speed at lower environmental impact.
## Conclusion As I always should, I need to thank Tanj for his
invaluable advice and commentary on NAND as well as memory in general.
NAND can never replace DRAM. DRAM is just faster and does not wear out. But the hard disk drive?
As of this writing, HDDs still have the best cost-per-bit thanks to new innovations like HAMR.
But the SSD is raising density at a far faster rate. In the last ten years,
we have gone from 2.6 gigabits per square millimeter to 14 to as high as 28 gigabits.
This scaling is remarkable, and continues despite challenges.
The rest of the semiconductor industry - struggling with their own scaling
challenges - must be overflowing with envy. They must be thinking:
How can we get that scaling trend for ourselves? What can we do to get as vertical as NAND?
Ask follow-up questions or revisit key timestamps.
This video explains the evolution of NAND flash memory, focusing on the transition from 2D to 3D architectures. Initially, 2D NAND scaled by shrinking cell size, but this hit a technical wall due to reduced electron counts and interference. The industry then moved to 3D NAND, stacking memory cells vertically. Early 3D NAND approaches involved stacking 2D layers, which proved inefficient. Toshiba's Bit Cost Scalable (BiCS) architecture, introduced in 2007, revolutionized 3D NAND by using a "stack, punch, and plug" process with charge traps instead of floating gates. Samsung countered with its Terabit Cell Array Transistor (TCAT) technology, which utilized metal gates for better performance. The adoption of 3D NAND, particularly Samsung's VNAND in 2014, significantly increased storage density and reshaped the market. Current challenges in 3D NAND involve increasing the number of layers while managing high aspect ratio etching, with technologies like cryogenic etch emerging to address these issues. Despite ongoing challenges, 3D NAND continues to scale rapidly, offering a stark contrast to the scaling struggles in other semiconductor industries.
Videos recently processed by our community