Itanium: Intel’s Great Successor

Watch on YouTube

Now Playing

Transcript

520 segments

0:02

In June 1994, Intel and Hewlett-Packard - two of Silicon Valley's largest and

0:08

most powerful companies - announced an alliance.

0:11

From the union of these two giants, will spring forth the next generation of CPUs.

0:18

The Great Successor. Chosen to unify two architectures under one umbrella.

0:24

It was named Itanium and by 2002 Intel had spent $5 billion on it. In today’s video,

0:32

we trace one of Intel's most ambitious products.

0:36

## Intel and 64-Bits

0:40

The x86 instruction set helped turn Intel into a giant.

0:45

A massive ecosystem had built up around it. In the 1980s, four out of every five

0:51

PCs shipped with an Intel CPU. These huge volumes helped them afford to build big,

0:58

advanced semiconductor fabs and produce at the lowest cost.

1:03

Why leave it all behind? But after shipping the famous Pentium CPU,

1:09

powerful voices inside Intel indeed began to assert that the time had come for something

1:15

new. The foremost reason for doing so was something called 64-bit computing.

1:21

The "64-bit" part of that phrase refers to the size of a CPU's "register". At the time,

1:27

Intel's CPUs were 32-bit processors, and that limited them in several ways.

1:33

The most prominent limit being that a 32-bit computer can only use up to

1:39

about 4 gigabytes of working memory: 2 to the power of 32. Less in practice,

1:45

because some of that is taken up by the operating system.

1:49

In the early 1990s, this 4-gigabyte wall was not a big deal for the consumer market

1:55

because PC memories topped out at about 128 megabytes. Who can imagine ordinary

2:01

folks ever needing much more than that at least in the near future?

2:07

But it was a big deal for graphics workstations, scientific computers

2:11

handling precise calculations, and web servers delivering content over the Internet.

2:17

These are powerful, very expensive machines that at the time ran UNIX.

2:23

Intel then dominated the PC, but had no presence in that high end space.

2:28

That space was populated by RISC chips like Sun Microsystems' SPARC,

2:33

Hewlett-Packard's PA-RISC, or DEC's Alpha. Intel wanted to get into that game.

2:40

## Extension versus Blank Sheet So question. Why not just extend the existing

2:44

32-bit x86 instruction set so that it can handle 64-bit registers?

2:50

After all, that is what Intel did with the prior major transition from 16-bit

2:56

to 32-bit. It wasn't easy, but the resulting 386 Intel CPU was a massive

3:03

success - powering a generation of PC clones like those from Compaq.

3:09

Intel even tried a similar, very ambitious blank sheet approach for that 32-bit transition.

3:16

The iAPX 432 was Intel's first 32-bit architecture. And to skip a lot of

3:23

words - feel free to read the very long Wikipedia if you care - that product failed.

3:30

But kind of like invading Russia, history rhymes. Intel felt that the 64-bit transition

3:36

would be different. And that this time, x86's years-old legacy CISC components would hold it

3:43

back. A lot of extra tooling and rules had to be followed to preserve that old world.

3:49

AMD and the other x86 cloners were a factor too. A history of 64-bit

3:55

computing by Matthew Kerner and Neil Padgett interviewed Richard Russell,

3:59

who pointed out that AMD's cross-licensing agreements gave them access to Intel's x86 work.

4:07

So the way it went was that Intel first releases a new x86 chip. Then six or twelve months later,

4:14

AMD releases their version at a cheaper price. This devalued

4:19

Intel’s R&D and burned a huge amount of profits for everyone involved.

4:25

The ghost of IBM and the PC loomed too. There is no guarantee that Intel will forever control x86.

4:34

The day might come that AMD, Cyrix, and the other x86 cloners somehow

4:39

pry control of the standard like what the PC cloners did to IBM.

4:45

So in Intel's eyes, yeah sure they can always extend x86. But the reboot-with-a-clean-sheet

4:51

approach could potentially let Intel surge ahead of the competition with an architecture

4:56

that it fully owned. And proponents argued that Intel now had enough influence to pull it off.

5:04

The debate raged until the late, great Albert Yu - Intel's general manager of microprocessors,

5:10

who oversaw development of the 386, 486, and the Pentium - bought in.

5:16

But how to achieve it? Kerner and Padgett also interviewed Dileep Bhandarkar,

5:21

who was then an Intel director. Bhandarkar recalled the company

5:25

doing a small internal 64-bit effort in 1992 while investigating outside opportunities.

5:32

The computer company DEC tried to get Intel to take on their RISC chip Alpha,

5:37

a very fast chip, which they declined. Intel then suggested DEC make the Alpha in Intel’s

5:43

fabs, which DEC declined because they just spent half a billion dollars on a new fab.

5:49

Then in late 1993, HP came knocking on Intel's door with an exciting new technology.

5:58

## A Post-RISC Technology In 1990, Hewlett-Packard rehired the brilliant

6:01

Bill Worley to flesh out the future of their proprietary line of chips.

6:06

Worley used to work at IBM alongside John Cocke on the legendary IBM 801 project.

6:12

801 is widely acknowledged as the driving force that kicked off the RISC revolution.

6:18

He then joined HP where he helped produce one of the earliest RISC instruction sets,

6:24

Performance-Architecture RISC, or PA-RISC. The architecture became a

6:29

growth engine for Hewlett-Packard through the turbulent RISC wars of the 1980s.

6:35

Worley then briefly left HP to lead a graphics processor

6:39

startup but rejoined in 1990 for a special project. The PA-RISC team

6:45

recognized that RISC was on the verge of hitting serious performance limits.

6:50

So a new project initially called "Super Workstation" was formed

6:54

to explore new architectures in the post-RISC beyond. Over time,

6:59

Super Workstation's work began to intertwine with that of another team inside Hewlett-Packard:

7:06

Fine-grained Architecture and Software Technologies, or FAST an HP internal

7:12

project exploring and evolving a radical concept known as Very Large Instruction Word, or VLIW.

7:22

## Meet VLIW

7:22

Very Long Instruction Word is a term coined by the brilliant Joshua Fisher while he was at Yale.

7:29

The way he puts it, VLIW describes a design philosophy. A concept or idea

7:36

more like RISC, rather than a specific instruction set like ARM or x86.

7:43

Its goal is for a CPU to achieve as much Instruction-Level Parallelism or

7:49

ILP as possible without making its hardware do it.

7:54

What is ILP? It is a way for a single microprocessor to speed up work by

8:00

initiating and executing multiple machine instructions in parallel

8:05

so that we can try to get more than one useful operation done per clock cycle.

8:11

Traditionally, high levels of ILP were seen as infeasible because programs have so many branching

8:18

conditions: If/else statements, loops and annoying dependencies that change the path of the code.

8:25

VLIW tries to surpass those shortcomings

8:28

by running "traces" of the program code. Using heuristics and user-provided data,

8:34

the compiler will try and guess how the user's program might progress.

8:39

That compiler then aggressively schedules the trace’s instructions

8:43

for maximum parallelism regardless of dependencies. To handle mistaken guesses,

8:49

the compiler adds compensation code to "backtrack" or fix things up.

8:55

These scheduled instructions are then packed together and sent to

8:59

the hardware in "very large words". Ergo the name.

9:03

People initially thought VLIW computers were impossible. That

9:08

is because it requires a compiler that can somehow predict a program’s future.

9:13

The difficulty of producing such compilers is a recurring theme with this technology.

9:20

## Fisher and Rau

9:20

Wanting to prove the skeptics wrong, Josh Fisher left to start a startup called Multiflow.

9:27

In 1987, they produced a line of powerful mini-supercomputers called TRACE. Over the

9:33

next two years, they sold and shipped about 100 units to scientific and commercial users.

9:40

Multiflow was not the only startup exploring VLIW at the time. There was

9:45

another founded by a brilliant Indian-American named Bob Rau.

9:50

Rau had led a team at the computer company TRW

9:53

studying similar Instruction-Level Parallelism techniques. In the same

9:58

year Fisher founded Multiflow, Bob Rau and several colleagues left to found Cydrome.

10:04

Cydrome worked on a VLIW-based "departmental supercomputer" called the Cydra 5. And while they

10:13

got it to work, it never shipped as a commercial product. The company eventually disbanded.

10:20

Multiflow also disbanded. In 1989, the mini-supercomputer market crashed from

10:26

over-competition in the category as well as cannibalization by powerful

10:30

single-chip RISC workstations called "Killer Micros". Circumstances trumped technology.

10:38

## A Radical Idea

10:38

After their startups closed down, both Bob Rau and Josh Fisher joined Hewlett

10:43

Packard and the FAST project with the goal of evolving the VLIW technology.

10:49

At the time, the big thing in the microprocessor world

10:52

was an ILP approach called Out-of-order Superscalar. This approach was arguably

10:58

pioneered by the aforementioned John Cocke and Tilak Agarwala.

11:02

Roughly speaking, superscalar involves us adding independent stations to the CPU, plus extra

11:09

hardware to grab a lot of instructions, figure out their various dependencies, and send them to

11:14

the right stations for simultaneous execution. This is all done as the program is running.

11:21

Superscalar worked. IBM utilized it then for their high-performing RS/6000 workstation.

11:29

Intel would later use it for their Pentium processors. But Rau and Fisher

11:33

came to believe - quite controversially - that superscalar is an anchor. An anchor that will

11:39

blunt the lift that microprocessors were then getting from Moore's Law.

11:44

Superscalar leans heavily on hardware to analyze instructions, figure out their

11:49

various dependencies, and sort them into the ideal order as the program runs. Such hardware

11:56

is incredibly complex and power-hungry. Rau and Fisher bet that it will not scale.

12:03

With their contributions, Super Workstation produced a new architecture called PA-Wide

12:08

Word or PA-WW. It performed quite well compared to what existed inside HP.

12:16

Next then is to design and produce a chip that implements this architecture. But in this,

12:21

there were challenges. Worley realized that PA-WW chips would have to be made

12:26

in a leading edge fab. In a 2001 interview for HP Labs, he explained the ramifications of such:

12:33

> The costs of such a fab implied that the chip volumes would have to be extremely high. High

12:39

volumes, as well as the need to attract software from many providers, implied that the architecture

12:45

would have to be an industry standard. An industry standard implied that HP could not do it alone.

12:52

Thus in July 1992, Worley recommended that HP bring in a manufacturing partner

12:59

with both prowess and scale. The obvious partner was Intel.

13:04

On Thanksgiving 1993, HP's CEO Lew Platt made a call to Andy Grove, asking whether

13:11

Intel might be interested in working with HP to make PA-WW the successor to x86.

13:18

Grove said no. HP tried again later, emphasizing that PA-WW would be fully

13:24

backwards compatible with both x86 and PA-RISC. This time it worked.

13:32

## Intel and HP Team Up So what did Intel see that got them so interested?

13:35

The HP design team included well-respected folks like Josh Fisher, Bob Rau,

13:40

and Bill Worley. And that team had already made much progress. In a widely circulated quote,

13:47

Intel's John Crawford told the Wall Street Journal:

13:49

> When we saw WideWord, we saw a lot of things we had only been looking at doing,

13:55

already in their full glory

13:57

A PA-Wide Word architect named Rajiv Gupta had this second golden quote - also widely circulated:

14:06

> I looked Albert Yu in the eyes and showed him we could run circles around PowerPC [a

14:11

competing IBM processor], that we could kill PowerPC, that we could kill the x86. Albert,

14:19

he's like a big Buddha. He just smiles and nods.

14:22

Intel would be blind if they didn't also notice the competitive dynamics. They can

14:28

convert one of their significant RISC rivals onto a technology platform that they control.

14:34

And if HP gets on board, then maybe others like Sun and Silicon Graphics will too.

14:40

Grove was intrigued and ordered a bake-off between PA-WW and its own internal 64-bit

14:47

architecture effort. PA-WW won. So they hammered out a deal, announced in June 1994.

14:56

Hewlett-Packard transfers the PA-WW IP over to Intel. Intel then designs and produces

15:02

the first CPUs. HP can then get said CPUs at a discount to produce enterprise system products.

15:11

There were no solid products, only a statement of direction towards a future

15:16

computer architecture. The first processors were not anticipated to arrive before 1998,

15:22

but once delivered, they will carry both companies into the 21st century.

15:28

This was going to be a massive project. Albert Yu anticipated it costing between $400 to

15:35

$500 million over its whole life. An underestimate as it turns out. But Intel

15:41

can afford it and the results were going to be amazing. Albert Yu told the press at the time:

15:47

> By combining our skills ... we will offer the marketplace chips and systems

15:52

with absolutely unparalleled performance for the future

15:56

## Taking Names

15:57

Now. I want to pause a bit and talk names. Part of what makes this all so confusing are the names.

16:04

There are more names here than you can shake a stick at. And unfortunately

16:07

they all come out at different times. I am going to step out of the flow of

16:11

time and gather them all together here so that we can keep track.

16:16

So we start off with HP’s Super Workstation,

16:19

which produces PA-Wide Word. The announced 1994 collaboration with Intel would eventually evolve

16:26

PA-WW into a new thing called "Explicitly Parallel Instruction Computing", or EPIC.

16:34

EPIC is an architecture philosophy kind of like how CISC or RISC are philosophies. So

16:40

think of it like the philosophy of French cuisine - a style with recommendations on

16:45

how to achieve a wanted goal. EPIC likes parallelism. French cuisine likes sauces.

16:52

EPIC is a direct descendant of VLIW. So it still transfers complexity from the hardware

16:59

to the software compiler. The complier still aggressively analyzes the program code for

17:04

parallelism opportunities and group together instructions in big bundles.

17:10

But EPIC strikes a more moderate tone by admitting that sometimes

17:14

the hardware is in a better position to do certain things in runtime because of

17:18

access to program variables. So EPIC accommodates hardware in the CPU for

17:24

that - but not so much to make it as complex as a superscalar chip.

17:29

Multiflow and Cydrome's VLIW compilers were also too tightly bound to their

17:35

microarchitectures' hardware. EPIC addresses this rigidity with something

17:39

called "templates" - which help define which instructions can be bundled together.

17:45

Now that is EPIC. The next term to introduce is the IA-64 instruction

17:50

set architecture. EPIC is to IA-64 as what RISC is to PA-RISC or SPARC. A specific

17:58

instruction set implementation of EPIC, defined and owned by both Intel and HP.

18:05

So to continue the cooking metaphor, you can think of it as like a French cuisine

18:10

cookbook - demonstrating various techniques and recipes for cooks to make French dishes.

18:17

After that, we go to the individual chips. The French dishes themselves,

18:21

as served by the restaurant. Intel expected its first IA-64 chip to hit the market in 1998.

18:30

Internally, this first IA-64 chip had the codename Merced after a river in California.

18:37

In October 1999, Intel would announce that the chip would be officially named Itanium. Intel said

18:44

at the time that the name conveys the processor's unique strengths and power while retaining the

18:50

"-ium" word endings for brand consistency. Netizens almost instantly dubbed it the Itanic.

18:58

## Reactions

18:58

Anyway. Back to 1994 and the flow of time. Outside analysts saw the

19:04

collaboration's potential - citing the two companies' talents and capabilities.

19:09

Hewlett-Packard was top two in the workstation and server markets,

19:13

where Intel was then weak. And of course, Intel was the juggernaut of the PC industry,

19:19

trying mightily to get into the workstation and server industries.

19:24

Analysts looked at how the collaboration might have on IBM, which backed its own PowerPC line

19:30

of RISC chips. Andrew Allison of the "Inside the Computer Industry" newsletter told ComputerWorld:

19:36

> I would imagine that IBM is not terribly thrilled with it ... It’s probably the only

19:41

combination that is virtually guaranteed to have the horsepower to stand up to PowerPC.

19:48

Intel didn't outright say it - and they would later deny to have ever

19:52

implied such a thing - but they also positioned this new family

19:55

as the future successor to x86. One VP at a Boston consultancy said:

20:01

> "Intel is smart enough to know when it’s time to be at the end of the x86 line."

20:07

The Microprocessor Report echoed the notion that the end was now in sight for x86.

20:13

This new architecture will supersede both it and

20:16

PA-RISC before trickling down to the mass market. They write:

20:20

> We expect that, in about 10 years, Intel will stop making pure x86 chips in favor

20:27

of [the new chips]. Intel will continue to milk the x86 cash cow as long as it can ...

20:33

> Intel’s P6, due in late 1995, probably will be the last pure x86 core that Intel develops

20:42

## Disagreements

20:43

Not everyone agreed with that. Shortly after the announcement,

20:47

Nick Tredennick wrote up a dissenting view.

20:52

He argued that the two companies had shot themselves in the foot by

20:55

transitioning architectures and pursuing the VLIW "technofad".

21:01

He pointed out that big architectural shifts require developers to recompile

21:06

their software. Which they hate doing because it’s never smooth.

21:10

And that the complicated hardware will also need extremely complicated

21:14

compilers. Neither of which have good histories of on-time delivery.

21:20

And that switching away from x86 would be walking the same mistaken and failed path

21:26

that IBM did when Big Blue tried to lock down the PC ecosystem with the Micro Channel Architecture.

21:34

Add to this boiling bone broth the collaboration's high expectations, which towered over K2.

21:42

Robert Colwell is a legendary CPU designer who previously worked at

21:46

Multiflow. He then went to Intel in 1990. In his memoirs, he wrote:

21:52

> In essence, [the Intel design team in charge of IA-64] were told that their mission was to

21:58

jointly conceive the world’s greatest instruction set architecture with HP,

22:03

and then realize that architecture in a chip called Merced by 1997,

22:08

with performance second to no other processor, for any benchmark you like.

22:14

Merced will also do all these things while being fully compatible with

22:19

legacy software of both x86 and PA-RISC. This sounds ambitious.

22:26

Colwell was not alone in his doubts. Intel's chief of corporate strategy

22:30

at the time was David House. While he approved the project,

22:34

he would later say that its sheer scale - and I quote - "scared the everloving bejesus out of me".

22:42

## Merced

22:42

Intel sold chips to HP, but they never worked together on this level.

22:47

HP is famous for its consensus-based management

22:50

style. Intel on the other hand is just as famous for "constructive confrontations",

22:57

where people are expected to challenge each other bluntly, promptly and with data.

23:03

So the two arm-wrestled over what functions should

23:05

be handled by the software or hardware while simultaneously ramping up their teams with new,

23:11

relatively inexperienced people. There was tension.

23:15

The difficult experience was either so traumatic or constructive that HP took the sole lead

23:21

for the second generation of IA-64 chips. This particular chip project was code-named McKinley.

23:29

The original plan was to release Merced in 1998 and fab it with

23:33

Intel's 250 nanometer node. But then the chip design was found

23:38

to be spilling beyond the limits of what can be fabbed. Like a muffin top.

23:43

So the designers took out transistors allocated for memory cache and x86

23:48

compatibility. Removing the latter was made easier after the much-faster

23:52

Pentium Pro released because of weak x86 performance relative to that beast.

23:59

Even so, there was still spill over. So it was decided to go to the 180-nanometer node

24:04

instead. The transistor shrink would let them put the whole design onto a single

24:09

die. The cost however was a six month delay, pushing the ship date to 1999.

24:16

Things progressed. In October 1997, the two companies introduced EPIC

24:21

and IA-64 to 1,500 computer designers at the Microprocessor Forum. They talked about EPIC's

24:28

key architectural choices and emphasized its speed relative to existing RISC chips.

24:35

Intel also shared a release date for Merced:

24:39

1999. They said it would have industry leading performance, full compatibility

24:45

with the old 32-bit architecture, and have a complete solution stack at launch.

24:52

Several big software developers announced their participation in the IA-64 ecosystem.

24:58

Microsoft agreed to have a 64-bit version of its Windows NT operating system available at

25:03

release. Sun said it would make their Solaris OS available on Merced chips.

25:09

And to raise the hype even more, presenter and Intel Fellow Fred Pollack teased the

25:15

second-generation McKinley chip, saying that it was going to "knock your socks off".

25:20

## P7

25:22

When Colwell arrived at Intel back in 1990,

25:26

he helped found the company's second design team in Oregon.

25:30

That team - working in friendly competition with a team in Santa Clara - began on a product

25:37

code-named P6. It would be released in 1995 as the 32-bit Pentium Pro.

25:44

The Pentium Pro was a remarkable chip. Despite being fabbed on the same process

25:50

as its predecessor (P5), P6 ran twice as fast thanks to the inclusion of ideas like out-of-order

25:57

superscalar, which to remind you, searched more aggressively for instructions to parallelize.

26:04

The Pentium Pro brought Intel's x86 architecture neck to neck with some of the fastest RISC chips.

26:11

It also opened the door to the workstation market by enabling the "personal workstation".

26:17

Such personal workstations - running Microsoft's Windows NT or Linux - cost

26:23

half that of the old-school UNIX-powered workstations. They grew rapidly in 1995,

26:29

eating into the low end of the market.

26:32

Unfortunately, internal politics interfered with the Oregon team's pursuit of this opportunity.

26:38

Colwell remembers being told that IA-64 will eventually replace the 32-bit lines,

26:43

so why keep working on the old legacy stuff?

26:46

To Colwell however, the Pentium Pro showed that the 32-bit architecture

26:51

still had plenty of juice. With no 64-bit killer application on the immediate horizon,

26:57

a premature switch might leave the market to AMD and other competitors.

27:02

He also argued that Merced had so many new things going on that there was no

27:07

chance that it would all work right on the first try. He felt Intel should have

27:12

returned the chip to the lab as a long-term research project to iron out its kinks.

27:18

In the end, management could not decide on a coherent strategy on how to resolve the conflicts

27:23

between the Oregon team working on 32-bit and the Santa Clara team working on 64-bit Merced.

27:30

At first, they were content to just stand aside and let the best one rise to the top.

27:36

However this backfired, because Merced had to be compatible with the 32-bit stuff. With

27:42

Colwell and the Oregon team still working on it, that goal became an ever-moving target. So

27:48

the Santa Clara team tried to "freeze" the specification, which Oregon hated.

27:54

In the end, management separated the children: 64-bit for the more powerful server chips.

27:59

32-bit for everything else including workstations. That’s the strategy Intel would follow henceforth.

28:06

By the way, I highly recommend Colwell's book, "The Pentium Chronicles", where he

28:10

talks about these worsening dynamics between Santa Clara and Oregon. It is a strong read.

28:17

## A Second Delay

28:17

Soon after the October 1997 presentation at the Microprocessor Forum, a new problem emerged.

28:23

A source told CNET at the time that Intel severely underestimated the

28:27

chip's complexity. The Wall Street Journal later reported Intel struggling with various

28:33

signals arriving at parts of the CPU at the wrong time, creating speed bottlenecks.

28:39

This was amplified by Intel targeting an exceptionally high 800 megahertz clock rate.

28:45

Tweaks made to fix bottlenecks in one module caused ripple effects in

28:49

other modules, making debugging endlessly tricky.

28:53

There are rumors of other things, but I won't go into them. Whatever the thing was,

28:58

it was serious. By mid-1998, the company had to announce that it was

29:02

pushing Merced's release from late 1999 to mid-2000. Which means servers do not

29:09

reach actual customers until Q4 2000. New CEO Craig Barrett told the press:

29:14

> Our best assessment is that the project is a bit bigger and complicated than we assumed

29:20

it would be ... we are pleased with progress. There's not a basic problem with the technology.

29:26

This second delay means that Merced is scraping up against the second-generation IA-64 chip - the

29:33

one that HP is designing code-named McKinley. It was scheduled to enter mass production in 2001.

29:42

Intel finally successfully taped out Merced in summer of 1999 and

29:47

demonstrated it in the fall at its 1999 Intel Developers Forum. Shortly afterwards,

29:52

the fabs started learning how to produce the new chip, with early versions seeded to developers.

29:59

## Transition Plans

29:59

Both Intel and Hewlett-Packard - perhaps expecting this might happen - went to their backups.

30:04

At the 1998 Microprocessor Forum, Hewlett-Packard unveiled a "transition plan" towards IA-64.

30:11

They would continue releasing additional PA-RISC chips for the next five years until

30:16

2003. Customers can choose which chip they want in their server.

30:21

This was not ideal. A former HP executive remarked that they had to do all sorts of tricks

30:27

to extend PA-RISC. The delays and distractions associated with getting out IA-64 allowed rival

30:33

Sun Microsystems to leap ahead in the web server market during the wild late 90s internet boom era.

30:41

And as for Intel, the chip giant revitalized market revenues of its 32-bit architecture in

30:46

1998 with the introduction of the Celeron and Xeon lines. Market segmentation.

30:54

The former targeted value-minded consumers who otherwise bought cheaper chips from AMD,

31:00

Cyrix and other cloners. The first Celeron flopped

31:04

because it basically had no cache but later iterations performed very well.

31:09

The latter chip, the Xeon, targeted the medium to high-end server market

31:14

with faster clock speeds, larger caches and higher cache bandwidth.

31:19

So when Merced was announced to be delayed,

31:22

analysts noted that it was not a huge deal and that the Xeon can hold on as

31:26

a "placeholder". As we will later see, that turned out to be an understatement.

31:33

The delay did give OS-makers like Microsoft and the UNIX vendors time

31:38

to port for Merced/Itanium. But even as something like a "race" developed, actual

31:44

application developer interest remained tepid. One Wells Fargo system architect said in 1998:

31:51

> We have a few applications that could benefit from Merced,

31:55

but probably not anytime soon ... first we’ve got to take

31:58

care of Year 2000 compliance issues. Maybe in 2001 we can look at Merced

32:05

## Itanium in 2001: The Revolution is Here

32:05

After 7 years and $5 billion spent, Intel finally launched Itanium in the summer of 2001.

32:13

Recognizing that their 32-bit products were still going strong,

32:17

Intel tried to position Itanium as a powerful but revolutionary product for the "most demanding

32:23

enterprise and high-performance computing applications" as their press release said.

32:29

So yes, while it might take some additional work at the start, those who do will be

32:34

rewarded. They commissioned a white paper to identify "sweet spots for early adopters",

32:39

which included technical computing, large databases, and complex analytics.

32:45

To the press, Intel worked hard to emphasize that this was just the first step of a long

32:50

journey and that the ecosystem adoption thus far at this early stage was pretty impressive.

32:56

On the hardware side, they highlighted buy-in from a spectrum of computer manufacturers.

33:02

Some 35 Itanium-based models were said to be released by 25 companies like Dell,

33:07

Compaq and Silicon Graphics throughout 2001.

33:12

Intel also highlighted that Itanium systems can run four compatible operating systems:

33:18

Two 64-bit versions of Windows, HP's proprietary UNIX variant HP-UX,

33:25

IBM's proprietary UNIX variant, and certain commercial Linux distributions.

33:31

With all this backing from the big companies,

33:34

people presumed that Itanium would take the market. A 2000 market report from MicroDesign

33:40

Resources had predicted that IA-64 chips would have 60% of the server market by 2003.

33:48

Unfortunately, Itanium took too long of a path to the market. Soon after its debut,

33:53

it was outshone by several new 64-bit RISC like Sun's UltraSPARC III and IBM's Power4 chips.

34:02

Microprocessor Report nominated the Itanium for

34:05

its Best Workstation/ Server Processor award, but wrote:

34:09

> But while other high-end server processor designs are moving to glueless multiprocessing,

34:15

simultaneous multithreading, chip-level multiprocessing,

34:19

and integrated memory controllers, the Itanium system architecture is beginning

34:24

to show its age. Perhaps the design has been in development too long and has had too many cooks.

34:32

Another major issue was that there was not a lot of Itanium-native software. And while

34:37

Itanium can run 32-bit x86 software, it unfortunately did not do it that well.

34:44

IA-64 is so different from x86 that emulation means recreating the whole

34:49

thing from scratch. The more you try to force the former to act like the latter,

34:54

the more you are giving up its own inherent advantages.

35:00

Considering the chip's high price, disappointing performance,

35:03

and the looming arrival of a faster chip the following year,

35:07

it is surprising that the first iteration of the Itanium sold even the few units that it did.

35:13

There are a few who say that Itanium did (allegedly) kill one of its big RISC rivals,

35:18

the DEC Alpha - once heralded as the world's fastest chip.

35:23

After Compaq bought DEC, they wanted to consolidate to a single 64-bit

35:28

platform - which was Itanium because that was all that Intel had at the time - and

35:33

sold the Alpha IP to Intel. Does that mean Itanium killed Alpha? Not sure.

35:39

A few people are nostalgic about the Alpha, but chip R&D is expensive, DEC was not doing

35:45

well then, and it was not like the chip was doing all that great before Compaq nixed it.

35:51

## AMD

35:51

Befitting the fast follower, AMD too wanted to get into the server

35:56

business. They had 18% of the 32-bit market but zero in servers. That meant going 64-bit.

36:05

Hearing that Intel was doing something brand new for 64-bit, AMD approached Intel for an

36:11

early look at the Itanium architecture. But as I mentioned in passing, Intel had

36:16

intentionally carved out Itanium from AMD's cross-licensing agreements. They were rebuffed.

36:22

So what to do next? Atiq Raza - who joined AMD as COO from its acquisition of the CPU

36:28

company Nexgen - explains in his oral history for the Computer History Museum:

36:33

> Everybody said we're going to get screwed. Itanium is going to

36:37

take over the world. So I said, "Okay. I find it very weird that basically they

36:41

have abandoned the x86 and are doing a different instruction set for 64-bit.

36:47

We should also consider doing a different instruction set if that's the case."

36:52

So AMD investigated various partners - SPARC, MIPS, PowerPC and DEC - to see whether they

36:59

can do something. While those ecosystems had existing user bases that AMD can leverage,

37:05

32-bit x86 software did not run well on them in emulation mode.

37:11

Eventually, AMD came to believe that Intel had made a mistake.

37:15

Developers hate recompiling software and users hate being forced to adopt

37:20

something new unless for some compelling reason. IA-64 didn't seem to be it.

37:27

VLIW's roots were in academia. And while Multiflow did sell well to corporates,

37:33

it showed its best colors on numerical and scientific workloads as those programs tend to

37:39

have more repeated loops, ILP opportunities and predictable control flows. For more

37:46

generalized work like in the business space, VLIW’s gains were not as obvious.

37:52

If Intel really did err in going down the route to Itanium, then AMD suddenly had an opportunity.

37:59

So Raza went to a brilliant chip designer who joined AMD from DEC named Jim Keller, and said:

38:06

> "Jim, life and death for AMD. We do an x86 extension to 64-bit. You

38:12

have to write the spec and you have to do it with very little time."

38:18

Keller - who cranked on this day and night - thus is one of the major authors of the x86-64

38:24

spec - later known as AMD64. Which I would say is by itself a killer legacy, but Keller

38:30

has since gone on to do a bunch of legendary stuff at Apple, Tesla and more. Living legend.

38:38

In October 1999, AMD announced x86-64 to the world. This was a major divergence from Intel.

38:45

AMD assured the market that their 64-bit transition will

38:49

be a "simple change" fully compatible with 32-bit x86.

38:54

AMD - never one to miss a snarky comment at their rivals - criticized Intel for

39:00

"forcing" the Itanium design onto the OEMs.

39:04

Ron Curry, Intel's director of marketing for IA-64 products,

39:09

responded by insisting that Itanium too will have x86 compatibility.

39:15

He then went on to compare AMD's strategy to trying to soup up a Volkswagen with wider

39:22

tires and a faster engine. I get where he is driving with this but still find it amusing.

39:28

The looming threat of AMD's entry into the 64-bit space pushed Intel

39:33

to double down on its second-generation IA-64 chip, the one codenamed McKinley.

39:41

## Itanium 2: This is Ready

39:41

In 2002, Intel CEO Craig Barrett rebooted the project.

39:45

McKinley was officially announced as the Itanium 2,

39:49

made an official member of the Intel lineup, and released in late 2003.

39:55

Then-CEO Paul Otellini predicted sales of 100,000 units, telling security analysts:

40:01

> At the risk of getting myself in a lot of trouble,

40:05

I'm going to declare this the year of Itanium

40:09

The Itanium 2 was indeed an improved product. It performed better on benchmarks.

40:14

Largely because of a larger bus with three times

40:17

the data bandwidth of the first Itanium and bigger L3 cache.

40:23

For the redux, Intel sharpened the messaging to aim squarely at Sun and their high-end UltraSPARC

40:29

III-based server systems - then the market leader in UNIX systems with nearly 30% market share.

40:36

Intel's product news release repeatedly compares

40:39

it favorably to the UltraSPARC III. Sun must not be happy about that.

40:44

Second, Intel highlighted growth in the software ecosystem. Applications were now

40:49

available from Microsoft, Oracle, Reuters, and BEA. They also claimed that the chip

40:55

is compatible with more operating systems than any other high-end enterprise server platform.

41:00

And Hewlett-Packard was giving it their all. They developed an Itanium 2-based high-end server

41:06

called the Superdome that can run a variety of operating systems - not just HP-UX but

41:12

also Windows and Linux - to help transition their customers over to this Intel stuff.

41:18

But even as early December 2002, there were concerns by outside analysts that

41:24

HP might be the only server vendor to go so far in adopting the Itanium.

41:31

In November 2003, Intel's director of business-critical systems marketing

41:36

told CNET that Itanium was on the brink of broader

41:39

use and that 2004 was going to be a "very strong watershed year".

41:45

## Rise of the Compute Cluster In an earlier time, I think it might have worked.

41:47

I think Intel really did have the market power then to drag people to Itanium. The

41:53

growing internet tech giants might have bought plenty of expensive

41:57

Itanium-powered computers for web servers, and Itanium could have been on its way.

42:02

But times were changing. Tech giants like Google were turning away from big mainframes towards what

42:09

are called "compute clusters". Such clusters were powered by cheap commodity hardware and

42:15

the open-source Linux OS and then networked with software to act like one chonk computer.

42:22

Such compute clusters can be cheaply scaled up to serve billions of people.

42:26

Buyers don’t have to pay a tax to a proprietary systems vendor.

42:30

Clusters also tend to be more resilient against physical hardware failures.

42:35

Big mainframes still have their place but clusters were the future.

42:40

That means cheap commodity hardware, which ironically enough meant x86 Pentium - later

42:45

Xeon - CPUs. In the end, Itanium's biggest competitor was another Intel product.

42:54

## AMD's Victory

42:54

In April 2003, AMD released the K8 Sledgehammer core in the Opteron and the Athlon 64.

43:02

AMD counted on the 130-nanometer Opteron to help it gain traction in the corporate market.

43:08

They held it out as the "evolutionary" path for

43:11

companies with legacy code to get to 64-bit computing.

43:16

Despite its groundbreaking features and Sun's support, the chip did not

43:19

sell as well as anticipated. System sales by mid-2004 did beat Itanium, but lagged

43:26

Xeon's by a country mile. Basically none of the big box vendors made an Opteron server.

43:32

This sus behavior is what in part led CEO Hector Ruiz to eventually

43:37

sue Intel again for alleged anti-competitive practices.

43:41

But Opteron and Athlon 64 did force Intel's hand. In early 2004,

43:48

Craig Barrett told developers that they were adding 64-bit address

43:52

extensions for their server Xeons. Desktop chips, the following year.

43:57

Intel ended up adopting AMD's 64-bit extension, with some minor differences. It was the first

44:03

time that a company other than Intel made a major addition to the x86 spec.

44:10

Isn't it ironic? Intel produced Itanium in part because they were afraid that AMD and the cloners

44:16

might pry away control of x86. But making Itanium eventually led to that very thing they feared

44:23

happening! Fortunately for them, they still had Xeons (and their special discounts to vendors).

44:30

Several analysts had projected Itanium server sales to reach $14 billion by mid-2004. The

44:37

actual sales number then was about $600 million. Outside of Hewlett-Packard,

44:42

no major systems vendor was selling these Itanium systems at volume.

44:48

## Conclusion Intel leadership long insisted that Itanium is

44:51

a long term play. In a 2005 oral history for Stanford University, Albert Yu said:

44:57

> I think Itanium has not failed. I think the Itanium chapter has not been written yet.

45:03

Later on, he says:

45:04

> Itanium was never intended to be a replacement for Intel Architecture. It was never thought of

45:10

that way. It's always to be for high end servers. And I think there probably some confusion.

45:15

Some people might think that we're going to take Itanium to replace Intel architecture.

45:20

That's never been the intention. It will be an additional architecture, and that was the intent.

45:27

Chairman Craig Barrett echoed the sentiment when he retired in 2009. He said in an interview:

45:33

> You guys have had a lot of fun with it in the past ... The

45:36

ultimate verdict is probably going to be 10 years from now

45:40

In 2000, the company posted a roadmap document that said that the IA-64 architecture would

45:47

last for 25 years. It didn't quite get there, but it got darn close.

45:53

A big reason why was Hewlett-Packard - which quickly standardized on the

45:57

Itanium. They were its dominant user - buying 85%

46:00

of production - and depended on it quite a bit. Their HP-UX OS only runs on Itanium.

46:07

This became a problem as the Itanium ecosystem declined. So HP went to great lengths to

46:13

keep this thing huffing and puffing even as software developers started dropping support

46:18

for Itanium systems: Red Hat in 2009, Microsoft in 2010, and more since then.

46:25

The biggest drop-out occurred in 2011 when Oracle halted Itanium development and called the product

46:31

"end of life". This led to Intel getting huffy and a lawsuit from Hewlett-Packard three months later.

46:38

It is thanks to this lawsuit that we learned that HP agreed to pay Intel

46:43

$690 million over the span of 2 deals to keep producing Itanium chips until 2017.

46:51

HP won the lawsuit by the way.

46:54

Development on Itanium continued - we have a Itanium 9100, 9300, 9500, and 9700 series.

47:01

But it was clear to everyone that the bulk of Intel's resources would be spent on the

47:06

x86-based Xeon. And today, the Xeon is indeed one of the company's biggest product lines.

47:13

But the day had to come. And in 2017 it finally did, Intel announced that that Itanium 9700,

47:19

codenamed Kittson, would be the end. The last to be shipped in 2021. And so it was.

47:26

With all that being said, I do applaud Intel for having the ambition, balls, and billions

47:31

to throw at something like this. For years, CPUs got faster from improvements in both clock speed

47:36

and architecture. After going superscalar, what in the latter was left? Itanium's failure locked x86

47:44

into the market - and in part paves the way for the stasis to come later in the decade.

Interactive Summary

Ask follow-up questions or revisit key timestamps.

This video explores the history and downfall of Intel's ambitious Itanium processor project. Initiated in the 1990s as a partnership with Hewlett-Packard, Itanium aimed to move computing beyond the 32-bit x86 architecture using a revolutionary philosophy called VLIW (Very Long Instruction Word). Despite spending billions, the project was plagued by technical delays, a shrinking software ecosystem, and the unexpected emergence of commodity compute clusters. Ultimately, AMD's approach of extending x86 to 64 bits won the market, leading to Itanium's gradual decline and eventual discontinuation in 2021.

Recently Distilled

Videos recently processed by our community