HomeVideos

The history of servers, the cloud, and what’s next – with Oxide

Now Playing

The history of servers, the cloud, and what’s next – with Oxide

Transcript

3014 segments

0:00

Can you tell us about the.com boom? We

0:02

did much more technically interesting

0:04

work in the bust than we did in the

0:05

boom. There's a degree to which

0:07

innovation requires some level of

0:08

desperation that good economic times are

0:11

kind of hard to summon that desperation.

0:12

How have AI tools changed how you're

0:15

working at oxite?

0:16

>> Certainly we're using cloud code a bunch

0:18

and people are doing that but for a lot

0:19

of the work that we're doing it is

0:21

helpful as maybe a polishing tool but

0:24

less as at the epicenter of its

0:25

creation. Can you tell me what it

0:27

actually means to design or build a

0:28

computer? Oh, it's very involved. Yeah,

0:30

it's very involved. So, first of all,

0:32

how have servers and cloud

0:33

infrastructure evolved [music] since the

0:34

late 1990s and what is next? Brian Caner

0:37

was a distinguished engineer at some

0:39

micros systemystems during the com boom

0:41

and comb bust. Built a small competitor

0:43

to AWS called Joyant [music] and is now

0:45

the co-founder at Oxide. Today, we go

0:48

into the history of servers in the cloud

0:50

from the late 1990s to [music] today.

0:52

the challenges of building hardware like

0:54

the Oxide computer from scratch. How the

0:56

Oxide team uses AI and why they find it

0:59

practically useless for hardware

1:00

engineering [music] challenges. Why

1:01

Oxide builds everything as open source

1:04

and how they manage to work remotely as

1:05

a hardware startup and [music] many

1:07

more. If you'd like to understand more

1:08

about how the cloud works, and learn how

1:10

Nimble hardware plus [music] software

1:11

startup operates, this episode is for

1:13

you. This podcast episode is presented

1:15

by Statsig, the Unifi platform for

1:17

flags, [music] analytics, experiments,

1:19

and more. Check out the show notes below

1:20

to learn more about them and our other

1:22

[music] season sponsor. So Brian,

1:24

welcome to the podcast.

1:25

>> Oh, it's great to be with you. Thanks

1:26

for having me.

1:27

>> I'd love to jump back in time a lot back

1:30

in the 1990s because you're someone

1:32

who's been around the block and back

1:33

then you worked at some interesting

1:35

companies including at Sun and if you

1:37

could give us listeners and and viewers

1:39

a sense of what was it like in the '9s

1:41

in terms of

1:43

>> software servers, what was the vibe

1:46

like? Yeah, it was an interesting

1:47

inflection point because I was

1:49

interviewing in 1995. I started in 1996.

1:52

So, uh I would say that the the internet

1:55

and I mean we HTTP had been developed in

1:58

like 9394.

2:00

We had kind of the first web browsers

2:02

but it was still very very very new and

2:06

the internet was just kind of primed for

2:09

takeoff. Java had been Java had come out

2:11

in maybe 1995.

2:14

Java had kind of taken off immediately.

2:16

So there was a lot of uh really exciting

2:18

energy, but it was nowhere near what it

2:20

would become a couple year even a couple

2:22

years later it became very frothy of

2:23

course and it was exciting. Um it was

2:25

very clear to me I went to school

2:28

actually in the east coast but just

2:30

coming out here to Silicon Valley the

2:31

energy was was extraordinary um and

2:34

really knew that I wanted to come out

2:35

here for my career. So at Sun those next

2:38

couple of years I mean I I got very

2:40

lucky really because Sun was in the

2:43

right place at the right time with the

2:45

right technology which you know

2:47

sometimes you only appreciate in

2:48

hindsight um because it was so explosive

2:52

and if you wanted to build a website as

2:55

part of that com boom you were buying

2:57

Sun servers you were buying Cisco

2:59

switches

3:00

>> now why was this the case because again

3:02

just taking myself back just being a bit

3:04

naive I would assume that let's Hey, I'm

3:06

in the 1995. I want to build a website.

3:09

Could have not just used [clears throat]

3:10

a PC and spun up a server. Did it not

3:13

work like that or how did it work?

3:14

>> You I mean a PC like maybe but you

3:18

didn't really have an operating system,

3:19

right? Because you Linux is Linux is

3:23

very very new. Linux is not

3:25

>> I'll back down.

3:25

>> Oh yeah, definitely. Linux is you know

3:28

uh would be like haiku today which is an

3:30

operating system you haven't heard of

3:31

for a reason. It's kind of like a

3:32

hobbyist operating system. You know what

3:33

I mean? You'd be like what? No, you

3:35

wouldn't. And you then you kind of had

3:37

the BSDs were out the free BSD was

3:39

certainly out there. Also still very

3:41

much under the shadow though of this

3:42

lawsuit from AT&T. So the unices are

3:44

kind there's not really open- source

3:47

operating system options. Uh there was

3:49

the um actually this is kind of funny

3:52

because so where was the GNU option? Uh

3:55

it was going to be the herd operating

3:56

system. So Herd was kind of like the

3:58

Duke Nukem forever of its time. It was

4:01

the operating system that was constantly

4:03

coming kind of next year and next year

4:05

and next year and it was going to be

4:06

micro kernel based and so you know that

4:10

it's kind of amazing but you really

4:12

couldn't do it on on PCs because of the

4:15

lack of system software and actually

4:17

part of my attraction to Sun was I had

4:19

used Solaris on Spark but I never I knew

4:21

Solaris existed on x86 but I never used

4:23

it. So I was excited to use Solaris on

4:26

x86.

4:26

>> And so what did Sun build? You mentioned

4:28

Solaris. That was the operating system.

4:30

>> Solar is the operating system. We built

4:31

servers. So we built Sparkbased servers.

4:33

Um we built a desktop machines. So we

4:36

Sun was a computer company. It was a

4:38

systems company. So So we built desktop

4:40

machines, built some ill-advised

4:41

laptops. So basically desktop machines,

4:43

workstations. But then at that time in

4:46

the 90s, what was really exploding were

4:48

everything from those kind of workg

4:49

groupoup style servers up to really

4:51

getting bigger and bigger servers up to

4:54

um the very large machines, machines

4:56

that are as physically the same size as

4:57

what Oxide makes today. And I remember

5:00

vividly in what would have been like

5:02

9798 maybe Greg Papadopoulos then the

5:04

the CTO of Sun giving it to the entire

5:07

company saying here are the top three

5:09

applications for Sun micros systemystems

5:11

databases, databases and databases. So

5:14

that gives you an idea of kind of how it

5:16

was being used. And this is again as

5:19

that kind of in that that that knee up

5:22

of that.com buildout where if you again

5:26

if you wanted to really build a web

5:28

presence, you were going to use you were

5:30

going to use Java, you were going to use

5:31

you do it on Solaris, you're going to do

5:33

it on Sun servers. Um and you were going

5:35

to and it was kind of it was a wild time

5:38

for sure. And can you tell us about the

5:39

dotcom boom because you know right now I

5:41

know AI is pretty exciting and it feels

5:43

like we're in a special time but what

5:45

what was it like especially working on

5:46

it sounds like it was at the it was the

5:48

epicenter of it and you know what was

5:50

funny is I did uh it was frenetic in a

5:54

way that was not always positive. So,

5:56

one of the things that is that is just a

5:58

point of fact and one can take from what

6:01

what one will I did we did much more

6:04

technically interesting work in the bust

6:06

than we did in the boom cuz I think that

6:08

when you're in boom times you know

6:11

everyone kind of like secretly believes

6:13

that this is because of me like I that

6:16

it is because of the thing that I am

6:17

working on if I you know I once had you

6:20

know one of the one of the the early

6:22

technologies behind Java once told me

6:24

with a straight

6:25

every server that Sun sells they sell

6:28

because of Java and I'm like you know

6:30

what you know what's most amazing I you

6:32

believe that is actually the more

6:33

interesting fact that I mean it is like

6:35

obviously false especially with you know

6:37

databases databases databases being the

6:39

top three applications but that that

6:41

kind of reflects the zeitgeist of the

6:43

time that everyone believes that this is

6:44

you know if I work on the microprocessor

6:46

it's because of the the the

6:47

microprocessor is perfect if I work on

6:49

the operating system it's because oh

6:51

this is the operating system that people

6:52

are buying the the machine for and it

6:54

like that doesn't really lend itself to

6:56

really to to real innovation. I think I

6:59

think there's a degree to which like

7:01

innovation requires some level of

7:02

desperation that good economic times are

7:06

it's kind of hard to summon that

7:07

desperation sometimes. So, I think that

7:08

during the boom it was and it was just

7:11

it was frothy and it felt like there was

7:14

a period of time where I'm like this

7:15

obviously can't go on forever and you

7:17

know the economist is having these very

7:19

like gloomy covers about how this is all

7:21

going to end and it's going to be an

7:22

apocalypse which I believed and then I

7:24

just stopped believing it. And I'm like,

7:26

well maybe the economist is right. It

7:27

just went on longer. And you know, one

7:29

of my early life lessons from the boom

7:30

and bust is these things go on longer

7:33

than you think possible.

7:35

>> But when they growth

7:36

>> in terms of the boom, when you're in

7:38

frothy times, that boom will go on

7:40

longer than you think possible.

7:42

>> Mhm.

7:43

>> And when it switches, it will collapse

7:46

faster than you can fathom.

7:48

>> In the boom, do I understand correctly

7:49

that customers were just like wanting to

7:51

buy your servers? They were flying off

7:53

the shelves. all these companies and on

7:55

a day-to-day work what did it what did

7:57

it mean for you? So I'll tell you like

7:59

in daytoday it meant first of all it

8:00

meant that traffic was terrible that the

8:03

you know there is you couldn't get

8:05

housing you couldn't get you know

8:07

everything was in short supply you

8:09

couldn't uh customers are you know they

8:12

are buying we had a customer that you

8:14

know was but was going to buy 19,000

8:16

servers which is obviously a big very

8:18

big number

8:18

>> and and these were these massive big

8:19

servers right

8:20

>> yeah well in that case those were

8:22

actually one use servers to build out a

8:24

broadband initiative that actually was a

8:26

company called Enron you know I remember

8:28

vividly we were at a a a dinner uh here

8:32

in the city at a at a restaurant called

8:34

Aqua which is a very kind of fancy

8:35

restaurant long since out of business

8:37

and I don't think Aqua survived the bust

8:39

and we were at Aqua with a with a bank

8:42

who was a a customer of Suns and they

8:46

were spending a galactic amount of money

8:48

every year with Sun and we were at a

8:51

dinner and I just remember I mean it was

8:54

it was the kind of like 19th century

8:56

guilded age kind of dinner. People are

8:58

ordering you know nine courses. What I

9:00

remember is at the end of that having

9:03

chateau deem which is a sotern. So I

9:05

don't know very much I don't know very

9:07

little about wine. I know nothing about

9:08

so turns. What I did know is there was

9:10

someone who knew wine and it's like we

9:13

are going to all drink the 1952 chateau

9:15

demot

9:17

which is which is and and I remember

9:20

being like I'm like I'm not much of a

9:22

drinker but I was like too drunk at that

9:24

point to really appreciate it. So I have

9:26

had this so turn that you know that that

9:28

enophiles kind of live their life to to

9:31

drink and I'm sad to inform you that

9:33

there's one less bottle of this precious

9:35

vintage because it was poured down the

9:38

gullet of a 20-some dotr who really had

9:41

and I just remember being back in my

9:43

apartment being literally drunk on

9:46

chateau deem thinking to my inro hill

9:49

and remember thinking to myself this

9:51

can't last this is not sustainable and I

9:54

swear the.com boom turned to a bust like

9:58

that night. I that is that is September

10:00

of 2000. So the uh pets.com had kind of

10:03

busted out and the bunch of NASDAQ had

10:05

busted out early in 2000. Uh the traffic

10:07

got lighter early in 2000. Anyone who

10:09

was here would be like that the absolute

10:11

spookiest thing is it went from like

10:13

gridlock to like COVID like traffic in

10:17

the span of like a month

10:18

>> without co happening

10:19

>> without co happening with only the

10:21

NASDAQ collapsing. and you're like,

10:23

"Okay, that's very odd." And then 2000

10:26

kind of muddled along and then the with

10:28

the that dinner was in September of 2000

10:31

and uh the what really stopped was the

10:33

telco buildout. So that there was a lot

10:36

of telco build up because people are

10:38

like the internet is the future

10:39

>> and telco build up meaning the towers

10:42

the server

10:42

>> the servers the infrastructure for and

10:45

then all of the conccommittent the the

10:47

fiber like JDS unif was a huge company

10:49

you you had these companies that were

10:51

you know global crossing and and MCI

10:53

WorldCom and all these companies were

10:55

explosive and everyone believed that the

10:58

internet is the future and this is like

10:59

an important thing important and they

11:01

were right they were right

11:02

>> Brian just said how an important lesson

11:04

to Alcom boom was that people who

11:06

believe the interim will be the future,

11:08

they were right. Today we're in a

11:10

similar stage with AI. It's pretty

11:12

likely that AI will be part of the

11:13

software stack in the future, even if

11:15

timing is harder to predict. The latest

11:18

shift is how AI agents are becoming a

11:19

lot more commonly used for development.

11:21

And this is a great time to talk about

11:22

our season sponsor, Linear, and how they

11:25

think about collaborating with agents.

11:27

Linear has taken an interesting approach

11:29

here. Instead of building one

11:30

proprietary AI assistant and locking you

11:32

into it, they built an open API and SDK

11:34

that lets any agent plug into your issue

11:36

tracker. That means you don't need to

11:38

wait for linear to build the features

11:40

that you need. You can connect the best

11:41

coding agents on the market like Cursor,

11:43

GitHub, Copilot, OpenAI, Codeex, and

11:45

Devon or you can build your own agent

11:47

for your team specific workflow. It's a

11:49

fundamentally different approach from

11:50

most issue tracking and project

11:52

management tools on the market. You get

11:53

optionality and the experience is

11:55

surprisingly natural. You assign an

11:57

issue to an agent the same way you'd

11:59

assign it to a teammate or you can

12:01

simply mention the agent in an issue

12:02

thread. Curser then can pick up a bug,

12:05

understand the context from the issue,

12:06

open a PR code can explore a fix while

12:09

you're focused on something else,

12:10

centric and root cause analysis when

12:12

something breaks. It's pretty powerful

12:14

what you can get these agents to do. And

12:16

here's what I like. You, the human stays

12:18

the accountable owner. The agent works

12:21

for you, not instead of you. You review

12:23

the work. You decide when it's good and

12:25

when it ships. If agents are going to be

12:28

a part of the tool set of building

12:29

software, and it feels to me they

12:31

increasingly are, you'll want a system

12:33

that's actually designed for them.

12:35

Linear is a system like this. To learn

12:37

more, head to linear.app/ aents. And

12:40

with this, let's get back to the point

12:42

where Brian was saying how those

12:43

believing the internet will be the

12:44

future back in 2001 were right. This is

12:48

the other thing. It's like they're

12:49

right. And so like a very famous impact

12:51

creator from the.com boom is is webband

12:53

right webband was delivering groceries

12:56

which many people today are going to get

12:58

their groceries delivered right right

12:59

instart. It's like they weren't wrong

13:02

but their timing was off and they lost

13:05

track of the underlying economics

13:07

completely. And so when it busted out,

13:10

so in in the the fall of 2000, uh in

13:14

November of 2000 in particular, there

13:16

were there were zero orders from

13:19

telecoms at Sun. Like it went to zero.

13:21

Wow. And every and you know, you're kind

13:23

of used to kind of ups and downs, but

13:24

that's like just like off a cliff. And

13:27

from that point, we you know, going to

13:29

2000 and then and then 2001 and it was

13:32

then very very grim. I would say that

13:34

the thing that that happened through the

13:36

bust and layoff after layoff after

13:38

layoff and cuz companies had kind of

13:40

built themselves and geared themselves

13:41

around these fat times lasting forever

13:44

and now they were gone and expectations

13:46

as frothy as expectations were during

13:49

the boom. They were that much negative

13:52

in the bus. People were like everything

13:54

is it's it's the end of days

13:55

>> and and were you a software engineer

13:57

back then?

13:57

>> Yeah, software engineer. Yep. And then

13:59

so as a software engineer like both you

14:01

and also thinking about your your

14:03

colleagues back at the time or friends

14:05

how did it impact you? Were you kind of

14:07

just chugging along or

14:08

>> so I would say that like lots of people

14:11

left and you had like the statistic of

14:13

you know the U-Hauls were 10 to1 out of

14:15

the Bay Area. So you the the moved away

14:19

and the the thing that I noticed is that

14:22

the people that had moved out to Silicon

14:24

Valley because they were they really had

14:26

a a an interest in the technology all

14:28

were there all stayed and were not

14:30

adversely affected honestly. I mean I

14:32

the um yes we every one of us if you had

14:36

equity in your company which of course

14:37

you all did like you try not to

14:39

overthink it right you just try to like

14:41

you try to remind yourself like I never

14:43

had it to begin with so like it's hard

14:44

to you know but it's definitely gone sun

14:47

lost 98% of its value um so it's like

14:50

definitely gone and you know there was

14:51

something and I think it also like a

14:53

boom can get you to care about things

14:56

that you actually don't care about and a

14:58

boom can get you to because in a boom

15:00

everyone is so financially driven that

15:02

it's hard not to become financially

15:04

driven. But it's like that's actually

15:05

not why I got into this. And so during

15:08

the bust, I'm, you know, definitely able

15:10

to put, you know, put a meal on my table

15:13

and a roof over my head. Um, but the uh

15:17

it was really a reminder about like

15:19

what's important and again because we

15:21

did we did do better technical work in

15:24

the bus than we did in the boom. And I

15:25

think it's because in the bust it's like

15:27

okay now like we really we have to focus

15:31

we we have fewer resources that that the

15:34

fewer resources actually force more

15:36

creativity. So you know all of the

15:39

things that we did certainly speaking at

15:40

Sun and system software so ZFS and Drace

15:43

and the service management facility all

15:45

these things that were really

15:45

revolutionary for the operating system

15:47

all happened in the same kind of

15:49

postbust period of time. So they all

15:51

those all of those things happened from

15:54

2001 to say 2005

15:56

>> and and so what were these specific

15:59

innovations?

16:00

>> So I'd gone to work at Sun to be to work

16:03

with Jeff Bonwick and as long as I had

16:05

known Jeff from the mid '90s Jeff had

16:07

wanted to rethink file systems and now

16:09

finally in the early 2000s uh he and

16:12

Matt Erenss were able to really go take

16:14

a clean sheet of paper from the file

16:16

system and that's CFS. I had a a chip on

16:19

my shoulder about the way we understand

16:21

and debug systems by the way we observe

16:23

systems. So I along with two other

16:26

colleagues um did dra which allowed us

16:29

to dynamically instrument running

16:30

systems and you can kind of go down the

16:32

line and there were there were a bunch

16:34

of things like this where

16:36

>> we and I I I don't know that all of this

16:38

is related to the bust. It's just that

16:40

the timing lined up such that it was all

16:42

happening during the bust and what we

16:44

ended up with was a whole bunch of

16:45

interesting technology coming together

16:48

actually in a single version of the

16:49

operating system and then very I mean

16:51

fortunate for us and I do think this is

16:53

a bit of a consequence of the bust

16:54

because sun was definitely open to to

16:57

new approaches we open sourced all the

16:58

operating system so that happened in

17:00

2005 and that was very important to to

17:02

give these kind of technologies eternal

17:03

life but I think you know we can never

17:05

predict the future but to me it's it is

17:06

pretty positive in this sense that even

17:09

in the bus hearing the stories that

17:11

innovation did not stop. Sure, you know,

17:13

sounds like it was probably harder to

17:14

get jobs and and there there might have

17:16

been fewer of them, but you know,

17:17

industry kept innovating and and what

17:19

you what you said that I didn't expect

17:21

to hear that it was a bit easier to

17:24

innovate.

17:24

>> It's just less manic. We were able to

17:26

focus more and so not that now I mean

17:29

not that one should uh necessarily pine

17:31

for a bust because busts are brutal, but

17:34

there is a clarity that you get too. Um,

17:37

so I mean ideally you would like to have

17:39

just like can we just be like normal

17:40

economically but like nope. Apparently

17:42

in high-tech we've got to be like on or

17:44

off. So bust aside in the early 2000s

17:47

leading up to this internet boom the way

17:49

to you know most companies went about

17:51

buying Sun servers with Solaris

17:53

installed and everything was hardware

17:55

and software came together. It was

17:56

beautiful. It worked well. Again I I

17:58

heard from from folks who did it. What

18:00

happened then? Cuz when I I got into

18:01

second in 2000 I did not hear about

18:03

Solaris and that that was not how it

18:05

>> No. Right. That's what was the shift.

18:07

>> So the shift was first of all open

18:08

source, right? So then so you know we

18:10

said in the mid90s Linux was kind of

18:12

still very much a hobby project. Not so

18:15

by the 2000s, right? So grew up it grew

18:17

up absolutely and it grew up because you

18:19

had a bunch of companies that really

18:20

backed up the truck and you know the

18:23

things that at first IBM and SGI data

18:26

general some other companies those

18:28

companies were very important because

18:30

they decided to contribute their

18:31

technologies like XFS right XFS many

18:34

people still use today on Linux that's

18:36

from SGI XFS was SGI on IRX that was

18:39

happening in kind of those the late 90s

18:41

and then in the 2000s I mean Google was

18:44

always built on Linux right And so you

18:46

had kind of the the companies that that

18:49

became that that next boom were all

18:51

built on open source and indeed needed

18:53

to be built on open source. So they

18:54

economically relied on open source to be

18:57

able to build what they built. So then

18:59

it became much more practical to

19:01

certainly run Linux and I think the the

19:04

other BSDs or they I we open source

19:06

Solera. So there were a lot of options

19:08

that were now available. So that

19:11

shifted. I think the other thing that

19:12

that that shifted is that I mean Spark

19:15

bluntly lost to x86 and you sun for and

19:20

and spark is a Harvard architecture.

19:22

>> Spark is a microprocessor. Yeah. And and

19:25

uh there was because there was a time in

19:27

the '9s when if you wanted the fastest

19:30

microprocessor it was a risk

19:32

microprocessor. It was it was from it

19:34

was a spark microp processor or it was

19:36

MIP or it was alpha. and x86 was was a

19:40

commodity but was was uh and obviously

19:43

available with a personal computer but

19:44

was not faster than those those risk

19:46

microp processors that shifted that

19:48

shifted in the late '9s and we you know

19:52

because we ran the operating system on

19:54

that was in Solaris on both Spark and

19:56

x86 we could see how fast these x86

19:58

machines were and could see frankly how

20:00

like you know you talk to the micro

20:01

electronics folks they really did not

20:03

they they kind of dismissed x86 and

20:05

dismissed Intel and you shouldn't do

20:07

that And in particular, Intel was was

20:09

very focused and architected their way

20:12

around what was called the memory wall.

20:14

Um, and they were able in part because

20:17

they use speculative execution. They

20:19

were able to actually make these

20:21

microprocessors that were became much

20:23

faster than the risk microp processor.

20:25

So by the time say you are in 2004 2005,

20:29

if you want a leading edge microp

20:31

processor, it's x86. So that that was a

20:35

a big and important shift. So by the

20:36

time you're coming up, it's like, okay,

20:37

yeah, if I want this, if I I'll I'll

20:40

just like I don't know, get a like a

20:41

Dell box or super micro box and then

20:44

I'll I'll put Linux on it or maybe

20:46

FreeBSD and and away I go. Then the the

20:49

next kind of big and important shift

20:51

that happened started in 2006. You could

20:53

you could argue with with S3, but then

20:56

especially in those next kind of

20:58

>> seven 8 n with the introduction of EC2

21:01

and now you have like the the cloud that

21:04

starts to come into play and now like

21:06

people were like why would I even like

21:08

screw out of the server at all? I mean

21:10

it was so great to be able to just spin

21:12

up infrastructure.

21:14

>> Yeah. I I remember one of my early

21:15

companies mid 2000s, we we had a server

21:19

room. We had server administrators. The

21:21

server room was always hot. And this was

21:23

a small company, mind you. This was not

21:25

not not a big one. Every company needed

21:27

to do that. It's kind of amazing to

21:28

think it's like that every single

21:29

company, no matter if you were a

21:30

website, you had your own server room.

21:32

>> And if you were a dev, you wanted to be

21:34

friends with the server admin because

21:35

when you wanted to deploy your stuff,

21:37

you you know, they they could do stuff

21:39

for you.

21:39

>> They could do stuff for you. That or

21:41

that's it totally. And so I think that

21:43

cloud computing was really important.

21:45

This is not a deep thought that elastic

21:46

infrastructure was really important but

21:49

the ability to have APIdriven

21:50

infrastructure. Um and that so for me

21:53

personally so I was I was at Sun and

21:56

then from in 2006 I started a storage

21:59

group inside of Sun which was great. Um

22:02

really successful group but so

22:04

successful that it actually attracted

22:06

Oracle as a customer for the first time

22:07

in a long time. I kind of this is like a

22:09

little bit of like residual like shame

22:11

that I have that like did I attract the

22:14

the the marine apex predator that ate

22:16

the company

22:17

>> cuz Oracle later stunt right and then

22:19

they bought sun in and uh and that

22:22

closed in early 2010. Um I left shortly

22:25

thereafter because I could see what

22:27

Oracle was.

22:28

>> Well, I never heard a story of your

22:30

potential role here.

22:32

>> That's right. So I Yeah. and uh uh

22:35

Oracle and I and I gave some maybe a

22:37

year later I gave a talk uh in 2011 with

22:41

some rather unvarnished opinions about

22:43

Oracle and Larry Ellison in particular I

22:44

cautioned people about

22:46

anthropomorphizing Larry Ellison you you

22:48

have to treat Larry Ellison as as a

22:49

machine uh like a lawn mower you stick

22:52

your hand in the lawn mower it'll chop

22:53

it off well this is so all right so I I

22:55

I I'm giving this talk in 2011 again

22:57

this is after I've left I've I've left

22:59

what was then Oracle and uh you know I

23:02

was just saying things that I felt were

23:03

were obvious but people you know the

23:05

audience is kind of gasping and you know

23:07

it's like and people are coming up to

23:08

you after the talk like do you think

23:09

there's going to be like it's gonna be

23:11

retribution from Oracle no you're

23:13

misunderstanding like there's no the

23:15

lawn mower is not angry at you it's a

23:17

it's a machine it doesn't it doesn't

23:18

have it doesn't have the mirror neurons

23:20

to be I would almost like I it would

23:22

almost u show me that I'm wrong for

23:25

Oracle to resent what I'm saying about

23:27

the anyway so but all the videos for

23:29

that conference go up and my video does

23:32

not go up

23:32

>> oh Right. Okay. And so my colleagues

23:35

were like, "This is an Oracle

23:36

conspiracy." I'm like, "This is not an

23:38

Oracle conspiracy," which it wasn't. It

23:39

wasn't orchestrated by Oracle, but what

23:40

I did I what I underestimated was the

23:43

fear of the conference organizers. So

23:45

they themselves were terrified of

23:47

offending Oracle.

23:48

>> Yes. Even though it probably would have

23:50

been fine.

23:50

>> No. So the talk did finally go up.

23:52

Before the talk starts, there is a

23:54

disclaimer. The views in this talk do

23:56

not represent the views of the US

23:58

association. And you're like, "All

23:59

right, I get it. Like never seen this

24:01

disclaimer before, but fine. Then during

24:03

the talk, you know, the format of the

24:04

talk is you got a slide and then you've

24:06

got like a little blank script and then

24:08

you got this talking head in the lower

24:09

right corner. There's like kind of this

24:11

dead space above the speaker. They took

24:13

this disclaimer and they rejustified it

24:16

and they put it above my head the entire

24:19

time I'm speaking. So if you and in I

24:23

mean and maybe in this regard they were

24:25

preanted because to this day if Ellison

24:28

is mentioned on hacker news or Oracle's

24:30

mention on hacker news someone will

24:32

immediately cite minute 33 of this talk

24:35

which is when I go on this kind of

24:37

Oracle

24:38

again I don't view it as a rant I view

24:40

it as just like me describing what is

24:42

obviously true that we all know but

24:43

anyway so I I had left uh I left Oracle

24:47

after after they bought

24:48

>> so so we're we're now around like 2,000.

24:51

So cloud has taken off. x86 architecture

24:54

is everywhere. Linux is is now winning

24:56

both for smalltime servers but also on

24:58

the cloud. And then what happens? This

25:00

was an interesting time when Google

25:02

started to figure out that hey they

25:04

could do something interesting on their

25:05

cloud, right?

25:06

>> Yeah, that's right. So this is still a

25:07

little bit before that. So this is in

25:09

kind of from I would say from 2010 to

25:11

about 2014

25:13

is when is a period of relentless

25:16

execution from AWS. AWS is executing. so

25:20

extremely well. There are not really

25:22

other public cloud options. There's like

25:24

kind of Azure kind of drifting out

25:25

there.

25:26

>> I think people people forget that that

25:27

you know like GCP on paper has been

25:29

around from 2009 but up to like 2014 it

25:32

was like it was almost like a joke.

25:33

>> It was a joke. I would say before it was

25:35

it like it existed but it was a joke and

25:37

the and in particular at every single

25:40

reinvent Amazon would announce a new

25:42

price cut. And if you were a competitor

25:45

to AWS you were like dreading reinvent

25:47

because here comes another price cut. If

25:49

you are a partner of AWS, you're

25:51

dreading reinvent because here comes the

25:52

announcement of a new service that

25:54

competes with what you're making. I

25:55

>> I think people who have not been around

25:56

have forgotten, but it really has

25:57

happened and cuz it's not been the norm

25:59

the last like let's say 5 10 years or

26:01

so.

26:01

>> Well, and in particular, they did a

26:02

couple things are just like, man, you

26:04

got to tip your hat to just I mean Jeff

26:07

Bezos is the apex predator of

26:08

capitalism. like Larry Ellison may be

26:10

the lawn mower, but Bezos is ultimately

26:13

the apex predator because the thing that

26:14

was so impressive is they were able to

26:18

give people the idea that this was a

26:19

terrible business. So, in particular,

26:21

they did not break out their financials.

26:23

So, everyone's like, "Oh my god, what an

26:26

awful business." Like, they're cutting

26:27

the price every year. Like, you do not

26:29

want to like this is a, you know, a

26:31

classic red ocean. It's bloody. You

26:33

don't want to compete. And so, we were

26:34

at joint. We were actually competing

26:35

headto-head with with AWS. So you you

26:38

were offering uh

26:39

>> a public cloud. So we public cloud and

26:40

then unlike AWS taking the software that

26:43

we had used to run the public cloud and

26:44

making it available for people that

26:46

wanted to run a cloud on prem on their

26:47

own hardware. So people that would buy

26:49

Dell or HP or Super Micro, they would

26:51

buy our software and they would run it

26:53

on there and get a cloud. So we we ran a

26:56

public cloud and we knew what the

26:58

economics of a public cloud were. Namely

27:00

pretty good. Margins were good. And so

27:02

what we knew that Amazon that Amazon

27:06

wasn't volunteering, but what we knew is

27:09

that AWS S3 was underwriting a war on

27:13

big box retail. S3 was paying for your

27:15

prime shipping. It was a genius move.

27:17

And so

27:18

>> also some some insider information that

27:20

you had because you did your own thing.

27:22

>> Well, we know that the margins are very

27:24

good. And then of course, I mean, we did

27:26

you will be unsurprised to learn that

27:27

several of Joy's most prominent

27:29

customers were retailers. Retailers,

27:32

this was not lost. Retailers are like,

27:34

"Gee, I wonder what's happening."

27:35

Retailers are like, "If you think I'm

27:36

going to take my dollars and spend them

27:38

on AWS so AWS can I so Amazon can go to

27:42

war with me, like no thank you." There

27:44

was a period of time when I felt like in

27:46

order to be in the cloud, you have to

27:48

implement every AWS API. So there's this

27:50

idea that you had to be API compatible

27:53

with EC2. There's a company called

27:54

Eucalyptus that tried to do this. It was

27:56

just a disaster. And part of the reason

27:58

it was thought that GCP and Azure could

28:00

never compete with AWS because they

28:01

could never be API API compatible. And

28:04

so I am convinced that the because what

28:06

changes what changes in like 2015? What

28:08

starts in 2015? Kubernetes. And I think

28:13

that part of that initial attraction to

28:15

Kubernetes is that people wanted to get

28:18

some optionality around their cloud and

28:21

they they felt locked into AWS. They're

28:23

like, I'm not using all this stuff. I'm

28:24

not using elastic bean stock. I'm not

28:26

using green grass. I'm not using kind of

28:27

these more as I'm not using red shift.

28:30

What I actually want is this kind of

28:31

basic infrastructure and kubernetes now

28:33

gives me this layer upon which I can

28:36

deploy and get some sort of true cloud

28:39

neutrality. So multicloud didn't really

28:42

exist I would say before Kubernetes and

28:44

I think a lot of that especially early

28:46

momentum behind Kubernetes is around

28:49

this idea of like I need to have some

28:50

optionality in here. I want to have

28:52

actually be able to go to GCP. So I

28:54

think you know and I don't I think it's

28:56

giving Google slightly too much credit

28:58

but only slightly too much credit to say

28:59

it is master stroke.

29:00

>> On the podcast I had Kat Cosgrove who's

29:03

uh released a project manager on

29:05

Kubernetes and you know she's been in

29:07

the project for a long time and I asked

29:08

her she's not she was never a Google

29:09

employee but I asked her why do you

29:11

think Google open source Kubernetes

29:13

which you know they have Borg which is

29:14

amazing and they kind of built honestly

29:17

a better version for for the for

29:18

external and they just released it just

29:20

like that. They put a lot of work in it

29:21

and to me it didn't really compute like

29:23

why would Google like what is the

29:25

business reason and she told me that she

29:27

thought again speculation from the

29:29

outsiders. She thought that they

29:30

probably thought that it would help

29:32

Google cloud

29:33

>> that's right

29:34

>> to have the a container which is now

29:36

portable and now you can give the

29:38

promise that if you run this on Azure

29:39

especially AWS you could come over so it

29:42

kind of makes sense. Is this your

29:44

thinking? Yeah, absolutely. But I think

29:46

I think that is definitely the argument

29:48

that Kubernetes proponents would make

29:50

inside of Google

29:51

>> in terms of like why they did it. Nobody

29:53

prevented it. You know what I mean? It's

29:54

like they they kind of open sourced it.

29:56

>> Google was a pretty cool place in the

29:58

sense that it was very bottoms up as I

30:00

understand back then still.

30:02

>> Yeah. And and then I think part of their

30:04

you know it was Craig Mccau who really

30:06

pushed for the CNCF the formation of the

30:08

CNCF around Kubernetes to give it kind

30:10

of a foundation home. I did I do

30:12

remember one conversation with Craig was

30:14

that and I were talking early as he's

30:16

contemplating the CNCF and he's like

30:18

well I think this is going to allow

30:20

Kubernetes to get the marketing dollars

30:21

that it needs. I'm like don't you work

30:23

for the most profitable company on earth

30:25

like do you really isn't it just like

30:27

gushing cash over there and you can't

30:29

get like you know a couple million bucks

30:31

for marketing for this thing but no

30:32

apparently you can't. So, but so I I

30:34

think that that the the argument that

30:36

people were making internally was about

30:37

we should be encouraging cloud

30:39

neutrality because we are the ones that

30:41

have something to win and they're right.

30:42

Um and and they did and GCP is now not

30:45

an afterthought. GCP is very important.

30:47

It's a very big business and I think

30:49

that they've got is Kubernetes to thank

30:51

solely no but I think it's played an

30:53

important role for sure. And where are

30:54

we today in terms of the the hardware

30:57

and the software stack running

30:59

specifically thinking of these big

31:00

clouds what's happening inside the likes

31:02

of Meta these giants as I understand you

31:04

know they're no longer just like you

31:06

know ordering servers from Dell or or

31:08

wherever

31:08

>> never were never what they do

31:11

>> they [clears throat] so it it's kind of

31:13

funny because for all of these folks

31:14

they took a somewhat similar path they

31:16

never were because in Google's earliest

31:18

days they were assembling machines from

31:21

fries you know rip fries fries being a

31:23

ical electronic shop that has long since

31:25

disappeared, but they were kind of

31:27

famously velcroing machines together and

31:29

finding

31:29

>> so so they bought like the processor,

31:32

the the different networking switch,

31:34

whatever.

31:35

>> And they had this idea that like it

31:36

doesn't matter what junk we run on

31:38

because, you know, our our software is

31:41

going to run as a distributed system. It

31:42

actually doesn't matter. We don't need

31:43

ECC protected memory because it doesn't

31:45

matter if your DIMs fail. And so it's I

31:47

think they learned well it does matter a

31:49

little bit if your DIMs have rampant

31:51

data corruption. like dims failing

31:53

that's actually not a problem. Dims your

31:55

memory returning the wrong thing like

31:58

that is a problem. You can actually like

31:59

you turn that like next thing you know

32:00

like your software inserts that into a

32:02

row into a database and like yeah now

32:05

you got

32:05

>> yeah that is correctness is a problem.

32:07

>> Yeah. Yeah. Correctness is a problem.

32:08

It's like okay overshot the mark. So by

32:11

the time they're like okay we're not

32:12

going to velcro machines together. we're

32:14

not going. But what by that point in

32:16

time, you know, the business was

32:17

established enough that they actually

32:19

did they built the machines that were

32:21

fit for scale. So they have a a great

32:25

book that was written um in the kind of

32:27

the mid 2000s, the warehouse size

32:29

computer where they talk about all the

32:30

things they did DC bus bar really

32:32

thinking about power across the entire

32:34

DC. So they kind of they went from from

32:37

being kind of too cheap for kind of Dell

32:39

or even Super Micro to then being much

32:42

better engineered than those systems

32:43

ever were. So they were never really

32:45

meaningful customers. Uh and ditto for

32:47

Facebook Meta. They were they were never

32:50

really meaningful. I mean they they

32:52

kicked them out very early and did their

32:53

own stuff. Brian just talked about how

32:55

Facebook built their own servers because

32:57

offtheshelf solutions didn't work at

32:59

their scale. And what's interesting is

33:01

that companies like Meta and Google

33:03

didn't just build better hardware. They

33:05

also built incredible internal tools.

33:07

Tools for safe deployments, feature

33:08

flagging, experimentation, debugging,

33:10

analytics, the whole stack that lets

33:12

teams shift fast and with confidence.

33:14

Most companies never get access to this

33:16

level of infrastructure. You either

33:17

build it yourself, which takes years,

33:19

and large engineering teams, or you make

33:22

with scattered tools that don't talk to

33:23

each other. That's exactly where Static

33:25

comes in. Static is our presenting

33:27

partner for the season, and they give

33:28

every engineering team access to the

33:30

kind of tooling that only the biggest

33:31

tech companies used to have internally.

33:33

At its core, static is a toolkit for

33:36

safer deployments and experimentation.

33:38

You ship a new feature to 10% of users

33:40

behind a feature gate. You validate that

33:42

it behaves correctly, wash the metrics,

33:44

and expand to remaining 90% only when

33:46

you're confident. And if something goes

33:48

wrong, you can turn it off instantly,

33:50

long before it affects everyone. And

33:52

safe deployments require visibility.

33:54

Static includes analytics, both product

33:56

analytics and infrastructure analytics.

33:58

So you can actually see what your code

34:00

is doing in production, errors,

34:01

performance changes, funnels, user

34:03

behavior, because you cannot ship safely

34:05

if you can't see what's happening.

34:07

Companies like Microsoft and notion run

34:08

hundreds of experiments per quarter were

34:10

statig velocity that used to require

34:12

entire platform teams to build and

34:14

maintain. This used to be infrastructure

34:16

available to maybe 10 or 15 tech giants.

34:19

Now startups and mid-size teams use

34:21

static to ship quickly without breaking

34:23

things. If you want to give your

34:24

engineering team world-class tooling

34:25

from day one, go to

34:26

statsic.com/pragmatic,

34:28

there's a generous freeze tier, a

34:30

$50,000 starter program and affordable

34:33

enterprise plans. And now let's get back

34:35

to the conversation about the history of

34:37

computing and what might be coming next.

34:39

>> And and this was independent. So like

34:41

both Google and Meta both came to the

34:42

conclusion of like we should just build

34:43

our own stuff

34:44

>> and and Microsoft and and Amazon all

34:47

came to the independent conclusion

34:49

because the scale at which they needed

34:51

to run was not at all the scale at which

34:54

Super Micro and Dell and HP were geared.

34:56

What they were geared to do was to run

34:58

the servers in your server room where

35:00

you needed to know the devs, right?

35:01

Where it's like I'm going to have a

35:02

little rack. It's going to have six

35:04

servers. Then maybe it's got 12 servers.

35:05

Okay, maybe we grow to 24 servers.

35:08

That's what they were designed to do. If

35:09

you're like, "No, I want to buy servers

35:11

by the thousands because I've got a

35:13

public cloud business." Like, if you

35:15

want to buy servers by the thousands,

35:16

there is no product from those companies

35:18

for you. And in very, very basic ways,

35:20

well, like the DC bus bar at every

35:23

juncture, they've been designed to be a

35:25

personal computer that you happen to be

35:27

slapping many personal computers

35:28

together, but they're not designed to

35:30

actually run infrastructure at scale.

35:32

So, and that was happening inside

35:34

effectively all the hyperscalers. And

35:36

Joint, meanwhile, was bought by Samsung

35:38

in 2016. Joint was bought by Samsung

35:40

because their cloud bill was off the

35:42

charts and

35:44

>> they bought they bought you to

35:46

>> bring it in house.

35:47

>> Yeah. And and there was not a product

35:49

they could go buy. So they went to go

35:50

buy a company.

35:52

>> So you're like, "Wow." And it's like,

35:53

"Wow, that's a big AWS bill." It's like,

35:55

yes, very big AWS bill. But then that

35:58

was not a product that or company that

35:59

was available for for you know the next

36:01

S. What does the next Samsung do? like

36:02

well that's one less company available

36:04

to buy. Um so when we were contemplating

36:07

the next thing in 2019 one of the things

36:09

that we had seen is that and we felt we

36:12

earnestly believe that one cloud

36:14

computing is the future of all computing

36:16

not a deep thought that elastic

36:18

infrastructure APIdriven infrastructure

36:20

that is modernity

36:22

one two you shouldn't be able to only

36:25

rent that you should be able to buy that

36:27

own it run it in your own data center

36:29

why would you want to do that well you

36:30

might want to do that for risk

36:32

management for security or for economics

36:35

because it, you know, if you're at a

36:36

certain scale, you'd rather own it than

36:38

rent it.

36:39

>> And I think, you know, before Oxide or

36:41

like in 2019 or even in like, you know,

36:43

2020, 2021, if you were like a midsize

36:46

company, you know, like not big enough

36:48

to build out your own custom cloud and

36:50

build everything that the hyperscalers

36:51

did, you could like buy some

36:53

off-the-shelf like HP or Dell, like a

36:56

bunch of them. I think that's what Base

36:57

Camp did. I I think they posted that

36:58

they they bought a bunch of bunch of

37:00

these things. They rented a space in a

37:02

in a one of these shared or or or I

37:04

think two different locations. They put

37:06

in their boxes with all the memory and

37:08

then you know they kind of set it up and

37:10

and put it together. So I guess those

37:11

were the two options, right?

37:13

>> Yeah, those are the two options and I

37:14

think that you know base camp ended up

37:15

being a real poster child for the

37:17

economic advantage because I mean DHH

37:19

know obviously outspoken and uh the

37:22

economic advantage was really really

37:24

really clear. They're also at a scale

37:26

which is like not the scale that we're

37:28

targeting, right? That the scale we're

37:30

looking at is a much larger scale. And

37:32

so the economic argument is actually

37:35

even more compelling when you're at that

37:36

larger scale. I love it when you know

37:38

the VCs that passed on us because they

37:40

felt there was no market then would send

37:42

me like the DHH blog post. It's like why

37:45

are you sending this to me? I should be

37:46

sending this to you. Like I know this.

37:48

We just knew the economics of it and we

37:50

knew couldn't [clears throat] predict

37:51

exactly what the trends would look like

37:53

but but believed that there would be

37:55

folks that were born on the public cloud

37:57

that would outgrow the economics of the

37:59

public cloud and want to go on prem.

38:00

>> Economics aside, what does it take to

38:02

build one of these things? And I I I saw

38:04

one of these things. We we'll put in a

38:06

picture of it. It's like a proper like,

38:08

you know, like my my 9 ft tall rack.

38:11

It's it's big. It's it it feels like

38:14

you're putting like I don't know like 16

38:16

or 32 of those of those like you know

38:18

Dell things in terms of size just to get

38:20

sense.

38:21

>> Yeah. Yeah. We would 32 comput sleds in

38:22

there. That's right.

38:22

>> And and what does it take what did it

38:24

take to actually build it? What did you

38:26

need to design in terms of hardware and

38:28

then software?

38:29

>> Yeah. So well and we knew this too that

38:31

going into the company we knew we were

38:32

taking a clean sheet of paper right and

38:34

so we were deliberately like no we're

38:36

going to start with a problem. We're not

38:37

going to build it out of Dell HB micro.

38:39

you're going to start with a problem and

38:41

how do you best solve the problem? And

38:42

as it turns out, like there were a whole

38:44

bunch of there's a lot of technical debt

38:46

that had been accured by this kind of PC

38:48

ecosystem. So I mean, you know, God,

38:51

where do you start? Uh just on the

38:52

environmentals like on power, right? The

38:54

fact that you got AC power in each of

38:56

these Dell HP super micro.

38:57

>> Yeah. So if you like put 16, you have

38:59

like 16 separate AC

39:01

>> times two because you have two power

39:03

supplies per one U two chassis. Two

39:05

power supplies. By the way, there are

39:06

two fans sitting on those power supplies

39:09

and those and those fans are actually

39:12

what wear out if you go to the like in

39:14

terms of like the worring fans. It's not

39:15

just coming from the computer, it's

39:16

coming from the power supplies because

39:17

those power supplies are dense. They're

39:19

packed with stuff. So, they've got to

39:21

overcome a huge amount of static

39:22

pressure. So, like that's not the way

39:24

anyone does it at scale. The way people

39:26

do it at scale is you've got AC bus bar,

39:28

you've got a a power shelf that is that

39:30

is much more efficient

39:32

>> and that that that that rectifies from

39:34

AC to DC and then you run DC up and down

39:36

and then you you blind made into that.

39:37

So we knew we were going to do that.

39:38

>> That's a little electronics engineering

39:40

right there.

39:40

>> Yeah. Yeah. That Yeah. The power

39:41

engineering for sure and we knew we were

39:43

going to do that. We also knew that by

39:45

taking a clean sheet of paper that we

39:47

would have opportunity made available to

39:49

us that we weren't necessarily thinking

39:51

of and that manifested pretty early. So

39:54

we blind mate into power which is to say

39:57

that when you feed a sled in that power

39:59

connector you don't see it it's at the

40:01

back you you lock the sled in blind

40:02

mates into power and we had assumed that

40:05

we were going to do what Facebook and

40:06

Google and others have done Amazon done

40:08

and had networking out the front in the

40:10

cold aisle but as we were you know

40:12

taking a clean sheet of paper talking to

40:13

some connectivity vendors they asked us

40:16

like why are you wait a minute you guys

40:17

are like taking a clean sheet of paper

40:19

why are you putting cabling in the front

40:21

like why wouldn't you also blind mate in

40:23

the network and the networking

40:25

connection and we were like can you do

40:27

that? They're like oh you can definitely

40:29

do that like well why don't the

40:31

hyperscalers do that? It's like, oh,

40:32

they would all tell you that if they

40:34

could start over today, they would blind

40:36

me the networking and they're just too

40:37

afraid to do it at this point, which is

40:39

like, I mean, that was like catnip for

40:40

us, you know, like they're too afraid to

40:42

do it. Like, okay, we got to And one of

40:44

the very early holy god, we're going to

40:46

bet the company decisions was

40:48

blindmating networking because if

40:50

blindmating networking doesn't work,

40:52

you've got nothing. You don't have a

40:54

problem.

40:54

>> And and so what is the difference in

40:56

blinding networking versus

40:57

>> It means there is no cabling in the

40:58

system at all. So when you've got a a

41:00

sled, you are blind mating into a cabled

41:03

back plane. So that the it's cabled in

41:05

the factory. So the the the operator

41:08

>> So when the box comes in, that's why I

41:09

didn't see any cables. It's it's inside.

41:11

It runs inside.

41:12

>> It runs down the back. And so

41:14

>> versus when I look at the pictures of a

41:16

data center of let's say Google, you you

41:17

see they're very neatly organized. It's

41:19

like I love organization. So it's like

41:21

beautiful, but it's cables everywhere

41:23

and you can see.

41:24

>> So you don't have that.

41:25

>> We don't have that. And in particular,

41:26

so because there's no cabling, there's

41:27

also no miscelling, right? So, so every

41:30

computer is not actually on just one

41:32

network. It actually needs to be on

41:33

three. It's on a power detect a presence

41:36

detect network. It is on a service man a

41:38

service processor network. And then it's

41:40

on that high-speed network that you

41:41

really care about like the actual

41:42

network. In any facility, you you need

41:45

another network for power environmentals

41:47

and so on. It's very easy to have

41:49

miscellane that's got to go to a

41:50

different router. It's like you there's

41:52

a bunch of of just complexity that we

41:55

eliminate because we do and then part of

41:59

that decision came out of an an arguably

42:01

earlier bet the company decision which

42:03

was we did our own switch. So we also

42:05

did in addition to doing our own comput

42:06

sled we did our own switch

42:07

>> and last time you told me about this and

42:09

in our deep dive we did a little bit

42:10

that like at first you said we did our

42:12

own switch and I was like yeah okay cool

42:14

you did your own switch and then you

42:15

told me that actually like that is a

42:17

second computer to build. Can you can

42:19

you tell me why? And it's funny because

42:21

we went when we went through Sand Hill

42:23

initially raising money for the company,

42:25

nobody asked us.

42:26

>> Sand Hill roll. Exactly. And we were

42:29

definitely so people be like, I've got a

42:30

technical question for you and you're

42:31

like, oh god, here comes switch. It's

42:33

the switch question. But then be no some

42:35

other random asked questions like all

42:36

right, that's not a very good question.

42:38

But nobody was asking us about the

42:39

switch. And we were concerned about the

42:41

switch because we'd already come to the

42:43

conclusion in order to make this thing

42:44

really work, we had to do our own

42:46

switch. And the reason you have to do

42:47

our own switch, if we didn't do our own

42:48

switch, it would be a third-party

42:50

integration nightmare and we wouldn't be

42:52

able to actually solve the problem that

42:53

we're trying to solve, which is when

42:54

this thing shows up in your data center,

42:56

we want this thing to to to come out of

42:58

the crate. We want you to wheel it up.

43:00

We want you to put in power and

43:01

networking and go. We do not want you to

43:04

have to to cable anything. It should be

43:06

the the the level of operator

43:08

involvement should be really minimal.

43:10

So, we'd already come to the conclusion

43:11

that in order to make this thing

43:12

operable and manageable, we need to do

43:14

our own switch. And so you're saying

43:16

that like buying cuz a switch to me

43:18

sounds like a somewhat simple component

43:20

and you're you're going to tell me why

43:22

it's not.

43:22

>> Oh yeah, it's definitely not. No, but

43:24

that but that attitude is very

43:25

important. If you want to go build your

43:26

own switch, I encourage you to have that

43:28

attitude as long as you possibly can

43:29

because otherwise you won't go do it.

43:31

>> So So what does your switch what is

43:33

switch being obviously the networking

43:34

switch? What is your networking switch

43:36

do or or that made it so important for

43:38

you to build it as opposed to like going

43:40

to one of the many suppliers and saying,

43:42

you know, let's get your

43:43

>> not many suppliers. Oh,

43:44

>> so if you actually go to the actual

43:46

switching silicon is coming from like it

43:49

was like one and a half providers.

43:51

>> Oh,

43:52

>> it's all Broadcom and so what you're

43:53

actually talking about is Broadcom

43:54

silicon. Um what we discovered is is

43:58

this actually interesting piece of

44:00

actually Intel silicon from a company

44:01

they had bought called Barefoot and we

44:03

found Intel Tofino which allowed us to

44:05

have true programmable networking. So we

44:06

we use Intel Tofino. Intel later killed

44:09

Tofino. So complicated relationship with

44:11

Intel over this. uh we fortunately have

44:13

procured enough to fino to be able to

44:15

take we bought ourselves the time we

44:17

need to kind of design our nextg switch

44:19

but that programmability was very very

44:21

important for us um and that we were not

44:23

going to get from Broadcom is a very

44:25

proprietary company we were not going to

44:28

get a bunch of the things that we needed

44:30

in building that switch we were not

44:31

going to get out of Broadcom so it was

44:33

ended up being very important we were

44:35

concerned I mean again another one of

44:37

these kind of bet the company decisions

44:38

very very concerned about about having

44:40

our own switch integrating our own And

44:42

what we found is that was a that was a

44:45

win in so many dimensions. So many

44:48

dimensions that we did not anticipate.

44:50

And as now you can't imagine the company

44:52

without having to sometimes do stuff and

44:54

you might get some wins. that I

44:56

absolutely well I think also like

44:57

whenever you're deliberating something

44:59

big like that you it the fact that it is

45:01

big kind of forces you to really

45:03

deliberate and then once you commit to

45:04

it to taking that big risk you often see

45:08

unexpected dividends like well as long

45:10

as we're going to do this as long as we

45:11

are taking a clean sheet of paper as

45:12

long as we're doing our own switch we

45:13

can blindate the networking if we were

45:15

not doing our own switch we really

45:16

couldn't blind make the networking we

45:18

really needed to be able to own both

45:20

sides of that in order to be able to do

45:21

our own switch or blind

45:24

>> a lot of us you know listeners, viewers

45:25

are software engineers, so we don't know

45:26

as much about hardware. Obviously, we

45:28

know we we know how the things work, but

45:30

can you tell me a bit on what it

45:32

actually means to design or build a

45:34

computer? Cuz you know, I I'll give you

45:35

the the novice approach, which is

45:37

obviously going to be wrong. But the

45:38

novice approach is like, oh, here's a

45:40

here's a processor, here's a few chips,

45:42

here's a mainboard, I'll just put it on

45:44

there and I'm done. But when I was in

45:46

your lab at Oxide, uh you told me that

45:49

one of the first engineers turned out to

45:51

be a radio frequency engineer. You told

45:53

me how this is great because of the all

45:54

the FDA approvals and all these things

45:56

and I was like okay this is way more

45:58

involved than I ever imagined.

46:00

>> Yeah, it's very involved.

46:01

>> How do you build a new computer?

46:03

>> First of all, it's all I mean it would

46:04

be a lot easier if it were all slower,

46:06

right? The problem is it's very fast.

46:09

It's high speed. So the connection to

46:12

memory via now DDR5 double data area

46:15

memory 5 is ridiculously high throughput

46:19

is very from a signal integrity

46:21

perspective really complicated. These

46:23

boards by the way ultimately this is all

46:25

analog. We think of it as digital and it

46:26

is digital but digital is like a lie

46:28

that that doubles allow us to tell

46:30

ourselves. It is actually like you are

46:33

talking about signals that are racing

46:35

through a a substrate and the and with a

46:39

PCIe or DDR5 the all of so those signals

46:43

are very complicated to lay out that's

46:46

complicated the actual like how does the

46:49

computer start like this computer is

46:51

like it's like a it's like a a trip 7

46:54

right or you know I a 747 used to be my

46:56

favorite jet to kind of pick on but now

46:58

the 747 is retired so I got to pick

46:59

something else and I'm not going to pick

47:00

another boring aircraft I don't think I

47:02

an A380 I guess, right? I should pick an

47:04

air bus. But you think about like the

47:07

okay, an Airbus doesn't just like come

47:09

by itself like it needs an airport. It

47:12

needs like a runway. It needs it needs

47:14

all the infrastructure to feed it. Well,

47:16

so too for a microprocessor, it it

47:19

doesn't like just the power sequencing

47:21

for those things is very complicated. It

47:24

needs another surround that manages the

47:27

power distribution network that actually

47:29

manages its power on sequencing that

47:31

manages all of its environmentals that

47:33

manage its connection to memory to IO.

47:36

So it is it is just fractally

47:39

complicated uh to the point that people

47:41

often just take reference designs and

47:42

iterate on them. They don't actually

47:44

really innovate on this stuff because

47:45

it's it takes so long. And you told me

47:46

this was really interesting last time

47:48

that uh as I understand reference design

47:50

means correct me if I'm wrong that

47:52

you're an electronics engineer or or

47:55

hardware engineer and you want to build

47:56

a new hardware and you take an existing

47:58

reference that has been tested measure

47:59

it out like it doesn't create accidental

48:02

like all sorts of radio frequency things

48:04

and then you implement that. But you

48:05

told me that this is not what you did.

48:07

You also told me that it's pretty hard

48:08

to find electronics engineers who are

48:11

used to not doing reference design but

48:13

who are brave enough to like

48:15

>> who are brave. Yes. I would say that in

48:18

in in computer design in particular, the

48:21

high-speed designs are so hard. People

48:24

got very accustomed to taking the

48:26

reference designs and it was harder to

48:29

find folks that were willing to take a

48:31

clean sheet of paper and we we

48:33

ultimately found them. I mean, and we've

48:35

got a a double E team that is

48:37

extraordinary

48:38

>> and double E is electronics engineer,

48:40

right?

48:40

>> And yeah, and absolutely fearless. Um,

48:43

and in part because like they're

48:44

actually but they didn't spend their

48:46

careers at Dell and HPE. Like they're

48:49

coming No, they're like coming from like

48:50

GE medical where they worked on CT

48:52

systems.

48:53

>> Wow. H how did that happen?

48:55

>> Uh how did they come to Oxide?

48:58

>> It's it's it's not but it feels like

49:00

such a different field. I would have

49:01

assumed naively that you know if you're

49:03

building a computer you'll you'll try to

49:05

get electronics engineers who have built

49:07

computers

49:07

>> you would think. Um and then we and that

49:10

was probably our thought as well and

49:11

then we discovered that we were

49:13

>> not getting along with those engineers.

49:15

Well, we didn't hire them because we

49:16

were but we were just like finding like

49:17

there's a lot of friction because there

49:20

wasn't a real first principles approach

49:22

from those folks. And this is where you

49:24

get to especially you get to talk to

49:25

folks that like been at Dell for a

49:27

generation and like for any design

49:30

they're used to calling what's called

49:31

the FAE which is the the the the field

49:34

applications engineer for you know the

49:36

for the voltage regulator. It's like

49:38

well the FAE gives me the design. It's

49:40

like all right well how do you know that

49:42

it's the right design? Well no he there.

49:43

So it's like all right so like let's go

49:45

hire that person then let's forget you.

49:47

And we were really just we were

49:50

struggling. I was struggling to get

49:52

outside of my own personal network to

49:54

find um the right engineers. Um and we

49:57

were kind of brainstorming like how can

49:59

we um get people to see the company who

50:03

wouldn't otherwise see it

50:04

>> and specifically for hardware engineers

50:05

like we're talking about.

50:06

>> Yeah. And just in general, but forecally

50:09

Yeah. For doubles it was feeling

50:10

especially acute. One of the thing you

50:12

we're kind of brainstorming as a team

50:14

and uh you know one of our engineers

50:16

said you know I you know the values are

50:18

very important to us at Oxide which they

50:20

are and I relay Oxide's values and our

50:24

principles to people outside of Oxide

50:26

and they're like that's just

50:28

and I explain that like you know

50:30

normally I would agree with you but uh

50:32

it's when I get to the compensation

50:35

people that their heads turn because our

50:37

compensation is transparent and uniform

50:40

and people are like, "Wait, what?" And

50:43

I'm like, "I could write a blog entry on

50:45

it." Like, "Yeah, that'd be like that

50:46

would be great." I'm like, "Okay." And

50:48

so, up until that point, we had not

50:49

talked about it at all. We had not

50:50

talked about it publicly at all. I just

50:52

came up with the idea that like

50:54

compensation is just private. It's just

50:55

not something you talk about with

50:56

people, you know, and you go to levels

50:59

FYI or some of or some of the forums

51:01

you're like anonymously asking, people

51:03

are enemies sharing that. That's how you

51:04

get information.

51:05

>> That's how you get information. And so I

51:06

kind of had this idea that it was that

51:07

it just is not something that you and so

51:10

we wrote this blog entry in March of

51:11

2021 and it sent our hiring nonlinear

51:15

and it wasn't that people were like oh

51:17

my god I want to work for a company

51:18

where everyone's paid the same like that

51:20

is like that's like

51:20

>> yeah cuz your composition was both the

51:22

same and you also put the number

51:23

specifically I think it was something

51:25

like $200,000 back then.

51:27

>> Uh uh yeah with it was a little bit less

51:29

back then but in a bit more than that

51:31

yeah I the uh now we just got another

51:33

raise so now I've lost track. It was 207

51:35

but now it's more than that. I actually

51:37

don't know because the one thing is when

51:38

compensation is uniform like you don't

51:40

keep total track of like oh did like

51:42

literally people were like wait a minute

51:44

like I got there's an error in my

51:45

paycheck I just got paid more. People

51:46

like, "No, no, we got a raise." Like,

51:47

"When was that?" Like, "No, it was at

51:49

the last all hands." Like, "Oh, you

51:51

know, I did have to go to the bathroom

51:52

like at the end of last all all hands. I

51:54

didn't listen to the recording. I guess

51:55

I missed my raise." Like, "Yeah, yeah,

51:56

you got to pay attention around here."

51:57

But it was more that what what drew

52:00

attention was that people engineers in

52:02

particular were but just in general,

52:05

people drawn to a company that would be

52:07

so nuts as to do that. And it did it

52:11

ultimately like that engineer that made

52:12

the suggestion was absolutely right. It

52:14

was the compensation that convinced

52:16

people that we take our values really

52:18

seriously, that we're a really

52:19

principled company,

52:21

>> which is you're paying everyone the same

52:22

base salary. Exactly the same. Yeah.

52:25

>> They're making the same as as you the

52:27

the electronics engineer, software

52:29

engineer, the whatever other role you

52:32

might have.

52:32

>> That's right. And when and and and I um

52:35

I don't know if uh you should just go

52:38

ahead and say it if you want to, but

52:39

many people are like, would you pay

52:40

support engineers the same amount? It's

52:41

like why do people always like pick on

52:43

support? They they would ask

52:45

>> exactly uh answer to that is yes and the

52:48

answer to that is uh if you do that you

52:49

find supportive support engineers and so

52:51

we have got uh I think we've got the

52:53

best support engineers in the business.

52:55

I think it's we we've got really really

52:58

phenomenal folks in support. I I I heard

53:00

I heard a small company called Gumroad

53:02

do this where where they they they paid

53:04

their support staff really high again

53:06

about same as software engineers and

53:08

then they got support staff who were

53:10

software engineers and they could fix

53:11

the code or like write tools for

53:13

themselves and you get people for whom

53:15

because I mean you know there's a a

53:17

certain thrill in being in a in support

53:21

that because you've got someone with a

53:24

problem. It's technical. You get to come

53:27

up you get to be technical. get to go

53:29

solve a hard problem and then

53:30

immediately the you get such gratitude

53:33

you know and like that's a rush and if

53:37

there are people that are really drawn

53:38

to that like I love helping other people

53:40

I love that feeling that I get when I

53:42

resolve a problem for someone that

53:44

immediiacy so one of the things that we

53:46

we've heard repeatedly from from several

53:47

of our support engineers is my heart was

53:51

always in support but but my career path

53:54

was forcing me into a different career

53:56

path and I love the fact that and get

53:58

back to where my heart is.

53:59

>> Yeah, that that that's nice because now

54:01

like it Yeah, you're not going to make

54:03

more by doing something that you're not

54:05

as into. I I love that. So, going back

54:08

to where we were, which is like you

54:09

build the hardware, you build this like

54:11

really complicated piece and you went

54:13

through electronics engineering, putting

54:14

it together. Let's put a software cuz

54:16

that that's super exciting. What what

54:18

what does it take to build software for

54:20

this? Did you start from sc let's talk

54:23

from from the low level. Did you start

54:24

from scratch from operating system? Did

54:26

you have to or could you use

54:29

>> and and there's kind of different

54:30

answers at different levels of the

54:31

stack. So on our service processor we

54:33

did start from scratch. We did our own

54:34

denovo operating system um in Rust

54:37

appropriately called hubris because we

54:39

had the hubris to do it. Um the the

54:41

debugger by the way for hubris is called

54:42

humility feels like appropriate for a

54:44

debugger. So that was was denovo

54:46

>> and this is open source right

54:47

>> open source. Yeah open source the entire

54:49

stack is open source. Everything we've

54:50

done is open source.

54:51

>> We can go on GitHub and check it you

54:52

know, go on GitHub and check it out. And

54:54

yeah, I mean, we've got God's own

54:56

revenue model because like you're like,

54:57

well, what if somebody like can download

54:59

it, run it on a different computer. It's

55:00

like knock yourself out because, you

55:02

know, we we think the best way to run

55:04

this is on on the machines that we make

55:06

and those are not free. An oxide machine

55:08

is not, you know, that's not free

55:09

downloadable, but all open source. So,

55:12

um that was for the service processor.

55:13

Um for the host CPU, we really had it

55:16

kind of at a quandry like what do we

55:17

want to do in the host CPU? And uh with

55:19

that is say like on the actual like what

55:21

was then AMD Milan now AMD Turin silicon

55:24

we knew that we wanted to do in the

55:26

product we would do our own hypervisor

55:29

and our own control plane. It was very

55:31

so this is not something that you run

55:33

>> the control plane is is that controlling

55:35

multiple like like the whole like you

55:37

have a bunch of processors and memory

55:39

and all that and control plane controls

55:40

all that.

55:41

>> You plug this thing on, you power it on,

55:42

you put in networking. What you get is a

55:45

console that looks a lot like a what

55:47

would like look like AWS if AWS looked

55:49

better. I mean it's it's a console. I

55:51

mean not to I mean look not to disparage

55:53

AWS but like we know that like design is

55:55

not really the strong suit.

55:56

>> We agree with that.

55:56

>> Yeah. Exactly. So it looks gorgeous. Uh

55:59

of course um but it's and it's also got

56:01

you got your API you've got your CLI and

56:04

you're provisioning instances. Where are

56:06

those provision instances provisioned?

56:08

It's the control plane that makes those

56:09

decisions.

56:10

>> You are attaching virtual storage those

56:12

instances. Where does that storage live?

56:14

It's the control plane that makes that

56:15

decision. So just like with AWS, you

56:17

don't need to know that stuff. That is

56:18

that's just happening. You you you're

56:21

using Terraform to spin up your cluster.

56:23

You're you're running Kubernetes on it.

56:24

You're knocking yourself out. So I we

56:27

are delivering all of the software from

56:29

that lowest layer that service processor

56:32

what the operating system that's running

56:33

on the host CPU and then that

56:35

distributed system very importantly that

56:36

distributed system um which we called

56:39

omocron before the omocron variant of co

56:43

which was feeling very like illtimed for

56:46

a very brief period of time it was

56:48

feeling illtimed and now I feel like the

56:49

omocrron variant of of co is just like

56:51

has just forgotten and now it's a good

56:53

name again so it's like you know we just

56:54

>> it was a really short list it short

56:55

live. Yeah. So we so so you know we we

56:57

we lived longer than the omocrron

56:58

variant of co co and that is our our

57:01

control plane. Um and um that is a very

57:05

sophisticated body of software. um in

57:08

addition to cuz it's it's not enough to

57:11

just like provision an instance right

57:12

you need to and you need to do that

57:15

robustly you need to do that via API CI

57:17

and so on but then you all the software

57:19

that does that and keeps track of your

57:20

instance so on uh it's very important

57:22

that you can actually update that

57:25

software that whole distributed system

57:27

you need to be able to update to a new

57:29

version of the software and this gets

57:32

really thorny right because in in a in a

57:36

public cloud you do that with a runbook,

57:39

right? I mean, even the, you know, we

57:42

don't feature it prominently, but even

57:43

in GCP and AWS, yes, there's a lot of

57:45

automation, but there's also also humans

57:48

involved, and there are humans that are

57:50

taking the responsibility for for

57:51

actually updating software. For sure.

57:54

Really? Yeah. I mean, again,

57:55

>> for the most part.

57:56

>> Yeah. I mean, there's a lot of

57:57

automation involved, but in particular,

57:59

if something goes wrong in an update,

58:01

you know, you've got DevOps that can can

58:03

hop in and figure out what's going on

58:05

and and get it rectified. We are

58:07

shipping a distributed system across an

58:09

air gap in an oxide rack that's

58:11

potentially running in a secure

58:12

facility. We cannot be there if it goes

58:14

wrong. So, we need when when

58:17

>> especially because a lot of your

58:18

customers are buying it because they

58:20

want to do it themselves.

58:20

>> They want to do themselves. So in many

58:22

ways the thorniest software problem for

58:25

us we had actually several thorny

58:26

problems couldn't pick between them

58:28

because they're all thorny for different

58:29

reasons. One of the very very thorny

58:32

problems was how do we ship a

58:34

distributed system that we can then

58:35

update and one of the things we did that

58:37

was important was like okay because it's

58:39

very easy to paint a road map that is

58:42

very complicated for update. You'll

58:43

never ship anything. So what we needed

58:45

to ship in that first product that we

58:47

shipped when you were back in Emeryville

58:49

2 years ago, we needed the minimum

58:52

viable update. We needed an update where

58:55

the software could be updated even if it

58:57

was painful. So what we did is we have

58:59

this thing called mupdate which is the

59:00

minimum update and mupdate in particular

59:04

required the control plane to be parked.

59:06

So we're going to take this rack that's

59:07

running instances, take it offline,

59:10

we're going to update it and then bring

59:12

it back online. And that was robust. It

59:14

was great and we got that working.

59:15

That's great. That is great and that you

59:18

can update it. But that's actually not

59:19

what you want in a cloud, right? You're

59:20

like, I sorry, I'm like using this thing

59:22

247. Like I actually I I I want to these

59:26

instances need to remain up while I

59:28

update it. But that gave us the platform

59:32

to go build that update functionality

59:34

into the software. Extraordinarily

59:36

sophisticated um and really an

59:38

extraordinary body of work. And actually

59:40

just recently um we had at our internal

59:42

meetup the engineer who led the charge

59:45

on that Dave Pacico gave a presentation

59:47

on looking back of two years of update.

59:50

And I I got to tell you, I think this is

59:51

one of the best single talks on software

59:54

you'll ever say. And we we will link

59:56

this, but can you give me just a short

59:58

overview of like why this update is so

60:00

difficult because like some listeners

60:02

will will be used to just building

60:04

applications for example on the iPhone

60:06

and an update there it what it means

60:08

obviously I know this is way more

60:10

complicated but an update is there's a

60:11

new binary version and it replaces the

60:13

old binary version. Now, of of course,

60:15

you know, you're saying this is an

60:16

operating system update or or you know,

60:18

like with a car and of course you might

60:20

think like, well, you know, you could

60:21

just replace the old version with the

60:23

new version and there's some downtime,

60:25

but where is the complexity that

60:27

actually like puts all this thorn?

60:29

Because I'm sensing this is like

60:32

>> I am missing something something very

60:33

obvious.

60:34

>> So, because it's a distributed system,

60:36

when you've got an app on an iPhone,

60:38

it's not a distributed system. Oh, and

60:39

distributed system, meaning that you've

60:41

got a bunch of different nodes,

60:43

>> components that are going to speak to

60:44

one another. And it's like

60:45

>> those might need updating as well.

60:47

>> Oh, they definitely need updating.

60:48

>> Oh, they all need update.

60:48

>> Yeah, the whole thing needs to be

60:49

updated. You got to be able to update

60:50

all of the software in the rack.

60:52

>> Oh,

60:53

>> this is not just operating updating the

60:55

operating system. This is updating

60:56

absolutely everything.

60:57

>> So, you might need to update some parts

60:59

or all parts.

61:00

>> You need to update the service

61:00

processor, the root of trust, the drive

61:02

firmware, the host operating system, and

61:05

then all of the components that speak to

61:06

one another.

61:07

>> Okay. And then it's like okay so I mean

61:09

this is challenge is fractally

61:11

complicated. I mean one of the very

61:13

basic ways it's complicated is like so

61:15

when we're updating we are moving the

61:17

system from from one version to another

61:19

version in between it's going to kind of

61:22

be in both versions. Like what does that

61:24

mean to have the system that's operable

61:26

while you've got some new components and

61:28

some old components? What if you change

61:29

your database schema from one version to

61:32

the next version which we definitely

61:33

have. Like you have to have a a a method

61:35

of doing that. What if you and for for

61:37

every one of these components, how is it

61:40

updatable? How we got to reason about

61:43

the system when it's in this hybrid

61:44

state and then it needs to be done in a

61:46

way that's very very robust. So the

61:50

first and foremost we had to to develop

61:53

the foundation that allowed us to do

61:55

this absolutely robustly. And so the way

61:58

Dave and team did this is you know with

62:00

that foundation and then very slowly

62:03

lighting up different aspects of the

62:05

system and making it more and more

62:07

automatic over time and you know first

62:10

started running that on what we call our

62:12

dog food rack and did our first

62:15

automatic update on the dog food rack.

62:17

Uh it was a really great feeling for

62:20

that team because this has been a very

62:22

long software road and it has been one

62:24

that has been very deliberate. Um and

62:26

and ultimately like and you know full

62:29

credit to Dave and team took us about

62:31

the amount of time that we thought it

62:33

would which is kind of very rare for

62:34

software because I think software so

62:35

practically complicated but that's only

62:38

because they've been very carefully

62:40

managing scope versus schedule making

62:42

and because quality has got to be the

62:44

constraint and Dave's talk goes into

62:46

that in detail in a way that I think is

62:48

just extraordinary. So I I'd like to

62:50

talk about the topic that is, you know,

62:53

a lot of people's mind is is AI

62:55

specifically and and AI tools.

62:56

>> Yeah.

62:57

>> How have AI tools changed how you're

63:01

working at Oxide specifically? Think

63:03

about software engineering, maybe maybe

63:04

even hardware. Are are you using these

63:06

tools? Are you experimenting with them?

63:07

>> For sure. When we've been early on in

63:09

terms of of using them and Yeah. I mean,

63:11

you use them for different for a

63:12

different and people are using them in

63:14

different ways. I mean I I I no part of

63:17

the oxide stack is vibe coded. I think

63:19

that that is the that that is safe to

63:21

say but we are using it and we're using

63:23

it to and again different people are

63:25

using it different ways. We are you know

63:27

using it to do things that are tedious.

63:29

We're using it to do generate test cases

63:31

you know generate the I use it for

63:34

because I think the thing that is just

63:35

like unmatched at is just document

63:37

comprehension. We've got a very writing

63:39

intensive culture. We've got a lot of

63:41

documents. It is great.

63:42

>> You always had that.

63:43

>> Yeah. always had that and if you've got

63:45

a writing intensive culture like your

63:47

LLM ready not to generate those

63:50

documents but to to consume them

63:53

>> and to you know one of the things that

63:55

I've always wanted to do and it's still

63:56

like now is possible I I haven't quite

63:59

found the time to do it early on I

64:01

wanted to make an RFD glossery so RFD

64:04

are a request for discussion we've got a

64:05

lot of technical terms I wanted to make

64:06

a glossery I tried to do that for like 3

64:09

hours this is like in 2020 and I'm like

64:12

this would This spreads to the horizon.

64:14

This is so just making a glossery is so

64:16

complicated. A glossery is something

64:17

that an LLM could just turn out and the

64:20

so there are lots of things that we are

64:22

we're doing to to use LLMs in particular

64:26

is clearly a very real very very big

64:30

shift in lots of different aspects of

64:32

software engineering. I I think that it

64:34

you know but of course there are people

64:35

that are being kind of reductive about

64:36

it. I am definitely not a doomer. There

64:39

are a lot of doomers that are out there

64:40

and you know I tried to give this talk

64:43

about building the oxide itself the

64:47

oxide rack and in particular the

64:50

problems that we had along the way that

64:52

an LLM was never going to be of any

64:54

assistance on. And so and I I the title

64:57

of the talk was intelligence is not

64:59

enough and one of the prominent doomers

65:02

actually did a reaction video to my

65:04

talk. It's like the only time I've ever

65:06

had someone and my daughter who was then

65:08

like 11 was just like thought it was

65:11

hilarious that someone had held their

65:14

own time in such low regard that they

65:16

would spend it recording a reaction

65:19

video to my talk. And so she was like we

65:21

I want to watch this. I'm like oh god I

65:23

do not want to watch this again.

65:24

Ultimately, the thing that was really

65:25

frustrating is this person obviously

65:27

disagrees with what I was saying, but

65:29

then when I was giving these very

65:31

concrete examples of here are the

65:33

specific technical problems that

65:36

required more than intelligence to

65:38

resolve that an LLM was not going to be

65:40

able to resolve. He literally fast

65:42

forwarded through those parts. He's

65:44

like, we just don't need this. This is

65:45

like this is just you're like, bro, this

65:47

is the talk. Like you you can't do this.

65:50

Like you're fast forwarding over the

65:51

actual like meat of the talk. C can you

65:53

give an example of like a problem which

65:55

which you felt was this like even you

65:57

know if if we fast forward to like

66:00

>> the arbitrary future. Yeah. Yeah. Yeah.

66:01

Yeah. So yeah super simple. I mean like

66:03

like the I mean we've had many many

66:05

scary problems but um we had a uh the

66:09

CPU when we did our first bring up of

66:11

our first machine. And then what does a

66:14

bring up mean?

66:14

>> A bring up means taking a board and

66:17

powering it up and trying to get it to

66:19

work for the first time. I think you

66:21

mentioned that the term smoke test comes

66:23

from electronics engineers.

66:25

>> Oh, I they I mean a smoke test I always

66:28

think of a smoke test more from from

66:29

aerodynamic but but yes I mean

66:31

aeronautical engineers but yes I mean

66:32

that you're definitely like smoke is

66:35

definitely a possibility that's a very

66:36

bad you do not want smoke that is bad

66:38

but no smoke please in bring up

66:39

>> so so the bring up

66:40

>> but we are doing bring up and we are

66:43

unable to get the CPU out of reset and

66:47

after 1.25 25 seconds the CPU would

66:51

resets itself. What's going on? Is the

66:53

power network bad? We're doing all and

66:56

like when you have something like that

66:57

happen, it's like well what's happening?

66:59

It's like I mean it's it's just not

67:02

working. I mean like what do you tell

67:03

your LLM to be like it like it's not

67:06

working. I mean and they can maybe give

67:07

you some suggestions but in this case it

67:09

wouldn't. So we are going deep into this

67:12

understanding like are maybe the power

67:14

network is like marginal. No no no we

67:17

resolve that. No, no. We're We've got a

67:19

and actually we're working with AMD at

67:20

the time and AMD's like, "No, these

67:21

power numbers are amazing. Like your

67:23

margin is very good."

67:24

>> You're measuring it out. You're like

67:25

eliminating that one.

67:26

>> We're eliminating that one. You're going

67:27

through eliminating eliminating

67:28

eliminating. And um couldn't get we and

67:31

this was weeks and you're like we are we

67:33

don't have a company like we're dead. We

67:35

are absolutely dead.

67:36

>> And I feel like this is the kind of

67:38

thing that desperate, you know, you get

67:39

desperate. You're like, we're going to

67:41

try kind of anything. And what we uh the

67:44

engineer who was working on this um

67:46

actually looked at the protocol between

67:48

the CPU uh and the voltage regulator. So

67:51

there's a protocol that it goes back and

67:52

forth says hey I need this voltage and

67:55

you know this is voltage and one of the

67:56

things he notices is that there is no

68:00

acknowledgement packet from the

68:02

regulator. So the CPU asks for a voltage

68:05

to be set to a certain level and he's

68:07

noticing that there's no acknowledgement

68:08

packet back from the regulator

68:10

>> which should come

68:11

>> which should come and the test that

68:13

they've got something called SDLE which

68:14

is this great uh test goober that you

68:17

you take the CPU off you put on the SDLE

68:20

and it will measure the power for you.

68:22

Well the SDLE didn't care whether it got

68:23

an acknowledgement packet or not. The

68:25

CPU definitely did. And the CPU So the

68:28

CPU says I want you to go to 0.9 volts.

68:30

It never gets an acknowledgement back.

68:32

And meanwhile sitting at 0.9 volts and

68:34

it's just like, well, I never got an

68:35

acknowledgement, so we're going to reset

68:37

and I'll do it again. And that was due

68:39

to a firmware bug on the Renaissance

68:41

controller. And so they we got a

68:43

firmware update from Renos and done. And

68:46

I mean, to be fair, the Renaissance FA

68:49

is great. Was like, well, you guys

68:50

really should have reached out a lot

68:51

sooner. Like, yeah, I know. We really

68:52

wanted to make sure that we got like

68:54

everything. Uh, and and that's the kind

68:56

of problem. And there were many many

68:58

problems like this where it's not merely

69:01

intelligence. It's not building a a a

69:04

board is not an IQ test. It's more I

69:06

mean you need to be intelligent to do it

69:08

but intelligence is not enough. You need

69:09

these other kind of characteristics.

69:11

>> Then I feel you also need a team in this

69:13

case, right?

69:13

>> You absolutely need a team. 100% 100%

69:15

you need a team.

69:16

>> Like you're you're going to solve these

69:17

problems with, you know, you had that

69:19

engineer who just like thought of

69:21

measuring this out,

69:22

>> right? Well, an engineer who was

69:23

desperate, you know, because we were all

69:25

getting desperate. Um, and you know, we

69:27

and again, we've had many of these over

69:30

the history of the company. Um, and

69:32

you're right, you absolutely need a

69:33

team. You need you need a team. And you

69:35

see also the value when you have a team.

69:39

People have different ways of

69:41

approaching a problem. That diversity is

69:44

really important because you need and

69:47

actually sometimes this has happened

69:48

more than once with the company where

69:50

somebody kind of like is just kind of

69:52

like walking through the problem and

69:54

like someone's like hey I'm just joining

69:55

you know about a remote company anyone

69:57

joins a you know they joining the Google

69:58

meet yeah I'm just joining because you

70:00

know I think that I'm following along

70:02

and you get someone will be like just

70:04

make an like hey I got like a dumb

70:06

question are those virtual addresses

70:08

like those look like similar virtual

70:09

addresses and you you get something

70:10

where someone's making and you need

70:12

someone to kind of like come and make

70:14

that observation that is maybe less

70:16

grounded in it and people like oh wait a

70:18

minute no that's actually like well

70:19

that's something to go check and so you

70:22

need that that different kind of

70:25

approach um that that is really a team

70:28

kind of uniquely summons

70:29

>> and you know I think you might have

70:31

alluded to it but uh on the previous

70:33

podcast Arman Ronacher mentioned to me

70:35

he's uh the creator of Flask he's he's

70:37

been around the block for for quite a

70:38

while and he's now doing a startup and

70:40

he said that right now It's just him and

70:42

his co-founder and he's got an army of

70:45

AI interns right now. He's prototyping

70:46

him. But he told me, "I'd like to start

70:48

to hire people soon because people bring

70:52

energy and you need energy per company

70:55

to live and and thrive." And I'm kind of

70:57

sensing the same thing.

70:58

>> Oh, for sure. No, for sure. And I, you

71:00

know, just listened to this great piece

71:02

with Richard Sutton who was the inventor

71:05

of reinforcement learning and and I

71:06

think rightfully I agree with him. It's

71:08

like you guys are conflating an LLM with

71:11

artificial intelligence. It doesn't have

71:13

goals. This is really important. So like

71:16

a prompt is not a goal and guessing the

71:19

next word is not a goal. And but like us

71:22

together as a startup and like wanting

71:24

to make it together, not wanting to die

71:26

here together, that's a goal. And that

71:29

so we can use that creativity. Maybe we,

71:32

you know, we use an LLM certainly as a

71:34

tool to help us achieve our goal, but I

71:38

I I do think that that's a very

71:39

important distinction.

71:40

>> And can you tell me like what kind of

71:42

tools you use and and what are the areas

71:44

that you you find it helpful? I

71:46

understand you're experimenting with

71:47

stuff and you know this is all work in

71:48

progress, but where areas that that and

71:51

you mentioned like the summarizing was

71:52

was one example of glosseries.

71:54

>> Yeah. Oh yeah. I mean and I I mean I use

71:56

LMS as an editor all the time. Um I find

71:58

it to be a really I mean actually it was

72:00

funny. I had a blog entry that went on

72:01

Hacker News and someone was like, "Oh,

72:03

this is LLM written." I'm like,

72:04

"Actually, it is LLM edited, but the

72:07

only thing that I did based on the LM is

72:09

I deleted an entire paragraph." So,

72:11

there's a paragraph that like wasn't

72:12

working and the LLM was like, "This

72:14

paragraph is not working." And I'm like,

72:16

"You know what? I'm just going to delete

72:17

the paragraph." So, I was like, I I

72:19

don't know. You want to say that's LM

72:21

edited? Because like every word there is

72:22

written by me, but there were some words

72:24

that there was written by me that an LM

72:25

social I deleted there, which I deleted.

72:27

So I mean I use it for um in writing for

72:30

sure. I mean I also like to use and this

72:32

is like a stupid reason stupid thing

72:34

>> but when you're writing Rust and we

72:36

write a lot of Rust there especially

72:38

when you're new to Rust this you you

72:41

wonder like the way I just phrased this

72:43

is this like idiomatic is there a better

72:46

way to do this that that's a great

72:47

little problem for like I got this small

72:49

little snippet of code. Is this an

72:51

idiomatic way of doing this? Is there a

72:52

better way of doing this? And that's a

72:54

great thing for an LLM to be to make a

72:56

suggestion or not or tell you that like

72:58

nope that's that's an idiomatic way of

72:59

doing maybe I would make this small

73:00

adjustment. So I find it really val I

73:02

find LLMs to be more valuable in the

73:04

small than in the large. Um so like

73:08

again this kind of I I my you know hats

73:10

off to people who want to uh spend their

73:12

lives acting as a middle management for

73:14

robots but like that's not necessarily

73:16

for me. Um certainly at Oxide I mean our

73:18

belief is that people take

73:20

responsibility for their own work. So,

73:22

if you want to have an LLM help you out

73:24

on that, that's fine. But ultimately,

73:25

like if there's a bug in this, like you

73:26

can't blame the LLM. The L the LLM broke

73:28

my code is like not interesting. That

73:31

that that's LM don't have

73:33

accountability. And so, one thing that

73:34

is starting to spread across I think a

73:36

lot of engineering is engineers using

73:38

LMS either uh inside your ID with

73:41

autocomplete or or and also kicking off

73:43

now agents. Now, there's more advanced

73:45

ones with like cloud code and and codecs

73:47

where it can actually run command

73:49

prompts and run your tests. Are you

73:51

seeing engineers use some of these

73:53

tools? And there's a little bit of back

73:54

and forth as well. You know, like it's

73:56

very clear that when it you're doing

73:58

kind of more boilerplate things that are

74:01

so-called on distribution, which is they

74:03

they've learned like React TypeScript,

74:05

it can spit out a bunch of stuff, but

74:07

you strike me as someone who's doing a

74:08

lot more nuance things.

74:11

>> Yeah. I mean, you're writing a bunch

74:12

you're running writing a bunch of C code

74:14

in the operating system kernel. It's is

74:16

it is less valuable.

74:17

>> Yeah. Yeah. But so what are you seeing

74:19

across the team in terms of

74:20

>> you know I encourage people to to uh

74:23

experiment and I would say we're seeing

74:25

a a wide variety of experimentation

74:27

certainly we've got we're using cloud

74:29

code a bunch and people are doing that

74:31

and um but I would you know broadly

74:34

speaking for a lot of the work that

74:36

we're doing um it is helpful as like

74:39

maybe a polishing tool but less as a

74:42

kind of the at the epicenter of its

74:44

creation. It's not true of everything.

74:45

There's some software for

74:46

>> No, but but that that's also nice to

74:47

hear cuz I'm I'm kind of asking you more

74:49

to putting on your CTO hat who's who's

74:51

also very like you know you're very

74:52

hands-on and you know what's going on

74:54

with the industry cuz a lot of non-hands

74:56

executives are kind of looking their

74:58

finger and thinking oh we must be 10 or

75:00

20 or 30% more productive but what what

75:03

what I'm hearing is like things are kind

75:04

of the same as before, right?

75:05

>> Yeah. I mean I mean my big belief is

75:07

it's a tool. It's a powerful tool. I

75:09

mean I will say that the thing I you

75:11

know occasionally get people are like

75:12

well I don't want to use it at all. And

75:13

I'm like, you should. So, like,

75:15

>> you should try, right?

75:16

>> Yeah. Like, let me get you off of that

75:18

position and let me, you know, we had

75:20

Simon Wilson on our podcast. Simon's

75:22

delightful. And, you know, one of the

75:23

lines that he has that I really love is

75:25

people should run these LLMs on their

75:27

own laptop where they run slowly and

75:29

poorly so they can see the bad output

75:31

that they generate so they can

75:33

understand what some of the limitations

75:34

are. So, I I I definitely I love that. I

75:36

I I do think that that uh people should

75:40

use them enough to know where they are

75:42

valuable. It's a very important tool in

75:44

the toolbox. You want to be aware of it,

75:45

but it's definitely reductive to think

75:47

it's the only tool in the toolbox

75:48

because it isn't.

75:49

>> Now, you're in such an interesting

75:50

company because like, you know, you

75:52

don't not just do software, but you do a

75:54

lot of hardware.

75:54

>> Yeah.

75:55

>> Have you found any use?

75:57

>> No.

75:58

>> No.

75:58

>> No. Zero. I mean, okay, zero is a bit

76:01

reductive. I have found it to be useful

76:03

when, for example, you know, you've got

76:05

a waveform of an I squed C transaction.

76:07

it actually amazingly you can send that

76:09

to an LLM and have it like interpret

76:11

this like hey what what am I seeing I I

76:13

squed C kind of compliant behavior and

76:15

it can help you out on that a little bit

76:17

but it's like absolutely at the edges

76:19

>> okay so that's a 0.01 01.

76:21

>> Also, like I think people don't realize

76:23

like there are already tools for that.

76:26

Like that's what EDA is. You spend a lot

76:28

of money on like we're not laying stuff

76:30

out like by hand with graph paper. Like

76:32

this is like you've got, you know, when

76:34

you do layout for a board, there are a

76:36

bunch of rules that are automatically

76:38

checked for SI, you know, we we've got a

76:41

we do a bunch of simulation work. Like

76:43

we're not doing that by hand. We're not

76:45

we're using software.

76:46

>> Yeah. I saw you have those machines in

76:48

there. Like I I I saw that. I think it's

76:50

a bit reassuring to hear because I think

76:52

it's very clear like maybe we don't

76:54

realize as software engineers but

76:56

programming is such a great use case for

76:58

LMS. It's a simple grammar you can

77:00

validate it and I think it's sometimes

77:02

nice to just you know touch sand of like

77:04

an area that is very very different.

77:06

Yes.

77:06

>> But but it's it's cool that you're

77:08

checking and you know you're seeing if

77:09

if if it changes over time I guess you

77:11

always keep checking.

77:12

>> Yeah. And I I for sure and I think that

77:14

like I I it is frustrating to me because

77:18

it programming is such a good use case

77:20

for certain kinds of programs. So as a

77:22

result you end up with certain kinds of

77:23

programmers who just in in part because

77:26

of their own self-centric view of the

77:28

universe believe that oh this is just

77:29

going to replace every job and it's like

77:31

no not even close not even close and you

77:34

need to spend more time you need to get

77:35

outside a little bit more.

77:36

>> Yeah. So speaking of getting outside and

77:38

you know meeting different people what I

77:39

noticed when I went to oxide is just

77:41

like it was great. We had double ease as

77:44

you say, software engineers, people used

77:45

to work on virtual reality at at Oculus

77:49

all in the same room. Can you tell me

77:50

about how big is the team? What's the

77:52

composition? Yeah, so we we're on you

77:55

know we've I think you know we got some

77:57

more offers going out tonight. So I

77:59

think we've got on the order we'll be at

78:01

like 85. I should probably keep better I

78:02

should keep better mental track track of

78:04

it. where we got like 85 plus minus and

78:06

we you know we've been very blessed by

78:10

uh we've really put a beacon out there.

78:12

We've got a lot of people rooting for

78:14

the company. We've got a lot of people

78:15

and as a result we got a lot of people

78:16

want to work for the company. So um you

78:18

know we as we talked about last time um

78:21

we really put a lot on folks to describe

78:24

you know the work they've done what's

78:26

important to them why they want to work

78:27

for Oxide. I mean a lot of my LM use is

78:31

I will look at someone's materials. As

78:33

you can imagine, we've started to see

78:34

materials that are heavily LLM authored.

78:36

Potential applicants oxide, please do

78:37

not do this. We get people who like who

78:39

who human author their entire materials

78:42

and then they get to the last question.

78:44

Why do you want to work for Oxide? Why

78:45

do you want to work in this role? And

78:47

they have an LLM spit that out and

78:48

you're like, do you think you want to

78:50

work here? Like I'm just like, let's

78:51

leave aside whether this is like, you

78:53

know, is is this right or wrong or

78:55

cheating or not? It's like fine, I

78:58

guess, but like I don't think you want

78:59

to work here. like you're not gonna get

79:01

a job here because I don't think you

79:02

actually want to work here. Put it in

79:03

your own words. But that process

79:07

really has allowed us to attract people

79:10

who themselves are attracted to the

79:11

company and attracted to the the

79:13

culture, the problem, the team, and it's

79:16

just extraordinary. I mean, it's I just

79:18

feel so lucky to be with such an

79:22

unbelievable group of people across more

79:24

and more and more and more disciplines.

79:26

I mean the great thing about our

79:28

approach is it brings people in who are

79:31

you know God it's like I love this

79:32

approach for we talked about support

79:33

engineering we I people who are like god

79:36

I love this approach like finally QA can

79:39

stand on its own two feet I I feel that

79:40

that QA has been kind of subjugated by

79:43

by these other disciplines now QA is

79:46

kind of really thought to be as

79:48

important as anything else in the

79:49

company and it is because at some like

79:51

at some like monetary

79:54

perspective it is as important as

79:55

anything else. Uh

79:56

>> yeah, but but I remember like when I

79:58

worked at Microsoft back like 15 years

80:00

ago or so, the QAs were just on a lower

80:03

pay grade, you know, like the senior QA

80:05

was at the same as like I think software

80:07

engineer 2 or something which just kind

80:09

of implied

80:10

>> Yeah. you're less important.

80:11

>> You're less important. You're just less

80:12

important. And so like if you tell the

80:15

world that we think it's as important,

80:17

do you know who you get? You get people

80:19

who are extraordinary at QA. You get the

80:21

best of the best. And so, um, that has

80:24

been really exciting. And now we've got

80:26

people coming. I mean, I do love how

80:29

many different companies because my

80:31

belief is that like every company has

80:34

something to teach us that there there

80:36

is something positive you can take from

80:39

every company. Now, there are some

80:40

companies, it's like, you're really

80:42

scraping the bottom of the barrel.

80:44

>> Maybe not an Ronaldo. They did buy some.

80:46

>> Yeah. Yeah. That's right. That's like

80:48

there are like even Oracle you can find.

80:50

There are that may be a bit of a

80:52

challenge. Let's not do that one. Uh but

80:54

you know what uh the and and at the time

80:58

I thought this was a negative but now

80:59

I'm like I see it. Larry Ellison makes

81:01

every hiring decision at Oracle.

81:02

>> So what's positive about that?

81:04

>> Exactly. Which be like what's I really

81:07

the I really think that the kind of the

81:09

founder mode the Paul Graham essay on

81:11

founder mode is talking about founders

81:13

that lost track of their own hiring. So

81:15

I think now I don't like the way Ellison

81:16

does it. I think that you want to have

81:18

you want to trust a team to make a

81:19

decision, but ultimately I believe that

81:22

the that the CEO of a company bears

81:24

responsibility on every single hire and

81:26

I think should be looking at every

81:28

single hire coming into your company and

81:30

that that is to me that is a very

81:33

important check on these kind of

81:35

companies that that so that is there you

81:36

go something that I've something that

81:38

positive I take from and it's telling

81:40

that your immediate reaction is like

81:42

wait what's positive about that? Yeah.

81:44

I'm I'm not sure like I'm not sure you

81:46

undid that that talk on on Oracle.

81:49

>> Yeah. Fair enough. Fair enough. Exactly.

81:50

Yeah. And there from some companies more

81:52

than others, but I think that there are

81:54

and so I love having all of these

81:56

different experiences present at Oxide

81:58

because I do think that there's so much

82:00

to learn and we're trying, you know, you

82:02

want to take all the positive things cuz

82:03

I also think that every company

82:06

including, you know, people I actually

82:07

one of the questions I love that I got

82:09

once is like, what do you not want to

82:11

emulate from Sun? I'm like, "Oh, thank

82:13

God." Because like think people think of

82:14

oxide as kind of the second coming of

82:16

Sun Micros Systemystems and like I there

82:18

are lots of things I loved about Sun.

82:20

There are lots of things I did not love

82:21

about Sun that I did not want to emulate

82:23

and so I think for any also any company

82:25

there are things we want to leave behind

82:27

and you know I think when you've got a

82:29

big diverse team you you get to go do

82:31

that. And one thing that really

82:33

surprised me last time I I was at your

82:35

office is turns out that most people

82:38

were not in the office and and they work

82:39

remote and I I would understand for

82:41

software but how do you make that work

82:43

for hardware development where

82:44

physically you do need to you know be at

82:47

the the hardware sometimes I understand

82:48

you need to measure stuff I saw a lot of

82:50

like you know you know units sometimes

82:52

you need to go to like check on

82:53

manufacturing how does that part work

82:55

>> yeah so I mean uh a lot in people's

82:57

basements um so you know fortunately

82:59

we're making you know this is the

83:01

advantage of making a server and not

83:03

making like you know a tractor or like

83:05

you know we're not making like a you

83:06

know I don't know like a wind turbine or

83:08

something you know this is something

83:09

that people can actually model in their

83:11

basement um so that helps but then a lot

83:14

of even hardware engineering is using

83:16

these software tools using EDA tools

83:18

you're using solid works you're using

83:19

LTM you're kind of putting this thing

83:21

together you when you're doing layout

83:22

for example um which is very important

83:25

task when you're laying out a board all

83:27

of that is that that can be done

83:29

anywhere that's all just software Okay.

83:31

>> And so the the there are things that are

83:33

where that physicality is very

83:34

important. And then when you're doing

83:35

bringup, you actually need to be at your

83:37

manufacturer when you do that. So like

83:39

that is also not in an office.

83:41

>> You would need to travel anyway.

83:42

>> Yeah. You need to travel anyway. And

83:43

anyone coming electronics industry is

83:45

like, "Okay, I'm interested in oxide,

83:46

but please tell me I never have to go

83:47

spend any time in Taipei or Beijing."

83:49

Because you go out there for, you know,

83:51

or Shenzhen or wherever. And you're out

83:52

there for two weeks in a windowless

83:55

office trying to get this thing brought

83:56

up. And um we all of our assembly is

84:00

done here in the United States of

84:01

Minnesota. So we are all in fact we've

84:02

got a bunch of folks out there this week

84:04

for uh at Benchmark Electronics in

84:06

Rochester. So this is wonderful. And one

84:09

thing that you told me is one of the

84:10

things that's on top of your mind right

84:12

now as oxide is growing. You still have

84:14

this culture of the the same

84:16

compensation full remote. So like it's

84:18

it's kind of been the same since the

84:19

start. What what will be the challenge

84:21

in in maintaining it? Because again you

84:23

worked at large companies. You've seen

84:24

how it goes. it can get tricky. What

84:27

what are the things that you're seeing

84:28

and what are the things that you're

84:29

trying to do to you know keep this kind

84:31

of start of vibe even even as you might

84:33

be just bigger.

84:33

>> Yeah. So I I think that the thing that

84:35

is that is top of mind right now for me

84:38

um is and especially because you know we

84:39

raised a big series B which is great. Um

84:42

I think much more importantly we're

84:45

seeing a lot of customer traction which

84:46

is great. So we've seen paying off.

84:48

Yeah. No it really is. It's really great

84:50

and we kind of knew that was going to

84:51

happen in the abstract. Um, but it's fun

84:54

to actually see it happen and fun to

84:56

actually see um the customers that have,

84:58

you know, like, you know, I bought one

85:00

rack and I mentioned it, but now I want

85:01

to buy a lot more racks. I love what I'm

85:02

seeing and I want, you know, that's

85:03

great. Very, very, very exciting stuff.

85:06

That means we're growing the company a

85:07

bunch. And one of the things that's very

85:09

important to me, because I've seen this

85:10

happen so many times, is companies take

85:12

their eye off the ball when it comes to

85:14

hiring in in particular. And it is very

85:17

important to me that we continue to have

85:19

absolute discipline in the way we hire.

85:21

And uh we we're doing that. And

85:23

fortunately, you know, the nice thing

85:24

about our hiring process is every single

85:26

Oxide employee has gone through it. So

85:27

it's like I'm not having to persuade

85:30

anyone about the importance of our

85:31

process because everybody has gone

85:34

through it and that you know the thing

85:36

that we've got overwhelmingly in our

85:38

favor is because we've used our values

85:40

as a lens for that hiring. Oxide's

85:43

culture is important to every single

85:45

person at Oxide. That's what it takes to

85:47

to really preserve that. And it it

85:50

doesn't mean that it won't change at

85:52

all, but the bones aren't changing. Like

85:54

what what will change is it will be

85:56

bigger and it will be I think you know

85:58

and I love the fact that you know even

86:00

at like 85 we're already so big that you

86:04

know Steve and I know everybody at the

86:05

company but very few other people know

86:07

everybody at the company. So when we get

86:10

everyone together, it's like the best

86:13

party you've ever been to because you

86:15

know when in college I used to throw the

86:17

best parties in college. And the reason

86:18

I threw the best parties in college is

86:20

not because of me. It was because of the

86:22

roommates that I had. So like I was a

86:23

computer science student who played

86:24

ultimate. My roommate was an engineer

86:26

who was on the water polo team. My other

86:28

roommate was a was a history student who

86:30

was in the chorus. That's six different

86:32

demographics that don't normally

86:35

overlap. And then very importantly, we

86:37

made sure that the women's swim team was

86:38

always invited. The women's swim team,

86:39

they were like the foundation water

86:41

player.

86:42

>> Yeah, exactly. Waterfall. You always

86:43

check their calendar to make sure they

86:44

can make what And people loved the

86:47

parties we have. Why? Because they would

86:48

meet people that they never met before

86:50

who were really interesting and they and

86:53

what I love about Oxide is we've got

86:56

this when when we get the whole team

86:57

together, people get the all these

87:00

delightful surprises. So people take me

87:01

aside and be like, "God, you know, Ry is

87:04

awesome." I'm like, "Yeah, I know. I

87:05

know. I know. I know. You know, too now.

87:06

That's great." But like, you know, or

87:08

you know, whomever it is. It's it's just

87:10

it's really exhilarating. And I think

87:12

that also serves to reinforce how

87:15

important what we've got is. I tell the

87:17

team like, we have lightning in a

87:19

bottle. And we cannot take it for

87:21

granted. And that means that every

87:23

single one of us need we we need to rise

87:24

to the moment. We need to do what our

87:26

customers need us to do, but we need to

87:28

do it in a way that protects and

87:30

preserves what got us here. So thinking

87:32

a little bit ahead, let's assume that,

87:34

you know, these AI tools will just get

87:37

better eventually. They'll be able to,

87:39

you know, help more even on on your kind

87:41

of low-level things. You've been in the

87:43

industry for quite a while. You've

87:45

you've seen a lot of shifts. What are

87:46

what do you think are are some of the

87:48

things both in software engineering or

87:50

in hardware engineering or just in

87:51

general engineering that will probably

87:53

not change even if we predict

87:55

>> uh these these things being like more

87:58

capable? Yeah, I think that what we I

88:00

mean I think that that it's certainly a

88:02

revolution. I think it's going to allow

88:05

us all to do more. I do think that we

88:07

are going to hit a point where people

88:10

understand that this is a tool where

88:12

because there's a little bit where we're

88:14

still have this tension of like, oh, is

88:16

this going to be AGI? Is this going to

88:17

replace all jobs? And this is like

88:20

nonsense as far as I'm concerned. And

88:22

it's distracting kind of nonsense. And

88:25

we actually need to get back to putting

88:27

the tools in the toolbox of of the human

88:31

that's building it. Now these tools have

88:33

become much more powerful and I think

88:36

that that's going to be I think that's

88:38

extraordinary. I think it's important. I

88:39

think that also we'll be you know we've

88:41

got a lot of experiments right now we

88:43

humanity that I'm I'm not sure are going

88:46

to make economic sense. So you know we

88:48

we'll be figuring that out as well. Um

88:52

but I think that you know one of the

88:53

things I am a little bit worried about

88:54

is a little bit of despair from younger

88:57

software engineers in particular who are

88:59

like what's the point like an AI can do

89:02

all this well and there's also the news

89:04

even from more experienced software

89:05

engineers in the mainstream media

89:07

there's this news that company X is

89:09

laying off healthier workforce because

89:11

of AI and by the way when we look closer

89:12

it's not because of AI but it it is

89:14

coming across and it does give not

89:17

younger people a lot of anxiety tons

89:20

even like mid mid-level folks or even

89:22

some more experienced like it it does

89:23

give a sense of I think it's the first

89:25

time in computer history that most of us

89:27

remember that there is this thing that

89:29

could threaten my job and I I think

89:32

we've just never had to deal with this.

89:34

I think you know there there are

89:35

industries that might have been a bit

89:36

more used to it.

89:37

>> Yeah, I would say that we I mean there

89:39

have been busts before. The knock on

89:41

bust was a bust like a lot of jobs did

89:44

disappear, right? So I think that we but

89:46

the bus has really come in in what feels

89:48

to be a broader and more permanent way.

89:51

I I I mean my view is like this is an

89:54

opportunity for I mean I think one of

89:56

the things we should be society really

89:58

encouraging is new company formation

90:00

because now I mean just like you're

90:01

talking to Armen about how you know just

90:03

a small group you know just Armen and

90:05

his co-founder were able to do so much

90:08

together right we should be really

90:10

encouraging that and what are some of

90:12

the gaps that we can all go fill because

90:14

ultimately like we we all need to find a

90:16

livelihood we need to find meaning and

90:18

the way we do that as engineers is we

90:20

build useful things. And so we're like,

90:22

we can now build many more useful

90:24

things. What would we go build? What

90:26

would if you could build anything, what

90:28

would you go build? And that's kind of

90:29

the question that people need to ask

90:30

themselves. It's scarier. It's scarier

90:32

than like go to this school, get

90:35

concentrate in this, and then mama

90:37

Google will hire you and take care of

90:38

you and feed you breakfast. It's like

90:41

no, that's not like that's not what's

90:43

going to happen. And it feels a lot

90:45

scarier because it feels like there's at

90:48

some level like less security, less job

90:50

security. But yeah, that's true that you

90:53

know that and and that that's scarier,

90:56

but there's also a lot more opportunity.

90:57

>> And for a for a college student or or

91:00

some someone in school or or with little

91:02

experience who says like, look, my goal

91:04

would be one day in like 5 years time to

91:07

be as good that I could get a job at a

91:10

place like Oxide. it doesn't need to be

91:11

oxide but again a place that has a high

91:13

bar they're they often hire experienced

91:16

people but I want to get there and yeah

91:18

there's all this AI stuff as hell

91:19

happening what would you advise them in

91:21

terms of what to focus on what what

91:24

areas to study what things to do or how

91:27

to think about like you know like they

91:28

have the the goal is there what advice

91:31

would you have them part with

91:32

>> yeah so I think that they need that that

91:35

you need to have a different mindset and

91:37

that mindset needs to be not around how

91:40

do I create as much as possible, but

91:43

rather how do I get better? How am I

91:46

getting better every day? And I think

91:48

LMS are a great tool to get better. How

91:50

can I learn about something new? Go

91:53

deeper. Go into something that I

91:55

wouldn't go into before. Get over that

91:57

kind of that fear. And one needs to

92:00

especially if you're coming, you're in

92:01

school now, you want to work at a place

92:03

like Oxide. It's like you you kind of

92:05

have to view it as like all right, like

92:07

you you want to play Major League

92:08

Baseball, that's great. like you're a

92:09

you're a great high school player. You

92:11

want to play Major League Baseball. It's

92:12

really hard. Got to get better every

92:14

single day and you you're going to be

92:16

need to be really focused on getting

92:19

better and you need to be like really

92:20

realistic about like what I need to go

92:22

do to get better. And it's hard but and

92:25

it's chancy because you might not get

92:27

there but you could get there and the

92:29

and you're certainly not going to get

92:30

there if you don't focus on that kind of

92:31

self-improvement. So I I I really think

92:33

that that it there is a shift in mindset

92:37

that that needs to happen or that one

92:39

needs to have I would put that way. One

92:40

you really got to have a mindset towards

92:43

getting better understanding more. What

92:45

do you not understand? There is lots

92:47

that you don't understand. I mean I

92:49

think one of the the the challenges of

92:50

modernity is that we delude ourselves

92:52

into thinking that we understand it all.

92:54

You don't. I don't. Like one of the

92:56

things that I've learned, I've joked at

92:57

oxide that like I keep waiting for the

92:59

day that I know how computers work and

93:01

it like

93:03

>> like it wasn't today definitely wasn't

93:05

yesterday. It's not like it's going to

93:06

work.

93:06

>> You understand how

93:07

>> but I mean that earnestly in that the

93:10

the the the amount of of complexity that

93:13

I that I definitely I mean I knew but

93:16

also didn't know. It's like every day I

93:18

feel I'm still learning new facets and

93:21

not just like a computer but actually

93:23

delivering a computer to people. There's

93:24

there's so much to learn out there. So

93:27

many op and and now with the way you've

93:30

got to view LLM is not like this thing

93:32

is coming from my job. You got to view

93:34

it as like no I've got now this like

93:36

private coach tutor what have you that I

93:39

can ask any question to. It's not going

93:41

to I got to like fact check its answers

93:43

for sure but now you've got the

93:45

opportunity to and you got it is easier

93:48

to get into this domain than it ever has

93:51

been. And that is that's great and it's

93:54

powerful, but it can also be scary.

93:55

>> And as closing, what's a book or two

93:58

books that you would recommend to folks

94:00

and why?

94:01

>> Oh, so many good books. You know, my my

94:03

uh my I've got a I've got a 21-year-old,

94:07

an 18-year-old, and a 13-year-old. And

94:08

when the 18-year-old was in his he's now

94:10

a freshman in college, he's a high

94:12

school senior. He got this assignment,

94:13

great assignment from his his English

94:15

teacher, namely go to someone that you

94:18

that that you know and ask them for

94:21

three books that they would recommend

94:22

that you read and I'm going to assign

94:24

you one of those three books to read and

94:26

you're going to read it and then you're

94:27

going to talk with them about that book.

94:28

And I'm like, "Oh, I love this

94:30

assignment." So he's like, "Dad, I'm

94:32

coming to you." And I'm like, "Oh, you

94:33

have Thank you so and of course my wife

94:35

was like, "Why didn't you come to me?"

94:36

Like, "Hey, look, I'm, you know, I

94:38

sorry, you know, look, uh, it was

94:40

great." So yeah, I I'll give you those

94:42

three books that that that I gave to him

94:44

and I think that each of these is really

94:46

terrific. Uh first is Soul of a New

94:48

Machine by Tracy Kder. So this one won

94:50

the Puliter Prize uh in 1980 or 1981,

94:53

but about the the the building of a new

94:56

computer at data general and it's a it's

95:00

extraordinarily well written and even

95:04

folks like well I'm not like what do I

95:05

have to do with a computer company in

95:07

the late 70s and early 80s? any engineer

95:10

will see something of themselves in that

95:12

book. It is just masterfully told. Tom

95:14

West who's the the is is is kind of a

95:17

complicated figure but that is soul is

95:19

still I mean it it it it's literature

95:22

for us. So I would absolutely solve a

95:23

new machine every engineer should read

95:25

Soul a new machine by Tracy KDR. Um for

95:27

me personally um very influential was

95:30

skunk works by Ben Rich. So about the

95:33

the the history of skunk works. Um

95:34

Clarence Kelly Thompson was the with

95:37

kind of the originator of skunk works at

95:38

Loheed Martin. Uh extraordinary story

95:41

about what engineers can do when they

95:44

they kind of task themselves on the

95:46

impossible. Um

95:48

>> it's such a good book.

95:48

>> It's such a good book. Amazing book. And

95:51

then the uh the other one is Steve Jobs

95:54

and the next big thing by Randall

95:55

Straws. So um Steve Jobs is kind of like

96:00

lionized by the industry but people

96:03

forget about a very important chapter of

96:04

his life namely next and I believe we

96:07

are it it was just an anniversary maybe

96:09

it was the 30th anniversary it must have

96:10

been of the or maybe the 40th

96:12

anniversary Jesus of the the

96:14

announcement of the next machine. So the

96:18

um Steve Jobs left Apple, was fired from

96:21

Apple, started a computer company called

96:23

Next. Uh really interesting company in a

96:25

lot of ways. Was at Next for a very long

96:27

time. It's a 13-year journey before Next

96:30

was bought by Apple. Next is bought by

96:32

Apple. Steve Jobs returns to Apple when

96:34

they buy Next. This book, Steve Jobs and

96:36

the Next Big Thing, is written before

96:39

Apple buys Next. And it is at Steve

96:42

Jobs's lowest moment. It it is not here

96:44

to praise him. It is here to bury him.

96:47

And it is very interesting about all the

96:50

missteps at next and the thing that we

96:53

cannot know because Jobs obviously died

96:56

but I believe having read the book which

96:58

gets basically next gets essentially no

97:00

treatment in the Isixson biography. Next

97:02

is like six pages of glory. It's like

97:04

that's not what it was. Um, but Rand

97:06

Straw's book is is masterful and in

97:10

particular I believe that Jobs's

97:12

failures at Next were essential for for

97:16

the resurrection of Apple. And there

97:18

because you look at the way he handled

97:19

himself coming back to Apple was very

97:21

different from the jobs that got fired

97:23

from Apple. And I think that like when

97:25

people look at Jobs like they don't

97:28

really take him apart. And I think you

97:30

should because I think he's a really

97:31

interesting guy. He's enigmatic. He's

97:33

someone like he did things that I that I

97:35

think are really fascinating and also

97:37

things that I really strongly disagree

97:38

with. So just to be clear, I'm not like

97:40

but I think that he's he's indisputably

97:43

an important figure and that book is by

97:46

far the best book. So Steve Jobs

97:48

>> No, I'm adding that. I actually want to

97:50

read that now.

97:50

>> Oh, it's extraordinary. It's very good.

97:52

>> Well, Brian, this was such a fun

97:54

discussion.

97:54

>> Oh, my my pleasure. I mean, we knew this

97:56

was going to be long and wide ranging,

97:58

so hopefully it delivered, but uh I I

98:00

really appreciate the went from from the

98:03

'90s all the way to the future.

98:05

>> Awesome. Well, thank you so much for

98:06

having your guy. It was terrific. I've

98:08

got to say Oxide is one of my favorite

98:10

companies, and I say this as someone who

98:13

has zero affiliation with them. [music]

98:14

It's just so rare to find a startup that

98:16

built both hardware and software and are

98:18

world class in doing both of these

98:20

[music]

98:20

and are so open about talking exactly

98:23

how they do it all. Honestly, the only

98:25

downside [music] I can think about Oxide

98:26

is how their server racks are built for

98:28

pretty large companies and are

98:30

definitely out of reach for hobbies

98:31

devs. In this episode, I really

98:33

appreciated how much of a straight

98:34

shooter Brian was, especially about the

98:36

impact of AI [music] tools. Yes,

98:38

everyone at Oxide uses them and they do

98:40

find use cases for coding and working

98:42

with documents, but it's eye opening how

98:44

it gives them basically zero help with

98:46

hardware engineering. This is a good

98:47

reminder that LMS might be the single

98:49

best fit for coding related tasks. And

98:51

as [music] devs, we should know that

98:53

these tools might be more specialized

98:55

than many people think. I hope you

98:56

enjoyed the stories in this episode as

98:58

much as I did. If you'd like to learn

98:59

more about Oxide, I did a two-part deep

99:01

dive about the company, and you can read

99:03

it linked in the show notes below. If

99:05

you enjoy this podcast, please do

99:06

subscribe on your favorite podcast

99:07

platform and on [music] YouTube. This

99:09

helps the podcast a lot. A special thank

99:11

you if you also leave a rating on the

99:13

show. Thanks. And I'll see you in the

99:15

next one in [music] the next

Interactive Summary

Brian Cantrill, co-founder of Oxide Computer Company, explores the evolution of cloud infrastructure from the 1990s dotcom era to modern hardware startups. He details the technical journey from Sun Microsystems to the creation of the Oxide rack, emphasizing the importance of first-principles design in hardware, networking, and software. The discussion also covers Oxide's unique culture of transparent, uniform compensation and a pragmatic view of AI as a tool that aids software development but struggles with the complexities of physical hardware engineering.

Suggested questions

5 ready-made prompts

Recently Distilled

Videos recently processed by our community