The history of servers, the cloud, and what’s next – with Oxide
3014 segments
Can you tell us about the.com boom? We
did much more technically interesting
work in the bust than we did in the
boom. There's a degree to which
innovation requires some level of
desperation that good economic times are
kind of hard to summon that desperation.
How have AI tools changed how you're
working at oxite?
>> Certainly we're using cloud code a bunch
and people are doing that but for a lot
of the work that we're doing it is
helpful as maybe a polishing tool but
less as at the epicenter of its
creation. Can you tell me what it
actually means to design or build a
computer? Oh, it's very involved. Yeah,
it's very involved. So, first of all,
how have servers and cloud
infrastructure evolved [music] since the
late 1990s and what is next? Brian Caner
was a distinguished engineer at some
micros systemystems during the com boom
and comb bust. Built a small competitor
to AWS called Joyant [music] and is now
the co-founder at Oxide. Today, we go
into the history of servers in the cloud
from the late 1990s to [music] today.
the challenges of building hardware like
the Oxide computer from scratch. How the
Oxide team uses AI and why they find it
practically useless for hardware
engineering [music] challenges. Why
Oxide builds everything as open source
and how they manage to work remotely as
a hardware startup and [music] many
more. If you'd like to understand more
about how the cloud works, and learn how
Nimble hardware plus [music] software
startup operates, this episode is for
you. This podcast episode is presented
by Statsig, the Unifi platform for
flags, [music] analytics, experiments,
and more. Check out the show notes below
to learn more about them and our other
[music] season sponsor. So Brian,
welcome to the podcast.
>> Oh, it's great to be with you. Thanks
for having me.
>> I'd love to jump back in time a lot back
in the 1990s because you're someone
who's been around the block and back
then you worked at some interesting
companies including at Sun and if you
could give us listeners and and viewers
a sense of what was it like in the '9s
in terms of
>> software servers, what was the vibe
like? Yeah, it was an interesting
inflection point because I was
interviewing in 1995. I started in 1996.
So, uh I would say that the the internet
and I mean we HTTP had been developed in
like 9394.
We had kind of the first web browsers
but it was still very very very new and
the internet was just kind of primed for
takeoff. Java had been Java had come out
in maybe 1995.
Java had kind of taken off immediately.
So there was a lot of uh really exciting
energy, but it was nowhere near what it
would become a couple year even a couple
years later it became very frothy of
course and it was exciting. Um it was
very clear to me I went to school
actually in the east coast but just
coming out here to Silicon Valley the
energy was was extraordinary um and
really knew that I wanted to come out
here for my career. So at Sun those next
couple of years I mean I I got very
lucky really because Sun was in the
right place at the right time with the
right technology which you know
sometimes you only appreciate in
hindsight um because it was so explosive
and if you wanted to build a website as
part of that com boom you were buying
Sun servers you were buying Cisco
switches
>> now why was this the case because again
just taking myself back just being a bit
naive I would assume that let's Hey, I'm
in the 1995. I want to build a website.
Could have not just used [clears throat]
a PC and spun up a server. Did it not
work like that or how did it work?
>> You I mean a PC like maybe but you
didn't really have an operating system,
right? Because you Linux is Linux is
very very new. Linux is not
>> I'll back down.
>> Oh yeah, definitely. Linux is you know
uh would be like haiku today which is an
operating system you haven't heard of
for a reason. It's kind of like a
hobbyist operating system. You know what
I mean? You'd be like what? No, you
wouldn't. And you then you kind of had
the BSDs were out the free BSD was
certainly out there. Also still very
much under the shadow though of this
lawsuit from AT&T. So the unices are
kind there's not really open- source
operating system options. Uh there was
the um actually this is kind of funny
because so where was the GNU option? Uh
it was going to be the herd operating
system. So Herd was kind of like the
Duke Nukem forever of its time. It was
the operating system that was constantly
coming kind of next year and next year
and next year and it was going to be
micro kernel based and so you know that
it's kind of amazing but you really
couldn't do it on on PCs because of the
lack of system software and actually
part of my attraction to Sun was I had
used Solaris on Spark but I never I knew
Solaris existed on x86 but I never used
it. So I was excited to use Solaris on
x86.
>> And so what did Sun build? You mentioned
Solaris. That was the operating system.
>> Solar is the operating system. We built
servers. So we built Sparkbased servers.
Um we built a desktop machines. So we
Sun was a computer company. It was a
systems company. So So we built desktop
machines, built some ill-advised
laptops. So basically desktop machines,
workstations. But then at that time in
the 90s, what was really exploding were
everything from those kind of workg
groupoup style servers up to really
getting bigger and bigger servers up to
um the very large machines, machines
that are as physically the same size as
what Oxide makes today. And I remember
vividly in what would have been like
9798 maybe Greg Papadopoulos then the
the CTO of Sun giving it to the entire
company saying here are the top three
applications for Sun micros systemystems
databases, databases and databases. So
that gives you an idea of kind of how it
was being used. And this is again as
that kind of in that that that knee up
of that.com buildout where if you again
if you wanted to really build a web
presence, you were going to use you were
going to use Java, you were going to use
you do it on Solaris, you're going to do
it on Sun servers. Um and you were going
to and it was kind of it was a wild time
for sure. And can you tell us about the
dotcom boom because you know right now I
know AI is pretty exciting and it feels
like we're in a special time but what
what was it like especially working on
it sounds like it was at the it was the
epicenter of it and you know what was
funny is I did uh it was frenetic in a
way that was not always positive. So,
one of the things that is that is just a
point of fact and one can take from what
what one will I did we did much more
technically interesting work in the bust
than we did in the boom cuz I think that
when you're in boom times you know
everyone kind of like secretly believes
that this is because of me like I that
it is because of the thing that I am
working on if I you know I once had you
know one of the one of the the early
technologies behind Java once told me
with a straight
every server that Sun sells they sell
because of Java and I'm like you know
what you know what's most amazing I you
believe that is actually the more
interesting fact that I mean it is like
obviously false especially with you know
databases databases databases being the
top three applications but that that
kind of reflects the zeitgeist of the
time that everyone believes that this is
you know if I work on the microprocessor
it's because of the the the
microprocessor is perfect if I work on
the operating system it's because oh
this is the operating system that people
are buying the the machine for and it
like that doesn't really lend itself to
really to to real innovation. I think I
think there's a degree to which like
innovation requires some level of
desperation that good economic times are
it's kind of hard to summon that
desperation sometimes. So, I think that
during the boom it was and it was just
it was frothy and it felt like there was
a period of time where I'm like this
obviously can't go on forever and you
know the economist is having these very
like gloomy covers about how this is all
going to end and it's going to be an
apocalypse which I believed and then I
just stopped believing it. And I'm like,
well maybe the economist is right. It
just went on longer. And you know, one
of my early life lessons from the boom
and bust is these things go on longer
than you think possible.
>> But when they growth
>> in terms of the boom, when you're in
frothy times, that boom will go on
longer than you think possible.
>> Mhm.
>> And when it switches, it will collapse
faster than you can fathom.
>> In the boom, do I understand correctly
that customers were just like wanting to
buy your servers? They were flying off
the shelves. all these companies and on
a day-to-day work what did it what did
it mean for you? So I'll tell you like
in daytoday it meant first of all it
meant that traffic was terrible that the
you know there is you couldn't get
housing you couldn't get you know
everything was in short supply you
couldn't uh customers are you know they
are buying we had a customer that you
know was but was going to buy 19,000
servers which is obviously a big very
big number
>> and and these were these massive big
servers right
>> yeah well in that case those were
actually one use servers to build out a
broadband initiative that actually was a
company called Enron you know I remember
vividly we were at a a a dinner uh here
in the city at a at a restaurant called
Aqua which is a very kind of fancy
restaurant long since out of business
and I don't think Aqua survived the bust
and we were at Aqua with a with a bank
who was a a customer of Suns and they
were spending a galactic amount of money
every year with Sun and we were at a
dinner and I just remember I mean it was
it was the kind of like 19th century
guilded age kind of dinner. People are
ordering you know nine courses. What I
remember is at the end of that having
chateau deem which is a sotern. So I
don't know very much I don't know very
little about wine. I know nothing about
so turns. What I did know is there was
someone who knew wine and it's like we
are going to all drink the 1952 chateau
demot
which is which is and and I remember
being like I'm like I'm not much of a
drinker but I was like too drunk at that
point to really appreciate it. So I have
had this so turn that you know that that
enophiles kind of live their life to to
drink and I'm sad to inform you that
there's one less bottle of this precious
vintage because it was poured down the
gullet of a 20-some dotr who really had
and I just remember being back in my
apartment being literally drunk on
chateau deem thinking to my inro hill
and remember thinking to myself this
can't last this is not sustainable and I
swear the.com boom turned to a bust like
that night. I that is that is September
of 2000. So the uh pets.com had kind of
busted out and the bunch of NASDAQ had
busted out early in 2000. Uh the traffic
got lighter early in 2000. Anyone who
was here would be like that the absolute
spookiest thing is it went from like
gridlock to like COVID like traffic in
the span of like a month
>> without co happening
>> without co happening with only the
NASDAQ collapsing. and you're like,
"Okay, that's very odd." And then 2000
kind of muddled along and then the with
the that dinner was in September of 2000
and uh the what really stopped was the
telco buildout. So that there was a lot
of telco build up because people are
like the internet is the future
>> and telco build up meaning the towers
the server
>> the servers the infrastructure for and
then all of the conccommittent the the
fiber like JDS unif was a huge company
you you had these companies that were
you know global crossing and and MCI
WorldCom and all these companies were
explosive and everyone believed that the
internet is the future and this is like
an important thing important and they
were right they were right
>> Brian just said how an important lesson
to Alcom boom was that people who
believe the interim will be the future,
they were right. Today we're in a
similar stage with AI. It's pretty
likely that AI will be part of the
software stack in the future, even if
timing is harder to predict. The latest
shift is how AI agents are becoming a
lot more commonly used for development.
And this is a great time to talk about
our season sponsor, Linear, and how they
think about collaborating with agents.
Linear has taken an interesting approach
here. Instead of building one
proprietary AI assistant and locking you
into it, they built an open API and SDK
that lets any agent plug into your issue
tracker. That means you don't need to
wait for linear to build the features
that you need. You can connect the best
coding agents on the market like Cursor,
GitHub, Copilot, OpenAI, Codeex, and
Devon or you can build your own agent
for your team specific workflow. It's a
fundamentally different approach from
most issue tracking and project
management tools on the market. You get
optionality and the experience is
surprisingly natural. You assign an
issue to an agent the same way you'd
assign it to a teammate or you can
simply mention the agent in an issue
thread. Curser then can pick up a bug,
understand the context from the issue,
open a PR code can explore a fix while
you're focused on something else,
centric and root cause analysis when
something breaks. It's pretty powerful
what you can get these agents to do. And
here's what I like. You, the human stays
the accountable owner. The agent works
for you, not instead of you. You review
the work. You decide when it's good and
when it ships. If agents are going to be
a part of the tool set of building
software, and it feels to me they
increasingly are, you'll want a system
that's actually designed for them.
Linear is a system like this. To learn
more, head to linear.app/ aents. And
with this, let's get back to the point
where Brian was saying how those
believing the internet will be the
future back in 2001 were right. This is
the other thing. It's like they're
right. And so like a very famous impact
creator from the.com boom is is webband
right webband was delivering groceries
which many people today are going to get
their groceries delivered right right
instart. It's like they weren't wrong
but their timing was off and they lost
track of the underlying economics
completely. And so when it busted out,
so in in the the fall of 2000, uh in
November of 2000 in particular, there
were there were zero orders from
telecoms at Sun. Like it went to zero.
Wow. And every and you know, you're kind
of used to kind of ups and downs, but
that's like just like off a cliff. And
from that point, we you know, going to
2000 and then and then 2001 and it was
then very very grim. I would say that
the thing that that happened through the
bust and layoff after layoff after
layoff and cuz companies had kind of
built themselves and geared themselves
around these fat times lasting forever
and now they were gone and expectations
as frothy as expectations were during
the boom. They were that much negative
in the bus. People were like everything
is it's it's the end of days
>> and and were you a software engineer
back then?
>> Yeah, software engineer. Yep. And then
so as a software engineer like both you
and also thinking about your your
colleagues back at the time or friends
how did it impact you? Were you kind of
just chugging along or
>> so I would say that like lots of people
left and you had like the statistic of
you know the U-Hauls were 10 to1 out of
the Bay Area. So you the the moved away
and the the thing that I noticed is that
the people that had moved out to Silicon
Valley because they were they really had
a a an interest in the technology all
were there all stayed and were not
adversely affected honestly. I mean I
the um yes we every one of us if you had
equity in your company which of course
you all did like you try not to
overthink it right you just try to like
you try to remind yourself like I never
had it to begin with so like it's hard
to you know but it's definitely gone sun
lost 98% of its value um so it's like
definitely gone and you know there was
something and I think it also like a
boom can get you to care about things
that you actually don't care about and a
boom can get you to because in a boom
everyone is so financially driven that
it's hard not to become financially
driven. But it's like that's actually
not why I got into this. And so during
the bust, I'm, you know, definitely able
to put, you know, put a meal on my table
and a roof over my head. Um, but the uh
it was really a reminder about like
what's important and again because we
did we did do better technical work in
the bus than we did in the boom. And I
think it's because in the bust it's like
okay now like we really we have to focus
we we have fewer resources that that the
fewer resources actually force more
creativity. So you know all of the
things that we did certainly speaking at
Sun and system software so ZFS and Drace
and the service management facility all
these things that were really
revolutionary for the operating system
all happened in the same kind of
postbust period of time. So they all
those all of those things happened from
2001 to say 2005
>> and and so what were these specific
innovations?
>> So I'd gone to work at Sun to be to work
with Jeff Bonwick and as long as I had
known Jeff from the mid '90s Jeff had
wanted to rethink file systems and now
finally in the early 2000s uh he and
Matt Erenss were able to really go take
a clean sheet of paper from the file
system and that's CFS. I had a a chip on
my shoulder about the way we understand
and debug systems by the way we observe
systems. So I along with two other
colleagues um did dra which allowed us
to dynamically instrument running
systems and you can kind of go down the
line and there were there were a bunch
of things like this where
>> we and I I I don't know that all of this
is related to the bust. It's just that
the timing lined up such that it was all
happening during the bust and what we
ended up with was a whole bunch of
interesting technology coming together
actually in a single version of the
operating system and then very I mean
fortunate for us and I do think this is
a bit of a consequence of the bust
because sun was definitely open to to
new approaches we open sourced all the
operating system so that happened in
2005 and that was very important to to
give these kind of technologies eternal
life but I think you know we can never
predict the future but to me it's it is
pretty positive in this sense that even
in the bus hearing the stories that
innovation did not stop. Sure, you know,
sounds like it was probably harder to
get jobs and and there there might have
been fewer of them, but you know,
industry kept innovating and and what
you what you said that I didn't expect
to hear that it was a bit easier to
innovate.
>> It's just less manic. We were able to
focus more and so not that now I mean
not that one should uh necessarily pine
for a bust because busts are brutal, but
there is a clarity that you get too. Um,
so I mean ideally you would like to have
just like can we just be like normal
economically but like nope. Apparently
in high-tech we've got to be like on or
off. So bust aside in the early 2000s
leading up to this internet boom the way
to you know most companies went about
buying Sun servers with Solaris
installed and everything was hardware
and software came together. It was
beautiful. It worked well. Again I I
heard from from folks who did it. What
happened then? Cuz when I I got into
second in 2000 I did not hear about
Solaris and that that was not how it
>> No. Right. That's what was the shift.
>> So the shift was first of all open
source, right? So then so you know we
said in the mid90s Linux was kind of
still very much a hobby project. Not so
by the 2000s, right? So grew up it grew
up absolutely and it grew up because you
had a bunch of companies that really
backed up the truck and you know the
things that at first IBM and SGI data
general some other companies those
companies were very important because
they decided to contribute their
technologies like XFS right XFS many
people still use today on Linux that's
from SGI XFS was SGI on IRX that was
happening in kind of those the late 90s
and then in the 2000s I mean Google was
always built on Linux right And so you
had kind of the the companies that that
became that that next boom were all
built on open source and indeed needed
to be built on open source. So they
economically relied on open source to be
able to build what they built. So then
it became much more practical to
certainly run Linux and I think the the
other BSDs or they I we open source
Solera. So there were a lot of options
that were now available. So that
shifted. I think the other thing that
that that shifted is that I mean Spark
bluntly lost to x86 and you sun for and
and spark is a Harvard architecture.
>> Spark is a microprocessor. Yeah. And and
uh there was because there was a time in
the '9s when if you wanted the fastest
microprocessor it was a risk
microprocessor. It was it was from it
was a spark microp processor or it was
MIP or it was alpha. and x86 was was a
commodity but was was uh and obviously
available with a personal computer but
was not faster than those those risk
microp processors that shifted that
shifted in the late '9s and we you know
because we ran the operating system on
that was in Solaris on both Spark and
x86 we could see how fast these x86
machines were and could see frankly how
like you know you talk to the micro
electronics folks they really did not
they they kind of dismissed x86 and
dismissed Intel and you shouldn't do
that And in particular, Intel was was
very focused and architected their way
around what was called the memory wall.
Um, and they were able in part because
they use speculative execution. They
were able to actually make these
microprocessors that were became much
faster than the risk microp processor.
So by the time say you are in 2004 2005,
if you want a leading edge microp
processor, it's x86. So that that was a
a big and important shift. So by the
time you're coming up, it's like, okay,
yeah, if I want this, if I I'll I'll
just like I don't know, get a like a
Dell box or super micro box and then
I'll I'll put Linux on it or maybe
FreeBSD and and away I go. Then the the
next kind of big and important shift
that happened started in 2006. You could
you could argue with with S3, but then
especially in those next kind of
>> seven 8 n with the introduction of EC2
and now you have like the the cloud that
starts to come into play and now like
people were like why would I even like
screw out of the server at all? I mean
it was so great to be able to just spin
up infrastructure.
>> Yeah. I I remember one of my early
companies mid 2000s, we we had a server
room. We had server administrators. The
server room was always hot. And this was
a small company, mind you. This was not
not not a big one. Every company needed
to do that. It's kind of amazing to
think it's like that every single
company, no matter if you were a
website, you had your own server room.
>> And if you were a dev, you wanted to be
friends with the server admin because
when you wanted to deploy your stuff,
you you know, they they could do stuff
for you.
>> They could do stuff for you. That or
that's it totally. And so I think that
cloud computing was really important.
This is not a deep thought that elastic
infrastructure was really important but
the ability to have APIdriven
infrastructure. Um and that so for me
personally so I was I was at Sun and
then from in 2006 I started a storage
group inside of Sun which was great. Um
really successful group but so
successful that it actually attracted
Oracle as a customer for the first time
in a long time. I kind of this is like a
little bit of like residual like shame
that I have that like did I attract the
the the marine apex predator that ate
the company
>> cuz Oracle later stunt right and then
they bought sun in and uh and that
closed in early 2010. Um I left shortly
thereafter because I could see what
Oracle was.
>> Well, I never heard a story of your
potential role here.
>> That's right. So I Yeah. and uh uh
Oracle and I and I gave some maybe a
year later I gave a talk uh in 2011 with
some rather unvarnished opinions about
Oracle and Larry Ellison in particular I
cautioned people about
anthropomorphizing Larry Ellison you you
have to treat Larry Ellison as as a
machine uh like a lawn mower you stick
your hand in the lawn mower it'll chop
it off well this is so all right so I I
I I'm giving this talk in 2011 again
this is after I've left I've I've left
what was then Oracle and uh you know I
was just saying things that I felt were
were obvious but people you know the
audience is kind of gasping and you know
it's like and people are coming up to
you after the talk like do you think
there's going to be like it's gonna be
retribution from Oracle no you're
misunderstanding like there's no the
lawn mower is not angry at you it's a
it's a machine it doesn't it doesn't
have it doesn't have the mirror neurons
to be I would almost like I it would
almost u show me that I'm wrong for
Oracle to resent what I'm saying about
the anyway so but all the videos for
that conference go up and my video does
not go up
>> oh Right. Okay. And so my colleagues
were like, "This is an Oracle
conspiracy." I'm like, "This is not an
Oracle conspiracy," which it wasn't. It
wasn't orchestrated by Oracle, but what
I did I what I underestimated was the
fear of the conference organizers. So
they themselves were terrified of
offending Oracle.
>> Yes. Even though it probably would have
been fine.
>> No. So the talk did finally go up.
Before the talk starts, there is a
disclaimer. The views in this talk do
not represent the views of the US
association. And you're like, "All
right, I get it. Like never seen this
disclaimer before, but fine. Then during
the talk, you know, the format of the
talk is you got a slide and then you've
got like a little blank script and then
you got this talking head in the lower
right corner. There's like kind of this
dead space above the speaker. They took
this disclaimer and they rejustified it
and they put it above my head the entire
time I'm speaking. So if you and in I
mean and maybe in this regard they were
preanted because to this day if Ellison
is mentioned on hacker news or Oracle's
mention on hacker news someone will
immediately cite minute 33 of this talk
which is when I go on this kind of
Oracle
again I don't view it as a rant I view
it as just like me describing what is
obviously true that we all know but
anyway so I I had left uh I left Oracle
after after they bought
>> so so we're we're now around like 2,000.
So cloud has taken off. x86 architecture
is everywhere. Linux is is now winning
both for smalltime servers but also on
the cloud. And then what happens? This
was an interesting time when Google
started to figure out that hey they
could do something interesting on their
cloud, right?
>> Yeah, that's right. So this is still a
little bit before that. So this is in
kind of from I would say from 2010 to
about 2014
is when is a period of relentless
execution from AWS. AWS is executing. so
extremely well. There are not really
other public cloud options. There's like
kind of Azure kind of drifting out
there.
>> I think people people forget that that
you know like GCP on paper has been
around from 2009 but up to like 2014 it
was like it was almost like a joke.
>> It was a joke. I would say before it was
it like it existed but it was a joke and
the and in particular at every single
reinvent Amazon would announce a new
price cut. And if you were a competitor
to AWS you were like dreading reinvent
because here comes another price cut. If
you are a partner of AWS, you're
dreading reinvent because here comes the
announcement of a new service that
competes with what you're making. I
>> I think people who have not been around
have forgotten, but it really has
happened and cuz it's not been the norm
the last like let's say 5 10 years or
so.
>> Well, and in particular, they did a
couple things are just like, man, you
got to tip your hat to just I mean Jeff
Bezos is the apex predator of
capitalism. like Larry Ellison may be
the lawn mower, but Bezos is ultimately
the apex predator because the thing that
was so impressive is they were able to
give people the idea that this was a
terrible business. So, in particular,
they did not break out their financials.
So, everyone's like, "Oh my god, what an
awful business." Like, they're cutting
the price every year. Like, you do not
want to like this is a, you know, a
classic red ocean. It's bloody. You
don't want to compete. And so, we were
at joint. We were actually competing
headto-head with with AWS. So you you
were offering uh
>> a public cloud. So we public cloud and
then unlike AWS taking the software that
we had used to run the public cloud and
making it available for people that
wanted to run a cloud on prem on their
own hardware. So people that would buy
Dell or HP or Super Micro, they would
buy our software and they would run it
on there and get a cloud. So we we ran a
public cloud and we knew what the
economics of a public cloud were. Namely
pretty good. Margins were good. And so
what we knew that Amazon that Amazon
wasn't volunteering, but what we knew is
that AWS S3 was underwriting a war on
big box retail. S3 was paying for your
prime shipping. It was a genius move.
And so
>> also some some insider information that
you had because you did your own thing.
>> Well, we know that the margins are very
good. And then of course, I mean, we did
you will be unsurprised to learn that
several of Joy's most prominent
customers were retailers. Retailers,
this was not lost. Retailers are like,
"Gee, I wonder what's happening."
Retailers are like, "If you think I'm
going to take my dollars and spend them
on AWS so AWS can I so Amazon can go to
war with me, like no thank you." There
was a period of time when I felt like in
order to be in the cloud, you have to
implement every AWS API. So there's this
idea that you had to be API compatible
with EC2. There's a company called
Eucalyptus that tried to do this. It was
just a disaster. And part of the reason
it was thought that GCP and Azure could
never compete with AWS because they
could never be API API compatible. And
so I am convinced that the because what
changes what changes in like 2015? What
starts in 2015? Kubernetes. And I think
that part of that initial attraction to
Kubernetes is that people wanted to get
some optionality around their cloud and
they they felt locked into AWS. They're
like, I'm not using all this stuff. I'm
not using elastic bean stock. I'm not
using green grass. I'm not using kind of
these more as I'm not using red shift.
What I actually want is this kind of
basic infrastructure and kubernetes now
gives me this layer upon which I can
deploy and get some sort of true cloud
neutrality. So multicloud didn't really
exist I would say before Kubernetes and
I think a lot of that especially early
momentum behind Kubernetes is around
this idea of like I need to have some
optionality in here. I want to have
actually be able to go to GCP. So I
think you know and I don't I think it's
giving Google slightly too much credit
but only slightly too much credit to say
it is master stroke.
>> On the podcast I had Kat Cosgrove who's
uh released a project manager on
Kubernetes and you know she's been in
the project for a long time and I asked
her she's not she was never a Google
employee but I asked her why do you
think Google open source Kubernetes
which you know they have Borg which is
amazing and they kind of built honestly
a better version for for the for
external and they just released it just
like that. They put a lot of work in it
and to me it didn't really compute like
why would Google like what is the
business reason and she told me that she
thought again speculation from the
outsiders. She thought that they
probably thought that it would help
Google cloud
>> that's right
>> to have the a container which is now
portable and now you can give the
promise that if you run this on Azure
especially AWS you could come over so it
kind of makes sense. Is this your
thinking? Yeah, absolutely. But I think
I think that is definitely the argument
that Kubernetes proponents would make
inside of Google
>> in terms of like why they did it. Nobody
prevented it. You know what I mean? It's
like they they kind of open sourced it.
>> Google was a pretty cool place in the
sense that it was very bottoms up as I
understand back then still.
>> Yeah. And and then I think part of their
you know it was Craig Mccau who really
pushed for the CNCF the formation of the
CNCF around Kubernetes to give it kind
of a foundation home. I did I do
remember one conversation with Craig was
that and I were talking early as he's
contemplating the CNCF and he's like
well I think this is going to allow
Kubernetes to get the marketing dollars
that it needs. I'm like don't you work
for the most profitable company on earth
like do you really isn't it just like
gushing cash over there and you can't
get like you know a couple million bucks
for marketing for this thing but no
apparently you can't. So, but so I I
think that that the the argument that
people were making internally was about
we should be encouraging cloud
neutrality because we are the ones that
have something to win and they're right.
Um and and they did and GCP is now not
an afterthought. GCP is very important.
It's a very big business and I think
that they've got is Kubernetes to thank
solely no but I think it's played an
important role for sure. And where are
we today in terms of the the hardware
and the software stack running
specifically thinking of these big
clouds what's happening inside the likes
of Meta these giants as I understand you
know they're no longer just like you
know ordering servers from Dell or or
wherever
>> never were never what they do
>> they [clears throat] so it it's kind of
funny because for all of these folks
they took a somewhat similar path they
never were because in Google's earliest
days they were assembling machines from
fries you know rip fries fries being a
ical electronic shop that has long since
disappeared, but they were kind of
famously velcroing machines together and
finding
>> so so they bought like the processor,
the the different networking switch,
whatever.
>> And they had this idea that like it
doesn't matter what junk we run on
because, you know, our our software is
going to run as a distributed system. It
actually doesn't matter. We don't need
ECC protected memory because it doesn't
matter if your DIMs fail. And so it's I
think they learned well it does matter a
little bit if your DIMs have rampant
data corruption. like dims failing
that's actually not a problem. Dims your
memory returning the wrong thing like
that is a problem. You can actually like
you turn that like next thing you know
like your software inserts that into a
row into a database and like yeah now
you got
>> yeah that is correctness is a problem.
>> Yeah. Yeah. Correctness is a problem.
It's like okay overshot the mark. So by
the time they're like okay we're not
going to velcro machines together. we're
not going. But what by that point in
time, you know, the business was
established enough that they actually
did they built the machines that were
fit for scale. So they have a a great
book that was written um in the kind of
the mid 2000s, the warehouse size
computer where they talk about all the
things they did DC bus bar really
thinking about power across the entire
DC. So they kind of they went from from
being kind of too cheap for kind of Dell
or even Super Micro to then being much
better engineered than those systems
ever were. So they were never really
meaningful customers. Uh and ditto for
Facebook Meta. They were they were never
really meaningful. I mean they they
kicked them out very early and did their
own stuff. Brian just talked about how
Facebook built their own servers because
offtheshelf solutions didn't work at
their scale. And what's interesting is
that companies like Meta and Google
didn't just build better hardware. They
also built incredible internal tools.
Tools for safe deployments, feature
flagging, experimentation, debugging,
analytics, the whole stack that lets
teams shift fast and with confidence.
Most companies never get access to this
level of infrastructure. You either
build it yourself, which takes years,
and large engineering teams, or you make
with scattered tools that don't talk to
each other. That's exactly where Static
comes in. Static is our presenting
partner for the season, and they give
every engineering team access to the
kind of tooling that only the biggest
tech companies used to have internally.
At its core, static is a toolkit for
safer deployments and experimentation.
You ship a new feature to 10% of users
behind a feature gate. You validate that
it behaves correctly, wash the metrics,
and expand to remaining 90% only when
you're confident. And if something goes
wrong, you can turn it off instantly,
long before it affects everyone. And
safe deployments require visibility.
Static includes analytics, both product
analytics and infrastructure analytics.
So you can actually see what your code
is doing in production, errors,
performance changes, funnels, user
behavior, because you cannot ship safely
if you can't see what's happening.
Companies like Microsoft and notion run
hundreds of experiments per quarter were
statig velocity that used to require
entire platform teams to build and
maintain. This used to be infrastructure
available to maybe 10 or 15 tech giants.
Now startups and mid-size teams use
static to ship quickly without breaking
things. If you want to give your
engineering team world-class tooling
from day one, go to
statsic.com/pragmatic,
there's a generous freeze tier, a
$50,000 starter program and affordable
enterprise plans. And now let's get back
to the conversation about the history of
computing and what might be coming next.
>> And and this was independent. So like
both Google and Meta both came to the
conclusion of like we should just build
our own stuff
>> and and Microsoft and and Amazon all
came to the independent conclusion
because the scale at which they needed
to run was not at all the scale at which
Super Micro and Dell and HP were geared.
What they were geared to do was to run
the servers in your server room where
you needed to know the devs, right?
Where it's like I'm going to have a
little rack. It's going to have six
servers. Then maybe it's got 12 servers.
Okay, maybe we grow to 24 servers.
That's what they were designed to do. If
you're like, "No, I want to buy servers
by the thousands because I've got a
public cloud business." Like, if you
want to buy servers by the thousands,
there is no product from those companies
for you. And in very, very basic ways,
well, like the DC bus bar at every
juncture, they've been designed to be a
personal computer that you happen to be
slapping many personal computers
together, but they're not designed to
actually run infrastructure at scale.
So, and that was happening inside
effectively all the hyperscalers. And
Joint, meanwhile, was bought by Samsung
in 2016. Joint was bought by Samsung
because their cloud bill was off the
charts and
>> they bought they bought you to
>> bring it in house.
>> Yeah. And and there was not a product
they could go buy. So they went to go
buy a company.
>> So you're like, "Wow." And it's like,
"Wow, that's a big AWS bill." It's like,
yes, very big AWS bill. But then that
was not a product that or company that
was available for for you know the next
S. What does the next Samsung do? like
well that's one less company available
to buy. Um so when we were contemplating
the next thing in 2019 one of the things
that we had seen is that and we felt we
earnestly believe that one cloud
computing is the future of all computing
not a deep thought that elastic
infrastructure APIdriven infrastructure
that is modernity
one two you shouldn't be able to only
rent that you should be able to buy that
own it run it in your own data center
why would you want to do that well you
might want to do that for risk
management for security or for economics
because it, you know, if you're at a
certain scale, you'd rather own it than
rent it.
>> And I think, you know, before Oxide or
like in 2019 or even in like, you know,
2020, 2021, if you were like a midsize
company, you know, like not big enough
to build out your own custom cloud and
build everything that the hyperscalers
did, you could like buy some
off-the-shelf like HP or Dell, like a
bunch of them. I think that's what Base
Camp did. I I think they posted that
they they bought a bunch of bunch of
these things. They rented a space in a
in a one of these shared or or or I
think two different locations. They put
in their boxes with all the memory and
then you know they kind of set it up and
and put it together. So I guess those
were the two options, right?
>> Yeah, those are the two options and I
think that you know base camp ended up
being a real poster child for the
economic advantage because I mean DHH
know obviously outspoken and uh the
economic advantage was really really
really clear. They're also at a scale
which is like not the scale that we're
targeting, right? That the scale we're
looking at is a much larger scale. And
so the economic argument is actually
even more compelling when you're at that
larger scale. I love it when you know
the VCs that passed on us because they
felt there was no market then would send
me like the DHH blog post. It's like why
are you sending this to me? I should be
sending this to you. Like I know this.
We just knew the economics of it and we
knew couldn't [clears throat] predict
exactly what the trends would look like
but but believed that there would be
folks that were born on the public cloud
that would outgrow the economics of the
public cloud and want to go on prem.
>> Economics aside, what does it take to
build one of these things? And I I I saw
one of these things. We we'll put in a
picture of it. It's like a proper like,
you know, like my my 9 ft tall rack.
It's it's big. It's it it feels like
you're putting like I don't know like 16
or 32 of those of those like you know
Dell things in terms of size just to get
sense.
>> Yeah. Yeah. We would 32 comput sleds in
there. That's right.
>> And and what does it take what did it
take to actually build it? What did you
need to design in terms of hardware and
then software?
>> Yeah. So well and we knew this too that
going into the company we knew we were
taking a clean sheet of paper right and
so we were deliberately like no we're
going to start with a problem. We're not
going to build it out of Dell HB micro.
you're going to start with a problem and
how do you best solve the problem? And
as it turns out, like there were a whole
bunch of there's a lot of technical debt
that had been accured by this kind of PC
ecosystem. So I mean, you know, God,
where do you start? Uh just on the
environmentals like on power, right? The
fact that you got AC power in each of
these Dell HP super micro.
>> Yeah. So if you like put 16, you have
like 16 separate AC
>> times two because you have two power
supplies per one U two chassis. Two
power supplies. By the way, there are
two fans sitting on those power supplies
and those and those fans are actually
what wear out if you go to the like in
terms of like the worring fans. It's not
just coming from the computer, it's
coming from the power supplies because
those power supplies are dense. They're
packed with stuff. So, they've got to
overcome a huge amount of static
pressure. So, like that's not the way
anyone does it at scale. The way people
do it at scale is you've got AC bus bar,
you've got a a power shelf that is that
is much more efficient
>> and that that that that rectifies from
AC to DC and then you run DC up and down
and then you you blind made into that.
So we knew we were going to do that.
>> That's a little electronics engineering
right there.
>> Yeah. Yeah. That Yeah. The power
engineering for sure and we knew we were
going to do that. We also knew that by
taking a clean sheet of paper that we
would have opportunity made available to
us that we weren't necessarily thinking
of and that manifested pretty early. So
we blind mate into power which is to say
that when you feed a sled in that power
connector you don't see it it's at the
back you you lock the sled in blind
mates into power and we had assumed that
we were going to do what Facebook and
Google and others have done Amazon done
and had networking out the front in the
cold aisle but as we were you know
taking a clean sheet of paper talking to
some connectivity vendors they asked us
like why are you wait a minute you guys
are like taking a clean sheet of paper
why are you putting cabling in the front
like why wouldn't you also blind mate in
the network and the networking
connection and we were like can you do
that? They're like oh you can definitely
do that like well why don't the
hyperscalers do that? It's like, oh,
they would all tell you that if they
could start over today, they would blind
me the networking and they're just too
afraid to do it at this point, which is
like, I mean, that was like catnip for
us, you know, like they're too afraid to
do it. Like, okay, we got to And one of
the very early holy god, we're going to
bet the company decisions was
blindmating networking because if
blindmating networking doesn't work,
you've got nothing. You don't have a
problem.
>> And and so what is the difference in
blinding networking versus
>> It means there is no cabling in the
system at all. So when you've got a a
sled, you are blind mating into a cabled
back plane. So that the it's cabled in
the factory. So the the the operator
>> So when the box comes in, that's why I
didn't see any cables. It's it's inside.
It runs inside.
>> It runs down the back. And so
>> versus when I look at the pictures of a
data center of let's say Google, you you
see they're very neatly organized. It's
like I love organization. So it's like
beautiful, but it's cables everywhere
and you can see.
>> So you don't have that.
>> We don't have that. And in particular,
so because there's no cabling, there's
also no miscelling, right? So, so every
computer is not actually on just one
network. It actually needs to be on
three. It's on a power detect a presence
detect network. It is on a service man a
service processor network. And then it's
on that high-speed network that you
really care about like the actual
network. In any facility, you you need
another network for power environmentals
and so on. It's very easy to have
miscellane that's got to go to a
different router. It's like you there's
a bunch of of just complexity that we
eliminate because we do and then part of
that decision came out of an an arguably
earlier bet the company decision which
was we did our own switch. So we also
did in addition to doing our own comput
sled we did our own switch
>> and last time you told me about this and
in our deep dive we did a little bit
that like at first you said we did our
own switch and I was like yeah okay cool
you did your own switch and then you
told me that actually like that is a
second computer to build. Can you can
you tell me why? And it's funny because
we went when we went through Sand Hill
initially raising money for the company,
nobody asked us.
>> Sand Hill roll. Exactly. And we were
definitely so people be like, I've got a
technical question for you and you're
like, oh god, here comes switch. It's
the switch question. But then be no some
other random asked questions like all
right, that's not a very good question.
But nobody was asking us about the
switch. And we were concerned about the
switch because we'd already come to the
conclusion in order to make this thing
really work, we had to do our own
switch. And the reason you have to do
our own switch, if we didn't do our own
switch, it would be a third-party
integration nightmare and we wouldn't be
able to actually solve the problem that
we're trying to solve, which is when
this thing shows up in your data center,
we want this thing to to to come out of
the crate. We want you to wheel it up.
We want you to put in power and
networking and go. We do not want you to
have to to cable anything. It should be
the the the level of operator
involvement should be really minimal.
So, we'd already come to the conclusion
that in order to make this thing
operable and manageable, we need to do
our own switch. And so you're saying
that like buying cuz a switch to me
sounds like a somewhat simple component
and you're you're going to tell me why
it's not.
>> Oh yeah, it's definitely not. No, but
that but that attitude is very
important. If you want to go build your
own switch, I encourage you to have that
attitude as long as you possibly can
because otherwise you won't go do it.
>> So So what does your switch what is
switch being obviously the networking
switch? What is your networking switch
do or or that made it so important for
you to build it as opposed to like going
to one of the many suppliers and saying,
you know, let's get your
>> not many suppliers. Oh,
>> so if you actually go to the actual
switching silicon is coming from like it
was like one and a half providers.
>> Oh,
>> it's all Broadcom and so what you're
actually talking about is Broadcom
silicon. Um what we discovered is is
this actually interesting piece of
actually Intel silicon from a company
they had bought called Barefoot and we
found Intel Tofino which allowed us to
have true programmable networking. So we
we use Intel Tofino. Intel later killed
Tofino. So complicated relationship with
Intel over this. uh we fortunately have
procured enough to fino to be able to
take we bought ourselves the time we
need to kind of design our nextg switch
but that programmability was very very
important for us um and that we were not
going to get from Broadcom is a very
proprietary company we were not going to
get a bunch of the things that we needed
in building that switch we were not
going to get out of Broadcom so it was
ended up being very important we were
concerned I mean again another one of
these kind of bet the company decisions
very very concerned about about having
our own switch integrating our own And
what we found is that was a that was a
win in so many dimensions. So many
dimensions that we did not anticipate.
And as now you can't imagine the company
without having to sometimes do stuff and
you might get some wins. that I
absolutely well I think also like
whenever you're deliberating something
big like that you it the fact that it is
big kind of forces you to really
deliberate and then once you commit to
it to taking that big risk you often see
unexpected dividends like well as long
as we're going to do this as long as we
are taking a clean sheet of paper as
long as we're doing our own switch we
can blindate the networking if we were
not doing our own switch we really
couldn't blind make the networking we
really needed to be able to own both
sides of that in order to be able to do
our own switch or blind
>> a lot of us you know listeners, viewers
are software engineers, so we don't know
as much about hardware. Obviously, we
know we we know how the things work, but
can you tell me a bit on what it
actually means to design or build a
computer? Cuz you know, I I'll give you
the the novice approach, which is
obviously going to be wrong. But the
novice approach is like, oh, here's a
here's a processor, here's a few chips,
here's a mainboard, I'll just put it on
there and I'm done. But when I was in
your lab at Oxide, uh you told me that
one of the first engineers turned out to
be a radio frequency engineer. You told
me how this is great because of the all
the FDA approvals and all these things
and I was like okay this is way more
involved than I ever imagined.
>> Yeah, it's very involved.
>> How do you build a new computer?
>> First of all, it's all I mean it would
be a lot easier if it were all slower,
right? The problem is it's very fast.
It's high speed. So the connection to
memory via now DDR5 double data area
memory 5 is ridiculously high throughput
is very from a signal integrity
perspective really complicated. These
boards by the way ultimately this is all
analog. We think of it as digital and it
is digital but digital is like a lie
that that doubles allow us to tell
ourselves. It is actually like you are
talking about signals that are racing
through a a substrate and the and with a
PCIe or DDR5 the all of so those signals
are very complicated to lay out that's
complicated the actual like how does the
computer start like this computer is
like it's like a it's like a a trip 7
right or you know I a 747 used to be my
favorite jet to kind of pick on but now
the 747 is retired so I got to pick
something else and I'm not going to pick
another boring aircraft I don't think I
an A380 I guess, right? I should pick an
air bus. But you think about like the
okay, an Airbus doesn't just like come
by itself like it needs an airport. It
needs like a runway. It needs it needs
all the infrastructure to feed it. Well,
so too for a microprocessor, it it
doesn't like just the power sequencing
for those things is very complicated. It
needs another surround that manages the
power distribution network that actually
manages its power on sequencing that
manages all of its environmentals that
manage its connection to memory to IO.
So it is it is just fractally
complicated uh to the point that people
often just take reference designs and
iterate on them. They don't actually
really innovate on this stuff because
it's it takes so long. And you told me
this was really interesting last time
that uh as I understand reference design
means correct me if I'm wrong that
you're an electronics engineer or or
hardware engineer and you want to build
a new hardware and you take an existing
reference that has been tested measure
it out like it doesn't create accidental
like all sorts of radio frequency things
and then you implement that. But you
told me that this is not what you did.
You also told me that it's pretty hard
to find electronics engineers who are
used to not doing reference design but
who are brave enough to like
>> who are brave. Yes. I would say that in
in in computer design in particular, the
high-speed designs are so hard. People
got very accustomed to taking the
reference designs and it was harder to
find folks that were willing to take a
clean sheet of paper and we we
ultimately found them. I mean, and we've
got a a double E team that is
extraordinary
>> and double E is electronics engineer,
right?
>> And yeah, and absolutely fearless. Um,
and in part because like they're
actually but they didn't spend their
careers at Dell and HPE. Like they're
coming No, they're like coming from like
GE medical where they worked on CT
systems.
>> Wow. H how did that happen?
>> Uh how did they come to Oxide?
>> It's it's it's not but it feels like
such a different field. I would have
assumed naively that you know if you're
building a computer you'll you'll try to
get electronics engineers who have built
computers
>> you would think. Um and then we and that
was probably our thought as well and
then we discovered that we were
>> not getting along with those engineers.
Well, we didn't hire them because we
were but we were just like finding like
there's a lot of friction because there
wasn't a real first principles approach
from those folks. And this is where you
get to especially you get to talk to
folks that like been at Dell for a
generation and like for any design
they're used to calling what's called
the FAE which is the the the the field
applications engineer for you know the
for the voltage regulator. It's like
well the FAE gives me the design. It's
like all right well how do you know that
it's the right design? Well no he there.
So it's like all right so like let's go
hire that person then let's forget you.
And we were really just we were
struggling. I was struggling to get
outside of my own personal network to
find um the right engineers. Um and we
were kind of brainstorming like how can
we um get people to see the company who
wouldn't otherwise see it
>> and specifically for hardware engineers
like we're talking about.
>> Yeah. And just in general, but forecally
Yeah. For doubles it was feeling
especially acute. One of the thing you
we're kind of brainstorming as a team
and uh you know one of our engineers
said you know I you know the values are
very important to us at Oxide which they
are and I relay Oxide's values and our
principles to people outside of Oxide
and they're like that's just
and I explain that like you know
normally I would agree with you but uh
it's when I get to the compensation
people that their heads turn because our
compensation is transparent and uniform
and people are like, "Wait, what?" And
I'm like, "I could write a blog entry on
it." Like, "Yeah, that'd be like that
would be great." I'm like, "Okay." And
so, up until that point, we had not
talked about it at all. We had not
talked about it publicly at all. I just
came up with the idea that like
compensation is just private. It's just
not something you talk about with
people, you know, and you go to levels
FYI or some of or some of the forums
you're like anonymously asking, people
are enemies sharing that. That's how you
get information.
>> That's how you get information. And so I
kind of had this idea that it was that
it just is not something that you and so
we wrote this blog entry in March of
2021 and it sent our hiring nonlinear
and it wasn't that people were like oh
my god I want to work for a company
where everyone's paid the same like that
is like that's like
>> yeah cuz your composition was both the
same and you also put the number
specifically I think it was something
like $200,000 back then.
>> Uh uh yeah with it was a little bit less
back then but in a bit more than that
yeah I the uh now we just got another
raise so now I've lost track. It was 207
but now it's more than that. I actually
don't know because the one thing is when
compensation is uniform like you don't
keep total track of like oh did like
literally people were like wait a minute
like I got there's an error in my
paycheck I just got paid more. People
like, "No, no, we got a raise." Like,
"When was that?" Like, "No, it was at
the last all hands." Like, "Oh, you
know, I did have to go to the bathroom
like at the end of last all all hands. I
didn't listen to the recording. I guess
I missed my raise." Like, "Yeah, yeah,
you got to pay attention around here."
But it was more that what what drew
attention was that people engineers in
particular were but just in general,
people drawn to a company that would be
so nuts as to do that. And it did it
ultimately like that engineer that made
the suggestion was absolutely right. It
was the compensation that convinced
people that we take our values really
seriously, that we're a really
principled company,
>> which is you're paying everyone the same
base salary. Exactly the same. Yeah.
>> They're making the same as as you the
the electronics engineer, software
engineer, the whatever other role you
might have.
>> That's right. And when and and and I um
I don't know if uh you should just go
ahead and say it if you want to, but
many people are like, would you pay
support engineers the same amount? It's
like why do people always like pick on
support? They they would ask
>> exactly uh answer to that is yes and the
answer to that is uh if you do that you
find supportive support engineers and so
we have got uh I think we've got the
best support engineers in the business.
I think it's we we've got really really
phenomenal folks in support. I I I heard
I heard a small company called Gumroad
do this where where they they they paid
their support staff really high again
about same as software engineers and
then they got support staff who were
software engineers and they could fix
the code or like write tools for
themselves and you get people for whom
because I mean you know there's a a
certain thrill in being in a in support
that because you've got someone with a
problem. It's technical. You get to come
up you get to be technical. get to go
solve a hard problem and then
immediately the you get such gratitude
you know and like that's a rush and if
there are people that are really drawn
to that like I love helping other people
I love that feeling that I get when I
resolve a problem for someone that
immediiacy so one of the things that we
we've heard repeatedly from from several
of our support engineers is my heart was
always in support but but my career path
was forcing me into a different career
path and I love the fact that and get
back to where my heart is.
>> Yeah, that that that's nice because now
like it Yeah, you're not going to make
more by doing something that you're not
as into. I I love that. So, going back
to where we were, which is like you
build the hardware, you build this like
really complicated piece and you went
through electronics engineering, putting
it together. Let's put a software cuz
that that's super exciting. What what
what does it take to build software for
this? Did you start from sc let's talk
from from the low level. Did you start
from scratch from operating system? Did
you have to or could you use
>> and and there's kind of different
answers at different levels of the
stack. So on our service processor we
did start from scratch. We did our own
denovo operating system um in Rust
appropriately called hubris because we
had the hubris to do it. Um the the
debugger by the way for hubris is called
humility feels like appropriate for a
debugger. So that was was denovo
>> and this is open source right
>> open source. Yeah open source the entire
stack is open source. Everything we've
done is open source.
>> We can go on GitHub and check it you
know, go on GitHub and check it out. And
yeah, I mean, we've got God's own
revenue model because like you're like,
well, what if somebody like can download
it, run it on a different computer. It's
like knock yourself out because, you
know, we we think the best way to run
this is on on the machines that we make
and those are not free. An oxide machine
is not, you know, that's not free
downloadable, but all open source. So,
um that was for the service processor.
Um for the host CPU, we really had it
kind of at a quandry like what do we
want to do in the host CPU? And uh with
that is say like on the actual like what
was then AMD Milan now AMD Turin silicon
we knew that we wanted to do in the
product we would do our own hypervisor
and our own control plane. It was very
so this is not something that you run
>> the control plane is is that controlling
multiple like like the whole like you
have a bunch of processors and memory
and all that and control plane controls
all that.
>> You plug this thing on, you power it on,
you put in networking. What you get is a
console that looks a lot like a what
would like look like AWS if AWS looked
better. I mean it's it's a console. I
mean not to I mean look not to disparage
AWS but like we know that like design is
not really the strong suit.
>> We agree with that.
>> Yeah. Exactly. So it looks gorgeous. Uh
of course um but it's and it's also got
you got your API you've got your CLI and
you're provisioning instances. Where are
those provision instances provisioned?
It's the control plane that makes those
decisions.
>> You are attaching virtual storage those
instances. Where does that storage live?
It's the control plane that makes that
decision. So just like with AWS, you
don't need to know that stuff. That is
that's just happening. You you you're
using Terraform to spin up your cluster.
You're you're running Kubernetes on it.
You're knocking yourself out. So I we
are delivering all of the software from
that lowest layer that service processor
what the operating system that's running
on the host CPU and then that
distributed system very importantly that
distributed system um which we called
omocron before the omocron variant of co
which was feeling very like illtimed for
a very brief period of time it was
feeling illtimed and now I feel like the
omocrron variant of of co is just like
has just forgotten and now it's a good
name again so it's like you know we just
>> it was a really short list it short
live. Yeah. So we so so you know we we
we lived longer than the omocrron
variant of co co and that is our our
control plane. Um and um that is a very
sophisticated body of software. um in
addition to cuz it's it's not enough to
just like provision an instance right
you need to and you need to do that
robustly you need to do that via API CI
and so on but then you all the software
that does that and keeps track of your
instance so on uh it's very important
that you can actually update that
software that whole distributed system
you need to be able to update to a new
version of the software and this gets
really thorny right because in in a in a
public cloud you do that with a runbook,
right? I mean, even the, you know, we
don't feature it prominently, but even
in GCP and AWS, yes, there's a lot of
automation, but there's also also humans
involved, and there are humans that are
taking the responsibility for for
actually updating software. For sure.
Really? Yeah. I mean, again,
>> for the most part.
>> Yeah. I mean, there's a lot of
automation involved, but in particular,
if something goes wrong in an update,
you know, you've got DevOps that can can
hop in and figure out what's going on
and and get it rectified. We are
shipping a distributed system across an
air gap in an oxide rack that's
potentially running in a secure
facility. We cannot be there if it goes
wrong. So, we need when when
>> especially because a lot of your
customers are buying it because they
want to do it themselves.
>> They want to do themselves. So in many
ways the thorniest software problem for
us we had actually several thorny
problems couldn't pick between them
because they're all thorny for different
reasons. One of the very very thorny
problems was how do we ship a
distributed system that we can then
update and one of the things we did that
was important was like okay because it's
very easy to paint a road map that is
very complicated for update. You'll
never ship anything. So what we needed
to ship in that first product that we
shipped when you were back in Emeryville
2 years ago, we needed the minimum
viable update. We needed an update where
the software could be updated even if it
was painful. So what we did is we have
this thing called mupdate which is the
minimum update and mupdate in particular
required the control plane to be parked.
So we're going to take this rack that's
running instances, take it offline,
we're going to update it and then bring
it back online. And that was robust. It
was great and we got that working.
That's great. That is great and that you
can update it. But that's actually not
what you want in a cloud, right? You're
like, I sorry, I'm like using this thing
247. Like I actually I I I want to these
instances need to remain up while I
update it. But that gave us the platform
to go build that update functionality
into the software. Extraordinarily
sophisticated um and really an
extraordinary body of work. And actually
just recently um we had at our internal
meetup the engineer who led the charge
on that Dave Pacico gave a presentation
on looking back of two years of update.
And I I got to tell you, I think this is
one of the best single talks on software
you'll ever say. And we we will link
this, but can you give me just a short
overview of like why this update is so
difficult because like some listeners
will will be used to just building
applications for example on the iPhone
and an update there it what it means
obviously I know this is way more
complicated but an update is there's a
new binary version and it replaces the
old binary version. Now, of of course,
you know, you're saying this is an
operating system update or or you know,
like with a car and of course you might
think like, well, you know, you could
just replace the old version with the
new version and there's some downtime,
but where is the complexity that
actually like puts all this thorn?
Because I'm sensing this is like
>> I am missing something something very
obvious.
>> So, because it's a distributed system,
when you've got an app on an iPhone,
it's not a distributed system. Oh, and
distributed system, meaning that you've
got a bunch of different nodes,
>> components that are going to speak to
one another. And it's like
>> those might need updating as well.
>> Oh, they definitely need updating.
>> Oh, they all need update.
>> Yeah, the whole thing needs to be
updated. You got to be able to update
all of the software in the rack.
>> Oh,
>> this is not just operating updating the
operating system. This is updating
absolutely everything.
>> So, you might need to update some parts
or all parts.
>> You need to update the service
processor, the root of trust, the drive
firmware, the host operating system, and
then all of the components that speak to
one another.
>> Okay. And then it's like okay so I mean
this is challenge is fractally
complicated. I mean one of the very
basic ways it's complicated is like so
when we're updating we are moving the
system from from one version to another
version in between it's going to kind of
be in both versions. Like what does that
mean to have the system that's operable
while you've got some new components and
some old components? What if you change
your database schema from one version to
the next version which we definitely
have. Like you have to have a a a method
of doing that. What if you and for for
every one of these components, how is it
updatable? How we got to reason about
the system when it's in this hybrid
state and then it needs to be done in a
way that's very very robust. So the
first and foremost we had to to develop
the foundation that allowed us to do
this absolutely robustly. And so the way
Dave and team did this is you know with
that foundation and then very slowly
lighting up different aspects of the
system and making it more and more
automatic over time and you know first
started running that on what we call our
dog food rack and did our first
automatic update on the dog food rack.
Uh it was a really great feeling for
that team because this has been a very
long software road and it has been one
that has been very deliberate. Um and
and ultimately like and you know full
credit to Dave and team took us about
the amount of time that we thought it
would which is kind of very rare for
software because I think software so
practically complicated but that's only
because they've been very carefully
managing scope versus schedule making
and because quality has got to be the
constraint and Dave's talk goes into
that in detail in a way that I think is
just extraordinary. So I I'd like to
talk about the topic that is, you know,
a lot of people's mind is is AI
specifically and and AI tools.
>> Yeah.
>> How have AI tools changed how you're
working at Oxide specifically? Think
about software engineering, maybe maybe
even hardware. Are are you using these
tools? Are you experimenting with them?
>> For sure. When we've been early on in
terms of of using them and Yeah. I mean,
you use them for different for a
different and people are using them in
different ways. I mean I I I no part of
the oxide stack is vibe coded. I think
that that is the that that is safe to
say but we are using it and we're using
it to and again different people are
using it different ways. We are you know
using it to do things that are tedious.
We're using it to do generate test cases
you know generate the I use it for
because I think the thing that is just
like unmatched at is just document
comprehension. We've got a very writing
intensive culture. We've got a lot of
documents. It is great.
>> You always had that.
>> Yeah. always had that and if you've got
a writing intensive culture like your
LLM ready not to generate those
documents but to to consume them
>> and to you know one of the things that
I've always wanted to do and it's still
like now is possible I I haven't quite
found the time to do it early on I
wanted to make an RFD glossery so RFD
are a request for discussion we've got a
lot of technical terms I wanted to make
a glossery I tried to do that for like 3
hours this is like in 2020 and I'm like
this would This spreads to the horizon.
This is so just making a glossery is so
complicated. A glossery is something
that an LLM could just turn out and the
so there are lots of things that we are
we're doing to to use LLMs in particular
is clearly a very real very very big
shift in lots of different aspects of
software engineering. I I think that it
you know but of course there are people
that are being kind of reductive about
it. I am definitely not a doomer. There
are a lot of doomers that are out there
and you know I tried to give this talk
about building the oxide itself the
oxide rack and in particular the
problems that we had along the way that
an LLM was never going to be of any
assistance on. And so and I I the title
of the talk was intelligence is not
enough and one of the prominent doomers
actually did a reaction video to my
talk. It's like the only time I've ever
had someone and my daughter who was then
like 11 was just like thought it was
hilarious that someone had held their
own time in such low regard that they
would spend it recording a reaction
video to my talk. And so she was like we
I want to watch this. I'm like oh god I
do not want to watch this again.
Ultimately, the thing that was really
frustrating is this person obviously
disagrees with what I was saying, but
then when I was giving these very
concrete examples of here are the
specific technical problems that
required more than intelligence to
resolve that an LLM was not going to be
able to resolve. He literally fast
forwarded through those parts. He's
like, we just don't need this. This is
like this is just you're like, bro, this
is the talk. Like you you can't do this.
Like you're fast forwarding over the
actual like meat of the talk. C can you
give an example of like a problem which
which you felt was this like even you
know if if we fast forward to like
>> the arbitrary future. Yeah. Yeah. Yeah.
Yeah. So yeah super simple. I mean like
like the I mean we've had many many
scary problems but um we had a uh the
CPU when we did our first bring up of
our first machine. And then what does a
bring up mean?
>> A bring up means taking a board and
powering it up and trying to get it to
work for the first time. I think you
mentioned that the term smoke test comes
from electronics engineers.
>> Oh, I they I mean a smoke test I always
think of a smoke test more from from
aerodynamic but but yes I mean
aeronautical engineers but yes I mean
that you're definitely like smoke is
definitely a possibility that's a very
bad you do not want smoke that is bad
but no smoke please in bring up
>> so so the bring up
>> but we are doing bring up and we are
unable to get the CPU out of reset and
after 1.25 25 seconds the CPU would
resets itself. What's going on? Is the
power network bad? We're doing all and
like when you have something like that
happen, it's like well what's happening?
It's like I mean it's it's just not
working. I mean like what do you tell
your LLM to be like it like it's not
working. I mean and they can maybe give
you some suggestions but in this case it
wouldn't. So we are going deep into this
understanding like are maybe the power
network is like marginal. No no no we
resolve that. No, no. We're We've got a
and actually we're working with AMD at
the time and AMD's like, "No, these
power numbers are amazing. Like your
margin is very good."
>> You're measuring it out. You're like
eliminating that one.
>> We're eliminating that one. You're going
through eliminating eliminating
eliminating. And um couldn't get we and
this was weeks and you're like we are we
don't have a company like we're dead. We
are absolutely dead.
>> And I feel like this is the kind of
thing that desperate, you know, you get
desperate. You're like, we're going to
try kind of anything. And what we uh the
engineer who was working on this um
actually looked at the protocol between
the CPU uh and the voltage regulator. So
there's a protocol that it goes back and
forth says hey I need this voltage and
you know this is voltage and one of the
things he notices is that there is no
acknowledgement packet from the
regulator. So the CPU asks for a voltage
to be set to a certain level and he's
noticing that there's no acknowledgement
packet back from the regulator
>> which should come
>> which should come and the test that
they've got something called SDLE which
is this great uh test goober that you
you take the CPU off you put on the SDLE
and it will measure the power for you.
Well the SDLE didn't care whether it got
an acknowledgement packet or not. The
CPU definitely did. And the CPU So the
CPU says I want you to go to 0.9 volts.
It never gets an acknowledgement back.
And meanwhile sitting at 0.9 volts and
it's just like, well, I never got an
acknowledgement, so we're going to reset
and I'll do it again. And that was due
to a firmware bug on the Renaissance
controller. And so they we got a
firmware update from Renos and done. And
I mean, to be fair, the Renaissance FA
is great. Was like, well, you guys
really should have reached out a lot
sooner. Like, yeah, I know. We really
wanted to make sure that we got like
everything. Uh, and and that's the kind
of problem. And there were many many
problems like this where it's not merely
intelligence. It's not building a a a
board is not an IQ test. It's more I
mean you need to be intelligent to do it
but intelligence is not enough. You need
these other kind of characteristics.
>> Then I feel you also need a team in this
case, right?
>> You absolutely need a team. 100% 100%
you need a team.
>> Like you're you're going to solve these
problems with, you know, you had that
engineer who just like thought of
measuring this out,
>> right? Well, an engineer who was
desperate, you know, because we were all
getting desperate. Um, and you know, we
and again, we've had many of these over
the history of the company. Um, and
you're right, you absolutely need a
team. You need you need a team. And you
see also the value when you have a team.
People have different ways of
approaching a problem. That diversity is
really important because you need and
actually sometimes this has happened
more than once with the company where
somebody kind of like is just kind of
like walking through the problem and
like someone's like hey I'm just joining
you know about a remote company anyone
joins a you know they joining the Google
meet yeah I'm just joining because you
know I think that I'm following along
and you get someone will be like just
make an like hey I got like a dumb
question are those virtual addresses
like those look like similar virtual
addresses and you you get something
where someone's making and you need
someone to kind of like come and make
that observation that is maybe less
grounded in it and people like oh wait a
minute no that's actually like well
that's something to go check and so you
need that that different kind of
approach um that that is really a team
kind of uniquely summons
>> and you know I think you might have
alluded to it but uh on the previous
podcast Arman Ronacher mentioned to me
he's uh the creator of Flask he's he's
been around the block for for quite a
while and he's now doing a startup and
he said that right now It's just him and
his co-founder and he's got an army of
AI interns right now. He's prototyping
him. But he told me, "I'd like to start
to hire people soon because people bring
energy and you need energy per company
to live and and thrive." And I'm kind of
sensing the same thing.
>> Oh, for sure. No, for sure. And I, you
know, just listened to this great piece
with Richard Sutton who was the inventor
of reinforcement learning and and I
think rightfully I agree with him. It's
like you guys are conflating an LLM with
artificial intelligence. It doesn't have
goals. This is really important. So like
a prompt is not a goal and guessing the
next word is not a goal. And but like us
together as a startup and like wanting
to make it together, not wanting to die
here together, that's a goal. And that
so we can use that creativity. Maybe we,
you know, we use an LLM certainly as a
tool to help us achieve our goal, but I
I I do think that that's a very
important distinction.
>> And can you tell me like what kind of
tools you use and and what are the areas
that you you find it helpful? I
understand you're experimenting with
stuff and you know this is all work in
progress, but where areas that that and
you mentioned like the summarizing was
was one example of glosseries.
>> Yeah. Oh yeah. I mean and I I mean I use
LMS as an editor all the time. Um I find
it to be a really I mean actually it was
funny. I had a blog entry that went on
Hacker News and someone was like, "Oh,
this is LLM written." I'm like,
"Actually, it is LLM edited, but the
only thing that I did based on the LM is
I deleted an entire paragraph." So,
there's a paragraph that like wasn't
working and the LLM was like, "This
paragraph is not working." And I'm like,
"You know what? I'm just going to delete
the paragraph." So, I was like, I I
don't know. You want to say that's LM
edited? Because like every word there is
written by me, but there were some words
that there was written by me that an LM
social I deleted there, which I deleted.
So I mean I use it for um in writing for
sure. I mean I also like to use and this
is like a stupid reason stupid thing
>> but when you're writing Rust and we
write a lot of Rust there especially
when you're new to Rust this you you
wonder like the way I just phrased this
is this like idiomatic is there a better
way to do this that that's a great
little problem for like I got this small
little snippet of code. Is this an
idiomatic way of doing this? Is there a
better way of doing this? And that's a
great thing for an LLM to be to make a
suggestion or not or tell you that like
nope that's that's an idiomatic way of
doing maybe I would make this small
adjustment. So I find it really val I
find LLMs to be more valuable in the
small than in the large. Um so like
again this kind of I I my you know hats
off to people who want to uh spend their
lives acting as a middle management for
robots but like that's not necessarily
for me. Um certainly at Oxide I mean our
belief is that people take
responsibility for their own work. So,
if you want to have an LLM help you out
on that, that's fine. But ultimately,
like if there's a bug in this, like you
can't blame the LLM. The L the LLM broke
my code is like not interesting. That
that that's LM don't have
accountability. And so, one thing that
is starting to spread across I think a
lot of engineering is engineers using
LMS either uh inside your ID with
autocomplete or or and also kicking off
now agents. Now, there's more advanced
ones with like cloud code and and codecs
where it can actually run command
prompts and run your tests. Are you
seeing engineers use some of these
tools? And there's a little bit of back
and forth as well. You know, like it's
very clear that when it you're doing
kind of more boilerplate things that are
so-called on distribution, which is they
they've learned like React TypeScript,
it can spit out a bunch of stuff, but
you strike me as someone who's doing a
lot more nuance things.
>> Yeah. I mean, you're writing a bunch
you're running writing a bunch of C code
in the operating system kernel. It's is
it is less valuable.
>> Yeah. Yeah. But so what are you seeing
across the team in terms of
>> you know I encourage people to to uh
experiment and I would say we're seeing
a a wide variety of experimentation
certainly we've got we're using cloud
code a bunch and people are doing that
and um but I would you know broadly
speaking for a lot of the work that
we're doing um it is helpful as like
maybe a polishing tool but less as a
kind of the at the epicenter of its
creation. It's not true of everything.
There's some software for
>> No, but but that that's also nice to
hear cuz I'm I'm kind of asking you more
to putting on your CTO hat who's who's
also very like you know you're very
hands-on and you know what's going on
with the industry cuz a lot of non-hands
executives are kind of looking their
finger and thinking oh we must be 10 or
20 or 30% more productive but what what
what I'm hearing is like things are kind
of the same as before, right?
>> Yeah. I mean I mean my big belief is
it's a tool. It's a powerful tool. I
mean I will say that the thing I you
know occasionally get people are like
well I don't want to use it at all. And
I'm like, you should. So, like,
>> you should try, right?
>> Yeah. Like, let me get you off of that
position and let me, you know, we had
Simon Wilson on our podcast. Simon's
delightful. And, you know, one of the
lines that he has that I really love is
people should run these LLMs on their
own laptop where they run slowly and
poorly so they can see the bad output
that they generate so they can
understand what some of the limitations
are. So, I I I definitely I love that. I
I I do think that that uh people should
use them enough to know where they are
valuable. It's a very important tool in
the toolbox. You want to be aware of it,
but it's definitely reductive to think
it's the only tool in the toolbox
because it isn't.
>> Now, you're in such an interesting
company because like, you know, you
don't not just do software, but you do a
lot of hardware.
>> Yeah.
>> Have you found any use?
>> No.
>> No.
>> No. Zero. I mean, okay, zero is a bit
reductive. I have found it to be useful
when, for example, you know, you've got
a waveform of an I squed C transaction.
it actually amazingly you can send that
to an LLM and have it like interpret
this like hey what what am I seeing I I
squed C kind of compliant behavior and
it can help you out on that a little bit
but it's like absolutely at the edges
>> okay so that's a 0.01 01.
>> Also, like I think people don't realize
like there are already tools for that.
Like that's what EDA is. You spend a lot
of money on like we're not laying stuff
out like by hand with graph paper. Like
this is like you've got, you know, when
you do layout for a board, there are a
bunch of rules that are automatically
checked for SI, you know, we we've got a
we do a bunch of simulation work. Like
we're not doing that by hand. We're not
we're using software.
>> Yeah. I saw you have those machines in
there. Like I I I saw that. I think it's
a bit reassuring to hear because I think
it's very clear like maybe we don't
realize as software engineers but
programming is such a great use case for
LMS. It's a simple grammar you can
validate it and I think it's sometimes
nice to just you know touch sand of like
an area that is very very different.
Yes.
>> But but it's it's cool that you're
checking and you know you're seeing if
if if it changes over time I guess you
always keep checking.
>> Yeah. And I I for sure and I think that
like I I it is frustrating to me because
it programming is such a good use case
for certain kinds of programs. So as a
result you end up with certain kinds of
programmers who just in in part because
of their own self-centric view of the
universe believe that oh this is just
going to replace every job and it's like
no not even close not even close and you
need to spend more time you need to get
outside a little bit more.
>> Yeah. So speaking of getting outside and
you know meeting different people what I
noticed when I went to oxide is just
like it was great. We had double ease as
you say, software engineers, people used
to work on virtual reality at at Oculus
all in the same room. Can you tell me
about how big is the team? What's the
composition? Yeah, so we we're on you
know we've I think you know we got some
more offers going out tonight. So I
think we've got on the order we'll be at
like 85. I should probably keep better I
should keep better mental track track of
it. where we got like 85 plus minus and
we you know we've been very blessed by
uh we've really put a beacon out there.
We've got a lot of people rooting for
the company. We've got a lot of people
and as a result we got a lot of people
want to work for the company. So um you
know we as we talked about last time um
we really put a lot on folks to describe
you know the work they've done what's
important to them why they want to work
for Oxide. I mean a lot of my LM use is
I will look at someone's materials. As
you can imagine, we've started to see
materials that are heavily LLM authored.
Potential applicants oxide, please do
not do this. We get people who like who
who human author their entire materials
and then they get to the last question.
Why do you want to work for Oxide? Why
do you want to work in this role? And
they have an LLM spit that out and
you're like, do you think you want to
work here? Like I'm just like, let's
leave aside whether this is like, you
know, is is this right or wrong or
cheating or not? It's like fine, I
guess, but like I don't think you want
to work here. like you're not gonna get
a job here because I don't think you
actually want to work here. Put it in
your own words. But that process
really has allowed us to attract people
who themselves are attracted to the
company and attracted to the the
culture, the problem, the team, and it's
just extraordinary. I mean, it's I just
feel so lucky to be with such an
unbelievable group of people across more
and more and more and more disciplines.
I mean the great thing about our
approach is it brings people in who are
you know God it's like I love this
approach for we talked about support
engineering we I people who are like god
I love this approach like finally QA can
stand on its own two feet I I feel that
that QA has been kind of subjugated by
by these other disciplines now QA is
kind of really thought to be as
important as anything else in the
company and it is because at some like
at some like monetary
perspective it is as important as
anything else. Uh
>> yeah, but but I remember like when I
worked at Microsoft back like 15 years
ago or so, the QAs were just on a lower
pay grade, you know, like the senior QA
was at the same as like I think software
engineer 2 or something which just kind
of implied
>> Yeah. you're less important.
>> You're less important. You're just less
important. And so like if you tell the
world that we think it's as important,
do you know who you get? You get people
who are extraordinary at QA. You get the
best of the best. And so, um, that has
been really exciting. And now we've got
people coming. I mean, I do love how
many different companies because my
belief is that like every company has
something to teach us that there there
is something positive you can take from
every company. Now, there are some
companies, it's like, you're really
scraping the bottom of the barrel.
>> Maybe not an Ronaldo. They did buy some.
>> Yeah. Yeah. That's right. That's like
there are like even Oracle you can find.
There are that may be a bit of a
challenge. Let's not do that one. Uh but
you know what uh the and and at the time
I thought this was a negative but now
I'm like I see it. Larry Ellison makes
every hiring decision at Oracle.
>> So what's positive about that?
>> Exactly. Which be like what's I really
the I really think that the kind of the
founder mode the Paul Graham essay on
founder mode is talking about founders
that lost track of their own hiring. So
I think now I don't like the way Ellison
does it. I think that you want to have
you want to trust a team to make a
decision, but ultimately I believe that
the that the CEO of a company bears
responsibility on every single hire and
I think should be looking at every
single hire coming into your company and
that that is to me that is a very
important check on these kind of
companies that that so that is there you
go something that I've something that
positive I take from and it's telling
that your immediate reaction is like
wait what's positive about that? Yeah.
I'm I'm not sure like I'm not sure you
undid that that talk on on Oracle.
>> Yeah. Fair enough. Fair enough. Exactly.
Yeah. And there from some companies more
than others, but I think that there are
and so I love having all of these
different experiences present at Oxide
because I do think that there's so much
to learn and we're trying, you know, you
want to take all the positive things cuz
I also think that every company
including, you know, people I actually
one of the questions I love that I got
once is like, what do you not want to
emulate from Sun? I'm like, "Oh, thank
God." Because like think people think of
oxide as kind of the second coming of
Sun Micros Systemystems and like I there
are lots of things I loved about Sun.
There are lots of things I did not love
about Sun that I did not want to emulate
and so I think for any also any company
there are things we want to leave behind
and you know I think when you've got a
big diverse team you you get to go do
that. And one thing that really
surprised me last time I I was at your
office is turns out that most people
were not in the office and and they work
remote and I I would understand for
software but how do you make that work
for hardware development where
physically you do need to you know be at
the the hardware sometimes I understand
you need to measure stuff I saw a lot of
like you know you know units sometimes
you need to go to like check on
manufacturing how does that part work
>> yeah so I mean uh a lot in people's
basements um so you know fortunately
we're making you know this is the
advantage of making a server and not
making like you know a tractor or like
you know we're not making like a you
know I don't know like a wind turbine or
something you know this is something
that people can actually model in their
basement um so that helps but then a lot
of even hardware engineering is using
these software tools using EDA tools
you're using solid works you're using
LTM you're kind of putting this thing
together you when you're doing layout
for example um which is very important
task when you're laying out a board all
of that is that that can be done
anywhere that's all just software Okay.
>> And so the the there are things that are
where that physicality is very
important. And then when you're doing
bringup, you actually need to be at your
manufacturer when you do that. So like
that is also not in an office.
>> You would need to travel anyway.
>> Yeah. You need to travel anyway. And
anyone coming electronics industry is
like, "Okay, I'm interested in oxide,
but please tell me I never have to go
spend any time in Taipei or Beijing."
Because you go out there for, you know,
or Shenzhen or wherever. And you're out
there for two weeks in a windowless
office trying to get this thing brought
up. And um we all of our assembly is
done here in the United States of
Minnesota. So we are all in fact we've
got a bunch of folks out there this week
for uh at Benchmark Electronics in
Rochester. So this is wonderful. And one
thing that you told me is one of the
things that's on top of your mind right
now as oxide is growing. You still have
this culture of the the same
compensation full remote. So like it's
it's kind of been the same since the
start. What what will be the challenge
in in maintaining it? Because again you
worked at large companies. You've seen
how it goes. it can get tricky. What
what are the things that you're seeing
and what are the things that you're
trying to do to you know keep this kind
of start of vibe even even as you might
be just bigger.
>> Yeah. So I I think that the thing that
is that is top of mind right now for me
um is and especially because you know we
raised a big series B which is great. Um
I think much more importantly we're
seeing a lot of customer traction which
is great. So we've seen paying off.
Yeah. No it really is. It's really great
and we kind of knew that was going to
happen in the abstract. Um, but it's fun
to actually see it happen and fun to
actually see um the customers that have,
you know, like, you know, I bought one
rack and I mentioned it, but now I want
to buy a lot more racks. I love what I'm
seeing and I want, you know, that's
great. Very, very, very exciting stuff.
That means we're growing the company a
bunch. And one of the things that's very
important to me, because I've seen this
happen so many times, is companies take
their eye off the ball when it comes to
hiring in in particular. And it is very
important to me that we continue to have
absolute discipline in the way we hire.
And uh we we're doing that. And
fortunately, you know, the nice thing
about our hiring process is every single
Oxide employee has gone through it. So
it's like I'm not having to persuade
anyone about the importance of our
process because everybody has gone
through it and that you know the thing
that we've got overwhelmingly in our
favor is because we've used our values
as a lens for that hiring. Oxide's
culture is important to every single
person at Oxide. That's what it takes to
to really preserve that. And it it
doesn't mean that it won't change at
all, but the bones aren't changing. Like
what what will change is it will be
bigger and it will be I think you know
and I love the fact that you know even
at like 85 we're already so big that you
know Steve and I know everybody at the
company but very few other people know
everybody at the company. So when we get
everyone together, it's like the best
party you've ever been to because you
know when in college I used to throw the
best parties in college. And the reason
I threw the best parties in college is
not because of me. It was because of the
roommates that I had. So like I was a
computer science student who played
ultimate. My roommate was an engineer
who was on the water polo team. My other
roommate was a was a history student who
was in the chorus. That's six different
demographics that don't normally
overlap. And then very importantly, we
made sure that the women's swim team was
always invited. The women's swim team,
they were like the foundation water
player.
>> Yeah, exactly. Waterfall. You always
check their calendar to make sure they
can make what And people loved the
parties we have. Why? Because they would
meet people that they never met before
who were really interesting and they and
what I love about Oxide is we've got
this when when we get the whole team
together, people get the all these
delightful surprises. So people take me
aside and be like, "God, you know, Ry is
awesome." I'm like, "Yeah, I know. I
know. I know. I know. You know, too now.
That's great." But like, you know, or
you know, whomever it is. It's it's just
it's really exhilarating. And I think
that also serves to reinforce how
important what we've got is. I tell the
team like, we have lightning in a
bottle. And we cannot take it for
granted. And that means that every
single one of us need we we need to rise
to the moment. We need to do what our
customers need us to do, but we need to
do it in a way that protects and
preserves what got us here. So thinking
a little bit ahead, let's assume that,
you know, these AI tools will just get
better eventually. They'll be able to,
you know, help more even on on your kind
of low-level things. You've been in the
industry for quite a while. You've
you've seen a lot of shifts. What are
what do you think are are some of the
things both in software engineering or
in hardware engineering or just in
general engineering that will probably
not change even if we predict
>> uh these these things being like more
capable? Yeah, I think that what we I
mean I think that that it's certainly a
revolution. I think it's going to allow
us all to do more. I do think that we
are going to hit a point where people
understand that this is a tool where
because there's a little bit where we're
still have this tension of like, oh, is
this going to be AGI? Is this going to
replace all jobs? And this is like
nonsense as far as I'm concerned. And
it's distracting kind of nonsense. And
we actually need to get back to putting
the tools in the toolbox of of the human
that's building it. Now these tools have
become much more powerful and I think
that that's going to be I think that's
extraordinary. I think it's important. I
think that also we'll be you know we've
got a lot of experiments right now we
humanity that I'm I'm not sure are going
to make economic sense. So you know we
we'll be figuring that out as well. Um
but I think that you know one of the
things I am a little bit worried about
is a little bit of despair from younger
software engineers in particular who are
like what's the point like an AI can do
all this well and there's also the news
even from more experienced software
engineers in the mainstream media
there's this news that company X is
laying off healthier workforce because
of AI and by the way when we look closer
it's not because of AI but it it is
coming across and it does give not
younger people a lot of anxiety tons
even like mid mid-level folks or even
some more experienced like it it does
give a sense of I think it's the first
time in computer history that most of us
remember that there is this thing that
could threaten my job and I I think
we've just never had to deal with this.
I think you know there there are
industries that might have been a bit
more used to it.
>> Yeah, I would say that we I mean there
have been busts before. The knock on
bust was a bust like a lot of jobs did
disappear, right? So I think that we but
the bus has really come in in what feels
to be a broader and more permanent way.
I I I mean my view is like this is an
opportunity for I mean I think one of
the things we should be society really
encouraging is new company formation
because now I mean just like you're
talking to Armen about how you know just
a small group you know just Armen and
his co-founder were able to do so much
together right we should be really
encouraging that and what are some of
the gaps that we can all go fill because
ultimately like we we all need to find a
livelihood we need to find meaning and
the way we do that as engineers is we
build useful things. And so we're like,
we can now build many more useful
things. What would we go build? What
would if you could build anything, what
would you go build? And that's kind of
the question that people need to ask
themselves. It's scarier. It's scarier
than like go to this school, get
concentrate in this, and then mama
Google will hire you and take care of
you and feed you breakfast. It's like
no, that's not like that's not what's
going to happen. And it feels a lot
scarier because it feels like there's at
some level like less security, less job
security. But yeah, that's true that you
know that and and that that's scarier,
but there's also a lot more opportunity.
>> And for a for a college student or or
some someone in school or or with little
experience who says like, look, my goal
would be one day in like 5 years time to
be as good that I could get a job at a
place like Oxide. it doesn't need to be
oxide but again a place that has a high
bar they're they often hire experienced
people but I want to get there and yeah
there's all this AI stuff as hell
happening what would you advise them in
terms of what to focus on what what
areas to study what things to do or how
to think about like you know like they
have the the goal is there what advice
would you have them part with
>> yeah so I think that they need that that
you need to have a different mindset and
that mindset needs to be not around how
do I create as much as possible, but
rather how do I get better? How am I
getting better every day? And I think
LMS are a great tool to get better. How
can I learn about something new? Go
deeper. Go into something that I
wouldn't go into before. Get over that
kind of that fear. And one needs to
especially if you're coming, you're in
school now, you want to work at a place
like Oxide. It's like you you kind of
have to view it as like all right, like
you you want to play Major League
Baseball, that's great. like you're a
you're a great high school player. You
want to play Major League Baseball. It's
really hard. Got to get better every
single day and you you're going to be
need to be really focused on getting
better and you need to be like really
realistic about like what I need to go
do to get better. And it's hard but and
it's chancy because you might not get
there but you could get there and the
and you're certainly not going to get
there if you don't focus on that kind of
self-improvement. So I I I really think
that that it there is a shift in mindset
that that needs to happen or that one
needs to have I would put that way. One
you really got to have a mindset towards
getting better understanding more. What
do you not understand? There is lots
that you don't understand. I mean I
think one of the the the challenges of
modernity is that we delude ourselves
into thinking that we understand it all.
You don't. I don't. Like one of the
things that I've learned, I've joked at
oxide that like I keep waiting for the
day that I know how computers work and
it like
>> like it wasn't today definitely wasn't
yesterday. It's not like it's going to
work.
>> You understand how
>> but I mean that earnestly in that the
the the the amount of of complexity that
I that I definitely I mean I knew but
also didn't know. It's like every day I
feel I'm still learning new facets and
not just like a computer but actually
delivering a computer to people. There's
there's so much to learn out there. So
many op and and now with the way you've
got to view LLM is not like this thing
is coming from my job. You got to view
it as like no I've got now this like
private coach tutor what have you that I
can ask any question to. It's not going
to I got to like fact check its answers
for sure but now you've got the
opportunity to and you got it is easier
to get into this domain than it ever has
been. And that is that's great and it's
powerful, but it can also be scary.
>> And as closing, what's a book or two
books that you would recommend to folks
and why?
>> Oh, so many good books. You know, my my
uh my I've got a I've got a 21-year-old,
an 18-year-old, and a 13-year-old. And
when the 18-year-old was in his he's now
a freshman in college, he's a high
school senior. He got this assignment,
great assignment from his his English
teacher, namely go to someone that you
that that you know and ask them for
three books that they would recommend
that you read and I'm going to assign
you one of those three books to read and
you're going to read it and then you're
going to talk with them about that book.
And I'm like, "Oh, I love this
assignment." So he's like, "Dad, I'm
coming to you." And I'm like, "Oh, you
have Thank you so and of course my wife
was like, "Why didn't you come to me?"
Like, "Hey, look, I'm, you know, I
sorry, you know, look, uh, it was
great." So yeah, I I'll give you those
three books that that that I gave to him
and I think that each of these is really
terrific. Uh first is Soul of a New
Machine by Tracy Kder. So this one won
the Puliter Prize uh in 1980 or 1981,
but about the the the building of a new
computer at data general and it's a it's
extraordinarily well written and even
folks like well I'm not like what do I
have to do with a computer company in
the late 70s and early 80s? any engineer
will see something of themselves in that
book. It is just masterfully told. Tom
West who's the the is is is kind of a
complicated figure but that is soul is
still I mean it it it it's literature
for us. So I would absolutely solve a
new machine every engineer should read
Soul a new machine by Tracy KDR. Um for
me personally um very influential was
skunk works by Ben Rich. So about the
the the history of skunk works. Um
Clarence Kelly Thompson was the with
kind of the originator of skunk works at
Loheed Martin. Uh extraordinary story
about what engineers can do when they
they kind of task themselves on the
impossible. Um
>> it's such a good book.
>> It's such a good book. Amazing book. And
then the uh the other one is Steve Jobs
and the next big thing by Randall
Straws. So um Steve Jobs is kind of like
lionized by the industry but people
forget about a very important chapter of
his life namely next and I believe we
are it it was just an anniversary maybe
it was the 30th anniversary it must have
been of the or maybe the 40th
anniversary Jesus of the the
announcement of the next machine. So the
um Steve Jobs left Apple, was fired from
Apple, started a computer company called
Next. Uh really interesting company in a
lot of ways. Was at Next for a very long
time. It's a 13-year journey before Next
was bought by Apple. Next is bought by
Apple. Steve Jobs returns to Apple when
they buy Next. This book, Steve Jobs and
the Next Big Thing, is written before
Apple buys Next. And it is at Steve
Jobs's lowest moment. It it is not here
to praise him. It is here to bury him.
And it is very interesting about all the
missteps at next and the thing that we
cannot know because Jobs obviously died
but I believe having read the book which
gets basically next gets essentially no
treatment in the Isixson biography. Next
is like six pages of glory. It's like
that's not what it was. Um, but Rand
Straw's book is is masterful and in
particular I believe that Jobs's
failures at Next were essential for for
the resurrection of Apple. And there
because you look at the way he handled
himself coming back to Apple was very
different from the jobs that got fired
from Apple. And I think that like when
people look at Jobs like they don't
really take him apart. And I think you
should because I think he's a really
interesting guy. He's enigmatic. He's
someone like he did things that I that I
think are really fascinating and also
things that I really strongly disagree
with. So just to be clear, I'm not like
but I think that he's he's indisputably
an important figure and that book is by
far the best book. So Steve Jobs
>> No, I'm adding that. I actually want to
read that now.
>> Oh, it's extraordinary. It's very good.
>> Well, Brian, this was such a fun
discussion.
>> Oh, my my pleasure. I mean, we knew this
was going to be long and wide ranging,
so hopefully it delivered, but uh I I
really appreciate the went from from the
'90s all the way to the future.
>> Awesome. Well, thank you so much for
having your guy. It was terrific. I've
got to say Oxide is one of my favorite
companies, and I say this as someone who
has zero affiliation with them. [music]
It's just so rare to find a startup that
built both hardware and software and are
world class in doing both of these
[music]
and are so open about talking exactly
how they do it all. Honestly, the only
downside [music] I can think about Oxide
is how their server racks are built for
pretty large companies and are
definitely out of reach for hobbies
devs. In this episode, I really
appreciated how much of a straight
shooter Brian was, especially about the
impact of AI [music] tools. Yes,
everyone at Oxide uses them and they do
find use cases for coding and working
with documents, but it's eye opening how
it gives them basically zero help with
hardware engineering. This is a good
reminder that LMS might be the single
best fit for coding related tasks. And
as [music] devs, we should know that
these tools might be more specialized
than many people think. I hope you
enjoyed the stories in this episode as
much as I did. If you'd like to learn
more about Oxide, I did a two-part deep
dive about the company, and you can read
it linked in the show notes below. If
you enjoy this podcast, please do
subscribe on your favorite podcast
platform and on [music] YouTube. This
helps the podcast a lot. A special thank
you if you also leave a rating on the
show. Thanks. And I'll see you in the
next one in [music] the next
Ask follow-up questions or revisit key timestamps.
Brian Cantrill, co-founder of Oxide Computer Company, explores the evolution of cloud infrastructure from the 1990s dotcom era to modern hardware startups. He details the technical journey from Sun Microsystems to the creation of the Oxide rack, emphasizing the importance of first-principles design in hardware, networking, and software. The discussion also covers Oxide's unique culture of transparent, uniform compensation and a pragmatic view of AI as a tool that aids software development but struggles with the complexities of physical hardware engineering.
Videos recently processed by our community