How AWS S3 is built
2064 segments
AWS [music] S3 is the world's largest
cloud storage service, but just how big
is it and how is it engineered [music]
to be as reliable as it is at such a
massive scale? Milon is the VP of data
and analytics at AWS and has been
running S3 for 13 years. Today we
discuss the sheer scale of S3 [music] in
the data stored and the number of
servers it runs on. How seemingly
overnight AWS went from an eventually
consistent [music] data store to a
strongly consistent one and the massive
injury and complexity behind this move.
What is correlated failure, crash
consistency, and failure allowances, and
why engineers on S3 live [music] and
breathe these concepts, the importance
of formal methods to ensure correctness
at S3 scale, and many more. A lot of
these topics are ones that AWS engineing
rarely talks about in public. I hope you
enjoy these rare details shared. [music]
If you're interested in how one of the
largest systems in the world is built
and keeps evolving, this episode is for
you. This episode is presented by
Statsig, the Unified platform for flags,
analytics, [music] experiments, and
more. Check out the show notes to learn
more about them and our other season
sponsors. So, Milan, welcome to the
podcast.
>> Thanks for having me.
>> To kick things off, can you tell me the
scale of S3 today?
>> Well, if you want to take a step back
and just think about S3, it is a place
where you put an incredible amount of
data. And so, right now, S3 holds over
500 trillion objects. We have hundreds
of exabytes of data. And we serve
hundreds of millions of transactions per
second worldwide. And if you want
another fun stat, we process over a
quadrillion requests every single year.
And what's under the hood of all that is
also pretty amazing scale. If you think
about, you know, what's underneath the
hood of S3 at the fundamentally were
we're discs and servers which sit in
racks and those sit in buildings. And if
you try to think about all of the scale
of what is under the hood, we manage
tens of millions of hard drives across
millions of servers. And that is in 120
availability zones across 38
regions, which is pretty amazing if you
think about it.
>> So deep down it it all starts with hard
drives sitting inside servers, sitting
inside racks, and then you have a bunch
of these racks and then rows of them,
buildings of them, right? And that's
what you said. So there's tens of
millions of hard drives deep down in in
in the bottom of this.
>> That's right. In fact, if you think
about the scale of this, if you imagine
stacking all of our drives one on top of
another, it would go all the way to the
International Space Station and just
about back. And so like that, I mean,
it's kind of a fun visual to have for us
who work on the service, but you know,
kind of fundamentally, it's it's really
hard to get your brain around the scale
of S3. And so a lot of our customers
they they don't they they assume the
scale is there. They assume that you
know all of the drives are always there
and they just focus on what S3 is to
them which is it just works. It just
works for any type of data and all of
your data.
>> Yeah. Even I mean even for me for the
scale when you talk about exabytes I
actually had to look up exabytes because
I know of pabytes which is already
massive. If if a company has like one or
two or three pabytes of data it's it's
tons. And exabyte is it is is it a yes
it's a thousand pabytes is an exabyte
and and you told me that you're you're
you're thinking in that level. It's just
hard hard to hard to fathom.
>> Yeah, we I mean we have individual
customers that have exabytes of data.
Individual customers who have exabytes
of data and what they call a data lake.
Although last week I heard a great term.
We had the um Sony group CEO talk about
what Sony is doing with data and they
refer to it as a data ocean and not a
data lake but a data ocean and so like
if you have exabytes of data in your
data lake it is in fact a data ocean and
that ocean is is kind of fundamentally
S3.
>> Can you tell me how S3 started? I I did
some research and there was a story
about a distinguished engineer sitting
in a pub in Seattle. who knows it was
true or not but I read that this this
was a story that he was a bit frustrated
with engineers at Amazon building a lot
of infrastructure again and again.
>> Yeah. If you think back into you know S3
development really started in 2005 and
we launched as the first AWS service in
2006 and if you think about the
technical problems of 2006 you know a
lot of customers were building things
like like e-commerce websites right like
Amazon.com
and so the engineers at Amazon knew that
they had a lot of data that at the time
was very unstructured data it was PDFs
it was images it was backups and they
needed wanted a place where they could
store that at an economic price point
that let them not think about the growth
of storage. And so they built S3 and
they really built it for a certain type
of storage. And so the original design
of S3 in 2006 was really anchored around
eventual consistency. And the idea of
eventual consistency is that when you
put data in storage for S3, you know,
we're not going to give you an act back
on your put unless we actually have your
data. So, we have your data, but the
eventual consistency part is that if you
were to list your data, it might not
show up because it's being eventually
consistent. It's there, but it might not
show up on a list. And so, we did that
at the time that consistency model at
the time, uh, we built that because, you
know, we were really optimizing for
things like durability and availability.
And it worked like a champ for, you
know, e-commerce sites and things like
that because, you know, when a human was
interacting with an e-commerce site and
an image happened to not show up exactly
at the moment where you put the data
into storage, it was okay because a
human would just refresh. And so when we
launched in 2006, here's a a fun fact
for you. 2006 is actually when Apache
Hadoop first began as a community as
well. And so we had a set of what I
think of as frontier data customers like
Netflix and uh Pinterest who took a look
at things like Hadoop and they put it
together with the economics and the
attributes of S3 which is you know
unlimited storage with pretty good
performance at a great price point. And
they um they decided to build their you
know what we first began to call data
lakes at the time. they decided to build
to extend the idea of unstructured
storage and include things like tabular
data. And so the first wave of frontier
data customers were adopting quote
unquote data links um in about 2013 to
2015. Those were the frontier data
customers born in the cloud. And around
2015 to I would say 2020, we started to
see all the enterprises take that same
data pattern of how can I use S3 the
home of all the unstructured data you
know on the planet and extend it to
tabular data and that's when about five
years ago 2020 I started to see a ton of
exabytes of you know basically parquet
files and you know I I have worked on S3
for a minute I started working on S3 in
20 I guess it was 2013.
I'd been at AWS since 2010, so kind of a
while. And the rise of parquet was
really interesting because what people
did is they said, "Oh, okay. I like the
traits and the attributes of S3 and I
want to apply it to a table." And so I
am going to run my own parquet data in
S3. And then you know around 20 I would
say 19 2020 we started to see basically
the rise of iceberg and iceberg at the
time you know is incredibly popular and
it gives the table attributes to the
underlying parquet data and customers
started to do it in you know many of my
largest data links across different
industries and different customers and
so one of the things that we did in 2024
is we introduced S3 tables
>> just for those who don't know what
iceberg is. So, it's it's an open-
source data format for like massive
analytic workflows. Right.
>> That's right. If I ask our customers of
these data oceans why they care so much
about iceberg, it's because they want to
be able to have what a lot of customers
are calling this decentralized analytics
architecture where, you know, they can
have lines of businesses or different
teams within their company that pick
what type of analytics to use as long as
it's Icebro compliant. And so if iceberg
is the common metaphor for data for
tabular data then you have choice you
have flexibility and choice for what
type of analytics engines you use in a
decentralized analytics architecture and
so I think that's one of the reasons why
iceberg has just taken off is that it
makes it easy to use data at scale but
it also gives a business owner this you
know the chief data officers or the CTOs
of the world it gives them future
proofing for analytics they can replace
their analytics they can change it out.
They can adopt new types of analytics
and AI because you have this iceberg at
the bottom turtle of S3. We lost S3
tables in 2000 um in December 2024. This
year we've had over 15 um new features
that we've added to S3 tables. Um and uh
and then this year of course we launched
the preview of S3 vectors in July and
then last week we were generally
available and so you know the story of
S3 it's like a story that our customers
have written for data but it's been
super fun to work on all these different
evolving attributes
>> as an engineer. What is the kind of
basic architecture and the basic
terminology I should know about when I'm
starting to work with S3? When we first
launched in 2006, the whole goal for SRE
is to provide a very simple developer
experience and we've really tried to
stick with that. In fact, when the
engineers and you know when we're
sitting around and we're talking about
what do we build next, we always go back
to that idea of how do you make things
really simple to use to use S3. And so
fundamentally S3 we have a lot of
different capabilities now, but it's
really about the put and the get. the
put of the storage in and the get of the
storage out and where we can do that
really well at scale that that is kind
of the heart of S3. Now we have a ton of
extra capabilities that we've launched
over time but you know fundamentally
when customers think about using S3 they
think about the put and the get.
>> Yeah. So like put data get data and I
guess some of the other like operations
it's a bit like HTTP right? There's also
delete, list, copy, a few kind of other
like I guess primitives
>> there is and you know if I think about
where we have gone over time we've added
capabilities on top of that just based
on what developers are trying to do.
Okay, let's just take put. Okay, um we
recently added a set of conditionals to
the put capability and like last year we
did put if absent or put if match. Um
this year we did a copy if absent or a
put if match and we did delete if match.
And the the core thing about for for us
with conditionals is that we can give
developers the capabilities of doing
things like the put but to do it based
on the behaviors of their application.
>> Outside of the the get and put the basic
operations I guess the base terminology
that you should just know about is the
buckets, objects and keys, right? That's
how we think about our data.
>> Yeah. And now it's not just objects. If
you think about um the two latest um
primitives or building blocks we've
introduced as as native to S3, one of
them is the iceberg table with our S3
tables and the other one is is vectors.
And you know under the hood of an S3
table is a set of parquet files that
we're managing on your behalf. But
that's not the case for vectors. A
vector is just basically a long string
of numbers. And that is a new data
structure for us and it's sitting um in
S3 just like your objects.
>> Milan was talking about the building
blocks of S3 like the put, get, tables
and vectors. Speaking of primitives for
building applications leads nicely to
our season sponsor, work OS. Work OS is
a set of primitives to make your
application enterprise ready. Primitives
like single sign on authentication,
directory sync, MCP authentication and
many others. One feature does not make
an app enterprise ready. Rather, it's
the combination of primitives altogether
that solves enterprise needs. When your
product grows in scale, you can always
reach for new building blocks for
infrastructure from places like AWS or
similar. Similarly, when you need to go
up market and sell to larger
enterprises, work provides the
application level building blocks that
you need for this. Work has seen the
edge cases, the enterprise complexity
and solves this for you so you can focus
on your core product. One example of
such a building block is adding
authentication to your MCP server. This
is a typical screen when you're about to
authenticate with an MCP server. If you
would have to build it from scratch, it
gets pretty complex to set up the ooth
flows behind the scenes. But with work,
it's a few simple steps. Add the AltKit
component to your project, configure it
via the UI, then you just direct clients
of your MTP server to authorize via
AltKit, verify the response you get via
some code, and that's pretty much it.
This is the power of well-built
primitives. To learn more, head to
works.com. And with this, let's get back
to S3 and how it all started. So I I I'd
like to still go back to the beginning
of of S3. When it was launched, it was
pretty shocking for the broader
community because S3 launched with a
pricing of 15 cents per gigabyte per
month, which was about a third to fifth
cheaper than anything else. The going
rate at the time was something like 50
cents or 75 cents. And on the first day,
I I read that like 12,000 developers
signed up immediately. A lot of
companies immediately or very quickly
moved over and then the surprising thing
was that S3 kept cutting prices. It was
unheard of before. You were there in the
2010s when some large price cuts happen.
Can you tell me what was the thinking
inside the S3 team on the this unusual
pricing it seemed customers would have
been willing to pay more and also the
the cutting of prices continuously even
today? I think today it's something like
2 cents or 2.3 cents, something like
that for the same storage as as as it
was 15 cents on launch.
>> Yeah. You know, I think part of this
goes back to what the goal is for S3.
Okay. And so the mission of S3 is to
provide the best storage service on the
planet. Okay. And our goal too is that
if you think about the growth of data,
IDC says that data is growing at a rate
of 27% year-over-year. But I have to
tell you, we have so many customers that
are growing so much faster than that.
>> Yeah, I was about to say it sounds
pretty low.
>> I know like that. But that's an average
across everything. We have a lot of
customers that grow twice or three times
that that rate. But if you think about
that, okay, you think about all the data
that's being generated from sensors,
from applications, from, you know, AI,
from all these different
>> from just taking photos. I mean, every
day, right?
>> Photos. That's right. like you know and
you know if you think about your phone
too think about the resolution and how
the resolution of the cameras on their
phone have grown you just have this like
kind of what Sony talked about with the
data ocean. Okay. And in order to have
all that data and to grow it you have to
be able to grow it economically. You
have to be able to grow it at a price
point where you don't really think okay
what data am I going to delete now
because I'm running out of space. You
don't have that conversation with S3
customers because of of two things. One
is, you know, we do lower the price of
either storage or the capabilities of
what we're doing. Like for example, we
lowered the cost of compaction for S3
tables pretty dramatically within a year
after launching S3 tables. It's not just
that it's like the overall total cost of
ownership of your storage. We give you
the ability to tier and to archive,
right? Storage. We give you the ability
to do something called intelligent
taring, which is if you don't touch your
data for a month, we'll give you an
automatic discount on that data because
we're watching your storage and you
don't touch it for much, we'll give you
up to 40% discount on that storage. And
it's like dynamic discounting so you
don't even have to think about it. And
so our whole goal is that you can grow
the data that you need to grow because
we know that's being used to pre-train
models. We know it's being used to
fine-tune and do any type of
post-raining of AI. We know you're using
it for analytics. We know you're using
it for all these different things either
now and in the future. And so our goal
is so that you can keep your data and
you can use it in a way that advances
whatever the thing is that you're doing,
whether it's life sciences or you're an
enterprise, you know, in in
manufacturing, right? whatever you need
the data should be there and you should
be able to grow it and keep it and use
it any way you want.
>> Yeah, I I did want to ask you about this
part. So there's intelligent taring
which was launched in 2018. So like 12
years after S3 was launched. One thing
that really got my attention Amazon
Glacier which is was launched in 2012.
So a long time ago and it's you can
store data that you don't need immediate
access to. You're okay waiting for some
time to uh to get access to it. I think
maybe even hours. when it launched it
was only one cent per gigabyte per month
which was again this was something back
then the going rate for storage was like
15 cents so almost like almost 10 times
cheaper. How do you do that? Like what
what is the architecture and thinking
behind how you're able to have this
trade-off of like look if you don't need
your data quickly we can do it a lot
cheaper. How h how could I imagine the
kind of trade-offs that that you and the
engineering team were were were thinking
of making? Well, you know, I mean, a as
you know, I mean, you're an engineer
yourself and you know, as you know, a
lot of engineering is about constraints,
right? And that is the fun part about
working on on S3 is that when you think
about constraints, you think about
constraints that we have for
availability, you consider you think
about constraints that we have around,
you know, the cost of storage, we start
to get really really creative. Okay? And
in S3, because you know we build all the
way down to the metal of S, you know, of
the drives and the capabilities that we
have in our hardware, we're able to
drive, you know, efficiencies at every
single part of our stack. Okay? And so
our engineers when they get together and
they and they talk about the
constraints, they talk about the design
goals, we'll do something like we'll set
a a a target for you know the cost of a
bite and we'll drive for that and we'll
drive for it at every single part of the
process. And the part of the process
that we are also including is is you
know it includes a data center. How do
our data center tech uh technicians be
able to operate the the service of S3
from a hardware and a data center
perspective like the physical buildings
just like we do the same thing for the
software and the layers of S3 itself and
when you have that when you have that
ability to to run across the whole stack
all the way down to the physical
buildings and we're thinking about so
deeply about the cost and the lifetime
of every bite it you're able to do
things like like Glacier. You mentioned
something really interesting that when
S3 started it was eventually consistent
which means that you know data
eventually arrives it it might not be
there and you might be behind and
there's a lot of things that you can do
with this and and it gives you some
constraints but you mentioned that the
reason that the team launched this
because durability and availability was
more important and I I assume of course
cost as well but during those initial
phases while S3 was eventually
consistent what what kind of benefits
does it give to have eventual
consistency? Is it a cost uh constraint?
Is it just easier to do high
availability systems from from an
engineering perspective?
>> Well, I mean from an engineering
perspective, the the main optimization
was it was availability. It was not
necessarily durability, but it was
availability. Okay. So, if you take a
step back and and um look at the
original design of S3, we were really
focused very hard on availability. So,
so let's take a step back. Okay. So when
you talk about consistency, it's the
property where the object retrieval, the
object get reflects the most recent put
to that same object. Okay? And so if you
think about, you know, what parts of the
system of S3 that really hits, a lot of
it just kind of starts with our indexing
subsystem. So if you think about the
indexing subsystem in S3, that holds all
of your object metadata. And so that's
like its name, its tags, its creation
time, and the index, our index is
accessed on every single get or put or
list or head or delete, any API call
like that. And so um every single data
plane request where you go back into our
storage system to go get an object goes
through our index. And if you think
about it, more requests go through our
index in our storage system because for
example, it's serving thing like head
requests and list requests that don't
actually end up going back into our
storage system at all. That's those are,
you know, metadata or index requests.
So, you know, if you think about our
indexing system, we have a um a storage
system in there. Okay? And that is a
really central concept, a storage system
in the middle of our indexing system. So
you need you need a storage system for
your index in index system, right?
>> That's right. And so um we have to
configure and size the system to deliver
on our you know our design promise for
our for both availability and and
durability. Okay. And so the data is
basically in our in our in our index
system is stored across a set of
replicas and it uses something called
you know it's it's basically a quorum
based algorithm. Okay. And a quorum
based algorithm tends to be very
forgiving to to failures. And so if you
think about how we implemented quorum in
our index system, we start first from
servers that are running in these
separate availability zones. And the
reason we do that is that it a it lets
us avoid correlation on a single fault
domain. Okay. And since the failure of
like a single disk, a server, a rack, a
zone, it only affects a subset of data,
it never affects all of the data for a
single object or even a majority of the
data for a single object which we have
sharded across you know a wide um spread
of um of servers. So like this this core
of availability for us is this idea that
we spread everything. And so when a read
comes in, it's coming into the S3 front
end and we just heavily cache objects
across our systems. When a read comes
in,
>> it could route at random and you could
create a situation where you're creating
an an inconsistent read.
>> And so when we have quorum at the index
storage layer, we can see reads and
writes overlap, but in in the cache,
they don't because we're optimizing for
availability. So, so, so ju just so I
understand the the first part the
eventual consistency correct me if I'm
wrong that you can just you know write
to all these distributed nodes and you
ask one of them and if it doesn't have
it no problem because it will be
eventually consistent you now have high
availability because you don't need to
worry about all of them being in the
same correct and that's
>> phase one of AWS and it was it it gives
you availability and now you're now
explaining how you're able to behind the
scenes turn this a strongly consistent
the strong consistency means that it's
guaranteed to have the the whole systems
state which is hard to do because you
could have distributed failures etc
>> and this replicated journal you know it
took us a while to build I won't lie we
don't talk about this stuff very very
much okay because this is kind of the
the secret sauce of S3 um but you know
again like our engineers who are in the
room they were thinking about how do you
deliver on both the strong consistency
without compromising availability. So I
go back to constraints. Okay. So in in
that case we were not trading off the um
consistency and availability anymore.
And so the engineers had to come up with
a new data structure. Basically we do
this in S3. Uh vectors basically is a
new data structure that we came up with
as well. But you know if you think about
what we had to invent for strong
consistency at S3 scale without relaxing
the constraint of availability is we had
to build this replicated journal. Okay.
And the replicated journal is basically
a distributed data structure where we're
chaining nodes together so that when
this write is coming into the system
it's flowing through the nodes
sequentially. Okay. And so a reader
write in a strongly consistent system
for S3, it flows through these storage
nodes in the journal sequentially. And
so every node is forwarding to the next
node. And when the storage roads get
written to, they learn the sequence
number of the value along with the value
itself. And therefore on a subsequent
read like through our our cache, the
sequence number can be retrieved and
stored. And so now you have this
[clears throat] strongly consistent and
highly available capability in S3. And
the heart of that is actually this
replicated journey.
>> Okay. But what's what's the catch on one
on one end because there's always always
something with trade-offs. You always
have something. So on one end you
obviously have more complicated business
logic. And then I guess the second
obvious question is what about failures?
Because in the case of eventual
consistency, you don't worry too much
about one failure. Clearly in this case
uh what if a node in the sequence fails
either at the time at the first time or
or later or how does the system monitor
this recover because that that's I guess
that's going to be the tricky part right
>> there's another piece to this puzzle
that we implemented which is um you know
it's it's basically a cash coherency
protocol and the idea is that um this is
where we built what we think of as a
failure allowance where in in this mode,
we uh needed to retain the property that
like multiple servers can receive
requests and some are allowed to fail.
And so it's kind of this combination of
this replicated journal as a as a new
data structure plus we implemented this
new cache coherency protocol that gave
us a failure allowance and those two
things working in concert gave us this
uh strong consistency. I will say too
this does come at some um actual cost. I
was about to say like you you nothing is
free on engineering, right?
>> There's hardware cost in this because
you can imagine we're we're we we've
done some more engineering behind the
scenes, but I I remember um sitting in
the room with our engineers on S3 and we
did a debate on this. We we debated it.
We said, you know, there's costs there's
like actual costs to the underlying
hardware for this and do we pass it
along to customers or not? And we made
that explicit decision not to. We said
>> really?
>> Yeah. We said that when we launch this,
we should launch strong consistency. We
should make it free of charge to
customers and it should just work for
any request that comes into S3. We
shouldn't sort of say it's only
available on this bucket type or what
have you. This should be true for every
request made to S3. And part of that
mindset for S3 is like how can we
provide these type of capabilities and
how can we make it something that
becomes a building block like part of
the building block of S3 and you
shouldn't have to think about the cost
of it. This was the very surprising
thing of this launch by the way that
suddenly AWS said like okay everything
is strong existent it does not cost you
more latency wise your latencies have
shouldn't have changed significantly I
mean I'm sure when you roll out
initially you do your measurements etc
but but that was the promise and that
was why I I couldn't really believe it
when I I I reread history because it
typically doesn't happen typically
strong consistency does add latency or
it increases cost if it doesn't have
latency. There's always these
trade-offs. And I mean, sounds like you
either swallowed the cost or or cost
caught up, but it's it's very unusual.
So,
>> if I think about that, one of the things
that was also very important for us, and
we haven't really talked about this as
much, but it's it's we think about it a
lot on the S3 team is correctness. Okay?
So, it's one thing to say that you're
strongly consistent on every request.
It's another thing to know it. And so
when we built this strong consistency,
you know, I I I talked about our new
caching protocol. I talked about this
replicated journal as a new data
structure. You know, that took a little
bit of time to to do and to get right.
But at S3 scale, we could not say that
we were strongly consistent unless we
actually knew we were strongly
consistent. Okay. And so what does that
mean? How do you do that at S3 scale
when everybody is using it for every
last workload? In fact, one of the
reasons why people use it is because our
scale is such that we're decorrelating
workloads and you can run absolutely
anything on S3. But how do you know?
Milon just talked about how strong
consistency made it so much easier to
trust S3. Trust is something that is
just as important when writing code,
especially when with AI we write more
code than before. And this is a good
time to talk about our season sponsor
Sonar. What is the impact that AI is
having on developers? Let's look at some
data. A new report from Sonar, the state
of developer server report, found that
82% of developers believe they can code
faster with AI. But here's what's
interesting. In this same survey, 96% of
developers said they do not highly trust
the accuracy of AI code. This checks out
for me as well. While I write code
faster with AI agents, I don't exactly
trust the code it produces. This really
becomes a problem at the code review
stage where all this AI generated code
must be rigorously verified for
security, reliability, and
maintainability. Sonar Cube is precisely
built to solve this code verification
issue. Sonar has been a leader in the
automated code analysis business for
over 17 years, analyzing 750 billion
lines of code daily. That's over 8
million lines of code per second. I
actually first came across Sonar 13
years ago in 2013 when I was working at
Microsoft and a bunch of teams already
use Sonar Cube to improve the quality of
their code. I've been a fan since. Sonar
provides an essential and independent
verification layer. It's automated
guardrail that analyzes all code whether
it's developer or AI generated, ensuring
it meets your quality and security
standards before it ever reaches
production. To get started for free,
head to sonarsource.com/pragmatic.
And with this, let's get back to the
importance of strong consistency at AWS.
>> How do you know that you're strongly
consistent? And that is why we used
automated reasoning.
>> What is automated reasoning for for
those of us who are not as familiar with
this, which will be most people outside
of very few domains like S3.
>> Yeah, it's I mean S3 uses automated
reasoning all over the place. Okay. And
automated reasoning is a specialized
form of computer science. Okay. And
girly, if you if you kind of think about
if computer science and math got married
and had kids, right, it would be
automated research. It's
>> is it formal methods or based on formal
methods?
>> That's exactly.
>> Oh, yeah. I mean, I I studied computer
science. So, yeah, that that's fun.
>> So, it's actually proper formal methods
that you're using.
>> That is right. And we use formal methods
in many different places in S3. But one
of the first places that we adopted was
for us to feel good that we actually had
delivered strong consistency across
every request. So what we did is we
proofed it, right? We basically built a
proof for it and then we incorporated
our proof on check-ins into this index
area that I talked about, right? Where
you have your caching and then you have
your storage sub layers of the index
capabilities. And so when somebody
anybody is working on our index
subsystem now and they're checking in
code into the code paths that that are
being used um for uh consistency we are
proofing through formal methods that we
haven't regressed our consistency model
>> and can you just give us a rough idea
because the formal methods that I have
have studied they were pretty abstract
the things like designing languages how
to have like the different operators and
of course there are maths involved as
well. But what are are they like
primitives like servers, network
etc and models being built, data flows
like how how can I imagine a a a simple
proof of of something
>> inside S3 roughly at a really high
level.
>> Yeah. I mean if you if you go back to
the fundamental notion of a proof, you
are proving something to be correct.
Okay. And so the places that we use
these proofs, we use them in consistency
where we built a proof across all the
different combinatorics to make sure
that the consistency model is correct.
We use it in cross region replication to
prove that a replication of data from
one region to another arrived and we use
it in different places within S3 to
prove the correctness of API. In all of
these cases, you know, we talk about
durability, we talk about availability,
we talk about cost, but just as strong
of a a principle, a design principle for
us across S3 is correctness. It's a
correctness of uh, you know, a thing, an
API request, you know, an operation as
it were. And the key thing for us too is
that you don't you don't want to just
proof it once. You want to proof it on
every single check-in and you want to
proof it on every single request so you
can verify you can validate and verify
that um you are doing in fact what you
say you do and I think for us you know
at a certain scale
math has to save you right because at a
certain scale you can't do all the
combinatorics of every single edge case
but math math can save you and help you
on this uh at S3 scale and so we use we
use formal methods in many different
places of estry. We have some research
papers too. I can send you some links to
some research papers where you talk
about
>> Yeah, please please do and we will and
we will put it in the the show notes
below so anyone can check it out because
I think it's it's really interesting. I
I feel formal methods are not really a
thing in a lot of startups and even
infrastructure startups yet but it it
sounds very reassuring to me to actually
have an ongoing proof of that. And
speaking of which, I I I want to ask
about one thing that is related to this
durability. Uh Amazon S3 has very very
like high durability promises. I think
it's 11 9 which I I had to like do a
double check on because in in uh backend
systems whenever you say three nines
it's like h when you say four 9ines of
availability we're not talking dur
availability four 9s is already hard to
achieve and beyond that it just gets
very expensive and I have never heard of
11 9 of durability now this is
durability and not availability one
question that I I got when I when I I
shared this stat uh publicly what people
one thing people were asking and I was
also thinking How can you prove that not
just in a formal way but you're now
storing as you said 500 trillion uh
objects which is now large enough that
just by this durability promise you
should be you might be losing some of
them do you actually like validate it on
the actual data as well on outside of
the proof because I assume in the proof
you will have assumptions on hardware
failure rate which might or might not be
not be true. So my my question is that
at Amazon S3 level when when you you are
able to look at the are we living up to
for example our durability promise how
do you go about that and and what are
your findings?
>> Yeah. So we just spent a lot of time
talking about our index subsystem
uh because that is the subsystem that is
related to consistency but when you
think about durability I mean you think
about it all you know at different
levels of um the S3 stack but we really
think about it in the storage layer. And
so if you think about it in the storage
layer, you have this design, this
promise of you know the design here and
underneath that is a combination of
things. It's software but it's also the
physical layout of where our data is
across everything that we have in S3.
And you know one of the things that I
talked about is that we have you know
disks and servers which sit in racks
which sit in buildings and we have tens
of millions of these hard drives. We
have millions of servers and we have 120
availability zones across 38 regions.
>> Yeah. And one availability zone is like
two availability zones are two
physically separate locations just to
physically separate and sometimes
they're a ways away from each other and
in some of our regions we have more than
three availability
zones gives us a different domain a
fault domain. If I were to think about
durability I think the most important
thing for us is our auditors.
So if you think about a distributed
system, we talked about the put and the
get. We have many many many
microservices
that are all doing one or two things
very well in the background. Okay? And
so we have many different varieties of
health checks, but we also have um
repair systems and we have auditor
systems. And our auditor systems go and
they inspect every single bite across
our whole fleet. And if there are signs
that there is repair needed, you know,
another repair system will come in
place. And these are all, you know, in
the in the world of distributed systems,
these are all microservices working
together, loosely correlated, but
communicating through well-known
interfaces. And so that, you know,
collection of systems, which are over
200 microservices now, that all sit
behind one S3 regional endpoint. And a
fair number of those subsystems, those
microservices are all dedicated to the
notion of durability.
>> So, so they will go and check and log
and report back. So, do I understand
correctly that in any given time frame
at S3 someone or some people or or some
systems can actually answer the question
of what is our durability the past week,
month,
>> year and so on.
>> Yes.
>> Okay. Great. So, so you can actually
verify your your durability promise that
check if the math is mathing.
>> Yes. And you know, part of our design is
that at any given moment in this
conversation that you and I have had
just just today, we're we're having
servers fail because servers fail. And
so what we are building and what we've
built in S3 is an assumption that
servers fail. And so a lot of our
systems are always, you know, they're
first of all, they're they're they're
checking to see, you know, where any
failure might hit an individual node.
How does it affect a certain bite? What
repair needs to automatically kick in
place? And so this system is constantly
moving behind the scenes, if you will,
while and that is a completely separate
thing from the get and the put. The get
and the put is what the customer sees.
There's this whole universe under the
hood of how do we manage the business of
bytes at scale.
>> I'm just thinking because for a lot of
us engineers who are building like
moderately sized systems I'll say
compared compared to S3 they can already
be big but a failure is is a big deal
like you know like a a machine going
down again. I have a small side project
and my storage filled up and I started
to give errors and this is a big deal
because it rarely happens to me. This is
the first time it happened in 3 years.
>> Yeah.
>> But I understand in your business or or
when you work at S3 scale, this is just
every day. And and the question is not
when, it's just how often, how do you
deal with it? I guess it's a different
world.
>> It is a different world. And the the
trick is to really think about
correlated failure. Okay. So, if you're
thinking about availability at any
scale, it's the correlated failure
that'll get you. And
>> and what is a correlated failure?
>> Okay. So that's super interesting. So if
you think about what I talked about
with, you know, eventual consistency, we
talked about quorum. Okay? And quorum is
okay for one node to fail, but if all of
the nodes go south, for example, and
they're in the same availability zone or
on the same rack, then you're really
going to be messing with your
availability of the underlying storage,
okay? You've just lost your failure
allowance that I talked about with the
cache because they all fail together.
And so like a correlated failure is an
incredibly important thing to think
about when you're thinking about
availability. And so when we're
designing around correlated failures,
the thing is that we have to think about
is like do we expose or how are those
workloads exposed to different low uh
levels of failure. So when you upload an
object to S3 with a put, we replicate
that object. Okay? We don't just store
one copy of it. We store it many
[clears throat] times. And that
replication is important. It's important
for durability. But what's interesting
about it, it's also important for
availability because if any of those
correlated failure domains fail, like if
a whole a fails, there's still a copy
somewhere else and the data is still
available somewhere even though an
availability zone has failed or a rack
has failed or a server has failed or so
forth. Okay. And so that idea of how do
you manage and design around correlated
failures with both our physical instit
infrastructure is super important for S3
for both availability and durability. Uh
we also do things like we think about
something called crash consistency. I
mean Gregly you you can tell I can go on
and on about this so you just have to
stop me.
>> No but but this is the this is the
interesting stuff.
>> All right. So the whole idea of crash
consistency is that a system any system
that you build it should always return
to a consistent state after a a fail
stop failure. And if you can like do
things like reason about the set of
states that a system can reach in the
presence of failure and you just always
assume the presence of failure. Then you
also assume the presence of consistency
and availability. then you just design
all of these different microservices to
all work together in an underlying um uh
capability like S3. But that's what our
engineers do. They think about like
crash consistency. They think about
correlated failures, you know, they
think about failure allowances and
caches, right? And it's it's all that
deep distributed system work that um
that our engineers come in every day to
work on. Can can we talk about how you
think about failure allowances because
again there there is a concept of error
budgets outside in other companies as
well. I feel it's a bit like loosely
handled whereas I feel this is kind of
your bread and butter. So what is a
failure allowance and how do you measure
it and what do you do if you if you
overstep it or overspend it.
>> Yeah I mean I think that the idea of a
failure allowance is want to have it
like you have to have it. If you assume
for no, you know, that you'll never have
a failure, you will you'll actually have
a very bad day for your customer. And so
we account for failure allowances. And
but the the most important thing is
let's just talk about the failure
allowance in our cash. So how do we
manage that? Well, we manage it in such
a way that you'll never experience it
because we size it, right? And if you're
sizing the cache and you're making sure
that the underlying capabilities and the
hardware are always there and we have
like I talked about those distributed
sub subsystems those microservices that
are all interoperating under the hood.
We have a ton of them that do nothing
but just track metrics right and like
you know that the sizing of our cache is
all related to the metrics and the um
>> the size of our underlying system.
>> All the metrics. Yeah.
>> Yeah. That's right. And so one of the
really big benefits of running on S3 is
because our system is so huge, you have
these massive, you know, uh layers,
right? And the massive layers are all
managing things like correlated failures
and and um and failure allowances. And
because they are so huge at the scale of
S3, any application that's sitting on
top of S3 gets the benefit of it.
>> Let's take a break a minute from S3 to
talk about a one-of-a-kind event I'm
organizing for the first time. The
Pragmatic Summit in partnership with
Stats Sig. Have you ever wanted to meet
standout guests from the Pragmatic
Engure podcast, plus folks from Kadesh
Tech Companies and learn about what
works and what doesn't in building
software in this new age of AI? Come
join me 11 February in San Francisco for
a very special one-day event. The
Pragmatic Summit features industry
legends and past podcast guests like
Laura Tacho, Kent Beck, Simon Willis,
Chip Huan, Martin Fowler, and many
others. We'll also have insider stories
on how engineering teams like Cursor,
Linear, OpenAI, Ramp, and others built
cutting edge products. We'll also have
roundts and carefully created audience
where everyone and everyone is
interested to meet and chat with.
Something I'm hoping will make this
event extra special. Seats are limited
and you can apply to attend at
pragmaticsummit.com.
Talks will be recorded and shared and
paid subscribers will get early access
afterwards as well as a thank you for
your additional support. I hope to meet
many of you there and I am so excited
about this event. And now let's jump
back to S3 and the massive scale of the
service. To get a sense of what what
what the reality is like working as an
engineer, an engineering leader inside
an organization like this. I read a
quote from a distinguished engineer Andy
Warfield who who said I'm I'm just
quoting what what what he said. Early in
my career, I had this sort of naive view
that what it meant to build large scale
commercial software that it was
basically just code. The thing I
realized very quickly working on S3 was
that the code was inseparable from the
organizational memory and the
operational practices and you know the
scale and and the scale of the system
since you you've now been more than a
decade in S3. How do you think of this
this beast this this really complex
system hundreds of microservices data
that is hard to fathom you know unless
you think of the hard drive stacking all
the way to the space station and how do
you engineers kind of wrangle this
because it does feel a bit intimidating
I'm not going to lie
>> well I think so much of this just comes
back to the culture and the commitment
on the team and you know I've worked on
S3 for a very long time now and I have
such deep respect effect for the
engineering community on S3. And you
know, honestly, I mean, this is true for
all of the services in our data and
analytics stack, but we have engineers
in S3 and they come in every single day
with this deep commitment to the
durability and availability and the
consistent of your bite. And so the type
of conversations that we have are so
interesting because we have people and
really you know these are people who are
early out of school there are people
who've been working on S3 we have
engineers who've been working on S3 for
15 years and everything in between the
creativity
and the invention of S3 like you have
this tension which is like on one side
you're like you have to be very
conservative with S3 right and on the
other hand like I mean we have this
princip ible engineering tenant called
respect what came before and that's an
Amazon engineering tenant which is if it
has worked for many many years you have
to respect that but then there's this
also this tenant these two tenants are a
little bit in tension with each other
which is kind of what makes it so fun
Amazon engineering tenant is called um
be technically fearless
and I believe that the S3 engineers are
just amazing at this at respecting what
came before because if we build new
capabilities in S3 We have to maintain
the properties the traits of S3 which is
it just works and you get that
durability availability etc. But at the
same time we have to be technically
fairless because our ability to go into
the world of conditionals our ability to
go into the world of you know native
support for iceberg or for vectors means
that we are extending this this
foundation of storage in a way that
helps customers build whatever
application they need now and in the
future. And so that combination of the
two things that is sort of when I think
about our S3 engineering team I think
they come in every day and they embody
that.
>> Now going back to the evolution of of S3
from unstructured to structured data.
You were mentioning how Hadoop uh the
data warehouse was what was a big use
case where customers started to use it
on top of S3 and then at at S3 you
noticed your like what a lot of
customers or some of your biggest
customers doing and you kind of built it
uh yourself with with more structured
data and then S3 tables came along and
then vectors would you mind sharing a
little bit more on on how you evolve S3
because this was another question that
when when I asked people about what
they'd like to know about S3 one of the
question was like like is it done Is is
it finished or or is it still evolving?
Because there is this notion that S3 can
store anything already, right? Like any
any object, any blob? What what new
thing is there? And yet we have a lot of
new things.
>> Yeah. And if you kind of go back in time
a little bit and you think about, you
know, the rise of Parquet. Okay. So the
rise of Parquet data in S3 started about
2020 and um we started to see more and
more people store their tabular data in
S3. And if you think about what iceberg
provided, it provided a replacement for
Hive. Okay, so if you think about Hive
and Hadoop, Hive was basically giving
your file system access into S3
unstructured storage. Iceberg is giving
that iceberg that tabular access
including the you know the compaction
and all the table maintenance that goes
along with it into your parquet data.
And I actually think that the world's
data for tabular data is going to live
in the future in S3. And if you just
think about the launch that for example
Superbase did last week, Superbase
announced that their Postgress database
is now going to is just going to do
secondary writes directly into an S3
table just like their Postgress
extension for vector is going to um
integrate directly with S3 vectors. And
so if the world of database, if the
world is data as a source, if you will,
goes directly into an S3 table, what
does that mean for the world's data?
Okay, so SQL as we know is a lingua
frana of data and the world's LLMs have
all been trained on decades of SQL and
therefore
>> and Python, SQL and Python,
>> Python and the stuff that's already out
there. And so if you think about this,
you know, we have many, many AWS
customers who know the S3 API pretty
darn well by this point. It's pretty
simple API, but now you have the ability
to interact with data in S3 through SQL.
And what that means is that you don't
have to be, you know, somebody who's
building cloud applications or know S3.
You just need to know SQL.
>> And this is with S3 tables, right?
>> With S3 tables. And so you can just
write SQL into an S3 table and whether
you're an AI agent or a human, right?
You're introducing the lingua franka of
data as a native property of S3 with S3
tables and I think you're just going to
see that take off in the upcoming years.
>> And your latest launch is S3 vectors. C
can you share a little bit what it takes
to build a new data primitive like
vectors just just behind the scenes how
long it takes how the seams comes
together and and maybe what are some
engineering challenges of launching some
something like this and again we're
talking about vectors right so like you
can you use embeddings whenever you have
LMS you create an embedding it's a
vector you want to store that somewhere
you will need to do search on it there's
specialized vector databases there's
specialized vector additions etc so I'm
assuming this is the the functional that
that S3 vector supports it very nicely.
>> Yeah. And you know I mean today a lot of
customers use vector databases just like
back in the day a lot of people put
their you know their tabular data in
just databases. Okay. And they just use
the structure of the database in order
to you know take advantage of being able
to query their their data. But they
didn't really need to use a database.
They just put it in a database. And then
S3 came along and then we introduced
this way, you know, with with the help
of open formats like Apache Parquet and
being able to store that structured data
in S3. That's kind of what we're doing
with vectors right now. Okay. And if you
think about vectors, vectors are
basically a bespoke
um data type. A vector at the end of the
day is a very very long list of numbers.
And vectors have been around for a long
time and they've been in vector
databases for a while, but they really
kind of took off in people's, you know,
data worlds in the last couple of years
with the rise of, as you said, the
embedding models. Okay. And so if you
take a step back and you think about one
of the great ironies of data,
it is that you have to know your data to
know your data, right? You have to know
what your schema is. You have to know
what the data types are. You have to
know where it is. And as these data
lakes become data oceans, you have this
situation where it gets harder and
harder to know what's in your data,
right? And the beautiful thing about
embeddings is that embedding models will
understand your data so that you don't
have to understand your data. And the
format that these embedding models puts
this semantic understanding of your data
is in fact a vector. And so when we talk
to customers and we, you know, they're
so excited about how these embedding
models are getting better and better,
they want to apply more and more
basically semantic understanding to
their underlying data whether it's
unstructured or structured that they
have in storage and so they kind of want
to store billions of vectors. But but
just to say when they want you say they
want to understand to correct me if I'm
wrong but hypothetically you have a
bunch of text data or maybe some image
data and you're saying that a lot a lot
of people customers teams they would
like to write queries to say like hey
can you find an image that looks like a
puppy or can you find an article that
contains this or this and and embeddings
are as we know are great for that but
then you need to actually create the
embedding build the system etc. Right.
>> Yeah. And like exactly what you're
saying like I mean if you think about
what what vectors can do, if you think
about all the data that a given company
has, you know, your knowledge across
your business or your knowledge across
your life isn't organized into rows and
columns like a database. It's in PDFs.
It's in your phone, right? It's in audio
customer care recordings which capture
the sentiment of how a customer actually
feels about their interaction with you.
It's whiteboards. By the end of this
day, this whiteboard is totally filled
up with ideas and it's in documents
across dozens of systems. And so it it's
not that you don't have data. You have
tons of data. But understanding what
data you have across all of those
different formats is a real problem. And
it's one that AI models can help you
with. And so the capabilities of those
AI models have gotten so much better in
the last 18 to 24 months. But it we
needed a place to put billions of
vectors, billions of uh you know the
semantic understanding of relationships
and that's what we built S3 for. the
state-of-the-art embedding models
combined with the ability to have
vectors across S3 is like a a really
important part and it's not a database.
I mean it's the cost structure and scale
of just S3 but it's for vector storage.
And then do I understand that did you
need to build new primitives to store
this like going down to the metal
figuring out exactly where we do this or
did you build it on top of your existing
you know like again existing primitives
as well like blob storage etc. It's
actually a new primitive and so you know
we had talked about S3 tables. S3 tables
um is building on objects because those
individual parquet files at the end of
the day they're an object. Vector is
totally different. So with vector we
built a new data structure a new data
type and you know it turns out that when
you're building vectors searching for
the closest vector in a very
highdimensional space which is basically
vector space
>> yes
>> it's often really hard to find the
nearest neighbor and so you basically in
a database you have to essentially
compare every vector in a database and
that's often like super expensive. And
so what we do in S3 is because we aren't
storing all of our vectors in memory,
we're storing it on our fleet of S3,
very large fleet, we still need to
provide a super low latency. And in our
launch last week, we were getting about
um 100 milliseconds or less for a warm
query to our vector space, which is
actually pretty fast. It's not database
fast, but it's pretty fast. And the way
that we do that is we premputee a bunch
of think of them as vector
neighborhoods. Okay? And so it's
basically a cluster a bunch of vectors
that are clustered together in
similarity like you know a a type of dog
as an example. These vector
neighborhoods if you will they're
computed ahead of time offline. They're
computed ahead of time asynchronously so
that when you're doing your query it's
not going to impact your uh query
performance. And then every time a new
vector is inserted to S3, the vector
gets added to one or more of these
vector uh neighborhoods based on on
where it's located. And so when you are
executing a query on S3 vectors, there's
a much smaller search that's done to
find the nearest neighborhoods. And it's
just the vectors and the vector
neighborhoods that are loaded from S3
into a fast memory. That's where we
apply the nearest neighbor algorithm and
it can result in like really good sub
100 millisecond query times. And so you
know if you think about the scale for S3
will give you up to two billion vectors
per index. You think about the scale of
a S3 vector bucket which is up to 20
trillion vectors. And you think about
that combined with a 100 milliseconds or
less for warm query performance. that
just opens up what you can do with
creating a semantic understanding of
your data and how you can query it.
>> It sounds very interesting and also
challenging to because you have to build
this for scale from day one. I guess
that's that's one of the I guess
benefits and curses of working at S3
that everything that you launch you need
to prepare for what will be extreme data
elsewhere but here it's just Monday. We
have S3 service tenants as well. And one
of the tenants and one phrase that I use
all the time and our engineers do too is
scale is to your advantage. So if you
are an engineer and you think about that
and you think about one of your tenants
for anything you build is that scale
must be to your advantage. It just
changes how you design. It means that
you can't actually build something where
the bigger you get, the worse your
performance gets or the worse some some
attribute gets. It has to be constructed
so that the bigger you get, the better
your performance gets. The bigger S3
gets, the more decorrelated the
workloads are that run in S3. That is a
great example of scale is to your
advantage. And so when we built vectors
just like we built everything in S3, we
ask ourself how can we build this such
that scale is to our advantage. How can
we build this such that a 100
milliseconds or less is just the start
of the performance that we're going
after? And how can we make sure that the
more vectors we have in storage, the
better the traits of S3 for vector.
>> I have a different question about the
limitations of of S3. Uh I read that the
largest object you can store in S3 is 50
terabytes. Um why is there a limit on
the largest object? I mean I think we
can imagine this will be through either
multiple hard drives and so on but why
did you decide to have a a limit? You
know I'm just interested more in the
thought process of how the team comes up
with like okay this will be the limit
and this is why
>> I mean I think um first of all that
limit of 50 terabytes is 10 times
greater than what we launched with. We
launched with five terabytes and now
we're 50 terabytes and sometimes we sit
and tell customers that and they go what
am I going to store that's going to be
50 terabytes and we're like high
resolution video right and so um you
know if I think about
>> known customer
>> right and so if you think about this
sort of thing you know like if you think
about I don't know size size limits
generally speaking we we do try to
optimize for certain patterns And um
when you raise the size of an object by
10 times like we did, we're just
optimizing for the performance and scale
of the underlying systems, it's like we
increase the scale of our batch
operations by 10 times last week, too.
And the idea behind that is that the
underlying systems were just optimizing
for distributions of work that are the
new norm for how people are doing
things. And um we'll just keep on
changing. We don't have too many limits
to be honest, but we'll just keep on,
you know, looking at what customers are
doing across a distribution of workloads
and seeing if there's something that
that needs to be changed. The big thing
for us, you know, again, we we did have
a lot of conversations with customers
and they're like, "Really? Like, I don't
have that many individual objects that
are that big, but with the increase of,
you know, cameras and phones and things
like that, we are seeing more and larger
size objects and we just wanted them to
be able to grow unfettered in S3."
>> And so, how does S3 evolve and and how
has the road map changed? Because so far
what I picked up is everything that you
told me is saying well you know our
customers were doing this or that and
you obviously here you live and breathe
data so you see the patterns you see
stats you see the the objects you also
talk with them is it only you talking
with customers seeing what's happening
what's uh what they're struggling with
what they're using more of and then
deciding to improve that may that be the
limits may that be figuring out we need
a new data type because they're now
building their own data types on on top
of it or is is there also some some kind
of more kind of all right here's a
vision here's a road mapap of what we'll
do
>> it's a great question and in fact one of
the things that we talk about all the
time is the coherency
of S3 right and so there are certain
things that people always expect from S3
it's the traits of S3 it's the
durability availability attributes that
we talked about and so a fair amount of
engineering goes on under the hood for
that. Okay. And it's a set of
capabilities that you know we may or may
not have talked about today. In fact, if
you think about I think back to 2020, I
think we've launched over a thousand new
capabilities since 2020 in S3. And some
of them are what we think of as the 90%
of the road map, which is what people
ask for explicitly. Okay. And so, for
example, you know, some of our our media
customers want the bigger object size,
and so we delivered that. We have other
customers that do a lot with batch
operations. But then we have some things
that we invent because you know we look
at what customers are doing with the
data and we ask ourselves how can we
build that. Vector kind of falls into
that category. For vector when we looked
at S3 and how S3 is evolving, we told
ourselves like look you know we can
continue to make S3 the best repository
for data on the planet. And we will. We
will. We have engineers that come in
every day working to make that so. But
there's this other element of how do you
make sure that the data that you have is
in fact usable and how do you make sure
that it's usable in a way that's you
know industry standard like that iceberg
layer on top of our tabular data. But
it's usable because AI models have now
gotten so good at embeddings that you
can have AI give you a semantic
understanding of your data. If only you
had the cost point of putting billions
of vectors into storage. So you could
actually understand and use your data in
a different way. And so for us, a lot of
it is kind of taking a step back and
looking not just at what customers ask
us for, but we want to remove the
constraint of the cost of data, which is
what we do in S3. And we want to remove
the constraint of working with your
data, which is what we do in S3 too. And
when we can do both of those things, if
we can make it possible that your data
grows as your business needs it and you
can tap into all the capabilities that
you're getting with AI and how the world
is changing for data, then then we have
a shape. We call it a product shape.
Then we have a product shape.
>> Product shape.
>> What's a product shape? It's sort of
like an emerging like when I think about
S3 I think of it as almost like this
living breathing organism where the
shape of the product is evolving but
it's evolving with coherency around what
you expect for the traits of S3 but it's
evolving in a way that lets you steer
into how you want to use data and how do
you want to use data not just now but in
the future and we will continue to
evolve the product shape of S3 based on
what you want to do with data. And so in
a lot of ways, we're sort of
transcending the the the boundaries of
what object storage was or what a
database traditionally was because now
we have tabular formats, we have
conditionals and we're we're evolving
into this new shape and it is ultimately
uniquely S3.
>> It it kind of sounds like you have all
these microservices. It's kind of
evolving almost like a plant or a living
organism. No. Yes, I uh I am in fact a
former peacecore volunteer from forestry
and so you know a lot of times I will go
back to the natural world for my my
metaphors and uh yeah I mean S3 is this
living breathing repository of data that
lets people do things with data that
they never thought possible. It's just
interesting because I think as engineers
we don't often think to relate the
systems that we build with like a a
living organization when in fact I mean
obviously there's code but as as you
said there there's people there's
servers there's failures that now happen
at at a at a cadence you can almost just
you you can probably predict how many
hard drives are failing today in fact at
at your scale already which again maybe
is do you think it's because of the
scale when things become large enough
they start to have these characteristics
because what what I find fascinating
talking to you is the way engineering
works inside of S3 feels very different
to how it works inside a smaller
organization your kind of startup which
again does you know like terabytes of
data or maybe even a few pabytes but but
that's kind of it uh and you've seen
some of the these organizations what
what changes at at at this large scale
what do you think that makes it it feels
pretty different the the world that you
and and the teams work in
>> it does but you So in order for us to
sustain
the traits of S3 and to evolve it over
time, we have to constantly go back to
simplification. We have a very complex
system with all of our different
microservices, but I kind of go back to
those microservices have to do one or
two things really well and we have to
stay true to that. Otherwise you know
the complexification
of a distributed system you know it's
it's unmaintainable over time and for S3
this concept of okay there's a simple in
S3 and the simple in S3 is a couple of
things one it's a simplicity of the user
model where not only do you have a
simple API but now you have the
simplicity of using SQL with S3 or you
have the simplicity of being able to
leverage these AI embedding models which
makes semantic understanding
understanding of your data so much
easier than having to annotate you know
a whole metadata layer. And so that
concept of simplicity is in the user
model of S3 but under the hood if you
are sit on any of our engineering
meetings you will hear our engineers
talk about how do we make sure that we
implement this capability with the
greatest simplicity that we possibly
can. Speaking of which, what type of
engineers do you typically hire to to
work at S3 in terms of what kind of
traits potentially past experience do
you look for?
>> Well, we hire all kinds of engineers.
Uh, you know, we have a lot of, um,
engineers on S3 who are early career.
Uh, they're straight out of school or
they're at a, you know, undergrad or
graduate school. And like I said, we
have, like a ton of engineers who have
been on S3 for a long time and
everything in between. I think there's a
really strong element in our teams um
that work on data around ownership. It's
a it's you know people feel this like
personal sense of commitment. I feel it.
I feel it every day I come in where I
feel a personal sense of commitment to
your bite, to the preservation of your
bite, to the use youthfulness of your
bite, to the ability for you to think
about what your application does next
and not the types of storage that you
need or how you grow it. And that deep
sense of ownership and that deep sense
of commitment is a very very common
thread across our data teams because we
know that at the end of the day every
modern business is a data business and
everything that people are trying to do
with traditional systems AI whatever is
based on your data as shaping the core
of your application experience. And so
that data is our responsibility and we
feel it very deeply.
>> And what would your advice be to let's
say a mid-career software engineer,
someone who has a few years of
experience working at at different
places who would who is actually after
listening to this gets really
enthusiastic and decides like one day
I'd love to work on a a deep strong
infrastructure team like S3 for like
let's say like more experienced folks.
what are experiences,
activities that you might look for that
that might uh help you consider these
folks more?
>> There's a a strong value and relentless
curiosity. Okay. And you know, I talked
a little bit about coloring within the
lines and how when you work on S3 or a
large scale distributive system which
continues to reinvent what storage
means, you're not really coloring within
the lines. you're just kind of looking,
you're taking a step back and you're
saying, you know, I I will draw what the
lines are today and I will know that I
might have to rub those out and draw new
lines in the future for wherever things
go. And so, you know, I have three kids
who are um in university. I have two
kids in university and one in grad
school. And that is one thing that I,
you know, I think is really important is
to always take a step back, take a look
at the latest research. And some of the
papers that I'll share with you are
around how we, you know, we either took
formal methods and we brought them into
storage systems, right? Or we thought
about failure in a different way where
that that creativity, that relentless
curiosity and that creativity with
engineering, I don't think you can go
wrong with that. I think the next
generation of software, no matter if
it's built in S3 or elsewhere, it is all
driven by the creativity of the
engineering mind and it is in all of us.
We just have to kind of unlock it and
unleash it and we will bring we'll build
amazing things like S3. And I also love
that with S3, not only has S3 created
something that did not exist and I think
like it just was unimaginable because it
didn't exist, but now I'm hearing
startups that are building on top of S3.
I think Turbopuffer is a good example.
You know, they're building innovation
because now they have a base layer and I
I feel there's different levels of
innovation. You decide where you want to
innovate at the very lowest level, one
level higher and and so on. And you just
use the right primitives, right? In your
case, this is just doing hardware and
storage better than anyone. In the other
layers, it will be using the right
primitives better than anyone.
>> Yeah, it's very exciting for us to see
so many different types of
infrastructure built on S3. Now,
>> and as closing, what is a book or a
paper that you would recommend reading
that that you enjoyed and and why?
>> I read a lot of different papers. I am
fascinated by how quickly the evolution
of embedding models are coming along now
and in particular um a field of science
that I'm quite interested in is the
multimodal embedding model because as
you know the world that we experience is
multimodal and therefore the
understanding that we have of data
should be multimodal as well and so
there's this whole field of science
that's that's emerging quite rapidly um
around multimodal embedding models uh
and so I That is something that I
encourage people who are working in the
field of data to to look at because I
think that is the next generation of
data. If you think about you know the
next world of data lakes I think it's
actually going to be on um metadata.
It's going to be on the semantic
understanding of our data and uh
understanding how that is created
through uh vectors and how it's being
searched and um done across multi uh
multiple modalities I think is is an
important area of both research and
advancement. And so that's what I would
encourage people to look at in the world
of data. I think vector is going to be
quite quite big particularly at the
price point that we've introduced for S3
storage for vectors. And um I'm excited
about it. I think, you know, I think
we're just getting started with data and
an understanding of our data and I can't
wait to see what comes next.
>> Amazing. And do you have any book book
recommendations?
>> I will give you a book recommendation um
just in case your readers are
interested. It won't be in the field of
computer science. um it will be about
the evolution of um the ecology around
us and supporting the bees, the native
bees and insects around us. So, a tiny
bit farther a field, but I'll give you a
book recommendation and if your your
readers are interested, they can um they
can take a look at how to support the
bees of the planet.
>> Well, Miline, thank you very much. This
was fascinating and and very interesting
to get a peak into this massive world of
scale of data and and respecting the
bite and and treating it and making sure
that it's durable.
>> It was great talking to you and thank
you to both yourself. I know you're a
fan of S3 and to all of your listeners
who use S3. Uh we quite literally
wouldn't be able to do what we do
without the feedback and the
encouragement from everybody who uses S3
today. So thank you for that.
>> Just wow. I always suspected there's a
lot of complexity behind a system like
S3, but [music] I just did not realize
the scale of it. Whenever I worked on a
systems with even hundreds of virtual
machines, failure of one machine was a
rare event and not something that we
really counted on. During my
conversation with my launch, she
casually mentioned that several machines
have failed during our conversation,
which is something that the Sream knows
and prepares for and treats it like an
everyday event. I personally really
liked how AWS has two conflicting
tenants heavily used on the S3 team.
[music]
Respect what came before and technically
fearless. For such a massive system, it
will be easy to say let's move
conservatively because of how many
companies depend on us. But if they did
so, S3 would fall behind. Finally, I'm
still in awe that AWS put strong
consistency in place, roll it out to all
customers and did not increase pricing
nor they didn't increase latency at S3
scale. This is an absolutely next level
engineuring achievement. In fact, it was
probably one of the lesserknown enduring
feats [music] of the decade. I hope you
found the episode as fascinating as I
did. If you'd like to learn more about
Amazon and AWS, check out the exclusive
deep dive I did with AWS's incident
management team on how they handle
outages in [music] the show notes below.
In the Pragmatic Engineer, I also did
other deep dives about Amazon and AWS.
They are also linked in the show notes.
If you enjoy this podcast, [music]
please do subscribe on your favorite
podcast platform and on YouTube. A
special thank you if you also leave a
rating on the show.
>> [music]
Ask follow-up questions or revisit key timestamps.
This video features Milon, the VP of data and analytics at AWS, discussing the immense scale and engineering behind Amazon S3. The conversation covers the service's evolution from its 2005 beginnings to storing over 500 trillion objects today. Key topics include the technical transition from eventual to strong consistency without cost increases, the use of formal methods to ensure 11 nines of durability, and the introduction of new primitives like S3 Tables and S3 Vectors for AI and analytics workloads.
Videos recently processed by our community