HomeVideos

RollerCoaster Tycoon Optimizations are Insane

Now Playing

RollerCoaster Tycoon Optimizations are Insane

Transcript

260 segments

0:00

Hey, you see this man right here? Okay,

0:02

do you know who it is? Well, guess what,

0:03

buddy? If you don't know who it is, you

0:05

got to you got to go back to school,

0:06

okay? You got to get a little bit of

0:08

respect in you, cuz this right here is

0:09

Chris Sawyer. Now, you may not recognize

0:12

even the name, which hey, I don't blame

0:14

you, but you should. He's the man that

0:16

can handcraft some assembly better than

0:18

pretty much anybody else. In fact, he

0:20

handcrafted one of the greatest games

0:22

ever, Roller Coaster Tycoon. Yes, we all

0:25

know Roller Coaster Tycoon. Okay, that

0:27

game has some serious aura as the as the

0:30

zoomers say. That game came out in 1999,

0:33

6 years after Doom. Doom written in C.

0:37

Chris Sawyer wrote his in assembly.

0:40

Absolutely diabolical, man. The roller

0:42

coaster tycoon optimizations are insane

0:45

and we're going to go over three of

0:46

them. And the third one, it's kind of

0:47

the craziest. It's also just a very,

0:50

very good idea. Besides for being

0:52

written in assembly, one thing about

0:54

Roller Coaster Tycoon is that it has

0:56

insane optimizations. If you've ever

0:58

played it, it was just creamy smooth on

1:01

any system. Now, remember, there's

1:03

thousands of individual NPCs, all the

1:05

little agents running through your park.

1:06

Yes, they were called agents at the

1:08

time. And look at this. The system

1:09

requirements were nothing. 16 megabytes

1:12

of RAM. To put that in perspective, look

1:14

at this. This is Node right here. A

1:16

little wild true loop. I'm going to do

1:17

this. I'm going to hit it with the old

1:19

amperand. I'm going to grab that and go

1:21

VMRSS and go 2 3 5 6 1 BAM. A while loop

1:25

in node

1:27

takes 57.3 mgabytes of RAM. Roller

1:31

coaster tycoon less than 16. Even the

1:34

fabled the bun requires 41.41

1:37

megabytes for a while loop.

1:39

>> Here at Terminal, we love Planet Scale.

1:42

We've been using Planet Scale since day

1:44

one of Terminal.shop and had an amazing

1:47

experience. So if you want a database

1:49

you don't have to worry about, choose

1:51

Planet Scale. Now back to the video.

1:55

So let's go over this gold standard of

1:57

optimization. Now I know a lot of you

1:59

have never actually had to concern

2:00

yourself with a little thing called

2:02

memory. Effectively unlimited and coding

2:04

has largely been solved. So you just

2:06

don't think about these things. But back

2:08

in the day, you had to be a lot more

2:10

careful. Now I want you to really think

2:12

about something when it comes to Roller

2:14

Coaster Tycoon. Every single one of the

2:16

park adventurers had some sort of memory

2:19

associated with them because each

2:20

individual one could become angry, they

2:22

could become nauseated, they could

2:23

become hungry, or they would just want

2:25

to leave the park. So that means every

2:28

single person had to have its own

2:30

memory. And this is our first type of

2:32

optimization. It actually involves the

2:34

size of money. Now, individual little

2:36

park adventurers, they would have a

2:38

little bit less memory associated with

2:40

their money. Okay? Hey, you don't got

2:42

the memory, you don't got the money. And

2:44

that's because the average Park

2:46

Adventure, they didn't need a ton of

2:47

money. They just needed something, say,

2:49

less than 256. So, you could just have

2:51

one bite of memory associated with how

2:54

much money that Park Adventure had. Or

2:56

in this case, it could be two bytes

2:58

associated with money. Now, remember,

3:00

there could be somewhere between 2 to

3:01

4,000 of these little guys walking

3:03

around. And so, a singular bite saved is

3:06

2 to 4K worth of memory. You can see why

3:09

that adds up a lot when you only have 16

3:12

megabytes and that is the system

3:15

requirement which means you still need

3:16

to load up the source code. You still

3:18

need to load up all the graphic tiles.

3:19

You still need to have the operating

3:21

system running somewhere. Therefore,

3:23

every single bite actually mattered. So

3:26

even the money representation, the

3:28

position representation, everything

3:30

would be squashed down such that each

3:33

character could save the maximal amount

3:35

of memory. Now, the next one's a bit

3:37

stranger, but it actually makes a ton of

3:39

sense. You'll see a bunch of these these

3:42

bit shifts of old value, say by two. All

3:44

right, so I know not all of you are

3:46

going to have some binary knowledge out

3:48

there. And let's just face it, teaching

3:50

you base 2 might take a bit of a moment,

3:52

but you can see this right here. Here is

3:54

a value 1 0 1 0 0 0. This value is

3:59

actually technically five. That's

4:01

because this value is four. This value

4:03

is one. You add those two together, bada

4:05

bing, bada boom, you got yourself five.

4:06

Now, a left shift or a shift that's

4:09

represented like this is actually pretty

4:10

easy to understand. If I were to shift

4:12

this by one, you just actually shift

4:14

over all the binary. So, that means the

4:17

original value 1 0 1 would actually

4:19

become 0 1 0 1 because it's been shifted

4:21

over by a singular bit. Now, this value

4:25

right here is actually 10. And that's

4:27

because a single left shift is the

4:29

equivalent of multiplying by two. A

4:31

single right shift is the same thing as

4:33

dividing by two. Now, you're probably

4:35

asking, well, why would you do this? Why

4:36

not just multiply by two? It's so much

4:38

more clear for everybody. Well, first

4:40

off, back then, people knew how to kind

4:42

of read a little bit of that binary. But

4:44

second off, the reason being is actually

4:45

quite simple. A singular multiplication

4:48

could end up taking somewhere between 3

4:50

to eight cycles is what I've read on the

4:52

internet back on those machines. And a

4:54

divided by can take tens of cycles to be

4:57

able to execute. So, if you can center

4:59

your game around these divide by twos

5:02

and multiply by twos, you can save many,

5:04

many cycles, thus making your game feel

5:07

much, much faster. So, in roller coaster

5:08

tycoon land, everything's going to be a

5:10

left shift, a left shift by two, or the

5:12

same thing as multiplying by two,

5:13

multiplying by two, or just multiplying

5:15

by four. Or you're going to see a lot of

5:17

things like right shift three or divided

5:19

by 8. I it is a very different world

5:21

because everything we kind of live with

5:23

today has just almost no bearing on

5:26

what's actually happening underneath the

5:27

hood or the CPU cycles that are being

5:30

taken by individual instructions.

5:31

Instead, you just kind of have these

5:33

really abstracted concepts and you just

5:35

go, okay, hey, I'm going to create a new

5:37

object. I'm going to throw some keys and

5:38

values in there. One of those values is

5:40

going to be a function. I'm going to

5:41

execute the function. you don't actually

5:42

think about all the individual little

5:44

cycles and instructions that are going

5:46

through your CPU at any one point, let

5:48

alone the memory that you're actually

5:49

taking up. Now, the last optimization is

5:51

actually the craziest optimization,

5:53

which is pathf finding. If you ever

5:55

watch one of these little park

5:56

adventurers kind of walk through the

5:58

park, they just wander aimlessly. They

6:01

don't actually attempt to go anywhere.

6:03

They actually choose random directions

6:04

until they find a ride that they're

6:06

interested in. or if they're hungry,

6:08

they actually continue to take random

6:11

directions until they find food. And the

6:13

reason being is that path finding is

6:15

actually really expensive. Let's pretend

6:18

we have the following graph right here.

6:19

And you are right here and you wanted to

6:22

leave this park. This is the exit right

6:23

here. How would you actually leave it?

6:25

For us, it's very easy. We just look

6:27

through and be like, "Dog, just goam

6:28

bam." Well, unfortunately, computers

6:30

can't do that. So instead, you have to

6:32

take every single one of these nodes,

6:34

node being represented by one of these

6:36

little circles, and you have to put it

6:37

into a list. And then you're going to

6:39

have to resolve slowly how to find the

6:42

exit. This is called Dystra's shortest

6:44

path. It's a decently complicated

6:46

algorithm that involves a minimum heap

6:48

and you have to store a bunch of values

6:50

and you have to look up stuff. But for

6:51

the sake of it, you can obviously see

6:53

having to do this for say 2 to 4,000

6:56

individual agents every time they want

6:58

to go somewhere is clearly going to

7:00

cause frame stuttering, especially on a

7:02

90 megahertz CPU. So to put that into

7:04

perspective, that means without any

7:06

instruction optimization. These CPUs

7:09

back then were over 30 times slower than

7:12

the CPUs you have today. Just pure raw

7:15

instruction power. Not to mention

7:17

reading from RAM, significantly slower.

7:20

So doing this kind of thing over and

7:22

over again for every single agent would

7:24

just cause massive frame freezing. But

7:27

pathf finding did actually happen in the

7:28

game. It would happen under two

7:30

conditions. One, if there was a

7:31

mechanic, the mechanic would come in and

7:33

try to find where in the park is a ride

7:36

actually broken at. And they would get

7:38

up to eight paths they could look into

7:40

to go and find. So they'd be able to go

7:42

decently far, but sometimes even the

7:45

mechanic couldn't find the ride it

7:46

needed to go fix. And two, when one of

7:48

the little park adventurers needed to

7:50

leave the park, they would get up to

7:52

five depth to be able to leave the park.

7:54

And if they couldn't find that, you

7:55

would end up seeing a picture like this.

7:57

This very familiar picture. They're

7:58

angry. They can't find the park exit.

8:00

And did you know that if you sold

8:02

individual customers maps, those that

8:04

had maps, they'd get up to a depth of

8:06

seven to be able to search for an exit.

8:08

Meaning you would have happier customers

8:10

not to find the rides. No, no, no, no,

8:12

no. Which was always a little bit That

8:13

is a very surprising fact. They bought

8:15

maps not to find rides, but to find

8:17

exits. And all of this was done just

8:19

simply to be able to make Roller Coaster

8:22

Tycoon run. The money size was to help

8:25

reduce RAM usage. The multiplication and

8:28

division being XNATE in favor of being

8:30

able to multiply or divide by two CPU

8:33

optimizations. Then Pathfinding, the

8:35

ultimate CPU uh optimization was

8:38

throughout the entire game to prevent

8:40

frame stutter. And all of this was done

8:43

not in some highlevel JavaScript

8:45

language, but just straight up in the

8:46

Lord's language, okay? Assembly. All

8:48

right? So, the next time you see this

8:49

face, put a little respect on it, okay,

8:52

buddy? Because this face right here,

8:54

this is the face of a man who made not

8:55

just one roller coaster tycoon, but two

8:58

roller coaster tycoons in assembly. The

9:00

name is the primogen.

Interactive Summary

This video highlights the incredible technical feats of Chris Sawyer, the creator of Roller Coaster Tycoon. It explains how he achieved extraordinary performance and efficiency for the 1999 game by writing it entirely in assembly language, specifically focusing on memory management, bitwise operations for performance, and clever pathfinding limitations to handle thousands of NPCs.

Suggested questions

4 ready-made prompts