HomeVideos

E24: I Tested NVIDIA's Self Driving Car... Is Tesla In Trouble?

Now Playing

E24: I Tested NVIDIA's Self Driving Car... Is Tesla In Trouble?

Transcript

1997 segments

0:00

Today, you're joining me for something

0:01

really special. A realworld unedited

0:04

1-hour drive through downtown Los

0:06

Angeles in a Mercedes equipped with

0:09

Nvidia's L2++ autonomous driving

0:11

platform. This isn't a highlight reel,

0:13

and it's not a simulation. It's

0:15

continuous footage of the system

0:17

navigating the everyday chaos of LA

0:20

traffic, lane merges, sudden cutins,

0:22

construction zones, and unpredictable

0:25

pedestrians. Joining me for the ride is

0:27

Armen Connie, senior product manager for

0:29

autonomous vehicles user experience at

0:32

NVIDIA. I asked him every technical

0:34

question I could think of. So, you'll

0:36

hear him explaining what's really

0:38

happening under the hood, from sensor

0:40

fusion and decision logic to how the

0:42

user experience is designed to keep

0:44

drivers informed. For investors, I hope

0:46

this footage helps you assess how far

0:48

Nvidia's automotive platform has really

0:51

come and how it compares to others in

0:53

the autonomous mobility market. Every

0:55

few minutes, something unexpected

0:57

happens, and how the system handles it

0:59

might just challenge your assumptions

1:01

about who's ahead in full self-driving,

1:03

or at least by how much. Your time is

1:06

valuable. So, let's get right into it.

1:08

>> So, the system's engaged, right? So, you

1:10

can see we'll try to make this uh right

1:12

on red here after uh this car passes by.

1:15

>> Yeah.

1:15

>> Uh so, what you're experiencing here,

1:16

this is what we call our level two plus

1:18

experience. So, this is built on our

1:20

Hyperion architecture, right? So, this

1:22

car is using 10 cameras, five radar, and

1:25

then 12 ultrasonics uh for parking.

1:27

>> No lighter.

1:28

>> No lighter on this car.

1:29

>> Okay.

1:29

>> Right. So, because we still have George

1:31

is still a level two product, right? Uh

1:34

we still have the driver here. Uh and

1:36

it's designed where he can collaborate

1:37

with the car, right? So, uh the car will

1:39

follow all the speed limits. Uh if he

1:41

wanted to increase the speed, he can do

1:43

that from pressing the steering wheel

1:44

button. There's a speed adjustment

1:45

there. He can request lane changes by

1:47

using the turn signal stocks. uh or if

1:50

he wants, let's say there's a big

1:51

pothole in the road or something like

1:52

that, he would be able to collaborate

1:54

with the car and actually you adjust the

1:56

steering and then release the steering

1:57

and then the car resumes uh you know the

1:59

driving. So uh but for now we'll let the

2:01

car do everything. So we're approaching

2:02

the stop sign. The car can see the stop

2:04

sign there. It'll know to stop. It knows

2:06

to follow the right of way order right

2:08

before proceeding. So we're going to

2:09

start with a kind of a little section

2:11

here where we're going to get to like a

2:12

shopping area with some uh restaurants

2:15

and stores. So hopefully we'll get some

2:16

people crossing the street and some

2:17

delivery vans and things like that just

2:19

so we can show that we can handle a

2:20

variety of these scenarios.

2:21

>> Sure.

2:23

Uh one of my first questions is going to

2:25

be how did you guys come to the design

2:27

decision not to incorporate LAR?

2:29

>> So we work with uh our partners here to

2:32

determine what sensors right we want to

2:35

use. And for a level two plus product uh

2:38

we felt that we can achieve that with

2:39

just the 10 cameras and the five uh

2:42

radar with ultrasonics as well. Uh but

2:44

for our level three and level four

2:45

initiatives, that's when we'll add the

2:46

additional LAR to it. And then we'll

2:49

also change the actual driving model

2:51

that we're using to a bigger model. So

2:53

it all just scales to what the, you

2:55

know, design intent is for the given

2:56

product.

2:57

>> Yeah. And I assume that's like going to

2:59

be like a more extended version of the

3:01

same stack. So like the people who just

3:03

want L2++, you know, the people who want

3:06

three, four, or eventually five. It's

3:08

just an evolution of that same stack,

3:10

not a different stack completely.

3:11

>> So it's the same principles, right? So

3:13

what we're experiencing here today, this

3:15

is running all in a single orin. Uh and

3:17

then this uh experience is about 95% of

3:21

the driving will be done by Alamo,

3:22

right? And then we still have that

3:24

classical stack sitting kind of you

3:26

remember those old drivers education

3:28

cars where you have like two sets of

3:29

brake pedals, two gas pedals, two

3:30

steering wheels.

3:31

>> So the way I like to think about it is

3:33

uh Alpha Mayo, the end model in the

3:34

driver's seat, right? It's doing the

3:35

driving. Yeah,

3:36

>> but we have this classical stack that is

3:38

sitting in that passenger seat with the

3:40

extra set of brake pedal, gas pedal, and

3:42

steering wheel to take over to help

3:44

enforce certain rules if needed to. So,

3:46

that's how we're able to have both the

3:48

safety of that kind of classical stack,

3:50

right? But also the human driving

3:51

behavior of the end to end model where

3:53

you get that smooth, comfortable driving

3:55

behavior.

3:56

>> Got it. Um, one thing I'd love to

3:58

understand is, you know, you mentioned

4:01

there's a lot of cameras, there's a lot

4:02

of radars, and there's a lot of

4:04

ultrasonic sensors, right?

4:06

um what do each of those sensors do and

4:08

how do they get combined into this

4:10

larger like 360° view of what's going on

4:13

around the car?

4:14

>> Yeah. So it can take input from all

4:16

those sensors and it creates what we

4:17

call the world model, right? So the car

4:19

can see that all these cars are parked,

4:21

none of them are moving. So we can

4:22

detect the velocities for example,

4:24

right? Based off what the car can see

4:26

with the cameras, we can see the lane

4:27

lines and the lane markings. So we can

4:28

tell that we're in a drivable lane here,

4:30

but to our right is a bike lane, right?

4:32

So the car can label and understand what

4:34

all those are. So then it creates this

4:36

reconstruction of the world, right? And

4:38

it uses that to understand behaviors,

4:40

right? So for example, at an all-way

4:42

stop sign, we can use that to understand

4:44

right-of-way order, right? So we can see

4:46

when the other cars arrive and determine

4:48

when it is our turn to move in terms of

4:50

precedence, right? But then we also have

4:52

to consider there's these people

4:53

walking, right? So they will also impact

4:55

our ability to start. So you can see

4:57

this guy's here stopped, right? You have

4:59

these guys in the crosswalk. It's safe

5:00

for us to proceed, right? We also can

5:01

see that there's this guy on the scooter

5:02

in the bike lane. We don't need to freak

5:04

out, right? We can just keep driving in

5:05

our own lane. No problem, right? We can

5:07

just drive next to this guy,

5:08

>> right?

5:09

>> So the end to end model is using kind of

5:12

the front camera where it can see,

5:13

right? And then it's receiving inputs

5:15

from that world model to see what's also

5:16

going on behind it as well. And then the

5:19

classical stack is using all of them to

5:20

determine where it should go.

5:22

>> And sorry. So it sounds like everything

5:23

you've described so far is coming from

5:25

the cameras, right? That's the

5:27

>> cameras and it's also using the radar as

5:28

well. So

5:28

>> and the radar.

5:29

>> Yep. So it can use both basically to

5:31

understand the velocity of an object

5:33

right or a person that's walking right

5:35

it can tell right this is a car in front

5:37

of us. So it combines actually both

5:39

inputs in order to understand what it's

5:41

looking at and what it can detect with

5:42

the radar.

5:42

>> And then the third type of sensor you

5:44

mentioned I believe was ultrasonic. Is

5:45

that right?

5:46

>> So those are used for parking. Right. So

5:47

when we get really close to curbs and

5:49

things like that that's where we're

5:50

using those ultrasonics. But for driving

5:52

most of it's done with the uh camera and

5:54

the the radars.

5:55

>> Yeah. And and the radars are there

5:56

primarily to do range and speed.

5:58

>> Correct. You've got it. That's very

6:00

cool. How come you can't use um uh

6:03

information from like multiple cameras

6:05

like in stereo vision to determine range

6:08

and speed? What why radar or not?

6:10

>> We do a bit of both, right? So, you

6:11

know, there that's bit of the secret

6:13

sauce, right? But, uh you know, we're

6:14

using both to confirm and understand the

6:16

world around us, right? So, that's where

6:17

you have that redundancy that's helpful

6:19

to understand what is drivable, what

6:22

these other things are doing, right? Are

6:23

these people, right? You know, a radar

6:25

can see these people are crossing the

6:26

street, right? without the camera to

6:28

tell you that hey this is a person right

6:30

I know a person shape right you're able

6:32

to combine those together and understand

6:33

that these are people walking versus

6:34

that's a car or a small scooter right

6:37

>> so certain solutions on the road try to

6:39

approach this from a vision only

6:41

perspective

6:42

>> um how did you guys like can you walk me

6:44

through a little bit of the bigger

6:45

thought process that made the

6:47

determination you know what vision only

6:49

may not be enough especially for like an

6:51

L3 or L4 solution and these other

6:53

sensors need to come into play is that

6:55

from a safety perspective, a regulation

6:58

perspective, a capabilities in general

7:00

perspective. How how did you guys decide

7:02

to use more than vision only in the

7:03

first place?

7:03

>> For our Hyperion architecture, right, we

7:05

wanted the redundancy, right? So for

7:07

like level three and level four, uh

7:09

we'll use to uh to Thor, right? And

7:11

we'll also have the LAR there, right? So

7:13

we get more information, right?

7:15

>> And sorry, can you just briefly explain

7:17

what THOR is and Orin is and

7:19

>> Yep. So these are the different uh

7:21

onboard computing chips that we provide

7:22

to our partners, right? So the THOR has

7:26

more computing power than the Orin. So

7:28

with additional computing power, you can

7:29

use bigger end toend models, right? We

7:32

can take inputs from more uh signals,

7:34

right? So we can use more cameras to

7:36

power that model to get more

7:37

information. So like here's a great

7:38

example. We just had that light turn

7:40

yellow right as we're pretty close,

7:42

right? The car has to understand, should

7:44

I, you know, step on the brakes and stop

7:45

before the uh light or should I proceed

7:48

through kind of like a human would,

7:49

right? So it's calculating the distance

7:50

between us and the stop line, how fast

7:53

we're moving, right? to make some of

7:54

those decisions as well. So, sorry, just

7:55

an interesting scenario that we we had.

7:58

>> Uh, so yes, like coming up ahead, I

8:00

think we're gonna have an unprotected

8:01

left turn, which also be interesting to

8:03

see, right? You know, we have to

8:04

consider the oncoming cars, right? We

8:06

have to consider if anybody's in the

8:07

crosswalk. So, it's it's it should be

8:10

interesting uh coming up here. But this

8:11

is that shopping area saying where we

8:12

might have some double parked vehicles

8:14

and some pedestrians crossing the

8:16

street. So, uh earlier today, we've been

8:18

lucky. We've got some fun interesting

8:20

scenarios. I'm hoping we get some to

8:21

share with you guys as well.

8:22

>> Yeah. No, I'm looking forward to that.

8:24

I'm I'm noticing this is probably the

8:26

most boring driving job ever because

8:28

it's like you're not touching the

8:29

steering wheel. You're never touching

8:30

either pedal. Like it seems so smooth

8:32

that it's like it's interesting just

8:34

being here almost like a safety backup

8:36

instead of um the primary, right? Like

8:39

you're really being driven instead of

8:40

you driving the car is what I'm just

8:42

noticing for really the first time.

8:44

Yeah.

8:44

>> Yeah. So for for example, I did a test

8:46

where I drove from San Francisco to San

8:47

Diego in one day. So that was about 14

8:49

hours. It's about 1,000 miles, right? So

8:51

it was another driver and I and uh we

8:54

went down and you'd imagine being in the

8:56

driver's seat for that long right you'd

8:57

arrive really tired and you know

9:00

irritated right you sitting in traffic

9:01

throughout the day and things like that

9:03

>> but honestly by using the system and

9:04

letting the car handle a lot of the kind

9:06

of mundane task of you know sitting in

9:08

bumperto-bumper traffic in LA right the

9:10

car handled a lot of that right so there

9:12

I actually arrived even after a long day

9:14

quite refreshed and actually not that

9:16

tired because you release so much of

9:18

that you know processing that you're

9:20

doing right while you're driving that

9:21

you arrive a little bit more refreshed.

9:23

>> Yeah. You you uh eliminate a lot of that

9:25

decision fatigue, right? That decision

9:27

fatigue.

9:32

>> What uh what's going on with the

9:33

displays here? So, like this seems to be

9:35

a pretty static display. Um walk me

9:37

through like what the driver what kind

9:38

of information you're presenting to the

9:40

driver in a situation where they're

9:42

being driven versus them driving

9:43

themselves.

9:43

>> So, here on the center display, right,

9:45

you can see just the navigation, right?

9:47

So again, we just set a route out to

9:49

show you a variety of scenarios, but

9:50

we're just using that purely for

9:51

navigation, right? So we're not getting

9:53

any there's no HD map here. So we're not

9:55

getting any hints about the lane to our

9:57

right is a parking lane, then there's a

9:59

bike lane and there's this lane. So we

10:00

don't get any of that information. Uh

10:02

that's based purely off the car can see.

10:03

And then George, when you have a second,

10:05

do you want to switch over to the

10:07

conference view? Right. So we also then

10:09

can provide these inputs to our partners

10:11

where they can choose how they want to

10:12

visualize what

10:14

>> I'm sorry, I'm looking there now, right?

10:15

>> Yes. In front of the driver. So the

10:16

instrument cluster there, right? So we

10:18

can show that, hey, we see we have this

10:19

lead vehicle, right? So the car is able

10:22

to present that information to the

10:24

driver to help communicate a little bit

10:25

more about what it can see, what it's

10:27

doing, right? And we have all those

10:28

inputs that we can share, right? So we

10:29

can share things like traffic lights,

10:31

other vehicles, lane detections, and

10:33

then the partners can choose what they

10:35

want to uh use to display.

10:37

>> And sorry, just just so for clarity for

10:39

me, everything I'm seeing on the screens

10:41

now is purely for humans. None none of

10:44

this is also like information like the

10:46

information that the car itself is using

10:49

to make decisions is completely separate

10:51

from these like

10:52

>> what you're saying. Yeah. The only thing

10:53

that car is taking here is basically the

10:54

route, right? That's the only thing that

10:55

it receives that you can see. So as we

10:57

adjust around and again as we said

10:59

towards the end here, we can update the

11:00

route and set new points that are not on

11:02

the route. No problem, right? Uh at that

11:04

point the car just getting the

11:05

navigation to turn on this street,

11:07

right? Proceed on that street. That's

11:08

all it's getting from there.

11:09

>> Um how come? Why like why why not give

11:12

it as much information like is is the

11:14

idea that's the sufficient amount of

11:16

information to do the job and you don't

11:18

need to give it more or is there like

11:20

why not give it for example all the

11:21

speed limits uh to these roads is it

11:23

because it's enough to have it determine

11:25

things uh with cameras using looking at

11:28

street signs even in low visibility or

11:30

>> so it's there's always a data quality

11:32

question right so sometimes we can get

11:34

incorrect information from the map or it

11:36

doesn't have information right so we'll

11:39

use what we have if it's there, but if

11:40

it's not there, no problem. We'll go off

11:42

what the car can see from perception,

11:44

right? So, as we see speed limit signs

11:46

change, right, the car then can see that

11:48

and adjust its speed accordingly, right?

11:50

So, you know, if maps don't get updated,

11:52

things like that, we want to make sure

11:54

that the car is always driving based off

11:55

what is most relevant. Like we can see

11:57

we have some construction coming up

11:58

ahead, right? That may not be

12:00

represented in any mapping, right? So,

12:02

the car just needs to be able to handle,

12:03

okay, there's a lane closure coming up,

12:06

right? So, you can see the car stopped.

12:07

Does this guy cut in front? Right? No

12:09

problem. Uh, it's able to stop. And then

12:12

you can see up ahead, right? We'll have

12:14

this lane closure with that big LED

12:16

board, right? So, we can see, you know,

12:18

the the signs there, right? And we know

12:21

we want to make a left turn, so it's

12:23

going to want to get over to the left,

12:24

but it's actually going to change its

12:26

mind and go here, right? No problem. It

12:27

sees there's a guy standing on the road,

12:29

right? So, we can come here and we'll

12:31

drive here with this center lane that's

12:32

closed, right? How often do you see a

12:34

center lane is closed?

12:35

>> Yeah. Right.

12:37

What's that like for you as the driver,

12:39

if you don't mind me asking? Like, you

12:40

know, someone cuts in front of you, your

12:41

hands close to the your pedals are close

12:43

to the feet, your hands are close to the

12:45

and you do nothing and it just works

12:47

like

12:48

>> um Well, I've been testing the software

12:50

enough, so I know the car can handle

12:52

most of the situations.

12:53

>> Was the first time it was just like,

12:55

"Oh, this is crazy." And then,

12:56

>> no, I've seen crazier stuffs.

12:58

>> Okay. Yeah. So, so this it sounds like

13:01

it can really handle truly like outlier

13:03

situations is what I'm getting at.

13:05

>> Yeah. So the, you know, the model's been

13:07

trained enough, right, that we can do a

13:08

lot of the general driving, right? And

13:10

what we're excited about is, you know,

13:11

we'll have, you know, kind of a beta

13:13

release of this, uh, in Q2 of this year,

13:16

right? But we're looking for a, so we

13:17

can see we're trying to get through with

13:18

these two guys here. No problem. U,

13:21

we're trying to do a nationwide roll out

13:23

by the end of this year, right? So,

13:25

>> and sorry, when you say nationwide roll

13:26

out, what do you mean? Is that like

13:28

nationwide in Mercedes? Like what what

13:30

is

13:30

>> for customers? Yeah. That are buying

13:32

this car, right? they'd be able to, you

13:34

know, purchase this software, right, and

13:35

use it to drive, you know, from here to,

13:38

you know, Miami or, you know, everything

13:40

in between, right? So, that's really

13:42

exciting. And then from that, right, as

13:44

the, you know, we have this rolled out

13:46

in more and more customer cars, right,

13:47

we start to get, you know, interesting

13:49

data events that we get from those cars

13:50

as well. So, that data will be sanitized

13:52

and sent back to us. So, we can always

13:53

evaluate all the new events that we're

13:55

seeing that maybe our fleet didn't

13:56

catch, right? But someone living in, you

13:58

know, a state where we don't have a test

14:00

car, right? we can then start getting

14:01

that information that way as well and

14:03

use that to enhance the models for

14:04

future releases.

14:06

>> And I so I assume a bunch of things are

14:08

happening at once, right? Like for

14:09

example, the software is being offered

14:11

in more and more automobiles and with

14:13

more and more auto uh manufacturers and

14:16

then separately there's also growing

14:18

more and more capabilities in each one

14:20

of those. So for example, we're in an L2

14:22

plus vehicle, right? Yep.

14:24

>> And then you know eventually level three

14:26

and beyond. Can you speak a little bit

14:28

to All right, you just said a little bit

14:30

about the nationwide roll out, right?

14:32

What's the what's on the road map uh the

14:34

other way, right? Like when do you guys

14:36

expect to sort of reach a level three, a

14:38

level four?

14:38

>> Yeah. So, as you may have seen a lot of

14:40

the news with GTC this year, we

14:42

announced that with Uber, we'll do a

14:43

level four robo taxi in Los Angeles in

14:45

San Francisco starting next year. Right.

14:47

So, you can see we're nice going down

14:48

this nice narrow road, right? And sorry,

14:50

he he actually touched the wheel there

14:52

or was it more about what's what what's

14:54

the distinction like why did why why um

14:58

manually turn there as opposed to let

14:59

the car do it?

15:00

>> Um

15:01

>> I'm genuinely like I'm just trying to

15:03

understand the

15:06

>> I'm just a safety driver. So if I feel

15:08

like sometimes um like uh the car maybe

15:11

is um going to uh get into contact with

15:14

objects then I can collaborate uh I can

15:18

do something called collaborative uh

15:20

steering

15:21

>> but the car still lets me uh handle the

15:23

wheel

15:23

>> and it's like a seamless transition it

15:25

seems like like you you did one tiny

15:27

maneuver hands off and it wasn't like a

15:29

hard intervention like the mode didn't

15:31

change.

15:32

>> Yeah. So the design is so that way for

15:34

level two, George can be involved or he

15:36

can let go, right? It can go either way,

15:38

right? So like I've done that turn six

15:39

times today, right? No issue, right? So

15:41

in this case, right, depending on if

15:43

George wants to get closer or not,

15:44

right? He can help the steering out,

15:46

right? To increase his comfort, right?

15:48

Or increase our comfort, right? In that

15:50

case,

15:50

>> and that hand like I think what I'm more

15:52

commenting on is like I've I've been in

15:55

other, you know, level two assisted cars

15:58

in the past. Um, and when you make a

16:01

manual intervention, from then on,

16:03

you're in manual driving until you

16:05

re-engage a lot of those features. But

16:08

in this case, it seemed so smooth. It

16:10

was like touch the wheel, adjust the

16:12

turn a little bit, you're back to hands

16:14

off, feet off, you know.

16:15

>> Exactly. So,

16:16

>> so in this case, if George can he can

16:18

tap the gas, he can, you know, touch the

16:20

steering wheel, right? The system will

16:22

stay engaged. It's only if he hits the

16:23

brakes that'll disengage completely.

16:24

>> Got it. Uh, super interesting. And

16:27

sorry, so right before that happened, we

16:29

were talking about uh level three and

16:31

level four. Can you just remind me? So

16:33

with Uber, that's what you're talking

16:34

about.

16:34

>> Yeah. So we'll have uh level four will

16:36

start rolling out in LA and San

16:38

Francisco to start, right? And it'll be

16:39

28 cities by the end of 2028 uh with

16:42

them. So we're excited to see that

16:44

coming as well. So we can see this guy

16:45

stopped, right? No problem. We have that

16:48

>> there. Uh so yeah, so we're excited

16:50

again to see how the architecture can

16:52

scale from a level two plus product all

16:54

the way up to that level four, right?

16:56

Where the car would have to do

16:57

everything, right? Where you you don't

16:58

have George here, right? You don't have

16:59

anyone in that seat, right? The car

17:00

would be able to handle all of those

17:02

scenarios. Are you expecting when that

17:04

happens, will there be certain new kinds

17:07

of cars coming out that maybe don't have

17:08

a steering wheel at all? Like how does

17:11

this impact the future of what

17:13

automobiles will even look like?

17:15

>> Yeah, there's a couple of different

17:16

ways, right? I think we can imagine a

17:18

world where yeah for kind of robo taxis

17:21

right where there never will be a driver

17:24

right you can have a design where you

17:25

know there's a car that doesn't have a

17:27

steering wheel right or doesn't or have

17:28

the seats that maybe face inward right

17:30

that's one concept

17:31

>> uh the other one right it can be you

17:33

know consumer grade where you know you

17:35

can buy a car where you know at least

17:38

living in California right I might want

17:39

to go drive on you know highway 1 and

17:41

PCH and go drive you know the beautiful

17:43

scenic countryside roads along the ocean

17:46

but then when I'm sitting in traffic in

17:47

San Francisco, I want the car to do

17:48

everything, right? So, you can have

17:50

different approaches to, you know, when

17:52

you want to drive versus when you don't.

17:53

So, the the as long as you have the

17:56

sensor set, right, the stack is flexible

17:58

enough that you can have a steering

17:59

wheel there and we can design it where

18:01

we want the driver to be part of the

18:02

experience or we can do it where we

18:04

don't want the driver to be part of the

18:05

experience at all.

18:06

>> Sure. No, that makes a lot of sense.

18:07

Speaking of the stack, um since you know

18:09

so much of it is camera based, uh what

18:12

is performance like you know like super

18:14

foggy weather, nighttime, bad rain,

18:17

right?

18:17

>> Yeah, it's a great question. So the

18:19

system has levels of kind of degradation

18:21

that can it can accept, right? And we

18:23

also can understand where maybe the

18:25

blockages, right? So let's say there's

18:27

dirt that gets on some of the cameras,

18:28

right? For example, if we're looking in

18:30

front of us, if it's blocking where we

18:32

can't really see the tops of these

18:33

buildings, right? We don't really care

18:35

as much, right? you don't need to

18:36

prioritize things that are up high,

18:37

right? As long as we still can see

18:38

directly in front of us in that type of

18:40

drivable space. So, it's prioritizing

18:43

different areas for each of the cameras

18:44

of what is most important, right? And

18:46

then until they reach, you know, a

18:47

certain degradation level, right? Then

18:49

it may say, "Hey, okay, actually, we

18:51

want the driver to take over in this

18:52

case, right? And then with kind of the

18:54

level threes and level fours, right,

18:55

that's why we want the redundant sensor

18:57

sets there so we can have a couple

18:59

different options in order to help aid

19:02

uh what the car can see, right? and make

19:03

sure that you know it's able to see all

19:05

the objects and all the cars on the

19:06

road.

19:08

>> That makes a lot of sense. So, we've

19:10

we've seen a lot of interesting use

19:12

cases already, right? For example, um or

19:14

edge cases, sorry, I should say, like um

19:16

construction in the middle of the road

19:18

where you have to make a left right

19:19

decision, but the construction zone is

19:20

between them. What is some of the like

19:22

craziest edge cases you've seen that

19:25

turn into real practical examples and

19:27

training data for the model?

19:29

>> Yeah, I would say a lot of construction

19:30

ones have been interesting, right? Uh we

19:33

I had one example uh it was actually in

19:35

San Francisco where there was a row of

19:38

cones, right? And there's people working

19:39

in the middle. No problem, right? We see

19:42

that, you know, a million times. No

19:43

issues, right? But then one of the

19:44

construction workers decided to throw a

19:46

cone in front of the car, right? To say,

19:48

"Hey, stop. We have we're going to

19:49

unload some stuff." So it literally just

19:50

throws a cone, right? And the car sees

19:52

this object. Yeah. So then it stops,

19:53

right? But you're like, "Okay, wait,

19:55

what? I've never seen someone do that

19:56

before." And the guy just didn't want to

19:58

wait. So he threw a cone in front of the

19:59

car and then he went and carried, you

20:00

know, a couple boxes across the street

20:02

and then went and picked up the cone and

20:03

then walked out of the way, you know,

20:04

>> and it was like totally fine.

20:05

>> Yeah, it was totally fine, right? But it

20:06

was one of those things you're like,

20:07

"Okay, I've never I've never seen that

20:09

even as a human driver, right? So we can

20:11

see we don't want to block the

20:13

intersection, but then we also have, you

20:16

know, these pedestrians here, right? So

20:18

we want to try to clear the ped the

20:19

intersection as much as we can, right?

20:21

But then we have these guys that are

20:23

walking, right? Cuz the light came to a

20:24

stop. We're in gridlock, right? This is

20:26

a, you know, a nice deep dense traffic

20:28

area in downtown San Francisco.

20:30

>> So, this is a case where it seems to

20:31

have made the decision, uh, even though

20:33

we were already moving at low speeds and

20:35

the light was yellow, uh, then it turned

20:37

red while we were still in the

20:38

intersection. Is is that typical? Like,

20:41

walk me through kind of what just

20:42

happened versus what maybe the average

20:44

person would have expected to happen.

20:46

>> Right. It's, you know, so you have this

20:48

guy that's really close to a spine, so

20:49

that's why we got the rear blind spot

20:51

going off there. U, it's interesting,

20:54

right? The, you know, yellow light

20:56

handling is also one of those

20:57

interesting things as a human, right?

20:58

Where you can tell how quickly you're

20:59

moving towards a car or towards the

21:01

intersection and you can kind of gauge,

21:03

okay, I should stop for this one, right?

21:04

I'm really far away. I'm not going that

21:05

fast. I should stop. Uh versus, hey, I'm

21:07

really close to the intersection. I

21:09

should probably proceed to be safer that

21:10

way. You're not slamming on the brakes,

21:11

right? So, it's a very similar approach,

21:13

right? We have enough training data on

21:15

yellow lights, right? Where it's

21:16

learned, okay, in this situation where

21:18

I'm, you know, about this far away at

21:19

this speed, right? I should proceed

21:20

through versus I should not. And then we

21:22

also know in this situation where if we

21:24

end up in a situation like that where we

21:25

get kind of stuck where we've already

21:27

entered the intersection, we're already

21:28

passed kind of the weight line, right,

21:30

we should try to clear the intersection,

21:31

right? So that's why we came over to the

21:33

right lane and we're able to kind of

21:34

open up and clear up the intersection so

21:36

that we're not blocking the the cross

21:37

traffic.

21:38

>> Got it. Yeah. As a more selfish driver,

21:40

I would have probably stayed in the

21:41

intersection, not gotten in this lane,

21:43

you know, so it's really interesting

21:44

seeing um the way the car prioritizes

21:48

certain things. Like for example,

21:50

clearing the intersection at the cost of

21:52

our own convenience because now we're in

21:53

sort of a lane people park in versus the

21:56

more human decision of like, oh, I'll

21:57

just wait because I'm about to clear

21:59

this light even though right now I'm

22:00

sticking in the intersection. Right.

22:02

>> Yeah. But then it also saw we had people

22:03

walking. Right. So it stopped. Right. So

22:05

it let those people cross ahead. Right.

22:07

>> Right.

22:08

>> Whereas I probably would have just

22:09

beeped at them.

22:12

>> What are you most looking forward to

22:14

like near-term more on the road map? Is

22:16

there like a specific feature that's

22:18

coming soon? Is it more like the global

22:20

rollout? Like walk me through as

22:21

somebody who lives this sort of

22:22

day-to-day like what's kind of next that

22:24

you're looking forward to?

22:25

>> It's a great question. I think what is

22:27

exciting is it's kind of like watching

22:29

like a 16-year-old learn how to drive

22:30

right as it gets better every day,

22:32

right? So uh as mentioned when I'm not

22:34

uh sharing these experiences with you

22:36

guys, I'm driving the car every day and

22:37

experiencing uh the latest builds.

22:39

>> Uh so it's fun to see the car get better

22:41

at handling those edge cases like the

22:43

construction worker scenario I gave or

22:46

you know the construction in the middle

22:47

lane. Seeing how the car can handle more

22:48

and more complex situations is really

22:51

cool. And then as I mentioned like the

22:52

test down in San Diego, right, taking it

22:54

to different locations, right? Seeing

22:56

how the car can handle, you know, the

22:58

different scenarios, right, is also

23:00

really exciting to see how it can drive

23:02

in different cities, right?

23:04

>> Jeez. Yeah, this is this is tough. I I

23:07

get it. Um and sorry I don't know if you

23:09

said this when we had already started

23:11

recording but can you just please say

23:13

like what your actual role is what you

23:16

actually do at NVIDIA right

23:17

>> so I'm one of the product managers that

23:18

works on our ADAS features right so uh I

23:21

specifically am on our user experience

23:23

team so we try to look after the overall

23:25

driving behavior right so is the car

23:28

comfortable is it being safe right how

23:30

does it feel when the car is driving so

23:32

that's my priority when it comes to uh

23:34

working on the stack

23:35

>> and sorry when you say you know is Is it

23:37

comfortable? Is it safe? Do you mean

23:38

like individual to individual like this

23:40

experience or do you mean like based on

23:42

the data we're seeing, you know, the

23:44

smoothness of stops, the ease of turns,

23:47

like the more macro level like

23:49

statistics, the overall experience is

23:51

safe, easy, like

23:53

>> both, right? So, we look at both, right?

23:54

So, we have a number of, you know, uh,

23:56

regression tests that we can do where we

23:58

could run the car through, I don't know,

23:59

10,000 left turn events, right? and make

24:01

sure that it always makes sure that it

24:03

clears safely and doesn't collide with

24:05

any traffic or pedestrians in

24:06

simulation. Right? So, we do all of our

24:08

offline testing and then we also do

24:10

on-road testing, right? Because

24:11

ultimately we also want to validate,

24:13

right, the behavior uh in the car as

24:16

well. So, uh I do both, right? And it's

24:18

a lot of fun to actually get in the car

24:20

and actually experience different builds

24:21

and see some models may be more relaxed,

24:24

some models may be more aggressive,

24:25

right? And everything in between. Uh and

24:27

we try to kind of design for you know

24:29

kind of the 80% where you know my mom

24:31

would be happy to get in this car who

24:32

someone who's not technical right

24:34

doesn't you know doesn't want to give up

24:36

control right where we can get to the

24:37

point where someone can feel comfortable

24:39

uh and feel safe using these types of

24:41

software so simulation while while we're

24:44

on the subject I'm really interested you

24:46

know you take 10,000 right turns in

24:48

simulation for example right and then

24:50

you look at data from 10,000 right turns

24:52

in real life on the same model how big

24:55

of a variance or like difference I guess

24:57

between what you see in simulation and

24:59

what you see using real data is there

25:01

>> yeah with Cosmos and our physical kind

25:04

of AI simulation right you get the real

25:06

world physics right so it's it actually

25:09

behaves differently right if you run a

25:10

simulation for an area with snow and

25:12

rain right braking distances are longer

25:14

right because in real life they would be

25:16

right so uh it's quite accurate actually

25:18

you you'd be surprised where okay this

25:20

model shows hey this one might uh we

25:23

call like under steer where it turns and

25:24

it drifts into the other lane because it

25:25

can't keep its lane as tightly, right?

25:27

We can see that in simulation and if you

25:29

deploy that model to a car, it follows

25:31

like, oh, actually this car doesn't

25:33

follow the turn trajectory as tightly as

25:34

we'd like it to, right? So, it gives you

25:36

a pretty good uh sense of what the

25:38

performance will look like.

25:39

>> It probably also gives you a pretty good

25:41

sense of like how good drivers are,

25:42

right? Like what a car would do versus

25:44

what people decide to do, like how many

25:46

people decide to take control of a turn

25:48

versus let the model go through it, for

25:50

example.

25:50

>> Yep. So, we get that data. We also, it's

25:52

interesting. Obviously, it's trained on

25:54

a lot of human driving data, right? So

25:55

things like uh what we call like the

25:57

California role where you creep through

25:59

a stop sign and they come to a complete

26:01

stop, right? So you can see you know

26:03

certain data sets have more of that in

26:04

it. So all of a sudden now we see the

26:06

model thought it stopped but it really

26:07

didn't. It's like okay wait no we need

26:09

to enforce that. So that's where having

26:11

that classical stack underneath is

26:13

really helpful where you can enforce

26:14

certain things like making sure you stop

26:16

completely for stop signs, right? Or

26:18

>> uh for example if you were to go to

26:21

different states where you can't make

26:22

right turns on red, right? you can

26:24

enforce it by location, right? Things

26:26

like that. You're also able to help, you

26:28

know, enforce the behavior.

26:30

>> Can you speak a little bit more to that

26:31

enforcement? Is that like um, you know,

26:33

a rules-based enforcement? Like, hey, in

26:36

this boundary box, which is the state

26:37

line or whatever, the rule is no right

26:39

turn on red, or is it more like a

26:41

training a separate model state, but I

26:43

know not exactly a separate model, but

26:44

like walk me through a little bit.

26:46

>> So, we we do a bit of both, right? So,

26:47

you can have region specific. So, you

26:50

can see this guy cutting in, right? He's

26:51

here. No problem, right? we can kind of

26:53

get over it's all good but then we see

26:55

these people right so we're just being a

26:57

little cautious here u so you can do

26:59

different data sets right where you can

27:02

have your you know California data set

27:05

Florida data set if you'd like right you

27:06

can have different uh locations in the

27:08

actual model yeah but then you also to

27:10

your point you can also use uh rules to

27:13

enforce certain behaviors like hey when

27:15

in you know a different state right a

27:17

state that doesn't allow right on red

27:18

right you can do that right so you also

27:21

can see uh the sign there that says no

27:23

right on red. Yeah.

27:24

>> Right. So, we'll come to a stop here and

27:26

we'll wait a little bit further. Right.

27:28

We have a couple areas where you can

27:29

make a ride on red where the car will

27:31

creep forward and it'll make sure

27:32

there's nobody coming and it'll make

27:33

that turn here.

27:34

>> And sorry, like um how does it determine

27:36

right now? I mean, this is gonna sound

27:39

like a silly question, but like does it

27:41

know it's in California right now

27:43

because it's in some certain latitude

27:44

longitude box or is it told that it's in

27:48

California or like

27:48

>> the car has GPS, so it knows that it's

27:50

uh at least for this.

27:51

>> So, it's just using GPS data on top of

27:52

that. Okay. I didn't know if it was Got

27:54

it. That makes a lot of sense.

27:56

>> So, so you see we had the green light,

27:58

but this guy decided to go. So, we

27:59

yielded for him, right? So, even though

28:01

humans can be bad actors, right? Now, we

28:03

can go. Nobody else is going. Great. We

28:05

can make this turn.

28:06

>> Yeah.

28:07

I don't know what's worse, me behind the

28:09

driver's seat of a self-driving car or

28:11

me being a pedestrian once self-driving

28:14

cars are on the road. You know,

28:16

>> I'll be a bad doctor actor either way.

28:18

>> That's humans.

28:23

>> Um, what are some typical confusers that

28:27

um sometimes can make the car misbehave?

28:29

So, for here's what I'm specifically

28:31

asking. Right now, we're on an incline

28:33

and there's like low hanging wires

28:35

directly. I know not directly in front

28:37

of the car, but because our noises, our

28:39

nose is pointed up, we're seeing things

28:41

that would seem to be unusual to

28:43

sensors, right?

28:44

>> Yeah. So, to kind of the earlier part

28:46

about like camera degradation, right, we

28:48

know to prioritize things that are

28:49

closer to the ground, right? So, what we

28:51

can see where the car actually is going

28:52

to drive, right? So, if we see something

28:54

weird like the wires above, right, we

28:55

can say, "Hey, that's probably doesn't

28:57

have anything to do with where we're

28:58

driving, right? We can ignore, you know,

29:00

these cables above us, right? Those

29:02

aren't lane lines, right? Those aren't

29:03

railroad tracks, right? Those are just

29:04

cables, right?

29:05

>> So with enough data, right, you can

29:07

learn to dep prioritize certain regions

29:09

of what the car can see and then also

29:11

what it is seeing.

29:12

>> Does the car care like does the car

29:13

understand when it's like how does the

29:16

car understand when it's on an incline

29:18

and it needs to look like further like

29:20

closer to the ground?

29:21

>> So you'll see actually uh on this route

29:24

we also have some of the nice classic

29:25

San Francisco hills, right? So the car

29:27

can tell that there is occlusion there,

29:29

right? So maybe it needs to be a little

29:31

bit more cautious when dealing with

29:32

always stop signs in that scenario

29:33

because you might not be able to see

29:34

someone right who's coming. So uh it has

29:37

this understanding of hey I can't see

29:40

right I can tell there's gradient here

29:42

let me be uh a little bit more cautious

29:44

drive a little bit more slowly

29:45

>> right like we also see a section where

29:47

the speed limit will be 25 miles an hour

29:49

right but the car actually will slow

29:51

down because at 25 mph it feels very

29:53

fast on a steep hill in San Francisco

29:55

right so the car will naturally slow

29:56

itself down

29:57

>> and again that also comes with kind of

29:59

the endto-end model where you get enough

30:01

diverse driving data right it'll learn

30:03

that humans may naturally slow down even

30:06

though the speed limit may be higher,

30:07

right? If it's a narrow road or a steep

30:09

road, we naturally will slow down. The

30:11

car will also learn that behavior.

30:13

>> That makes a lot of sense. And I guess

30:14

one thing I just thought of is some of

30:16

the cameras, I believe, are mounted

30:18

higher than us, right? Like they're

30:20

mounted up here. So things that seem

30:22

invisible to me, the camera might still

30:24

be able to see over it. Right.

30:25

>> Exactly. So Right. And it has multiple

30:27

cameras and it has a radar, right? So it

30:28

can compare the position, what it can

30:30

see from each of those and try to

30:31

determine, okay, is there something

30:33

there? Right. Should I be cautious?

30:34

what's happening here?

30:36

>> What do you do when um there's a

30:37

situation where one sensor says one

30:39

thing and another says another? So, for

30:41

example, a highly reflective surface

30:43

surface to the radar, but that doesn't

30:45

show up like on the cameras,

30:47

>> you know?

30:47

>> Yeah. So, it'll do a comparison, right?

30:49

And we can assign like confidence

30:50

percentage to different things, right?

30:52

So, we can say, hey, we are not sure

30:54

what this is, right? So, we we have what

30:56

we call multi-ensor fusion, right? And

30:58

then it can choose what to wait and what

31:00

to prioritize based off on how confident

31:03

it is on what it's detecting.

31:05

>> Got it. And like does that multi-ensor

31:08

fusion think about like the resolution

31:11

of like cameras versus the radar like

31:14

>> Yeah. So it can it knows obviously we

31:15

know the spec uh of what the camera is,

31:17

what it can see, right? So it's able to

31:19

determine, hey, what is the likelihood

31:21

that I think that there's something

31:22

here, right? And then you can see here

31:24

it looks like maybe it's trying to

31:25

consider a lane change for this guy

31:27

who's stopped who decided to stop. So

31:31

we'll wait for these cars to pass.

31:34

So you can see it's creeping forward.

31:36

All right. And it was able to make that

31:37

lane change to go around all these guys

31:38

that are stopped. Uh one thing that we

31:41

found that's really interesting with the

31:43

model is lane changing actually feels a

31:45

lot more natural. Uh with the classical

31:48

stack, right, you have to identify where

31:49

the gap is in traffic, right? So you're

31:51

calculating the velocity of the lead

31:53

car, the rear car, right? And also have

31:55

to determine where to position your own

31:57

vehicle, right? So getting that kind of

31:59

what we call the speed adapt phase where

32:00

the car is accelerating and slowing

32:02

down, right? You have both lateral and

32:04

longitudinal acceleration that you need

32:05

to consider, right, when making those

32:07

lane changes. Uh with the classical

32:09

stack, we found that sometimes it could

32:10

feel

32:11

>> a little bit more robotic or a little

32:12

bit more jerky, right? Whereas once we

32:14

started training the model on lane

32:15

change data, it felt really smooth,

32:18

right? where it's able to, you know,

32:19

gently slot itself into gaps in traffic

32:22

where it feels much more intuitive and

32:24

it feels much more humanlike because

32:25

it's able to, you know, more naturally

32:27

control both lateral and longitudinal uh

32:29

behavior.

32:31

Are there any surprises? Like, so that's

32:33

a good example of something where adding

32:35

in that next layer of data made the

32:37

driving experience noticeably and

32:39

significantly better, right? Less jerky,

32:41

more smooth. Um are there other examples

32:43

like that you can share where it's like

32:45

adding that extra layer of um training

32:48

really resulted in a clear change in

32:49

behavior?

32:50

>> Yeah, a big one that was a challenge for

32:51

us is handling like double parked

32:53

vehicles, right? So you have a car

32:55

stopped in front of you. You need to go

32:56

into your oncoming lane, right? And then

32:59

come back into your original lane,

33:00

right? So you have to detect the

33:02

distance between you and the double

33:03

parked car and then you have to also

33:05

check for any oncoming vehicles, right?

33:07

That can be really challenging because

33:09

if it's something narrow like a bicycle

33:11

or motorcycle ride or something coming

33:13

really quickly, right? That can be a bit

33:14

of a challenge to know when you should

33:16

decide to go versus when not to go,

33:18

right? But giving that uh a big data set

33:21

of human driving, right? The model we

33:23

found was actually uh much more natural

33:26

in its timing, right? So it's decision

33:27

to go versus not go felt a lot uh more

33:30

natural and actually the maneuver

33:31

itself, the quality of the maneuver was

33:33

a lot better, right? And so that's

33:34

another one where we're really excited

33:36

to see, okay, here's this really

33:37

challenging scenario, right? But the car

33:40

is able to do it in a way that feels

33:42

natural and humanlike without giving you

33:44

like big jerks or harsh brakes because

33:46

it doesn't want to hit something. So

33:48

that was really nice.

33:49

>> How is the car itself at parallel

33:51

parking? Is that something like

33:52

>> Yeah, we can do parallel parking. We can

33:54

do perpendicular parking, right? Can do

33:56

angled spots, right? So, uh we also have

33:58

parking capabilities. We're looking to

34:00

add kind of ability to park within

34:02

parking structures and things like that.

34:03

So that's other products that we're

34:04

working on. Uh so it's exciting to see

34:08

how quickly it advances, especially with

34:10

uh the end toend models, right? We see

34:13

the rate of improvement is pretty quick.

34:15

And because it's sitting on top of that

34:17

classical stack, you still get the

34:19

safety of the classical stack, right?

34:20

But then you get all these improvements

34:21

for things like lane change or double

34:23

park vehicles, all that sort of stuff,

34:25

you're able to quickly iterate on and

34:27

get the advantage of that endto-end

34:28

driving behavior and the human-like

34:30

behavior.

34:32

That's really interesting. Oh, yep.

34:36

>> Yeah. So, it sees there's guys there

34:37

with cones, right?

34:39

>> And then it seems to like have

34:40

understood, hey, that cone actually

34:42

isn't in my lane. I'm just going to keep

34:44

going. No problem. Right. That was

34:45

really cool.

34:46

>> Yeah. So, you can see, hey, we'll slow

34:47

down. We see there's a guy. Is it going

34:48

to step in front of us? What's going to

34:49

happen? Right. Okay. No. Okay. I can go

34:52

ahead and proceed.

34:52

>> What is the So, like um how many times a

34:56

second does it make decisions? I know

34:57

it's like continuously, but you only

34:59

get, you know, I'm making this number

35:01

up, 60 frames a second from the cameras,

35:03

let's say.

35:04

>> So, like how often is it processing and

35:06

making those decisions per second?

35:07

>> Uh, I don't know the exact number off

35:09

the top of my head, but I can tell you

35:09

it's generating trajectories basically

35:11

every second, comparing that with the

35:12

the classical stack to determine, hey,

35:15

is this rational and is this safe? So,

35:17

we can follow up after and get you the

35:18

exact number if you

35:19

>> Sure. Yeah. Just generally,

35:20

>> just curious. Yeah. Um, when I I went to

35:24

the Q&A with Jensen and I got to ask him

35:26

a question and I asked him, you know,

35:29

with the advent of like OpenClaw and

35:30

Nemoclaw, what are you most what

35:32

application areas are you most excited

35:34

to see tackled with these new

35:35

technologies? And actually to my

35:36

surprise, he talked a lot about

35:38

self-driving and how agents and agent AI

35:41

will sort of be infused into cars in the

35:43

future and help with that decision

35:44

stack. Can you speak to as somebody

35:46

who's like a little more on the ground,

35:48

can you speak to are how are you guys

35:50

thinking if you are at this point using

35:52

Open Claw like how does that fit into

35:54

this larger picture if at all yet?

35:56

>> Yeah. But for you know my role no not

35:58

not yet right but uh some of our other

36:00

team members are using it to help you

36:01

know search for you know certain data

36:03

sets that we need right so if we're

36:05

looking we can use AIs to search for

36:07

construction workers that throw cones in

36:09

front of you right across all of the you

36:11

know data that we collect from all of

36:12

our fleet plus you know data that we get

36:14

from customers and partners right so

36:17

you're able to use it to train you can

36:19

use AI to find the data you need to

36:21

train your model right so

36:23

>> uh I don't work on the model directly

36:24

but you know my team members that too,

36:26

right? There's different ways that we

36:27

use, you know, AIS in that way in order

36:29

to find the data that we need for those

36:31

corner cases that we're looking for to

36:33

help improve the performance.

36:35

>> Yeah, that makes a lot of So, it's

36:36

sometimes it's more about um using AI to

36:39

go through all this largely unstructured

36:41

data.

36:41

>> Correct. We use it to label the data,

36:43

right? So, we can say, hey, these are

36:44

cars, these are people, these are dogs,

36:45

right? These are cones, right? And then

36:47

we also can use it then to go capture or

36:50

collect the data that we need to train

36:51

the model in a certain scenario.

36:53

>> That makes a lot of sense. How do you um

36:57

determine when you need to take do

37:00

something new in simulation? Like do you

37:02

look at data first and then say hey we

37:05

don't really have enough of this kind of

37:06

data. Let's go simulate this case many

37:08

many more times or like what's the

37:10

process for deciding when to load

37:12

something up in omniverse and just like

37:14

>> create a new scenario. Yeah. So we have

37:16

this concept we call like a functional

37:18

scenario tree right. So to give you an

37:20

example right let's take stopping at all

37:22

way stop signs right. So, okay, we know

37:25

for an always stop sign scenario, you

37:27

can go straight, you can turn left, you

37:28

can turn right. Now, do we have data

37:31

that covers all three of those

37:33

scenarios, right? Uh, yes, but maybe we

37:36

have a little bit less data, right?

37:37

Let's go mine for more data for

37:38

specifically all way stop signs, right?

37:40

Uh, and then we say, okay, well, we also

37:42

realize that we need to consider, let's

37:44

say, two-way stops, right? So, add

37:46

another node to the the tree, right?

37:47

Okay, now we need two-way stop sign

37:49

scenarios, right? So as you expand upon

37:51

individual scenarios and use cases, you

37:53

then can layer on data on top to support

37:55

and continue to build out all kind of

37:57

the longtail quarter cases, right? So

37:59

>> the driving experience you're seeing

38:00

here, right,

38:01

>> I don't want to speak for you, but it's

38:02

pretty good, right? So the car is able

38:04

to handle construction and pedestrians

38:06

and things like that, right? But again,

38:08

as we go to different geographies,

38:11

different scenarios that maybe we

38:12

haven't encountered, right? We can

38:13

always continue to add more. And if we

38:15

what we do is we look at any new issues

38:17

that we find across the fleet all across

38:19

you know the globe right and we'll say

38:21

hey we've never seen anything like this

38:22

before. We haven't simulated this

38:23

anything before let's go find data that

38:25

supports this now one new issue that we

38:27

found and then we can expand it like

38:29

that. It's like here like we're driving

38:30

with cones in the middle of the road,

38:32

right? We don't know if there's people,

38:33

there's guys reversing, right? Let's

38:35

break a little bit and confirm, right?

38:37

Okay, now we have this where we can use

38:39

this in the future, right? If you wanted

38:40

to, hey, we need data on driving past

38:42

cones, right? As an example,

38:44

>> it's it's really interesting, you know,

38:45

like now that I'm really alert and like

38:48

keeping my eyes open for it. Driving's

38:50

hard, man. Like there's a lot of stuff

38:51

going on in the road that you kind of

38:53

take for granted when you're driving in

38:54

the moment because that's all you're

38:55

focusing on. But when you can take a

38:57

step back and just assess and be like,

38:59

"Oh yeah, we've been in a few crazy

39:00

situations already,

39:02

>> you know, it's really funny." And you're

39:04

right, the car has handled it very well.

39:06

Um, good segue into my next question.

39:08

What do you think, you know, is the

39:10

ultimate hard scenario for uh

39:13

self-driving. So like my thought, you

39:16

know, I imagine those videos or images

39:18

uh in India with those giant roundabouts

39:21

where it's just like people are merging

39:22

and weaving through each other. It's

39:23

just like it seems like pure chaos when

39:25

you're observing from above. Um, but

39:28

that's just my thought. What is actually

39:30

the hardest scenario to train for?

39:31

>> I think the way to think about it,

39:32

right, is is very similar, right? What

39:34

do you think would be difficult for a

39:35

human, right? And because a lot of the

39:37

driving behavior is trained off human

39:38

data, right?

39:39

>> What would humans struggle with, right?

39:40

Those are the things that we need to get

39:42

through. So whether it's, you know, very

39:44

dense traffic with scooters and people

39:46

walking in between with no lane lines

39:48

and no road markings, right? uh those

39:50

are the things that I think you know are

39:51

the kind of the longer tale right

39:53

solutions that you know we need to get

39:54

to where I think again with enough data

39:57

right I don't see why we wouldn't be

39:58

able to support you know handling

40:00

driving in different countries different

40:02

geographies right different weather

40:03

conditions right because you have enough

40:05

understanding of what good driving

40:07

behavior is right you can reinforce that

40:09

and make it so that way it can handle uh

40:11

better than a good human driver would

40:13

>> yeah no that makes a lot of sense um and

40:15

then I guess as a follow-up to that you

40:17

know one of the things that I'm noticing

40:18

is super important is lane markings,

40:21

right? Um how is this on dirt roads?

40:24

>> So it'll use the context from other cars

40:26

as well, right? To understand where

40:29

other people are driving, right? So it

40:31

doesn't Yeah. Yes. Obviously having lane

40:33

markings is nice, but uh there are

40:35

certain parts even in San Francisco

40:36

where the roads under construction where

40:37

there are no lane markings, right? So

40:39

the car is able to see, hey, this is

40:40

roughly the width of two lanes and I can

40:42

see the other cars are driving here.

40:44

Okay, contextually, this is where I

40:45

should drive.

40:46

>> Yeah, this must be where the lane would

40:47

be. Right.

40:48

>> Exactly. Right. So the the platform

40:50

itself, right, is is a HD mapless

40:52

solution, right? So it's able to

40:54

understand context, right, and try to

40:57

figure out, okay, this is where the lane

40:59

should be, right? I should drive here,

41:01

>> right? Um, when you say fully

41:03

contextless, like one of the things I'm

41:05

also imagining is like no cell signal,

41:08

no online, like this can this is a fully

41:10

self-contained solution that doesn't

41:12

reach back over the internet or the

41:13

cloud to anything, direct.

41:14

>> Yeah. So this is all built uh, you know,

41:16

in the car, right? So this is a

41:18

production car. We've just flashed one

41:20

of our latest kind of software builds to

41:22

the car so that way we can enable all

41:23

these features, right? But the physical

41:25

hardware on the car is the same, right?

41:27

And there's no, you know, there is an

41:28

internet connection that we use to

41:29

upload data, but there's no streaming,

41:32

you know, to this car saying, "Hey,

41:34

here's what's the latest map is of

41:35

what's going on in San Francisco."

41:37

>> Um, what about like you, this is maybe a

41:40

silly question, but we just passed a

41:42

handicap parking space. If I put a

41:44

handicap parking um you know thing here,

41:48

would it contextually understand, hey,

41:49

now I'm allowed to park in a handicap

41:51

space?

41:51

>> So, not yet. That's not something we

41:52

have yet. So, we're not looking at like

41:54

curb colors or anything quite yet, but

41:56

you know, those are some of the concepts

41:57

that some of my colleagues are working

41:58

on. That's, you know, exciting to see.

42:00

You know, hopefully, you know, we'll

42:01

roll some of those features out in the

42:02

future. Is there a so like broader than

42:05

like even that you know if I had

42:08

handicap put like I guess what I'm

42:09

really asking is like is there a button

42:11

I can push to say hey this vehicle is

42:13

allowed to park in certain special

42:15

spaces that would not otherwise be you

42:18

know handicap in this example but you

42:19

can imagine like any wide variety of

42:22

>> yeah it depends on what the the partner

42:24

is looking for right so you we can adapt

42:26

the stack to do any number of things

42:28

right so if someone wanted us to look

42:30

for you know identifying curb color,

42:32

right? And understanding that yellow

42:34

means temporary parking, but red means

42:36

no parking, right? Those are the types

42:37

of things that we can do and we can work

42:38

on with the partner in order to provide

42:40

that type of capability. Uh, which we

42:42

haven't done that yet here for this

42:43

case.

42:44

>> What about Sorry, I realize I'm asking a

42:46

similar question a few times. Um, you

42:48

know, in the robo taxi case, 15minute

42:51

parking, I want I just want to drop

42:53

someone off and leave. Is the car smart

42:55

enough to say I can park there for 15

42:57

minutes? I really only need like one and

42:59

a half minutes, two minutes to drop off

43:01

my passenger. Is it

43:02

>> we can get to that point. Uh again, I

43:04

haven't worked on the robo taxi project

43:06

myself yet, but you know, we can read

43:08

street signs, right? So, we can

43:09

understand things like, you know, like

43:12

the no ride on red, for example, right?

43:13

We can see and read the sign that says,

43:14

you know, the arrow through the uh the

43:17

sign and the

43:18

>> be fun. Sorry, not to interrupt you. I

43:20

apologize.

43:20

>> No worries. Right. So, we can see if we

43:22

can fit through this narrow space,

43:24

right? So, we stop. Right. So, that way

43:25

we don't get too close. Right. And you

43:27

can see we'll just kind of slop through.

43:29

>> Wow.

43:31

>> But again, so the double park vehicle

43:33

case I talked about, right? Yes. Like

43:34

that case would be really difficult

43:36

before we had these end to-end models,

43:37

right? Whereas now we saw that she had

43:39

stopped. There's enough space, right? We

43:40

can fit. Okay, great. We'll go ahead and

43:41

take this gap here.

43:42

>> Well, what I'm realizing about myself is

43:44

like I'm a much more conservative driver

43:46

than even this computer is because in

43:49

that situation, I would have just

43:51

stayed. I cuz I saw that car coming the

43:53

other way. I would just stayed. So, I'm

43:56

learning two things. One, it's really

43:57

capable of making really fine

43:59

estimations of like, hey, I think I can

44:01

fit through there. I'm going to try. And

44:03

two, I need to be a much more aggressive

44:05

driver is what I'm really hearing.

44:06

>> The, you know, the cool thing about

44:08

this, right, is we've deployed different

44:10

models, right? So, we see some that are

44:12

very conservative, right? So, we get

44:14

feedback not only from my team, but also

44:15

our drivers and everybody else at

44:16

testing the fleet where we say, "Hey,

44:18

this model seems to be getting stuck

44:19

more often, right?" Yeah. Sure. So we

44:21

try to find the balance of, you know,

44:23

getting stuck versus being assertive,

44:25

right? Where it gets to this kind of

44:27

nice middle ground where sometimes we'll

44:28

get stuck, sometimes we'll make the

44:30

pass, right? It just depends on, you

44:32

know, what the scenario is, right?

44:33

>> And is is that um like a global

44:36

assertion about the model, hey, it's

44:37

globally aggressive, it's globally

44:39

conservative, or is it like, hey, this

44:41

model is really aggressive when it comes

44:42

to overtaking a vehicle, but it could be

44:44

conservative in other cases. Like, is

44:46

the whole model aggressive, passive,

44:48

conservative? that it's you'll find that

44:50

different models will do different

44:52

things, right? So on average, we have

44:54

roughly seven new models that you know

44:56

we generate per day, right? And you'll

44:59

try different ones and not all of them

45:00

end up making up to the cost.

45:01

>> Sorry. Seven new models you generate per

45:02

day. Can you just speak a little bit to

45:04

the different like why seven wide

45:07

>> just about what we can generate with uh

45:08

you know the GPU usage that we have and

45:10

everything to come up with new models,

45:11

right? But you can imagine these are

45:12

enormous data sets,

45:14

>> right? So we're always trying to improve

45:16

the driving behavior, right? So some

45:18

models may be more reactive to people

45:21

where it's too conserved where every

45:22

time we see someone it might want to

45:23

break right okay that model to us right

45:26

might not be as comfortable right safety

45:28

might be the same right but the comfort

45:30

is less because it's over sensitive

45:32

right so there are different things like

45:34

that that we can deploy and we can test

45:36

based off different data sets that we

45:38

add to the model and then we can wait

45:40

and prioritize different data sets to

45:41

get the right mix of safety and comfort

45:44

that we're looking for. So like here

45:45

again we have another for our car right

45:47

that car is far away he's pulling over

45:49

we're able to easily drive around that

45:50

no problem right

45:52

>> um so yeah so there's different models

45:54

that we deploy and every day we test

45:56

different variants across our entire

45:58

fleet right where we can see in

45:59

different you know geographies right one

46:01

model might be really good in California

46:02

but struggles in Texas right so that's

46:05

also why we want to test in different

46:07

geographies uh in addition to the

46:09

simulation testing right we also always

46:11

want to do on-road testing as well just

46:12

to compare and make sure that we don't

46:13

see this guy just ran his stop sign,

46:15

right? And no problem. The car gently

46:17

waited for him and then proceeded.

46:19

>> Yeah. No, got it. Okay. So, it's really

46:21

about generating variants of the same

46:23

model, comparing their outputs. Okay.

46:25

That that was going to be my next

46:27

question is like what are these seven

46:28

models even? But

46:29

>> yeah, so typically you'll see it will

46:31

start from a very similar base, right?

46:33

But over time that base will evolve and

46:35

get more capable, right? So, uh back to

46:37

my kind of 16-year-old analogy, right?

46:39

Uh you know, in the beginning, right,

46:40

the car might be a little jerky or a

46:41

little, you know, wobbly, right? Now you

46:43

can see the drive is quite smooth,

46:44

right? So over time you see the general

46:46

capability improves as it deals with

46:48

more scenarios, right? You can detect

46:49

this person standing here with this open

46:51

trunk, right? So let me slow down, let

46:53

me go around them and let me come back

46:55

to, you know, the center of my lane as

46:56

an example. Can you speak a little more

46:58

to that like 16-year-old comment like

47:00

two years ago the the average model was

47:03

like driving like an 11y old and two

47:05

years from now we expected to drive like

47:06

a 25-year-old like you know can you help

47:09

me understand like the pace of evolution

47:10

of how good driving models are on

47:12

average. So this so this project you can

47:15

say with these types of models it's been

47:17

a little bit over a year

47:18

>> right and this has been this model was

47:21

about you know 2,300 or so models that

47:24

we've generated to get to this point

47:26

>> right so

47:27

>> uh you know fortunately we have the you

47:29

know compute in order to generate and

47:31

process this data to create new models

47:32

but you can see today this driving is

47:34

very smooth very capable right it can

47:36

understand you know construction and

47:38

double park cars right u so you know

47:41

that's what I'd say is you know pretty

47:43

good, right? It's a pretty good driver,

47:44

right?

47:44

>> For sure.

47:45

>> So, like here, right, we had that yellow

47:46

light flash really quick, right? We were

47:48

able to make it through no problem,

47:50

right? We see this guy coming at us on a

47:51

skateboard, right? There's no harsh

47:53

braking for that scenario, right?

47:55

There's no big swerves, things like that

47:57

where the car is learned to be very

47:59

smooth and predictable with its outputs

48:01

in terms of vehicle motion.

48:04

>> Incredible.

48:05

>> You see a nice unpredicted left turn,

48:06

right? And we have this nice downhill,

48:08

right? Where the limit is 25 or 30 here,

48:11

right? But we're not going to

48:12

immediately try to jump up to the, you

48:14

know, the speed limit, right? We can see

48:15

there's a red light. We should just

48:16

gently come to this light here.

48:20

>> Yeah. And we're so we're on a pretty

48:22

steep incline. So the car understands

48:24

like, hey, even though it's at eye

48:26

level, what I'm looking at out there is

48:28

actually the horizon. It's not useful

48:30

information in the context of driving.

48:32

I'm just going to look at what's right

48:33

in front of me.

48:34

>> Yeah. But it can use that for context,

48:35

right? Like it can see the roofs of the

48:36

cars are disappearing, right? So, okay,

48:38

I know that there's a pretty steep

48:40

grade. let me uh you know kind of come

48:41

over here gently, right? Let me not

48:43

floor it to come over this hill, right?

48:45

Because you don't know someone could be

48:46

stopped there, right?

48:47

>> Um what what do you find are some of the

48:49

most um interesting driver behavior

48:52

changing features? Is it things like

48:54

elevation? Is it things like weather? Is

48:56

there another one that's like that that

48:58

surprised you like, oh wow, this sort of

49:01

scenario is surprisingly difficult based

49:04

on what I thought. Uh yeah, I would say

49:07

those two were definitely two of the

49:08

ones that, you know, surprised me at

49:10

first, right? I think uh when you get

49:12

into like really dense traffic with a

49:14

lot of, you know, bicyclists, people

49:16

weaving in and out and, you know, you

49:18

can imagine, you know, downtown San

49:19

Francisco at rush hour with like

49:20

delivery bikes and things like that,

49:22

>> right? So understanding when to be

49:25

assertive versus when to be

49:26

conservative, right, is really

49:28

interesting. To your earlier point, it

49:29

makes me really reflect on my own

49:31

driving, right? Right. And it's like,

49:33

okay, yeah, I would have done this. Or,

49:34

oh, okay, I didn't even see that guy.

49:36

Like, there's been times where I've been

49:38

at stop signs where I'm like, hey, come

49:39

on, car. Like, let's go ahead. Like,

49:41

let's start going. And then out of

49:43

nowhere, right, I see a guy cross the

49:44

street cuz he was behind a bush and I

49:45

couldn't see him, right? But the car was

49:47

able to see the motion there and say,

49:48

hey, there's someone there. I'm not

49:49

going to go.

49:50

>> Uh, so yeah, it's it's a lot of fun

49:52

seeing those types of things where,

49:54

okay, here's where I maybe need to

49:56

reconsider how I approach driving.

49:58

>> For sure. Yeah, for sure. Um, one of the

50:00

things that, um, constantly is a

50:03

scenario that I deal with back home is,

50:05

you know, we have ambulances or

50:07

emergency services and you hear them way

50:09

before you see them. And so the culture

50:11

in Florida is, you know, when you kind

50:14

of get the sense of the flashing lights

50:16

around you, you pull over to let them

50:17

pass.

50:18

>> How does that work when you're

50:20

self-driving? Like, does the car

50:21

understand, hey, I hear the siren or is

50:23

it looking for the lights or what's the

50:25

>> No. So, not quite yet. So, that's where

50:26

for like an L2++ product, right? that's

50:28

where the driver can take over and pull

50:29

over in that case,

50:30

>> right? But those are kind of the longer

50:32

tail things that we're looking to add

50:33

for level four, right? Where the car

50:35

needs to understand, hey, I need to pull

50:37

over for safety vehicles and things like

50:39

that. So,

50:40

>> uh that's where we see kind of the the

50:41

next jump coming in in the future.

50:44

>> How does it deal with like um even finer

50:47

like

50:49

I'm using stranger because I just don't

50:50

have a better word to say it, but like

50:52

you know there's a police officer in the

50:53

middle of the road. He's directing

50:55

traffic with his hands, not a sign. very

50:57

small, very hard to see, maybe you know

51:00

the color blends in with maybe the

51:02

background or his uniform or whatever.

51:04

If he waves you on like this, can the

51:06

car see that and understand to do that

51:07

or is that an intervention?

51:09

>> So sometimes we can do that, sometimes

51:11

you can't, right? So that's one where

51:12

you know we still need to refine that

51:13

part of the model behavior, right? So in

51:15

this case, right, you can still leave

51:17

the system engaged but tap the gas,

51:18

right? And the car says, "Okay, the

51:19

driver's giving me this input, right?

51:20

Okay, now I can proceed." Right? So we

51:22

can see that there's a person there. So

51:24

obviously don't drive towards the

51:25

person, right? But because the driver is

51:27

given the confidence of hey go ahead and

51:29

proceed, right? Then the car then can

51:30

resume control.

51:32

>> That's that's interesting. So it's not

51:33

really a the driver took over in the

51:35

sense that he's going to get past that

51:37

scenario. It's simply a I'm tapping the

51:40

gas to let you know it's okay for you to

51:42

move forward and then you're like

51:44

putting the decisions of what to

51:45

actually do back in the car's hands, so

51:47

to speak.

51:47

>> Exactly. Right. So it's kind of like a

51:50

you know you're working together with

51:51

the car, right? So, you know, even let's

51:53

say there's a double parked car and if

51:55

it gets a little too close, a little

51:56

stuck, right? You know, there's times

51:57

where I can just give a little bit of

51:58

steering input to say, "Hey, I can see

52:00

around. It's all good." Right? Then the

52:01

car then will control the accelerator to

52:03

then pull out, right? So, it's very much

52:05

you're working with the model to drive,

52:07

right? And the car said, "Okay, you're

52:09

giving me this input, right? Okay,

52:10

great. Let's go ahead and move forward."

52:12

>> So, that's really interesting. I I

52:14

didn't expect it to be so cooperative.

52:17

One of the things that's uh ahead of us

52:18

right now is a school bus. And school

52:20

buses have this magic ability to

52:22

materialize a stop sign out of nowhere,

52:24

right? How does the car react when like

52:26

all of the sudden it sees a stop sign

52:28

that wasn't there just a few seconds

52:30

ago, but it's already made its

52:31

trajectory. It's made its plan.

52:33

>> So, we can identify different types of

52:34

vehicles, right? So, we have like a

52:36

school bus classification. We have

52:37

regular cars, trucks, right? So, we can

52:39

identify different types of vehicles.

52:41

>> Oh, so sorry, not to interrupt you, but

52:43

not just a bus. You can tell that's a

52:45

school bus. And part of the school bus

52:46

entity is the stop sign. So then if the

52:48

stop sign's out, we can detect there's a

52:50

stop sign there, right? And we can see

52:52

the flashing lights as well. So using,

52:53

okay, we see a stop sign, we see the

52:55

flashing lights. Okay, we should stop

52:56

and wait here.

52:57

>> Got it. Okay. So it's at that level.

52:59

It's not like, hey, I know what a bus

53:01

is, and then I know some buses are also

53:03

school buses. And one of their magic

53:05

powers, for lack of a better word, is to

53:07

put out a stop sign.

53:08

>> Yeah. So we can see Yeah. We'll classify

53:10

as a bus, but then we see bus plus stop

53:12

sign plus flashing lights. Okay. Yeah.

53:13

Stop.

53:13

>> Got it. Thanks. That's that's very

53:15

clear. Um there's an object in So we're

53:18

looking at a Whimo, right? Obvious to

53:20

the GoPros and stuff, but what some of

53:22

the sensors might see is like something

53:23

that's spinning in the road. Does it

53:26

care about that? Does it not? It

53:27

understands the larger context. Walk me

53:29

through.

53:29

>> So that it would just see this. It would

53:31

just see this is a car, right? And

53:32

there's things moving on it, right? But

53:34

I can tell this is a car. So either wait

53:36

for it to clear my path, right? Or, you

53:39

know, go ahead and proceed.

53:39

>> Beyond that, it doesn't matter. It's

53:41

like, oh, whatever strange features it

53:42

might have are just strange features.

53:44

>> Yeah. doesn't behave differently for a

53:45

different take by hand.

53:47

>> Okay. Thank you, George. Yeah. Just to

53:48

take us back to keep us on our schedule

53:49

here. Thank you. Sure.

53:51

>> Yeah. But, uh, as mentioned, right? So,

53:53

you can see George's braking is a little

53:54

bit more firm than, uh,

53:56

>> our car is, right? So,

53:57

>> what else what else should we know about

53:59

like the the car the capabilities like,

54:02

you know, I tried to ask as many

54:04

questions as I could, but what's what's

54:06

one thing maybe I missed that you'd love

54:07

to like share with my audience? I think

54:09

what's really exciting is that we could

54:11

take this architecture, right? And we

54:12

can scale it up or scale it down, right?

54:14

So this is like our, as I mentioned, our

54:15

level two plus experience, right? But

54:17

we're scaling this up to that level four

54:19

experience, right? So whether that's a

54:21

robo taxi or even consumer grade level

54:23

four, right? We're flexible enough where

54:25

we can adapt to whatever the partner is

54:27

looking for, right? So I think what I'm

54:29

really excited about is a future where,

54:31

you know, I have my own car that I buy.

54:33

I want to go drive the beautiful twisty

54:35

road. Great. But then when I'm commuting

54:36

home from work, I can let the car do

54:38

everything, right? And have that be a

54:39

level four experience.

54:40

>> And um like no, off the record, no one's

54:43

holding you to it, you know, I'm just

54:44

curious, do you expect to be able to do

54:46

that in 3 years, 10 years, like

54:48

>> Yeah. I think what we see with, you

54:50

know, Uber, right, we're going to launch

54:51

in San Francisco and California by next

54:53

year, right, for uh our robo taxi

54:55

initiatives, right? So we're going to

54:57

have to figure out how to make sure we

54:58

make it work. And I'm excited to see

55:00

that. So I think it's coming much sooner

55:01

than we think.

55:02

>> That's really, really exciting. That's

55:05

awesome. Um, anything else you want to

55:07

say

55:07

>> for me?

55:08

>> Dude, that was really awesome. We saw a

55:10

lot of really cool scenarios that I

55:13

never really expected or really thought

55:15

about how much and how fatiguing driving

55:17

can be, right? Like I imagine if I was

55:19

driving that,

55:19

>> yeah,

55:20

>> I'd be pretty tired. You know what I

55:21

mean? That was a lot of stuff we went

55:23

through.

55:25

>> It's interesting. Uh, so again, like I

55:27

mentioned, uh, yeah, I work with the

55:29

product team here. U, you know, you guys

55:31

are some of the first to see it that are

55:32

non Nvidia employees, right? What did

55:34

you guys think? Right. You know, I'm

55:35

curious.

55:36

>> I I think it was really, really smooth.

55:38

I was really impressed with how smooth

55:39

it was. And I was really impressed with

55:41

some of the decisions it made

55:43

>> versus the decisions I would make. I do

55:45

consider myself a very good driver.

55:46

>> Everybody does.

55:47

>> And and and this really made me and and

55:50

then this really made me reconsider

55:51

like, oh, there are certain situations,

55:53

you know, I made the joke earlier, but I

55:55

can and probably should be more

55:56

aggressive just because that that

55:58

actually is the safer option, like to

56:00

clear this lane sooner or whatever, for

56:02

example. And there are situations where

56:04

maybe I would have misjudged it because,

56:06

you know, my two cameras are in a fixed

56:09

position in the middle of the car, but

56:11

this has multiple cameras, including

56:14

blind spots I can't see and above me

56:16

where it can see like horizons, right? I

56:18

think those things put together really

56:21

I'm impressed by the difference between

56:24

its behavior and my behavior. Yeah. You

56:26

know,

56:26

>> it's really fun the first time you drive

56:28

it, right, where you're sitting in

56:30

Georgia seat, right? where you're like,

56:31

"Okay, let's see what the car does,

56:32

right?" And after a couple minutes,

56:34

you're like, "Oh, this is as good, if

56:36

not better than I am." Right? And then

56:38

you get to those really interesting

56:38

scenarios like, "Okay, I would have done

56:40

this, right? But the car did this."

56:41

You're like, "Huh, okay. Yeah, that

56:42

actually makes sense." So, it's it's

56:44

kind of fun because it makes you look at

56:45

driving in a very different way.

56:46

>> Yeah, there's a feedback. I'm curious to

56:48

see one one thing I am curious to see is

56:50

like, "All right, you've rolled out

56:51

self-driving. It's been out for a while.

56:53

Some people obviously still choose to

56:55

manually drive their cars on occasion.

56:57

as you keep collecting that data, are

56:59

people on average becoming better

57:01

drivers because they see computers that

57:03

are doing better things? You know what I

57:05

mean?

57:05

>> It'll be interesting to see, right? The

57:06

next couple years, it'll be really

57:07

interesting to see.

57:08

>> Yeah.

57:08

>> And then I have one question.

57:09

>> Yeah, sure.

57:10

>> So,

57:11

>> is there equal priority or any

57:13

difference in weight in goals of safety

57:17

versus efficiency on the road of like

57:20

what the

57:21

>> goals of of automation might be?

57:23

>> Yeah.

57:24

>> Increased safety, increased efficiency,

57:26

all of the above. people waiting.

57:27

>> Safety is always the highest weight,

57:29

right? We want to make sure we're not

57:30

driving into people while driving other

57:32

cars, right? So, that's always

57:34

>> Yeah. Always always takes priority,

57:35

right? And that's why we're really

57:36

excited that we have that classical

57:38

stack there always to make sure the

57:39

model doesn't do anything we don't want

57:40

it to do,

57:41

>> right? But then after that, then comes

57:43

that comfort and efficiency, right?

57:44

Where nobody wants to get stuck in

57:46

traffic, right? Doesn't want to get

57:47

stuck in the lane. So, that's where

57:48

having a big model is really helpful

57:50

because then you're able to get that

57:51

human decision-making. It's like, "Okay,

57:53

let me go into this lane and, you know,

57:55

we can get over here where we're not

57:56

blocking traffic. We're not stuck here,

57:58

right? We can see the lane is closed."

57:59

So, it's that's where it's really

58:00

powerful.

58:01

>> Okay,

58:02

>> this was awesome, man. Thank you so much

58:03

for your time.

58:04

>> Thank you.

58:04

>> Great to meet you guys.

58:05

>> Yeah, thank you so much. I mean, your

58:07

driving was fantastic.

58:10

>> A few things stood out to me after

58:11

spending an hour with Nvidia's L2++

58:14

powered Mercedes. The car had to react

58:16

dynamically to LA traffic. Wayne merges,

58:19

sudden cutins, construction zones, and

58:22

unpredictable pedestrians. And it

58:24

handled everything with a smooth

58:25

precision that you don't really get with

58:27

human drivers. For investors, the bigger

58:29

takeaways might be in Armen's

58:31

commentary. how Nvidia's Drive OS and

58:34

perception stack are working together,

58:36

the specific in-car capabilities they

58:38

unlock, and how this architecture will

58:40

let OEMs like Hyundai and platforms like

58:43

Uber tailor the driving experience

58:45

across level two, level three, and level

58:47

four autonomy over the next couple

58:49

years. A huge thank you to the entire

58:51

Nvidia team for flying us out to

58:54

California for supplying us with press

58:56

passes for GTC and for making this test

58:58

drive possible and to Armen for

59:01

answering my non-stop questions. And of

59:03

course, thank you for watching and

59:05

supporting the channel. Without you, I

59:07

wouldn't get opportunities like this in

59:09

the first place. And if you want to see

59:11

what else I learned at Nvidia GTC and

59:13

what I'm investing in, check out this

59:15

video next. or if you want more science

59:18

behind the stocks, then this video is

59:20

for you. Either way, thanks for watching

59:22

and until next time, this is Ticker

59:24

Simple U. My name is Alex, reminding you

59:27

that the best investment you can make is

59:30

in you.

Interactive Summary

This video features a real-world, unedited one-hour test drive in downtown Los Angeles using a Mercedes vehicle equipped with Nvidia's L2++ autonomous driving platform. The narrator rides with Armen Connie, a senior product manager at Nvidia, to discuss the system's technical design, including the use of cameras, radar, and ultrasonics without LiDAR. They explore how the system handles complex urban scenarios like construction zones, pedestrian interactions, and gridlock, while highlighting the 'world model' that synthesizes sensor data to inform decision-making. The discussion also covers the roadmap from L2++ to L4, the role of end-to-end training models, and the collaboration between the new AI stack and a traditional safety-backup stack.

Suggested questions

4 ready-made prompts