HomeVideos

Sergey Brin: Lessons from Google Glass + Why Every Computer Scientist Should be Working on AI

Now Playing

Sergey Brin: Lessons from Google Glass + Why Every Computer Scientist Should be Working on AI

Transcript

293 segments

0:00

It's a race. It's a race to develop

0:01

these systems. Is that why you came back

0:03

to Google?

0:06

Um I mean I think as a computer

0:08

scientist uh it's a very unique time in

0:12

history. Like uh honestly anybody who's

0:15

a computer scientist uh should not be

0:18

retiring right now should be working on

0:20

AI. That's what I would just say. I mean

0:22

there's just never

0:23

been a greater sort of problem and

0:26

opportunity a greater cusp uh of

0:29

technology. Um so I don't I wouldn't say

0:32

it's because of the race. uh although we

0:35

fully intend that Gemini will be the

0:37

very first AGI clarify that

0:40

uh but uh to be immersed in this uh

0:46

incredible technological revolution I

0:49

mean it's unlike you know I went through

0:51

sort of the web 1.0 thing. It was very

0:53

exciting and whatever. We had mobile, we

0:55

had this, we had that. But uh I think

0:57

this is scientifically

1:00

uh far more exciting and I think uh I

1:04

think ultimately the impact on the world

1:06

is going to be even greater in as much

1:07

as you know the web and mobile phones

1:10

have had a lot of impact. Um I think AI

1:13

is going to be vastly more

1:14

transformative.

1:16

So what what do you do dayto-day?

1:19

I think I torture people like uh Demis

1:22

um who's amazing by the way. He

1:24

tolerated me crashing this uh fireside.

1:28

Um I'm in the you know I'm across the

1:30

street uh you know pretty much every

1:32

day. Um and they're just uh uh people

1:36

who are working on the key Gemini text

1:38

models on the pre-training on the

1:40

post-training mostly those I

1:42

periodically delve into some of the

1:45

multimodal work. uh V3 as uh you've all

1:48

seen.

1:49

Um but I tend to be uh pretty deep in

1:54

the technical details. Um and that's a

1:57

luxury I really enjoy fortunately

1:59

because guys like Demis are you know

2:01

minding the shop.

2:03

Um and uh yeah that's just where you

2:07

know my scientific interest is. It's

2:09

deep in the algorithms and how they can

2:12

evolve. Okay let's talk about the

2:14

products a little bit. some that were

2:16

introduced recently. Um, I just want to

2:18

ask you a broad question about agents,

2:20

demis, because when I look at other tech

2:22

companies building agents, what we see

2:24

in the demos is usually something that's

2:26

contextually aware, has a disembodied

2:29

voice, is often interacted uh with you

2:32

often interact with it on a screen. When

2:34

I see Deep Mind and Google demos, often

2:37

times it's through the camera. It's very

2:39

visual. We there was an announcement

2:41

about smart glasses today. So talk a

2:44

little bit about if that's the right

2:46

read, why why Google is so interested in

2:49

having an assistant or companion that is

2:52

something that sees the world as you see

2:54

it. Well, it's for several reasons,

2:56

several threads come together. So as we

2:58

talked earlier, we've always been

2:59

interested in agents. That's actually

3:01

the the the heritage of deep mind

3:03

actually. We started with agent-based

3:05

systems in games. We are trying to build

3:07

AGI, which is a full general

3:08

intelligence. Clearly that would have to

3:11

understand the physical environment,

3:12

physical world around you. And two of

3:14

the massive use cases for that in my

3:16

opinion are a truly useful assistant

3:18

that can come around with you in your

3:20

daily life, not just stuck on your

3:22

computer or one device. It needs to we

3:24

want it to be useful in your everyday

3:26

life for everything. And so it needs to

3:28

come around you and understand your

3:30

physical context. Um, and then the other

3:32

big thing is I've always felt for

3:34

robotics to work, you sort of want what

3:37

you saw with Astra on a robot. And I've

3:40

always felt that the the bottleneck in

3:41

robotics isn't so much the the hardware,

3:44

although obviously there's many many

3:45

companies and and working on fantastic

3:48

hardware and we partner with a lot of

3:49

them, but it's actually the software

3:50

intelligence that I think is always

3:52

what's held um robotics back. But I

3:54

think we're in a really exciting moment

3:56

now where finally with um these latest

3:59

versions especially 2.5 Gemini and more

4:01

things that we're going to bring in this

4:03

kind of VO technology and other things I

4:05

think we're going to have really

4:06

exciting uh algorithms to make robotics

4:09

finally work in in in its and you know

4:12

sort of realize its potential which

4:13

could be enormous. So I think this and

4:15

and then in the end AGI needs to be able

4:18

to do all of those things. So for us and

4:20

that's why you can see we always had

4:21

this in mind that's why Gemini was built

4:23

from the beginning even the earliest

4:25

versions to be multimodal and that made

4:27

it harder at the start because it's

4:29

harder to make things multimodal than

4:30

just text only. But in the end I think

4:32

we're reaping the benefits of those

4:34

decisions now and I see many of the

4:35

Gemini team here in the front row of the

4:37

correct decisions we made. They were the

4:39

harder decisions but we made the right

4:41

decisions and now you can see the fruits

4:43

of that with all of what you've seen

4:45

today. Actually, Sergey, I've been

4:47

thinking about whether to ask you a

4:49

Google Glass question. Oh, far away.

4:51

What did you learn from Glass that

4:54

Google might be able to uh apply today

4:57

now that it seems like smart glasses

4:59

have made a reappearance? Wow. Yeah. Uh

5:03

great question. Um I learned a lot. I

5:06

mean, that was um I definitely feel like

5:08

I made a lot of mistakes with Google

5:10

Glass. I'll be honest.

5:12

Um I am still um a big believer in the

5:15

form factor. So I'm glad that we have it

5:17

now. Uh and now it's like looks like

5:19

normal glasses doesn't have the thing in

5:21

front. Uh I think there was a technology

5:25

gap honestly. Now in the AI world, the

5:28

things that these glasses can do to help

5:30

you out without constantly distracting

5:33

you, that capability is much higher. Uh

5:37

there's also just um I just didn't know

5:40

anything about consumer electronics

5:42

supply chains really and how hard it

5:44

would be to build that and have it be at

5:46

a reasonable price point um managing all

5:50

the manufacturing so forth. Um this time

5:53

we have great partners that'll are

5:55

helping us build this.

5:57

Um so that's another step

6:00

forward. Uh what else can I say? I do

6:03

have to say I miss the the um airship

6:06

with the wings suiting sky divers for

6:08

the demo.

6:10

Honestly, it would have uh been even

6:12

cooler here at Shoreline Amphitheater

6:15

than it was up in Moscone back in the

6:16

day. But maybe we'll have to we should

6:19

probably polish the product first this

6:21

time.

6:23

Ready and available and then we'll do a

6:26

really cool demo. So that's probably a

6:28

smart move. Yeah. What I will say is I

6:30

mean look we've got obviously an

6:31

incredible history of glass devices and

6:33

smart devices so we can bring all those

6:35

learnings to today and very excited

6:38

about our new glasses as you saw but

6:40

what I' what I've always always talking

6:42

to our team and Sham and the team about

6:44

is that I mean I don't know if Serge

6:45

would agree but I feel like the the

6:48

universal assistant is the killer app

6:51

for smart glasses and I think that's

6:53

what's going to make it work apart from

6:55

the fact that it's all the tech the

6:56

hardware technology is also moved on and

6:58

improved moved a lot is this. I think I

7:00

feel like this is the actual killer app,

7:02

the natural killer app for it. Okay.

7:04

Briefly on video generation, I sat uh in

7:07

the audience in the keynote today and

7:09

was like fairly blown away by the level

7:12

of uh improvement we've seen from these

7:15

models and I I mean you had filmmakers

7:17

talking about it in the

7:19

presentation. I want to ask you Demis um

7:21

specifically about model quality. If the

7:24

internet fills with video that's been

7:27

made with artificial intelligence, does

7:31

that then go back into the training and

7:33

lead to a lower quality model than if

7:35

you were training just from human

7:38

generated content? Yeah. Well, look, we

7:40

we you know, there's a lot of worries

7:41

about this so-called like model

7:42

collapse. I mean, video is just one

7:44

thing, but in any modality, text as

7:46

well. There's a few things to say about

7:48

that. First of all, we're very rigorous

7:50

with our data quality management and

7:52

curation. We also, at least for all of

7:55

our generative models, we we attach

7:57

synth ID to them. So there's this

7:59

invisible AI actually made watermark

8:02

that um is pretty very robust has held

8:04

up now for you know a year 18 months

8:07

since we released it. And all of our

8:09

images and

8:10

videos are embedded with this watermark.

8:13

that we can detect and and we're

8:15

releasing tools to allow anyone to

8:16

detect uh uh these watermarks and know

8:19

that that was an AI generated um uh

8:22

image or video. And of course that's

8:24

important to combat deep fakes and

8:26

misinformation, but it's also of course

8:28

you could use that to filter out if you

8:30

wanted to whatever was in your training

8:32

data. So I don't actually see that as a

8:34

big problem. Um, eventually we may have

8:37

video models that are so good you could

8:39

put them back into the loop as a source

8:42

of additional data, synthetic data it's

8:45

called. And there you just got to be

8:47

very careful that you're you're actually

8:49

creating from the same distribution that

8:51

you're going to model. Um, you're not

8:53

distorting that distribution somehow.

8:55

Uh, the quality is high enough. We have

8:57

some experience of this in a completely

8:58

different main with with things like

8:59

Alphafold where there wasn't actually

9:01

enough real experimental data to build

9:03

the final alpha fold. So we had to build

9:05

an earlier version that then predicted

9:07

about a million protein structures and

9:09

then we selected it had a confidence

9:10

level on that. We selected the top three

9:12

400,000 and put them back in the

9:14

training data. So there's lots of it's

9:16

very cutting edge research to like mix

9:18

synthetic data with real data. So there

9:20

are also ways of doing that. But on the

9:21

terms of the video sort of generator

9:23

stuff, you can just exclude it if you

9:25

want to. At least with our own work and

9:27

hopefully other um gen media companies

9:30

follow suit and um put robust watermarks

9:33

in also obviously first and foremost to

9:35

combat uh deep fakes and misinformation.

9:38

Okay, we have four minutes. I got four

9:40

questions left. We now move to the

9:42

miscellaneous part of my question. So

9:44

let's see how many we can get through

9:45

and as fast as we can get through them.

9:47

Um let's go uh to Sergey with this one.

9:50

What does the web look like in 10 years?

9:51

What does the web look like in 10 years?

9:54

I mean, go one minute. Boy, I think 10

9:57

years because of the rate of progress in

10:00

AI is so far beyond anything we can see,

10:04

not just the web. I mean, I don't know.

10:07

I don't think we really know what the

10:08

world looks like in 10 years. Okay,

10:10

Demis.

10:12

Well, I think I think that's a good

10:13

answer. I do think the web I think in

10:15

nearer term, the web is going to change

10:16

quite a lot. If you think about an agent

10:18

first web, like does it really need to,

10:21

you know, it doesn't necessarily need to

10:22

see renders and things like we do as as

10:24

humans using the web. So, I think things

10:26

will be pretty different in a few years.

10:28

Okay. Uh, this is kind of an under over

10:31

question. Uh, AGI before 2030 or after

10:34

2030?

10:37

Uh, 2030. Boy, you really kind of uh put

10:40

it on that fine line. I'm going to I'm

10:41

going to say before. Before. Yeah.

10:44

Dennis, I'm just after. Just after.

10:46

Yeah. Okay. Um, no pressure, Dennis.

10:50

Exactly. But I have to go back and get

10:53

working harder. Is that I can ask for

10:56

it. He needs to deliver it. So, Exactly.

10:59

Stop sandbagging.

11:01

We need it next week. That's true.

Interactive Summary

This video discusses the current state and future of Artificial Intelligence, focusing on Google's advancements with Gemini and other AI initiatives. Key topics include the rapid development in AI, the potential for AGI (Artificial General Intelligence), the evolution of user interfaces through AI-powered agents and smart glasses, and the implications of AI-generated content. The speakers emphasize the transformative impact of AI, comparing it to previous technological revolutions like the web and mobile. They also touch upon the challenges and learnings from past projects like Google Glass, and address concerns about model collapse due to AI-generated training data.

Suggested questions

5 ready-made prompts