AGI: The Path Forward – Jason Warner & Eiso Kant, Poolside

Watch on YouTube

Now Playing

Transcript

440 segments

0:20

How many people here know what poolside

0:23

is and does? Anyone? Anyone? Yeah. So,

0:28

let's talk about that real quickly.

0:30

Poolside exists to close the gap between

0:33

models and human intelligence. That's

0:35

literally it. That's what we're here to

0:37

go do. We're building our own models

0:39

from scratch to do this. We're based on

0:41

the idea 2 and a half years ago that we

0:43

thought next token prediction was an

0:46

amazing techn technological

0:47

breakthrough, but it need to be paired

0:49

with reinforcement learning really to

0:51

make that leap. So that's what we've

0:54

been doing for the past 2 and a half

0:56

years. So we're on our second generation

0:57

of models now, Malibu agent. And instead

1:00

of kind of like walking you through some

1:03

slides and all that, we just thought

1:04

maybe I don't know, let's kind of show

1:06

you what we're doing here. So

1:09

are you there?

1:10

>> I got you, Jason.

1:11

>> So as I said, you were supposed to see

1:13

him today, but there's

1:16

I don't know. Our airline system kind of

1:18

works sometimes, maybe. So he's stuck in

1:20

California, but uh we thought we'd just

1:23

walk you kind of through some um some

1:25

demos here today. So what you're looking

1:27

at here is a very modern programming

1:29

language that the government uses to run

1:31

all the world's critical infrastructure

1:33

called ADA. Anyone familiar with ADA?

1:37

>> Yes. Yes. Okay. So everyone I saw put

1:39

their hands up for ADA either has no

1:40

hair or gray hair like me. So that

1:43

should tell you what's going on here. So

1:45

ISO, why don't we uh why don't we figure

1:48

out what's going on with this codebase

1:50

here?

1:51

>> Well, let's start asking what the

1:52

codebase is about.

1:54

>> That's great. And what you're seeing

1:55

here is obviously our assistant in in

1:57

Visual Studio Code backed by poolside

2:01

agent, a model we train from scratch

2:03

using our proprietary techniques. Um,

2:06

and you can see what's going on here.

2:07

Kind of the stuff you expect from an

2:09

agent. Uh and obviously the form factors

2:11

of all of these things are going to

2:12

change a couple of times over the next

2:14

couple of years, but you know people

2:15

seem to like VS Code. Uh so we're going

2:18

to you know show you this demo here

2:19

today. So you can see from this it kind

2:21

of went through told you what this

2:22

codebase is all about but um you know

2:26

these things run in our satellites and

2:29

uh I don't know anything about ADA but I

2:31

do know a lot about a couple of other

2:32

programming languages. So uh ISO what do

2:35

we want to do here? Why don't we uh see

2:37

what this thing might look like in Rust?

2:39

Let's do it. Let's ask it convert this

2:42

database to rest.

2:47

>> So obviously you're going to see what's

2:48

going on here. Again, if you guys have

2:50

used other tools, you're not going to

2:51

expect too much of the difference for

2:53

what's happening here. Except that

2:54

again, we're backed by our own model.

2:56

We're not using Open AI. We're not using

2:58

Anthropic. This is poolside. And

3:00

poolside is a bottom and top stack that

3:03

is right now if no one's touched it and

3:05

I know no one in this room has touched

3:07

this unless you work for a three-letter

3:08

agency, a defense contractor, or you've

3:11

sent missiles somewhere that we're not

3:14

going to talk about in this session. Um

3:16

because that's where we're working.

3:17

We're working in high consequence code

3:18

environments for the last year inside

3:20

the the government and the the defense

3:22

sector. Um as you can see from this

3:24

demo. Um so what you see here is is kind

3:27

of going through doing the conversions.

3:29

What you see in the middle pane is

3:31

something that we built to kind of show

3:33

you as the streams come through all the

3:35

different changes that are happening.

3:37

Um, one of the tricky parts about

3:39

working on inside the defense sector and

3:41

things like that is you can't have an

3:43

agent that's just going to run around

3:44

and do stuff. I mean like I can't walk

3:46

into half of these buildings. You can't

3:47

give an agent access to these data

3:48

source and just say, "Hey, go nuts." You

3:51

need to have the right permissions. You

3:52

got to actually really ratchet these

3:54

things down to do things inside those

3:56

environments that you know they feel

3:59

comfortable with. So, uh, where are we

4:01

on this now? What is is it trying to fix

4:03

itself yet? Yes, it's it wrote about

4:07

1152 lines of code. Uh, and it just

4:10

popped up a command start and tested,

4:15

excuse me. Uh, so we see here all the

4:17

files on the left hand side that it

4:19

created. Uh this is essentially our live

4:21

diff view that's available.

4:24

Uh and as we see it's currently starting

4:26

to actually test it out.

4:34

So this is the part where we just sit

4:35

here and watch this for 3 minutes and I

4:36

see nothing.

4:38

>> No. What you see

4:38

>> the good thing is that this is a very

4:40

fast inference.

4:41

>> Yes.

4:41

>> So 1100 lines of code.

4:43

>> Did it task completed?

4:45

>> Do we know if this works yet?

4:48

>> Well, let's have a look. So it actually

4:50

wrote some bell commands to test it. And

4:53

when we check out the output of those,

4:56

this actually looks pretty good.

4:59

>> We ask

4:59

>> can we verify that

5:00

>> to run it? Let's go verify it. So of

5:03

course our agent came back and gave a

5:06

summary of what it did. But let's just

5:09

ask how to run this.

5:18

Okay.

5:20

So,

5:21

I'm going to go open up now. So, it says

5:24

this is how I can run the ADA version

5:26

and this is how I can run the Rust

5:27

version. Let's run the Rust version.

5:32

Perfect. Let's have a look at

5:35

We might be hitting an actual

5:38

>> an actual demo bug.

5:39

>> Let's have a look.

5:40

>> Let's see what happens.

5:41

>> I know. No, no. Just warnings. Just

5:43

warnings.

5:45

>> Do we have an unwrap in there that we

5:46

need to take care of? I heard that those

5:48

things are dangerous.

5:49

>> So, right now there's a ripple.

5:53

Uh, let's hit help. See what we're able

5:55

to do. So, it looks like we have a set

5:57

of commands. I'm going to be lazy. I'm

6:00

going to copy paste these queries.

6:03

So, create table users. Okay. So far so

6:05

good.

6:07

Let's insert a record.

6:11

Okay. Well, let's find out if it

6:13

actually did its job. Select start from

6:15

users. Okay, we've got a record here.

6:18

>> That's nice.

6:19

>> Now, now I want to actually

6:22

uh you see if I use the up arrow,

6:24

it doesn't actually allow me to cycle

6:26

through commands. Let's ask it to add a

6:28

feature.

6:30

6:33

allows me to use the up arrow to cycle

6:37

through.

6:39

I think it will understand my center.

6:42

>> The one thing we know about ISO is he

6:44

actually does know how to read and write

6:45

but he can't type. So all those errors

6:48

that you're seeing in there. Uh yeah.

6:52

>> So it looks like the agent's identified

6:54

a package that we can use. Let's just

6:57

quickly look here. Compare this to the

6:59

Virgin one.

7:02

And it looks like it's adding a library

7:03

called rusty line and changing the files

7:07

accordingly.

7:09

It's currently built it and it looks

7:12

like the build output is successful.

7:15

There's some warnings. We'll ask it to

7:16

clean those up later on. And it's now

7:18

starting to test it.

7:25

Okay, apparently it works. It's going to

7:28

It wrote itself a little bash script to

7:30

test the history.

7:33

It's wrote itself a little final demo

7:35

script.

7:36

So let's let it Okay. So, and it gave us

7:40

the summary. Well, now how do I rerun

7:44

this? I do kind of know that, though.

7:46

So, let's just

7:46

>> should know that. That was 30 seconds

7:48

ago.

7:49

>> Let's build it. And let's run it again.

7:53

Okay, let's do a help.

7:56

And oh yeah, that's the up arrow. It

7:58

works.

7:59

>> Very nice.

8:00

>> Now, our models aren't just capable

8:03

coding agents. They're capable in lots

8:05

of areas of knowledge work. They're also

8:07

emotionally intelligent. They're fun.

8:09

They're great to write bedtime stories

8:11

with for the kids. So, I'm going to ask

8:12

you to write me a poem about all these

8:14

changes, but that's just more for fun.

8:18

So, as Isa was saying, this is just an

8:20

interface into our platform. There's

8:22

other interfaces into it if you're

8:23

inside one of those organizations that

8:25

has adopted poolside. So this is the

8:27

coding interface into it but we also

8:29

have other ways in which you you can

8:31

interact with it web as well as an agent

8:33

that you can download on your machine

8:35

but um yeah we don't really tout the

8:37

poem writing or the songwriting though I

8:40

did send this to my wife to see and I

8:43

have been sending her love letters

8:44

written by poolside so I kind of hope

8:46

that she did not enter this session to

8:48

know exactly how I've been doing that

8:50

for the past 6 months but uh yeah so

8:52

this is kind of poolside this is what

8:54

we've been up to Um, so as I said,

8:56

Malibu agent is as a second generation.

8:59

We've got a ton more compute coming

9:01

online and that's when we're training

9:02

our next generation. That is be going to

9:05

be the one that comes out publicly to

9:06

everybody very early next year. We're

9:08

going to have it behind our own API.

9:10

It'll be on Amazon behind the bedrock

9:12

API. Anybody in the world who's building

9:14

out any sort of on a one side the

9:16

engineering assistants like the cursors,

9:18

windsurfs, cognitions, replets of the

9:21

world, you can use ours. or if you use

9:23

building out on any other side of the

9:24

fence, the Harveys, the writers, the

9:26

whatever applications of the world,

9:28

there's going to be a fifth model out

9:29

there that's going to be at that level

9:30

that you can you can consume. But we're

9:33

dead set on doing this and bringing this

9:35

out to everybody in the world and kind

9:36

of advancing that state-of-the-art and

9:37

we're just going to keep pushing that

9:38

out. So, that's kind of who we are. Um,

9:41

and uh you can find out very little more

9:43

at our website since we don't put much

9:45

out there.

9:47

But Iso, anything else you want to say

9:49

before you uh try to go make your flight

9:51

this time, please?

9:55

>> So, I would say that it's been a pretty

9:58

incredible journey for the last 2 and

9:59

1/2 years of starting entirely from

10:01

scratch and now building to a place

10:03

where we see our models have grown up to

10:05

become increasingly more intelligent.

10:07

And the kind of missing ingredient that

10:09

we had was compute. And now that it's

10:11

unlocked for us and and with a large

10:13

number of over 40,000 GB300s coming

10:15

online, we see how we can start scaling

10:17

up some of those models uh to get even

10:19

further uh in in their level of

10:21

capabilities and software development

10:23

and other types of long horizon

10:24

knowledge work. What I think is exciting

10:26

about this conference and this audience

10:28

is of all the work that's happening of

10:30

evolving the form factor. Right? Right

10:32

now what we looked at was this

10:33

asynchronous way of of operating with

10:35

agents. You know, Jason, you and I, we

10:36

have agents running that are doing tasks

10:38

for for hours, and I think in the near

10:40

future, we can see a world where they're

10:41

able to start doing tasks in days in the

10:43

coming years. And so, I think the

10:45

interface will continue to change. Uh,

10:47

we're really focused on the

10:48

fundamentals, building intelligence, and

10:50

being able to scale up and serve it. And

10:52

it's why we go full vertical. It's why

10:54

we go from our multi gigawatt campus in

10:55

West Texas where we're building out data

10:57

centers building out models. And the

11:00

interface that you saw today is just our

11:01

version of an expression. But I think

11:03

this audience is going to do an

11:04

incredible job of building lots better

11:06

versions of how to express using that

11:08

intelligence uh into actually, you know,

11:10

valuable, economically valuable work.

11:13

Couldn't have said it better. Can't wait

11:14

to see what you guys build on this uh in

11:16

the future when it's publicly available.

11:18

And if anyone really does want to build

11:19

a data center campus, we are hiring for

11:21

that. Um it is weird to be putting

11:24

shovels in ground again like we did in

11:25

the '9s and early 2000s, but that's what

11:27

you got to do to scale intelligence

11:29

these days. So,

11:30

>> I would make one other non-scheduled

11:32

statement if you're going to be okay

11:34

with this one, Jason.

11:37

As as our models are are getting more

11:40

capable, we'd love to also see who wants

11:43

to build with them. Right now, the the

11:45

vast majority of of you know, companies

11:47

that are doing additional reinforcement

11:48

learning and fine-tuning on top of

11:50

models are are doing it on what I would

11:52

consider right now the you know,

11:53

best-in-class open source models, the

11:55

the Quens and Fumies and Miniaxes of the

11:57

world. And uh we'd like to start

11:59

figuring out how we can you know partner

12:01

with you with our our models anywhere

12:03

from any checkpoint early on to where we

12:05

are today for you to be building closer

12:07

together with us on top of things. Uh we

12:09

haven't really figured out the approach

12:10

to it yet. Uh but I think since we have

12:12

this audience it's uh it's not a bad

12:14

place to put it out there and so

12:15

definitely reach out to us. Uh we think

12:17

the world till date was built by

12:18

intelligence. The world in the future

12:20

has been built on top of intelligence

12:21

and so be a great way to partner.

12:25

>> Well thanks ISO. Thanks everybody here.

12:26

And now we do have 5 minutes left. I

12:28

don't know if we're supposed to take

12:29

questions, but I'm happy to. So, if

12:31

anyone does, but if not, I'm just going

12:32

to go that way.

12:34

>> What was that?

12:37

>> Sort of. I mean, I think of him that

12:40

way. Here, here's a fun story. Here's

12:42

how I met ISO. I like to tell this story

12:44

because um ISO is a fun fun dude. I met

12:47

ISO because started with a failed

12:49

acquisition at GitHub. So back when I

12:52

joined GitHub in 2017 as a CTO, I wanted

12:54

to take GitHub from a kind of

12:56

collaborative collaborative code host

12:57

with open source bent and turn it into

12:59

an endto-end software development

13:00

platform infused by intelligence. And so

13:02

you know the the products that we

13:03

launched from 27 on or 17 on GitHub

13:07

actions, packages, alerts,

13:08

notifications, eventually code spaces,

13:11

um and then co-pilot was the last thing

13:13

that the office of the CTO did before I

13:15

left with Nat Friedman, Uga De Moore,

13:17

and a couple of other folks inside

13:18

there. But ISO in 2017 when I joined uh

13:22

he had working code completion before

13:24

the transform architecture had landed

13:26

fully. He had on LSTMs and so I quickly

13:28

tried to acquire his company and he just

13:31

he just said no. So he just said no to

13:34

me. Uh but we had that was a long drawn

13:36

out process talking about what we

13:38

thought neural networks were going to

13:39

mean for the world. And so during that

13:42

process, which was a lengthy one, we

13:43

became really good friends and we'd

13:45

stayed in close contact over the years.

13:47

And then 22 rolled around, obviously

13:48

Chat GPT comes out, Anthropics out, and

13:51

we kind of saw the endgame at play and

13:54

we said, "Do we jump back in or not?"

13:55

And of course, yes, we jump back in. But

13:58

I like to tell that story about how he

14:00

just kept saying no to me and I just

14:01

kept asking him questions and eventually

14:03

he said, "Yes, we should found a

14:04

company." Cuz by the way, when I asked

14:05

him if we should do this, he said, "Oh,

14:07

god damn no." That were his exact words.

14:10

He's like, "No, we should just learn how

14:11

to paint and sail." But here we are.

14:15

So,

14:15

>> yeah,

14:17

>> it's it's been a great journey together.

14:18

Jason, I I think the reason we ended up

14:21

doing this is because of our our

14:24

opinionated view on what it was going to

14:25

take to build more capable intelligence.

14:27

And in the first 18 months of this

14:29

company, you know, obsessing and

14:30

focusing on reinforcement learning

14:32

combined with LMS felt like one of the

14:34

most contrarian opinions in the world,

14:35

but I think today it's absolutely not.

14:37

And it's super exciting to see the the

14:39

progress that's continuing to make like

14:41

we're in the coming years we're going to

14:43

see the world that started in

14:44

completions and went to chat and is now

14:46

at a gentic increasingly approach more

14:48

autonomous and we're all of it is

14:52

stemming effectively from the

14:53

combination of bringing highly capable

14:54

models that are constantly evolving

14:56

together with real world problems and

14:59

and I think what we're starting to see

15:00

now is we're entering these kind of

15:01

awkward teenage years ahead of AGI where

15:04

everybody in this room is building out

15:06

incredible companies and applications is

15:08

bridging this gap of what it really

15:10

takes to make intelligence that in its

15:12

raw form actually be valuable and we uh

15:15

we want to be a small humble part of

15:17

that. We've got a lot of work still

15:18

ahead of us. Uh the team is growing. Uh

15:21

but hopefully what you've seen today uh

15:23

is what our our customers and

15:25

enterprises have been having access to

15:26

and seeing for a while is that we're you

15:28

know hard at work at uh at really

15:30

pushing those capabilities. We also want

15:31

to make sure we make them available to

15:33

build together with others.

15:35

>> Well, that's it. Thanks everybody.

Interactive Summary

Ask follow-up questions or revisit key timestamps.

Poolside develops advanced AI models to bridge the gap between AI and human intelligence, focusing on reinforcement learning combined with large language models. After 2.5 years, they showcased their second-generation "Malibu agent" through a demo of its coding capabilities, including understanding, translating (ADA to Rust), testing, and refactoring code, even adding new features. Their technology is currently deployed in high-consequence environments within government and defense sectors, requiring robust permissions. Poolside plans a public release of its next-generation model in early 2024 via its own API and Amazon Bedrock, aiming to make powerful AI accessible for various engineering and application development. The company is investing heavily in compute infrastructure, building multi-gigawatt data centers in West Texas, and also seeking partnerships to build with their evolving models.