How to build proactive agents & self-improving company (Fully explained)

Watch on YouTube

Now Playing

Transcript

334 segments

0:00

Thanks HubSpot for sponsoring this

0:02

video. What if your company got better

0:05

while you sleep? Why combinator just ran

0:07

a whole session on this? They're calling

0:09

this selfimproving companies. A

0:11

companies in the current batch are

0:13

already hitting 5x more revenue per

0:16

employee compared with 18 months ago.

0:18

Their agents have been handling all the

0:20

internal ops and write 45 of its own

0:23

tools all autonomously. We saw company

0:25

like Pocha raise $30 million to just

0:27

build and run the whole company from

0:29

scratch. All of them is point to one

0:31

thing that there's a new AI native way

0:34

of running a business. My team and many

0:36

others has been experimenting those AI

0:39

native way of running company past few

0:41

weeks and we've encapsulated our

0:42

learnings and best practice into an open

0:44

source agent skill for long horizon work

0:46

and self iterating tasks alongside many

0:48

other useful tools. And this is what I

0:50

want to talk you through today. How does

0:52

this actually work and how can you set

0:53

up step by step for your own team? So

0:55

for any company operation before AI

0:58

human has been the glue to use mole

0:59

different sets to get a certain outcome

1:02

and human the one that prioritize decide

1:04

what kind of things to do. With the

1:06

recent AI boom most of us has translate

1:08

to this AI enhanced workflow which means

1:11

you talk to an agent or a AI workflow

1:13

that is completing a task end to end.

1:15

However, there's no feedback loop back

1:17

to assistant to inform the improvement

1:19

that it should do. The human still be

1:21

the mean driver about prioritization,

1:23

plan and trigger tons. What's really

1:25

powerful about all those other use case

1:27

we're seeing here is this real AI native

1:30

loop where agent can take a input or

1:32

trigger about a certain goal doing

1:34

certain task but most importantly

1:36

actually captures feedback to learn

1:38

what's working not working plan next

1:40

steps to making sure next time it is

1:42

doing things better. Diana from YC is

1:45

explained this in a control system setup

1:48

with comparison of closed loop versus

1:51

open loop where there's no feedback

1:52

bathroom system to a closed loop where

1:55

status decisions and outcomes are

1:57

continuously captured and feedback into

1:59

intelligence layer and in a later video

2:01

it break down into five core elements

2:03

for each AI loops from how the data

2:06

actually ingest into system to the

2:08

policy layer which is like a contract

2:09

about the workflow and SOP and two layer

2:12

that allow agent to access different

2:14

systems and write quality gates in the

2:16

workflow. So either human or AI

2:19

evaluator can guard the output quality

2:22

and lastly some sort of mechanism that

2:23

can bring those learning back to system

2:25

so that it can improve its own

2:27

operations. This might feel complicated

2:29

a bit overwhelmed when you want to

2:30

automate the whole company but in

2:32

reality it's actually pretty simple and

2:34

easy to get started. Fundamentally for

2:37

each AI loop the way I started is just

2:39

set up a memory layer or environment so

2:42

the agent can keep a system or record of

2:45

the task and outcomes as well as

2:47

different skills and chron jobs for

2:49

agent to continuously executing and

2:51

monitoring the results and depending on

2:53

the type of task you must have different

2:55

cadence and skills and let's take SEO as

2:57

an example it is really good use case to

2:59

start with because SEO is kind of solve

3:01

problem that can be engineered at high

3:03

level human is basically doing this loop

3:05

of doing rese research across Google

3:07

console, internet, AF to form a keyword

3:09

strategy and based on that strategy we

3:12

start pumping out different social

3:13

content web page and continuously

3:15

monitoring the performance. Update the

3:17

strategy if needed to get agent self

3:19

sustain this loop. You can set a proper

3:22

memory layer quite commonly my memory

3:24

layer will be break down into two parts.

3:26

one's a temporal lock to log what agent

3:29

did every day or every week and from

3:31

that continuously forming a latest

3:33

strategy and pumping all the learnings

3:35

into it and this is like a simple setup

3:38

there could be a lot more things to it

3:39

which I will talk a bit more and

3:40

meanwhile you can set up a skills with

3:42

CRI so the agent can continue the whole

3:44

loop end to end by itself things like

3:46

SEO audits draft content publish and

3:49

reading data from Google Analytics or AH

3:52

and again there are also a lot of nuance

3:54

and the tools that I'm going to cover

3:55

pretty soon and This is one free tool

3:57

that's actually super relevant here,

3:59

which is HubSpot free AEO creator. So

4:02

AEO basically means answer engine

4:04

optimization to increase the chance your

4:06

product and service show up in chat GBT

4:08

peri and Gemini's answer. Fundamentally

4:11

AEO complements SEO because it cover

4:13

channels like AI answer engine that

4:15

traditional SEO didn't fully measure.

4:17

What kind of page are cited when people

4:19

asking relevant questions and what kind

4:21

of things were mentioned? So you can

4:23

reverse engineer to find relevant

4:24

authors to outreach and create content

4:26

that going to fill the gap that nobody

4:28

is covering. Normally all the tools on

4:30

the market is basically help you get a

4:32

refresh answer from different chatbot

4:34

and top pages AI is sourcing information

4:37

from. They normally charge a good amount

4:38

of money for this type of information.

4:40

That's why hot free aquer is really

4:42

good. You basically just give your

4:44

company name. It will analyze how

4:46

tragic, perplexity and Gemini

4:48

characterize your brand. Give you score

4:50

across multiple different dimensions as

4:52

well as growth area that you or the

4:54

agent can fix. So you can take the audit

4:56

report back to the agent to form the

4:58

right keyword and content strategy or

5:00

even turn into a skill that agent can do

5:02

once a while to get more rich data. I

5:04

have put a link in the description below

5:06

so you can go get audit report for free.

5:08

Now let's get back to the last part of

5:10

building those type of AI loops perform

5:12

memory SQ that is those chron jobs. You

5:15

can set up chron job to get agent

5:17

recursively executed on the action and

5:19

also do cloud auto dreaming type of

5:21

setups by having a weekly planning

5:23

chrome job. And these three things

5:25

together allow you to form a closed

5:27

loop. So the agent will continuously

5:29

monitoring the results publish content

5:32

and update it hypothesis continuously.

5:35

and ankit from AI buildup also set up

5:37

some similar AI loops SEO and increased

5:40

traffic by three times in just one to

5:42

two months. He has shared some of setup

5:44

he had for the growth analysis designing

5:46

website information architecture and

5:48

assignment to even write high quality

5:50

SEO content and blocks. I've also

5:52

included his public ripple in the

5:54

description below so you can check out

5:55

and theme loop setup can be applied for

5:57

many other different scenarios like my

5:59

friend Gio tried this experiment to get

6:01

agent autonomously wrong ads for months

6:04

by applying such autonomous loop with

6:06

power chrome and skill setup. So get

6:07

list of skills from analyzing

6:09

performance copyrightiting image

6:11

generation research and also kept a

6:13

state folder to log all change logs and

6:15

learnings as well as JSON file of

6:17

campaign history and live ads. In this

6:19

process, the first week agent tests 10

6:21

different ads format from whiteboard

6:23

sketch, notebook page, cardboard science

6:26

to tweet screenshot. And from this

6:28

process, agent learns that ugly ad

6:29

assets that looks like this actually win

6:31

better. And second way made a decision

6:33

based on all the learnings including

6:35

specific asset format to via a

6:36

whiteboard plus what kind of copy

6:38

showing on the whiteboard as well as

6:40

content itself should be around a free

6:42

skill pack and generate 243 leads within

6:45

months for a $1.5,000 budget. So this

6:48

loop does really work but there are a

6:50

lot of nuance getting into it to really

6:52

differentiate whether your AI loop is

6:53

actually going to deliver the results as

6:55

well as some tooling that will make a

6:57

setup a lot easier and one learning here

6:59

is a memory setup. Normally there are

7:01

two types of information that it need to

7:02

be saved. One is those kind of factual

7:04

memory which normally is a logs of

7:07

things that agent ever did so you can

7:09

remember what have been done before and

7:11

review the performance. Another is those

7:13

kind of procedural learnings which you

7:15

can normally turn into a skew and of

7:17

course you can just prompt the agent to

7:19

save everything as a log but when there

7:22

are quite a messy information or complex

7:24

structure it can make it very difficult

7:25

to retrieve later but their open source

7:28

memory layer that you can reuse like

7:30

Gitan's Jbrain which is a plug-in that

7:32

you can use open clock per cloud code it

7:35

has instruction to handle the data

7:37

instruction so you can access meeting

7:39

scripts YouTube videos transcripts

7:41

things like that would be otherwise

7:42

difficult to extract and data will be

7:44

saved in a specific format like in your

7:46

brief folder where this list of

7:48

predefined entities. It's kind of like

7:51

Andrew Copsy's large language model wiki

7:53

where the wiki is mainly designed for

7:55

consuming different research paper

7:57

versus the jing setup has been designed

7:59

for logging different entities for

8:01

personal assistant usage like meetings,

8:03

people, program and purchase. Each

8:06

folder has a readme file to detail

8:08

explain what goes into this type of

8:09

entity and each entity will follow a

8:11

markdown structure to log the facts as

8:13

well as timeline log. Meanwhile, it

8:15

comes with a retrieval pipeline.

8:17

Basically, all those knowledge saved

8:18

will be automatically turned into a

8:20

vector DB alongside some CRM and MCP

8:23

tool for you to search against and it is

8:26

pretty good from personal assistant

8:27

point of view that is managing hundreds

8:29

of thousands of different entities for

8:31

people like Gary Tiff. However, jub

8:33

brain is still again designed for those

8:35

entity based memory. Stone X on my team

8:38

were actually experimenting with another

8:39

entity memory setup. What do we call

8:41

looping is like a company in the loop.

8:43

We optimize this memory layer for those

8:46

long cycle task and for self-arning

8:48

behavior. So the agent has relevant

8:49

chron job to output learnings and skill

8:52

proposals. And the example of using that

8:54

is we can just copy this instruction to

8:56

any agent like her open crawl. They will

8:58

start setting up the memory layer on

9:00

computer and there are few predefined

9:02

artifacts that can be used ask you

9:04

questions step by step about what kind

9:06

of AI loops that you want to create or

9:07

what type of missions that you want

9:09

agent to drive for example I can say I

9:11

want agent to autonomously draft social

9:13

contents for me to daily drive the

9:14

growth for my Twitter then ask you

9:16

question by question about what kind of

9:18

API skill that you need to create and

9:20

some information about the procedure

9:22

knowledge like voice and tones the

9:24

cadence and you can just back and forth

9:26

with the agent based on the conversation

9:28

You can just prompt agent to set up the

9:30

relevant artifact type and the chron

9:32

jobs. In my specific case, it creates

9:34

this post draft artifact is going to

9:36

allow agent to log what has been dropped

9:38

before as well as a feedback alongside

9:40

relevant chrome job. And this chrome job

9:42

will have instruction to scan previously

9:44

relevant nodes and information and

9:46

generate new one. Most importantly

9:48

during the daily chrome, it also has

9:49

instruction to extract learnings as well

9:52

as propose updates to skills. So you can

9:54

log those procedural learnings. often

9:56

pair with this loopony plugin with some

9:58

special data access skills for data

10:00

injection because quite often agent

10:02

autobots can't really access certain

10:04

special type of data and you might have

10:06

preured data that need special CRI to

10:08

access by having a skew for data access

10:11

the detailing strategy for accessing

10:13

specific type of data is really really

10:15

helpful there are a lot of different

10:17

open source ripple that is handling this

10:19

problem that I list out here so you can

10:20

go check out using those ones for if you

10:23

build clubs I have included my data

10:24

access skills in the agent skill 101. So

10:27

you can copy paste to use. Meanwhile,

10:28

this also pretty useful open source tool

10:30

called printing press. It's trying to

10:32

solve the problem that most of the API

10:34

or MCPS or even official CRI is not

10:37

actually designed for agents. So it's

10:39

not that token efficient and also has a

10:41

whole bunch of problems like it might

10:43

get into the interactive mode which

10:44

agent is not great at interactive with

10:46

error message might not contain enough

10:48

information for agent to self-healing

10:50

and sometimes CRI can return big amount

10:52

of data and Trevor actually have article

10:54

about those 10 principle for design

10:56

agent native CRIs that's really good and

10:58

this printing press tool is basically

11:00

encapsulating all those principles into

11:02

a CIS cube so you can basically ask

11:05

agent to build any sort of CRI like

11:07

access your internal database case or

11:09

some third party software that don't

11:10

have MCP or CRI officially autonomously

11:13

research and build the CRI with those

11:16

principles in mind. So with this you can

11:18

actually build quite sophisticated and

11:19

efficient data injections. So that's it

11:21

for today's video. If you find this

11:23

helpful please like and give me a

11:24

subscribe. Thank you and I see you next

11:26

time.

Interactive Summary

Ask follow-up questions or revisit key timestamps.

The video discusses the concept of 'self-improving' or 'AI-native' companies, where autonomous agents not only perform tasks but also capture feedback to iteratively improve their operations. The speaker explains the transition from human-led AI workflows to closed-loop systems that include memory, skill development, and periodic monitoring. Practical examples are provided, such as automating SEO and running ads, alongside recommendations for tools like HubSpot's free AEO creator, memory management frameworks, and principles for building agent-native interfaces.