HomeVideos

Does Microsoft AI Train Models on my Data and Interactions?

Now Playing

Does Microsoft AI Train Models on my Data and Interactions?

Transcript

164 segments

0:00

Hi everyone. The question has come up a

0:02

few times recently in my customer

0:03

interactions that does Microsoft train

0:06

models on my data on my interactions?

0:10

So, I wanted to quickly clarify and it's

0:12

actually pretty easy. And it falls into

0:15

two buckets.

0:18

If I am authenticating

0:20

with a work

0:26

or a school account,

0:29

what that really boils down to is I'm

0:32

logging in with an Entra identity. That

0:35

was the old name for Azure AD. So, am I

0:38

logging in with Entra? An identity that

0:42

my company provides.

0:45

If I'm signed in with an Entra identity,

0:49

if I'm using Copilot,

0:51

I'm using Copilot, then the answer is

0:54

no, whether it's the paid or the free

0:56

one. So,

0:58

no.

0:59

There is no training based on your data.

1:02

I can see it

1:04

in the documentation.

1:07

So, here if I go and look, Entra data

1:09

protection for prompts and responses for

1:11

M365 Copilot and Copilot chat,

1:16

it basically boils down to look, your

1:17

data isn't used to train

1:20

foundation models.

1:22

It's your data.

1:25

It's

1:26

not used for anything else. Your data is

1:28

private.

1:30

We don't use your data except how you

1:32

instruct.

1:34

The data protection, all of those

1:35

informations, they're all applying to

1:38

your data.

1:40

I don't even have a setting

1:43

as part of my profile

1:46

to try and opt out or do anything around

1:49

it. It's just no. That that is just the

1:52

default.

1:53

Now, even if you're using a model that

1:55

is a subprocessor, so at time of

1:57

recording, that would be Anthropic, for

2:00

example. I'm using an Opus model.

2:03

Well, because they're a subprocessor,

2:05

they operate within the Microsoft

2:07

product terms,

2:09

within the data protection addendums,

2:12

and are covered by our enterprise data

2:15

protection.

2:17

So, basically, it's still there's going

2:19

to be no training based on your

2:21

interactions,

2:23

on your data.

2:25

It's just no, no, and no. Now, that's

2:28

different from maybe where the

2:29

inferencing happens, and there's some

2:30

special considerations with

2:31

subprocessors, for example, in Europe,

2:33

and that's gone through in the document.

2:34

But, under no circumstances is it using

2:37

your data or your interactions to train.

2:40

So, then you think, "What if I deploy a

2:42

model in Foundry? What if I'm using

2:44

models in Copilot Studio? What if I'm

2:46

using Agent Builder?" So, the answer is

2:48

no,

2:49

no, and

2:52

no.

2:53

It's the same. It's it's your data. And

2:55

remember, models are stateless. Nothing

2:57

is stored in them anyway. But, once

2:59

again, we can come and look at the docs,

3:02

and it talks about, "Hey, in in Foundry,

3:04

for example,

3:05

it's just a whole bunch of nos.

3:09

Unless you're specifically telling it to

3:12

go and do some training. For example,

3:13

I'm fine-tuning,

3:15

then no. Your prompts, your completions,

3:18

your embeddings, your data,

3:21

it is yours. It is not used for any

3:24

other purpose. You would have to go and

3:26

say, "Hey, I want to do some training

3:29

because I'm fine-tuning." And if you

3:31

fine-tune a model, it's yours. It's

3:33

pretty simple.

3:34

Okay, so then you get the other bucket.

3:37

And the other bucket is I'm signing with

3:40

a personal account.

3:46

And the personal account means a

3:48

Microsoft, an MSA. And this is where it

3:51

is different. So, in this case,

3:55

the default is for Copilot, then yes,

3:59

it does

4:01

use some of those interactions for

4:02

future training. But,

4:04

you can absolutely

4:08

opt out.

4:10

And I'll show you that. So, if I go

4:12

over,

4:14

firstly,

4:15

the documentation

4:17

controlling how conversations are used

4:19

for model training,

4:21

it tells you how to go ahead and turn

4:23

that off.

4:25

But, if I go into the chat,

4:28

so I'm in the bottom left,

4:31

so I'm signed in with a Microsoft

4:32

account, and I just go to my settings,

4:37

privacy,

4:39

I've turned them both off. I don't want

4:40

it training on conversation activity. I

4:42

don't want it training on voice

4:44

conversations.

4:45

So, you still have the ability to opt

4:48

out of that even when I'm using these

4:50

kind of free Copilot uh chat

4:53

interactions as a Microsoft account.

4:57

Now, one thing I do want to stress,

5:01

training and personalization

5:04

are very, very different. So, training

5:11

does not equal

5:16

personalization.

5:20

Within the services, you get the option

5:22

to do personalization. For example, in

5:23

the corporate ones, there's explicit

5:26

memory where I tell it, "Hey, remember I

5:28

like to work this way." I can add in

5:30

custom instructions for how I want it to

5:32

behave. There's implicit memory that it

5:34

learns what I do.

5:36

In the personal ones, there's ways to

5:38

add customization as well.

5:40

I have the ability to

5:41

uh delete some of that learning, control

5:44

the memories.

5:45

But, normally, that's just for me.

5:47

That's just for my experiences. It's

5:48

still not used for anything else,

5:51

but it's good to have that to improve my

5:53

experience. So, I wouldn't generally

5:55

recommend turning off the

5:56

personalization cuz it's going to

5:57

increase and enhance your interactions

6:00

with the AI, but it's completely

6:02

different from training. So, don't mix

6:04

those two things up.

6:06

And that is it. It's pretty simple. Hey,

6:08

I'm logged in with a an Entra account,

6:10

work or school. Hey, it's not training

6:12

on anything. If I'm logged in with a

6:13

personal account, then by default, yes,

6:15

but you can go and turn it off.

6:18

Hope that helps. All the links to the

6:19

documentations I've shown are in the

6:21

description of the video if you want to

6:22

go and check out the detail. Uh but,

6:24

till next video, take care.

Interactive Summary

Microsoft's policy regarding training AI models on user data depends on the account type. For work or school accounts (Entra ID), Microsoft does not train its foundation models on user prompts or data, including when using subprocessors or tools like Copilot Studio. For personal accounts (MSA), training is enabled by default, but users can opt out through privacy settings. The video also clarifies that training is distinct from personalization, which improves individual user experiences without contributing to the base model.

Suggested questions

4 ready-made prompts