HomeVideos

Small Area Estimation: bringing theory to practice

Now Playing

Small Area Estimation: bringing theory to practice

Transcript

2105 segments

0:12

Good, good morning, uh good afternoon,

0:16

good evening everyone. Thank you for uh

0:18

joining this side event on small air

0:21

estimation

0:23

bringing theory to practice. Uh we will

0:26

have uh the next an hour and uh probably

0:30

half an hour and 30 minutes for uh uh

0:33

covering the the topics in our side

0:35

event is one of the 57 statical

0:38

commission side events which is being

0:40

held uh virtually uh we've um

0:45

distinguished guests here speakers uh

0:47

presenting uh regional aspects of the

0:50

the work and country examples and also

0:54

more from the uh uh the the the World

0:57

Bank and UNSD in terms of organizing and

1:00

managing uh this project so far. So uh

1:04

without further ado, I will um I'll pass

1:08

the mic to uh our colleague Miss Hawen

1:12

who is leading this uh activity from

1:16

United Nations Division to do the

1:19

opening and overview of the session. uh

1:21

we have a very tight uh uh program uh

1:25

and lots of participants will try to

1:27

make the most out of it. How please uh

1:30

go ahead.

1:33

>> Thank you so much Daniel. Thank you.

1:35

Welcome colleagues. My name is um Jan

1:37

coordinator of the intersector working

1:40

group working here division. So really

1:44

be welcome to this event on smarts that

1:47

jointly organized by UNC

1:50

um the world by and only from Africa

1:54

Asia Pacific and Latin America policies

1:57

for the noise behind

2:00

um so as countries start to deliver on

2:02

the sustainable development goals and

2:04

leave no one behind the demand for

2:07

disagregated timely and policy relevant

2:10

data.

2:12

Many NSO face constraints in producing

2:15

reliable estimate at subn national lower

2:19

geographic areas and using different

2:21

survey code.

2:23

My mission really offer a practical and

2:26

methodologically robust pathway to

2:28

bridge this gap by integrating survey

2:31

data with administrative data geospatial

2:34

and other nontraditional data filters to

2:37

produce more granular insight while

2:40

maintaining traditional results. G from

2:43

the UNSD site have been partnering with

2:46

our colleagues um here if you you will

2:49

be listening hearing from them advance

2:52

the use of smartation

2:55

offices really grateful for their

2:57

partnership

3:00

banking

3:04

materials guidelines in learning courses

3:07

in workshop

3:09

will be hearing their work and also from

3:11

countries

3:13

and the students are literally

3:16

countries that done great work in apply

3:19

as

3:21

um we look forward to hearing uh from

3:26

them and then we also look forward to

3:27

better serving our communities

3:30

they use as well estimation over to

3:33

thank you so much

3:37

>> thank you how for uh setting the stage I

3:41

think uh we have a sound issue from you

3:44

a little bit but I think uh most most of

3:47

uh the the uh topics have been I mean

3:50

been mean covered uh just making sure

3:52

that our partners are in in the house

3:55

the regional commission the world bank

3:57

who is really pushing this forward uh

4:00

for enabling NSO uh to to get to uh work

4:05

with small area estimation where where

4:07

it makes sense so uh without further ado

4:11

I'll Switching to uh the wallet bank

4:13

who's uh been the anchor for this

4:15

program. Uh our senior economist David

4:19

will will take you through the uh uh why

4:23

small area estimation and what the bank

4:25

is doing in supporting uh countries and

4:28

some some of the outcomes of the the

4:31

work so far uh they have been uh working

4:34

on. So I will uh pass it to David. David

4:38

uh floor is yours. Thank you very much,

4:41

Daniel. Um, let's give me a moment

4:43

perhaps to get the presentation up. Um,

4:46

can everyone see the presentation?

4:49

Let me also go into slideshow mode.

4:53

>> Wonderful.

4:54

>> Um, so I hope everyone can hear me well.

4:58

If not, I can uh speak louder and thank

5:01

you very much um Daniel and Howy for

5:04

organizing this uh presentation. It's

5:06

really an honor and a pleasure to be

5:08

here um with you and with uh so many

5:12

colleagues from national statistics

5:14

offices. I'm going to be talking about

5:16

geospatial small area estimation with a

5:18

focus on a recent application that we're

5:20

currently working on for Nigeria. Um

5:23

this is a nice compliment in my view for

5:25

a lot of the work that the bank has done

5:28

traditionally on smaller estimation with

5:30

survey and census data including the

5:32

relatively recently produced guidelines.

5:35

um and uh but we are also looking to

5:37

extend this um to geospatial

5:40

applications.

5:41

um so I think we mostly understand the

5:45

benefits of data integration. Surveys

5:47

measure very important socioeconomic

5:49

indicators but they're expensive and

5:52

because of that they're small and

5:54

because of that we cannot be very

5:56

granular with them. So most surveys

5:58

cannot go below state or district

6:01

levels. And of course we would love more

6:03

granular information. And so to do that

6:06

we can combine surveys with

6:08

comprehensive auxiliary data to increase

6:10

accuracy and precision of estimates. Um

6:14

and this is uh potentially useful for

6:17

many things. We can use it to target

6:18

social assistance programs, monitor and

6:21

evaluate programs and do quality control

6:24

for sample surveys. Um there are

6:26

concerns that sample surveys may not

6:28

always be representative. For example,

6:30

if potential areas in a country are

6:33

affected by conflict. Um there may be

6:35

some places where enumerators can't go

6:37

and data integration actually offers a

6:40

method for measuring that or estimating

6:42

that. Um traditionally, as I mentioned,

6:44

we've used survey census data for

6:46

smaller estimation. There's a long

6:47

history of this going back to Fay and

6:49

Harriet in the late 70s in the US

6:51

estimating average income for counties.

6:54

Um then in the bank Albert's Loa and Loa

6:58

kind of popularized the use of survey

7:00

and census smaller estimation around the

7:02

world. Uh Jen and Lehi and Molina and

7:05

Ralph also made improvements to the

7:06

methods. Um and so this is great but

7:09

censuses are very expensive. They're

7:11

conducted typically once every 10 years

7:13

but in some countries not always that

7:15

frequently. Um and so there's been more

7:18

recently a large and literature

7:21

combining survey and geospatial data to

7:23

try to uh see if we can do smaller

7:26

estimation more frequently. Amazingly

7:28

this started in 1988 with a paper that

7:31

was 30 years ahead of its time. Um then

7:34

everybody forgot about the potential

7:35

geospatial data until 2016 and then a a

7:39

relatively recent literature as I

7:41

mentioned has really pushed forward the

7:43

boundaries on it. And there are a couple

7:45

uh recent reviews um one of which I I

7:49

did that basically show that this this

7:52

stuff works um that the the geospatial

7:56

data that is available is predictive of

7:58

of many important socioeconomic

8:00

indicators. Um there's a lot of interest

8:03

in big data. I'm focused on geospatial

8:06

data in particular because it is

8:07

publicly available and we don't have to

8:10

worry about selection bias. Um I still

8:12

have some concerns with mobile phone

8:13

data. I I believe it can be useful for

8:15

some applications, but still not

8:18

everybody has a mobile phone. Also,

8:20

there's been a really revolutionary

8:22

increase in the access to publicly

8:24

available imagery and indicators spurred

8:25

on by platforms like Google Earth Engine

8:27

and Microsoft Planetary Computer. Um,

8:30

and now a lot of research as I mentioned

8:32

kind of that shows that geospatial

8:34

indicators really excel at predicting

8:37

spatial variation and urbanization. And

8:39

if you think about the things that it

8:40

can measure like buildings and nighttime

8:42

lights and vegetation and land

8:44

classification, these are all kind of uh

8:47

good proxies for urbanization and how

8:49

urban places and that in turn is

8:52

correlated to many important social

8:54

economic indicators. For example, almost

8:56

everywhere on average uh rural places

8:59

are poorer than urban than urban places.

9:03

Um so uh one area in which this has been

9:06

applied quite frequently is poverty

9:08

estimation. Um we've done a number of

9:10

tests relative to survey data and about

9:14

eight countries and shown that the

9:15

increase in precision uh is equivalent

9:18

to expanding the survey data by about a

9:20

factor of 2.5 to nine depending on the

9:22

context uh and the measure. Surveys cost

9:26

a huge amount of money to implement as

9:27

you all know a million dollars and more.

9:30

So if we can expand the precision of

9:32

these by 2.5 to nine for using a

9:35

procedure that is essentially free uh in

9:38

my book that qualifies as a huge win. Um

9:41

although much of the research so far has

9:43

applied this to poverty estimation I

9:45

believe it can be applied to many other

9:46

indicators and in fact we are doing so.

9:49

Not all indicators are sufficiently

9:51

precisely measured to do this. Not all

9:54

indicators are correlated enough with

9:56

population density to do this but many

9:58

are. Um so just to give you an example

10:00

of the best case scenario and this is

10:02

from older work done in Tanzania

10:05

um and uh we use geospatial smaller

10:08

estimation to go basically for from the

10:12

district to the commune level. Um that's

10:14

the the lower level is on the right. The

10:17

higher level is on the left. Um and

10:20

what's notable about this that we don't

10:21

show here is that the average precision

10:23

is measured by the coefficients of

10:25

variation is about the same in both of

10:27

these pictures. And so the the first one

10:30

did not use geospatial data. The one on

10:32

the right does. And it just shows in

10:34

this case, which was kind of a best

10:36

example, best case scenario, we're able

10:39

to uh go a level lower in terms of

10:42

administrative units with no loss in

10:44

precision.

10:46

Um so this is great. And then you might

10:48

think why isn't everybody using it? Um

10:51

uh but there are sort of some obstacles

10:53

I think that have held back adoption.

10:55

One of them is that there is a variety

10:57

and sophistication of methods um that

11:00

have been used for this and that can be

11:01

confusing and people have used linear

11:05

mixed models which are relatively simple

11:08

but even somewhat complicated like

11:10

empirical best predictor models. These

11:12

have been used uh traditionally for

11:14

survey and census data. They can also be

11:16

used for geospatial data. There's also

11:18

treebased machine learning like extreme

11:20

gradient boosting and now more recently

11:22

mixed effects gradient boosting. um you

11:25

so th those are uh maybe one step up

11:28

from linear mix models in terms of uh

11:30

complication sophistication but then a

11:32

lot of the literature has been focused

11:34

on deep learning approaches especially

11:37

convolutional neural networks and then

11:39

more recently AI with using foundational

11:42

transformer models. Um these are kind of

11:45

AI big AI models that have been trained

11:47

on many uh many thousands of terabytes

11:52

of imagery um to recognize features and

11:55

then they can be fine-tuned with data um

11:58

to develop predictions um tailored to

12:00

data that's so that's one issue is that

12:02

the methods can be sophisticated um a

12:05

second issue is that the EA location uh

12:08

can be sensitive notably the demographic

12:11

and health surveys would publish jitter

12:13

EA location information, but uh not all

12:16

surveys come with EA location

12:18

information. Um and so for for public

12:21

researchers, these can sometimes be

12:23

difficult to obtain. Um the this can be

12:27

somewhat comp complicated to implement

12:29

and requires considerable computing

12:31

capacity and uh memory. This is

12:33

especially true for the more

12:34

sophisticated deep learning and AI

12:37

methods. And of course, these also

12:39

require very strong technical skills.

12:41

the tools and the documentation are

12:42

still evolving and developing. We're

12:45

trying um but we're frankly not gotten

12:48

as far as we would have liked in terms

12:49

of getting the tools and documentation

12:51

out there. Um so these are all obstacles

12:54

that are kind of holding back adoption.

12:55

The hope is that through um events like

12:58

these and finishing tools and

13:00

documentation and doing workshops of the

13:03

sort that uh we've organized in the in

13:05

the recent past that the word will get

13:07

out and people will start uh taking

13:09

advantage of the benefits of publicly

13:11

available geospatial data. So what we're

13:14

trying to do is uh work on applications

13:17

of generating smaller estimates for

13:19

indicators for countries doing

13:21

additional research and evaluation

13:23

testing methods using geoloccated census

13:25

data. There are a few countries where

13:27

we've been able to obtain geoloccated

13:29

census data and those are great for

13:31

testing methods and then developing

13:33

tools and we're working on tool two R

13:35

packages in particular one called

13:36

Geolink and one called Podmap. Um the

13:39

first one helps download publicly

13:42

available geospatial indicators. The

13:43

second one helps integrate it with

13:45

survey data. We we've also used to learn

13:48

to utilize Google Earth Engine which is

13:50

a very powerful platform for obtaining

13:53

publicly available geospatial data. Um

13:55

and of course now with the development

13:57

of AI and the AI chat bots uh you can

13:59

just ask it to write code for you.

14:01

Momentum is accelerating here but a lot

14:03

of work remains. Um so I'm going to talk

14:06

here mostly about an application I'm

14:08

working on currently um for our country

14:10

team in Nigeria um where I'm uh doing

14:13

geospatial smaller estimation for 10

14:15

indicators. Um the 10 indicators are

14:18

listed there. I think what's neat about

14:20

this is almost all of the work and sweat

14:23

goes into uh downloading the geospatial

14:26

indicators. Once you have that, the cost

14:29

of adding new survey indicators is

14:31

essentially negligible. it's just you

14:32

know switching in a different dependent

14:34

variable in the same code. Um so uh you

14:37

know going to 20 indicators would not be

14:39

difficult um if they're if they can be

14:42

predicted by geospatial data. And so

14:44

just to give you a sense of what we're

14:46

doing uh we created the shape file of

14:49

about a million one square kilometer

14:50

grids covering all of Nigeria and then

14:53

we use that to obtain publicly available

14:55

geospatial features starting with

14:57

population estimates. Um there are some

14:59

relatively recent population estimates

15:01

produced by World Pop that we used and

15:03

when we use that we find that there are

15:05

about half of the grids in Nigeria are

15:08

populated nearly 500,000 and so we can

15:11

obtain geospatial data for those

15:13

populated grids. Merge those with the

15:16

survey data using area centrids obtained

15:19

from the survey. We were able to get

15:20

those then estimate a model using the

15:23

survey data use that model to predict

15:25

outcomes for each grid. weight it by the

15:28

estimated population and aggregate it to

15:30

the desired geographic level. Um we're

15:32

doing both wards and local government

15:34

areas. The most complicated part of this

15:36

is using bootstrap techniques to

15:38

estimate confidence intervals. Um that

15:40

can be difficult. Um uh but the our aim

15:45

is to get sort of documentation out on

15:47

that on how to do it. Um, so just to

15:50

give you a sense of the geospatial

15:52

features being used, the modeled

15:54

population estimates come from World

15:55

Pop. Nighttime lights um can be

15:58

downloaded from the Colorado School of

16:00

Minds website. There's building data

16:02

from 2018 and 2023.

16:05

Um, land cover, crop land, open- source

16:08

cell tower location, which may not be

16:10

super reliable, um, but it we can use

16:13

it. Um, it's also not the most

16:16

predictive variable. Um, as I'll show

16:18

you, vegetation index, average rainfall,

16:21

aerosol and ozone index, which are

16:22

measures of pollution that are local,

16:25

and then conflict events. Um, we also

16:28

experimented with these new and

16:30

interesting Google deep mind embeddings,

16:32

which are available on Google Earth

16:33

Engine. They work well on their own, but

16:36

when we have all these other variables

16:37

in in the model, they the embeddings add

16:40

very little. Um so you can estimate a

16:43

model at the grid level. P hat here is

16:46

our survey estimate of poverty in a grid

16:48

that's been uh matched to the

16:50

enumeration area. Um we use as

16:53

predictors predictors at the grid level

16:56

and predictors at the target area level

16:58

which is a ward or an LGA. And there's

17:01

an error term uh both at the ward level

17:04

and at the grid level. Um we use an arc

17:06

sign transformation because we're

17:08

estimating proportions and we want to

17:09

keep the uh estimates bounded between

17:11

zero and one.

17:14

Um we also in this case we're using

17:15

extreme gradient boosting. Um I find

17:18

that's widely used and very mature

17:20

software. It's been very well tested.

17:22

We've tested it in house. It slightly

17:24

outperforms linear models in many cases.

17:27

Um not always of course. Um but it has a

17:30

very flexible functional form that

17:32

handles nonlinearities and interactions

17:34

quite well and you can measure the

17:36

importance of predictors through sharply

17:38

decompositions without too much

17:40

difficulty. So it's a little more

17:42

flexible than linear models and a little

17:43

more interpretable than the AI and deep

17:47

learning models. Perhaps not as accurate

17:49

as mixed effects gradient boosting, but

17:51

that's a little more complicated and we

17:53

haven't tested it. And uh I you know

17:55

probably not as accurate as

17:57

convolutional neural networks or

17:58

transformers. Um but other times that

18:01

we've tested that they use we use

18:03

different features than we do here. And

18:06

so I I think with the extra features we

18:08

have here we can get very solid

18:09

estimates even with extreme gradient

18:11

boosting. Um and indeed the models are

18:14

pretty predictive. If you look at R

18:16

squar these are in sample R squares but

18:18

there is some regularization. So it's

18:20

probably not overfitting. Um when it

18:22

comes to XG boost and just predicting

18:24

these indicators um the R squares are

18:26

quite high. Um they do vary. Um so then

18:30

I mentioned you can do sharply

18:32

decomposition to look at what measures

18:34

are important and when we look at

18:36

poverty um it looks like it's buildings

18:40

uh pollution and land classification and

18:43

nighttime lights that's doing most of

18:44

the work. Um and I think this makes

18:47

sense. These are all correlated with

18:49

urbanization kind of as we'd expect. Um,

18:53

and fortunately I think they're pretty

18:54

solid in terms of the quality of the

18:56

data. Um, obviously not every building

18:58

is going to be correct, but on average

19:00

um, there's very useful information. And

19:03

I think what's nice is that this, as

19:05

we've seen in the past, leads to large

19:07

increases in precision relative to the

19:09

direct survey estimates. Um, these are

19:11

the mean width of the confidence

19:13

intervals of estimates. Um and you know

19:16

they they fall by over a half. Um and uh

19:21

you know that's really impressive I

19:23

think because you know this the mean

19:26

confidence interval widths are basically

19:28

proportional to the standard error and

19:31

the standard error is the square root of

19:33

the variance and the variance is

19:35

inversely proportional to the size of

19:36

the sample. So this is uh an efficiency

19:39

gain approximately equal to expanding

19:41

the sample by a factor of eight or six

19:44

um by this measure. Um so that that I

19:47

think that's again a big win. They are

19:50

less precise than the the quote

19:52

representative state level estimates. Um

19:55

so there there's a average confidence

19:57

interval width of 15 for the estimates

19:59

that are currently published. We would

20:00

go up to 26 even for LGA estimates if

20:03

you use this method. And so I think this

20:05

this leads to kind of some philosophical

20:07

questions of how precise is precise

20:09

enough to publish. Um so I just want to

20:11

give you a sense of these are

20:12

preliminary estimates but these are what

20:14

the maps look like. Um they show quite a

20:16

bit of poverty in the northern half of

20:18

Nigeria. Um but there are pockets um and

20:21

you can see where there are pockets

20:23

where it's redder than other places and

20:26

even in the northern and the northeast

20:28

there are pockets where it's notably

20:29

less poor than other places. Um and also

20:32

in the south you can see there there

20:33

there's quite a bit of variation even

20:35

within states. Um these have been

20:37

benchmarked to match the survey

20:39

estimates at the state level which I

20:41

think um is important. Um you can see

20:44

somewhat similar patterns but also some

20:46

differences in multi-dimensional poverty

20:48

which is I I think interesting. Um and

20:52

also some pockets um where you can see

20:54

particular words. Um we can look at

20:57

other indicators like the secondary

20:58

enrollment rate for education. Um again

21:02

uh different pockets sort of towards the

21:04

middle of the country where that's

21:06

weaker. Um so these are interesting. I

21:09

think for someone who knows Nigeria

21:11

better than I do they would be very

21:12

interesting but I want to again caution

21:14

that these are preliminary. They need to

21:15

be reviewed within the bank and by the

21:17

government. Um so uh you know I I

21:20

they're not ready to be used quite yet.

21:23

Um anyway to conclude um I feel that the

21:26

geospatial SAE is practical and

21:28

expensive and useful. Um all of this is

21:30

using publicly available data. Um so

21:33

it's relatively cheap and for those

21:34

reasons I think it should be used

21:36

routinely. Um getting to kind of routine

21:38

use requires tools, knowledge and

21:41

computing power. Um and uh I think we

21:44

can help certainly with the tools and

21:45

the knowledge. Um re the research on

21:48

methods and there's been a lot of

21:49

research on methods. I think it's useful

21:51

but it shouldn't stop applications at

21:53

this stage and in many cases where we've

21:55

compared methods the differences in

21:58

accuracy are really not that major. So

22:00

as an extreme example we did some

22:02

there's a recent paper that's coming out

22:04

soon that compares uh different methods

22:06

for combining census and survey data and

22:09

we found that the prefer when we

22:11

simulated a targeting program the

22:12

preferred method beat kind of the less

22:14

preferred method slightly and that

22:16

translated to a 0.01 01 percentage

22:19

points in simulated poverty after a

22:22

targeting program. So in my view that's

22:24

kind of small enough that it shouldn't

22:26

cause a lot of debate or hold back

22:28

adoption. Um that you know there can be

22:32

a lot of discussion about methods but in

22:33

practical terms the differences are are

22:35

probably not that major in many cases.

22:38

Um, so meanwhile the methods offer

22:41

trade-offs on interpretability and

22:43

simplicity to estimate parsimony in

22:45

terms of how easy it is to communicate

22:46

model parameters. And so if the

22:49

differences in accuracy in a practical

22:50

sense are not that huge, maybe one does

22:52

want to go for a method that offers more

22:54

interpretability or is simpler to

22:56

explain. Um, there can be bias in this

22:59

kind of technique. The model based

23:01

estimates, they reduce sampling error.

23:03

The reduction in sampling error

23:04

outweighs the introduction of model

23:06

error. So it's a net win. The estimates

23:08

are more accurate on average, but there

23:10

is model error. And so um we have to

23:13

think about how to deal with that. And

23:15

maybe this involves some sort of redress

23:17

mechanisms of being open to uh a

23:20

procedure for handling complaints if

23:22

there are complaints. It is possible

23:24

that a place looks less poor in terms of

23:27

its urbanization than it is. Um and even

23:30

though everything is kind of more

23:32

accurate on average, um these cases can

23:35

be important. And then I think we need

23:37

to think a little harder about to decide

23:39

when estimates are sufficiently precise

23:41

to publish. Um so survey based estimates

23:43

often uh there's a threshold based on

23:45

the coefficients of variation adopted by

23:47

many national statistics offices.

23:50

Actually I feel this is problematic for

23:52

proportions because it's it you know it

23:54

varies whether you use a measure or its

23:57

complement. So it's quite possible we

23:59

could publish the non-poverty rate but

24:01

not publish the poverty rate and that

24:03

doesn't really make sense to me. or you

24:05

could publish the inchool rate but not

24:07

the out of school rate. Um so I'm not a

24:09

huge fan of coefficients of variation

24:11

for proportions. They can be useful for

24:13

other things. Um and then in general,

24:15

you know, these estimates could be

24:17

useful for policy even if they're not

24:18

entirely precise. So uh thinking about

24:21

you know what what the right uh

24:24

precision measure if any for publishing

24:26

these I think is is important. Um of

24:29

course quality control regardless is

24:31

crucial. Um so whenever we do this there

24:33

does need to be some sort of process of

24:36

uh evaluating the estimates making sure

24:37

they make sense um etc. But I do believe

24:41

that uh it is not useful to suppress

24:43

useful estimates um and that these

24:45

techniques um can be very widely applied

24:48

and really help uh provide more policy

24:51

relevance and useful data um worldwide.

24:54

Thank you very much.

24:58

Thank you David I think for walking us

25:00

through this uh impressive example on

25:03

Nigeria. I think I'm sure everybody is

25:06

looking forward to see the the published

25:08

results which would be uh something that

25:11

uh you know the NSO would be very much

25:14

interested in picking up uh uh the

25:18

pieces in on their side.

25:20

uh it's there are a couple of questions

25:22

but I will we'll come to the questions

25:24

at the end so that we can answer them

25:27

along the way. I'm jotting some of them

25:29

in the chat uh some of them in the Q&A

25:32

and we have some from uh the the the

25:35

submission and registration. So I will

25:37

come back to that but really thank you

25:39

for uh putting perspective putting the

25:41

research into practice. I mean that's

25:44

that's what everybody is looking uh

25:47

forward to. uh and saying that the

25:51

regional commissions are at the center

25:52

of this making sure this these come to

25:56

practice by uh you know doing the

25:59

interlocation work that brings NSOs in

26:02

the region to to to this uh advanced

26:06

research methodologies use and

26:08

implementation and I will uh transition

26:12

to our regional colleagues we have ECA

26:15

and ELA um sorry escap on this session

26:20

we'll start with ECA uh I'll share your

26:24

presentation Angela Angela is our uh

26:27

lead from UNCCA who's pushing this with

26:31

this work with uh the region the African

26:35

region and a number of activities have

26:37

been happening in the past couple of

26:39

years uh on ACE I think it would be uh

26:44

uh great to see uh where they are at and

26:47

what they're uh uh pushing towards so

26:50

I'll I'll try to put your presentation

26:52

Angela in the meantime you can introduce

26:54

yourself please

26:55

>> thank you so much Daniel um as he has

27:00

indicated my name is Angela Chicho and

27:02

I'm a statistician at the African center

27:05

of statistics uh of the United Nation

27:08

Economic Commission for Africa I'm happy

27:11

to share with us uh because we've we've

27:13

each been given five minutes so it's a

27:15

really short presentation we're about

27:17

slide. Yeah. But uh it's really to

27:20

report on uh what we've been able to do

27:22

in the year 2025. But maybe as a way of

27:25

background I would like to share that uh

27:29

we've been doing this since uh 2023

27:33

and we have seen of course improvements

27:35

based on uh the experiences that uh we

27:38

encounter as uh you know this um course

27:42

is um uh

27:46

as this course is done by the

27:48

participants. Could you kindly go to the

27:50

next slide please? Yeah. So like I

27:53

mentioned um we started in 2023 and uh

27:58

in terms of uh trends uh we have seen an

28:01

increase in the number of countries uh

28:04

that have been uh um sending nominees to

28:07

participate in this uh uh e-learning

28:11

small area estimation course. Um I think

28:14

probably we could share the link in the

28:16

the chat for those that don't know about

28:18

it to maybe go there and see but uh it

28:22

is a self-paced uh course e-learning

28:25

course and uh it lasts uh about 7 weeks

28:30

it has several modules and so on and so

28:32

forth in terms of materials. Um so in

28:35

the case of Africa for example in 2023

28:38

we had only nine countries but of course

28:40

also the approach at the time was

28:42

different because the call was just open

28:45

and it was uh whoever saw the call that

28:48

you know self-enrolled and uh at least

28:50

the case of Africa we decided that we'll

28:52

pick like 30 participants

28:55

um for a manageable class in the next

28:58

year the approach was different we

28:59

reached out to the heads of national

29:01

statistical offices and requested to

29:04

nominate uh but this that time it was

29:07

really focusing on the anglophone

29:09

countries um and not all of them at that

29:13

time though. Uh so we had about 12

29:16

countries and this time round we put out

29:20

a call to all the 54 countries including

29:22

the franophone and uh we had a response

29:26

of about 30 countries. Now specifically

29:29

in 2025 uh we got about 70 nominees as

29:34

uh people proposed by the heads of NSO

29:37

because they understand their people

29:39

that is at least the the basis for

29:42

requesting or having the heads of NSO as

29:45

the entry for us and uh we requested

29:48

them to um register themselves at least

29:51

for the Anglo one and 41 did but of

29:55

course we can see the numbers keep

29:57

reducing and then of course committing

29:59

to uh you know have to do the course

30:02

over the 7 week period. You can see the

30:04

numbers again also keep producing and

30:07

ultimately those that actually do the

30:09

assignment because if you do not do the

30:10

assignment and pass it then uh you

30:12

cannot say that you have completed or

30:15

done the course or you know uh qualified

30:17

to take a certificate that you at least

30:20

have an idea. Yeah. So that has been uh

30:23

or that was the case with the English or

30:26

the anglophone. uh in the case of the

30:28

franophhone we had nine countries that u

30:33

sent nominees uh of which uh there were

30:37

about 27 uh 17 registered and uh

30:42

unfortunately in the case of the

30:43

franophone were not able to have the

30:46

cost done as per the model that is used

30:50

because there's an issue with some of

30:52

the materials but it is still planned

30:54

that once this is completed then uh

30:57

they will be benefiting from uh um from

31:01

the course. So the course is done in

31:03

such a way that uh for those that enroll

31:07

uh over the seven weeks at least on a

31:09

weekly basis they'll have sessions with

31:12

uh a facilitator a course facilitator

31:15

who takes them through uh what you call

31:17

a synchronous class and uh uh that is of

31:21

course still virtual. uh but during this

31:23

session um the discussions about the

31:26

challenges uh that the participants may

31:29

be experiencing especially with the

31:31

videos or the materials they would have

31:33

interacted with in the course of the

31:34

week and uh um as you can see I think it

31:41

still comes up as an issue a challenge

31:44

the the the attrition uh but we need to

31:47

see what would be a solution for it.

31:50

Now, we also did have uh an in-person um

31:54

training um for sorry, someone is trying

31:58

to call me. Yes. So for an inerson

32:00

training for the the company that uh

32:04

David has just uh taken us through and

32:07

of course the prerequisite was that uh

32:10

the the the participants for the in

32:13

person and this was uh only for for a

32:15

select um set of anglophone countries um

32:20

had to have completed the the e-learning

32:23

course the small area mission e-learning

32:26

course with proof that they had

32:27

completed it uh because it was forming

32:28

being part of you know the base for what

32:31

was going to be learned in person though

32:34

of course this one is skewed more to uh

32:36

the geospatial uh side as you can see it

32:39

is on earth observation data so it was a

32:42

5day more or less really intensive

32:44

workshop um in terms of the material

32:48

being uh offered and so on and uh we are

32:52

really grateful to the partnership from

32:55

the world bank the east African

32:57

community as well as uh UNSD um that

33:01

supported uh uh for for for the workshop

33:05

to to happen this training workshop. So

33:07

in a nutshell we had about uh 15

33:10

participating countries. Could you just

33:12

go back one sorry? Yeah. So we had 15

33:14

participating countries and those are

33:17

they and uh a total of 26 participants.

33:21

We also ensured at least for the host

33:23

country which is Kenya we invited um the

33:27

data person that uh is in the office of

33:30

the UN RCO that is the coordination or

33:33

the coordinator's office. Next please as

33:37

I conclude I just thought I'd highlight

33:40

some of the challenges that uh continue

33:42

to persist. Um I think of course people

33:46

who start with R and other uh um

33:48

packages usually beg your pardon other

33:52

packages besides R usually have a

33:54

challenge because this course is run in

33:57

R and uh so if one doesn't uh do that

34:01

foundational bit of uh learning R they

34:05

have a challenge um doing the course and

34:08

then of course uh this has really

34:10

persisted especially for the e-learning

34:13

the The completion rate continues to

34:15

remain a challenge but uh we hope that

34:19

along the way even during this webinar

34:21

we can have maybe ideas on how we can

34:24

overcome this one but they do exist

34:26

opportunities at least in the case of

34:28

Africa. We strongly see um support from

34:33

the top management and by top management

34:34

I mean the heads of the national uh

34:37

statistics offices and this is evidenced

34:40

by their responsiveness to a request for

34:42

nominees. Um the other is that uh um the

34:47

interest or for small area estimation uh

34:51

work is also evidenced in the

34:53

application especially by those that

34:55

have you know gained the skills of

34:58

course later during this webinar.

35:01

Colleague from Ghana will be sharing the

35:02

Ghana experience. Um David alluded to

35:05

the Nigeria case but then also as

35:08

individuals some uh uh uh uh

35:10

participants from course have gone ahead

35:12

to do their own papers. An example is

35:15

some colleagues from Kenya and uh we

35:18

still think that uh at least not to

35:21

leave people behind. So it would be

35:24

great to have the materials you know

35:27

expanded to other packages that people

35:29

are more familiar and more comfortable

35:31

with uh be it STA or Python. So I'd like

35:35

to thank us for uh thank you all for

35:38

your attention and uh also appreciate

35:42

our partners uh once again and uh thank

35:45

you so much for listening to me.

35:48

>> Thank thank you Angela. Thank you uh

35:50

taking us to the story of the region in

35:52

the past year and I think uh I was um uh

35:56

part of the the in-person training and I

36:00

have witnessed you know the the progress

36:02

even from countries and also the the you

36:06

know the innovative ways of participants

36:09

uh pulling other I think you were

36:11

mentioning about other tools like strata

36:13

and python we had a number of

36:15

participants who were proficient in

36:17

stata who were able to even pull their

36:20

data from Stata into uh uh this this

36:23

workshop and I think it's it's something

36:26

that we need to think about putting some

36:28

more examples in that in that area.

36:30

Thank you for that. We'll come back to

36:32

questions later on uh at the end. Now

36:35

I'll pass it to uh and you know and

36:37

saving some time I'll pass it to uh Roth

36:40

from uh Bangkok. Uh he's also leading

36:44

the regional uh work in Asia and

36:47

Pacific. Um, back to you. Uh, please go

36:51

ahead.

36:53

>> Thanks so much, Daniel. Um, so give me a

36:56

few second while I share my screen. Um,

37:00

hopefully it's showing up all right.

37:04

Seems to be okay on my side. Can you let

37:08

me know if you see the screen?

37:11

>> Okay, perfect. Uh well thanks thanks

37:13

again Daniel and thanks uh David and

37:16

Angela for um sharing the work that

37:18

you've been doing. Uh good morning good

37:21

afternoon uh evening everyone. So my

37:23

name is Sana Rod. I'm an associate

37:25

statistician from the statistics

37:27

division here at ESCAP. Um and it's a

37:29

pleasure to be with you today and share

37:32

uh ASCAP um well uh go beyond the

37:35

training uh to to share a bit more on

37:36

the capacity development program um once

37:40

that uh that we've uh conducted back in

37:42

20 uh 25.

37:45

Um I should first start off by saying

37:48

that uh ESCAP has been working closely

37:50

with uh UNSD, ECA uh and UN agencies

37:54

such as UNICEF and development partners

37:56

namely the World Bank to implement the

37:58

capacity development program. uh so

38:00

their support was uh very much crucial

38:02

to the success of the 2025 um activities

38:06

and even before uh getting into the

38:08

activities I should also mention that um

38:11

our SAPE program built on the decisions

38:14

of our previous committee uh I mean back

38:17

since uh the seventh session uh which

38:19

was in 2020 to prioritize data

38:22

integration and innovation integrate big

38:24

data into official statistics and

38:26

promote a whole of society approach um

38:29

to implement the um our uh declaration

38:33

on navigating policy with data to to

38:36

leave no one behind. So there's history

38:37

to to that and with that foundation we

38:40

car we carry out the following

38:42

activities

38:43

um back in 2025. Well, first of all, uh

38:47

in collaboration with uh UNSD, ECA and

38:50

the World Bank, we published the uh

38:52

how-to guide on um geospatial uh SAPE in

38:57

all. Um it's a practical guide with uh

39:00

runnable codes and real data uh for

39:03

practitioners and policy makers alike

39:05

interested in uh geospatial uh

39:09

uh the guide is available in both HTML

39:11

and GitHub versions as you can um should

39:15

be able to see in a bit. Uh I'll share

39:17

my screen to uh sort of give you a quick

39:19

demo of the guy. Um so this is the HTML

39:24

version. So you can see it's very

39:26

interactive. Uh we also prepare uh a

39:30

short sort of demo for you as well if

39:32

you would like to know how to navigate

39:34

um the guy. I won't play it because we

39:36

we're running out of time. Uh but do but

39:38

do please uh you know watch it and and

39:41

you know navigate the the guy um you

39:44

know in in your spare time. I'll share

39:47

the link um with you in the chat as

39:50

well. Uh there's sub uh components to

39:53

each chapter that you can go through. um

39:55

you know there's uh chapters on setting

39:58

up all as Angela mentioned some

40:00

participants in the workshop that I'm

40:02

going to tell you about later on um had

40:05

you know uh trouble uh getting used to

40:08

working on all and we've uh incorporated

40:10

that in here as well um but um overall

40:16

this guy is is uh essentially a

40:19

practical walk through where uh if

40:22

you're interested in in geospatial essay

40:24

E um you can just simply go through the

40:27

chapters um and you'll be able to uh

40:30

you'll be able to um you know not only

40:33

understand the importance and the

40:35

usability of geospatial uh SAE but

40:38

you'll also be able to uh run the codes

40:41

and um you know uh follow along the

40:45

examples and eventually um use uh all

40:49

this knowledge and codes uh for your own

40:52

uh country context or indicator of of

40:54

interest. Um, so let me get back to the

40:59

uh slide,

41:03

right? Um,

41:06

yes. So, um, that's that's the uh how-to

41:11

guide. Uh, and I highly encourage

41:13

everyone to visit it. Uh, the link will

41:15

be shared with you in the chat later on.

41:19

Uh second with support from UNSD and the

41:21

World Bank we organized um our 2025

41:26

Asia-Pacific capacity capacity

41:28

development program on SE uh and we uh

41:32

combine virtual e-learning uh

41:34

facilitated by um you know an expert

41:37

from the World Bank and uh with

41:40

in-person uh regional workshop in

41:43

Bangkok on geospatial SAE which I will

41:46

uh briefly tell you uh later on as Well,

41:49

so the guided e session started um on

41:52

the 2nd of October um and it concluded

41:56

on the 14th of January because um it it

42:00

includes seven weeks of facilitated

42:02

online classes um and participant went

42:05

through um you know uh sessions where we

42:09

have um the expert uh uh conducted um uh

42:14

tutorials uh essentially uh through the

42:17

e-learning

42:18

uh course uh that we developed and links

42:21

will be shared again uh in the chat box

42:24

uh on the Cup um the Cup platform. uh

42:30

CIP is our stat statistical institute

42:32

for Asia-Pacific. Um and through that

42:35

link uh participants went through uh all

42:38

of the uh materials. Um and all of the

42:42

participants uh able to complete uh the

42:45

course including the graded assignments

42:49

uh and they were able to um uh attend

42:52

the workshop which was similar to what

42:55

Angela mentioned was the main

42:57

requirement for participants to uh

42:59

participate in the in-person workshops.

43:01

Um there were 20 of them for the 2025

43:04

cohorts and from uh 10 uh countries. So

43:07

two per countries. Uh and let me get the

43:11

list for you. They're from Bangladesh,

43:13

India, Indonesia, Malaysia, Pakistan,

43:16

Philippines, Sri Lanka, Tajikistan,

43:17

Thailand, and Vietnam. We're lucky to

43:20

have uh one participant from Malaysia,

43:22

Miss Fisa uh here with us who will be

43:25

able to share her experience later on.

43:28

Um but yeah so this this uh course um

43:33

enable participants to be able to

43:35

prepare themsel for the inerson workshop

43:37

and contribute um as as much as they

43:41

can. Um speaking of the workshop uh it

43:46

happened on from the 24th to the 28th of

43:51

November. Um and uh it was attended by

43:54

all uh 28 participants. The focus of the

43:58

workshop was on providing hands-on

44:00

capacity uh building support uh to NSO

44:03

staff essentially on using uh or a

44:06

little bit Python to conduct geospatial

44:08

SAPE for their indicators of interest.

44:12

Uh we had um Dr. Josh Murfield some of

44:15

you might have uh worked or known him.

44:19

uh he was our facilitator and he taught

44:21

participants on uh working with shape

44:23

files, rusted data uh packages uh like

44:26

the one that um David mentioned earlier

44:29

uh geol uh titer puff map uh and

44:33

estimating models uh such as obviously

44:35

you know pop uh mentioned earlier and

44:38

then uh uh it's gradient uh boosting uh

44:42

as well um participant did bring uh

44:46

their own well many of them brought

44:48

their own data sources uh and they were

44:50

able to estimate their indicator of

44:52

interest using the um the skills uh

44:56

learned from the workshop. Um uh if

44:59

you're interested you can click on the

45:01

link or I will share in the chat box

45:02

again um to learn more about uh the

45:06

workshop itself. Um and the link at the

45:10

bottom there is another link uh in

45:12

reference to the the how-to guy uh

45:15

mentioned earlier. Last but not least,

45:18

uh we wanted to extend our capacity

45:20

building initiatives even further. Uh so

45:24

we organized our Asia-Pacific STA cafe

45:27

on geospatial SAPE on the 27th uh of

45:31

January 2026. Um, we invited

45:34

participants from our workshop and

45:36

e-learning program to share their

45:37

experiences uh and lessons learned in a

45:40

panel uh with expert reflections from

45:43

Josh and a re resource person uh from uh

45:47

BPS uh Indonesia.

45:50

Uh we also took the opportunity to

45:51

showcase the how-to guide again on

45:54

geospatial sea. As you can see, we're

45:56

very very proud of that uh with with uh

45:59

in collaboration with a colleague from

46:01

ECA uh USD and World Bank. Um and we've

46:05

got uh quite a number of participant uh

46:07

registered 60 plus uh and uh uh the uh

46:12

the panel discussion was lively and then

46:15

everyone was was very engaged uh and

46:17

more information on that can be found

46:19

again in the link that will be shared uh

46:21

in the chat uh later on. So that's all

46:25

from ASCAP in 2025 and look we look

46:28

forward to uh to uh 2026 and and what uh

46:32

the year will bring. Um thanks again for

46:35

the opportunity Daniel. Uh back to you.

46:39

>> Thank thank you Ro. I mean it's it's

46:41

really good to see uh what's happening

46:44

in in the African region and in the Asia

46:46

Pacific and uh I'm probably um

46:50

participants have picked up this is

46:52

happening uh for people who are

46:55

motivated because it's not it's not uh

46:58

uh other kind of courses you just go for

47:00

a week. Uh we also have a prerequisite

47:03

of people finishing the 7week module

47:07

that really uh puts uh the groundwork

47:10

ready for uh the 5-day workshop physical

47:13

workshop at the end. Uh we will be

47:16

sharing all this uh information later on

47:20

all the right links. Um I see a couple

47:22

of questions coming which which link

47:24

where it where the e-learning courses

47:26

and things like that they will be shared

47:29

and the recording of this meeting along

47:32

with the presentation will be shared

47:34

with all the participants who registered

47:36

here. So now with with all the regional

47:39

commission's backing and the UNSD and

47:42

our partners and in the world bank

47:45

countries have been uh participating in

47:47

this uh uh taking the course on online

47:51

and then also doing the physical uh

47:54

course for a week. uh it it is time now

47:58

to switch to countries and see what's

48:01

happening from uh their uh their side of

48:04

uh the the the the work on SAE. Without

48:08

further ado, I will pass this to our

48:12

colleague from Chile from the National

48:14

Institute of Statistics in Chile who

48:17

will be showcasing their uh SAPE

48:20

experience and they've been doing a lot

48:22

of experimental statistics using SAE and

48:25

we'll be uh yeah following up our

48:28

presentation. Miss Aier will be

48:30

presenting uh uh the activities in in

48:33

NA. U the floor is yours Mr.

48:39

Hello. Uh so well good morning everyone.

48:42

Thank you for the invitation. C can you

48:44

see my presentation?

48:47

>> We saw it and then it's gone. Can you

48:50

reshare again?

48:51

>> Oh

48:55

me

48:57

put it on presentation mode and I think

48:59

it's not working.

49:01

>> Okay.

49:04

>> Huh?

49:05

>> Yeah we can see it. Okay,

49:07

>> thank you.

49:07

>> Okay, thank you.

49:10

So, well, my name is Javier Torres. I'm

49:13

from Chile and we're going to show you

49:16

the results of the implementation of

49:18

small area estimations uh in the

49:21

national victimization survey.

49:24

So a little context uh our victimization

49:28

survey is called national

49:32

and it's one of the longest

49:34

victimization surveys in Latin America.

49:36

It's collected annually since 2005

49:40

and it has a sample of about 24,000

49:43

household providing national and

49:46

regional representativeness.

49:48

So in 2023 we have a major redesign of

49:52

the survey. We redesigned the

49:54

questionnaire and the sampling and with

49:56

that uh we collected the first survey

49:59

with a communal representativeness for

50:02

the European areas of 136 communes. Uh

50:06

this came from a growing demand from our

50:09

government for better geographical

50:11

disagregations given the characteristics

50:13

of the phenomenon. Uh however uh having

50:17

uh coming out a survey every year it's

50:21

uh expensive and and it's not

50:24

sustainable on the long run. So for 2024

50:27

we use SAI to obtain reliable estimates

50:30

of the proportion of households

50:32

victimized by violent crimes which is

50:35

the main estimate of the survey.

50:38

uh this for the 136 communes of the

50:41

design and the results were just

50:43

published we published in January and

50:45

you can see it on our website.

50:49

So uh regarding SAI we started working

50:52

back in 20 uh 2018 actually with the

50:56

first uh capacity of building phase uh

50:59

this came from assistance from EKLAC. So

51:03

we started working in 2018 with the

51:06

survey uh from that moment which is like

51:08

the old version of the survey and we

51:11

work with the proportion of households

51:12

victimized by high social impact crimes.

51:15

Uh this uh was a working paper that it

51:19

was published in 2024 and it gave us uh

51:23

a lot of lessons mostly that uh it was

51:26

needed to have a like a strong

51:28

theoretical framework and establish a

51:30

strong criteria for evaluating the

51:33

results and that these were consistent

51:35

with the phenomenon.

51:37

So uh well from 2018 to 2022 we mostly

51:44

work in the capacity building in NSO and

51:47

then for 2024 we establish uh SAI

51:51

estimations as official estimations of

51:54

the survey. So uh in 2023 we had an

51:58

exercise with the communal uh survey uh

52:03

which allow us to understand better the

52:06

coariates and the needs of the model.

52:10

So for uh 2024 as I told you we use uh

52:15

the survey we use SAI as an official uh

52:19

estimates or mostly it was uh

52:24

so we use uh uh some methodological

52:27

framework we use aid uh proposed

52:30

framework uh for the specific

52:32

specification phase

52:35

uh we evaluate the user needs uh the

52:37

data availability and the SA AI methods

52:40

available. So for the user needs, we

52:43

defined that our estimator was going to

52:45

be the proportion of households

52:47

victimized by violent crimes which is um

52:53

oh here it's a aggregate of seven

52:56

different crimes such as robbery or

52:59

assault. And for the data ability uh we

53:03

put efforts on creating a theoretical

53:05

framework which guided uh the search for

53:09

the coariate. So we establish some

53:12

dimensions like you know socioeconomic

53:13

or socio demographic characteristics but

53:16

we also look for data regarding uh crime

53:19

and victimization like police records

53:22

and infrastructure and environment such

53:24

as satellite images, national uh

53:27

community information and

53:30

and the system and informal settlements

53:33

data.

53:36

Uh this uh this phase we also chose our

53:39

target of estimation which is commune.

53:41

Uh commune is the smallest

53:42

administrative submission in Chile and

53:45

it's equivalent to a municipality. And

53:48

finally we decided on using EVAP uh is

53:50

based on the ferret model as it has been

53:53

applied to poverty estimations in Chile

53:55

and is quite consolidated as a

53:57

methodology.

54:00

Uh after that uh we went on the analysis

54:03

and adaptation phase. uh so to reduce

54:06

the volatility associated with the

54:08

estimates uh the sampling variance is

54:11

modeling using a generalized variance

54:13

function and this gave us a more stable

54:16

and robust measure of variance which is

54:18

subsequently used as an input for the

54:20

model. We also establish a domain

54:23

inclusion criteria which give us uh how

54:26

many communes will have a purely

54:29

synthetic estimation or which ones was

54:31

going to be uh direct and synthe

54:34

synthetic.

54:35

So um part of this criteria were the

54:39

degrees of freedom the number of

54:41

observations and the sign effect for

54:43

each domain and then we search uh for

54:46

our final model.

54:49

So uh for our model we have a model

54:53

selection algorithm algorithm uh

54:56

statistical criteria and conceptual

54:58

validation. I'm going to explain that

55:01

very shortly each one of them.

55:05

So uh we use a baseline model uh

55:08

including regional dummy variables. Uh

55:11

we use a stepwise selection and we did

55:14

the exploration of uh combinational

55:16

variable subsets. So we uh throw all the

55:19

coariates that we have and try to

55:22

simulate different models.

55:25

Uh we also well we look for the

55:28

statistical significance of the core

55:30

variates the AIC and the BIC. Uh we also

55:33

look for the residual diagnostics and

55:36

multicolinearity

55:38

checks and the prevent benchmarking

55:40

check. And finally for the conceptual

55:43

validation we selected coariates that

55:47

are expected to cover keymatic

55:48

dimensions related to victimization.

55:54

Uh so uh

55:58

for the evaluation finally after we have

56:01

our model so we evaluate its precision

56:03

and certainty and to verify the

56:06

assumption of linearity we check the

56:08

residuals looking that there's no

56:10

influal domains. Uh we also look for

56:13

consistency with the regional estimates

56:16

looking the SAI fall within the

56:18

confidence intervals of the direct

56:20

regional estimates.

56:22

uh in this case uh region is the first

56:24

level administrative division in Chile

56:26

and the commun are part of the region.

56:29

So we expected that you know no uh

56:32

domain were uh above the um confidence

56:36

intervals.

56:38

Uh we also look for error measures. Uh

56:40

particularly we compare the root mean

56:42

square uh error of the SAI within the

56:45

direct direct estimations of the survey.

56:48

And this show us uh this is the the

56:50

graphic we have here and this show us

56:52

that um the SAI is significantly more

56:56

efficient than direct estimations

56:58

especially in areas with a smaller

57:00

sample size sizes. Uh so in the figure

57:03

we also sort the communions by sample

57:05

size uh which is highly associated with

57:08

the reli reliability of the estimation

57:10

according to our NSO standards.

57:16

So um we have the point estimations this

57:20

is for 2024

57:22

uh where we can see the elab uh elap

57:25

estimations tend to be more conservative

57:27

than direct estimations. uh in blue uh

57:30

we can see the draw direct estimations

57:33

uh in the in the areas with the smaller

57:36

sample sizes also as we expected uh for

57:39

the communes with bigger sample sizes uh

57:42

both estimator estimators are quite

57:44

similar.

57:46

So this is sorted by sample size. Uh

57:48

this is smallest and the uh biggest on

57:51

the right.

57:55

Uh so uh well this is has been a long

57:59

work for us. We have been working on

58:00

this for the past six years and there is

58:03

a lot of lessons that we have taken from

58:06

the different exercises that we have

58:08

made. So uh first the quality of uh SAI

58:12

estimates depends uh strongly on the

58:15

relevance the coverage and consistency

58:18

of the auxiliary variables which are not

58:20

easy to find. uh regarding the

58:23

limitations of the administrative

58:25

records use we identified in some cases

58:28

coverage limitations. So we worked with

58:31

uh 136 communes from around 300 that

58:34

there is in Chile and it was really hard

58:37

to find data that covered the

58:40

136.

58:42

uh we also find missing data in the

58:44

administrative records and

58:46

inconsistencies which uh made it

58:48

necessary to perform some imputations or

58:51

in some cases to reject the use of the

58:54

data.

58:56

Uh also uh to use an aggregated

58:59

victimization indicator can facilitate

59:02

the explanation of the phenomena but

59:04

introduces uh challenges in generating

59:08

predictive models. As I told you, this

59:10

was seven uh different crimes with seven

59:13

uh different characteristics. So, it's

59:15

it's harder to find uh auxiliary

59:18

variables that can

59:20

work with the seven of these crimes.

59:24

Also, uh models with high predictive

59:27

performance are not always easily

59:29

communicable. So, for public policy

59:32

context, it is essential to balance

59:34

statistical precision with

59:36

interpretability and transparency.

59:39

uh for us also incorporates

59:41

incorporating the SAI requires a

59:44

safeguarding compatibility with direct

59:46

estimates and across periods which uh is

59:50

related to

59:54

to the last point. Uh the annual

59:56

frequency of our survey uh requires

59:58

iterating and adjusting and

60:00

progressively validating the models

60:02

strengthening their institutional use

60:04

over time and requires uh distinguish

60:08

distinguishing stages of methodological

60:11

learning, model testing and

60:12

implementation which do not always align

60:14

with the timelines of foring results. So

60:17

we have a very short timeline here. We

60:20

uh gathered the data in the last

60:23

trimester of the year. we publish during

60:25

the second trimester of the year. So, uh

60:28

actually being able to produce SAI

60:31

estimations over uh each year, it's uh a

60:35

serious challenge for us

60:38

and for the future as we already have

60:42

our first version of the SAI work. Uh we

60:45

are looking to expand uh to the use to

60:50

other indicators of interest. for

60:52

example uh the dark figure of crime or

60:55

the perception of insecurity. So it is

60:58

important for us to understand that the

61:00

results both on the quality of the

61:02

auxiliary information sources sources

61:05

and on aspects related to direct

61:07

estimations.

61:09

Uh we also well as the limitations of

61:13

the administrative records uh we

61:15

identified in some cases uh you know

61:18

limitations as as a missing data. So we

61:22

need to we need to search uh we we need

61:26

to search data every year for one uh

61:30

data that working for 2024 might not be

61:33

updated for 20 and 25 and so on. So it's

61:37

a constant work

61:39

and

61:41

finally uh as although it is possible to

61:44

make uh reliable uh commute level

61:47

estimates for the 20 and 24 period uh

61:50

for us uh

61:53

uh it's it has been a difficulty that we

61:57

had that as we produce uh direct

62:00

estimations for 2023

62:03

and scientific estimations for 2024. We

62:07

cannot rely on the usual statistical

62:09

test to compare the estimates with the

62:12

previous period and this is going to

62:14

happen to us again in 2025 because we

62:17

have direct estimations again. So this

62:19

is something that we're uh trying to

62:22

research uh how to

62:25

um how to use this data and avoid uh the

62:29

comparations between the two different

62:32

methods. So that's kind of what we are

62:35

doing right now. Uh thank you

62:40

>> many thanks Navier for an excellent

62:42

presentation and also showcasing you

62:45

know the the this the history of ACE in

62:48

in in the National Institute of Chile.

62:51

It it it shows clearly you know the

62:53

maturity of the work you you're doing.

62:56

Uh I will come back for the Q&A later on

63:00

but I will pass now to uh the other

63:02

continent

63:04

uh to Africa and uh Edward from Ghana

63:07

sical service will be presenting their

63:10

their experience in SAE uh will come

63:13

back uh to the questions later on. Uh

63:16

Edward the floor is yours.

63:19

>> Okay. Thank you. Um good afternoon once

63:21

again from Ghana.

63:24

Um I'll be presenting on Ghana's

63:26

experience using SAE and then what we

63:28

have been doing so far so date

63:42

screen is back.

63:44

>> Okay. Please can you see my screen?

63:47

>> Yes. Yes, we do.

63:49

>> Okay. Thank you. So

63:57

is it rolling or is still the same page?

64:02

Okay, sure. So this is going to be the

64:04

outline for the presentation. And I have

64:06

the introduction and then why we are

64:08

into SAPE the entry point where we

64:11

started from the capacity building and

64:13

training from training to practice what

64:15

we have been doing and then what we have

64:17

done so far and then what we are doing

64:20

as GSS when it comes to SAPE to instit

64:22

institutionalize it and then challenges

64:25

encountered so far um and then how we

64:28

are moving forward and then looking at

64:30

what we will do in the future.

64:34

So in Ghana we have 16 regions that's

64:36

admin 2 regions and then we have 261

64:39

municipal metropolitan industrial

64:41

assemblies that's for the admin 3 mostly

64:44

our um surveys are actually at the admin

64:49

level but the demand for data is always

64:52

coming in from the admin level because

64:54

of some of the local government policy

64:56

that they have over there. So what we

64:59

adapted is also part of this SAPE

65:02

actually we have to estimate for them

65:04

and mostly during the past we couldn't

65:06

because we didn't have any idea of these

65:09

estimates and then nobody will give them

65:11

the original and then try to make

65:14

assumptions around how the districts

65:15

will be like but for now that we have

65:18

been trained in SAE from

65:21

um

65:22

World Bank USD UNA ECA UNFPA and in

65:26

other regional commission

65:28

we have some ideas of how we can do

65:31

these estimations using the SA methods

65:33

that we have been trained on. I

65:35

particularly was part of the just recent

65:38

um ended training that we had in Kenya.

65:40

I was present there and then with Daniel

65:43

and then the team from World Bank.

65:46

So we have we have been through this

65:49

training from the past. I remember my

65:52

colleagues also were there some time ago

65:54

and then I also came and then we are

65:56

also training other people in the

65:58

office. What we do is that in most work

66:01

that we take under SAPE we try to

66:04

include a particular person or two to

66:07

also be part of the work so that they

66:08

also get some of the experiences

66:11

in SAPE so that we all work together and

66:13

then also get more people with some

66:16

skills in SAE to help us work

66:19

from that. We the Ghana saskar service

66:22

has been able to publish 15 reports

66:24

using SAPE and then these reports were

66:27

from the Ghana demographic and health

66:29

survey um data and then the population

66:31

housing census data and then it's it

66:34

ranges from the um exclusive

66:36

breastfeeding childhood immunization

66:38

women's empowerment gender based

66:40

violence excessive alcohol consumption

66:42

double burden of malnutrition we even

66:44

have um um excessive alcohol content and

66:48

all that we have a lot they or 15

66:50

reports that has been published together

66:52

with the UNFPA and the USA DHS. So in

66:57

this report we try to use different

67:01

methods. We use the ELE methods, we use

67:04

the EBP, we use a ferot the the the

67:09

reason why we chose a particular method

67:11

to use sometimes revolve around how we

67:13

interpret the results and then the

67:15

assumptions the estimate methods to the

67:18

policy makers. So that also embrace the

67:20

results that they are seeing. And

67:22

sometimes we test these methods to see

67:24

which one is giving us the best

67:26

estimate. Not just statistical

67:27

assumptions, but we look at what we know

67:30

from these districts, the data we have

67:33

from the past and what the estimates are

67:34

giving us. which one is closest to being

67:37

true or being real regarding the numbers

67:40

we are having and then sometimes we test

67:44

the assumptions like the UNFPA method

67:46

that we are seeing the ELN normally we

67:48

do logistic regression on these

67:49

estimates

67:51

um on these data sets and then we test

67:53

the ROC's that's the area under the

67:55

curve assumptions we test a lot of

67:57

things to see which one is performing

67:58

better before we choose the model to

68:06

And then we also trying as Ghana service

68:09

to like I said institutionalize

68:12

the G um the SAE method in Ghana. So we

68:16

release a report not just testing these

68:19

estimates or testing these methodologies

68:22

but we release the report to policy

68:24

makers to also use to the district level

68:27

local government policy makers to also

68:29

use sometimes to we invite them to these

68:32

publications so that they get sense of

68:35

whatever we are doing for them. the

68:36

numbers we are given to them, how they

68:38

were generated, how the estimates came

68:40

about so that they know what we are also

68:42

going to use the numbers for, what even

68:44

went into them and there are still

68:46

processes in place to make sure that SAE

68:49

methods become core part of whatever we

68:52

do in GSS most are incorporating it

68:55

apart from the um DHS report that we

68:58

have done currently as I'm sitting here

69:00

we are also working on the NPI reports

69:02

that were released um some months ago I

69:05

think last

69:06

So we are doing that was also at

69:08

regional level and it's like a trend

69:10

analysis from 2022 to 2025. We are also

69:13

trying to run SAE for all these years

69:16

for all the districts that's the 261

69:19

districts that we have in Ghana and it's

69:21

not just the headcount of poverty but we

69:23

also doing it at the indicator level and

69:26

also even running the intensity of

69:28

poverty for all these districts. So with

69:31

all these models there are different

69:33

different methods that we are trying to

69:34

use. Currently we are done with the hair

69:36

counts which was done using the ELLL the

69:40

intensity is um I've done for 2023 and

69:44

then we have we tested the EBP ferot and

69:48

then the base approach the reason is

69:52

that we can't estimate intensity of

69:53

multi-dimensional poverty for each

69:55

household but it has to be an area level

69:57

indicator so we need an area level model

70:00

to do this so that's why we are trying

70:02

to use the ferot the EBP P and then the

70:05

base that's the basian approach and then

70:08

the current one that we have that has

70:10

been accepted is based on the EBP so we

70:13

are trying to use the EBP to estimate

70:14

intensity for all the years also so GSS

70:18

is currently doing a lot and I won't say

70:20

we haven't faced some challenges we have

70:23

and then that's also the reason why

70:25

these are also based on purely sensors

70:28

data sets we haven't actually included

70:30

um

70:32

special data sets yet because sometimes

70:34

it comes with data harmonization between

70:36

the survey sensors and even the

70:38

geospatial data sets and then also

70:41

communicating uncertainty and model

70:42

based estimates to nontechnical users

70:44

are also a challenge sometimes. So we

70:47

try to find a way of interpreting them

70:49

in simpler terms or in simple languages

70:52

for them to understand to also cherish

70:55

what we are doing at GSS and then also

70:57

choosing the right auxiliary variables

71:00

also becomes difficult sometimes. I mean

71:04

it's like the NPI that we are doing

71:06

sometime we have to also make sure that

71:07

the indicators that we used to

71:08

estimating the head count poverty

71:10

doesn't also end up in the um the

71:13

auxiliary variables that we are using

71:15

the predators we are using to estimate

71:17

poverty so these are all difficult

71:19

sometimes you need to bring in other

71:21

external data set but because of how you

71:23

have to harmonize and then you are not

71:25

getting the right variables to me data

71:27

sets choosing the right auxiliary

71:29

variables become a challenge but well we

71:31

are doing our best and Then also the

71:33

right model like a lot of challenges

71:35

goes through when choosing the right

71:37

model. And then we are facing capacity

71:39

constraint when it comes to we were let

71:42

me put that way facing capacity

71:44

constraints when it comes to estimating

71:46

models but we try to put a member who

71:51

have not had a skill in SAE in the team

71:54

so that the person gets something in the

71:57

tips when it comes to SAE. So the next

71:59

time we can also rely on this person to

72:01

help us in SAPE models

72:05

and then what help us move forward I've

72:08

mentioned one we put members in the team

72:09

so that they can do it and then we also

72:12

learn from each other so we communicate

72:14

sometime we bring people from external

72:16

to also discuss what we are doing so

72:18

that they also give us their point of

72:19

views and then we put them together and

72:21

then we also had great leadership from

72:23

the part of the GSS management where

72:25

they also accept that okay this is not

72:27

our traditional way of doing things.

72:29

This is something new

72:31

but we have accepted it and we are also

72:33

trying to put them forward and then we

72:34

give them also thanks and then looking

72:37

ahead. So currently there are plans for

72:39

the DSS data science team to also look

72:42

at external data sources and how we can

72:44

incorporate that is also solving our

72:47

challenges when it comes to determin

72:48

harmonization and then I mentioned the

72:51

work of poverty multi-dimensional

72:52

poverty that we are working on there's

72:54

labor statuses coming we are working on

72:56

the Ghana living standard survey

72:58

currently we also working on the

72:59

consumption and the poverty line

73:01

estimates which when it's done we are

73:03

also going to run estimates for the

73:05

district levels because that one is also

73:07

representative at the regional level and

73:09

then the data science team is working on

73:12

um core detail records and then I am not

73:15

part of that team so I don't really know

73:17

what is in the data set but they are

73:19

working on how they can get some

73:20

variables to help us estimate these um

73:24

um models at the district levels

73:27

something that will distinguish one

73:29

district from another not just based on

73:31

the human beings living in it but

73:34

something that is peculiar to a

73:35

particular district from the data set so

73:37

That is also work that we are also doing

73:39

at the Ghana statistical service. Thank

73:42

you.

73:44

>> Thank you Edward for uh going through

73:46

Ghana's experience. I mean it's really

73:48

impressive to see uh all this reporting

73:51

is using all this SAPE um modeling uh

73:55

which which seems to be a little bit of

73:58

a challenge for many NSOs but uh going

74:02

through these examples and answering the

74:04

questions at the you know local level

74:07

that's where uh decisions are made where

74:10

the impact is much more important uh

74:14

that's that's really uh great to to see

74:18

Yeah, time is um flying and uh we are

74:21

trying to get uh everybody in and we'll

74:25

probably add extra minutes at the end

74:28

for more Q&A. Uh so without further ado,

74:31

I'm I'm now switching to another

74:33

continent going to Asia Pacific. We have

74:36

our colleague FIA from the Department of

74:38

Statistics Malaysia. they will be she

74:40

will be presenting the experience of SAE

74:44

in their uh uh in the in their uh

74:46

department uh FISA the floor is yours

74:49

please go ahead

74:51

>> okay thank you Mr. David, let me share

74:54

my slide first.

75:09

Can you see my slide?

75:11

>> It's coming.

75:14

>> Yes, I can see your slide.

75:17

>> Okay. Assalamu alayikum and hello

75:20

everyone. I am Faizar Rosanti Taj Aaros

75:22

a statistician from department of

75:24

statistic Malaysia. Today I will share

75:26

my experience attending the SAE capacity

75:29

development program under ESCAP UNSD and

75:32

World Bank with our team use SAE for

75:36

estimate income and poverty at state

75:38

legislative assembly in Kang Joho.

75:42

This is the geography of Na and the

75:45

arrow show the location of the study

75:47

which is Kangjoo.

75:49

Below are the four maps clearly show the

75:52

different boundaries for each

75:54

administrative level in Kuang Joho. The

75:57

first map is our administrative district

76:00

of Kuwang as admin level two. Then the

76:03

district is divided into three

76:04

parliament as admin level three. Each

76:07

parliament actually consists of two

76:10

SLAs's with a total of six SLA in admin

76:13

4 and we have 11 census district at

76:16

admin level five. For this study we use

76:19

three set of data. First the population

76:22

and housing census 2020 the data up to

76:26

enumeration block level in shape file

76:28

format. Then we have household income

76:31

and expenditure survey in 2022 with the

76:35

data up to living quarters level in CSV

76:38

format. And we have satellite image data

76:42

2024 in format CSV and TIF

76:47

by using our census shape file data.

76:50

Then we generate the population map

76:52

based on the enumeration block. From

76:55

this map we can see that the most

76:57

population in Kwang administrative

77:00

district are in parliament of Kuwang.

77:03

Then we create a new shape file by

77:06

combine the census and survey data and

77:08

the second map showing show showing the

77:10

distribution of average household income

77:14

across the Kang area

77:16

by using the data from Google Earth

77:19

Engine. We also plot a point of

77:22

nighttime light by using the point of

77:24

coordinate and we change the point at

77:26

the center into a polygon. This map

77:29

display night time light intensity for

77:32

clu.

77:36

Then we use satellite image to create a

77:38

simple map for showing how bright each

77:41

area is at night. From the nighttime

77:43

light map, we can conclude that the most

77:46

population area also the most bright

77:48

area in this satellite image data. The

77:51

polygon then are colored based on the

77:54

average nighttime light and the area

77:56

with no data are shown in light gray in

77:59

second map. Then we do some treatment

78:01

for the EBS with no satellite data. We

78:04

replace the missing data with the global

78:06

mean.

78:09

After finished with the data cleaning,

78:11

we managed to use the guideline to SAPE

78:14

for poverty mapping published by our

78:17

bank to identify which method should be

78:19

used for this study. From the decision

78:22

tree on method availability, we choose

78:25

to use area level model base. This is

78:27

because the census and survey data are

78:30

not conducted in the same time frame.

78:32

the census in 2020 and the data were

78:35

received at EBIS level while the survey

78:38

data was from 2022 and the data at LQ's

78:42

level. So for this study we choose to

78:45

use the fair Harriet model. Then we use

78:48

LEO and GLM net from package in R for

78:52

the variable selection and transform the

78:55

predictors to improve the fit of the

78:57

model. The lambda value was used for

79:01

extracting the variable with nonzero

79:03

coeffic coefficient.

79:08

For this study, we use income

79:10

indicators, poverty line information

79:13

and nighttime light from satellite image

79:15

as auxiliary information to help

79:18

stabilize the estimate for this area. We

79:21

cannot use direct survey alone because

79:23

of the sampling might be used only a few

79:25

household. So the numbers will become

79:28

noisy, unstable and sometime misleading.

79:32

We use the fairheaded model to improve

79:34

the estimate by combine two sources of

79:36

the information which is what the survey

79:39

tell us and what we know from other data

79:42

from this small area. If an area has

79:45

strong survey data, the model relies

79:48

mainly on the survey. But if the survey

79:51

data is weak, the model borrows strength

79:54

from the auxiliary information. For this

79:57

model, we group household by EBS and

80:00

calculate the direct survey poverty rate

80:02

for each EB. Then the model improve this

80:05

estimate using the auxiliary data

80:08

without any transformation.

80:10

The fair model without any

80:12

transformation can be uneven or noisy.

80:16

From the chart, we can see that the

80:18

relationship between the direct survey

80:20

and the model is less clear and the

80:23

results scatter among the point. The

80:25

brown test result show that the

80:27

correlation between the model predicted

80:29

and the direct survey estimate is 0.68.

80:33

This correlation result show that the

80:35

auxiliary data we use is informative and

80:38

support the model in producing better

80:40

and more stable poverty estimate.

80:45

Next we try fitting the fair model with

80:48

log transformation from empty package.

80:52

We apply a log transform to stabilize

80:55

the modeling scale. After do the

80:57

transformation, we convert result back

81:00

to the original poverty rate scale using

81:02

a bias corrected model or BC_SM.

81:06

So they are interpretable. The output

81:08

now include improved estimate for every

81:11

EB plus their uncertainty or MSE with

81:14

bias corrected back transformation to

81:17

the original scale. As a result, the

81:20

relationship become clearer, the data

81:22

points are more orderly and the model

81:25

become more consistent.

81:27

The brand test result show that the

81:29

correlation between the models with

81:32

transformation and the direct survey

81:34

estimate is more better from 0.68 to

81:37

0.79.

81:39

The main model accuracy indicator here

81:42

are the MSE for how confident we are

81:44

with the estimate.

81:47

In summary, the fair model give more

81:50

stable and reliable poverty estimate for

81:52

each small area. This model improve the

81:55

direct survey result by reducing noise

81:58

and borrow strength from auxiliary

81:59

information to make the estimate more

82:01

accurate. Limitation and way forward for

82:04

this study. For improving the data

82:07

quity, we need stronger administrative

82:09

and geospatial data such as housing

82:12

condition, land use, school attendance,

82:15

clinic visit or we can also incorporate

82:17

with other data source such as mobile

82:20

phone data, digital primary records and

82:23

additional satellite indicators to

82:25

strengthen the model. Beside that, all

82:27

data set must include geographic

82:29

boundaries information so the coordinate

82:31

more accurate and the model become more

82:33

perfect. And last but not least to have

82:36

better partnership by collaboration with

82:39

local authorities, welfare department,

82:41

utility providers and other facilities

82:43

provider is essential to access micro

82:46

data and these data sources can

82:48

significantly enhance the accuracy of

82:51

small area poverty estimate. That's all

82:54

from me. Thank you very much for your

82:56

attention.

82:58

>> Thank you Fisa. That's it's impressive.

83:00

Even I was thinking I'm going into the

83:03

course myself. So it's uh it's really

83:06

refreshing to see you know the the the

83:08

sweat that goes into uh running all this

83:11

uh modelings. Um as as they say in

83:16

Malaysia statics plume in harmony that's

83:18

uh that's the motto. So uh we've now

83:22

come to the conclusion of the

83:24

presentations. We have uh an extra 5 to

83:27

10 minutes for Q&A. Uh I have seen uh uh

83:32

most of the uh uh questions are being

83:35

answered in the chat very much thanks to

83:39

David and Howie Adit. But I will

83:42

probably um open one question for all

83:46

the three country presenters from for

83:48

Edward Fisa and uh Javier. This question

83:51

is about uh I think uh you've alluded it

83:55

to your uh presentation

83:58

question from uh in the Q&A we have a

84:02

question from uh Sanchez. uh he's asking

84:05

about uh if you can say a few words

84:08

about how policy makers and government

84:12

is accepting USA results uh when you

84:16

present it. So what are uh what do you

84:18

consider what are the things uh that uh

84:22

is you know important to communicate

84:24

this to the policy makers. If you have

84:26

some experience if you can share with

84:29

that with us that would be good

84:32

and you can uh you can jump in and

84:34

answering FA Edward or Mavia V.

84:41

>> Okay. Thank you. Um for Ghana I know

84:45

that this um estimate that we are

84:48

producing is a great deal for policy

84:50

makers especially when we go to the

84:52

local government they use some of the

84:54

estimate that we give them in their um

84:57

district common fund estimates. So when

85:00

they are preparing their um district

85:02

common fund um formula that's how they

85:04

call it here they use some of these

85:06

estimates to guide them in that. So yes,

85:09

it's a great deal for them. When we

85:10

produce them, they come, they sit in and

85:13

then we release the numbers to them.

85:15

Yes. So I know they are using it because

85:18

that's one of the things that they do.

85:20

And I think the government also Yeah.

85:21

Because I've seen the um Ministry of

85:25

Sanitation um some of the other relying

85:28

on these numbers that we put out for the

85:30

districts

85:32

to do some of the policies. I know they

85:34

are revising some of the documents

85:35

because of some of the estimates that we

85:37

put out there. Yeah.

85:39

Thank you Edward. uh any any input from

85:41

uh FISA or miss Javier?

85:45

Okay, for Malaysia we are from Malaysia

85:49

we are currently serious looking for the

85:51

small area statistic. We start publish

85:54

our small area um like parliament and

85:57

SLA statistic and also in our current

86:01

strategic plan for the department. We

86:03

also include the small area estimation

86:05

for our future statistic. That's all

86:08

from me.

86:10

>> Thank you. Thank you. And now to Chile.

86:13

Yes. So promise uh well uh poverty

86:17

estimations with SAI models are already

86:21

in place. So we we do not produce those

86:24

models though. But uh they have been

86:26

working since around two or three years

86:29

maybe a little bit more. And for us with

86:32

victimization it has been um

86:36

uh it has been a lot of work uh to uh

86:39

consider these estimations. As we told

86:41

you, we started working in 2018 and just

86:44

by 2024, we were allowed to produce this

86:48

uh estimations within the um the

86:52

official results of the survey. So uh

86:55

and it's still uh we still publish as uh

86:58

experimental uh statistics. So it's

87:01

still not like official official. So but

87:04

we are working on it. So we respecting

87:06

after two or three iterations uh we

87:08

might publish the result as official 66.

87:13

>> Thank you. Thank you very much. I I'll

87:16

just switch back to for extra few

87:18

minutes back to David. I've seen a

87:21

number of questions answered there.

87:23

Maybe if you can uh give us a quick

87:25

highlight of uh what if you want to

87:28

broadcast this to the the whole group on

87:30

some of the questions. most of them are

87:33

on technicalities and type of models

87:35

used but I think it would be good to uh

87:38

also um I've seen a few questions

87:41

requesting support uh for uh doing some

87:45

of the work. So I think if in light of

87:47

uh uh the support that would be uh

87:50

available in the coming year if you

87:52

could uh briefly uh present what we're

87:56

uh going to do from our side uh in terms

87:59

of supporting countries. Yep. Sure.

88:02

So, thank you Daniel. I think for

88:05

supporting countries um usually the

88:07

World Bank has a a contact person um in

88:11

country um and it's best to work through

88:15

that person or a contact person that

88:17

works with NSOS

88:19

um and you know I I can work through

88:21

that person um and support country teams

88:24

as I've been doing with Nigeria and will

88:26

be soon um with Colombia.

88:29

um on some of the the technicalities. Um

88:32

I I think I'm I'm happy my I put my

88:35

email address in the chat. I'm happy to

88:38

address any questions that people may

88:40

have. Um one thing I would point out is

88:42

that cross-sectional small area

88:44

estimation is very different from

88:46

intertemporal survey to survey

88:48

amputation. Um, and the same variables

88:51

that may do well in cross-sectional uh

88:53

smaller estimation may not do well when

88:56

uh trying to predict across time. Um,

88:59

simple models with geospatial

89:01

indicators, for example, do not do as

89:03

well predicting across time as they do

89:05

across space. Um, and in general, I feel

89:08

more confident using models that have

89:10

been trained in a cross-section to

89:12

predict across space rather than

89:14

applying them to intertemporal

89:16

uh prediction. though my colleagues have

89:19

been working on that as I put in the

89:21

chat um and have written quite a bit and

89:24

uh I would encourage people to talk to

89:26

them about that. It's just a it's just a

89:28

different thing. Um so uh yeah I I'm

89:32

looking forward to continuing this

89:33

agenda. Um and uh certainly the World

89:37

Bank is still very interested in how to

89:39

use models and data integration to

89:41

produce better data and we're happy to

89:43

support any countries that um are are

89:46

have the same interests. Thank you.

89:49

>> Thank you David. Just back to the last

89:51

for the last 30 seconds or so for our

89:54

regional commission Angela and Roth. I

89:57

think this is uh something that also

89:59

comes to you the support for countries

90:02

and what is a plan for next year. uh if

90:04

you can say a few words on that that

90:07

would be that would be great.

90:12

>> Okay, I'll go first. Thank you uh Daniel

90:14

for the opportunity. Um without a doubt

90:18

definitely these uh techniques are

90:20

really uh important and uh they help us

90:24

really leverage uh the data that we have

90:27

to further produce you know disagregated

90:31

uh estimates and so on and so forth.

90:33

fill gaps and so they are definitely

90:36

important uh as the African region would

90:39

want to continue promoting this among

90:42

the member states and you know really

90:44

exposing them so that at the end of the

90:46

day we can have a critical mass of

90:48

individuals that can actually um you

90:51

know apply these techniques use these

90:53

methods do this modeling to achieve u

90:57

the the ultimate goal and uh definitely

91:01

for our agenda 2063 as Africa as well as

91:04

the global agenda. We definitely need

91:07

this. So we'll continue to encourage

91:09

other member states to apply this. Thank

91:12

you. Back to you.

91:14

>> Thank you Angela. Back to you Roth for

91:16

the last uh word.

91:19

>> Thanks Daniel. Thanks uh everyone for

91:21

the active contributions. Uh like Angela

91:25

uh Escap is really uh keen on uh explore

91:28

opportunities uh to help countries out.

91:31

Um you know as colleagues mentioned uh

91:34

our doors are open if you're uh

91:36

interested in reaching out and letting

91:37

us know um you know how we can help you

91:40

with your work. Um smaller estimation

91:43

for us uh ties in quite nicely

91:46

particularly geospatial uh smaller

91:48

estimation is um you know ties in nicely

91:52

to our work um on big data data science

91:56

uh and data integration work. So um in a

91:59

way uh it really is uh part of our plan

92:03

to extend support to countries in our

92:06

region as much as we can whether that be

92:08

through um knowledge management uh

92:11

capacity building uh you know aka

92:14

projects or providing uh hands-on

92:17

training or even um or even uh webinars

92:21

type type of support. Um so yeah do do

92:24

look out for more information from us.

92:27

Uh the best way to do that is through uh

92:30

our website. That's that's the first

92:32

point of contact. Um and you can also

92:34

reach out to me or to our um our generic

92:37

email more more generally as well. Back

92:40

to you Daniel. Thank you.

92:42

>> Thank you. Thank you Ros. Thank you

92:43

everybody for attending the session. I

92:45

just put one slide because there was

92:47

back and forth in the chat. This is a

92:50

number of tools available. I mean you

92:52

don't have to wait for response from us

92:55

or from a specific agency. You can go

92:58

ahead and do a lot of self self-based

93:01

courses are online that that was

93:04

mentioned during the call and some

93:06

exercises are already there but in in in

93:09

in general yes uh UNSD in collaboration

93:12

with partners will be um happy to help

93:17

in this process. Unfortunately, how we

93:19

has to leave but we are from our side

93:21

from UNESC side uh coordinating this

93:24

work with the inter secretariat working

93:26

group on household surveys as a as a as

93:29

as an item uh that we're working on. So

93:33

we'll keep in touch and thank you for

93:35

your attention and uh making it to the

93:38

last bit. Uh sorry for uh taking extra

93:41

five minutes of your time. I appreciate

93:44

uh your uh your uh patience and uh

93:47

followup. Have a great day.

93:51

>> Thank you, Daniel. Thank you everyone.

93:53

>> Thank you everybody. That's you.

Interactive Summary

This webinar highlights the practical application of Small Area Estimation (SAE) to produce granular, policy-relevant data for the Sustainable Development Goals. It features collaborative efforts between the UNSD, the World Bank, and regional commissions in Africa and Asia-Pacific to bridge data gaps by integrating surveys with geospatial and administrative data. The session showcases methodological advancements, capacity-building initiatives like e-learning and workshops, and real-world case studies from Nigeria, Chile, Ghana, and Malaysia, focusing on poverty, victimization, and health indicators.

Suggested questions

5 ready-made prompts