DHS for Poverty Analysis: Sample Weight

photo credit: Jody Art via photopin cc
photo credit: Jody Art via photopincc

Demographic and Health Survey (DHS) provides useful household datasets mainly for health related analysis in developing countries. Upon the application approval, everybody can access a large number of household socioeconomic data for free. For this reason, I believe DHS should be more often and effectively utilised by policy makers but not only researchers.

But the guideline of DHS is sometimes a bit complicated. This might be one possible constraining factor when busy policy makers use the dataset. Despite the fact that the DHS official website is the best source for the further details of DHS in general, there are a few available guideline for poverty analysis in particular.

For the purpose of my personal memo and increasing usability, I keep and share some notes about the use of DHS for poverty analysis from now on whenever I find something particularly useful. Here is the first post.

General Guideline

Guide to DHS Statistics” explains DHS features in general.

Introduction to Analysis

If you want to know the process from registration to dataset download “Using DataSets for Analysis” provide a good guidance.

Use of Sample Weight

I have been stuck at the use of sample weight quite a while because I could not find DHS country reports explain how to use sample weight in detail. Having gone through the DHS website, I finally found the best guideline that solved my problem.

To make sample data representative of the whole population in a country, you need to apply weights. According to the DHS website, almost every summary statistic shown on the country reports are weighted. The variable name of sample weight seems to be the same across datasets in different countries. You can replicate the same summary statistics shown on the reports using the below codes.

In “Households” or “Household Members” datasets, the sample weight variable is “hv005”. As decimal points are not included in the weight variable, you need to divide the sampling weight they are using by 1,000,000.

In Stata:
generate wgt = hv005/1000000
tab var [iweight=wgt]

COMPUTE WGT = V005/1000000.

See further details on the DHS website.

Comments are closed.