Appendix 3: Methods to Estimate Trends in Undernutrition Prevalence: A Review
T. J. Cole
In 1987 the Sub-Committee on Nutrition of the Administrative Committee on Coordination (ACC/SCN) published its First Report on the World Nutrition Situation.1 It was a compilation of information from various United Nations agencies and presented trends over the period 1960 to 1985 for demographic and nutritional indicators in seven groups of developing countries. Its aim was to highlight current levels and recent trends in child nutritional status for different regions of the world, as part of its remit to review the evolution of global nutrition problems.
A supplement published the following year described the methodology used to derive the trends, focusing particularly on the estimation of undernutrition prevalence.2 This was defined as the percentage of children below - 2 standard deviations (SDs) weight-for-age using the National Center for Health Statistics/Centers for Disease Control/World Health Organization (NCHS/CDC/WHO) international growth reference. This report was made possible through the availability of information from the Fifth World Food Survey, on the one hand, and an increasing number of national anthropometry surveys on the other.
However, the number of anthropometry surveys was relatively limited at that time, and a method was devised to make the most of them. Undernutrition prevalence for particular countries was predicted from a multiple regression analysis relating prevalence to contemporary national diet and other demographic indicators that were more widely available. Using the regression equation, the likely prevalence in a given country could then be predicted for other years, for which the indicator levels were known.
An update gave demographic and nutritional information for 33 countries,3 and the Second Report on the World Nutrition Situation updated the results, for the seven regions of the First Report.4 Volume II of the Second Report described the statistical methodology used, which was broadly similar to that of the First Report.5 A second update provided extra information for 14 countries,6 and in 1996 a further update on the seven regions appeared.7
All these reports and updates used essentially the same statistical methodology to estimate undernutrition prevalence. However, this changed with the publication of the Third Report, in which a more direct method was used.8 The change was due partly to a substantial increase in the availability of country-specific anthropometric information and partly to personnel changes in the ACC/SCN Secretariat.
The World Health Organization (WHO) prefers the more direct method of estimating trends in undernutrition prevalence, as described in the Third Report. In contrast, UNICEF has, since 1990, used the earlier, more indirect method to report to its Executive Board on achievement of the goals of the World Summit for Children. UNICEF has continued to use this method for the sake of consistency and without the benefit, until now, of a comparison of methods.
The aim of this review is threefold: (1) to describe the statistical methods used by ACC/SCN to estimate undernutrition prevalence in (a) the First and Second Reports and (b) the Third Report; (2) to compare the two methods and make a recommendation as to which is better; and (3) to discuss the comparability of results obtained by the two methods.
The key ACC/SCN publications are abbreviated here as follows:
In each case, the relevant page numbers follow, prefaced by p. For example “R1S p5” indicates page 5 of the Supplement to the First Report.
Nutritional status is measured by three distinct indicators: weight-for-age, height-for-age, and weight-for-height. Low height-for-age, or stunting, is a marker of chronic undernutrition. Low weight-for-height, or wasting, is a sign of acute undernutrition. Low weight-for-age, or underweight, is a combination of stunting and wasting and so is a composite of acute and chronic undernutrition. For all three indicators, “low” is defined as a measurement below -2 SDs on the international growth reference for the child’s age and sex.
The prevalences of the indicators were obtained from nationally (or for the largest countries, regionally) representative anthropometry surveys of children aged between birth and 60 months. The actual age ranges varied between surveys. The cut-off used to report underweight prevalence was sometimes 80% of the median rather than -2 SDs, and sometimes references other than those from the NCHS were used. Where necessary the resulting differences in prevalence were adjusted for (R1S p5, R2ii p94-95, R3 p94).
The First and Second Reports used underweight in preference to stunting or wasting as a marker of undernutrition on the practical grounds that it was reported more often (R1S p5). The Third Report switched to stunting as the indicator for three reasons (R3 p3): the increased availability of national height data; the existence of child health goals to reduce the prevalence of stunting; and the belief that stunting is a better cumulative indicator than underweight.
Other national information was available on population, gross national product (GNP), dietary energy supply, infant and child mortality, and several other economic, food, health, and women’s status indicators (R2ii p2). In many cases the data were averaged over several years and were treated as relating to the midpoint of the relevant period.
Comparison of Statistical Methods
The statistical method used in the First and Second Reports was referred to there as an “interpolation model” (R1S p7). In practice it was an indirect method of estimating trends in underweight prevalence using other indicators. The method used in the Third Report (R3 p94-95) focused on observed trends in the prevalence of stunting and so was more direct. The terminology used here reflects this distinction: the indirect method of the First and Second Reports versus the direct method of the Third Report.
The underlying principle of the indirect method was that anthropometry data were sparse but data for other indicators were plentiful. Given sufficient information on me countries within a region, it was possible to relate known underweight prevalence in a particular country for a particular year to other concurrent country indicators and then to use this relationship to predict underweight prevalence in other years.
The method of prediction was multiple linear regression analysis, with underweight prevalence as the outcome measure. For the First Report the analyses were unweighted (R1S p7-17), while for the Second Report the analyses were weighted to reflect differences in the numbers of surveys per country (R2ii p98).
The First Report, based on data from 45 surveys in 36 countries, focused on three independent variables: dietary energy supply (KCAL), the logarithm of GNP (logGNP), and infant mortality rate (IMR). The correlations of KCAL and logGNP with underweight prevalence were high, -0.58 and -0.69, respectively. The regression model included KCAL and IMR, plus indicator (dummy) variables identifying four of the regions from which individual countries came: South Asia, Central America, South-East Asia, and South America. The regression slope between prevalence and IMR differed between South Asia and the other regions, so an interaction term between IMR and South Asia was added. The final regression equation explained 94% of the between-country variation in underweight prevalence (R1 p49, R1S p10-11).
The regression equation was used to predict underweight prevalence for each country in the years 1975, 1980, and 1984. The prevalences by country were then weighted by the zero - to four-year-old population and aggregated to give prevalences for seven regions - the above four plus Sub-Saharan Africa, the Near East, and China (R1S p16-18).
In the Second Report the approach was slightly different. More data were available (100 surveys from 66 countries), and this allowed more variables to be included in the model. KCAL was the strongest individual predictor and was even better, using the value for the year prior to each nutrition survey rather than the same year (correlation -0.49). There was still a need for an interaction term of KCAL with South Asia, as this region had a steeper regression slope with KCAL than the other regions. Three other indicators were included in the model: prevalence of female secondary education, percent of government social support, and the child population under five. Dummy variables were also included for South Asia, South America, and South-East Asia (R2ii p96). The resulting regression model explained 90% of the variation in underweight prevalence. It was used to predict prevalence by region for the years 1975, 1980, 1985, and 1990.
A third variant of the basic regression model was used by Kelly to summarize results for 82 surveys from the WHO anthropometric data bank,9 and a later unpublished version of the same document extended the analysis to 153 surveys. The model included variables for dietary energy supply, infant mortality, and population density, all transformed to logarithms, two (later five) dummy variables for region, and the interaction of South Asia with logKCAL. An important difference here was the recognition of skewness in the distribution of underweight prevalence, requiring a transformation to adjust for it. In 1992 Kelly used the cube root of underweight prevalence as the outcome measure, while his later unpublished report used the logistic transformation. This is equivalent to a form of regression known as logistic regression, which is an analysis with some theoretical benefits compared with ordinary least squares (OLS) regression when the outcome variable is a percentage.
With the increasing availability of nutrition surveys, it has become less necessary to rely on indirect estimates of undernutrition prevalence. The WHO Global Database on Child Growth and Malnutrition includes representative survey data for children ages zero to five in 103 out of 147 developing countries, a coverage of 94.5% by population.10
The commentary with the WHO global database gives trends in prevalence for different regions of the world over the period 1975 - 95. Previous world nutrition situation reports have estimated prevalence rates indirectly, using the regression model based on energy supply, infant mortality, and population density described above. Since 1997, a direct method has been used to obtain prevalence, based on population-weighted averages for countries with nationally representative data. The prevalences of underweight and stunting for each country with at least 75% coverage during the period were multiplied by the estimated child populations aged zero to four years in 1990 and 1995 to give the numbers of cases, and these were aggregated for each region to give regional prevalences.
This analysis has the advantage of simplicity. However, it fails to account for trends in undernutrition prevalence over time within countries, and for this some form of modelling is necessary. The data consist of repeat surveys within countries, and countries within regions. They represent three distinct sources of variation: between regions, between countries within regions, and between surveys over time within countries. Linear regression analysis is not well suited to this form of data structure as it has only one error term. Recently developed regression methods allow for more than one component of variation, here time, country, and region. The analysis goes under various names - restricted maximum likelihood (REML), random effects modelling, or multilevel modelling - but they all rely on the same principle. The multilevel model is simply an extended form of regression, with separate estimates for the three levels of variation, and the total variation is obtained by combining over the three levels.
Multilevel modelling was used for the Third Report’s analysis, based on data for 223 surveys from 95 countries obtained from the WHO global database. The Third Report includes a clear and concise description of the analysis (R3 p94-95), and only a brief summary is given here. The aim was to estimate the time trend in prevalence, which would require at least two surveys per country. Many countries had only one survey, so they could not contribute to the estimate of trend but only to the mean prevalence. For this reason it was decided to estimate a mean prevalence for each country, but to average the trends in prevalence over time across all countries, giving a set of parallel regression lines. Also, though it was possible to analyze all the regions in a single model, it would have been complex to ensure a good fit. So, instead, each region was analyzed separately, which allowed region-specific estimates for the trend in prevalence over time, and country-specific estimates for the mean prevalence. Linear, quadratic, and cubic time trends were tested for.
The fitted equations were used to predict country-specific prevalences for the years 1980, 1985,1990, and 1995, and these were aggregated to the regional level, weighted by the relevant country populations (R3 p6). Two versions of the multilevel analysis were carried out: one with the individual survey prevalences unweighted and the other with the data weighted according to the country’s population. The former analysis tests how well the data fit the model, while the latter analysis is more appropriate for assessing regional trends, with the larger countries having a greater influence on the outcome.
The aim of fitting a model is to summarize the data using simplifying assumptions. The particular aim here is to predict prevalence by region and year. So to compare the two available methods we need to compare the validity of their assumptions.
The indirect method assumes (1) that relationships between underweight prevalence and energy supply, infant mortality, and the other indicators are linear and (2) that the slopes of the relationships are the same across all countries in all regions and are constant over time. There is one exception to this - an interaction term allows the slope of IMR or KCAL to differ in South Asia compared with the other regions.
Another assumption concerns regional differences in prevalence. Some regions have a dummy variable fitted, while others do not. For regions with a fitted dummy variable, the prevalence predicted for the mean of their indicators is equal to the region’s observed prevalence. So the prediction is unbiased. Regions without a dummy variable are treated as all one region, and their predicted prevalences for the mean levels of their indicators are, in general, not the same as their observed prevalences. So regional prevalence is well estimated if a dummy variable is fitted, and less so if not.
The direct method makes fewer assumptions. Again the aim is to predict prevalence by region and year, but now both region and year appear in the model. The multilevel model can cater simultaneously for variation between regions, countries, and years, but there are sufficient data to fit it to each region separately. Parallel regression lines of prevalence on year are fitted for each country in the region - that is, random intercepts and a common slope. So the only assumption is that all of a region’s countries have the same time trend. In effect the model provides separate estimates of the mean and trend in prevalence for each region.
So, are predictions made by the two models likely to be similar? The main differences are summarized in the Table A3.1. They show that the direct model, by making fewer assumptions and estimating more effects, allows for greater regional and temporal variability in predicted prevalence. In addition, the direct model predicts time trends in prevalence directly from the time trends in the data, whereas the indirect method relies on time trends in the indicators acting as proxies for the trends in prevalence.
TABLE A3.1: Comparison of indirect and direct methods of predicting undernutrition prevalence, by region over time
a Except for regions with dummy variable.
Overall the two Sets of predictions are likely to be most similar where there is least extrapolation - that is, where the various indicators (including time) are near the mean for the whole data set. The predictions will differ progressively as the indicator values move away from the mean, introducing variation in the slopes for the various indicators.
It would be useful to quantify the differences in prediction under the two models. The indirect model can be made more similar to the direct model by allowing the means and trends in prevalence to differ between regions. This is done by extending the indirect model to include dummy variables for each region and interactions for each region-indicator combination. For 7 regions and 2 indicators (as in the First Report) it would involve 6 dummy variables and 12 interactions, a total of 18 terms altogether (of which 5 were fitted in the published model). The First Report had insufficient data to fit the extra terms, but the more extensive Third Report data could easily be re-analyzed this way. The re-analysis is equivalent to applying the indirect method to each region separately, just as for the direct method. It would still leave important differences in methodology between the two approaches - indirect versus direct modelling and OLS versus multilevel regression - but the results for each region would be more directly comparable.
The introduction of the indirect method in 1987 was an imaginative way of dealing with the then shortage of anthropometry data. However, there is now a substantial body of nutritional status information covering an extended period of time, and the complex proxy nature of the indirect method cannot be justified when compared with the direct method.
This review has so far focused on the differences between the indirect and direct methods. There are, however, aspects of the analyses common to both methods that deserve comment.
Testing of Assumptions
With sufficient data, the assumptions of the two models can be tested directly. The indirect method makes many assumptions, as shown above, but most can be avoided by switching to the direct method. So it is doubly important to validate the direct method, in particular the assumption that all countries in each region have the same time trend in prevalence. This is done by fitting a random coefficients model and seeing if it significantly improves the fit compared with the simpler random intercepts model. This was discussed in the Third Report (R3 p94), where Table 4 (R3 p11) compares the numbers of countries in each region showing rising and falling trends in stunting prevalence. There is clear heterogeneity in trend between-the countries in each region, which indicates that a random coefficients model would provide a better fit to the data. However, this would require at least three surveys per country - two degrees of freedom for the intercept and slope, and at least one more for error - and not enough countries had more than two surveys (R3 p94).
Linear Versus Logistic Regression
The outcome measure of interest is the prevalence of undernutrition, defined as either underweight or stunting. Prevalence is a variable taking values between 0% and 100%, and the variability in prevalence depends on its mean value. For example, prevalences near 0% or 100% are intrinsically less variable than those near 50%. The form of analysis that takes this into account is called logistic regression, which was not well developed at the time of the First Report but has become so since. The First and Second Reports used OLS regression, while Kelly used logistic regression latterly. The theoretical advantages of logistic regression are twofold: its weighted form of analysis stabilizes the variance, and the logit transform adjusts for skewness in the distribution of prevalence. Its one disadvantage lies in its complexity - logistic regression coefficients are in units of log odds ratios, which are hard to explain to non-statisticians. They can be converted back to mean regional prevalence figures in the same way as for linear regression, but they pose a barrier to understanding.
An alternative is to work with a simpler transformation than the logit, such as the logarithmic transformation. This adjusts for skewness but does not stabilize the variance so well. The relative advantages of the two approaches were discussed by Pelletier et al., who opted for the simpler logarithmic transformation on the grounds of transparency.11
Choice of Method
The indirect method was ah imaginative solution to a real problem, the shortage of national anthropometry surveys in the 1980s, but it is no longer relevant. As the number of surveys has risen (over 200 in the WHO global database), the case for the indirect method has evaporated. For the future estimation and prediction of trends in malnutrition prevalence, the direct method must be the method of choice. This is a multilevel model fitted to each region separately, with random country intercepts and a common time trend. With more data, an extended model with random country trends will become feasible. To facilitate the comparison of past and future trends in prevalence, the WHO global database should be reanalyzed using the indirect method and the results compared with those for the direct method. This will show to what extent the earlier (First and Second Reports) trends based on the indirect method can be compared with current and future analyses based on the direct method.
Possible Analytic Refinements
Future analyses should, where possible, address the issues raised in the final section, particularly the testing of model assumptions and the comparison of different forms of regression analysis.
[Ukrainian] [English] [Russian]