Measuring Depression

Prepared by the Columbia Center for Active Life of Minority Elders (CALME) Columbia University

This set of annotated abstracts focus on methodological issues in assessing depression using the Center for Epidemiological Studies-Depression (CES-D) scale. The references were selected based on the following criteria: inclusion of older adults (aged 65 and older) in the study sample, inclusion of ethnically diverse groups, and work published from the year 2000 till present. Seminal work published prior to the year 2000 are also included that assess the psychometric properties of the CES-D measures. Articles using modern psychometric theory to examine measurement properties, including factorial invariance, metric equivalence, and DIF across demographic subgroups are also represented. The original publication abstract is included for most of the references.

Additional information on the CES-D and other depression scales that may be of interest are available at and at

Cole and colleagues produced one of the first studies examining the effects of sociocultural characteristics on the measurement properties of the Center for Epidemiologic Studies-Depression (CES-D) among older adults. The authors examined the CES-D scale used in the New Haven Established Populations for the Epidemiologic Studies of the Elderly (EPESE) (N=2,340) for item bias related to age, gender, and race. The methodology used in their study was an extension of the Mantel-Haenszel (MH) adjustment, which the authors argued was ‘‘most appropriate for a medical and public health audience due to the use of proportional odds ratios.’’ The authors found that blacks responded higher on "people are unfriendly" and "people dislike me" items than whites, while matching on overall depressive symptoms. In addition, women responded higher on the "crying spells" item than that of men, even after being matched on overall depressive symptoms. Their data indicate the CES-D would have greater validity among this diverse group of older men and women after removal of the crying item and two interpersonal items.

This study determines appropriate cutoffs for the 20- (CESD-20) as well as a ten-item (CESD-10) version of the instrument. Data were also provided, based on simulated scoring, for the diagnostic performance of the scales when using dichotomous instead of 4-point rating scales. The ten and the 20-item version of the CES-D, regardless of scoring method, produced essentially identical performance indices. The optimal thresholds were 12 and 22 for CESD-10 and CESD-20 respectively, and based on these thresholds, sensitivity, specificity, positive predictive value and negative predictive value were 0.76, 0.55, 0.57 and 0.74 for CESD-10, and 0.75, 0.51, 0.55 and 0.72 for CESD-20. It is concluded that the ten-item version can be used in lieu of the 20-item version, and a dichotomous response format would probably work as well as the original four-point format, in order to simplify administration for elderly persons.

  • Mui AC, Burnette D, Chen LM. Cross-cultural assessment of geriatric depression: A Review of the CES-D and the GDS. Journal of Mental Health and Aging, 2001; 7: 137-164.
This systematic review article includes studies on the cross-cultural utility and psychometric properties of the CES-D between 1975-2001. Differential item functioning in the CES-D that was attributable to sociocultural and health-related factors were found. A comparison of the factor structures of the CES-D was reported for non-Western cultures (including samples in Asia), older Hispanics, African Americans, and American Indians.

This study is a basic psychometric analysis of the CES-D among a sample of older, community-dwelling African–Americans (N=225) in North Carolina. An exploratory factor analysis revealed the following four factors: (1) depressive/somatic; (2) positive; (3) interpersonal; and (4) social well-being. This is the first study to identify the ‘social well-being’ factor. This factor consisted of three items: appetite, hopeful, and talk.

This study is the first of its kind to present evidence suggesting that cognitive factors are important in the mode effect on the CES-D scale. Using IRT methods, this study examined the nature and magnitude of the interview mode effect at the item level. A diverse sample of depressed primary care patients from the Partners-in-Care Study were randomized to receive either a phone interview (N=139) or a mail survey (N=139) of the CES-D. Twelve items manifested differential functioning.

This is the first study that tests for strong factorial invariance in the CES-D across black and white men aged 50 and older. Under the confirmatory factor analysis (CFA) framework, this study examines the cross-group invariance of the "somatic and retarded activity" factor of the CES-D. All five items (bothered, restless, get going, appetite, and effort) supported metric invariance across blacks (N=248) and whites (N=2,004) from the National Health and Nutrition Examination Survey (NHANES). Further comparisons of observed means and variances between the two groups were examined for measurement bias on three-, four-, and five-items of the CES-D.

This study used an item response theory-based latent variable conditioning approach to reexamine item response bias in the CES-D by age, gender, and race as conducted by Cole et al. Using the multiple indicators, multiple causes (MIMIC) model framework to estimate measurement bias in the CES-D responses of participants in the New Haven Established Populations for the Epidemiologic Studies of the Elderly study (N=2,340). Measurement bias attributable to race was significant for the following two CES-D items: people ''are unfriendly'' and ''dislike me'.' The proportional odds of a higher-category response by blacks relative to whites on these items were 2.35 (95% confidence interval [CI]: 1.65, 3.36) and 3.11 (95% CI: 2.04, 4.76), respectively. The proportional odds were higher among women (2.03 [95% CI:1.35, 3.06]) relative to men for the CES-D item ''crying.'' Findings confirm that three items on the CES-D show strong evidence of item response bias. The MIMIC model is preferable to the Mantel-Haenszel approach because it conditions on a latent variable, although the effect estimates can also be interpreted using a proportional odds framework.

  • Radloff, L. S. The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1977;1: 385-401.
The Center for Epidemiological Studies-Depression Scale (CES-D) is a self-report scale designed to measure depressive symptomatology in the general population. The 20-items of the scale are symptoms associated with depression which have been used in previously validated longer scales. The CES-D was tested in household interview surveys and in psychiatric settings and was found to have very high reliability and validity. The factor structure of the CES-D was consistent across a wide variety of demographic characteristics in the general population samples tested. The scale has been widely used in epidemiologic studies of depression.

Last updated April 2007