OUP user menu

How patients’ sociodemographic characteristics affect comparisons of competing health plans in California on HEDIS® quality measures

Alan M. Zaslavsky, Arnold M. Epstein
DOI: http://dx.doi.org/10.1093/intqhc/mzi005 67-74 First published online: 24 January 2005


Objective. To estimate effects of patient sociodemographic characteristics on differential performance within and between plans in a single market area on the HEDIS® quality of care measures, widely used for purchasing and accreditation decisions in the United States.

Design. Using logistic regression, we modeled associations of age, sex, and zip-code-linked sociodemographic characteristics of health plan members with HEDIS measures of screening and preventive services. We calculated the impact of adjusting for these associations on measures of health plan performance.

Setting. Twenty-two California health plans provided individual-level HEDIS data and zip codes of residence for up to 2 years.

Participants. 110 541 commercially insured health plan members.

Main outcome measures. Ten HEDIS quality-of-care measures.

Results. Performance on quality measures was negatively associated with percent receiving public assistance in the local area (seven out of 10 measures), percent Black (three measures), and percent Hispanic (four measures), and positively associated with percent college educated (six measures), and percent urban (three measures), controlling for plan, while associations with percent Asian were positive for three measures and negative for one (P < 0.05 for six associations, P < 0.01 for four, P < 0.001 for 17). Associations were consistent across plans and over time. Adjustment for these characteristics changed rates for most plans and measures by <5 percentage points.

Conclusions. Adjustment for socioeconomic case mix has little impact on the measured performance of most plans in California, but substantially affects a few. The impact of case mix on indicators should be considered when making comparisons of health plan quality.

  • health care quality indicators
  • information services
  • insurance selection bias
  • managed care programs
  • outcome and process assessment
  • quality of health care
  • socioeconomic factors

The Health Plan Employee Data and Information Set (HEDIS®) has become the most commonly used set of quality performance measures for health plans in the United States. HEDIS data are submitted voluntarily by health plans to the National Committee for Quality Assurance (NCQA), the organization that developed HEDIS. NCQA publishes those data annually in a report called Quality Compass. The most recent release of these data, Quality Compass 2002, includes performance scores for more than 300 health plans representing approximately 75% of the US Health Maintenance Organization (HMO) population [1].

HEDIS measures are particularly salient because they allow both the public and health care providers to compare the performance of health plans on measures of preventive, chronic, and acute care, and to identify high- and low-quality performers. HEDIS measures potentially can influence the decisions of large purchasers of health care, provide aid to consumers choosing between plans, and catalyze quality improvement efforts by health care providers concerned about their relative performance [2].

One critical concern about the use of HEDIS measures for comparisons between health plans is that performance of the quality indicators may be affected by characteristics of enrollees that differ across plans [3–5]. If it is harder to achieve higher quality performance for certain groups of patients, such as the poor and uneducated, then health plan scores will reflect health plan patient mix and there will be an incentive to avoid enrolling these sociodemographic groups.

In a preliminary study we examined data provided to NCQA from a convenience sample of 10 health plans [6]. Our analyses of these data demonstrated a statistically significant association between sociodemographic characteristics of health plan enrollees (as assessed by the characteristics of the population in the zip code areas where they reside) and the probability that these enrollees would receive services measured by HEDIS. Unfortunately, the convenience sample of health plans in our preliminary work was drawn from a diverse set of geographical areas across the United States. Thus, it was not possible for us to assess the impact of case-mix adjustment on comparisons of health plans that compete in the same market area. Yet in most decisions about health care, large-scale purchasers and consumers choose between plans within a local market.

In this study we examined patterns of performance in relation to the sociodemographic characteristics of enrollees in an important state market, California, and the potential impact of adjustment for patients’ socioeconomic characteristics on health plans’ HEDIS scores. By obtaining several years’ data we were also able to examine the stability of these relationships over time.



Our goal was to (i) examine the importance of socioeconomic predictors (based on zip code of the patients’ residence) of HEDIS performance; (ii) determine whether the effects of individual socioeconomic predictors varied in different plans or between years; and (iii) gauge the magnitude of change in measured performance that would be caused by adjustment.


Individual-level HEDIS data were obtained from 21 plans in California for 1996 and 19 plans for 1997 as part of the California Cooperative HEDIS Reporting Initiative (CCHRI); altogether 22 plans were represented. For each measure and each plan providing the measure, the data indicated whether each sampled member received the care assessed by the measure, and the age, sex, and zip code of residence of the member. Our data initially included 14 measures drawn from HEDIS 3.0 [7]. Four measures (initiation of prenatal care, and well-child visits at 15 months, 3–6 years, and for adolescents) were only supplied for four or fewer plans in each year and are excluded from our analyses, leaving 10 measures, defined in Table 1.

View this table:
Table 1

HEDIS® measurement definitions

We matched these data by zip code to 1990 census data. Zip (postal) codes correspond roughly to the area served by a local post office, with a (sample-weighted) mean population of 37 734. Zip-code-linked variables are widely used in studies of population health and health care, and patterns discerned using these variables are often broadly similar to those for the corresponding individual-level variables, although they may differ in magnitude [8,,9]. The census data included the percentage of adults in the zip code area in each of the following groups: Hispanics, Blacks, Asians, individuals with at least some college education, and those receiving public assistance income. To minimize the impact on the results of plans with small samples or incomplete data, we then excluded data from the analysis when a plan had fewer than 50 cases for a measure in a year, or when fewer than 70% of the cases had valid data on all analysis variables.


We fitted logistic regression models for each HEDIS outcome; all models controlled for plan, year, and a plan by year interaction. We first fitted univariate models, each of which included a single case-mix predictor variable (age, sex, or a zip code variable). We then assessed the consistency of case-mix effects across plans by testing the significance of interactions of the case-mix variable with plan. Similarly, we tested the interactions of each variable with year to assess consistency of case-mix effects over time. Next, we fitted multivariate models (without interactions with plan or year) that included all of the case-mix variables simultaneously. Because of the high correlation between the zip code, mean education, and public assistance variables, we also fitted models that excluded one or the other of those variables.

The eligible populations are defined separately for each of the HEDIS measures. For each HEDIS measure, we applied the regression coefficients from these multivariate models to calculate a predicted probability of a positive indication at each plan and year for every individual eligible for that measure in the combined sample. These predictions were calibrated so that the overall predicted rate across the sample at all plans would equal the overall rate observed in the corresponding year. By averaging these predicted probabilities by plan and year, we calculated the directly standardized adjusted score, defined as the predicted rate for each plan if every plan had the same distribution of member characteristics [10]. Mathematically, we fit the logistic regression model Math where pik is the probability that eligible member k at plan i receives the indicated service, xik is a vector of covariates (characteristics), β is a regression coefficient vector, and γi is an intercept for plan i in a given year. The adjusted rating, defined as the mean predicted probability for plan j, was calculated as 1/nΣik logit−1 (xikβ + γj), where the sum is over all n cases at all plans and years. (A SAS macro for this logistic regression adjustment is available from the first author.)

We then summarized the magnitude of the adjustments (differences between adjusted and unadjusted HEDIS scores) by tabulating the mean absolute adjustment as well as the largest adjustments in each direction. To assess the importance of the adjustment relative to random noise in each measure, we also calculated the ratio of the adjustment for each plan-year to the standard error of the corresponding rate.

The study plan was approved by the Institutional Review Board of the Harvard Medical School.


Of 154 795 cases in the original data, 50 359 were excluded because the measure was excluded, for missing data on a case, or for small sample size or high rates of missing data for a plan by measure by year unit, leaving 104 436 cases for analysis. (Plans most commonly fell below our minimum of 70 cases for the beta-blocker measure.) In this analytic set, sample sizes by measure ranged from 1892 for adolescent immunizations to 24 491 for diabetic retinal exam (Table 2). The number of plan-year units ranged from five for adolescent immunizations to 39 for diabetic retinal exams and breast cancer screening. The mean sample size per plan in each year was between 400 and 500 for most measures, consistent with NCQA standards for record-based data collection.

View this table:
Table 2

Number of patients, number of health plans, and percentage receiving the indicated service by HEDIS measure

The distributions of the socioeconomic variables for zip code areas of individuals, and of means by health plan, are summarized in Table 3. Members of the different plans are drawn from areas with different racial and economic compositions, as shown by comparing minimum and maximum values of plan means; for example, plans draw from areas with as low as 10.4% Hispanic on average to as high as 35.2%, or from 20.9% to 39.6% on average with college education. The differences between plans for these variables are very consistent in the subsamples eligible for each of the HEDIS measures (data not shown). For age and sex, we used individual-level data, not data based on zip codes. The distributions of age and sex are quite different for each measure because these variables are used to define the eligible populations (e.g. breast cancer screening among women 52–69 years old, or hemoglobin A1C testing among diabetics 18–75 years old). For this reason, their distributions are not displayed in Table 3.

View this table:
Table 3

Distributions of sociodemographic variables

The associations of patients’ socioeconomic characteristics and HEDIS performance

Table 4 shows the univariate associations of patients’ socioeconomic characteristics as assessed by zip code of residence with performance on the HEDIS measures. In general, residence in a zip code with a higher proportion of persons with high socioeconomic status and lower proportions of Black and Hispanic residents and those receiving public assistance was associated with better HEDIS performance. Among the zip code variables, the strongest and most consistent associations were with percent college educated (significant positive association with six HEDIS quality indicators) and percent receiving public assistance (negative association with seven indicators). Also negatively associated with indicators were percent Hispanic (four indicators) and percent Black (three indicators). Percent urban was positively associated with three indicators. Patterns for percent Asian were less consistent, with three positive associations and one negative (breast cancer screening). Use of beta-blockers was not associated with any zip code variable but had a fairly strong positive association with being male, and a negative association with age. Diabetic retinal exams showed the opposite pattern, with lower rates for males and increasing rates with age. Both childbirth-related indicators (prenatal care and post-delivery check-up) were positively associated with age.

View this table:
Table 4

Impact of socioeconomic characteristics (by zip code) on HEDIS performance (univariate analysis, showing logistic regression coefficients and odds ratios)

The magnitude of the impact of the variables at the individual level is signified by the odds ratios, also in Table 4. These reflect the comparison of predicted probabilities for a person whose socioeconomic characteristics as reflected by the composition of their residential zip code was at the third quartile (i.e. in the middle of the top half of the data values) relative to one at the first quartile (in the middle of the bottom half of the data). We regard this comparison as representing the effects of a moderate difference (the interquartile range or IQR) on the characteristic of interest, OR = exp(β×IQR), where β is the coefficient. For example, members at the third quartile of percent college educated resided in zip codes in which this percentage was 41.3%, while the corresponding percent college educated at the first quartile was 21.3%. The difference of 20.0% in proportion of college education predicts an odds ratio for receipt of prenatal care of 1.40 = exp(1.687 ¥ .20) in favor of the more highly educated group. (We display this odds ratio rather than that for a difference between 0 and 100%, the lowest and highest theoretically possible values for these variables, because no area actually attains the 100% value for any of the sociodemographic variables.)

In the interacted models, we found no significant interactions of these coefficients with either the plan or the year, suggesting that the effects of each of the socioeconomic predictors were fairly consistent across plans and between study years. Further models therefore included no interactions.

In multivariate regression models (Table 5), many of the same effects remained significant with the same sign as in the univariate regressions. Because of the large negative correlation (−0.674) between zip-code level percentages with college education and those receiving public assistance, the coefficients for these variables are not always easy to interpret, but this is not a problem for predictive use of the models [11]. (The education variable also had a strong negative correlation, −0.675, with percent Hispanic.) In models that exclude one or the other of these variables (data not shown), the significant coefficients are similar in sign to those in the univariate models, i.e. positive for percent college education and negative for percent receiving public assistance.

View this table:
Table 5

Coefficients of multivariate regression including all predictors

Case-mix adjustments

We calculated adjustments for each plan in each year, using coefficients from the regression model that included all of the variables. The magnitude of typical adjustments was small, with mean absolute adjustments ranging across measures from 0.3 percentage points to 4.7 percentage points (Table 6). Most of the adjustments for these 10 measures were less than 5 percentage points. The largest adjustments were for use of beta-blockers (Figure 1), for which the score for the plan with the largest downward adjustment was decreased by 7.1 percentage points and the largest upward adjustment was 10.6 percentage points; thus the difference between these two plans was adjusted by 17.6 percentage points. Similarly the largest relative adjustment for diabetic retinal exams was 10.2 percentage points. Conversely, for check-up after delivery (Figure 2), adjustments were minimal.

Figure 1

Unadjusted and adjusted rates for beta-blockers measure. Each point represents the performance of one plan in 1 year. The diagonal line represents equality of unadjusted and adjusted rates.

Figure 2

Unadjusted and adjusted rates for check-up after delivery measure. (See caption to Figure 1.)

View this table:
Table 6

HEDIS performance and impact of adjustment for patients’ socioeconomic characteristics in California health plans

For two measures (beta-blockers and retinal exams) the case-mix adjustments were larger than the standard error of the rates for more than half the plans, and the mean square of the adjustment/SE ratio exceeded 1 (2.1 for beta-blockers, 3.0 for retinal exams). Thus, for these measures the variation introduced into the rates by case mix was greater than sampling variability. The median adjustment/SE ratio for other measures ranged from 0.12 for check-up after delivery to 0.46 for prenatal care; thus, across the range of measures, adjustments for case mix were generally smaller than standard errors of estimates, but not negligible.


Publicly reported performance measurement is intended both to empower market forces that encourage quality improvement and to stimulate providers’ efforts, motivated by their professional ethos, to improve quality [12]. Both effects depend critically on the perception of a ‘level playing field’. Our study of health plans in California shows that sociodemographic characteristics of health plan enrollees are statistically associated with measured HEDIS performance and that the differences in case mix among health plans in California are sufficiently large that case-mix adjustment would have a meaningful impact on comparisons, at least for some plans and for some indicators.

Many observers have speculated that variation in the sociodemographic mix of enrollees between different health plans may bias measurements of quality of care such as HEDIS [2–5]. The current policy of NCQA to report HEDIS data by product—commercial, Medicare, Medicaid—is intended in part to reduce the need for case-mix adjustment. In fact, adjustment for case mix using geographically linked variables had little effect on the ratings of most plans in our study. Nevertheless, variation in case mix, even among the commercially measured population, can have a substantial impact on measured performances in some instances. In California, we found that two plans might appear to differ by as much as 17 percentage points in their rating for use of beta-blockers after myocardial infarction when their performance with similar patient populations would have been identical.

Case-mix adjustment of survey-based measures of health plans [13] and hospitals [14] has been a fairly common practice. Whether or not to adjust clinical quality measures for case mix has been controversial, with little data heretofore available on the potential impact of adjustment. Critics worry that adjustment will obscure differences in quality of care between health plans and reduce incentives to raise quality of care for vulnerable populations [15]. Proponents argue that without case-mix adjustment, measured ‘raw’ performance will sometimes be misleading and there will be important incentives to reduce access for disadvantaged populations. Furthermore, unadjusted plan-level measures reveal very little about quality differences affecting vulnerable groups, since those measures confound the effects of member characteristics and overall plan quality. Indeed, our results indicate that sociodemographic quality differences are consistent across all plans, not just those that serve large numbers of members from the disadvantaged groups. Thus to a large extent they are systematic rather than the responsibility of any one plan. These differences are potentially detected through the coefficients estimated in case-mix analyses, which should be reported if disparities are to be addressed. Recent initiatives to reduce racial and ethnic disparities in the quality of care and proposals to tie payment rates to providers more directly to quality of care are both likely to increase these concerns.

Perhaps one acceptable middle ground is to present analyses stratified by population subgroups, an extension of current NCQA policy to publish data by product. This approach might preserve the transparency of unadjusted analyses while reducing incentives to limit access by disadvantaged populations. Because of sample size requirements and administrative costs, however, this approach is likely to be feasible only for common conditions and quality metrics that use administrative data.

Our findings are consistent both with our own earlier work [2] and previous reports that minorities, low income, and poorly educated individuals obtain services measured in HEDIS, such as mammography [16–20], influenza, Pap smears [16,19–21] and immunizations [22,,23], at lower rates than other populations. These differences exist across plans and between individuals with similar commercial insurance. Thus they are not merely due to these populations receiving care from lower-quality health plans or because they are much more likely to be uninsured.

Our study has limitations. We relied on zip code level variables for all sociodemographic factors except for age and sex. Individual personal characteristics might have an even stronger relationship with performance, and the impact of adjustment might have been still greater, especially if health plans selectively enroll members with different characteristics even within the same geographical area [24]. However, if case-mix adjustment for HEDIS does become the norm, it is likely to be done with similar zip code data, at least until other sociodemographic data become routinely available in health care information systems. Moreover, if health plans attempt to avoid enrollment of individuals perceived as likely to reduce quality ratings, they are likely to do so at least in part through marketing and recruitment practices that reduce enrollment of residents in certain geographical areas. Current efforts to include racial/ethnic identifiers and indicators of socioeconomic status in health plan enrollment records might make individual-level data available in the future for more powerful case-mix adjustments [25].

There was a 6–7 year gap between the 1990 census and collection of clinical and administrative data for the HEDIS measures that we studied. The rapidly changing demographic composition of California as a whole, and of many neighborhoods within California, made the census data inherently imprecise, possibly attenuating the relationships we identified. Our data on quality performance are several years old, but there is no reason to believe that the underlying relationships have changed. Our analyses were also based on data for the commercial HMO population in a single state, albeit an important one in terms of population size and penetration of managed care. The relationships we examined might vary regionally. Furthermore, the amount of variation in patient characteristics across plans, and consequently the magnitude of the impact of case-mix adjustment, might also be different in other markets.

In summary, these findings support and extend our previous work. Case mix is related to quality performance and varies across health plans that operate in the same regional area. Our findings suggest that in a market where purchasers and consumers are trying to make decisions based on ‘value’ as well as price, case-mix adjustment would likely have a very modest impact for most health plans and most quality measures but can have a substantial impact for a few.


We thank the California Cooperative HEDIS Reporting Initiative for making these data available for our study, John Hochheimer (then at NCQA) for assistance in obtaining the data, Lawrence B. Zaborski for assistance with the data analysis, and Eric C. Schneider for helpful comments. This research was supported by a QSPAN grant from the Agency for Healthcare Research and Quality (HS09473-03).


View Abstract