Commonwealth Ranking: Are We Really 19th Out of 19?

If you think that the quality of your health care depends on whether your doctor keeps your medical records on an interoperable computer system or on whether the country has national health insurance – even it if it means that sick people will die while waiting for care – then Why Not the Best?  Results from the National Scorecard on U.S. Health System Performance, 2008 from the Commonwealth Fund is the scorecard for you.1

Choosing Nonmedical Benchmarks.  In its latest health system scorecard, the Commonwealth Fund updates about 90 percent of the indicators that were used in its 2006 scorecard.  On roughly a third of the measures, U.S. health care improved.  These indicators are generally the ones that measured some quantity associated with actual medical care.  The rest, which measured a hodge-podge of aggregate economic indicators, opinions of unknown quality and compliance with various ideologically-based policy prescriptions, stayed the same or got worse.  The Fund concludes that the United States as a whole “failed to keep pace with levels of performance attained by leading nations, delivery systems, states, and regions,” that access to health care in the United States “significantly declined,” and that health system efficiency remained low.2

Cherry-Picking the Benchmarks.  The score for each indicator is calculated as the ratio of the U.S. average to the Commonwealth benchmark.  Cherry-picking benchmarks helps Commonwealth obscure good results and highlight bad ones.  Many of the benchmarks are culled from the best results in the top 10 percent of states, regions, health plans (those that pay to be evaluated by the National Committee for Quality Assurance) or hospitals.  Since the U.S. average will always be below the best performance in the states or regions that make up the U.S. average, the design of the scorecard ensures that Commonwealth will always be able to report unacceptable performance – even if both the average performance, and the 90th percentile benchmark performance, continue to rise.

Using Questionable Benchmarks.  The indicators themselves have strong ideological components.  Some, like percent of children with a medical home,3 adults with chronic conditions given self-management plans, and the percent of physicians using electronic medical records, have not been shown to decisively improve health.  Some measures, like the OECD measure of health insurance administration, almost certainly contain measurement error, especially in countries with publicly controlled systems.

Other measures, like U.S. infant mortality rates, are well known to be heavily correlated with demographic characteristics that have little to do with the performance of the health care system.  These correlations are not taken into account in the scorecard rankings.

Equating Low Spending with Efficiency.  The scorecard also indulges in the dangerous practice of equating low spending with efficiency.  This ignores the fact that it costs more to treat patients than to let them suffer.  In many cases, additional spending on new techniques and advanced pharmaceutical treatments correlate with better health.  As results from Canada and the United Kingdom make clear, lower spending, which invariably improves scorecard results, is not necessarily an indicator of improvement.

Using Questionable Measurements.  The scorecard often derives blanket measures that sound reasonable from collections of measures that are not.  In the context of evaluating a health care system, the normal assumption about “unsafe drug use” would be that it measures medication errors, inappropriate prescribing, or unsafe drugs.

In the Commonwealth report, “unsafe drug use” measures three things: (1) An estimate of ambulatory care visits for treating adverse drug effects in 2004 (which leaves out hospital reports of adverse drug reactions and fails to distinguish between health system causes such as inappropriately prescribed drugs, unexpected individual reactions to appropriately prescribed medicines, or problems caused when patients fail to follow instructions); (2) a measure of whether children were prescribed antibiotics for a sore throat without a strep test (something that might be a perfectly reasonable cost saving measure in a family of 5 young children all of whom have sore throats and two of whom test positive for strep); and (3) a 2004 ranking of whether the elderly used 1 of 33 inappropriate drugs.

Confusing “Access” with “Third-Party Insurance.”  The 2008 scorecard summary makes much of the supposed fact that “access to health care significantly declined, while health system efficiency remained low.”  This would clearly be cause for concern if the indicators Commonwealth uses to measure access to care actually measure whether people get the medical care they seek.

As the long waits for health care in Europe, Canada, and the United Kingdom show, having the government promise to pay for health care does not necessarily guarantee that any particular individual will actually receive the medical care he needs.  Commonwealth ignores this fact.  Two of its five access indicators measure aspects of third-party payer coverage.  They are “Adults under 65 insured all year, not underinsured” and families spending less than 10 percent of income (or less than 5 percent of income, if low income) of out-of-pocket medical costs and premiums.

A third access measure asks whether premiums for employer sponsored coverage are less than 15 percent of household median income.  As this access measure is affected by state regulation, employer benefits, unionization, and various demographic variables, it is a poor measure of health system performance.  Its effects overlap with another access measure: the percent of “Adults under 65 with no medical bill problems or medical debt,” a measure that would logically be affected by individual income, individual savings behavior, state health insurance regulation, state and federal assistance programs, and the definition of medical debt.  None of these are aspects of a health care system as conventionally described.

The out-of-pocket spending measures, which make up roughly 40 percent of the access score, are crucially dependent on how one defines underinsured.  In the Commonwealth world, underinsured means a health insurance deductible that is no more than 5 percent of income, regardless of assets, or health care expenses that exceed 10 percent of family income.  (The figure is 5 percent for those with incomes below 200 percent of the poverty level which is $20,800 for a single person in 2008.)  Since the United States spends roughly 16 percent of its national income on health care, meeting this goal is unlikely without extensive substitution of government controlled tax funded spending for the individual purchase of health care and/or mandates requiring higher employer health insurance spending. Employer mandates will inevitably be funded by lower wages.

Ignoring Self-Insurance.  By choosing to evaluate the U.S. system using this measure, Commonwealth biases its evaluation in favor of third party payment.  There is a large body of evidence suggesting that third party payment for small or routine expenses does little to improve health or control health care costs.  In fact, once demographic variables are taken into account, there is considerable uncertainty about whether expanding third party coverage in the United States would translate into significant health benefits.4

The 6.1 million people who have purchased health savings account (HSA) plans (which allow more individual self-insurance and less third-party insurance) clearly do not agree with the Commonwealth bias in favor of third party payment.  A family HSA plan must have a minimum deductible of $2,200 in 2008.  If the people buying these plans shared Commonwealth’s view that lower deductibles equaled better health care, they would presumably purchase policies with the lowest possible deductibles.  Instead, the best-selling plans had deductibles of $4,846 in the individual market, $4,356 in the small group market, and $3,998 in the large group market.5  For ease of calculation, assume that it is now $28,000.  In 2008, Medicare Part B premiums are about $2,300 a year for two people. Part D premiums are about $720 a year for two people.  This means that the median married couple Medicare household is paying $3,020 a year – more than 10 percent of income – in Medicare premiums. Supplemental insurance, long-term care insurance, dental expenses, Medicare deductibles, Medicare copays, and uncovered medical visits are extra.  According to Federal poverty level guidelines, 200 percent of the poverty level is $28,000 for a two person household in 2008.  By Commonwealth’s standards, half of households with an over 65 head are underinsured if they spend more than 5 percent of income, or $1,400, on health care.  Medicare premiums exceed this.  Simple extrapolation suggests that a large fraction of the Medicare population falls into Commonwealth’s underinsured category.

Ignoring Rationing by Waiting.  The next access measure, “Percent of adults with no access problems due to costs” is also biased in favor of a national health system.  People do not worry about costs in national health systems because they usually have minimal out-of-pocket payments.  What they do worry about is access – waits for care are long and advanced care may be denied, especially if one is elderly, disabled or a severely ill infant.

The Commonwealth international access benchmark does not include a single measure designed to capture either rationing by waiting or rationing by denial of specific types of care.  It compares U.S. health care to that in other countries using the percentage of adults who said that they had no access problem due to costs in the Commonwealth IHP telephone surveys.6 ,7

Lacking basic demographic data on those surveyed poses a problem for those who wish to use the survey results to compare access and quality in the U.S. health care system with that in other countries.  The problem is that access and quality may vary significantly between U.S. private sector health care and government run programs like Medicare, Medicaid, Tricare, and the Veteran’s Administration.  The survey gives no indication whether the people responding to its questions are receiving government run or private sector care, or on how their ages, incomes, and reported illnesses compare across countries.

Blendon et al. reported on the original Commonwealth IHP surveys in the May/June 2003 issue of Health Affairs.8  The Blendon paper makes it clear that delays of scheduled medical procedures and long waits for specialist care are considered much bigger problems in the benchmark countries than in the United States – 5 percent of those surveyed reported canceled medical procedures were a big problem in the United States, compared with 9 to 16 percent in other countries.  These data are not included in the Commonwealth scorecard.

Misusing Statistics.  Nine of Commonwealth’s 37 indicators compare U.S. results with those in other countries.  Six of those nine indicators are derived from the authors’ analysis of data obtained from the Commonwealth Fund IHP Survey.  The rankings assigned the various countries creates the impression of difference when, in some cases, a z-test for equal proportions suggests that the results from the survey provide no basis for assuming that a difference exists.

AUS

CAN

GER

NZ

UK

US

Sample size, 2005 survey, sicker adults

702

751

1503

704

1770

1527

Percentage reporting medical mistake last 2 years

13

15

13

14

12

15

Commonwealth ranking
(1 best, 6 worst)

2.5

5.5

2.5

4

1

5.5

Z-test for two proportions: is percentage reporting mistake significantly different from that of the U.S.?

No

No

No

No

Yes, but not significantly different from Germany

—-

Rankings based on z-test results

4.5

4.5

1.5

4.5

1.5

4.5

As the table shows, Commonwealth assigned a ranking from 1 to 6 to each country based on the proportion of people in the IHP survey of sicker people who reported that they had been subject to a medical error in the last two years.  Rankings were averaged in the case of a point estimate tie.  Aside from the question of the accuracy of individual impressions of error, the statistical problem is that those proportions are estimated with error from different sample sizes. Using the sample sizes given in the paper, a simple two-tailed z-test for the equality of percentages from the different countries suggests that there is no statistically significant difference at the .05 level between the proportion of medical errors reported for the United States and New Zealand, the United States and Canada, the United States and Australia, or the United Kingdom and Germany.9

The simple test of proportions and revised rankings derived from its results suggests that the rankings assigned by Commonwealth create an artificial impression of lower performance in the United States.  It does this by ignoring statistical error and ranking countries from 1 to 6 based on point estimates without considering statistical error.  The scorecard calculation propagates the false impression of difference by using the improperly applied rankings in its calculations.  Unsurprisingly, Commonwealth then concludes that “the U.S. continues to rank last on safe care.”

Where the United States Improved.  For the record, here is the list of Commonwealth Fund indicators that showed improvement between 2006 and 2008.  The improvements may have been judged insufficient by the scorecard depending on the benchmark used for comparison:

  1. Deaths per 100,000 population fell.
  2. Infant mortality fell.
  3. Higher percentages of adults and children received recommended preventive care.
  4. Higher percentages of diabetics and hypertensive adults had better blood sugar and blood pressure control.
  5. A higher percentage of hospital patients received recommended care.
  6. A higher percentage of heart failure patients received written discharge instructions. (This is a process measure, whether it improves outcomes is unknown.)
  7. A slightly lower percentage of patients reported medical, medication, or lab test errors.
  8. A lower percentage of children were prescribed antibiotics for a sore throat without first getting a strep test.
  9. A smaller percentage of short-stay nursing home residents developed pressure sores.
  10. A higher percentage of patients reported that doctor-patient communication was adequate.
  11. A higher percentage of adults reported that they had no cost barriers to health care access.
  12. Actual to expected deaths for hospital mortality ratios fell from 101 to 82.

Notes

  1. The Commonwealth Fund Commission on a High Performance Health System, Why Not the Best? Results from the National Scorecard on U.S. Health System Performance, 2008, the Commonwealth Fund, July 2008. []
  2. The Commonwealth Fund Commission on a High Performance Health System, Why Not the Best? the Commonwealth Fund, July 2008, executive summary, abstract and page 10.  Online version accessed July 17, 2008. http://www.commonwealthfund.org/usr_doc/1150_
    WhyNottheBest_EXEC_SUMM_METHODOLOGY_ONLY.pdf?section=4039
    . []
  3. Though current health policy discussions sound as if a medical home should be a national priority, a search of PubMed references on the topic yields just 268 references on “medical home.”  There are virtually no good quality empirical studies demonstrating that “medical homes” improve health care except in the case of special needs children who generally already have them.  There is little evidence that children without medical homes need them or that children that need medical homes lack them.  The bulk of the papers are qualitative descriptions on what a medical home should be, how to evaluate it, and so forth.  As one abstract memorably points out, “The Medical Home model for providing services to children with special healthcare needs has strong philosophical foundation, but the science supporting this theoretical model is not as well developed.  The use of logic models and mixed method design provide systematic and rigorous approaches to observation while retaining the complexity, which tends to be lost with research designs intended to control and reduce the number of variables impacting a desired outcome, such as randomized controlled trials.”  In short, there is no generally acceptable evidence on the effect of medical homes, just as there was no generally acceptable evidence that medical gatekeepers, an earlier version of the medical home, could either improve care or lower costs. []
  4. For example, see Helen Levy and David Meltzer, What Do We Really Know About Whether Health Insurance Affects Health? ERIU Working Paper 6, Economic Research Initiative on the Uninsured, December 20, 2001. []
  5. AHIP Center for Policy and Research, January 2008 Census Shows 6.1 Million People Covered by HSA/High-Deductible Health Plans, America’s Health Insurance Plans, April 2008.

    Ignoring Assets.  Because the Commonwealth measure ignores assets, a married couple household with the 2006 median married couple family income of $69,716 and a $5,000 deductible would be classified as “underinsured” – even if it has $25,000 in a health savings account.  It is “underinsured” if it has been saving for braces for two teenagers and pays $7,000 in a lump sum in order to get a quantity discount.  Commonwealth’s arbitrary limit also ignores the fact that people who choose to substitute leisure for income by working part-time or at jobs with more flexibility and less pay will as a matter of course pay a higher fraction of their income for necessities in general, and for health care in particular.

    Census Bureau estimates of health insurance coverage provide a rough check on the reliability of Commonwealth’s access claims.  For 2006 it estimates that 46,995,000 people did not have private insurance or were not covered by public entitlement programs.  Of those, 541,000 were 65 or older, and 8,661,000 were children under 18. This leaves 37,793,000 uninsured people, about 20 percent of the total adult population of 188,402,570 between 18 and 65.  Without the chimera of “underinsured” the United States would have scored 80 against a benchmark of 100, rather than the 58 it received.

    Applying the Commonwealth Criteria of “Underinsured” to the Medicare Population.  It is fortunate that Commonwealth restricted its access measure to those under 65.  By its standards, an even larger fraction of Medicare beneficiaries are underinsured.

    The median household income for elderly Medicare beneficiaries, primarily U.S. residents over 65, was $27,798 in 2006. ((Carmen DeNavas-Walt et al., Income, Poverty, and Health Insurance Coverage in the United States: 2006, Current Population Reports, P60-233, U.S. Census Bureau, 2006. []

  6. Figuring out which survey is used is sometimes tricky.  The appendix references simply say “Commonwealth Fund IHP Survey” and “analysis by authors using survey sample of adults with health problems.”  The benchmark says the U.S. score is compared to the best of 7 countries in the report’s Exhibit 2, but the Commonwealth report on its international surveys says that seven countries were surveyed in Commonwealth’s 2006 survey of primary care physicians.  Adults with health problems were surveyed in 6 countries in 2005.  Sample sizes in the 2005 survey were small, about 700 in Australia, Canada, and New Zealand and 1,500 in Germany, the United States and the United Kingdom. Routine sample statistics were not included with the Commonwealth report updating the survey. []
  7. Karen Davis et al., Mirror, Mirror on the Wall: An International Update on the Comparative Performance of American Health Care, the Commonwealth Fund. Online edition (updated version May 16, 2007).  Online version accessed July 23, 2008. http://www.commonwealthfund.org/usr_doc/1027_Davis_mirror_mirror_international_update_final.pdf?section=4039 []
  8. Robert J. Blendon et al., “Common Concerns Amid Diverse Systems: Health Care Experiences in Five Countries,” Health Affairs, volume 22, number 3, May/June 2003, pages 106-121. []
  9. Computing many-one comparisons among the sample proportions is beyond the scope of this commentary. The calculator used for the test is available at http://www.dimensionresearch.com/resources/calculators/ztest.html.  A two-tailed test was run with a 0.05 significance level.  []

Comments (7)

Trackback URL | Comments RSS Feed

  1. Sally Pipes says:

    Good job.

  2. Glen Nelson says:

    We may have the best care in the world, but we also have some of the worst because of the inconsistency of our delivery system and thus our “average” results aren’t very impressive. If you aren’t the Prime Minister of Italy or have another way to selectively “cherry pick” our system, your odds of getting the “best” care in the U.S. may not be so appealing!

  3. Alfonso Chiscano says:

    Dear John Thank you for all your info..Is the cost of access to our great medical system the PROBLEM.

  4. Phil Haberstro says:

    John, your thoughts on the numerous reports (including I believe the WHO) that have ranked the health status of the American population as poorer than many countries.

  5. Regina Herzlinger says:

    Very nice work

  6. annie moose says:

    right wing hit piece good job

  7. Larry Wett says:

    Wow, Your tireless diligence to disguise a simple truth is truly amazing. Thanks for making complex, something that is actually quite easy to interpret. We have problems and spending 16% of our GDP on healthcare is preposterous, when other nations are spending 6-10% of GDP.