OUP user menu

Application of patient safety indicators internationally: a pilot study among seven countries

Saskia E. Drösler, Niek S. Klazinga, Patrick S. Romano, Daniel J. Tancredi, Maria A. Gogorcena Aoiz, Moira C. Hewitt, Sarah Scobie, Michael Soop, Eugene Wen, Hude Quan, William A. Ghali, Soeren Mattke, Edward Kelley
DOI: http://dx.doi.org/10.1093/intqhc/mzp018 272-278 First published online: 24 April 2009


Objective To explore the potential for international comparison of patient safety as part of the Health Care Quality Indicators project of the Organization for Economic Co-operation and Development (OECD) by evaluating patient safety indicators originally published by the US Agency for Healthcare Research and Quality (AHRQ).

Design A retrospective cross-sectional study.

Setting Acute care hospitals in the USA, UK, Sweden, Spain, Germany, Canada and Australia in 2004 and 2005/2006.

Data sources Routine hospitalization-related administrative data from seven countries were analyzed. Using algorithms adapted to the diagnosis and procedure coding systems in place in each country, authorities in each of the participating countries reported summaries of the distribution of hospital-level and overall (national) rates for each AHRQ Patient Safety Indicator to the OECD project secretariat.

Results Each country's vector of national indicator rates and the vector of American patient safety indicators rates published by AHRQ (and re-estimated as part of this study) were highly correlated (0.821–0.966). However, there was substantial systematic variation in rates across countries.

Conclusions This pilot study reveals that AHRQ Patient Safety Indicators can be applied to international hospital data. However, the analyses suggest that certain indicators (e.g. ‘birth trauma’, ‘complications of anesthesia’) may be too unreliable for international comparisons. Data quality varies across countries; undercoding may be a systematic problem in some countries. Efforts at international harmonization of hospital discharge data sets as well as improved accuracy of documentation should facilitate future comparative analyses of routine databases.

  • patient safety
  • quality indicators
  • International Classification of Diseases


International comparisons of health system performance are gaining popularity. Over the past 5 years, the Health Care Quality Indicators (HCQI) project of the Organization for Economic Co-operation and Development (OECD) has, despite several methodological hurdles, made progress in developing and reporting on internationally comparable indicators of quality of care [1, 2].

Patient safety is considered an important aspect of quality of care, and many countries have expressed interest in comparable information to facilitate benchmarking and enhance mutual learning. As a first step in OECD's HCQI project, an international expert panel selected in 2004 potential patient safety indicators from the literature based on three criteria: importance for patient safety, scientific soundness and potential feasibility for international data collection. Through a structured ranking process, the expert panel generated a list of 21 patient safety indicators from 59 candidate indicators [3, 4]. An important source was the patient safety indicator set of the US Agency for Healthcare Research and Quality (AHRQ), which contributed the 12 indicators shown in Table 1. The next step in the HCQI project was to assess the feasibility of using these 12 indicators to study patient safety across multiple countries. Seven OECD countries agreed in 2006 to join such a feasibility study, and the results are described in this article.

View this table:
Table 1

Overall population patient safety indicator rates (%) for seven OECD member countries

Patient safety indicatorCountryRatio between highest and lowest rate
Complications of anesthesia (1)0.0090.1350.1450.0790.0030.0230.10248.3
Decubitus ulcer (3)0.7960.241.2642.6610.2080.7652.5212.8
Foreign body left during procedure (5)0.0050.0070.0040.0060.0020.0040.0094.5
Selected infections due to medical care (7)0.1460.0950.0730.2810.0290.0890.2519.7
Postoperative hip fracture (8)0.0050.0170.0330.0330.0540.0050.0310.8
Postoperative pulmonary embolism or deep vein thrombosis (12)0.2610.1010.6170.3330.1290.1161.07910.7
Postoperative sepsis (13)0.4180.3770.3170.960.0520.2461.15122.1
Accidental puncture or laceration (15)0.1660.3920.0770.1440.1670.0750.3565.2
Transfusion reaction (16)0.00030.0009NA0.00020.00020.00010.00049
Birth trauma―injury to neonate (17)0.5210.1920.1511.4480.1320.8910.26111
Obstetric trauma―vaginal delivery (18)1.1994.104NA2.3864.0132.2894.0723.4
Obstetric trauma―cesarean delivery (20)0.2810.783NA0.1990.1610.0960.4368.2
  • Numbers in column 1 refer to AHRQ Patient Safety Indicators. NA, not available.


Patient safety indicators

The AHRQ Patient Safety Indicators were developed on behalf of the AHRQ by a team at the University of California and Stanford University. Precise documentation on indicator definitions and on the process for indicator development, selection and continuous review is available online in the public domain [5, 6]. The indicators exclusively rely on routinely collected hospital data that report the diagnoses, procedures, diagnosis-related groups (DRG) and selected patient-related data elements pertaining to each hospitalization. Since a separate but related set of pediatric quality indicators was created by the AHRQ in 2005, all of the AHRQ Patient Safety Indicators except obstetric injuries have been limited to adults aged 18 or more years at admission. In the USA, selected indicators are used for comparative hospital ranking in several states [7] and country-wide public reporting [8]. The indicator definitions published by the AHRQ are based on the International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM). The AHRQ offers a subset of indicators for application on area-level data to assess the quality of care in certain geographical regions or other defined populations (e.g. members of a specific insurance plan). The area-level indicators are calculated differently from the hospital-level indicators as the definition of numerator events at the area level is broader. In the OECD project, hospital-level indicators are used as there is no information available about geographical regions within the participating countries.

Data collection

The following seven OECD member countries volunteered to participate in the project: Australia, Canada, Germany, Spain, Sweden, the UK (England only) and the USA. The structure of administrative hospital databases is similar in all participating countries: principal diagnosis and secondary diagnoses are precisely defined and coded according to various versions of the ICD. The number of available secondary diagnosis fields ranges from 6 to 50, whereas for procedure coding between 12 and 100 fields are available. Six different procedure classification systems are in use, so only two of the seven participants use the same procedure catalog. Information on each patient's age, admission and discharge status (e.g. discharged home, admitted from or transferred to another acute care facility, deceased) and length of stay is also available. To achieve comparability in indicator calculation, several methodological issues had to be taken into account: (i) the participating countries use different ICD versions and procedure classifications, (ii) documentation rules, such as definitions of principal and secondary diagnoses, vary among countries and (iii) not all participants use a DRG system for hospital payment. To minimize the impact of these differences, the OECD provided a detailed technical manual to facilitate each country's calculation of its patient safety indicator rates [9].

Five of the seven participating countries use ICD-10 (or country-specific modifications thereof) instead of ICD-9-CM. A German project tested the feasibility of patient safety indicator calculation in ICD-10 and found high concordance between indicator rates based on American and German hospital data [10]. A British study applied ICD-10 indicator definitions to English data and confirmed the potential for monitoring safety events [11]. An international research consortium called the International Methodology Consortium for Coded Health Information evaluated the accuracy of the existing ICD-10 translations and provided an internationally harmonized ICD-10 version for 15 indicators [12, 13]. That existing translation was a major prerequisite to perform comparisons across countries and across coding systems.

AHRQ Patient Safety Indicator definitions distinguish between the principal diagnosis and secondary diagnoses. If the critical event is coded as the principal diagnosis, it is assumed—according to American coding standards [14]—that the patient was admitted with that condition and that the condition did not occur during the hospitalization. Two participating countries, Canada and Sweden, use a different definition of the principal diagnosis, based on the disease consuming the most resources during the hospitalization. If a hospital-acquired complication turns out to be the principal diagnosis, the case would not be counted as a safety event using the AHRQ logic and would thereby turn false negative. A diagnosis type indicator in the administrative data, indicating whether a condition was present at admission or acquired during hospitalization, can solve this conflict. Canada, using a diagnosis type indicator, was able to reconfigure its administrative database to achieve comparability in calculation of indicator rates.

Although the selected patient safety indicators mainly rely on diagnosis codes, procedure codes are used to help define certain indicators (e.g. postoperative hip fracture or postoperative pulmonary embolism or deep vein thrombosis). As there is no uniform international system for procedure classification, the countries were provided lists of specific procedures and asked to find the corresponding codes and to accomplish the calculation accordingly.

In comparison to the original AHRQ definitions of the patient safety indicators, three indicators were slightly modified for this project. ‘Complications of anesthesia’ cases assigned to MDC 14 (Major Diagnostic Category, obstetrics DRG chapter) are excluded here but not in the AHRQ definition. In addition to certain drug intoxications, this indicator is supposed to capture cases in which a failed intubation occurred. The ICD classification carries separate codes to differentiate failed intubations during non-obstetric clinical situations from those that occur during childbirth. The existing ICD codes for intubation failure in obstetric patients are not included in the AHRQ definition. Therefore, it seems inconsistent to include obstetric cases in the denominator of this indicator. Regarding the indicator ‘obstetric trauma―vaginal delivery’, the AHRQ definition distinguishes between deliveries with instrumentation (forceps or vacuum) and deliveries without instrumentation. To avoid conflicts due to different procedure classifications, these indicators were joined together in the OECD project [15]. Again, due to the lack of a uniform procedure classification system, the procedure-based exclusions used to identify cases with operative treatment were omitted for indicator ‘decubitus ulcer’.

The lead agency in each of the participating countries calculated rates after converting the ICD-10-WHO code list to the country-specific version of ICD-10. As the structures of the classification trees are similar, this task was manageable. For each indicator, countries were asked to provide population rates based on overall national numerator and denominator counts, as well as mean hospital rates and their standard deviations. A separate questionnaire explored technical details about the structure of each country's administrative database and country-specific documentation issues (e.g. mean number of secondary diagnoses).


Between-country variations in indicator rates are estimated from ratios between highest and lowest value (displayed in Table 1) and by estimation of the coefficient of variation (CV: ratio between weighted standard deviation and weighted mean, displayed in Table 2). Between-country concordance in the relative magnitude of indicator rates was tested by computing the Pearson's correlation coefficient between each country's vector of indicator rates and the USA vector. The American results were regarded here as the ‘reference standard’ because the USA developed the methods of patient safety indicator calculation and has the longest experience in collecting ICD-coded data for payment and surveillance. Statistical analyses were performed using Release 12.0 of SPSS for Windows (Copyright 1989–2003, Chicago: SPSS Inc.) and Version 9.1 of the SAS System for Windows.

View this table:
Table 2

Weighted mean population rates (%) and weighted standard deviations for patient safety indicator rates across participating countries

Patient safety indicatorn (countries)Rate (%) weighted meanRate (%) weighted standard deviationCV (%)
Complications of anaesthesia70.070.0457
Decubitus ulcer71.90.947
Foreign body left during procedure70.0070.00229
Selected infections due to medical care70.20.150
Postoperative hip fracture70.020.0150
Postoperative pulmonary embolism or deep vein thrombosis70.60.467
Postoperative sepsis70.80.450
Accidental puncture or laceration70.30.133
Transfusion reaction60.00030.000267
Birth trauma―injury to neonate70.40.375
Obstetric trauma―vaginal delivery63.70.822
Obstetric trauma―cesarean delivery60.40.125

Between-country comparisons in the relative magnitudes of the vector of national indicator rates were performed in order to identify countries with unusually low, high or mixed results. The 12 indicator rates reported for each country were log-transformed and then treated as components of a geometric vector in a multi-dimensional space. The Euclidean distance between each country's vector and the weighted centroid of the vectors from the other countries was computed [16]. The formula used for this calculation is: Embedded Image For the ith indicator (i = 1, … , n), the variable pi represents the logarithmized population rate in the country under investigation. The variable qi reflects the weighted mean of the logarithmized population rates for the ith indicator in the remaining countries. Population rates were log-transformed before computing Euclidean distances to stabilize variations across the indicators. A relatively large Euclidean distance indicates that a country is unusual compared with other countries in at least one and generally more than one indicator.


All seven countries returned indicator rates to the OECD secretariat; however, one country did not use the OECD technical manual but instead used a previous version of the AHRQ manual, and therefore had to omit three indicators due to major differences in definitions.


The sample sizes were estimated from the indicator ‘foreign body left during procedure’, as the denominator of this indicator captures nearly all medical and surgical cases. The term ‘procedure’ in the title of this indicator reflects surgical operations as well as medical or interventional treatments (e.g. endoscopy). Relative to the overall number of hospital discharges [17], five participating countries used more than 70% of available data and one country used 68%. Taking into account that inpatients younger than 18 years and patients treated in psychiatric institutions are excluded from calculation of the indicators, six countries included a majority of adult inpatients.

Table 1 depicts the population rates that countries reported to the OECD secretariat, with countries randomly assigned labels A–G. For privacy reasons, single numerator and denominator counts are not shown. These numbers were used for further descriptive analyses, including the weighted means and weighted standard deviations shown in Table 2.

We found systematic variation in indicator rates across countries (Tables 1 and 2), ranging from a 3.4-fold difference across countries for ‘Obstetric trauma—vaginal delivery’ (CV = 22%) and a 4-fold difference for ‘foreign body left in’ (CV = 29%) to an 11-fold difference across countries for ‘birth trauma’ (CV = 75%) and a 48-fold difference for ‘complications of anaesthesia’ (CV = 57%). Reliability analyses (not displayed) showed that only one indicator ‘birth trauma’ was not positively associated with other rates at the national level.

In the concordance analysis for relative magnitudes of each country's vector of indicator rates with the American vector, we found significant correlations with Pearson's coefficients (P ≤ 0.01), ranging from 0.821 to 0.966. These results suggest that the within-country relative magnitudes of various indicators are concordant to US data.

Regarding the raw data given in Table 1, the question arises whether there is systematic variation at the country level. For example, is there a country showing overall high or low rates compared with other countries? The raw data reveal that Country G has the highest or second highest rates for eight indicators. Rather low rates were reported by countries E (lowest rates for six indicators) and F (lowest or second lowest rates for eight indicators). The grey columns in Fig. 1 show the studentized Euclidean distances for the participating countries using all indicator rates. Our hypothesis that countries E and F are outliers compared with the other countries is supported, as they show the greatest distances from the centroid. To verify this result, the computation was repeated using divided data with two sets of six indicators each. Set 1 includes indicators with rather high variability (1, 3, 8, 12, 15 and 17), whereas set 2 (5, 7, 13, 16, 18 and 20) contains more stable indicators. The white and dark columns in Fig. 1 represent the Euclidean distances generated from both data sets. For both data sets, the ranking of the Euclidean distances is concordant with the unsplit data. The low Euclidean distances for Country C must be interpreted with caution due to three missing values.

Figure 1

Studentized Euclidian distances of logarithmized patient safety indicator rates (three missing values for Country C).

To understand the variation at the institutional level, countries were asked to report mean hospital rates and their standard deviations. Table 3 shows the means and standard deviations of within-country hospital-level rates for selected indicators. Five of seven countries reported these data; the number of institutions included in this analysis ranged from 237 to 698 acute care hospitals. For all indicators, the standard deviations are relatively large; however, Country E shows the highest standard deviations for most of the indicators. As these calculations were performed without denominator weighting, the results should be interpreted cautiously. Owing to their unweighted calculation, mean hospital rates in Table 3 differ from population rates shown in Table 1.

View this table:
Table 3

Variation across hospitals (mean hospital rates [%, unweighted] and standard deviations) of selected patient safety indicators from five countries

Decubitus ulcer1.372.950.120.271.381.570.2114.700.941.87
Foreign body left during procedure0.0040.0080.0270.0360.0040.0090.0010.7800.0020.010
Selected infections due to medical care0.
Postoperative pulmonary embolism or deep vein thrombosis0.250.460.190.780.600.540.110.970.251.75
Postoperative sepsis0.340.630.634.150.300.290.050.510.190.22
Accidental puncture or laceration0.130.120.340.


Although it would be premature to make inferences about the safety of patient care in the participating countries, our study demonstrates that the AHRQ Patient Safety Indicators can be applied to hospital data from multiple countries. One main methodological challenge, the consistent translation of indicator definitions from ICD-9-CM to ICD-10, was overcome through the international research consortium. However, differences in ICD coding guidelines and practices, such as optional versus mandatory use of certain codes to describe external causes of disease, affect the rates of those indicators that are defined using E-codes (e.g. ‘complications of anaesthesia’ and ‘foreign body left during procedure’).

This project selected only patient safety indicators primarily defined on diagnoses, as there is no common procedure classification across countries. If a reference procedure classification issued by WHO was implemented across countries, it would permit electronic linkage between different country-specific procedure classifications using information technology methods such as cross walking. A recent validation study from the USA revealed that those indicators built largely on procedure codes are relatively accurate (e.g. ‘postoperative wound dehiscence’) [18].

Two of the participating countries, the USA and Canada, have implemented a marker in their hospital data sets, indicating whether or not each coded diagnosis was present at admission or occurred during hospitalization. However, the American data used in this analysis predated the implementation of this data element. This marker facilitated the analyses of Canadian data, as extensive rearrangements had to be performed to account for the fact that Canada uses a different definition of principal diagnosis, and generally increases the usefulness of administrative data for performance measurement [19]. To improve the validity of safety event identification using coded hospital discharge data, the present-on-admission indicator should be introduced internationally.

Substantial variations in rates for some indicators could not be elucidated, as access to the underlying databases is restricted to the responsible persons in the participating countries. Hospital reimbursement relies on coding of diagnoses and procedures (e.g. DRGs) in all participating countries except Country E. For this country, underreporting is likely, as hospitals lack any clear financial incentive to code diagnoses thoroughly. This factor could also explain the unusually large variation in rates across hospitals in Country E. The fact that patient safety indicator rates heavily rely on the quality and completeness of ICD coding is supported by Fig. 2. Countries with a higher mean number of secondary diagnoses report increased rates of the indicator ‘postoperative deep vein thrombosis and pulmonary embolism’. The surprisingly high number of secondary diagnoses in Country E is related to its specific documentation rules. Any medical condition, regardless whether or not it has an impact on medical treatment, can be noted. Other factors, such as access to health care or ethnic disparities, might also affect rates within and across countries [20]. To evaluate systematic variations across countries, the method of estimating Euclidian distances is feasible and shows similar trends in repeated measurements.

Figure 2

Dependency between mean number of secondary diagnoses and population rate (%) of indicator postoperative pulmonary embolism (PE) or deep vein thrombosis (DVT).

The participating countries are now interested in investigating the possible causes of lower or higher indicator rates. For example, questions have been raised about differences in documentation, differences in average length of stay (leading to a shorter or longer time at risk) and differences in coding practice (resulting from how ICD codes are used to set hospital payments). The focus of this investigation was simply on establishing the feasibility of performance measurement using administrative hospital data in an international setting. Further analyses using stratified or risk-adjusted data are now underway. Secondary data analysis is a very feasible method for quality reporting within countries, but the quality of the collected data is a major concern. For this reason, validation studies based on medical record reviews are currently underway in the USA and elsewhere [21].

The AHRQ Patient Safety Indicators are designed to monitor in-hospital quality of care; however, the declining length of in-hospital stay must be taken into account in the future. Quality management and concerns about patient safety should not stop at the hospital doors. Data systems should be developed to permit tracking post-hospital complications that are closely related to the preceding hospital stay, such as postoperative pulmonary embolism. The general introduction of a unique patient identifier, already introduced in several Northern European countries, will be essential to permit quality management across settings of patient care.


The results demonstrated feasibility of implementation and quantified the amount of variation in patient safety indicator rates. The next challenge, however, is now to figure out how to interpret the variation seen, because it relates to either (i) true variation in patient safety, (ii) variation in coding and data quality or (iii) a combination of (i) and (ii). The way forward is to now conduct validation studies and harmonization of coding rules and data quality across countries. OECD member countries have expressed high interest in continuing the project and 10 additional countries participated in the 2008 calculation round. In this study, more detailed information on patient populations will be captured to gain greater insight into the safety of hospital care internationally. If this research proves to be successful, the OECD anticipates publishing comparable data on patient safety in future publications such as Health at a Glance.


Organization for Economic Co-operation and Development (OECD).


This investigation was initiated by the OECD as part of the HCQI project. Furthermore, appropriate health agencies in Australia, Canada, Germany, Spain, Sweden, the UK and the USA supported this research by performing the calculations and providing the data. This article reflects the opinion of the authors and does not represent an official position of the OECD, its member countries or institutions participating in the project. Members of the patient safety indicator interest group of the International Methodology Consortium for Coded Health Information (www.imecchi.org) are acknowledged for providing an internationally harmonized version of indicator definitions in ICD-10.


View Abstract