International Journal for Quality in Health Care 16:133-140 (2004)
International Journal for Quality in Health Care vol. 16 no. 2 © International Society for Quality in Health Care and Oxford University Press 2004; all rights reserved
A new instrument to measure appropriateness of services in primary care
1 Department of Family and Community Medicine, University of California, San Francisco, CA,
2 Division of General Internal Medicine and Center for Health Services Research in Primary Care, Department of Medicine, University of California at Davis School of Medicine, CA,
3 Permanente Medical Group, Department of Medicine, Sacramento, CA,
4 UC Davis Medical Group, Davis, CA,
5 Department of Medicine, Stanford University School of Medicine, Palo Alto, CA,
6 Department of Medicine, VA Greater Los Angeles and University of California at Los Angeles School of Medicine, CA, USA
Objective. To develop a new instrument for judging the appropriateness of three key services (new prescription, diagnostic test, and referral) as delivered in primary care outpatient visits.
Design. Candidate items were generated by a seven-member expert panel, using a five-step nominal technique, for each of three service categories in primary care: new prescriptions, diagnostic tests, and referrals. Expert panelists and a convenience sample of 95 community-based primary care physicians ranked items for (i) importance and (ii) feasibility of ascertaining from a typical office chart record. Resulting items were used to construct a measure of appropriateness using principals of structured implicit review. Two physician reviewers used this measure to judge the appropriateness of 421 services from 160 outpatient visits.
Setting. Primary care practices in a staff model health maintenance organization and a large preferred provider network.
Measures. Inter-rater agreement was measured using intraclass correlation coefficient (ICC) and kappa statistic.
Results. For overall appropriateness, the ICC and kappa were 0.52 and 0.44 for new medication, 0.35 and 0.32 for diagnostic test, and 0.40 and 0.41 for referral, respectively. Only 3% of services were judged to be inappropriate by either reviewer. The proportion of services judged to be less than definitely appropriate by one or both reviewers was 56% for new medication, 31% for diagnostic test, and 22% for referral.
Conclusions. This new measure of appropriateness of primary care services has fair inter-rater agreement for new medications and referrals, similar to appropriateness measures of other general services, but poor agreement for diagnostic tests. It may be useful as a tool to assess the appropriateness of common primary care services in studies of health care quality, but is not suitable for evaluating performance of individual physicians.
Keywords: diagnostic tests, prescription, process assessment, referral, utilization review
Address reprint requests to David H. Thom, Department of Family and Community Medicine, University of California, San Francisco, San Francisco General Hospital, 1001 Potrero Avenue, Building 80/83, San Francisco, CA 94110, USA. E-mail: dthom{at}itsa.ucsf.edu
Accepted for publication December 4, 2003.
Judging the appropriateness of a medical servicea prescription, a procedure, or a diagnostic testoccurs explicitly and implicitly at every level of our medical care system, from a payer deciding whether to cover a new medical technology for millions of plan members, to a clinician deciding whether to order a particular test for a patient. Researchers are interested in assessing appropriateness when studying the delivery of medical services. Health care delivery systems and health policy makers want to identify and avoid inappropriate services to reduce unnecessary health care costs [1,2].
Most of the previous work on appropriateness has been done on specific procedures or diagnostic technologies for specific indications, or as delivered to well characterized groups of patients. One example of this approach is that developed at RAND and University of California at Los Angeles (UCLA) for assessing the appropriateness of use of major diagnostic and therapeutic procedures for groups of patients with specific indications [3,4]. This approach uses panels of knowledgeable clinicians who are supplied with a summary of findings from the literature and asked to rate the appropriateness of the procedure for all identified indications. According to the RAND/UCLA definition, a procedure is appropriate when for an average group of patients presenting to an average US physician . . . the expected health benefit exceeds the expected negative consequences by a sufficiently wide margin that the procedure is worth doing . . . excluding monetary cost [4]. Health benefits are defined to include relief of patient anxiety. Appropriateness measures, using structured implicit review, have also been applied to groups of patients for the purpose of judging appropriateness of hospital stays and quality of hospital care [58]. Implicit review utilizes reviewers best judgments, rather than applying explicit criteria, and is widely used (e.g. in hospital peer review). Explicit review tends to increase reliability by minimizing the opportunity for disagreement, but is a limit to the number of services for which there are accepted, objective standards. Structured implicit review improves over traditional implicit review by providing a common structure for reviewers that improves reliability while preserving the advantages of implicit review. Structure implicit review has been widely used in studies of quality of care, but has not been adopted for identifying instances of inappropriate care in the market place [2,9,10].
Primary care physicians (defined as general and family practice, general internal medicine, and pediatrics) provide the majority of ambulatory care in the United States [11] and in other countries. It is not feasible to assemble expert panels and review the literature to assess the appropriateness of most medical services provided by clinicians during typical primary care office visits. Primary care physicians provide a wide range of services, most of which have not been studied sufficiently to develop explicit criteria for appropriateness. In contrast to in-patient care, the documentation of ambulatory care visits is usually limited to brief office chart notes. For these reasons, we decided to develop a measure of appropriateness for primary care services that could be applied to several services using the information generally available in the office record. Starting a new prescription medication, ordering a diagnostic test, and referring a patient for specialty consultation were chosen based on the frequency with which these services are provided in primary care and their potential financial impact (in contrast to services such as a physical examination or provision of information). Our goal was to use principles of structured implicit review [59] to develop an instrument for judging appropriateness that would have sufficient inter-rater agreement to be useful in research studies that include delivery of services in primary care.
| Methods |
|---|
|
|
|---|
Generation of candidate items by expert panel
The study used the nominal group technique, designed following a study by Cantrill et al. [12]. This approach applies a structured process for gathering information from a group identified as having expertise on the topic of interest. The five steps that constitute the nominal group technique are: (i) formulation and presentation of the question to the group; (ii) silent generation and recording of ideas by the group members; (iii) round robin statement and explanation of the ideas by members to the group, recorded so as to be visible to the group; (iv) group discussion of each idea; and (v) individual voting to rank or prioritize the ideas.
Members of the expert panel were selected by the co-principal investigators (D.T. and R.K.) for the current study, based on networking within their academic and clinical care communities. Recruiting was conducted to achieve a balance between family practice (n = 2), internal medicine (n = 2), and subspecialist (n = 2) physicians with an interest and experience in areas related to appropriateness, such as quality and utilization assessment. The subspecialist physicians were a cardiologist and an endocrinologist. Physicians practiced in academic, health maintenance organization, and community settings. There was also one non-physician: a PhD nurse researcher with expertise in patient education and health services research, making a total of seven panel members.
Panel members were provided with background materials on appropriateness before the meeting. The meeting began with a discussion of the concept of appropriateness and its application to services provided during primary care office visits. For each service (new prescription, diagnostic test, or referral), panel members individually generated and recorded variables they would use to judge appropriateness. Each member then briefly stated and explained each idea to the group. All members then discussed items further, and similar ideas from different members were consolidated into single items by group consensus.
Ranking and selection of items
Due to time limitations, members of the expert panel were not asked to formally rank the items generated during the meeting. Rather, the notes and audiotapes from the meetings were used to generate an inclusive set of items, each stated in the form of a question, to be considered in judging the appropriateness of each of the three types of service. Panel members were then asked, via mail, to rank each question item according to two attributes: (i) the importance of the question in establishing appropriateness for the type of service and (ii) the feasibility of answering that question via office chart review, using a 9-point response scale for each attribute that ranged from 1 = definitely unimportant (or not feasible) to 9 = definitely important (or feasible), with 5 = uncertain. In addition, a convenience sample of 95 family physicians and general internists from the Collaborative Research Network, a regional practice-based research network [13], also ranked the same items. The rankings by members of the expert panel and the sample of community physicians were used to select candidate items considered both important and feasible for inclusion in the appropriateness measure.
In constructing the chart review instrument, items assessing similar concepts were grouped together and reviewers were asked to judge each of these component dimensions before judging the overall appropriateness of the service. For example, reviewers were asked to judge the correctness of the three items related to a new medicationduration, dosage, and simplicity of useand then asked about this dimension with the summary question Everything considered, was this medicine correctly prescribed for its intended use?. While reviewers were instructed to consider their responses to each item, they did not mathematically derive an overall appropriateness rating from the items. This approach was formulated based upon the structured implicit review approach used by RAND, which guides the reviewer through consideration of key concepts in assessing appropriateness [6]. Structured implicit review has been reported to improve inter-rater reliability [9,14].
Pilot testing
The resulting instrument, labeled the Appropriateness of Primary Care Services Scales (APCSS) was composed of three separate forms (one for each type of service). The APPCSS was first piloted and then tested on patient visits from the Physician Patient Communication Project [15]. Briefly, the project enrolled 906 English-speaking, adult patients of 45 physicians (16 family physicians, 18 general internists, and 11 cardiologists) from two large managed care systems in Sacramento, California: one a traditional health maintenance organization, the other an academic-based physician network. Only patients with a scheduled visit for a new or worsening problem were enrolled to increase the likelihood that one or more services would be provided during the visit. The participation rate was 80.4% among patients known to be eligible.
Patients visits to a family physician or general internist were reviewed by two of the authors (D.T. and R.K.) to identify visits in which a patient received one or more of the target services (new prescription, test, or referral) were provided. The two physician reviewers (S.K-R. and R.S.) reviewed photocopies of the office notes from each study visit, plus the visit before, and up to three visits after the study visit if available (to provide clarification and context for the index visit, and occasionally additional information). Reviewers judged the appropriateness of each service provided during the index visit using the APCSS. During the pilot phase, a training sample of 20 patient visits was used to familiarize the reviewers with the APCSS, and to identify and clarify ambiguities in the wording of the items or the instructions. The interviewers then rated the appropriateness of services provided during 160 patient visits.
Analysis
Data were analyzed using SPSS and SAS software. For measures using a 9-point response scale, Spearman correlation coefficients were calculated, and inter-rater agreement was assessed using the method developed by Shrout and Fleiss for measuring agreement between raters for continuous rating scales using the intra-class correlation coefficient (ICC) type (2,1), which assumes all services to be rated by the same raters, who are a subset of all possible raters [16]. The SAS macro INTRACC.SAS (SAS Institute, Inc.) was used to calculate the ICC treating raters as random effects. Percent agreement and kappa statistics for inter-rater agreement [17] were calculated for dichotomous variables using SPSS software. Based on common convention, kappa values >0.75 indicate excellent agreement, values between 0.40 and 0.75 indicate fair to good agreement, and values <0.40 indicate poor agreement beyond that expected by chance [17].
| Results |
|---|
|
|
|---|
Instrument development
A total of 56 separate criteria were extracted from the data gathered during the meeting of the expert panel: 18 relating to new prescriptions, 19 to diagnostic tests, and 19 to referrals. Each item was rated by expert panel members from 1 to 9 with respect to its importance for judging appropriateness and its feasibility of being determined from the office record. The one non-physician on the panel declined to rate with respect to feasibility due to her unfamiliarity with current outpatient charting practices. For descriptive purposes, ratings from 7 to 9 were considered important or feasible, ratings from 4 to 6 were considered uncertain, and ratings from 1 to 3 were considered not important or not feasible. Panel member ratings for an individual item were considered heterogeneous if the range included at least one rating of 1, 2, or 3 (not important or not feasible) and at least one rating of 7, 8, or 9 (important or feasible). Of the 56 items, seven (13%) were rated heterogeneously with respect to importance and 18 (32%) were heterogeneous with respect to feasibility. In all but one case, the heterogeneity was due to a single member rating one item as not important or not feasible. As expected, almost all items (53 of 56) were considered important, but only 35 were considered feasible. In particular, items referring to patient preferences, understanding, and agreement with the service were considered important but not feasible to ascertain from office charts.
Item ratings by the 95 community-based, primary care physicians were similar to panel ratings with respect to importance. Specifically, 49 of the 53 items rated as important (7, 8, or 9) by the expert panel were also rated as important by the community physicians. Community physicians tended to be less optimistic about the feasibility of finding sufficient information to judge the items from the chart. Of the 35 items ranked as feasible by the expert panel, only 16 were considered feasible by the community physicians. Examples of items considered important, but not feasible to ascertain by chart review, are Did the patient agree to the prescription?, Is the patient likely to take the prescription as intended?, Were non-pharmacologic approaches given sufficient consideration?, Were alternatives to the test considered?, Is the test sufficiently sensitive and specific?, Does the patient understand and agree to the referral?, and Is the referral needed to meet the standard for care?. Based on ratings by the expert panel and 95 physicians, a total of 34 items (including 18 considered feasible by the expert panel but not by community physicians) were selected for a pilot measure of appropriateness (12 for new medication, 12 for test, and 10 for referral). Items were dropped when they were considered to be not feasible by both expert panel members and community physicians. This version was refined during the pilot phase and several items that were judged to be ambiguous were modified (if possible) or dropped. The final form for judging appropriateness of a new medication used 10 items divided into three component dimensions (clinically indicated, correctly prescribed, and best choice among alternatives), each with its own summary item, plus an overall appropriateness item. Similarly, the final diagnostic test form used seven items divided into two dimensions (clinically indicated and best choice among alternatives), and the final referral form used eight items divided into three dimensions (clinically indicated, correct specialist and timing, and best choice among alternatives). The final forms are included as Supplementary Data (available at IJQHC online). These forms were used by the two physician reviewers to judge 421 services (99 new medications, 268 diagnostic tests, and 54 referrals) provided to 160 patients. Reviewers estimated that it took them an average of 48 minutes to complete a service review.
Table 1 provides a description of the patients receiving services judged by the reviewers. The mean age of the patients was 52.7 ± 15.0 years and
60% were female. Patients were predominately white and well educated, with over one-third having a bachelors degree or higher. Most patients had been at the clinic for >1 year. Physicians were evenly divided between family practice and internal medicine.
|
Table 2 presents the mean percent of services judged to be definitely appropriate (mean rating of 79) and the percent of services at the ceiling of the scale (i.e. rated as 9 by both reviewers) for each summary measure, and for the overall appropriateness ratings for each service. Overall, 58% of new prescriptions, 80% of diagnostic tests, and 85% of referrals were considered definitely appropriate. Ceiling effects were low for overall measures of appropriateness of new prescriptions (6%), and moderate for diagnostic tests (17%) and referrals (15%).
|
Inter-rater agreement for each summary measure and for overall appropriateness by type of service is shown in Table 3. For new medication, inter-rater agreement was fair to good for each of the three-dimension summary items (ICC = 0.47 to 0.58) and for the overall assessment of appropriateness (ICC = 0.52). For the diagnostic test, inter-rater agreement was lower for each of the two-dimension summary items (ICC = 0.38 and 0.43). The agreement on overall appropriateness was lower (ICC = 0.35). For referral, inter-rater agreement was also lower for each of the three-dimension summary items (ICC = 0.38, 0.36, and 0.15). The agreement on overall appropriateness was fair (ICC = 0.40). Collapsing the response scale from the original 9-point to a 3- or 2-point scale did not improve reliability.
|
To calculate the kappa score as a measure of agreement, each of the 9-point scales were collapsed into a dichotomous measure. Responses of 7, 8, or 9 (definitely clinically indicated, definitely the best choice, etc.) were grouped together, and responses of 1 to 6 (definitely not or possibly clinically indicated, definitely not or possibly the best choice, etc.) were grouped together. This dichotomy was chosen because of the paucity of items judged to be in the lowest range (1, 2, or 3). As shown in Table 3, percent agreements ranged from 73 to 85%, and kappas from 0.20 to 0.44. For overall appropriateness of services, percent agreements and kappas were, respectively, 73% and 0.44 for new medications, 78% and 0.32 for diagnostics tests, and 85% and 0.41 for referrals.
Overall, the two reviewers both judged services as possibly or definitely appropriate 93% of the time for new medications, 98% for diagnostic tests, and 98% for referrals. Both reviewers judged 44% of new medications, 69% of tests, and 78% of referrals as definitely appropriate. An additional 7%, 2%, and 2%, respectively, were judged possibly appropriate (rated 4, 5, or 6) by one reviewer, but definitely not appropriate (rated 1, 2, or 3) by the other reviewer. There were no services judged to be definitely not appropriate by both reviewers. Thus, it was not possible to calculate a kappa score comparing a possibly or definitely appropriate service with a service deemed definitely not appropriate.
| Discussion |
|---|
|
|
|---|
In the current study we developed and tested a new instrument for judging the appropriateness of three key outpatient primary care services using office records. The items included in the instrument are generally similar to those used in previous approaches to assessing appropriateness. For example, in previous studies of appropriateness of prescription medications by Cantrill [12], Hanlon [18], and Lipton [19], items for assessing prescribing appropriateness addressed clinical indications, effectiveness, cost and risk compared with alternatives, and proper or valid dosing and duration. All of these criteria were also generated by the current study and rated as important and feasible.
In addition, consideration or trial of non-pharmacologic alternatives and likelihood of patient adherence were identified as being important in the current study, although it was not feasible to obtain them from the outpatient chart. Patient preferences and agreement, not included in the two previous studies, were considered important in the current study, although it was not feasible to obtain them from office charts. These results suggest two areas for future work. The emergence of sound electronic medical records may allow better ascertainment of medication timing, and the use of patient surveys may be needed to incorporate patient preferences.
Despite the multiple steps used to generate and test items for use in judging appropriateness, and the refinement of forms and instructions, inter-rater agreement was only fair for medications and referrals, and poor for diagnostic tests [17]. It is possible that further iteration of piloting and discussion of the instrument between reviewers might have increased inter-rater reliability. It also may have been possible to achieve a higher level of inter-rater agreement by using expert reviewers focused on specific procedures (e.g. neurologists judging the appropriateness of magnetic resonance imaging in patients with headaches), but doing so would miss our primary goal of developing an instrument to assess the appropriateness of a broad range of services. We also felt that it was important for primary care physicians to judge the appropriateness of primary care services, as previous studies have found that physicians in specialties employing a procedure judge appropriateness differently than physicians in other specialties [20,21]. Using more reviewers would tend to increase the reliability of the measure of appropriateness. Assuming a mean inter-rater reliability of 0.4, using seven reviewers would yield an estimated reliability of 0.82 [22]. While this level of reliability is not high enough to be used in practice, for example in imposing sanctions, it is high enough to be useful in research studies.
Full assessment of appropriateness of a service should ideally include consideration of patient preferences. Such an assessment would require some sort of interview or patient survey to determine preferences, which was beyond the scope of the current study. We also did not attempt to assess whether appropriate services were not delivered. While under-use of services may be as important as over-use [23], measuring underuse in primary care would require a separate instrument and was not the intent of the current study.
While measures of appropriateness developed for specific procedures such as hip joint replacement generally have good to excellent inter-rater reliability [24], reliability tends to be substantially lower when judging the appropriateness of more general services. For example, a study using structured implicit review to assess the quality of in-patient care on a general medical service found a kappa of 0.5 for overall quality; kappas for items such as clinical readiness for discharge and appropriateness of use of ancillary resources were
0.2 [8]. Levels of inter-rater agreement found in the current study were similar to those reported for identifying inappropriate use of the emergency department, where kappas for three different measures ranged from 0.39 to 0.42 [25]. A study of inter-rater reliability of the appropriateness of diagnostic tests found kappa values ranging from 0.33 to 0.44 [26], similar to our study. A study of agreement between general practitioners and specialists on appropriateness of new referrals revealed a kappa value of 0.61 [27]. However, this level of agreement was achieved by review of the initial referral letter from the general practitioner to the specialist, which likely had more clearly presented, clinically relevant information than was typically provided in the charts reviewed in the current study.
The high percentage of services judged as possibly or definitely appropriate by both reviewers in the current study is not entirely surprising, as the study identified all services within the categories of new medications, tests, and referrals, rather than focusing on those considered controversial or problematic. Studies of surgical procedures and invasive diagnostic tests have found higher proportions of inappropriate services. For example, a study of 1302 carotid endarterectomies found that 32% were done for inappropriate reasons [28]. A review by a large health care system found 19% of hysterectomies to be inappropriate; an average of 11% of the nine most common procedures were judged to be inappropriate [29]. Another study reported 19% of coronary revascularization procedures to be inappropriate [30]. Since inappropriateness is usually defined in terms of risks exceeding benefits, and primary care services generally have only small risks, the proportion of primary care services judged inappropriate would be expected to be smaller. We found that <3% of all services were considered definitely not appropriate by one reviewer, and no services were considered definitely not appropriate by both reviewers, suggesting a limited scope for detecting inappropriate use of prescriptions, diagnostic tests, and referrals in a general primary care population. The APCSS may be more useful as an assessment of the degree of appropriateness of services in primary care. In the current study, 56% of new medications, 31% of tests, and 22% of referrals were judged as less than definitely appropriate by at least one reviewer.
The generalizability of our results to other health care settings, including other countries, is not clear. Because judgment of appropriateness depends on information in the medical record, the level of detail and legibility of the chart notes is likely to be important in measuring performance. The current study relied on handwritten notes, usually half to one page in length. Generally, physicians wrote their notes using the SOAP format (subjective, objective, assessment, and plan). More extensive notes or typewritten notes might improve the measures performance. Conversely, briefer notes could adversely affect performance.
| Conclusions |
|---|
|
|
|---|
This new measure of appropriateness for a broad range of primary care services has only fair to good inter-rater agreement, similar to other appropriateness measures of general services. While the measure is not sufficiently reliable to be used in judging the appropriateness of specific services to individual patients, it does provide an additional tool for research in which the appropriateness of services is compared between groups of patients. Because the forms used to assess each key service were similar, it may be that this new instrument will prove to be especially useful in quality of care studies that examine the delivery of multiple services during an office visit. We hope that further work in this area can build on our results to create more reliable measures, which in turn will allow assessment of appropriateness for the vast majority of primary care services, which lack explicit, evidence-based criteria for appropriateness.
We would like to acknowledge the contributions of Mahmoud Benbraka, MD, John Chuck, MD, Kate Lorig, RN, DrPH, Mary Patton, MD, and Steven Rose, MD. This study was supported in part by a grant from the Robert Wood Johnson Foundation.
| References |
|---|
|
|
|---|
- Brook RH. Using scientific information to improve quality of health care. Ann NY Acad Sci 1993; 703: 7484.[CrossRef][Web of Science][Medline]
- Buetow SA, Sibbald B, Cantrill JA, Halliwell S. Appropriateness in health care: application to prescribing. Soc Sci Med 1997; 45: 261271.[CrossRef][Web of Science][Medline]
- Brook RH, Chassin MR, Fink A, Solomon DH, Kosecoff J, Park RE. A method for the detailed assessment of the appropriateness of medical technologies. Int J Tech Assess and Health Care 1982; 2: 5363.
- Hicks NR. Some observations on attempts to measure appropriateness of care. Br Med J 1994; 309: 730733.
[Free Full Text] - Lefevre F, Feingalss J, Yarnold PR, Martin GJ, Webster J. Use of the RAND structured implicit review instrument for quality of care assessment. Am J Med Sci 1993; 305: 222228.[Web of Science][Medline]
- Rubenstein LV, Kahn KL, Reinsch EJ et al. Changes in quality of care for five diseases measured by implicit review, 1981 to 1986. J Am Med Assoc 1990; 264: 19741979.
[Abstract/Free Full Text] - Siu AL, Manning WG, Benjamin B. Patient, provider and hospital characteristics associated with inappropriate hospitalization. Am J Publ Health 1990; 80: 12531256.
[Abstract/Free Full Text] - Hayward RA, McMahon LF, Bernard AM. Evaluating the care of general medicine inpatients: how good is implicit review? Ann Intern Med 1993; 118: 550556.
[Abstract/Free Full Text] - Burney RE, Gies ME, Williams D, Connolly KW, McKinney DO. A trial of Structured Implicit Review of randomly selected peer reviews of organization cases. Qual Health Care 1993; 1: 214218.
- Shekelle PG. Are appropriateness criteria ready for use in clinical practice? New Engl J Med 2001; 344: 677678.
[Free Full Text] - American Academy of Family Physicians. Family Practice Facts 2001. Table 26. American Academy of Family Physicians, KS: http://www.aafp.org/x794.xml Accessed 1 October 2002.
- Cantrill JA, Sibbald B, Buetrow S. Indicators of the appropriateness of long term prescribing in general practice in the United Kingdom: consensus development, face and content validity, feasibility, and reliability. Qual Health Care 1998; 7: 130135.[Abstract]
- Croughan-Minihane MS, Thom DH, Pettiti DB. Practice-based primary care physicians interested in office-based research: physician characteristics, practice characteristics, and areas of interest for physicians in two collaborative research networks. West J Med 1999; 170: 1924.[Web of Science][Medline]
- Rubin HR, Rogers WH, Kahn KI. Watching the doctor watchers: how well do peer review organizations methods detect hospital quality problems. J Am Med Assoc 1992; 267: 23492354.
[Abstract/Free Full Text] - Kravitz RL, Bell RA, Thom DH, Krupat E, Azari R. Antecedents and consequences of request fulfillment in office practice: results from the physicianpatient communication project. Med Care 2002; 40: 3851.[CrossRef][Web of Science][Medline]
- Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull 1979; 86: 420428.[CrossRef][Web of Science][Medline]
- Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977; 33: 159174.[CrossRef][Web of Science][Medline]
- Hanlon JT, Schmader KE, Samsa GP et al. A method for assessing drug therapy appropriateness. J Clin Epidemiol 1992; 45: 10451051.[CrossRef][Web of Science][Medline]
- Lipton HL, Bird JA, Bero LA, McPheee SS. Assessing the appropriateness of physician prescription for geriatric outpatients. Development and testing of an instrument. J Pharm Tech 1993; 9: 107113.[Medline]
- Leape LL, Park RE, Kahan JP, Brook RH. Group judgments of appropriateness: the effect of panel composition. Qual Assur Health Care 1992; 4: 151159.[Medline]
- Kahan JP, Park RE, Leape LL et al. Variations by specialty in physician ratings of the appropriateness and necessity of indications for procedures. Med Care 1996; 34: 512523.[CrossRef][Web of Science][Medline]
- Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika 1951; 16: 297324.[CrossRef][Web of Science]
- Laouri M, Kravitz RL, French WJ et al. Underuse of coronary revascularization procedures: application of a clinical method. J Am Coll Cardiol 1997; 29: 891897.[Abstract]
- Quintana JM, Arostegui I, Azkarate J et al. Evaluation of explicit criteria for total hip joint replacement. J Clin Epidemiol 2000; 53: 12001208.[CrossRef][Web of Science][Medline]
- OBrien GM, Shapiro MJ, Woolard RW, OSullivan PS, Stein MD. Inappropriate emergency department use: a comparison of three methodologies for identification. Acad Emerg Med 1996; 3: 252257.[Web of Science][Medline]
- Bindels R. Winkens RAG, van Wersch JW, Pop P, Hasman A. Reliability of the assessment of appropriateness of diagnostic test request behavior. Medinfo 2001; 10: 11121115.[Medline]
- Jenkins RM. Quality of general practitioner referrals to outpatient departments: assessment by specialists and a general practitioner. Br J Gen Pract 1993; 43: 111113.[Web of Science][Medline]
- Winslow CM, Solomon DH, Chassin MR, Kosecoff J, Merrick NJ, Brook RH. The appropriateness of carotid endarterectomy. New Engl J Med 1988; 318: 721727.[Abstract]
- Dubois RW. Appropriateness studies [letter]. New Engl J Med 1994; 330: 433.[Web of Science][Medline]
- Bernstein SJ, Brorsson B, Aberg T, Emanuelsson H, Brook RH, Werko L. Appropriateness of referral of coronary angiography patients in Sweden. SECOR/SBU Project Group. Heart 1999; 81: 470477.
[Abstract/Free Full Text]
This article has been cited by other articles:
![]() |
N J L Sheen, D Fone, C J Phillips, J M Sparrow, J S Pointer, and J M Wild Novel optometrist-led all Wales primary eye-care services: evaluation of a prospective case series Br. J. Ophthalmol., April 1, 2009; 93(4): 435 - 438. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
