Assessing the need to update prevention guidelines: a comparison of two methods
1 Cecil G. Sheps Center for Health Services Research, University of North Carolina at Chapel Hill, 2 Department of Epidemiology, School of Public Health, University of North Carolina at Chapel Hill, 3 RTI International, Research Triangle Park, NC and School of Public Health and Program on Health Outcomes, University of North Carolina at Chapel Hill, 4 Department of Family Medicine, School of Medicine, University of North Carolina at Chapel Hill, 5 Department of Internal Medicine, School of Medicine, University of North Carolina at Chapel Hill, NC, USA
Background. An important concern for developers of clinical practice guidelines is how best to determine when guidelines require updating to ensure they remain current and evidence based. Because of the high costs associated with updating guidelines, recent attention has focused on approaches that can reliably assess the extent of updating required. Recently, Shekelle and colleagues proposed a model of limited literature searches with modest expert involvement as a way to reduce the cost and time requirements for assessing whether a guideline needs updating.
Methods. The main objective of this study was to compare the Shekelle et al. assessment model (review approach) and a conventional process using typical systematic review methods (traditional approach) in terms of comprehensiveness and effort. We modeled the review approach on that by Shekelle and colleagues but refined it iteratively over three phases to achieve greater efficiency. Using both methods independently, we assessed the need to update six topics from the 1996 Guide to Clinical Preventive Services from the US Preventive Services Task Force. Main outcomes included completeness of study identification, importance of missed studies and the effort involved.
Results. Although the review approach identified fewer eligible studies than the traditional approach, none of the studies missed was rated as important by task force members acting as liaisons to the project with respect to whether the topic required an update. On average, the review approach produced substantially fewer citations to review than the traditional approach. The effort involved and potential time saving depended largely on the scope of the topic.
Conclusions. The revised review approach provides an efficient and acceptable method for judging whether a guideline requires updating.
Keywords: clinical practice guidelines, evidence-based practice, methods, prevention, updating
Address reprint requests to Gerald Gartlehner, Cecil G. Sheps Center for Health Services Research, 725 Airport Road, Chapel Hill, NC 27599-7590, USA. E-mail: gartlehner{at}schsr.unc.edu
Accepted for publication April 16, 2004.
Clinical practice guidelines are useful tools to condense an extraordinary volume of medical information into a manageable format for daily clinical use [1,2]. Such clinical applications can lead to higher standards and improved quality of care, insofar as guidelines reflect current best practices [3]. To define and measure quality requires setting standards based on the strongest evidence available or, when little or no sound evidence exists, on expert input. Thus, producing and using robust guidelines and related products make timeliness and updating two fundamental issues in the quest for higher quality care. This need in turn dictates frequent evaluation of the content and validity of guidelines, so that obsolete or misleading guidelines do not inadvertently lead to possible breakdowns in processes of care or unexpectedly poor outcomes for patients.
The current guideline development process, which typically uses systematic evidence reviews as its foundation [4,5], is very different from earlier approaches that relied largely on opinions of experts in the field [6]. This opinion-based approach has been criticized for its insufficient objectivity and rigor [7]. The increased thoroughness afforded by basing guidelines on scientific evidence rather than expert opinion produces better guidelines but at higher costs [8]. The need to update previously developed guidelines may also incur substantial effort and cost [9] depending on research advances and time between reviews; some guidelines will require thorough updating. Finding more efficient approaches to determine the extent of updating required at given intervals has thus taken on considerable urgency in national and international health care.
The importance of international cooperation to improve the quality of health care can only increase. In Berlin in 2002, for example, 14 nations founded the Guidelines International Network (G-I-N) to facilitate the exchange of background information in guideline development to optimize resource allocation (www.guidelines-international.net); as of March 2003, 36 organizations (34 national institutes and two supranational groups) had become founding members [Professor Günter Ollenschläger, German Agency for Quality in Medicine, Köln, Germany, K. N. Lohr, personal communication, 23 September 2003]. Moreover, the AGREE (Appraisal of Guidelines Research and Evaluation) Collaboration (www.agreecollaboration.org) has recently been established to develop compatible approaches to guideline creation, create mechanisms for appraising and monitoring guidelines, define quality criteria relevant to guidelines and diffuse such criteria through international exchanges and collaborative links. Efficient methods to assess periodically the validity and timeliness of guidelines will be crucial for these new institutions to guarantee health care that is of consistently high quality.
In evaluating the validity of evidence-based guidelines over time, Shekelle et al. [10] found that, after 3.6 years, 90% of guidelines from the Agency for Healthcare Research and Quality (AHRQ) were still valid, dropping to 50% at 5.8 years. More frequent updates may well help ensure validity, but the practicality and costs of frequent updates can be questioned. The Shekelle team proposed a model of limited literature search and expert review to reduce the cost and time requirements for assessing updating needs [11]. It assumes that the more important, methodologically sound, clinically relevant and controversial primary research articles will be accompanied by commentaries or editorials and will be cited often in the literature. Identifying these publications, with the assistance of topic experts who know of recent and pertinent evidence, is assumed to inform whether a guideline needs immediate major updating or only minor editing. This review method may be more efficient for determining the need for an update than the typical systematic review methodology.
Between 1997 and early 2003, AHRQ funded the RTI InternationalUniversity of North Carolina Evidence-based Practice Center (RTI-UNC EPC) to provide evidence reviews to support the US Preventive Services Task Force (USPSTF) guideline development process. Because of the extensive resources required to update these (and other) guidelines periodically, the RTI-UNC EPC in collaboration with AHRQ and USPSTF members designed this study to compare two processes for identifying new and important literature that indicates that a guideline is outdated.
Our study had four objectives. The first was to assess the yield from and effort required to use the model proposed by Shekelle and his colleagues, which we refer to as the review approach. The second was to compare the review approach and a more conventional systematic literature review, which we denote the traditional approach, on these same factors. Third, we compared the articles identified by both approaches to evaluate their relevance with regard to the critical key questions, specifically assessing whether the need to update based on the review approach matched that from the traditional approach, a form of validity testing. Finally, we used an experiential technique to refine the review approach and develop an efficient assessment model. The entire study is diagrammed in Figure 1
|
The focus of this work was on six clinical topics from the second edition of the Guide to Clinical Preventive Services [12] that had not, as of 2002, been updated. All chapters related to screening: glaucoma, bladder cancer, asymptomatic bacteriuria, hemoglobinopathies, genital herpes simplex and syphilis.
| Methods |
|---|
|
|
|---|
Because this was a methodological project, we first discuss the design of the two literature search and review processes. We then describe how we evaluated the review and traditional approaches using a three-phase study design that focused on two clinical topics in each phase.
Design of literature search and review processes
The traditional approach used basic search strategies that the RTI-UNC EPC followed in performing evidence-based reviews for the USPSTF. It relied on searching for studies that answer critical key questions, meet eligibility criteria and are methodologically sound. Critical key questions are those that might motivate the task force to change its mid-1990s conclusions or recommendations, should new and important evidence be uncovered. The intent was to be comprehensive by identifying the full spectrum of potentially relevant articles that addressed the critical key questions and met eligibility criteria.
In Phase 1 (glaucoma, bladder cancer) for the traditional approach, we searched MEDLINE for primary research concerning critical key questions on screening tests, treatments and outcomes. The clinicians on each team, aided by the librarians use of Medical Subject Heading (MeSH) selected the disease-specific search terms. Because these conditions were very specific and the librarians always use MEDLINEs explode feature, we are assured that both teams used virtually identical disease-specific terms.
The review approach in Phase 1 was based on a search strategy that we slightly modified from the Shekelle et al. model (Table 1). The disease-specific terms for the search strategy were the same as those described above for the traditional approach. The review approach limited the number of journals in the literature search to five key medical journals and three specialty journals. We selected the specialty journals by identifying the five journals referenced most often in the bibliography of the relevant chapter from the 1996 USPSTF Guide and from these five, selecting the top three journals based on the relevance of their citations to the critical key questions. In addition, we searched the National Guideline Clearinghouse (NGC), selected web sites from federal agencies and the Internet in general. We reviewed the bibliographies of relevant review papers and commentaries to ascertain primary research relevant to the critical key questions.
|
In Phase 2 (asymptomatic bacteriuria, hemoglobinopathies), we modified the review search strategy to increase efficiency (middle column of Table 1); we made the same changes to the traditional approach. We changed the MEDLINE search from the five general and three specialty journals to the Abridged Index Medicus (AIM) journals, which cover the five general journals searched in Phase 1 but also include major specialty journals. As in Phase 1, we searched specialty journals as needed if they were not AIM journals. We made no changes to the search strategy of the review approach for Phase 3 (genital herpes, syphilis).
Evaluation of literature search and review processes
We formed two teams of three persons each; two clinicians and one health services researcher. Each team evaluated the need to update each topic, alternating search strategies between topics (see Figure 1).
To guarantee that both teams had the same understanding of the clinical topic at baseline, each team independently developed an analytic framework and key questions, denoting which questions were priorities for changing recommendations (critical key questions) according to USPSTF methods [13]. The teams compared their analytic frameworks and critical key questions, qualitatively determining the level of agreement (complete, partial, no agreement), and then developed a common analytic framework, set of critical key questions and eligibility criteria. USPSTF liaisons reviewed the consensus results and provided feedback. With the exception of a brief meeting to determine which experts to approach for each topic, the two teams had no communication after the consensus meeting.
The teams identified up to six national or international experts for each topic based on the citations from the literature searches. We sent all experts the 1996 guideline (i.e. chapter from the 1996 Guide [12]), an explanation of the project and our list of critical key questions. We asked them to assess the validity of the 1996 guideline and determine whether updating was necessary. We asked whether any studies published after 1994 might address the critical key questions and inquired about ongoing research on the topic. If our inquiry went unanswered, we contacted the experts again by e-mail or telephone. Both teams incorporated the experts guidance in their searches.
After each phase, the teams compared findings and discussed use of the review approach. For the first and second phases, we evaluated the efficiency of the review approach qualitatively and adjusted the search strategy accordingly (Table 1). We revised the traditional search strategy in the same way as the review approach (for Phase 2, and carried on to Phase 3). For Phase 3, we revised the review approach to include only those review articles and citations drawn from review articles whose titles indicated relevance to our critical key questions. We also used the bibliographic database to determine whether the potentially eligible citations from the review articles had been excluded during the abstract review phase of the project. This cut down on the number of articles that required retrieval and review.
Two reviewers from each team independently assessed the relevance of abstracts, full-text articles and original studies identified during the reviewing process (Figure 1). Each team included in its final update memo only those studies that the teams judged as eligible and addressed a critical key question. We asked the task force liaisons to review each teams update memo, which included the studies that both teams retained and any discrepancies, i.e. studies identified by only one team.
We applied two outcome measures to assess the review approach: completeness of article identification and importance of eligible studies that the review approach may have missed. We considered the traditional approach to be the gold standard for evaluating the validity of the review approach. The two task force liaisons overseeing the topic update provided input using a questionnaire that addressed the relevance and importance of the missed studies as they related to the currentness of the 1996 USPSTF recommendations.
| Results |
|---|
|
|
|---|
Validity
Overall, the review approach detected fewer eligible studies than the traditional approach (Table 2) for bladder cancer and syphilis, the two methods identified equal numbers of studies. Because our goal was to compare the yield of the newer method with the standard traditional approach, the task force liaisons assessed only the eligible studies that the review approach missed. They found that no studies missed by the review approach would have influenced their decision to update the current USPSTF recommendation and that all studies critical for the update decision were identified by the review approach. Thus, we conclude that the review approach provides a valid, robust alternative to the traditional approach.
|
Effort
Because personnel time is so costly, we had planned to use time as a primary outcome for comparing the two approaches. However, our teams had diverse clinical and experience levels and these factors influenced the length of the review process. As a result, we used the number of citations and full papers reviewed to assess the effort expended to complete each update. By using a fixed time period for reviewing a specific number of abstracts or articles, one can determine the labor hours for each approach.
Overall, the search strategy for the review approach resulted in fewer citations needing review than the traditional approach (Table 3). The difference in the number of citations was greater for topics with more ongoing research and critical key questions.
|
At the beginning of the project, we retrieved more full-text articles for the review approach (chiefly review articles) than for the traditional approach. Primary research articles take more time to read than review articles, as they require reviewers to read the article carefully to critique its methodology. Even though we reviewed the same number of full-text articles for both approaches, the review approach took less time and was further streamlined for Phases 2 and 3 (see Table 3 and Discussion) as the teams became more experienced with the methodology.
As screening for syphilis was the last update topic, it should have been the most informative for comparing the two approaches. However, the search numbers were higher for the review approach than for the traditional approach. The difference was that the team conducting the traditional approach selected three specialty journals that were AIM journals whereas the review team selected three specialty journals that were not AIM journals, which led to more literature for the review approach.
The time and effort not represented in these figures are those of the librarians and personnel retrieving articles from either the abstract or citation review process. The changes in the review approach search strategy between Phases 1 and 2 materially reduced working hours for the librarians (from 34.0 hours to 8.8 hours) but not for the other team members. A substantial remaining inefficiency was that many citations found relevant when reviewing the bibliographies of primary research and review articles were, upon retrieval, noted to have been rejected in the abstract review process. Thus, these articles were not only reviewed twice but were also retrieved unnecessarily. To reduce this needless expenditure of time, we used the bibliographic database (ProCite) that contained the team members judgment of abstract eligibility for the original Medline literature search to screen the references that seemed eligible after retrieving and reviewing the bibliographies of editorials, commentaries and reviews. By cross-checking the potentially relevant references against the bibliographic database, we were able to reduce our retrievals and immediately omit the citation from further review.
Expert involvement and refinement of the review approach
Including topic experts has been an integral part of the current USPSTF guideline development process, but expert involvement proved not to be very useful for this updating project. The majority of experts did not reply; those who did often failed to evaluate the appropriateness of our critical key questions and the availability of new literature to address these questions (Table 4). Experts frequently saw the need to update disease classifications or prevalence estimates but provided little guidance on issues related to screening, diagnosis or treatment in the primary care setting. Consequently, the experts opinions on the need for minor or major revisions of guidelines often did not coincide with the assessments of task force liaisons. No expert identified studies relevant to a critical key question that we had not found; only once did an expert identify relevant ongoing research. One expert pointed out new ongoing research that was not directly related to our critical key questions but will be useful for future updates of the topic. The most positive and reassuring point is that we did not miss any critical studies.
|
An important goal of this study was to refine the review approach so that future assessments of how current guidelines are can be done more expeditiously. We based our changes to the search strategies and review methods on experience gained through our iterative process. Our final version of the review approach (Figure 2) incorporates all revisions to this approach implemented in Phase 2 and tested in Phase 3.
|
| Discussion |
|---|
|
|
|---|
This study demonstrates that a limited literature search [10] using review articles, commentaries and editorials can be a valid and feasible alternative in assessing whether clinical practice guidelines are adequately up to date. To improve efficiency, we revised the original approach of Shekelle et al. according to our growing experience with this new approach.
The traditional approach provides a more comprehensive view of the literature because the citations are to primary research articles; the review approach captures the treetops, addressing only the citations for editorials, commentaries and review articles. Although the review approach often identified fewer citations than the traditional approach, the review approach for Phase 1 was not less work. Because of its limited focus, the review approach seemed tedious, not only because the bibliographies are in small type, but because the review articles required the team to err on the side of retrieving an article to determine its eligibility. Thus, the initial version of the review approach did not meet our expectations of being a time-saving tool.
We could overcome most of these issues by more experience with the review approach. Omitting the general web search saved the librarians substantial time. Searching AIM journals further streamlined the search method and may have raised the searchs sensitivity. Despite refining and streamlining the search methods, we found that the level of experience in conducting systematic reviews played a critical role and accounted for most of the variations in labor hours between the teams.
Expert involvement was not very beneficial for assessing how current a clinical guideline is. Our identification process, which focused on researchers rather than those involved in developing guidelines may account for this finding. Shekelle and colleagues [10] obtained more information from their experts, who were nominated by panel chairs from previous guideline committees. From our research, we cannot tell whether experts provide an efficient means of assessing the need for guideline updating because they did not improve the efficiency of our approaches.
Our study has several limitations. Because it dealt only with prevention topics, it is not representative of all guidelines, especially treatment guidelines, for which the research base grows very quickly. Further, we did not randomly select chapters from the 1996 Guide, and our study only addressed six chapters. We do not know how effective the review approach will be for topics involving extensive ongoing research such as acquired immune deficiency syndrome.
Because we used the review approach as an integral part of the benchmark (i.e. the traditional approach), we had limited ability to compare the number of abstracts and full-text articles between the two approaches. Therefore, the requirement that the traditional approach be as comprehensive as possible by including the review approach methodology may have yielded additional citations that further increased the difference between the two methods. Moreover, our searches were limited to MEDLINE; other databases might need to be included for international references. In addition, the experts assessing the importance of the missed studies were not blinded to the search approach. These issues could theoretically introduce bias.
It is particularly timely to address the methods for assessing whether guidelines are current because many of the NGC guidelines will expire soon. By setting an arbitrary time limit, the NGC tries to promote timeliness and meaningful updating. This approach sets a minimum standard that relies on time frames. Ideally, in an evidence-based world, emerging evidence would be a primary means of updating and time frames only secondary. In areas of rapidly changing scientific evidence (e.g. HIV), complete updates will always be necessary. However, in fields having less research, a time- and cost-effective method like the review approach can be a valuable tool to assess the need to update, using evidence itself as a trigger for decision-making.
This article is based on research conducted by the RTI International and University of North Carolina Evidence-based Practice Center under contract to the Agency for Healthcare Research and Quality (contract no. 290-97-0011), Rockville, MD. The authors of this article are responsible for its contents. No statement in this article should be construed as an official position of the Agency for Healthcare Research and Quality or of the US Department of Health and Human Services. Financial support was provided to RTI International (contract 290-97-0011) from the Agency for Healthcare Research and Quality, Rockville, MD, USA.
| References |
|---|
|
|
|---|
- Field MJ, Lohr KN, eds. Clinical Practice Guidelines: Directions for a New Program. Washington, DC: National Academy Press, 1990.
- Audet AM, Greenfield S, Field M. Medical practice guidelines: current activities and future directions. Ann Intern Med. 1990; 113: 709714.
[Abstract/Free Full Text] - Lohr KN. Rating the strength of scientific evidence: Relevance for quality improvement programs. Int J Qual Health Care 2004; 16: 918.
[Abstract/Free Full Text] - Cook DJ, Mulrow CD, Haynes RB. Systematic reviews: synthesis of best evidence for clinical decisions. Ann Intern Med. 1997; 126: 376380.
[Abstract/Free Full Text] - Shekelle PG, Woolf SH, Eccles M, Grimshaw J. Clinical guidelines: developing guidelines. Br Med J 1999; 318: 593596.
[Free Full Text] - Field MJ, Lohr KN, eds. Guidelines for Clinical Practice: From Development to Use. Washington, DC: National Academy Press, 1992.
- Mulrow CD, Cook D. Systematic Reviews: Synthesis of Best Evidence for Health Care Decisions. Philadelphia, PA: American College of Physicians, 1998.
- Woolf SH, Grol R, Hutchinson A, Eccles M, Grimshaw J. Clinical guidelines: potential benefits, limitations, and harms of clinical guidelines. Br Med J 1999; 318: 527530.
[Free Full Text] - Browman GP. Development and aftercare of clinical guidelines: the balance between rigor and pragmatism. J Am Med Assoc. 2001; 286: 15091511.
[Free Full Text] - Shekelle PG, Ortiz E, Rhodes S et al. Validity of the Agency for Healthcare Research and Quality clinical practice guidelines: how quickly do guidelines become outdated? J Am Med Assoc. 2001; 286: 14611467.
[Abstract/Free Full Text] - Shekelle P, Eccles MP, Grimshaw JM, Woolf SH. When should clinical guidelines be updated? Br Med J 2001; 323: 155157.
[Free Full Text] - US Preventive Services Task Force. Guide to Clinical Preventive Services. 2nd edn. Alexandria, VA: International Medical Publishing, 1996.
- Harris RP, Helfand M, Woolf SH et al. Current methods of the US Preventive Services Task Force: a review of the process. Am J Prev Med. 2001; 20: 2135.[Web of Science][Medline]
This article has been cited by other articles:
![]() |
S Mattke When should measures be updated? Development of a conceptual framework for maintenance of quality-of-care measures Qual. Saf. Health Care, June 1, 2008; 17(3): 182 - 186. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Clark, E. F. Donovan, and P. Schoettker From outdated to updated, keeping clinical guidelines valid Int. J. Qual. Health Care, June 1, 2006; 18(3): 165 - 166. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



