If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
A systematic review of estimates of the minimal clinically important difference and patient acceptable symptom state of the Western Ontario and McMaster Universities Osteoarthritis Index in patients who underwent total hip and total knee replacement
Address correspondence and reprint requests to: C. MacKay, West Park Healthcare Centre, Research and Evaluation, 82 Buttonwood Avenue, Toronto, ON, M6M 2J5, Canada.
Division of Health Care and Outcomes Research, Krembil Research Institute, University Health Network, Toronto, Ontario, CanadaThe National University of Ireland, Galway, Ireland
Division of Health Care and Outcomes Research, Krembil Research Institute, University Health Network, Toronto, Ontario, CanadaInstitute of Health Policy, Management and Evaluation, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, CanadaDepartment of Physical Therapy, University of Toronto, Toronto, Ontario, CanadaRehabilitation Science Institute, University of Toronto, Toronto, Ontario, Canada
To systematically review the minimal clinically important difference (MCID) and patient acceptable symptom state (PASS) estimates in pain and function measured using the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) in patients who underwent primary total knee replacement (TKR) and primary total hip replacement (THR).
Design
The study was carried out following PRISMA recommendations. We searched five electronic databases. Two reviewers independently screened titles, abstracts and full-text papers using a priori inclusion/exclusion criteria. Data were extracted by two independent reviewers. Data were synthesized, with WOMAC values converted to 0–100 scores (0 = best, 100 = worst).
Results
Thirteen studies were included. Research methods used to calculate MCIDs and PASS varied across studies (e.g., using anchor-based or distribution methods, wording of anchor questions within anchor-based methods). Baseline WOMAC scores also varied across studies. Across studies and methods, MCIDs for the WOMAC in patients undergoing TKR ranged from 13.3 to 36.0 for pain and 1.8–33.0 for function; values for WOMAC in THR ranged from 8.3 to 41.0 for pain and from 9.7 to 34.0 for function. PASS cut-offs for TKR ranged from 25.0 to 28.6 for pain and 32.3–36.7 for function, and cut-offs for THR from 15.0 to 30.6 for pain and 28.0–42.0 for function.
Conclusion
Although the WOMAC is a commonly used measure for a single condition, the variability in methods used to calculate MCID and PASS estimates results in a range of values across studies making it unclear whether values reported in the literature can be applied with confidence. Future research is needed to refine methods used to calculate MCIDs and PASS.
Patient reported outcome measures (PROMs) are measures developed to assess health outcomes from the patient's perspective. They are increasingly used to measure the effectiveness of treatments in clinical research, to inform clinical decision making and patient care, and to inform health policies
. However, the interpretability of PROMs can be difficult (e.g., interpreting the meaning of a pain reduction of two points). In order to interpret clinically important changes in outcomes, methods have been developed to determine if a medical intervention improves perceived outcomes in patients. The minimal clinically important difference (MCID) is defined according to the patient's perspective of what change is improvement
. The MCID was first defined by Jaeschke as “the smallest difference in score in the domain of interest which patients perceive as beneficial and which would mandate, in the absence of troublesome side effects and excessive cost, a change in the patient's management”
. While a number of definitions of a MCID have been documented in the literature, the common thread is that the MCID represents a lower boundary of change that has been defined to be important
. MCIDs are calculated using anchor-based methods which link the change in the outcome to an external anchor that accounts for the patient's perspective, or distribution methods which are data driven approaches that define different statistical parameters to assess clinical significance
. While there is no consensus on how to develop MCIDs, the primary approach recommended by the US Food and Drug Administration (FDA) and Outcome Measures in Rheumatology (OMERACT) is an empirical anchor-based approach
. Another complementary concept, the patient acceptable symptom state (PASS) has been defined as the highest level of symptoms beyond which patients consider themselves well. In other words, the PASS is the symptom state patients consider acceptable
. Despite advancement in the development of MCIDs and PASS for PROMs, there have been methodological challenges in defining clinically important change from the patients' perspective
Years lived with disability (YLDs) for 1160 sequelae of 289 diseases and injuries 1990-2010: a systematic analysis for the Global Burden of Disease Study 2010.
Validation study of WOMAC: a health status instrument for measuring clinically important patient relevant outcomes to antirheumatic drug therapy in patients with osteoarthritis of the hip or knee.
. A number of empirical studies across different countries have been conducted to estimate the MCID and PASS of the WOMAC in patients with OA. Yet, the variation in MCIDs and PASS between and within methodological approaches and countries is unclear. Earlier work comparing studies of a range of PROMs in rheumatology (focused on MCIDs) suggested that there is wide variation in values across studies
. Our objective was to systematically review the evidence regarding reported MCID and PASS estimates in pain and function measured using the WOMAC in patients who underwent primary total knee replacement (TKR) and primary total hip replacement (THR).
Methods
Womac
The WOMAC has demonstrated reliability, validity and responsiveness in patients with OA
Validation study of WOMAC: a health status instrument for measuring clinically important patient relevant outcomes to antirheumatic drug therapy in patients with osteoarthritis of the hip or knee.
Validation study of WOMAC: a health status instrument for measuring clinically important patient relevant outcomes to antirheumatic drug therapy in patients with osteoarthritis of the hip or knee.
. Scores are converted to a 0–100 scale with higher scores representing worse pain and functional limitations.
Search strategy
We followed guidelines for conducting and reporting in systematic reviews including the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) Statement
. The search strategy was developed and performed by an information specialist in collaboration with research team members (AD, RW). In order to ensure a comprehensive search of articles on MCID/PASS, we incorporated keywords used or recommended by authors who have previously published reviews on MCIDs/PASS. Five electronic databases were searched from inception of the databases: Ovid MEDLINE/Ovid MEDLINE In-Process & Other Non-Indexed Citations (1946), Ovid EMBASE (1974), EBSCO CINAHL (1981), Ovid Cochrane Database of Systematic Reviews (2005), and LILACS (unknown year). Three core sets of search terms were included: MCID/PASS, WOMAC and OA/joint arthroplasty with Boolean operators OR/AND used to link search terms within/between core sets, respectively. For example, key words were: (‘minimally/minimal/minimum clinical(ly) important difference(s)’ OR ‘MCID’ OR ‘minimally/minimal/minimum important difference(s)’ OR ‘MID’ OR ‘clinical(ly) important difference(s)’ OR ‘CID’ OR ‘minimally/minimal/minimum clinical(ly) important improvement(s)’ OR ‘MCII’ OR ‘minimally/minimal/minimum clinical(ly) important change(s)’ OR ‘MCIC’ OR ‘minimally/minimal/minimum perceptible change(s)’ OR ‘meaningful change(s)’ OR ‘smallest worthwhile effect(s)’ OR ‘minimally/minimal/minimum clinically relevant state’ OR ‘low disease state’ OR ‘PASSpatient acceptable symptom state’ etc.) AND (‘Western Ontario and McMaster Universities Osteoarthritis Index’ OR ‘WOMAC’ etc.) AND (‘osteoarthritis’ OR ‘hip arthroplasty’ OR ‘knee arthroplasty’ OR ‘hip replacement’ OR ‘knee replacement’ etc.). The full list of key words utilized in each database is reported in the Appendix. The search in databases was limited to English language and publication type only when pertinent (i.e., excluded book chapters/series, conference proceeding, and letters/notes). Due to variation in the search databases, all search terms and limits were tailored to the specific database. The initial search was conducted on 2 June 2016 and updated to 13 August 2018 to identify additional literature published after the initial search.
Study screening and selection criteria
Results of the database searches were imported into EndNote X7. Two reviewers (RW, NC/KS) independently screened titles, followed by abstracts, and then full-text papers using a priori inclusion/exclusion criteria. Full text papers were retrieved if they passed the preliminary screen and if the records did not contain sufficient information to establish eligibility. Papers were eligible for inclusion if the following criteria were met: 1) patients had OA of the hip or knee and undergoing primary THR or TKR and 2) MCID or PASS estimates were calculated for WOMAC pain and function for THR and TKR patients separately. Papers were excluded if: 1) patient diagnosis other than OA; 2) patient population other than adults; 2) unicompartmental, bilateral or revision TKR/THR or another surgery other than TKR/THR; 3) a calculated MCID/PASS for an outcome measure other than WOMAC; 4) a calculated MCID/PASS for WOMAC but could not isolate to TKR or THR patients; 5) language other than English; and, 6) not original research (reviews/systematic reviews editorials, commentaries, workshop summaries, protocols, etc.). For papers that did not meet eligibility for final inclusion but cited or discussed an MCID/PASS estimate, the reference lists were searched to identify the primary sources of the MCID/PASS values reported and other relevant citations not generated from the database searches. Any discrepancies throughout the screening process were resolved through discussion and consensus was achieved by consulting a third reviewer (AD) as necessary.
Data abstraction
A data abstraction form was developed and pilot tested by team members (AD, RW, NC) using a randomly selected set of three eligible papers. Two independent reviewers (NC, CM) extracted information including study and patient characteristics, details about the primary outcome of interest (e.g., WOMAC version and scoring), the MCID and PASS definition adopted, and approach used to compute MCID or PASS values (e.g., distribution-versus anchor-based) including anchor properties, approach criteria, etc. where applicable. When a third reviewer (AD) compared the two extractions, minimal inconsistencies were found and were rectified through discussion.
We completed the Interpretability box of the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist for all studies
The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study.
The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study.
. We added two columns pertaining to whether each study specified anchor or distribution methods in deriving MCID/PASS values (no/yes) and rational for the method/cut points used (no/yes). These items refer to the reporting of information to facilitate interpretation of scores, rather than standards to assess risk of bias of a study on interpretability
The five database searches yielded 13,840 results (Medline 4661, Embase 7421, Cinahl 1382, CDSR 286, and LILACS 90). After duplications were removed, 9196 citations remained for preliminary screening. Screening of titles/abstracts for relevancy identified 430 potential citations from the database searches. An additional 2 records were identified through reference checking. The 432 full-text articles were screened for eligibility leading to the exclusion of 419 papers (i.e., 83 were not original research; 119 did not investigate MCID/PASS or WOMAC was not a primary outcome; 160 only cited or discussed an MCID/PASS but did not derive estimates; 29 derived MCID/PASS estimates but not for WOMAC; and, 28 calculated MCID/PASS for WOMAC but were not exclusively OA patients undergoing THR/TKR). Four studies were excluded as they did not calculate MCID/PASS for TKR and THR separately and one study was excluded as it was a secondary analysis of data from another included article. A total of 13 unique studies from final screening met the inclusion criteria and were kept for this review. The selection process is illustrated in Fig. 1. Data will be presented separately for TKR and THR. While there was some variability in use of terminology (MCID, minimally important difference, clinically important difference), we will use the term MCID.
Findings from the interpretability section of the Cosmin checklist are in Table I. Of the 13 studies included, seven reported on 1–3 of seven included COSMIN interpretability criteria
. The mean age of individuals was greater than 65 in all studies. Overall mean pre-operative pain scores ranged from 45.1 (17.3) to 64.3 (19.1) and function ranged from 45.7 (17.4) to 65.3 (17.7) (0–100 scale in which 100 is worse). Studies reported on data from TKR cohorts in six countries: Spain (n = 4), Germany (n = 1), Netherlands (n = 1), Switzerland (n = 1), Canada (n = 1) and the United States (n = 2) (some studies included cohorts from > one country). Nine studies used anchor-based methods
Before TKR; t+1 (first follow up after TKR - any time after day 1 to 1 year after T0) and T+2 (second follow-up after TKR 1 year following T+1)
original research used anchor method
PASS defined as the value beyond which patients consider themselves satisfied with actual OA symptoms (WOMAC pain <32.4 WOMAC disability <31.0): MID: 1/2 SD of difference between change scores
Studies (n = 9) using anchor-based methods calculated mean change scores or receiver operating curves (ROCs) or both. There was variability in the anchor used in the studies (question, response options), with three studies using a 5-point scale
). The time frame (period to compare current health status to prior health status) was also variable: three studies collected data pre-surgery and 1 year
Using anchor-based methods, the MCID for WOMAC pain calculated using mean change scores was 13.3 (0.5–26.1) in one study conducted in Holland (n = 73; calculated as mean change in score in the subgroup of patients who reported themselves as ‘‘a little better’’ on the anchor)
. Using the ROC method, the MCIDs for WOMAC pain ranged from 20.5 to 36.0. The MCID for WOMAC function calculated using the mean change method was 1.8 (−8.3-11.9) in the Dutch study
. MCIDs are presented by method in Table III (anchor-based) and Table IV (distribution).
Table IIIEstimates of minimal clinically important difference (MCID) and patient acceptable symptom state (PASS) for WOMAC Pain and Function in Studies of TKR using Anchor-based Methods
PASS cut-offs were calculated using the 75th centile or ROC analysis approach. Some researchers used a range of response options (4 items) to rate their satisfaction with joint replacement and others used a yes/no response option. Escobar et al.
; pre-operative function scores ranged from 48.7 (1.1) to 65.8 (16.97) (0–100 scale with higher scores indicating more pain and worse function). Studies were conducted in individuals undergoing THR in Canada (n = 1), Spain (n = 4), Netherlands (n = 1), Germany (n = 2), Switzerland (n = 1), and multiple countries in Europe (n = 1). Eight studies used anchor-based methods
Cohort 1: N=573; 51.8% female; 48.5% > age 70 Cohort 2: N=333; 48.1% female; 48.8% > age 70
OA
2 prospective cohorts
7 teaching hospitals
Preoperative and 6 months
“If you had to live the rest of your life with the hip symptoms you have now, how would you feel?” Response: “totally satisfied”, “slightly satisfied”, “not satisfied”, and “not at all satisfied”; Cohort 1 only: “Compared with status before you had a hip prosthesis how would you rate the status of your hip right now?” 7 responses from “a great deal better” to “a great deal worse”
. In Spain, values ranged from 24.55 to 29.26 in three studies (with one study reporting cut-offs of 15, 23, 36 for 1, 2 and 3 baseline tertile scores, respectively)
. Using the ROC method, the MCIDs for WOMAC pain ranged from 22.4 to 41.0. For function, the MCID, using the mean change method ranged from 20.8 to 26.54. Using the ROC method, the MCID ranged from 18.4 to 34.0. Using distribution methods, MCIDs were calculated as 10.5 for pain and 9.7 for function in one study
PASS cutoffs were calculated using the 75th centile or ROC method. The number of response options on the question of satisfaction varied (4 or 10 options). The cut-off for pain was reported as 20.0 and 30.0 in one study using different methods (percentile and ROC approach, respectively) and 30.9 and 31.2 for function
In this systematic review the estimates of the MCIDs and PASS for pain and function were examined for a commonly used measure, the WOMAC, and in the context of a specific procedure for end stage OA i.e., primary THR and TKR. The derived MCID and PASS values from the 13 studies included were variable within and across countries. Across studies and methods, MCIDs for the WOMAC in patients undergoing TKR ranged from 13.3 to 36.0 for pain and 1.8–33.0 for function; values for WOMAC pain in THR ranged from 8.3 to 41.0 for pain and from 9.7 to 34.0 for function.
Previous studies have reported significant heterogeneity in the calculation methods used for MCIDs in rheumatology
. While we found variation in methodological approaches to determining the MCID (anchor and distribution approaches as expected) and the PASS, we also found variation within a given method (e.g., different wording of anchor questions and choice of cut-points). Additionally, there was variation in patient sample characteristics (i.e., baseline WOMAC scores) used to determine values and, given the known influence of baseline scores on MCID and PASS calculations
, this likely also contributed to the variability in the values obtained.
Using the COSMIN interpretability questions, we found that seven of 13 studies reported on only 1–3 COSMIN criteria. While interpretability is not a measurement property, these questions are intended to facilitate interpretation of the scores
The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study.
. Lack of reporting in a number of studies highlight limitations in interpreting findings from these studies and points to the need for better reporting in MCID/PASS studies in the future (e.g., percentage of responders with lowest possible score, highest possible score and score in sub-groupings).
It is generally accepted that a PROM like the WOMAC would not have a single MCID and that values for the MCID would vary depending on the intervention and context (e.g., patient group)
. Interestingly, even within the context of one intervention with a relatively predictable result in a population with end stage OA, we found that there is heterogeneity in the approaches used to calculate MCIDs for the WOMAC in patients undergoing TJR and subsequent values for the MCID. In this review, more studies used anchor-based methods compared to distribution methods as recommended
. However, there was variation within methods, which included: 1) variation in the wording of the anchor question and the response scale (e.g., in TKR, anchors ranged from a 4-point scale to a 15-point scale), 2) the time frame studied, and 3) the approach to calculating the MCID (mean change and ROC). First, MCID was more often calculated for patients reported to be “somewhat better” in the anchor response but cutoffs varied (e.g., patients reporting to be “a little better”). While the cutoff is paramount to defining small but important change, there is a lack of evidence and agreement to optimal cutoff levels
. Some have suggested that the validity of anchors could be improved by querying about the importance of change rather than using the magnitude of change item
. Moreover, patients may need to be involved in specifying the cut-point to be used, rather than the investigator/clinician as often occurs. Variation also included the time frame for the study (e.g., baseline and 6 months, 12 months, 2 years). Since studies have demonstrated that much of the change in pain and function occurs within the first 6 months following TJR
it is likely the studies reporting on 6 and 12 months are most relevant as change beyond this time period may be influenced by external contextual factors or events. In other research, 36% of patients who underwent THR reported one or more positive impact life events in the year following surgery, while 63% reported one or more negative life events. The number of positive life events was associated with engagement in life activities following THR
. Finally, not all studies conducted a ROC analysis. This is despite recommendations that ROC methods be used for values that are to be applied at the individual level
. Overall, there appeared to be variability in values regardless of method used.
We found there was significant variation in the baseline WOMAC pain and function scores across studies for the same intervention. While TJR is a procedure for end-stage OA, this is not surprising as there are no agreed upon guidelines on surgical candidacy for TJR. There is evidence of large heterogeneity in patient status at the time of surgery and pain and function alone do not determine who may be a surgical candidate
. Variation in baseline scores has been found across other studies of MCIDs and researchers have highlighted the influence of baseline scores on the value calculated in studies of MCIDs
showed that failure to consider baseline scores could result in some patients being misclassified as having not benefitted from treatment (some patients could not achieve important improvement due to baseline scores). Some researchers have studied the use of item response theory-informed methods to correct for the variability in MCID based on baseline score
The minimal clinically important difference determined using item response theory models: an attempt to solve the issue of the association with baseline score.
but the limited evidence to date does not provide information on what methods are superior in developing MCIDs.
Studies were included from a range of countries. Studies conducted in Spain were most common and used similar methods (i.e., anchor-based, mean change). While the variability within studies conducted in Spain was less than across methods and countries, there was heterogeneity in results particularly for THR (TKR: 22.6–29.9; THR: 17.67–33.5). These findings further highlight the challenges inherent when determining and interpreting important change.
Fewer studies were included which calculated a PASS for WOMAC pain and function for patient undergoing TJR. Studies used ROC curves or the 75th centile approach (75th percentile of the scores for improvement in patients who report an important improvement by the anchoring question). PASS cut-offs were similar across the studies of TKR (25.0–28.6 for pain; 32.3–36.7 for function). While there were few findings reported for PASS for THR, values ranged from 20.0 to 30.0 for pain and 30.9–31.2 for function. Vogl et al.
reported a summary WOMAC score of 15 and interpreted that their low threshold state compared to other studies conducted in THR was because the cohort had lower baseline and follow-up scores. Similar to MCIDs, researchers have suggested that the methodology for identification of PASS may influence the identified cut-off points. The ROC approach has provided estimates that were somewhat lower than the cut-off points identified with the 75th centile approach
. This was not evident from the few studies included in this systematic review.
The study has limitations. Publications in languages other than English were not included. Findings are specific to the study population and are nontransferable across patient groups or interventions. Despite a comprehensive search strategy to locate articles, it is possible studies meeting inclusion/exclusion criteria were missed.
The WOMAC is a commonly used PROM in patients undergoing TJR. Yet, our findings highlight variation in methodological approaches used to determine MCIDs and PASS for the WOMAC, variation in approaches within methods, and variation in patient sample characteristics used to determine values. Due to the heterogeneity in the research methods used across studies, we were unable to identify clear patterns (e.g., participant characteristics or methodological approaches) that may explain the heterogeneity in estimates. Given the variability in the values reported for MCID and PASS across studies it is unclear that the values reported in the current literature can be applied to new research with confidence. To be able to use PROMs and identify responders to interventions, more standardization of methodological approaches to estimating MCIDs and PASS (including methods within approaches i.e., anchor questions) may be required. This standardization will be critical in the era of personalized medicine in which therapies are targeted to sub-groups of patients with unique attributes. At present, careful consideration needs to be given to the applicability of a given MCID to the specific context in which the WOMAC will be used. Future research is needed to refine methods used to calculate MCIDs and PASS including comparative methods research.
Author contributions
Conception and design: Davis.
Data acquisition: Clements, Wong.
Analysis and interpretation of the data: MacKay, Clements, Wong, Davis.
Drafting of the article: MacKay.
Critical revision of the article for important intellectual content: Clements, Wong, Davis.
Final approval of the article: MacKay, Clements, Wong, Davis.
Dr. Davis ([email protected]) takes responsibility for the integrity of the work as a whole.
Competing interests
None of the authors have any conflicts of interests in relation to this work.
Role of the funding source
There was no role of a funding agency in this work.
Appendix A. Supplementary data
The following is the supplementary data to this article:
Years lived with disability (YLDs) for 1160 sequelae of 289 diseases and injuries 1990-2010: a systematic analysis for the Global Burden of Disease Study 2010.
Validation study of WOMAC: a health status instrument for measuring clinically important patient relevant outcomes to antirheumatic drug therapy in patients with osteoarthritis of the hip or knee.
The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study.
The minimal clinically important difference determined using item response theory models: an attempt to solve the issue of the association with baseline score.