If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Address correspondence and reprint requests to: N. Østerås, National Advisory Unit on Rehabilitation in Rheumatology, Department of Rheumatology, Diakonhjemmet Hospital, PO Box 23 Vinderen, N-0319 Oslo, Norway. Tel: 47-92086465.
To assess validity, reliability, responsiveness and interpretability of the revised OsteoArthritis Quality Indicator (OA-QI) questionnaire version 2 (v2) assessing patient-reported quality of osteoarthritis care.
Methods
The OA-QI v2 (16 items, score range 0–100 (100 = best score)) was included in a longitudinal cohort study. Attendees of a 4.5 h osteoarthritis patient education programme at Diakonhjemmet Hospital, Norway, completed the OA-QI at four time points: 2 weeks before, immediately before, immediately after, and 3 months after the programme. Test-retest reliability and measurement error over a 2-week time period were assessed in those that had not seen health professionals in the interim. Construct validity and responsiveness were assessed with predefined hypotheses. Floor and ceiling effects, smallest detectable change (SDC95%) and minimal important change (MIC) were assessed to evaluate interpretability.
Results
The intraclass correlation coefficient for all 16 items was 0.89. For single items the test-retest kappa estimates ranged 0.38–0.85 and percent agreement 69–92%. Construct validity was acceptable with all six predefined hypotheses confirmed. Responsiveness was acceptable with 33 of 48 and three of four predefined hypotheses confirmed for single items and all items, respectively. There were no floor or ceiling effects. The SDC95% was 29.1 and 3.0 at the individual and group levels, respectively. MIC was 20.4.
Conclusions
The OA-QI v2 had higher reliability estimates compared to v1, showed acceptable validity, and is the recommended version for future use. The results of responsiveness testing further support the use of the OA-QI v2 as an outcome measure in studies aiming to improve osteoarthritis care.
. The prevalence of OA is expected to increase because of ageing and obesity, and health care professionals must prepare for a rise in the demand for OA care. Two recent systematic reviews assessing the quality of OA care revealed a gap between evidence based recommended care and clinical practice
. This highlights the need to implement evidence based treatment modalities in clinical care to optimize the care for people with OA. Quality indicator (QI) sets developed from OA care recommendations may be used to monitor and evaluate the quality of the provided care.
The OsteoArthritis Quality Indicator (OA-QI) questionnaire was developed in 2010 to assess patient-reported quality of OA care
American College of Rheumatology 2012 recommendations for the use of nonpharmacologic and pharmacologic therapies in osteoarthritis of the hand, hip, and knee.
. Content validity was deemed satisfactory after patient research partners and expert panel groups judged the OA-QI items to be relevant. Hypothesis testing was used to assess construct validity with all ten a priori hypotheses being confirmed. Assessed in a test-retest design, the relative reliability for the OA-QI was moderate
The OA-QI was revised in 2015 following feedback from researchers and patient participants. The process involved two expert panels including patient research partners, statistical analyses and pilot-testing. It resulted in minor wording revisions to the items, additional response category headings and removal of one item considered redundant (Details in Supplementary material, S1). The patient partners thought the Norwegian wording in the revised version was clear. For the English version, we harmonized the wording with a similar UK questionnaire developed by patient partners
. These revisions mean that the OA-QI should be further assessed for measurement properties. Some properties including absolute reliability, responsiveness and interpretability were not assessed for the original version. Responsiveness should be assessed in order to determine whether the OA-QI can be used as an outcome measure in longitudinal studies. Calculation of the smallest detectable change (SDC) and minimal important change (MIC) of the OA-QI will assist in interpretation of change scores in longitudinal studies.
The aim of the current study was to assess reliability, validity, responsiveness and interpretability of the OA-QI v2. The results will further support the evidence base for use of the OA-QI in national and international QI work.
Methods
Design and ethical considerations
The current study was a longitudinal, observational cohort study with repeated measurements among people with OA attending an OA patient education programme. Testing of reliability, responsiveness and interpretability of the OA-QI v2 followed the COSMIN checklist
The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study.
. The Regional Committee for Medical and Health Research Ethics decided that ethical approval was not required under the Norwegian Act on Medical and Health Research (ref. no: 2016/3 REK south-east C). The study was in accordance with the Personal Data Act and the Personal Health Data Filing System Act as approved by the Data Inspectorate/Data Protection Official (ref.no: 2016/460). All participants provided written consent.
Study sample
OA patients referred by their general practitioner between December 2015 and August 2016 to a multidisciplinary OA patient education programme at Diakonhjemmet Hospital, Oslo, Norway, received a postal request to participate in the present study. Those completing the baseline questionnaire and one or more of three follow-up questionnaires were included in the analyses. People with non-Nordic names were excluded due to potential language problems, and questionnaires with >50% missing items were also excluded.
The OA patient education programme
The 4.5 h multidisciplinary programme is held about twice a month with 30 invites and between ten and 20 people usually attend. The programme includes presentations by clinical nutritionist, occupational therapist, pharmacist, patient representative, physiotherapist and rheumatologist. The nurse coordinator leading the programme facilitates discussions and inter-activity with the participants. Following this, participants may request future consultations with rheumatologist, physical therapist and/or occupational therapist. Two group exercise sessions focusing on lower limb OA and/or two group exercise sessions for hand OA are also offered.
Questionnaire management
Participants completed the OA-QI v2 at four time points: 2 weeks before (T1), immediately before (T2), immediately after (T3), and 3 months after the OA patient education programme (T4). Study information, consent letter and the first questionnaire were sent by postal mail about 2.5 weeks prior to the OA education programme. Those who agreed to participate handed in the completed T1 questionnaire and the written consent to the study coordinator when arriving for the OA education programme. They received and completed the T2 questionnaire before the programme started. At the end of the programme, the participants filled in the T3 questionnaire before leaving. The T4 questionnaire was sent by postal mail to the participants after about 3 months. It was completed at home and returned in a prepaid envelope.
Questionnaires
The OA-QI v2 include 16 QI items with ‘Yes’, ‘No’ and a third response option if the item is not applicable (for example, ‘Not overweight’ for the items on weight management) or the participant do not remember (Fig. 1). Six QIs address patient education and information about OA, treatment alternatives, self-management, physical activity/exercise, weight management and the use of anti-inflammatory medication. Provider assessments are covered in four items. Three items address referrals, and three relate to pharmacological treatment. The OA-QI v2 includes an optional time frame in the introduction text, and in the current study, the participants reported treatment and information received in the past year.
Fig. 1OsteoArthritis Quality Indicator (OA-QI) questionnaire v2 in English.
. An item was considered “eligible” if the participant responded ‘Yes’ or ‘No’ for that item, whereas items were considered ‘not eligible’ and excluded from analysis if there was a missing/ambiguous response or if the participant had responded ‘Don't remember’, ‘Not overweight’, ‘No such problems’, and so on. Hence, the total number of eligible items varied across participants.
On the participant level, achievement of the QI items (i.e., pass rates) was calculated (in percentage) as the total number of items achieved divided by the number of eligible items for each participant, ranging 0–100 with 100 representing the best quality of care score. An individual pass rate of 100 means that the participant had checked ‘Yes’ to all eligible items, whereas a pass rate of 0 implies that the participant had checked ‘No’ to all eligible items. The mean total pass rate was calculated for the whole sample.
At the group level, an item pass rate for each item was equal to the total number of participants who passed (had checked ‘Yes’) divided by the total number of eligible participants (had checked ‘Yes’ or ‘No’) for a particular item. This is presented as an item pass rate for each item, ranging 0–100. An item pass rate of 100 implies that all eligible participants checked ‘Yes’ to that item.
In addition to the OA-QI, the T1 questionnaire included socio-demographic and disease related questions. The participants rated their mean pain level in the past week on an eleven-point numeric rating scale (NRS) (0 = no pain, 10 = worst possible pain) and reported the number of visits to general practitioners, medical specialists or physiotherapists in the past year. T1 also included one item from the COOP/WONCA (Dartmouth Primary Care. Cooperative Information Project/World Organization of National Colleges, Academies, and Academic. Associations of General Practice/Family Physicians)
charts: ‘During the past 2 weeks, how much difficulty have you had doing your usual activities or tasks, both inside and outside the house because of your physical and emotional health?‘ with five response alternatives ranging from ‘No difficulty at all’ – ‘Could not do’ (Table I). T2 included a question about whether they had visited a general practitioner, medical specialist or physiotherapist in the last 2 weeks (Yes/No). T3 included two additional items. The first asked whether the participants were satisfied with information and advice provided at the education programme (‘Not at all/To a little extent/To some extent/To a large extent/To a very large extent’). The second asked them to evaluate their current knowledge of OA compared with before the programme (‘Much less/Somewhat less/Same level of/Somewhat more/Much more knowledge now’). T4 included the same additional question as T2 but with a time frame of the last 3 months (Yes/No).
COOP/WONCA: Dartmouth Primary Care. Cooperative Information Project/World Organization of National Colleges, Academies, and Academic. Associations of General Practice/Family Physicians. Item: ‘During the past 2 weeks, how much difficulty have you had doing your usual activities or tasks, both inside and outside the house because of your physical and emotional health?’
No difficulty at all
15%
14%
A little bit of difficulty
25%
29%
Some difficulty
38%
34%
Much difficulty
21%
21%
Could not do
1%
1%
∗ NRS: Numeric rating scale, 0–10 (0 = no pain, 10 = worst possible pain).
† COOP/WONCA: Dartmouth Primary Care. Cooperative Information Project/World Organization of National Colleges, Academies, and Academic. Associations of General Practice/Family Physicians. Item: ‘During the past 2 weeks, how much difficulty have you had doing your usual activities or tasks, both inside and outside the house because of your physical and emotional health?’
Test-retest reliability and measurement error were assessed for ‘stable’ participants who had not received any information or advice regarding OA from health professionals in the 2-week time period between T1 and T2. Relative reliability was assessed with intraclass correlation coefficients (ICC2.1) two-way mixed effect model for absolute agreement for mean total pass rate and with unweighted Cohen's Kappa and percent agreement for single item pass rates (Table II). ICC of ≥0.70 was considered acceptable
; 0.2–0.4 fair, 0.4–0.6 moderate, 0.6–0.8 substantial agreement, and >0.8 almost perfect. Absolute reliability was assessed by calculation of measurement error, defined as ‘the systematic and random error of a patient's score that is not attributed to true changes in the construct to be measured’
The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes.
. In the current study, it was assessed by calculating the standard error of the measurement (SEM) with random error value obtained from the SPSS VARCOMP analysis; ( = variance due to systematic error between observations; = random error). The SDC95% was calculated at individual and group level
Have you been given information about osteoarthritis from a health professional?
0.55
0.42, 0.65
76
2
Have you been given information about different treatment alternatives?
0.67
0.50, 0.80
86
3
Have you been given information about how you can self-manage the disease?
0.45
0.33, 0.60
80
4
Have you been given information about the importance of physical activity and exercise?
0.68
0.61, 0.73
84
5
Have you been referred or offered a referral to a health professional who can advise you about physical activity and exercise?
0.55
0.49, 0.66
79
6
Have you been advised to lose weight, if you are overweight?
0.76
0.63, 0.83
86
7
Have you been referred or offered a referral to someone who can help you to lose weight, if you are overweight?
0.85
0.75, 0.96
92
8
If you have problems with daily activities, have these problems been assessed by a health professional?
0.45
0.35, 0.57
69
9
If you have problems with walking, has your need for a walking aid been assessed? (e.g., stick, crutch or walker)
0.38
0.18, 0.51
71
10
If you have problems related to other daily activities, has your need for appliances and aids been assessed? (e.g., splints, assistive technology for cooking or personal hygiene, a special chair)
0.43
0.35, 0.57
67
11
If you have joint pain, has it been assessed by a health professional?
0.62
0.47, 0.68
82
12
If you have joint pain, was paracetamol the first medication that was recommended?
0.73
0.70, 0.80
85
13
If you have prolonged severe joint pain, which is not relieved sufficiently by paracetamol, have you been offered stronger pain killing medications? (e.g., co-codamol, codeine, tramadol, co-proxamol, co-dydramol, dihydrocodeine)
0.63
0.50, 0.68
76
14
If you use anti-inflammatory medications, have you been given information about the effects and possible side-effects of this medication? (e.g., ibuprofen (Nurofen, Brufen), diclofenac (Voltarol), naproxen (Naprosyn), celecoxib (Celebrex))
0.64
0.61, 0.64
76
15
If you have experienced an acute deterioration of your symptoms, have you been given or offered a steroid injection?
0.67
0.52, 0.72
79
16
If you are severely troubled by your osteoarthritis, and exercise and medication do not help, have you been referred or offered a referral for an assessment for operation? (e.g., joint replacement)
0.66
0.52, 0.72
80
Item response alternatives: items 1–5: ‘Yes’, ‘No’ or “Don't remember’; items 6–7: ‘Yes’, ‘No’ or “Not overweight’; items 8–10: ‘Yes’, ‘No’ or “No such problems'; items 11–12: ‘Yes’, ‘No’ or “No pain’; item 13: ‘Yes’, ‘No’ or “No prolonged severe pain’; item 14: ‘Yes’, ‘No’ or “Not taking such drugs'; item 15: ‘Yes’, ‘No’ or “Not experienced such deterioration’; item 16: ‘Yes’, ‘No’ or “Not severely troubled’.
Construct validity was assessed with six predefined hypotheses (Table III). We hypothesized that people responding the third response alternative, ‘Not applicable’, for the items on weight reduction and functional assessments (items 6 and 8–11) would report lower BMI, better function on the COOP/WONCA item on daily activities and lower NRS pain levels compared to those responding ‘Yes’ or ‘No’ to the items. Further, we hypothesized that higher self-reported health care utilization in the past year was associated with higher QI mean total pass rates. Validity was considered acceptable if ≥75% (i.e., ≥5 of 6) predefined hypotheses were confirmed
Table IIIConfirmation of predefined hypotheses assessing construct validity of the OA-QI questionnaire v2 (n = 141–146)
Hypotheses
Mean values (sd)/percentages
Statistical analyses
Hypothesis confirmed?
1
People responding ‘Not overweight’ have lower BMI than people responding ‘Yes’ or ‘No’ on item 6 (‘Have you been advised to lose weight, if you are overweight?’)
23 (2.3) vs 30 (5.0)
t = 10.9, P < 0.001 n = 84 vs 61
Yes
2
People responding ‘No such problems’ report better function in daily activities
Response alternatives for function in daily activities: ‘No difficulty at all’/’A little bit of difficulty’/’Some difficulty’/’Much difficulty’/‘Could not do’.
than people responding ‘Yes’ or ‘No’ on items 8–10:
n = 48–98 vs 47–96
Yes
a)
If you have problems with daily activities, have these problems been assessed by a health professional?
27/40/29/4/0 vs 8/18/43/29/2
χ2 = 26.0, P < 0.001
b)
If you have problems with walking, has your need for a walking aid been assessed? (e.g., stick, crutch or walker)
19/32/34/15/0 vs 4/13/47/32/4
χ2 = 19.3, P < 0.001
c)
If you have problems related to other daily activities, has your need for appliances and aids been assessed? (e.g., splints, assistive technology for cooking or personal hygiene, a special chair)
23/41/27/9/0 vs 7/12/48/31/3
Χ2 = 31.5, P < 0.001
3
People responding ‘No pain’ report lower pain levels (NRS pain) than people responding ‘Yes’ or ‘No’ on item 11 (‘If you have joint pain, has it been assessed by a health professional?)
1.5 (1.5) vs 5.1 (2.0)
t = 4.3, P < 0.001 n = 6 vs 140
Yes
4
People responding ‘Not severely troubled’ report lower pain levels (NRS pain) than people responding ‘Yes’ or ‘No’ on item 16 (‘If you are severely troubled by your osteoarthritis, and exercise and medication do not help, have you been referred or offered a referral for an assessment for operation? (e.g., joint replacement)’)
4.0 (1.9) vs 5.7 (2.0)
t = 5.0, P < 0.001 n = 63 vs 81
Yes
5
People responding ‘Not severely troubled’ report better function in daily activities
Response alternatives for function in daily activities: ‘No difficulty at all’/’A little bit of difficulty’/’Some difficulty’/’Much difficulty’/‘Could not do’.
than people responding ‘Yes’ or ‘No’ on item 16 (‘If you are severely troubled by your osteoarthritis, and exercise and medication do not help, have you been referred or offered a referral for an assessment for operation? (e.g., joint replacement)’)
Participant-reported total number of visits to general practitioner, medical specialist and physiotherapist in the past year combined (independent variable).
in the past year is positively associated with the mean total pass rate at T1
Median visits:5 (range 1–150)
β = 0.03, P = 0.001 n = 142
Yes
BMI = body mass index. Analysed with Student's t-test, Chi-square or Linear regression analysis.
∗ Response alternatives for function in daily activities: ‘No difficulty at all’/’A little bit of difficulty’/’Some difficulty’/’Much difficulty’/‘Could not do’.
† Participant-reported total number of visits to general practitioner, medical specialist and physiotherapist in the past year combined (independent variable).
The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes.
. As no gold standard was available, the assessment of responsiveness relied on the construct approach and involved testing hypotheses. Responsiveness was considered acceptable if the change scores were consistent with ≥75% of the predefined hypotheses (Table IV)
Hypothesized very small (1–19%) change from No (T3) to Yes (T4).
1/3
16. If you are severely troubled by your osteoarthritis, and exercise and medication do not help, have you been referred or offered a referral for an assessment for operation?
Hypothesized small (20–49%) change from No (T3) to Yes (T4).
1/3
13. If you have prolonged severe joint pain, which is not relieved sufficiently by paracetamol, have you been offered stronger pain killing medications?
Hypothesized small (20–49%) change from No (T3) to Yes (T4).
2/3
In total
33/48
Δ = change. Confirmed hypothesis are in bold. HP = health professional. T1 = 2 weeks before, T2 = immediately before, T3 = immediately after, and T4 = 3 months after the OA patient education programme. Subgroup A: those reporting ‘much more knowledge’ at T3. Subgroup B: those reporting ‘somewhat more’/’similar’/‘somewhat less’/’much less knowledge’ at T3. Subgroup C: those reporting satisfied with information/advices to a ‘very large extent’ at T3. Subgroup D: those reporting satisfied with information/advices to ‘large’/’some’/’small extent’/‘not at all satisfied’ at T3.
∗ Hypothesized very small (1–19%) change from No (T3) to Yes (T4).
† Hypothesized small (20–49%) change from No (T3) to Yes (T4).
In the four hypotheses for mean total pass rate, we expected the change score from T1 to T4 to be higher than the SDCgroup value. A 20% increase is often regarded as a meaningful or clinically relevant change
Minimum clinically important improvement and patient acceptable symptom state in pain and function in rheumatoid arthritis, ankylosing spondylitis, chronic back pain, hand osteoarthritis, and hip and knee osteoarthritis: results from a prospective multinational study.
, and we hypothesized that the mean total pass rate would change >20% for the total sample. Further, we hypothesized that there would be a significant difference in the change scores for participants reporting very large increases in knowledge and those reporting that the information/advice were satisfactory to a very large extent compared to the rest of the participants. We also hypothesized that the mean total pass rate would increase more from T3 to T4 among those reporting visits to health professionals compared to those who had no such visits.
We also evaluated the responsiveness for single item pass rates with three hypotheses for each item considering acceptable responsiveness if ≥2 of 3 hypotheses were confirmed. Based on evidence for the effectiveness of OA interventions measured with QIs
Effect of a model consultation informed by guidelines on recorded quality of care of osteoarthritis (MOSAICS): a cluster randomised controlled trial in primary care.
, we hypothesized that the OA patient education programme would affect the change scores from T2 to T3 only for the items related to information or advice (e.g., items no. 1–4, 6 and 14). We expected that pass rates for the items related to referrals, provider assessments and pharmacological treatment (e.g., items no. 5, 7–13, 15 and 16) would be stable from T2 to T3. Further, we hypothesized to see positive change scores on all items from T3 to T4 among those that had visited health professionals.
Interpretability
Interpretability is not considered a measurement property, but ‘the degree to which one can assign qualitative meaning - that is, clinical or commonly understood connotations – to an instrument's quantitative scores or change in scores'
The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes.
. In the current study, interpretability was examined by floor and ceiling effects, SDC and MIC.
Floor and ceiling effects at T1 were considered to be present if more than 15% of the participants scored the lowest (0) or highest possible score (100)
. In the current study, a ceiling effect would imply limited scope for improvement in participant scores. The MIC after attending the OA patient education programme was determined with an anchor-based method employing the T3 question on ‘satisfaction with the provided information and advice’. The mean change method
was used to determine MIC. Patients who reported being satisfied ‘to a large extent’ were considered to have experienced a meaningful change. For evaluative purposes, the SDC95% should be smaller than the MIC
The results are presented as mean (sd) or proportions (%). Estimates are given with 95% confidence interval (CI). The Pearson Chi-square test, Fischer Exact test, Student's t-test for independent samples, and linear regression analysis were used to examine proportions and mean values in the predefined hypotheses for the total sample and subgroups. Statistical analyses were performed with IBM SPSS Statistics V.21 and STATA IC 13.
Results
In total 472 people were referred by their general practitioner to the OA patient education programme and received an invitation to participate in the study. Of the 257 people attending the programme during the recruitment period, 182 (71%) agreed to participate. Among these, 19 were excluded due to language problems and seven due to more than 50% missing items (Fig. 2). Excluded participants were comparable to those included, except for lower mean age (mean age 58 vs 64, P < 0.05). The number of included participant responses were 156 (T1), 156 (T2), 150 (T3) and 147 (T4). A large majority of the study sample was female (Table I). Levels of missing data at the item level were low (2–3%), and missing items were excluded in analyses of pass rates. Among the 156 participant responses at T1/T2, 64 (41%) were excluded from test-retest analyses because they had visited a general practitioner, medical specialist or physiotherapist in the interim period. The test-retest subsample (n = 92) did not differ from the study sample regarding demographic, disease or treatment characteristics (Table I).
The mean total pass rates (95% CI) at T1, T2, T3 and T4, were 43% (39–46), 43% (39–47), 66% (62–70) and 70% (67–74), respectively. The pass rates for single items at all four time points are shown in Fig. 3. At T3, 91% of participants were satisfied to a large, or very large, extent with the information and advice provided at the OA education programme (Table V). Ninety-three percent reported that they had an increased knowledge level (Table V). At T4, 71 (46%) participants reported visits to one or more health professionals after attending the OA patient education programme, of these 38 (26%) had visited a general practitioner, 35 (24%) a medical specialist and 40 (27%) a physiotherapist.
Fig. 3OA-QI single item pass rates* at T1, T2, T3 and T4 (n = 147–156). Mean pass rates for each item separately. The numerator represented the number of participants passed (had checked ‘Yes’ to an item), and the denominator represented the number of eligible persons (had checked ‘Yes’ or ‘No’ to an item). T1 = Two weeks before the OA patient programme; T2 = Immediately before; T3 = Immediately after; T4 = Three months after.
The kappa coefficients ranged from 0.38 to 0.85 (Table II). One coefficient was considered to have fair agreement, five moderate, nine substantial and one almost perfect agreement. The percent of agreement ranged from 69% to 92% (Table II). The ICCagreement for the mean total pass rate was 0.89 (95% CI 0.83, 0.93). The SEMagreement was 10.5, SDCind = 29.1, and SDCgroup = 3.0.
Validity
Construct validity was considered acceptable with all six predefined hypotheses being confirmed (Table III).
Responsiveness
Responsiveness was considered acceptable with three of the four predefined hypotheses for change scores of the mean total pass rate and more than 2/3 of the hypotheses for change scores of single item pass rates being confirmed (Table IV). Most single items had at least two of three hypotheses confirmed, except for items regarding referral for weight reduction, assessment of walking problems, pain assessment, recommendation of paracetamol use and referral to orthopaedic surgeon (items no. 7, 9, 11, 12 and 16, respectively).
Interpretability
At T1 2% of the participants had a mean total pass rate = 0% and 2% = 100, indicating no floor or ceiling effects. The MIC after attending an OA patient education programme was 20.4 (Table V).
Discussion
The current study assessed validity, reliability, responsiveness and interpretability of the OA-QI v2. Reliability was considered acceptable for the mean total pass rate and for most single item pass rates. Validity and responsiveness was also considered acceptable as ≥75% of the predefined hypotheses were confirmed. There were no floor or ceiling effects, and the SDCgroup, SDCind and MIC were calculated to aid interpretation of change scores.
The mean total pass rate and single item pass rates at T1 or T2 in the current study were comparable
than three previous Norwegian studies that used v1. This could be related to a combination of the different number of items in the two versions and differences between the study samples. Compared to a previous study in people with only knee OA including subsamples from Denmark, UK and Portugal, the mean total pass rate in the current study was higher than the Danish, but lower than the Portuguese and UK median pass rates
. A large proportion of participants in the current study had multisite OA, which may be due to the participants being referred to secondary health care.
Reliability
The ICC for the mean total pass rate was well above the minimum standard and considered acceptable. This implies that the OA-QI v2 is a reliable instrument that can be used in different types of settings, including clinical work, in cross-sectional or intervention studies, to capture patient-reported quality of OA care. Compared with a previous study on QIs for physiotherapy care among people with hip or knee OA
, the test-retest reliability (ICC) for the mean total pass rate in the current study was similar.
Comparing the Kappa estimates for OA-QI v1 with v2, the v2 estimates were higher and had narrower CIs for all items, except for the item on referral for physical activity/exercise advice (item no.5), which was 0.58 and 0.55 for v1 and v2, respectively. Percent agreement was also higher for v2 with the exception of the item ‘Have you been given information about osteoarthritis from a health professional?’. The relative magnitude of kappa estimates was similar for items across both versions. The items regarding information about OA, information on self-management and the three items on assessment of functional ability, walking disability and need for other aids had the lowest kappa estimates. These items may be less concrete compared to the items on information about treatment alternatives, physical activity/exercise, weight reduction and pharmacological treatment. Furthermore, participants may be less certain about what is meant by ‘assessments’. Compared to being advised to lose weight or offered pharmacological treatment, patients may be more uncertain about the information they have received and whether a health professional has done a functional assessment or not. Compared to a study of measurement properties of a QI questionnaire in patients with a psychiatric disorder
Feasibility, reliability and validity of a questionnaire on healthcare consumption and productivity loss in patients with a psychiatric disorder (TiC-P).
, the absolute agreement values were somewhat lower in the current study, but the kappa estimates were comparable. In contrast, absolute agreement values were similar and the kappa estimates somewhat lower in the current study as compared to a study of measurement properties of a QI questionnaire in informal caregivers of long-term care users
Sustained informal care: the feasibility, construct validity and test-retest reliability of the CarerQol-instrument to measure the impact of informal care in long-term care.
Assessment of construct validity showed that the contrasting subgroups responded as anticipated. All six predefined hypotheses for the revised version were confirmed, which confirm the construct validity of the v2. This is in line with the assessment of construct validity for v1, in which all ten predefined hypotheses were confirmed
Responsiveness for the OA-QI has not been assessed before, but was confirmed in the current study with >75% of the predefined hypotheses being confirmed. This means that the OA-QI has the ability to detect changes over time in patient-reported quality of OA care. The change scores for most items followed the expected pattern, but not all (i.e., the items on referrals and assessments). It is possible that some participants did not read the T4 questionnaire introduction and instead of responding according to information/treatment received in the past year, responded according to what they had received since filling in the previous questionnaire (i.e., past 3 months). This may explain why some of the participants responded ‘yes’ at T3, but ‘no’ at T4. The more ‘positive’ reporting at T3 may also be related to an attention bias or an ‘eager to please’ situation as the questionnaire was completed towards the end of the education programme. Further, comorbidity was reported by a large proportion, and the participants may have been uncertain about which information, advice or treatment that addressed their OA disease vs their other comorbid conditions. Additionally, the sample size for some of the hypotheses was very small.
Interpretability
There were no floor or ceiling effects for the OA-QI v2 in this study, which provided a scope to detect changes in mean total pass rates in this prospective study. The SDCgroup was relatively small considering that the potential change score ranges from 0 to 100, and it was much lower than the MIC after attending a patient education programme. The results showed that changes scores for the OA-QI v2 mean total pass rate on a group level should be >3 to exceed the measurement error (SDCgroup), but >20 to be considered an important change. Since MIC was 20 and SDCind 29, change scores between 20 and 29 for individuals should be interpreted with caution since we cannot be sure if the change reflects an important change or measurement error.
Strengths and limitations
Strengths of this study include the sufficiently large study sample and a methodology following the COSMIN checklist
The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study.
. With a large female predominance and a high proportion with multisite OA, the study sample may not be representative for all people with OA. Hence, caution should be applied when using the SDC and MIC from the current study in other OA cohorts. One limitation in this study, is that the questionnaires were filled in at different locations at different time points (at home vs at the hospital), and different contexts may have affected the results
. Furthermore, we did not collect feedback from participants on the questionnaire. A third limitation is that the T3 anchor question may not be ideal and that there was no anchor question included in the T4 questionnaire. For some of the sub-group analyses of responsiveness hypotheses, the sample size was below the recommended sample size of 50.
Implications
On basis of the measurement properties of the OA-QI v2 demonstrated in the current study, we recommend the OA-QI v2 for future use. Caution is recommended when comparing mean total pass rates across the two versions since v2 had higher reliability estimates. It follows that v2 may be more precise than v1.
In conclusion, the OA-QI v2 showed acceptable validity and reliability with higher test-retest kappa estimates than v1, and v2 is recommended for future use. Responsiveness was considered acceptable, which confirms that OA-QI v2 can be used as an outcome measure in longitudinal studies. An evaluation of interpretability showed that the changes scores for the OA-QI v2 mean total pass rate on a group level should be >3 to exceed SDC and >20 to be considered an important change. Interpretation of change scores between 20 and 29 for individuals should be made with caution since the SDC for individuals was 29.
Author contributions
All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be submitted for publication. Østerås had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
The study was funded by National Advisory Unit on Rehabilitation in Rheumatology, Department of Rheumatology, Diakonhjemmet Hospital. The funder was not involved in the choice of study design, data collection, analysis or interpretation of the data, in writing the manuscript, or in the decision to submit the manuscript for publication.
Conflict of interest
Nina Østerås: none.
Anne Therese Tveter: none.
Odd-Einar Svinøy: none.
Andrew M. Garratt: none.
Ingvild Kjeken: none.
Bård Natvig: none.
Margreth Grotle: none.
Kåre Birger Hagen: none.
Acknowledgment
The authors thank all of the patients who participated in this study.
Appendix A. Supplementary data
The following is the supplementary data related to this article:
American College of Rheumatology 2012 recommendations for the use of nonpharmacologic and pharmacologic therapies in osteoarthritis of the hand, hip, and knee.
The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study.
The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes.
Minimum clinically important improvement and patient acceptable symptom state in pain and function in rheumatoid arthritis, ankylosing spondylitis, chronic back pain, hand osteoarthritis, and hip and knee osteoarthritis: results from a prospective multinational study.
Effect of a model consultation informed by guidelines on recorded quality of care of osteoarthritis (MOSAICS): a cluster randomised controlled trial in primary care.
Feasibility, reliability and validity of a questionnaire on healthcare consumption and productivity loss in patients with a psychiatric disorder (TiC-P).
Sustained informal care: the feasibility, construct validity and test-retest reliability of the CarerQol-instrument to measure the impact of informal care in long-term care.