Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-01-25T22:04:56.183Z Has data issue: false hasContentIssue false

Kelley's Formula as a Basis for the Assessment of Reliable Change

Published online by Cambridge University Press:  01 January 2025

Gerard H. Maassen*
Affiliation:
Department of Methodology and Statistics, Faculty of Social Sciences, Utrecht University, The Netherlands
*
Requests for reprints should be sent to G.H. Maassen, Utrecht University, Faculty of Social Sciences, Department Methodology and Statistics, Post Box 80140, 3508 TC Utrecht, THE NETHERLANDS. E-mail: g.maassen@fss.uu.nl.

Abstract

In the literature on the measurement of change, reliable change is usually determined by means of a confidence interval around an observed value of a statistic that estimates the true change. In recent literature on the efficacy of psychotherapies, attention has been particularly directed at the improvement of the estimation of the true change. Reliable Change Indices, incorporating the reliability-weighted measure of individual change, also known as Kelley's formula, have been proposed. According to current practice, these indices are defined as the ratio of such an estimator and an intuitively appealing criterion and then regarded as standard normally distributed statistics. However, because the authors fail to adopt an adequate standard error of the estimator, the statistical properties of their indices are unclear. In this article, it is shown that this can lead to paradoxical conclusions. The adjusted standard error is derived.

Type
Original Paper
Copyright
Copyright © 2000 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Barkham, M., Rees, A., Stiles, W.B., Shapiro, D.A., Hardy, G.E., & Reynolds, S. (1996). Dose-effect relations in time-limited psychotherapy for depression. Journal of Consulting and Clinical Psychology, 64, 927935.CrossRefGoogle ScholarPubMed
Bruggemans, E., Van de Vijver, F.J.R., & Huysmans, H.A. (1997). Assessment of cognitive deterioration in individual patients following cardiac surgery: Correcting for measurement error and practice effects. Journal of Clinical and Experimental Neuropsychology, 19, 543559.CrossRefGoogle ScholarPubMed
Christensen, L., & Mendoza, J.L. (1986). A method of assessing change in a single subject: An alteration of the RC index. Behavior Therapy, 12, 305308.CrossRefGoogle Scholar
Cohen, J. (1977). Statistical power analysis for the behavioral sciences. New York: Academic Press.Google Scholar
Collins, L.M. (1996). Is reliability obsolete? A commentary on “Are simple gain scores obsolete?. Applied Psychological Measurement, 20, 289292.CrossRefGoogle Scholar
Cronbach, L.J., & Furby, L. (1970). How we should measure “Change”—or should we?. Psychological Bulletin, 74, 6880.CrossRefGoogle Scholar
Debats, D.L. (1996). Meaning in life—Clinical relevance and predictive power. British Journal of Clinical Psychology, 35, 503516.CrossRefGoogle ScholarPubMed
De Haan, E., Van Oppen, P., Van Balkom, A.J.L.M., Spinhoven, P., Hoogduin, K.A.L., & Van Dyck, R. (1997). Prediction of outcome and early vs. late improvement in Ocd patients treated with cognitive-behavior therapy and pharmacotherapy. Acta Psychiatrica Scandinavica, 96, 354361.CrossRefGoogle ScholarPubMed
Hafkenscheid, A.J.P.M. (1994). Rating scales in treatment efficacy studies: Individualized and normative use. Groningen (the Netherlands): Rijksuniversiteit Groningen.Google Scholar
Hageman, W.J.J.M., & Arrindell, W.A. (1993). A further refinement of the reliable change (RC) index byImproving the pre-postDifference score: IntroducingRC ID. Behaviour Research and Therapy, 31, 693700.CrossRefGoogle Scholar
Hsu, L.M. (1989). Reliable changes in psychotherapy: Taking into account regression toward the mean. Behavioral Assessment, 11, 459467.Google Scholar
Jacobson, N.S., Follette, W.C., & Revenstorf, D. (1984). Psychotherapy outcome research: Methods for reporting variability and evaluating clinical significance. Behavior Therapy, 15, 336352.CrossRefGoogle Scholar
Jacobson, N.S., & Truax, P. (1991). Clinical significance: A statistical approach to defining meaningful change in psychotherapy research. Journal of Clinical and Consulting Psychology, 59, 1219.CrossRefGoogle ScholarPubMed
Kelley, T.L. (1947). Fundamentals of statistics. Cambridge: Harvard University Press.Google Scholar
Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar
McNemar, Q. (1958). On growth measurement. Educational and Psychological Measurement, 18, 4755.CrossRefGoogle Scholar
McNemar, Q. (1962). Psychological statistics 3rd ed., New York: Wiley.Google Scholar
McNemar, Q. (1969). Psychological statistics 4th ed., New York: Wiley.Google Scholar
Mellenbergh, G.J. (1999). A note on simple gain score precision. Applied Psychological Measurement, 23, 8789.CrossRefGoogle Scholar
Nunnally, J.C., & Kotsch, W.E. (1983). Studies of individual subjects: logic and methods of analysis. British Journal of Clinical Psychology, 22, 8393.CrossRefGoogle Scholar
Ostrom, Th.M. (1966). Perspective as an intervening construct in the judgment of attitude statements. Journal of Personality and Social Psychology, 3, 135144.CrossRefGoogle Scholar
Plewis, I. (1985). Analysing change. Chichester: Wiley.Google Scholar
Rao, C.R. (1973). Linear statistical inference and its applications. New York: Wiley.CrossRefGoogle Scholar
Rogosa, D., Brandt, D., & Zimowski, M. (1982). A growth curve approach to the measurement of change. Psychological Bulletin, 92, 726748.CrossRefGoogle Scholar
Rudy, T.E., Turk, D.C., Kubinski, J.A., & Zaki, H.S. (1995). Differential treatment responses of Tmd patients as a function of psychological characteristics. Pain, 61, 103112.CrossRefGoogle ScholarPubMed
Sharma, K.K., & Gupta, J.K. (1986). Optimum reliability of gain scores. Journal of Experimental Education, 54, 105108.CrossRefGoogle Scholar
Smith, M.L., Glass, G.V., & Miller, Th.I. (1980). The Benefits of Psychotherapy. Baltimore: John Hopkins University Press.Google Scholar
Speer, D.C. (1992). Clinically significant change: Jacobson and Truax (1991) revisited. Journal of Consulting and Clinical Psychology, 60, 402408.CrossRefGoogle ScholarPubMed
Taylor, S. (1995). Assessment of obsessions and compulsions—Reliability, validity and sensitivity to treatment effects. Clinical Psychology Review, 15, 261296.CrossRefGoogle Scholar
Upshaw, H.S., & Ostrom, Th.M. (1984). Psychological perspective in attitude research. In Eiser, J.R. (Eds.), Attitudinal judgment. New York: Springer.Google Scholar
Van Oppen, P., De Haan, E., Van Balkom, A.J.L.M., Spinhoven, P., Hoogduin, K., & Van Dyck, R. (1995). Cognitive therapy and exposure in-vivo in the treatment of obsessive-compulsive disorder. Behaviour Research and Therapy, 33, 379390.CrossRefGoogle ScholarPubMed
Willett, J.B. (1988). Questions and answers in the measurement of change. In Rothkopf, E.Z. (Eds.), Review of research in education, Vol. 15, 1988–89 (pp. 345422). Washington: American Educational Research Association.Google Scholar
Willett, J.B. (1989). Some results on reliability for the longitudinal measure of change: Implications for the design of studies of individual growth. Educational and Psychological Measurement, 49, 587602.CrossRefGoogle Scholar
Williams, R.H., Zimmerman, D.W. (1996). Are simple gain scores obsolete?. Applied Psychological Measurement, 20, 5969.CrossRefGoogle Scholar
Wykes, T. (1998). What are we changing with neurocognitive rehabilitation—Illustrations from 2 single cases of changes in neuropsychological performance and brain systems as measured by SPECT. Schizophrenia Research, 34, 7786.CrossRefGoogle Scholar
Zimmerman, D.W., & Williams, R.H. (1982). Gain scores in research can be highly reliable. Journal of Educational Measurement, 19, 149154.CrossRefGoogle Scholar