What Qualifies as Evidence

of Effective Practice?

Scientific Research

John F. Kihlstrom

Note: An edited version of this paper was published as Kihlstrom, J.F. (2005). What qualifies as evidence of effective practice? Scientific research. In J.C. Norcross, L.E. Beutler, & R.F. Levant (Eds.), Evidence-based practices in mental health: Debate and dialogue on the fundamental questions (pp. 23-31, 43-45). Washington, D.C.: American Psychological Association.

Scientific research is the only process by which clinical psychologists and mental-health practitioners should determine what "evidence" guides evidence-based practices.

The Background in Evidence-Based Medicine

When the New York Times listed "evidence-based medicine" as one of the breakthrough ideas of 2001, many of its readers probably thought: "As opposed to what? Are there any medical treatments that are not evidence based?" (Hitt, 2001). The simple, straightforward answer to this question is "Yes". Although the medical profession has long cloaked itself with the mantle of science, the fact is that until surprisingly recently physicians had relatively few effective treatments for disease. The available treatments were mostly palliative in nature, intended to ameliorate the patient's symptoms, and make the patient comfortable, while nature took its course; or else physicians simply removed diseased organs and tissues through surgery. IN a very real sense, scientific medicine really only began in the latter part of the 19^th century (about the same time scientific psychology began), with the laboratory revolution of Claude Bernard, and the microbe-hunting of Louis Pasteur and Robert Koch, followed by successive phases of the pharmaceutical revolution of the 20th century (Magner, 1992; Porter, 1997).

Nevertheless, almost 150 years after Bernard, and more than a century after Pasteur and Koch, the Times article cited a recent estimate that only about 20% of common medical practices were "based on rigorous research evidence", as opposed to being "a kind of folklore" (Hitt, 2001, p. 68). It is only in the last few years that researchers have begun to systematically evaluate medical practices to determine whether they actually work, which ones work better than others, and which are cost effective (Davidoff, Haynes, Sackett, & Smith, 1995; Evidence-Based_Medicine_Working_Group, 1992; Rosenberg & Donald, 1995; Sackett, Straus, Richardson, Rosenberg, & Haynes, 1997). But now, evidence-based medicine -- defined as "the conscientious, explicit and judicious use of current best evidence in making decisions about the care of individual patients" (Sackett, Rosenberg, Muir_Gray, Haynes, & Richardson, 1996, p. 71), and more broadly renamed evidence-based practices (EBPs; Institute_of_Medicine, 2001) -- is the way medicine increasingly does business.

Science, Psychotherapy, and Managed Care

We can trace a parallel history in psychology. Clinical psychology owes its professional status, including its autonomy from psychiatry and its eligibility for third-party payments, to the assumption that its procedures for diagnosis, treatment, and prevention are based on a substantial body of scientific evidence. But for a long time after the invention of psychotherapy in the latter part of the 19^th century, this assumption simply went unchecked. It must have been a shock when, reviewing the paltry literature then available, Eysenck cast doubt on the proposition that psychotherapy had any positive effect at all, over and above spontaneous remission (Eysenck, 1952). It was certainly not good news for a profession facing competition from the first generation of psychotropic drugs, including lithium (introduced in 1949), the phenothiazines, imipramine, Milltown, and other benzodiazepines. For an embarrassingly long time afterward, the chief counterweight to Eysenck’s expose was the assertion that psychotherapy did have effects after all, but that the negative effects balanced the positive ones, creating an illusion of no change (Bergin, 1966). It took another 25 years, and the development of new meta-analytic techniques -- which not only provided quantitative summaries of data trends but also enabled investigators to aggregate weak effects into strong ones -- for researchers to demonstrate that psychotherapy did, in fact, on average, have a greater positive effect than nothing at all (see also Lispey & Wilson, 1993; Smith & Glass, 1977; Smith, Glass, & Miller, 1980).

One positive legacy of Eysenck’s expose, and Bergin’s rejoinders to it, was research intended not only to demonstrate that psychotherapy did work after all, but to identify conditions, and techniques, that would magnify the positive outcomes of psychotherapy and minimize the negative ones (Bergin & Strupp, 1970; Fiske et al., 1970; Garfield & Bergin, 1971; Strupp & Bergin, 1969). Following the rise of psychotropic drugs, the professional landscape within psychotherapy became even more competitive with the emergence of behavioral (Wolpe, 1958) and cognitive (Beck, 1970) therapies to rival more traditional psychodynamic and client-centered treatments. The first generation of behavioral and cognitive therapists took clinical psychology’s scientific rhetoric seriously, and systematically set about to demonstrate the effectiveness of what they did (Yates, 1970). And by the time that Smith and Glass did their meta-analysis, it did indeed seem that the cognitive-behavioral therapies were able to deliver the goods in a way that more traditional insight-oriented approaches did not. Although some observers concluded from the Smith and Glass that "Everyone has won and so all must have prizes" (Luborsky, Singer, & Luborsky, 1975), this was not really the case. Even in the Smith and Glass study, the effect sizes associated with cognitive and behavioral therapies were larger than those associated with psychodynamic and humanistic ones (Smith & Glass, 1977; Smith et al., 1980). Over the succeeding years, the cognitive-behavioral therapies have gradually emerged as the standard of psychotherapeutic care.

Still, the analysis of Smith and Glass (Smith & Glass, 1977; Smith et al., 1980) suggested that there was enough success to go around, and that would probably have been enough to permit psychoanalysts, Rogerians, and behavior therapists alike to enjoy good professional livelihoods -- except that the professional landscape changed once again, with the rise of health maintenance organizations and other forms of managed care. Patients and clients can pay for whatever treatment they want out of their own pockets, regardless of whether it works well or efficiently, so long as they believe they are getting some benefit – or are persuaded that some benefit will ultimately accrue to them. But when third parties foot the bill (patients and therapists are the first and second parties), strong demands for professional accountability come with the package -- and this is no less true for mental health care than it is for the rest of the healthcare industry (Kihlstrom & Kihlstrom, 1998). As a result, the demands of managed care have combined with the rhetoric of science, and competition from both cognitive and behavioral therapy and psychotropic drugs foster the development of standards for empirically supported treatments or therapies (EST; Chambless & Ollendick, 2001; Task_Force, 1995) – or, again, more broadly, evidence-based practices (EBP) within clinical psychology.

"Efficacy" and "Effectiveness"

Viewed from a historical perspective, EBPs are something that clinical psychology should have been striving for, and promoting, all along, and they have a real flavor of historical inevitability to them. Remarkably, though, at least from the standpoint of a profession that takes pride in its scientific base, there has been considerable resistance to the demand for EBPs. As tempting as it might be to dismiss this resistance as coming from private-practice entrepreneurs who simply want to continue doing what they’ve always done and resent any infringements on their livelihoods, I suspect things are more complicated than that. Just as some well-intentioned physicians have bridled at having their clinical judgment checked by managed-care bureaucrats, some well-intentioned psychotherapists argue against any standards or guidelines at all, on the grounds that they should be free to pick whatever treatment they think will be best for the individual patient. But physicians don't have this freedom: they have to conform their practices to the available evidence -- and where evidence is lacking, to the prevailing standard of care. Why should psychotherapists be any different?

Other resisters, including some clinical scientists, believe that the "efficacy" research that provides the basis for EBPs is inappropriate, or at least insufficient, because the studies are conducted under somewhat artificial conditions that do not represent the problems that are encountered in actual practice (e.g., Levant, 2004; Seligman, 1995; Seligman & Levant, 1998; Westen & Morrison, 2001; Westen, Novotny, & Thompson-Brenner, 2004). Instead, they propose that ESTs be based on "effectiveness" research, which they argue is more ecologically valid. But the distinction between efficacy research and effectiveness research seems strained. Research is research. Clinical drug trials are somewhat artificial too, but their artificiality does not prevent physicians from prescribing effective drugs in actual practice, based in large part on carefully controlled studies that show that the drugs in question really do improve the conditions being treated.

To the extent that "effectiveness" research attempts to extend to logic of "efficacy" research to more ecologically valid treatment settings – studying patients with comorbid conditions, for example, or with diagnoses on Axis II as well as Axis I, or more extended treatments – there is no essential difference between the two. But to the extent that "effectiveness" research loosens the standards for methodological rigor characteristic of "efficacy" research, then "effectiveness" research is a step backwards. In the Consumer Reports (CR) study, for example (Consumer_Reports, 1995; Kotkin, Daviet, & Gurin, 1996; Seligman, 1995), the outcome of psychotherapy was measured by patients' self-reported satisfaction with their treatment, instead of objective evidence of actual improvement; there were no controls for sampling bias, nor any untreated control group – a particularly egregious problem in the wake of Eysenck’s (Eysenck, 1952) analysis. It did not ask about the specificity of treatments – a question that is critical to distinguishing a genuine effect of psychotherapy from placebo, and for evaluating the differential effectiveness of various forms of therapy.

If the CR study is an example of effectiveness research, then effectiveness research is a step backward, not a step forward, in the journey toward evidence-based treatments. "Efficacy" research, modeled on randomized clinical trials in drug research, is a good place to begin research on psychotherapy outcomes. Any deficiencies that "efficacy" studies might have with respect to ecological validity – deficiencies that might be remedied in the future by properly designed and controlled "effectiveness" studies -- should not be taken as an excuse for discounting them in the meantime.

Ratcheting Up the Standards

At present, the standards for evidence-based practice in psychotherapy are roughly modeled on the clinical trials required before drugs are marketed (Chambless & Ollendick, 2001). In order to qualify as "empirically supported" on the list maintained by Society for a Science of Clinical Psychology (Division 12, Section III, of the APA), a treatment must yield outcomes that are significantly better than those associated with an adequate control (typically, patients who receive no treatment at all) in at least two studies, preferably conducted by independent research groups. These standards are a good start for putting psychotherapy, at long last, on a firm scientific base. But they are also somewhat minimal, and over time they should be progressively ratcheted up (the opposite of defining them down; Moynihan, 1993) to improve the quality of psychotherapeutic practice.

For example: two studies out of how many? The current EST standard is modeled on current Food and Drug Administration standards, which require only two positive trials, regardless of how many negative or inconclusive trials there are, raising the file-drawer problem and the issue of selective publication of positive results. Just as the medical community is ratcheting up this requirement, by requiring drug companies to pre-register all drug trials as a condition of accepting reports of them for publication (Vedantam, 2004), so we might find a way to register ongoing psychotherapy outcome studies, before their results are in. Certainly this is possible for major, collaborative studies supported by federal funds.

More substantively, we might wish to drop no-treatment control as an appropriate comparison group, in favor of either an appropriate placebo or some alternative treatment. It is something to prove that psychotherapy is better than nothing, but surely it is not much. Placebo controls are not easy to implement in psychotherapy research, because it is difficult to keep psychotherapists blind to the treatment that they are delivering. In drug research, especially when there are ethical concerns about the use of placebo controls, new medications may be evaluated against the current standard of care instead. If a new drug is not discriminably better than what is already available – and certainly if it is discriminably worse – then it is incumbent on its proponents to show that it is a reasonable alternative treatment for some individuals, for whom the currently available medications are ineffective or inappropriate. An example of such a comparison might be the NIMH Treatment of Depression Collaborative Research Program (TDCRP), where the antidepressant drug imipramine might be construed as the established standard of (medical) care, and psychotherapy as the alternative – and, for that matter, cognitive-behavior therapy as the alternative to the more established interpersonal therapy (Elkin et al., 1989).

Then there is the matter of how to evaluate the significance of outcomes. Long ago, Jacobson and his colleagues pointed out that a statistically significant change on some criterion measure may not reflect a clinically significant change in terms of the patient’s status (Jacobson, Follette, & Revenstorf, 1984; Jacobson & Revenstorf, 1988). The question is, what are the standards for clinical significance? Although I continue to believe (Kihlstrom, 1998) that the null-hypothesis statistical test is the foundation of principled argument in psychology (Abelson, 1995), psychotherapy outcome is one case where effect sizes really are preferable to tests of statistical significance (Cohen, 1990, 1994). Although even small effects can be practically significant (Rosenthal, 1990), there is no question that big effects are better – and probably more significant clinically as well.

One reasonable standard for clinical significance is that a patient who enters psychotherapy by virtue of receiving a diagnosis of mental disorder should no longer qualify for that diagnosis at the end of treatment. Accordingly, Jacobson and his colleagues suggested that the outcome of psychotherapy be deemed successfully if the treated patients’ scores on some criterion measure fall within normal limits (e.g., within 2 SD of the population mean), or more than 2 SD of the untreated patient mean, or preferably both (Jacobson et al., 1984; Jacobson & Revenstorf, 1988). Such standards are occasionally applied to the evaluation of therapeutic outcomes, including the TDCRP (Ogles, Lambert, & Sawyer, 1995). Of course, it might turn out that some mental disorders are chronic in nature, meaning that a cure, so defined, is impossible. Even so, clinically relevant standards for evaluating outcome in the treatment of chronic mental disorder might be modeled on evolving procedures for evaluating the management of chronic physical illnesses such as asthma or diabetes (Fox & Fama, 1996).

Again, this is a start, but one can imagine at least two improvements. One is to assess outcomes in terms of laboratory measures of mental and behavioral functioning, instead of symptoms – especially self-reported symptoms. In the TDCRP, for example, outcomes were measured by patients’ scores on the Beck Depression Inventory (BDI), the Hamilton Rating Scale of Depression (HRSD), and the Hopkins Symptom Check List. But while the diagnosis of mental disorder (as represented by DSM-IV ; American_Psychiatric_Association, 1994) is based on signs and symptoms, just as it was in the 19^th century, in the rest of healthcare the diagnosis of illness and the evaluation of treatment outcome is increasingly based on the results of objective laboratory tests, such as blood tests and radiological scans, interpreted in light of an increasingly sophisticated understanding of normal structure and function. It is long past time (Kihlstrom & Nasby, 1981; Nasby & Kihlstrom, 1986) that psychology began to move away from questionnaires and rating scales and toward a new generation of assessment procedures on objective laboratory tests of psychopathology (Kihlstrom, 2002b).

The interest of third-party payers in the outcome of both treatment and disease management suggests yet another, more macroscopic approach to the evaluation of outcomes – which is to assess how the treated patient fares in the ordinary course of everyday living. Couples who go through marital therapy might reasonably expect to have happier children than they did before; and employers who pay for their employees to participate in alcohol or drug-abuse treatment programs might reasonably ask if their employees do, in fact, become more productive after treatment. These examples remind us that there are other stakeholders in the treatment process than the patients themselves – and that their evaluation of treatment outcome also counts.

As an example of what might be done, Rosenblatt and Attkisson have proposed a conceptual framework in which outcome evaluation proceeds along three dimensions (Rosenblatt & Attkisson, 1993): the respondent (the patient, family members, social acquaintances, the therapist, or an independent evaluator), the social context (personal, family, work or school, community), and domain (clinical status, functional status, life satisfaction and fulfillment, and safety and welfare). So, for example, in addition to measuring clinical status with scales such as the BDI or HRSD, we could evaluate the degree to which the patients’ families and co-workers notice a difference (Sechrest, McKnight, & McKnight, 1996) after treatment, or the degree to which these "third parties" feel that their own life satisfaction has improved. Such a proposal transcends quibbles about the quantitative threshold for clinical significance, and brings qualitative considerations of ecological validity into the measure of treatment outcome.

Finally, it should be understood that EBPs include more than treatments: they also include the procedures by which patients are diagnosed, and treatment outcomes are assessed. Many of the assessment techniques traditionally employed by clinical psychologists (Rapaport, Gill, & Schafer, 1968) appear to rest on a surprisingly weak evidentiary base (Wood, Nezworski, Lilienfeld, & Garb, 2003). We need to extend the logic of EBPs to assessment as well as treatment, establishing and improving the validity of our current techniques, and abandoning those that do not pass muster. Moreover, it should go without saying that the logic of EBPs extends beyond clinical psychology to the broader range of professional psychology, including counseling, educational, and industrial/organizational psychology, and other domains where scientific knowledge is put into practice.

The Theory Behind the Therapy

Documenting treatment efficacy is not just a purely empirical matter: there are also theoretical considerations in the evaluation of any form of treatment. As my spouse once put it, in a conversation about an innovative treatment: "What made them think that would work?". It is not enough that a treatment prove, empirically, to be efficacious. Just as sound medical treatment is based on a scientific understanding of anatomy and physiology, so sound psychotherapy must be based on a scientifically valid understanding of mental and behavioral processes. Here is where placebos and other controls may have their real value – not merely in going one step further than showing that psychotherapy is better than nothing, but in evaluating claims concerning the mechanism by which a treatment achieves its effects. If some form of psychotherapy does no better than an appropriate placebo, we can begin to doubt whether that treatment has any specific effects at all. Of course, this assumes that psychotherapy is more than a placebo treatment to begin with (Frank, 1961; Rosenthal & Frank, 1956) – which, in fact, is my assumption.

Other kinds of controlled therapy outcome research can also evaluate the scientific validity of certain psychotherapeutic practices. For example, Wolpe’s (Wolpe, 1958) invention of systematic desensitization was predicated on Hullian learning theory; the only problem was that psychology already had grounds to suspect that Hullian learning theory was not correct (Gleitman, Nachmias, & Neisser, 1954). Fortunately, later research (e.g., Wilson & Davison, 1971) showed that exposure was the active ingredient in systematic desensitization – a conclusion that which was consistent with the new, improved, cognitive view of learning that emerged in the 1960s. Along similar lines, a more recent spate of dismantling studies indicates that exposure, not eye movements, is also responsible for the effectiveness of eye-movement desensitization and reprocessing (EMDR; e.g., Lohr, Lilienfeld, Tolin, & Herbert, 1999). Although EMDR may pass the narrowly empirical test for efficacy, claims in its behalf may be undercut by the lack of evidence for its underlying theory.

The point here is that sound treatments are not just those that are empirically supported. Sound treatments are based on scientifically valid theories of mind and behavior. Whenever an innovative therapy is accompanied by a theoretical statement of its underlying mechanism, the therapy should be evaluated not just in terms of whether it works, but in terms of its proponents’ theory of why it works. In this way, we validate the general principles on which the treatment is based, and which can form the basis for other therapeutic innovations as well (Rosen & Davison, 2003). And we avoid the trap of using efficacy research to legitimize proprietary, even trademarked, therapies.

To take an example from the history of hypnosis, Mesmer's animal magnetism was not rejected by the Franklin Commission because it did not work (Kihlstrom, 2002a). Everyone agreed that it did work – and, in fact, Mesmer had previously scored a win for scientific medicine by showing that he could duplicate the effectiveness of exorcisms with a technique that was materialist, rather than supernatural, in nature. Animal magnetism was rejected solely because Mesmer's theory was wrong, and nobody had a good theory to replace it (scientific psychology not having been invented yet). Exorcism might work, empirically, but even if it did medicine would reject it as a legitimate treatment because its underlying theory -- that disease is caused by demon possession -- is inconsistent with everything we know about how the body works.

Science as the Basis of Practice

The examples of Mesmer and hypnosis makes it clear that the relation between science and practice is not unidirectional: Studies of psychopathology and psychotherapy can alter our understandings of normal mental and behavioral function (Kihlstrom, 1979; Kihlstrom & McGlynn, 1991). But they also underscore the point that we want our evidence-based practices not only to be empirically valid, but based on valid scientific principles as well. The scientific method is the best way we have of understanding how the world works and why. Therefore, it is also the best way we have of knowing which of our practices work (and why). In establishing the validity of our theories and practices, anecdotal evidence, impressionistic clinical observations, and customer-satisfaction ratings simply will not suffice. Enhancing the scientific basis for clinical practice, by determining which practices are scientifically valid and promoting, and letting the others wither away, is the best way that clinical psychology can meet the competition from psychiatry and drugs, and meet the demands for managed care. It is the best way for clinical psychology to promote public welfare. And it is the only way for clinical psychology to achieve its aspirations.

Author Note

I thank Lucy Canter Kihlstrom for many stimulating discussions, which have helped me clarify my ideas about the relations between science and practice.

References

Abelson, R. P. (1995). Statistics as principled argument. Hillsdale, N.J.: Erlbaum.

American_Psychiatric_Association. (1994). Diagnostic and Statistical Manual of Mental Disorders (4th Ed. ed.). Washington, D.C.: American Psychiatric Association.

Beck, A. T. (1970). Cognitive therapy: Nature and relation to behavior therapy. Behavior Therapy, 1, 184-200.

Bergin, A. E. (1966). Some implications of psychotherapy research for therapeutic practice. Journal of Abnormal Psychology, 71, 235-246.

Bergin, A. E., & Strupp, H. H. (1970). New directions in psychotherapy research. Journal of Abnormal Psychology, 76(1), 13-26.

Chambless, D. L., & Ollendick, T. H. (2001). Empirically supported psychological interventions: Controversies and evidence. Annual Review of Psychology, 52, 685-716.

Cohen, J. (1990). Things I have learned (so far). American Psychologist, 45, 1304-1312.

Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49, 997-1003.

Consumer_Reports. (1995). Mental health: Does therapy help? Consumer Reports(November), 734-739.

Davidoff, F., Haynes, B., Sackett, D. L., & Smith, R. (1995). Evidence based medicine. British Medical Journal, 310, 1085-1086.

Elkin, I., Shea, M. T., Watkins, J. T., Imber, S. D., Sotsky, S. M., Collins, J. F., Glass, D. R., Pilkonis, P. A., Leber, W. R., Docherty, J. P., Fiester, S. J., & Parloff, M. B. (1989). National Institute of Mental Health Treatment of Depression Collaborative Research Program: General efefctiveness of treatments. Archives of General Psychiatry, 46, 971-982.

Evidence-Based_Medicine_Working_Group. (1992). Evidence-based medicine: A new approach to the teaching of medicine. Journal of the American Medical Association, 268, 2420-2425.

Eysenck, H. J. (1952). The effects of psychotherapy: An evaluation. Journal of Consulting Psychology, 16, 319-324.

Fiske, D. W., Hunt, H. F., Luborsky, L., Orne, M. T., Parloff, M. B., Reiser, M. F., & Tuma, A. H. (1970). Planning of research on effectiveness of psychotherapy. Archives of General Psychiatry, 22, 22-32.

Fox, P. D., & Fama, T. (Eds.). (1996). Managed care and chronic illness: Challenges and opportunities. Gaithersburg, Md.: Aspen.

Frank, J. D. (1961). Persuasion and healing. Baltimore, Md.: Johns Hopkins University Press.

Garfield, S. L., & Bergin, A. E. (1971). Therapeutic conditions and outcome. Journal of Abnormal Psychology, 77(2), 108-114.

Gleitman, H., Nachmias, J., & Neisser, U. (1954). The S-R reinforcement theory of extinction. Psychological Review, 61, 23-33.

Hitt, J. (2001, December 9). Evidence-based medicine. New York Times Magazine.

Institute_of_Medicine. (2001). Crossing the quality chasm: A new health system for the 21st century. Washington, D.C.: Institute of Medicine.

Jacobson, N. S., Follette, W. C., & Revenstorf, D. (1984). Psychotherapy outcome research: Methods for reporting variability and evaluating clinical significance. Behavior Therapy, 15, 336-352.

Jacobson, N. S., & Revenstorf, D. (1988). Statistics for assessing the clinical significance of psychotherapy tecniques: Issues, problems, and new developments. Behavioral Assessment, 10, 133-145.

Kihlstrom, J. F. (1979). Hypnosis and psychopathology: Retrospect and prospect. Journal of Abnormal Psychology, 88(5), 459-473.

Kihlstrom, J. F. (1998). If you've got an effect, test its significance: If you've got a weak effect, do a meta-analysis [Commentary on "Precis of Statistical significance: Rationale, validity, and utility" by S.L. Chow]. Behavioral & Brain Sciences, 21, 205-206.

Kihlstrom, J. F. (2002a). Mesmer, the Franklin Commission, and hypnosis: A counterfactual essay. International Journal of Clinical & Experimental Hypnosis, 50, 408-419.

Kihlstrom, J. F. (2002b). To honor Kraepelin...: From symptoms to pathology in the diagnosis of mental illness. In L. E. Beutler & M. L. Malik (Eds.), Alternatives to the DSM (pp. 279-303). Washington, D.C.: American Psychological Association.

Kihlstrom, J. F., & Kihlstrom, L. C. (1998). Integrating science and practice in an environment of managed care. In D. K. Routh & R. J. DeRubeis (Eds.), The science of clinical psychology: Accomplishments and future directions. (pp. 281-293). Washington, D.C.: American Psychological Association.

Kihlstrom, J. F., & McGlynn, S. M. (1991). Experimental research in clinical psychology, The clinical psychology handbook (2nd ed.). (pp. 239-257). New York: Pergamon.

Kihlstrom, J. F., & Nasby, W. (1981). Cognitive tasks in clinical assessment: An exercise in applied psychology. In P. C. Kendall & S. D. Hollon (Eds.), Cognitive-behavioral interventions: Assessment methods (pp. 287-317). New York: Academic.

Kotkin, M., Daviet, C., & Gurin, J. (1996). The Consumer Reports mental health survey. American Psychologist, 51, 1080-1082.

Levant, R. F. (2004). The empirically validated treatments movement: A practitioner/educator perspective. Clinical Psychology: Science & Practice, 11, 219-224.

Lispey, M. W., & Wilson, D. B. (1993). The efficacy of psychological, educational, and behavioral treatment: Confirmation from meta-analysis. American Psychologist, 48, 1181-1209.

Lohr, J. M., Lilienfeld, S. O., Tolin, D. F., & Herbert, J. D. (1999). Eye movement desensitization and reprocessing: An analysis of specific versus nonspecific treatment factors. Journal of Anxiety Disorders, 13(1-2), 185-207.

Luborsky, L., Singer, B. H., & Luborsky, L. (1975). Comparative studies of psychotherapies: Is it true that "everyone has won and all must have prizes"? Archives of General Psychiatry, 32, 995-1008.

Magner, L. N. (1992). A history of medicine. New York: Dekker.

Moynihan, D. P. (1993). Defining deviancy down. American Scholar, 62(1), 17-30.

Nasby, W., & Kihlstrom, J. F. (1986). Cognitive assessment in personality and psychopathology. In R. E. Ingram (Ed.), Information processing approaches to psychopathology and clinical psychology (pp. 217-239). New York: Academic.

Ogles, B. M., Lambert, M. J., & Sawyer, J. D. (1995). Clinical significance of the National Institute of Mental Health Treatment of Depression Collaborative Research Program data. Journal of Consulting & Clinical Psychology, 63, 321-326.

Porter, R. (1997). The greatest benefit to mankind: A medical history of humanity. New York: Norton.

Rapaport, D., Gill, M. M., & Schafer, R. (1968). Diagnostic psychological testing (Rev. Ed. by R.R. Holt ed.). New York: International Universities Press.

Rosen, G. M., & Davison, G. C. (2003). Psychology should list empirically supported principles of change (ESPs) and not credential trademarked therapies or other treatment packages. Behavior Modification, 27(3), 300-312.

Rosenberg, W., & Donald, A. (1995). Evidence based medicine: An approach to clinical probem-solving. British Medical Journal, 310, 1122-1126.

Rosenblatt, A., & Attkisson, C. C. (1993). Assessing outcomes for sufferers of severe mental disorder: A conceptual framework and review. Evaluation & Program Planning, 16, 347-363.

Rosenthal, D., & Frank, J. D. (1956). Psychotherapy and the placebo effect. Psychological Bulletin, 55, 294-302.

Rosenthal, R. (1990). How are we doing in soft psychology? American Psychologist, 45, 775-777.

Sackett, D. L., Rosenberg, W. M. C., Muir_Gray, J. A., Haynes, R. B., & Richardson, W. S. (1996). Evidence based medicine: What it is and what it isn't. British Medical Journal, 312, 71-72.

Sackett, D. L., Straus, S. E., Richardson, W. S., Rosenberg, W., & Haynes, R. B. (1997). Evidence-based medicine: How to practise and teach EBM. Edinburgh: Churchill Livingstone.

Sechrest, L., McKnight, P., & McKnight, K. (1996). Calibration of measures for psychotherapy outcome studies. American Psychologist, 51, 1065-1071.

Seligman, M. E. P. (1995). The effectiveness of psychotherapy: The Consumer Reports study. American Psychologist, 50, 965-974.

Seligman, M. E. P., & Levant, R. F. (1998). Managed care policies rely on inadequate science. Professional Psychology: Research & Practice, 29, 211-212.

Smith, M. L., & Glass, G. V. (1977). Meta-analysis of psychotherapy outcome studies. American Psychologist, 32, 752-760.

Smith, M. L., Glass, G. V., & Miller, R. L. (1980). The benefits of psychotherapy. Baltimore: Johns Hopkins University Press.

Strupp, H. H., & Bergin, A. E. (1969). Some empirical and conceptual bases for coordinated research in psychotherapy: A critical review of issues, trends, and evidence. International Journal of Psychiatry, 7, 18-90.

Task_Force. (1995). Training in and dissemination of empirically validated psychological treatments: Report and recommendations [of the Task Force on Promotion and Dissemination of Psychological Procedures of Division 12 (Clinical Psychology) of the American Psychological Association. Clinical Psychologist, 48, 3-23.

Vedantam, S. (2004, September 9). Journals insist drug manufacturers register all trials. Washington Post, pp. A02.

Westen, D., & Morrison, K. (2001). A multidimensional meta-analysis of treatments for depression, panic, and generalized anxiety disorder: An empirical examination of the status of empirically supported therapies. Journal of Consulting & Clinical Psychology, 60, 875-899.

Westen, D., Novotny, C. M., & Thompson-Brenner, H. (2004). The empirical status of empirically supported psychotherapies: Assumptions, findings, and reporting in controlled clinical trials. Psychological Bulletin, 130(4), 631-663.

Wilson, G. T., & Davison, G. C. (1971). Processes of fear reduction in systematic desensitization: Animal studies. Psychological Bulletin, 76, 1-14.

Wolpe, J. (1958). Psychotherapy by reciprocal inhibition. Stanford, Ca.: Stanford University Press.

Wood, J. M., Nezworski, M. T., Lilienfeld, S. O., & Garb, H. N. (2003). What's wrong with the Rorschach? Science confronts the controversial inkblot test. New York: Jossey-Bass.

Yates, A. J. (1970). Behavior therapy. New York: Wiley.