The NIMH Multimodal Treatment Study for Attention-Deficit Hyperactivity Disorder: Just Say Yes to Drugs Alone?

William E Pelham, Jr, PhD1


The outcome, merits, and limitations of the Multimodal Treatment Study for Children With Attention-Deficit Hyperactivity Disorder (ADHD), the MTA study, are described and discussed. Consideration is given to design issues that make the MTA a landmark study for clinicians and researchers working with ADHD children. These include the large, heterogeneous sample, the state-of-the-art treatment, the lengthy treatment period, the extensive documentation of treatment manuals, and the attention paid to treatment fidelity and adherence. Also highlighted are facets of the design that predisposed the study in favour of a differentially positive outcome for pharmacological relative to behavioural treatment. Primary among these is the fact that outcome was measured 4–6 months after the intensive phase of behaviour treatment and after therapeutic contact with the behaviour therapist had ended but while medication treatment was active and in its most intensive phase. Finally, the outcome for the combined treatment condition in the MTA is discussed in the context of the extant literature and directions for future research.

(Can J Psychiatry 1999;44:981–990)

Key Words: attention-deficit hyperactivity disorder, behavioural treatment, medication management,
study design

For the past 5 years, a major clinical trial has been undertaken by the National Institute of Mental Health (NIMH) and 6 collaborating academic sites (1–5). Based on the extensive literature documenting the effectiveness of behavioural and pharmacological treatments for attention-deficit hyperactivity disorder (ADHD) as well as the consistent finding in a smaller database of studies that combined interventions were more effective than separate treatments alone, the NIMH selected ADHD as the first childhood mental health disorder for which to mount a large, randomized clinical trial of treatment efficacy. This trial, the Multimodal Treatment Study for Children With ADHD (the MTA study), has focused on studying the relative effectiveness of treatment for ADHD. After an initial call for proposals, the NIMH selected 6 sites (University of Pittsburgh, Universities of California at Irvine and Berkeley, Duke University, Columbia University, and Long Island Jewish Medical Center with McGill University). Lead and collaborating investigators from these sites, the NIMH, and the United States Office of Education comprised the MTA Steering Committee (SC) and spent more than 1 year designing and planning the implementation of the study.

The resulting study is an investigation that compares 4 treatments for ADHD—behavioural treatment (BT), medication management (MM), combined BT and MM, and a community comparison control group. Seven- to 9-year-old children with ADHD were recruited from various community resources (schools, physicians, clinics, newspapers, and parent referral) and randomly assigned to 1 of the 4 treatments. Children and their families were followed intensively with major assessments at baseline and 9 and 14 months of treatment and at follow-up 24 months following baseline. The design of the study and description of its treatments are described briefly in Table 1.

This article is an overview of the results obtained to date and discusses the MTA from a critical perspective. As 1 of the project’s 6 principal investigators, I have been intimately involved in the conceptualization, rationale, design, implementation, analysis, and dissemination of the results of the MTA. Because this study is the first large, randomized clinical trial for a childhood mental health disorder, it is expected to be widely cited—it was already featured prominently at the recent National Institutes of Health (NIH) Consensus Development Conference on ADHD (November 1998)—and to impact on both practice (for example, practice guidelines, reimbursement) and future research in ADHD. Indeed, the MTA is a landmark study for the field of childhood mental health that will yield a wealth of information regarding ADHD and its treatment for years to come. It is therefore critically important to ensure that reports of the study’s results are complete and accurate and that the findings are understood in the contexts of the study’s design and limitations, the analytic plan, and the extant literature on treatment of ADHD. Unfortunately, rumours are widespread, and descriptions of purported MTA results are circulating, some of them bearing little relationship to the actual findings. This article clarifies misconceptions and discusses the results in the context of caveats that need to be included in descriptions of the MTA results. Thus, both the positive contributions that the study makes to the field and its limitations are discussed.

Primary Questions of the Study and Preliminary Results

The MTA addressed 3 primary questions: 1) What are the relative efficacies of behavioural and pharmacological treatments for ADHD? 2) What is the incremental benefit of combining these treatments over either alone? and 3) How do these evidence-based treatments compare with treatments routinely given in the community? Several of the primary analyses addressing these questions have been completed, with others in progress. Many secondary analyses are underway. As in any study, the MTA’s ability to answer these questions is limited by the nature of the specific design and treatment components employed.

The first 2 peer-reviewed reports of the study results have been accepted by the Archives of General Psychiatry, and they will likely appear this year (6,7). The first paper is an intent-to-treat analysis that addresses the primary questions for a selection of 19 dependent measures (for example, parent or teacher symptom ratings, parent–child relationships, social skills, academic achievement). An intent-to-treat analysis includes all subjects despite their status in the study. Thus, subjects who did not accept their assigned treatment are included, with their last available data point used. This is a conservative analysis that ignores questions regarding degree of participation in treatment, child and family characteristics, and other similar factors. The second paper examines a limited number of moderators and mediators (for example, child comorbidity, parent socioeconomic status [SES], acceptance of treatment) of the treatment effects reported in the first paper. It also is an intent-to-treat paper that includes data from all subjects.

In brief, the major findings at the 14-month endpoint (follow-up data are not yet analyzed) reported in those 2 papers are as follows:

With a few exceptions, these effects were typically not modified by the mediators or moderators examined. The few cases in which the results were influenced by mediators or moderators included the following:

Many other sets of analyses are being conducted or planned for the study. Some of these papers will repeat the intent-to-treat analyses on major dependent measures that have not yet been scored (for example, observed parent–child interactions, peer nominations), others will focus on secondary analyses that examine important questions planned in the original analytic plan (for example, effect of parental psychopathology on child’s treatment response and effect of child’s treatment on parental psychopathology, effect of treatment adherence and compliance on outcome), and still others will focus on follow-up at 24 months (10 months after treatment termination). This data set is expected to produce 100 papers in the next 5 to 10 years.

Design Issues

Positive and Unique Features

As with any study, the results of the MTA need to be understood within the context of the study design. One of the most unique and positive aspects of the study relative to the previous literature is its randomized clinical-trial design and large sample (1). Although large numbers of children have been studied in total in both pharmacological and psychosocial studies, previous studies of ADHD treatment have generally been small, with 1 to 20 subjects in each condition (8,9). With 144 subjects in each group, the MTA has considerably more power to detect group differences than these other studies. Further, these group sizes lend themselves to subgroup analyses (for example, the interactive influences of parental psychopathology and child comorbidity on treatment outcome) that have long been called for but often not practised in the field because study samples have not been sufficiently large. Among the first findings of the study is that treatment effects did not interact with child externalizing or internalizing comorbidity. ADHD children who also had oppositional defiant disorder or anxiety nonetheless responded as well as noncomorbid children to stimulant medication. This result confirmed what previous studies have reported regarding comorbid aggressive and oppositional disorders. However, it somewhat differs from what has been widely believed and reported in smaller preliminary studies regarding comorbid anxiety disorders. Thus, the large sample size in the MTA facilitated a resolution of concerns regarding stimulant effects on comorbid internalizing and externalizing difficulties in ADHD—these comorbidities are unimportant with respect to stimulant-medication response and ADHD.

A second important aspect of the study design is that treatment was conducted for 14 months—considerably longer than most controlled studies of psychosocial and behavioural treatments and of stimulants (8,9) and more ecologically valid for a chronic disorder than studies lasting days, weeks, or a few months. Thus, the MTA results, showing improvement over time for all treatment groups, provide evidence that the findings of previous short-term studies of behavioural and pharmacological treatments apply for at least 14 months. This result will not surprise clinicians with regard to medication—response to medication remains fairly consistent within a typical school year (even though dose often must be increased)—but it is the first controlled demonstration of this common clinical observation.

A third important aspect of the study design is the comprehensive nature of the assessments conducted (3). Rather than simple outcome measures that characterize many studies in child psychiatry (for example, clinician judgement of improvement), the MTA study selected or devised measures from multiple sources (parents, teachers, child, peers, and objective tests and observations) in multiple domains of functioning (ADHD symptoms, comorbid symptoms, peer relationships, classroom behaviour, academic achievement, and parent–child relationships). The battery used to evaluate subjects at baseline and assess treatment response is far more extensive than any employed in treatment-outcome studies in child psychiatry or psychology. Because the battery is so comprehensive, it will enable examination of an unusually large number of predictors of response to treatment and outcome.

Fourth, the study’s interventions were comprehensive and innovative, incorporating the latest treatment advances in ADHD. Thus the MM condition employed a systematic, blind, randomized, school-based medication-titration trial (2) to determine the initial dose that children would receive in the study (see 10). In addition, the MM group adopted procedures to obtain systematic information on both main effects and side effects not only from parents, as is typically the case in clinical practice, but also from teachers. Given that the primary reason for medicating most ADHD children is school behaviour or performance, the use of systematic teacher information in titrating medication dose both initially and throughout treatment is an important advance of this study.

Similarly, the BT arm of the study employed treatment components that are cutting-edge interventions in child psychology and psychiatry. For example, for parent training, the best available information was selected from several key sources and integrated into a comprehensive protocol that extended traditional parent training to sessions on peer relationships and coping with stress as well as teaching parents how to interface with their children’s schools (5). The Summer Treatment Program (STP) integrated several evidence-based interventions for ADHD into an intensive summer-camp school experience that was itself integrated with parent training (11). Finally, the school intervention combined an intensive, short-term classroom aide to jump-start the classroom intervention in the fall with an ongoing teacher consultation in which teachers were taught classroom management strategies.

Another important contribution to the field of childhood treatment research concerns the extensive manualization of treatment protocols and the numerous measures of treatment adherence and fidelity that were constructed for, and employed in, the MTA. The advantage of having manualized treatments in psychotherapy research has been well documented (12), as has the need for measuring the fidelity with which treatment is delivered (13). More than 1000 pages of treatment manuals were developed that describe in detail the treatment procedures to be followed in every arm of the study. Never before in child psychotherapeutic or psychopharmacologic outcome research have treatment procedures been so extensively manualized. Each of these manuals deals with measuring the fidelity with which the treatment component was delivered by therapists relative to the described treatment procedures. In addition, clinicians regularly evaluated (by both ratings and objective measures) the degree to which parents and teachers implemented the interventions that they had been taught in the BT conditions, and both therapists and parents rated the therapist–parent relationships. Further, parent and teacher adherence to and implementation of the BTs prescribed varied considerably, but the low dropout rate allows us to assess the effects of treatment implementation on outcome, an area that is sorely lacking in existing studies of behavioural treatments. As a result of the extensive emphasis on manualization and fidelity, and in combination with the large sample size, the study will produce an immense data set. Multiple questions can be asked about such issues as the relationship between therapist fidelity to the manual and patient outcome, or between outcome and parent or teacher compliance with implementation, or between the parent–therapist relationship and treatment adherence, for both pharmacological and behavioural treatments. Such issues have been addressed on only a very small scale in the extant literature.

Finally, an unusual and exemplary aspect of the study is that data at 14 months are available for a very high percentage of subjects. Dropout from the study was very low, with only 18 subjects completely refusing to participate after they were randomly assigned to treatment. Only a small percentage of subjects assigned to medication treatment failed to participate in the treatment, and the vast majority of those assigned to BT participated in at least part of the treatment. Thus, the questions that can be addressed will not be limited by an absence of data, which is often the case in treatment studies shorter than 14 months.

Design Limitations

In addition to the positive aspects of the MTA design, some limitations need to be discussed

Timing of Assessments Relative to Treatment Intensity

The 2 major treatment modalities—behavioural and pharmacological—were assessed at different time points relative to the intensive phase of treatment. Specifically, the effects of the pharmacological treatments were assessed at posttreatment while subjects were actively medicated; in contrast, the effects of BT were assessed following fading of therapist involvement. The intensive period of the BT ended in late December or early January, and endpoint measures were typically taken 4 to 6 months later—usually several months after the last planned, face-to-face, therapeutic contact. Thus, the endpoint MTA treatment comparison was for active MM treatment versus withdrawn BT. Whenever MM, combined, or community-medicated outcomes were differentially positive compared with BT alone, or when MM and combined outcomes did not differ, this outcome must be interpreted accordingly. It could be argued that such outcomes were predetermined, given the study design.

This design aspect has numerous implications for interpretation of the findings. For example, we cannot state that the medication (methylphenidate [MPH] for the vast majority of subjects) had long-term effects. Rather, the results simply demonstrate that effects of MPH given steadily for 14 months are the same at the end of that time as at the beginning (indeed, the correlations between drug effects at these 2 points of the study are very high). One cannot answer whether acute medication treatment has long-term effects after the medication is discontinued—in effect, the question that was examined for BT.

Had the study been designed so that medication had been faded while intensive BTs had continued, the results may very likely have been reversed, with BT being superior to MM and community treatment. When the SC was initially considering this aspect of the design, a majority strongly believed that medication should be used chronically with ADHD children and never withdrawn, so assessing the effects of MM with medication discontinued was pointless. These investigators also argued that, since it was well known that the effects of medication would disappear immediately upon discontinuation, the question was neither important nor interesting for either clinical or research purposes. Others believed that without withdrawing medication, no conclusion could be drawn about the possible cumulative effect of medication—such as examining the acute effects of medication on test-taking performance or behaviour versus a possible cumulative effect on academic achievement.

Further, without withdrawing medication, no conclusion could be drawn about whether children in the BT or combined conditions would have an advantage over children in the MM group had medication been withdrawn. Since the vast majority of subjects who take medication discontinue it in less than 1 year (14), that is an important question for most parents of ADHD children. In my experience, most parents would like their ADHD children to take medication to enhance their responsiveness to educational and psychological interventions, but they typically do not want their children to take medication for their entire lives. Two studies that have examined the maintenance of treatment effects following stimulant-medication withdrawal have shown that the effects of the behavioural component remain when medication is ended, highlighting the advantage of combined treatment when or if medication is withdrawn (15,16). It is unfortunate that the MTA, with its large sample size and relatively intensive behavioural treatment, did not address this question.

The fact that endpoint measures were taken while MM was active but BT had been faded makes all the more impressive the absence of a difference on 16 of 19 measures when MM was compared with BT and on 19 of 19 measures when community treatment (mostly medicated children) was compared with BT. Despite being faded, BT was as effective as medication as provided in the community and almost as effective as active medication as administered in the study. For a parent of an ADHD child, BT represents a valid, clear alternative to medication (8). At the same time, combined treatment was superior to BT, presumably reflecting the acute effects of medication. This finding reflects the same outcome as previously reported—active medication typically adds considerably to a baseline of BT (16,17).

What effect does this design aspect have on the comparison of the combined and MM groups? Across all measures in the study, the combined group was moderately superior to the MM, although usually not significantly so. This outcome was true at the group level of analysis, across the 19 dependent measures in the main intent-to-treat paper, as well as on analyses of excellent responders to treatment (18) and on a composite measure of treatment (19). Previous studies of combined treatment (for example, 16,17,20,21) have assessed outcome when both medication and BT have been active, perhaps therefore yielding a larger effect of combined treatment than was obtained in the MTA. Additionally, the large variation in parental and teacher adherence to BT meant that some parents and teachers actively continued BT after the contact ended whereas others did not (22). When differences in outcome between these groups are analyzed, it is likely that combined treatment for children whose parents and teachers continued the behavioural interventions they had been taught will have an outcome superior to MM, while combined treatment for those whose parents and teachers did not continue BT will be equivalent to MM alone (which would not be surprising, as functionally that would be what they were receiving).

To illustrate the impact of this design facet (active MM versus faded BT at endpoint) on the differential outcomes in the study, consider the results of a substudy with the BT and combined-treatment subjects from 3 of the 6 sites compared during the intensive STP phase of BT (23). In contrast to the results at posttreatment, there were no significant differences between the combined and BT groups on 82 of 87 measures of behaviour and academic performance across classroom, recreational, and home settings during the STP. The BT implemented in the STP was likely so effective that there was relatively little room for improvement for the additional treatment (MM). When the baseline treatment is active and very effective, typically little incremental benefit is derived from adding a second treatment in combined-treatment studies when treatment endpoint rather than treatment maintenance or withdrawal is being considered.

Sequencing of Behavioural and Pharmacological Treatments

A second aspect of the study design that may have influenced outcome is the sequencing of BT and MM components of combined treatment. MM and BT were started simultaneously for children in the combined group, so the initial medication-titration trial was conducted before behavioural interventions had been systematically implemented. In contrast to this approach, during the design phase of the study, some of the principal investigators argued that the initiation of medication should follow the initiation of BT in the combined group. They argued this point based on previous research showing that maintenance doses of medication would likely be lower—perhaps as much as a 50% reduction—in combined-treatment subjects if the BT was begun prior to medication titration (see 24 for a review). Further, it was argued that lower medication maintenance doses in a combined-treatment group would be among the main positive outcomes of the combined condition in the study. However, because of the simultaneous titration and behavioural intervention—only a few parent- or teacher-training sessions had been conducted prior to the end of medication titration—the titration trials for the combined and MM groups were comparable. Therefore the starting maintenance doses of medication were equivalent for the 2 groups of subjects, even though the combined group arguably needed less medication. Further, a study rule for the treatment team was that medication dosage could not be reduced for children in combined (or MM) treatment, even if their pharmacotherapist, behavioural therapist, parent, or teacher thought they could do just as well on a reduced dose of medication. A majority of the MTA SC believed that highest-tolerable doses rather than minimal-effective doses should be used for ADHD children. Medication dose could therefore only be reduced for side effects. In contrast to these design decisions, which minimized the likelihood that combined-treatment children would receive lower medication doses, in case of deterioration of functioning (assessed in monthly medication visits), the BT team had 3 weeks to alter the behavioural intervention and attempt to improve functioning before medication could be increased.

Interestingly, despite the rule prohibiting dose reduction, the combined-treatment children ended the study on 20% lower doses at endpoint than did the MM group (22). Although both groups started at the same point, the combined group did not have doses increased over the 14 months of treatment, while the MM group had doses increased by 20%—all due to deterioration of functioning at monthly checks. The difference held for both absolute and weight-adjusted dosing. Because the slopes of the increase in dosage over time were substantially steeper for the MM group than for the combined group, the best prediction is that medication would likely increase by 20% yearly for children being treated with medication alone versus those being treated with combined interventions. This difference is a minimal estimate, given that the medication dose could not be lowered for the combined group. At 1 site, the MM group took 50% more medication than the combined group, with similar outcomes. Over time, these patterns would result in considerably higher daily doses of stimulant medication for children treated only with medication. For parents and physicians who believe that relatively lower doses of stimulant medication are preferable to higher doses if symptoms and impairment can be managed equally well, this finding has considerable public health importance.

Interestingly, the MTA SC decided not to have the combined group wait for medication until after BT had begun because, if indeed the combined group ended up taking less medication than the MM group, the failure to find differences between combined and MM treatment could be blamed on difference in medication doses. This has, in fact, occurred. Given that medication causes linear improvement in parent and teacher ratings, it is certainly arguable, if not likely, that the combined group would have been rated as 20% better by parents and teachers relative to the MM group had the combined-treatment children been taking the same dose of medication as the MM children rather than 20% less.

The decision not to lag medication behind BT for the combined group also may have influenced treatment outcome by possibly reducing motivation and consequently the effort that parents and teachers exert in BT when a child is medicated. We have long argued that this may be the case when children are treated with medication before behavioural intervention (25). No literature has addressed this question, however, and because of the extensive data on treatment adherence and fidelity, the MTA is in a unique position to examine this issue. Analyses of compliance (for example, adherence and implementation) to BT in the MTA are currently underway, so we do not know the effect of the timing of medication relative to BT on compliance with BT. At the same time, the MTA has one outcome suggesting that prior medication may reduce parental and perhaps clinician effort in BT. Approximately 25% of the subjects assigned to the behavioural-only treatment crossed over to have stimulant medication added to their BT. This addition was made as a clinical decision by the treatment team or by parent choice and occurred across all sites in the study. The probability of a child’s crossing over to medication was 50% if he or she had been medicated (typically with MPH) prior to the study, while it was only 15% if he or she had no previous medication. Therefore, clinicians should be aware that when medication is used as the first-line treatment with ADHD, presumably successfully for acute symptom remission, a substantial portion of parents is likely to revert to reliance on medication during the course of subsequent behavioural treatment. Whether this effect occurred in our combined-treatment group is currently being examined.

It has long been suggested that issues such as sequencing of treatments, intensity of the baseline treatment against which combined treatment is contrasted, and treatment withdrawal or fading affect outcomes in studies of combined interventions, but only a handful of studies have addressed these issues. The MTA does not address these issues and employed a design that may have minimized the value of combining medication and behavioural interventions. Studies that investigate these facets of combining psychosocial and pharmacological interventions—not only for ADHD but also for other disorders—are the next frontier of research in child psychiatry and psychology.

Intensity of the Behavioural and Pharmacological Treatments

Another limitation in the MTA design is that only a single level of BT—intensive—was investigated. Similarly, children were titrated to the maximum tolerable dose of medication—optimal dosing, as labelled by the SC (2). Similar results, particularly regarding combined treatment, might have been obtained with less intensive and less expensive behavioural interventions and with lower doses of medication.

In its deliberations during the design phase of the study, the SC decided that the BT had to be comprehensive and intensive to have a fair chance against medication with its large acute effects. While this was a unique and positive aspect of the MTA, it is also a limitation: the possibility that the combination of less intensive behavioural treatments and lower doses of stimulants might yield the same outcome as higher doses of medication alone was not tested. That outcome has been reported in several previous studies of combined treatment (20,21). For example, parent training plus a simple home-school daily report card (DRC) is arguably the most essential component of BT for ADHD and also the most cost efficient. Perhaps a routine course of parent training plus a DRC as a BT, provided along with low dosages of a stimulant, might be sufficient for good long-term outcome for many ADHD children—for example, those without comorbid oppositional disorder, aggression, and peer problems. The more comprehensive and expensive MTA behavioural approach that includes expanded and lengthened parent training, school consultation with a classroom aide, and a summer or Saturday treatment program could be withheld until the effectiveness of a less intensive combined intervention were established for a given child. Considerable cost savings would result from this approach to psychosocial treatment.

Similarly, relatively high doses of medication were employed in the MM group—a mean of 38 mg daily MPH equivalent, a relatively high daily dose for young elementary school children—65% higher than the daily doses prescribed for the children in the community group. Previous studies have failed to find that BT adds to relatively high doses of medication when it provides incremental benefit to lower doses (20). For example, had the children in the community-treatment group, with a mean dose of 23 mg MPH daily, received a systematic concurrent behavioural intervention, they might have had an outcome that was equivalent to the MM group with 67% less medication. Such questions regarding the advantages and relative efficacies of less intensive treatments and their combinations were not addressed in the MTA. Because they hold the promise of less expensive psychosocial treatment and dramatically reduced and presumably safer doses of medication, such issues need to be examined in future research.

Other Treatments Obtained in the Community Comparison Group

An unexpected aspect of the MTA design that has had a major impact on the analytic plan and results is that large portions of the community group received treatment during the 14 months of the study. For example, nearly 70% of the community group received stimulant medication from their community physicians. The SC was concerned at the outset that if a large proportion of children in that group received treatment, it might be difficult to demonstrate effects of the MTA treatments. However, the SC was convinced by the extant literature on service use that the majority of the community group would not receive systematic treatments from professionals for ADHD during the study period. This was true with respect to effective psychosocial treatment services, which were used at a very low rate by the families. However, prior to our study, no data existed on the rate of new patients receiving stimulant medication in a given year. One-third of the subjects beginning the MTA study had been previously medicated; presumably all or nearly all of them immediately returned to their prestudy medications when they were assigned to the community group. Thus, likely one-half of the remaining community children were medicated during the year following baseline assessment, such that nearly 70% of the community group in total were medicated with a stimulant, at an average dose of 23 mg daily (MPH equivalent). The fact that the BT group was not different from the community group at endpoint is therefore particularly impressive, showing that BT alone is equivalent to active medication provided in the community.

A further complication in the community group is that assessments prior to treatment revealed that the use of behavioural interventions by parents and teachers even in the absence of documented professional involvement was quite high. For example, at the Pittsburgh site of the study, more than 90% of the parents and teachers reported regularly using behavioural techniques with the referred children prior to study implementation. Therefore, children in the community comparison group were receiving BT even though they were not provided by the study or by ongoing, systematic contacts with professionals.

In other words, the community group that was originally thought of as a “no treatment” control group by the SC received as much treatment as provided in the MTA. This may have contributed to the finding that the community group showed considerable improvement over the 14 months of the study, being surpassed at endpoint only by the actively medicated MTA subjects. This has led to widespread misinterpretation of the MTA results, with many media and professionals concluding that the BT did not work, unaware that the community group was actually treated with both stimulant medication and parent- and teacher-implemented BT. In fact, the data from the Pittsburgh site also show that children in the MM group were receiving behavioural interventions from their parents. So possibly 3 groups—combined, MM, and community—were actually receiving—more or less systematically—the combination of psychostimulant and behavioural treatments. Only the MTA BT group, with only 25% of subjects having crossed over to medication, may have predominantly received a unimodal treatment!

Fading of the Behavioural Treatment

A final design limitation of the study concerns the fading of BT after the intensive treatment phase. For many years (for example, 26), maintenance of treatment effects has been a major focus of BT studies. For at least a decade, many prominent adherents of BT have argued that BT for chronic disorders such as ADHD and other disruptive behaviour disorders may need to be maintained in some form or another for many years before they can be faded (24). For fiscal reasons, the SC reduced the planned study from 2 years in length to 14 months, with the last 4 to 5 months of BT being dramatically truncated. To maintain what were expected to be large gains accruing from the intensive phase of BT, the SC designed several strategies to offset the shorter time period. These consisted primarily of having parents maintain the school intervention by having regular, scripted meetings with the child’s teacher and then having monthly parent-training meetings to discuss their interactions with the children’s teachers and the maintenance of behavioural programs at home. Because a great deal of attention was paid to ensuring that treatments were delivered in the study, attendance at parent training, school intervention, and STP sessions of the MTA was quite good during the active phase of BT. The focus on variability in compliance with BT in the study therefore needed to be on attendance at parent-training and parent–teacher meetings during the maintenance phase and on implementation of the techniques taught to parents and teachers after therapist contact had been faded.

Perhaps not surprisingly, parent and teacher compliance with the plan to maintain the school intervention for their children varied considerably (22). For example, one-quarter of the parents never scheduled any of their meetings with their child’s teacher, while one-quarter attended all of the scheduled meetings. One-quarter never attended any of the 4 maintenance parent-training sessions, while one-third attended them all. For 45% of the sample, teachers continued the DRC that had been established by the behavioural clinicians and parents provided rewards at least 75% of the time. However, only one-quarter of the parents and teachers continued the DRCs almost all the time throughout the study. If we assume that attendance at these sessions and parent and teacher DRC-implementation represent their use of the behavioural techniques that they had been taught, then clearly our plan to fade therapist involvement and turn things over to the parents was not nearly as successful as we would have wished. In an ideal study with a flexible algorithm, evaluations of parent and teacher maintenance would be regularly conducted, with a return to therapist contact or other maintenance strategies when lack of maintenance is detected.

Analyses that examine possible outcomes for children whose parents and teachers did and did not continue DRCs are underway, as are analyses that examine the characteristics of those parents and children in families where BT was differentially implemented. Because of the large sample and the extensive data available from the study, these analyses can be expected to contribute valuable information to the field that can inform future studies on the maintenance of behavioural treatment effects.

Summary

The MTA has numerous design limitations that must be acknowledged to fully understand the results as they have been reported to date in the context of the existing literature. The results show: 1) active medication for ADHD is better than withdrawn BT (on some but not most measures); 2) combined treatment adds modestly to active medication but is superior to behaviour management alone; 3) study treatments that include active medication are better than community treatments that include medication, while BT is comparable to medication as delivered in the community; and 4) concurrent BT results in at least 20% lower and nonincreasing medication dosages relative to treatment with medication alone. The major caveats to note are that these results are contingent on such design issues as timing of assessments (active versus withdrawn treatment), sequencing of treatments, intensity of treatments that are combined, and the baseline against which treatments are being compared. The MTA addressed only 1 aspect of this combination of factors, leaving many questions unanswered.

Misinterpretation of the MTA Results

The results of the MTA study have been widely, prematurely, and inaccurately reported and described. Consider, for example, the headlines using the following phrases on the front-page stories of various International Medical News Group publications (December 1998), widely distributed and read medical newspapers (combined circulation 145 000) for physicians, and appearing on a leading medical Internet site (Medscape.com): “Medication makes the differences in ADHD kids,” “Monitoring medication is the key,” “Psychosocial interventions of no benefit even when used with medication.” The article was written by a medical reporter who attended a symposium at the annual meeting of the American Academy of Child and Adolescent Psychiatry in Anaheim, California, in October 1998, at which preliminary results of the MTA study were presented by the investigators involved in the study. The outcomes of the study were also presented in the NIH Consensus Development Conference on ADHD in late 1998 (27) and are described briefly in the Consensus statement resulting from that conference. These presentations indicated that medication resulted in dramatic improvement in treated subjects, particularly compared with BT, and that BT was of little value compared with medication and has little to no additive value to medication. These findings have also been cited by the MTA’s principal investigators to demonstrate that combined treatment has no value over medication alone (28). Finally, the first major paper of the study (6) concludes that, if “carefully monitored medication” is provided as the first-line treatment for ADHD, “our results suggest that many treated children may not require intensive behavioural interventions.”

Despite BT clearly being efficacious in the MTA, the emphasis in these media articles and papers is on the relative superiority of medication, with all of these sources concluding: 1) the pharmacological interventions were more effective than BT; 2) adding BT to medication resulted in little or no additional benefit; and 3) BT alone was no different from various nonstandardized treatments from community providers.

The consistent message of these articles and presentations has been that only medications are needed to treat ADHD. In fact, the study results are quite complex, are much too extensive to describe in 1 article, and, without caveats, do not lend themselves to the straightforward interpretations that have been argued in these presentations. Eventually, all of the findings will be published in various outlets, but that will not occur for several years, and results will be widely distributed across multiple outlets rather than easily accessible as a single body of work.

Therefore it is worth emphasizing several points regarding the study outcome. First, BT had an effect size improvement from baseline to endpoint across all measures of 0.9 to 1.3, which is very large. BT differed from MM on only 3 of 19 dependent measures and was no different from community treatment as a group on any of 19 measures (and also was not different from the medicated subgroup in the community group). All of these outcomes occurred despite the withdrawal of BT and the continuation of medication. Further, 75% of the children in the BT condition were maintained without medication for 14 months, including one-half of those who were medicated at study entry! Therefore, rather than not working at all, BT, even when faded, worked as well as ongoing stimulant medication provided in the community and nearly as well as active medication provided in the MTA on many measures of functioning.

Second, parents significantly preferred the behavioural and combined treatments over medication alone. To the extent that parent preference influences engagement in and continued use of treatments, this may be a very important advantage of BT when contrasted with or added to medication. Many parents prefer not to have their children treated with medication. The MTA results demonstrate that such parents can expect major improvement, in their children if they engage in intensive BT that includes parent training, school interventions, and a summer program.

Third, analyses of “excellent responders” to treatment in the MTA (18) have shown that parents and teachers were more likely to rate children as “normalized” if they were receiving combined treatment rather than medication or BT alone. This finding is consistent with the previous literature on normalization of functioning in ADHD children in combined versus unimodal treatments (15).

Fourth, the comparative equivalence on many measures between the combined group and the medication-alone group was obtained despite the fact that the combined group was receiving 20% less medication at study end. Despite study algorithms that minimized the likelihood of a difference between combined treatment and MM in dose, the difference occurred because dose did not increase over 14 months of treatment in the combined group, while it increased 20% in the MM group. Given that we know little about the long-term sequelae of stimulant drugs (29), prudent physicians will certainly want to minimize the total dose of medication to which they expose their patients. The medication dose likely could have been much lower in the combined group and yielded an outcome equivalent to that of medication alone. On that basis alone, combined pharmacological and behavioural treatments should be recommended as the treatment of choice for ADHD.

Finally, consider the well-known fact that the effects of stimulant medication, though clearly beneficial in the short term (for example, 30,31) do not last beyond medication termination. Most ADHD individuals stop taking stimulant medication during childhood or adolescence. The only condition under which experts in ADHD treatment should be recommending medication alone (as opposed to combined treatments) as sufficient is when they are not concerned with their patient’s long-term outcome, which will be unaffected by medication. BT may have a lasting beneficial effect after medication withdrawal, as was the case in the MTA.

The MTA study is an important benchmark for mental health in children. It has produced unique methodological and procedural advances for measuring and implementing response to pharmacological and psychosocial treatments, including manualization of treatments, algorithms for clinical decision-making, and measures of treatment fidelity and adherence. Its results will impact the field for years to come, and it is critical that they be completely presented and understood. They show clearly beneficial short-term effects for stimulant medication, as administered in the study and in the community, BT as delivered in the study, and combined treatments as delivered in the study. Despite the widely reported misinterpretation, its results say more than “yes” to drugs alone.


Clinical Implications

Limitations

Acknowledgements

Dr Pelham was supported by grants from the National Institute of Mental Health (MH50467 and MH53554) and the National Institute of Alcohol Abuse and Alcoholism (AA11873).

References

1. Arnold LE, Abikoff HB, Cantwell DP, Conners CK, Elliott G, Greenhill LL, and others. National Institute of Mental Health collaborative multimodal treatment study of children with ADHD (MTA): design challenges and choices. Arch Gen Psychiatry 1997;54:865–70.

2. Greenhill LL, Abikoff HB, Arnold LE, Cantwell DP, Conners CK, Elliott G, and others. Medication treatment strategies in the MTA study: relevance to clinicians and researchers. J Am Acad Child Adolesc Psychiatry 1996;35:1304–13.

3. Hinshaw SP, March J, Abikoff HB, Arnold LE, Cantwell DP, Conners CK, and others. Comprehensive assessment of childhood attention-deficit hyperactivity disorder in the context of a multisite, multimodal clinical trial. Journal of Attention Deficit Disorders 1997;1:217–34.

4. Richters JE, Arnold LE, Jensen PS, Abikoff H, Conners K, Greenhill LL, and others. NIMH collaborative multisite multimodal treatment study of children with ADHD: I. Background and rationale. J Am Acad Child Adolesc Psychiatry 1995;34:987–1000.

5. Wells KC, Pelham WE, Kotkin RA, Swanson JM, Abikoff HB, Arnold LE, and others. NIMH collaborative multimodal treatment study of children with ADHD (MTA): psychosocial treatments. J Abnorm Child Psychol. Forthcoming.

6. MTA Cooperative Group. 14-month randomized clinical trial of treatment strategies for attention deficit hyperactivity disorder. Arch Gen Psychiatry. Forthcoming.

7. MTA Cooperative Group. Effects of comorbid anxiety disorder, family poverty, session attendance, and community medication on treatment outcome for attention-deficit hyperactivity disorder. Arch Gen Psychiatry. Forthcoming.

8. Pelham WE, Wheeler T, Chronis AM. Empirically supported psychosocial treatments for ADHD. J Child Clin Psychol 1998;27:189–204.

9. Swanson JM, McBurnett K, Christian DL, Wigal T. Stimulant medications and the treatment of children with ADHD. In: Ollendick TH, Prinz RJ, editors. Advances in Clinical Child Psychology. New York: Plenum Press; 1995. p 265–322.

10. Pelham WE. Pharmacological treatment for children with attention-deficit hyperactivity disorder. School Psychology Review 1993;22:199–227.

11. Pelham WE, Hoza B. Comprehensive treatment for ADHD: intensive summer treatment programs and follow-up. In Hibbs ED, Jensen PS, editors. Psychosocial treatments for child and adolescent disorders. Washington (DC): American Psychological Association; 1996. p 311–40.

12. Wilson GT. Manual-based treatments: the clinical application of research findings. Behav Res Ther 1996;34:295–314.

13. Waltz J, Addis ME, Koerner K, Jaconbson NS. Testing the integrity of a psychotherapy protocol: assessment of adherence and competence. J Consult Clin Psychol 1993;61:620–30.

14. Sherman M, Hertzig ME. Prescribing practices of Ritalin: the Suffolk County, New York study. In: Greenhill LL, Osman BB, editors. Ritalin: theory and patient management. New York: Mary Ann Liebert; 1991. p 187–93.

15. Klein RG, Abikoff H. Behavior therapy and methylphenidate in the treatment of children with ADHD. Journal of Attention Disorders 1997;2:89–114.

16. Pelham WE, Schnedler RW, Bender ME, Miller J, Nilsson D, Budrow M, and others. The combination of behavior therapy and methylphenidate in the treatment of hyperactivity: a therapy outcome study. In: Bloomingdale LL, editor. Attention deficit disorders. Vol 3. London: Pergamon; 1988. p 29–48.

17. Pelham WE, Carlson C, Sams SE, Vallano G, Dixon MJ, Hoza B. Separate and combined effects of methylphenidate and behavior modification on boys with attention deficit-hyperactivity disorder in the classroom. J Consult Clin Psychol 1993;61:506–15.

18. Swanson J, MTA Cooperative Group. Qualitative outcome in the MTA study based on severity of ADHD and ODD symptoms: non-parametric analyses of excellent response at the end of treatment. J Am Acad Child Adolesc Psychiatry. Forthcoming.

19. Conners CK, MTA Cooperative Group. Multimodal treatment of ADHD (MTA): an alternative outcome analysis. J Am Acad Child Adolesc Psychiatry. Forthcoming.

20. Carlson CL, Pelham WE, Milich R, Dixon MJ. Single and combined effects of methylphenidate and behavior therapy on the classroom performance of children with attention deficit-hyperactivity disorder. J Abnorm Child Psychol 1992;20:213–32.

21. Pelham WE, Schnedler RW, Bologna NC, Contreras JA. Behavioral and stimulant treatment of hyperactive children: a therapy study with methylphenidate probes in a within-subject design. J Appl Behav Anal 1980;13:221–36.

22. Pelham WE. Secondary Analyses of psychosocial treatment effects: subgroups and compliance. In: Greenhill L, Chair. The NIMH Multimodal, Multisite Treatment Study for ADHD: posttreatment results. Symposium presented at the annual meeting of the American Psychological Association, Boston, August 1999.

23. Pelham WE, Hinshaw SP, Swanson JM, Gnagy EM, Greiner AR, Hoza B, and others. Behavioral vs. behavioral and pharmacological treatment in ADHD children attending a summer treatment program. J Abnorm Child Psychol. Forthcoming.

24. Pelham WE, Waschbusch DA. Behavioral interventions in attention deficit/hyperactivity disorder. In: Quay H, Hogan A, editors. Handbook of disruptive behavior disorders. New York: Kluwer Academic/Plenum Publishers; 1999. p 255–78.

25. Pelham WE, Murphy A. Attention deficit and conduct disorders. In: Hersen M, editor. Pharmacological and behavioral treatment: an integrative approach. New York: Wiley; 1986. p 108–48.

26. Stokes TF, Baer DM. An implicit technology of generalization. J Appl Behav Anal 1977;10:349.

27. Jensen P. Behavioral and medication treatments for ADHD: comparisons and combinations. Paper presented at the NIH Consensus Development Conference on Diagnosis and Treatment of Attention Deficit Hyperactivity Disorder. Washington DC, November 16–18, 1998.

28. Greenhill LL, Halperin HM, Abikoff H. Stimulant medications. J Am Acad Child Adolesc Psychiatry 1999;38:503–12.

29. NIH Consensus Statement. Diagnosis and treatment of attention deficit hyperactivity disorder. Nov 16-18, 1998;16(2):1–37.

30. Pelham WE, Aronoff HR, Midlam JK, Shapiro CJ, Gnagy EM, Chronis AM, and others. A comparison of Ritalin and Adderall: efficacy and time-course in children with attention-deficit/hyperactivity disorder. Pediatrics 1999;103(4):e43.

31. Pelham WE, Gnagy EM, Burrows-Maclean L, Williams A, Fabiano G, Morrisey SM, and others. Once-a-day OROS® methylphenidate versus t.i.d. methylphenidate in a laboratory setting. Paper to be presented at the annual meeting of the American Academy of Child and Adolescent Psychiatry/Canadian Academy of Child Psychiatry, October 1999.

Appendix.  Design and Treatments of the MTA Study

Study Design

Randomized clinical trial of 4 treatments:

Behavioural Treatments

Parent Training

School Intervention

Child Intervention

Pharmacological Treatment

Community Comparison Treatment