Dosing transcranial magnetic stimulation in major depressive disorder: Relations between number of treatment sessions and effectiveness in a large patient registry

The number of sessions in an acute TMS course for major depressive disorder (MDD) is greater than in the earlier randomized controlled trials. Objective: To compare clinical outcomes in groups that received differing numbers of TMS sessions. Methods: From a registry sample (N = 13,732), data were extracted for 7215 patients treated for MDD with PHQ-9 assessments before and after their TMS course. Groups were defined by number of acute course treatment sessions: 1 – 19 (N = 658), 20 – 29 (N = 616), 30


Introduction
Compared to the original sham-controlled trials that supported regulatory approval of transcranial magnetic stimulation (TMS) in major depressive disorder (MDD) [1,2], response and remission rates are substantially greater in recent reports documenting "real-world" outcomes across diverse clinical settings [3][4][5].For example, in both the industry-sponsored O'Reardon et al. trial [1] and the NIMH-sponsored Optimization of TMS study [2] remission rates at the primary endpoints of 4 weeks and 3 weeks, respectively, were approximately 15% with active TMS and 5% with sham in MDD patients withdrawn from antidepressant medication.While this difference supported the efficacy of TMS, the remission rates with active treatment were modest.In contrast, more recent naturalistic and registry studies of TMS treatment of MDD in community settings document response and remission rates of approximately 60% and 30%, respectively [3][4][5], substantially greater than in the blinded phases of the original sham-controlled trials.
This putative increase in the effectiveness of TMS could be due to multiple factors, including selection or reporting biases in the observational studies or differences in outcome measures, patient characteristics, concomitant treatments, targeting methods, and TMS dosing.There is some evidence that patients who have previously failed to benefit from multiple adequate antidepressant trials are less likely to benefit from TMS than patients with lesser degrees of treatment resistance [6][7][8].Samples enrolled in randomized controlled trials (RCTs) that require that pharmacoresistant depression be verified by detailed source documentation likely differ substantially from those receiving routine TMS in community settings [9,10].Additionally, the randomized sham-controlled TMS trials were mainly conducted in patients who were withdrawn from antidepressant medications prior to starting TMS [1,2,11].There is evidence that the combination of ECT and an antidepressant medication is more effective than ECT alone [12,13].When treating MDD, TMS is now routinely combined with ongoing pharmacotherapy and/or psychotherapy, and it is possible that additive or synergistic effects with these treatments enhance effectiveness [14,15].
The protocols used to administer TMS have also evolved over the past two decades.Much of the early controlled research administered TMS at lower stimulus intensities and with fewer pulses per session than is now standard, and there is some evidence that both factors may be related to clinical outcome [3,16,17].Perhaps most critically, the number of treatment sessions in a course of TMS has changed dramatically.The earliest sham-controlled trials often administered TMS for a period of only one or two weeks [18].The primary endpoints in the pivotal O'Reardon et al. [1] and Optimization of TMS [2] trials were assessed after 3-4 weeks of TMS with 5 sessions per week (i.e., after 15-20 sessions).In both trials, after the acute phase endpoint, patients could receive up to 30 additional TMS sessions plus 6 spaced taper sessions.In both studies, the additional treatment after the acute endpoint resulted in improved efficacy [19,20], resulting in the FDA-cleared protocol for 36 total sessions.Similarly, Blumberger et al. [21] randomized MDD participants to intermittent Theta Burst Stimulation (iTBS) or standard 10 Hz TMS for 20 sessions over 4 weeks, with an additional 10 sessions delivered only for those who had 30% or greater reduction in scores by week 4.In contrast, in a sample of over 5000 participants treated for MDD in the NeuroStar® Advanced Therapy System Clinical Outcomes Registry, the average acute treatment course was composed of 32.0 TMS sessions, delivered over 7.5 weeks [3].Patients now routinely receive longer courses of TMS than in the original RCTs.
There has been surprisingly little documentation of the relations between number of sessions in a TMS course and clinical outcomes in MDD [22,23] and few studies have attempted to characterize the point at which maximal symptom reduction is typically achieved and maximal response and remission rates typically observed [24,25].This study addressed this knowledge gap with three main objectives.First, we sought to determine the number of treatment sessions at which symptom reduction peaks for the modal patient.Intolerance, perceived lack of efficacy, worsening depression symptoms, logistical and access challenges, and other factors leading to early termination would be expected to produce relative short treatment courses with diminished antidepressant effects, while early termination due to rapid remission would have the opposite effect.With progressively longer treatment courses, TMS antidepressant effect would be expected to reach a peak.However, whether maximal antidepressant effect in group data is seen after 20, 30, 40 or more treatment sessions has not been established, and yet is critical in defining what constitutes an adequate TMS course [26,27].
The second objective of this study was to compare patient subgroups, defined by total number of treatment sessions in their acute course, in their trajectories of symptom improvement during the acute course.To inform measurement-based care, many TMS clinical practices have incorporated serial assessment with self-report depression severity scales.In addition to the acute course endpoint, clinical outcome data were available at fixed assessment intervals, i.e., after 10, 20, 30, and 36 TMS sessions.These data provide novel information in a very large realworld sample on the time course of improvement with TMS, both overall and in relation to groups receiving different lengths of treatment.
Our third objective focused on the patients that received especially long courses of TMS (>36 sessions and termed "extended treatment") and examined whether the additional sessions resulted in further meaningful clinical improvement.In some cases, a commercial insurance company will approve a clinician's request to cover an extension of the acute course beyond 36 sessions.This is typically done when the patient has had delayed onset of response or has steadily improved but not yet achieved remission.In other cases, patients self-pay for additional sessions.Specifically, we sought to determine whether extended courses of treatment resulted in meaningful increases in response or remission rates after what is typically considered a full course of treatment.In addressing this third objective, we also determined whether at any time point there was evidence that the effectiveness of TMS diminished or plateaued despite additional treatment or, to the contrary, evidence that TMS exerted additional antidepressant effects regardless of how many treatments were already administered.
This naturalistic, open-label study was conducted with data from the NeuroStar® Advanced Therapy System Clinical Outcomes Registry, involving more than 100 private practice sites [3,[28][29][30].In community practice, multiple factors determine the number of sessions administered in an acute TMS course besides the rate and extent of clinical improvement (e.g., tolerability, side effects, cost, convenience).Indeed, in the United States third-party insurance policies often pre-authorize a fixed number of TMS treatments, and in recent years 36 sessions has been commonly specified [31][32][33].This study can be considered an observational, natural experiment comparing "real-world" outcomes in groups defined by receipt of different lengths of treatment.While Identification of dose-response relationships in a naturalistic study of this type raises the possibility that differences in clinical outcomes are determined, at least in part, by dosage effects, it also possible that such effects reflect reverse causality, since clinical outcomes often determine the duration of exposure to treatments.

Clinical outcomes registry
As described previously [3,[28][29][30], site selection for the NeuroStar® Advanced Therapy System Clinical Outcomes Registry required that the clinical facilities treated at least 24 patients the year before joining the registry, used TrakStar® Cloud software for recording de-identified patient characteristics and treatment parameters, and had a secure link for electronic data transfer.Sites used the Patient Health Questionnaire-9 (PHQ-9) [34] and/or the Clinical Global Impression -Severity scale (CGI-S) [35] to assess the severity of depressive symptoms by self-report and clinician rating, respectively.Once a site joined the registry, all patients treated at the site were included in the database.Data entry in the registry started on May 5, 2016, and this report concerns all data collected until July 7, 2022.Site personnel entered patient demographic information (date of birth, gender), site identifier, primary diagnosis and diagnoses of co-morbid psychiatric conditions, and PHQ-9 and CGI-S scores.Treatment parameters were captured automatically, and included session date, treatment location of stimulation [i.e., left dorsolateral prefrontal cortex (DLPFC), right DLPFC, or both], motor threshold (MT), number of pulses per treatment location or session, treatment level (% device output relative to MT), pulse frequency, duration of pulse trains, intertrain interval (ITI), and the number of treatment sessions during the acute phase treatment course.The acute phase treatment period was retrospectively defined as starting with the patient's first recorded TMS treatment and continuing until there was a period of at least 8 or more days without any treatment or two consecutive weeks each with a gap of 7 days.

Abbreviations
The registry was maintained in compliance with the Health Insurance Portability and Accountability Act of 1996 (HIPAA).All patient data were de-identified prior to electronic transfer.Collection and analysis of clinical care data in this way does not require local Institutional Review Board approval or informed consent.The study was documented at ClinicalTrials.gov(Identifier: NCT05541302).

Sample definitions
Data were collected on 13,732 patients treated at 110 U S. sites (see Table 1).These registry participants were all unique individuals who received at least one treatment with the NeuroStar TMS Therapy System.The registry sites were a substantial proportion of the approximately 900 sites using this system in the U.S and were primarily administered by private practice practitioners (N = 51), larger group practices (N = 12) or private practice TMS Centers (N = 51), with few hospital-based practices (N = 2) or academic institutions (N = 1).
The total study sample excluded patients with age less than 18 years, no MDD diagnosis, or a primary diagnosis other than MDD (Table 1).To ensure that the treatment objective was management of an acute episode of MDD, patients with comorbid psychiatric diagnoses other than generalized anxiety disorder (GAD), panic disorder, and unspecified anxiety disorder were also excluded (e.g., post-traumatic stress disorder, obsessive compulsive disorder, schizophrenia, bipolar disorder, autism, attention-deficit hyperactivity disorder).Patients were excluded who did not have a PHQ-9 assessment within 14 days prior to the first TMS session or a PHQ-9 assessment within 4 days of the final session of the acute course.Individuals were also excluded whose baseline PHQ-9 was less than 10, indicating insufficient severity of baseline depressive symptoms, or who received 3 or more treatments in a day.After these exclusions, the total study sample with PHQ-9 ratings comprised 7215 patients.
Within the total sample with PHQ-9 ratings, a subsample was defined of patients treated with protocol-defined high frequency (10 Hz), left dorsolateral prefrontal cortex (DLPFC) TMS throughout the acute treatment course (see Table 1).Patients in this "Left Only" subsample (N = 3805) did not receive more than one treatment protocol in any session, all treatments delivered 10 Hz stimulation to the left DLPFC, the intensity of stimulation (treatment level) did not average less than 100% of MT, and the average number of pulses delivered per session was not less than 2000.Another subsample was comprised of patients who had CGI-S ratings both before and following the acute TMS treatment course (N = 2303), while otherwise applying the same exclusions as in the total sample with PHQ-9 ratings.[26,27], and the 1-19 session group was considered as especially likely to have received insufficient treatment.Nearly 50% of the sample received exactly 36 sessions as this corresponded to the number of sessions pre-authorized by most insurance carriers [31][32][33].These patients constituted a separate group.The two groups that received either 37-41 sessions or 42 or more sessions were considered as having received extended treatment, longer than typical TMS courses.

Table 1
Inclusion/exclusion criteria for the total sample and the left only and CGI-S subsamples.

Fig. 1.
Distribution of the number of sessions in the acute TMS course in 7215 patients treated with TMS for major depressive disorder (range = 1 to 102 sessions).

TMS procedures
MT was determined at the first treatment, using single pulse stimulation of the motor cortex area corresponding to the abductor pollicis brevis (or other muscle in a finger of the contralateral hand) and visual observation of the elicited twitch.MT level was defined as the minimum device intensity that induced an observable motor response in 50% of applied pulses, using an iterative algorithm for sequential testing.MT was expressed in Standardized MT units (SMTs) wherein 1.0 SMT corresponds to the average MT level observed in a large patient population (NeuroStar® System Instructions for Use, Rev. F, Apr.2019).
External coordinates for coil placement over the DLPFC target were calculated by the device (coordinate for a site 5.5 cm anterior to the MT location, along a left superior oblique plane), but practitioners could use other methods to localize the stimulation target.The TrakStar software captured the age, gender, and MT level of each patient, as well as the treatment parameters at each session, including TMS intensity (% relative to MT), pulse frequency, duration of pulse trains, inter-train interval (ITI) duration, and total number of delivered pulses.

Statistical analyses
The six groups defined by number of TMS sessions in the acute course were compared in baseline demographics and treatment parameters, endpoint clinical outcomes, and clinical outcomes at fixed intervals of 10, 20, 30 and 36 sessions.For continuous measures, omnibus one-way analyses of variance (ANOVAs) were conducted testing whether groups differed.Significant effects of group membership were followed by Tukey-Kramer post hoc comparisons identifying the significant pairwise differences among the groups [36,37].The same strategy was applied with the categorical measures; first conducting an omnibus Pearson chi-square test followed by post hoc comparisons using an adaptation of the Tukey-Kramer method.
Clinical outcomes at all time points included the raw PHQ-9 score, the difference from baseline (Pre -Post), and the percent change in score (Pre − Post)/Pre.Response was defined as ≥50% reduction in PHQ-9 scores at follow-up assessment relative to pre-TMS baseline, and remission was defined as a follow-up PHQ-9 score of less than 5. Percentage change in PHQ-9 score was the primary outcome in each set of comparisons, while response and remission rates were the secondary outcomes.In addition, a metric reflecting the degree of clinical improvement per TMS session was calculated for each group by determining the change in mean percentage improvement in PHQ-9 score over an assessment interval (e.g., 1-10 sessions, 30-35 sessions) divided by the number of sessions in that interval (e.g., 10,6).This metric also served as a secondary outcome measure to compare the groups in rate of improvement.
The first objective was to determine whether the groups differed in endpoint clinical outcomes, and if so, what was the length of treatment associated with peak effectiveness.A between-group, one-way ANOVA was conducted on the primary outcome measure and chi-square tests and one-way ANVOAs on the secondary outcome measures.
The second objective was to determine whether the groups differed in the rate or trajectory of improvement over the course of TMS.The six groups were compared in the clinical outcomes calculated following 10, 20, 30, 36 and endpoint TMS sessions, with separate between-group analyses conducted at each assessment time point.PHQ-9 assessments occurred within ±4 days of each assessment time point, and the maximum available sample was identified for each group at each time point.There was little data loss as the groups varied from 89 to 99% in representation of the total assessable group at the interim time points.To confirm the findings regarding the group differences in degree and rate of improvement over time, a repeated measures mixed model was conducted specifically comparing the three groups that received 36 sessions, 37-41 sessions, and >41 sessions in percentage change in PHQ-9 scores at the assessments following 10, 20, 30, and 36 sessions.This analysis required that participants contributed evaluations at all 4 time points.
To examine the possibility that there was a higher rate of extreme clinical outcomes in the groups with early termination, the four groups that received ≤36 TMS sessions were compared in the variance of the primary outcome measure after 10 and 20 sessions using Levene's F-test for variance equality [38].These groups were also compared at these assessment occasions in the shape of their relative frequency distributions for percentage change in PHQ-9 scores binned into 10% change intervals.
The third objective was to determine whether the groups who received extended treatment (37-41 sessions and >41 sessions) showed meaningful clinical improvement beyond what they obtained after 30 or 36 sessions.To test for a main effect of time point, a repeated measures mixed model was conducted contrasting the two extended treatment groups in the extent of improvement in PHQ-9 scores at assessments following 30, 36 and endpoint sessions requiring that participants contributed evaluations at all 3 time points.Paired t-tests were used for comparison of adjacent time points.A repeated measures mixed model was also used to contrast the time points in rates of response and remission.The changes in response and remission rates at the assessments following session 30 and endpoint were also examined in the extended treatment groups.
The analyses of the Left Only and CGI-S subgroups were restricted to retesting the findings regarding the first objective: determining whether the groups differed in outcomes at endpoint and the number of sessions associated with peak effectiveness.Examination of the trajectory of improvement in these subgroups during the acute course was not conducted due to limited sample size at specific assessment time points.On the CGI-S, response corresponded to an endpoint score of 3 ("mildly ill") or less, while remission corresponded to a score of 2 ("borderline mentally ill") or less.
The analyses were conducted using SAS v9.4 (SAS Institute Inc., Cary, NC, USA).Results are reported in tables as mean ± SD and in figures as mean ± SEM.Treatment parameters were averaged over all treatment sessions in the acute course.Significance values are two-sided with an alpha of 0.05.

Group differences in demographics and treatment parameters
Table 2 presents the demographic and TMS parameters for the total sample and the six groups defined by number of TMS sessions in the acute course.The groups did not differ in the distributions of gender or MT.The two extended treatment groups ("37-41 Sessions" and ">41 Sessions") were more likely to be treated with sequential bilateral TMS and consequently received more treatment protocols per session than each of the remaining groups.The extended treatment groups were also exceptional in receiving a larger number of pulses per session, and, on average, they were treated at a higher treatment level relative to MT. Due to the greater use of sequential bilateral TMS in the extended treatment groups, they also differed from the other groups in intertrain interval and pulse frequency.
Each of the groups were treated at an average rate greater than 4 sessions per week.The "1-19 Sessions" group was treated at a higher rate (4.56 sessions/wk) than all other groups, while the ">41 Sessions" group had the lowest rate of treatment (4.06 session/wk).The "1-19 Sessions" group was also exceptional in being treated at a lower magnetic intensity relative to MT than all other groups.The low level of treatment intensity in this group likely reflected issues with tolerability.

PHQ-9 endpoint clinical outcomes
Endpoint PHQ-9 clinical outcomes for the groups differing in length of the TMS acute course are documented in Table 3 and Fig. 2a and b.
None of the groups differed from each other in post hoc comparisons of baseline PHQ-9 scores.In contrast, there were marked differences in the post-treatment PHQ-9 score, change from baseline in this score (absolute and percentage change), and rates of response and remission.Effectiveness was poorest in the "1-19 Sessions" group which had significantly inferior outcomes in all measures relative to every other group.While the "20-29 Sessions" group had superior outcomes compared to the "1-19 Sessions" group, it had inferior outcomes on all measures when compared to each remaining group, except for a lack of difference in some measures with the ">41 Sessions" group.Antidepressant effect increased in the groups that received an increasing number of TMS sessions until it peaked in the "36 Sessions" group (Fig. 2a and b).Except for the adjacent group that received 30-35 sessions, the "36 Sessions" group had significantly superior outcomes on all measures than each of the other groups.The "36 Sessions" group had a significantly higher rate of remission than the "30-35 Sessions" group, but the two groups did not differ in other PHQ-9 outcomes.The extended treatment groups ("37-41 Sessions" and ">41 Sessions") showed intermediate efficacy outcomes, superior in most cases to those observed with short treatment courses ("1-19 Sessions" and "20-29 Sessions"), but generally inferior outcomes to the groups that received 30-35 or 36 sessions.Values are mean ± SD for continuous variables and percentage of group for categorical variables.One-way analyses of variance were conducted on the continuous variables and chi-square tests on the categorical variables.The letters following each group's value describe the post-hoc comparisons.Groups that differ in all their following letters (e.g., AB vs DE) differed significantly from each other, while groups that shared any letter did not differ (e.g., B vs. BC).

Table 3
Endpoint PHQ-9 clinical outcomes in groups that received differing durations of treatment with TMS.Values are mean ± SD for continuous variables and percentage of group for categorical variables.One-way analyses of variance were conducted on the continuous variables and chi-square tests on the categorical variables.The letters following each group's value describe the post-hoc comparisons.Groups that differ in all their following letters (e.g., AB vs DE) differed significantly from each other, while groups that shared any letter did not differ (e.g., B vs. BC).

Clinical outcomes at fixed intervals during the acute course
Using the maximal data available at each time point, Table 4 and Fig. 3a-c present the clinical outcomes following assessments after 10, 20, 30, and 36 sessions, as well as at endpoint for the extended treatment groups.The analyses were consistent across time points and outcome measures.At the 10, 20, 30 and 36 sessions time points, the extended treatment groups had inferior antidepressant effects compared to all other groups.Furthermore, in most cases the group with the longest duration of treatment, ">41 Sessions", had inferior outcomes compared to the other extended treatment group, "37-41 Sessions".Thus, the groups that ultimately received especially long treatment courses had poorer antidepressant effects from the earliest assessment time point.Also of note, at the early time points, the groups that received the shortest treatment courses ("1-19 Sessions" and "20-29 Sessions") improved to the same extent as groups that received longer courses ("30-35 Sessions" and "36 Sessions").Thus, the groups that received the shortest courses of TMS were as responsive in symptomatic change at the early shared timepoints as the groups that later manifested peak effectiveness.
A repeated measure mixed model was conducted to verify that the extended treatment groups manifested less improvement from early in treatment and subsequently had a slower rate of improvement when restricted to participants with complete data sets.The "36 Sessions" group and the two extended treatment groups were compared in the percentage change from baseline in PHQ-9 score at the assessments following 10, 20, 30, and 36 sessions (Fig. 4).There was a significant main effect of time point, F(3, 616) = 465.04,P < 0.0001, indicating that across the groups PHQ-9 scores improved over sessions.There was a significant main effect of group, F(2, 609) = 67.35,P < 0.0001, indicating that across the time points the groups differed in extent of improvement.There was also a significant interaction between group and time point, F(6, 701) = 5.23, P < 0.0001, indicating that the rate or slope of improvement differed among the groups.As seen in Fig. 4, after 10 sessions, the "36 Sessions" group (N = 2848) averaged a 30.0%improvement (SD = 28.3) in PHQ-9 scores, or 3.0% improvement in PHQ-9 score per session.The "37-41 Sessions" group (N = 511) averaged a 24.9% improvement (SD = 27.1),while the ">41 Sessions" group (N = 285), averaged a 17.1% improvement (SD = 24.8).One-way ANOVA indicated that the groups differed in extent of improvement at this Session 10 time point, F(2, 618) = 37.84, P < 0.0001, with post hoc comparisons indicating the three groups differed from each other.These three groups also differed in rate or slope of improvement between the 10th session assessment and the Session 36 assessment.During this interval the "36 Sessions", "37-41 Sessions", and ">41 Sessions" groups had an average additional improvement in PHQ-9 scores of 29.63% (SD = 29.0),26.48% (SD = 28.70), and 20.93% (SD = 27.47),respectively.One-way ANOVA indicated that the groups also differed in the extent of improvement between the 10 and 36 Sessions time points, F(2, 3641) = 13.24,P < 0.001, with the post hoc comparisons indicating that the ">41 Sessions" group (20.9% ± 27.5) had significantly less improvement over this interval than either the "36 Sessions" (29.6% ± 29.0) or "37-41 Sessions" (26.5% ± 28.7) groups.
It is possible that the groups that received the shortest TMS courses had more extreme clinical outcomes, with higher rates of both marked benefit and ineffectiveness resulting in earlier termination.This could produce average symptom improvement scores equivalent to other groups, but would result in increased variance or altered peaks in the distribution of scores.Comparison of the variances in PHQ-9 percentage change scores for the groups that had ≤36 sessions did not demonstrate inequality after 10 sessions, F(3, 5517) = 0.92, P = 0.43, or 20 sessions, F(2, 5084) = 0.29, P = 0.75.Supplemental Fig. 1 presents for each group the relative frequency distributions for the percentage change in PHQ-9 scores after 10 and 20 sessions, binned into 10% change intervals.There was no indication that the shape of these distributions differed in the groups receiving shorter or longer TMS courses.Patients who received shorter and longer TMS courses did not differ in the average extent of symptom improvement at early assessments nor in the variance or shape of the distributions of these scores.

Effectiveness associated with extended treatment
Using all available data, Table 4 also presents the clinical outcomes for the extended treatment groups after 30, 36 and endpoint sessions.Supplemental Table 1 presents the clinical outcome data restricted to participants who contributed evaluations at all three time points.The repeated measures mixed model contrasting the two extended groups in percentage change in PHQ-9 scores at these three time points yielded main effects of group F(1, 571.2) = 17.29,P < 0.001, time point, F(2, 551.6) = 70.63,P < 0.001, and a significant group by time point interaction, F (2, 551.6) = 5.01, P = 0.007.Both groups showed statistically significant and clinically meaningful improvement at each subsequent time point (Supplemental Table 1).Across time points, the ">41 Sessions" group had inferior outcomes compared to the "37-41 Sessions" group.However, the group by time point interaction reflected the fact that the ">41 Sessions" group had a significantly greater improvement than the "37-41 Sessions" group in the interval between Session 36 and endpoint.During this interval (Session 36 to endpoint), the "37-41 Sessions" group averaged 2.7 (SD = 1.2) additional sessions, whereas the ">41 Sessions" group averaged an additional 12.2 (SD = 7.1) sessions, t(583) = 21.95,P < 0.001.
Around the 30th session, clinicians and patients often anticipate that a pre-authorized limit of 36 sessions will be insufficient and seek authorization for a treatment extension beyond 36 sessions.As documented in Supplemental Table 1, in the "37-41 Sessions" group the response rate following the 30th session increased at the endpoint assessment from 45.9 to 61.6% and the remission rate increased from 14.7 to 29.7%.In the ">41 Sessions" group the response rate increased from 34.0 to 53.6% and the remission rate increased from 8.5 to 20.9%,These changes reflect clinically meaningful improvement in outcomes.

Time course of improvement with TMS
The foregoing analyses indicated that all patient groups showed their greatest rate of PHQ-9 improvement at the assessment after ten sessions and then the rate of improvement declined at the assessment following 20 sessions and remained relatively stable at subsequent assessments (Table 4).This pattern of symptom improvement was confirmed in within-group analyses requiring complete data at each assessment period (Supplemental Table 2, Fig. 5).As seen in Fig. 5, all but the extended treatment groups averaged approximately 3% improvement per session in percentage change in PHQ-9 score at the assessment following 10 sessions.In all groups, including extended treatment, this rate of improvement dropped to approximately 1.0% PHQ-9 improvement per session at the next assessment interval, 20 sessions, and remained relatively stable thereafter.There was no indication of a Values are mean ± SD for continuous variables and percentage of group for categorical variables response and remission rate).Rate of change per session was calculated as the change in the efficacy measure from the previous assessment interval divided by the number of sessions since the last assessment [e.g., (Assessment after 36 sessions -Assessment after 30 sessions/6).One-way analyses of variance were conducted on the continuous variables and chi-square tests on the categorical variables.The letters following each group's value describe the post-hoc comparisons.Groups that differ in all their following letters (e.g., AB vs DE) differed significantly from each other, while groups that shared any letter did not differ (e.g., B vs. BC).
plateau in efficacy with extended treatment as the rate of improvement per session appeared unchanged during the extended treatment periods.

Left only and CGI-S subsamples
The preceding analyses were conducted without constraint on the TMS treatment protocols administered.The findings also pertained to outcomes based on patient self-report.Therefore, generalizability was tested with respect to the subsample exclusively treated with the classic 10 Hz left DLPFC protocol and the subsample that had CGI-S ratings (see Table 1 for inclusion criteria in the Left Only and CGI-S subsamples).In both subsamples we retested the associations between total number of TMS sessions and endpoint clinical outcomes, as was reported in Table 3 and Fig. 2a and b for the total sample.Supplemental Table 3 reports the findings for the Left Only subsample, while Supplemental Table 4 reports the findings for the CGI-S subsample.The CGI-S findings mirrored those in the total sample with PHQ-9 scores.The groups that received less than 30 sessions had the weakest antidepressant effects, effectiveness peaked around 36 sessions, and the extended treatment groups had intermediate outcomes.A similar pattern was also obtained in the PHQ-9 scores of the Left Only subsample, except that the effectiveness peak was broader as the 30-35 Sessions, 36 Sessions, and 37-41 Sessions groups each manifested strong antidepressant effects that did not differ.

Discussion
This naturalistic study, conducted in a large registry sample, found consistent associations between the number of TMS sessions patients were administered during an acute course of treatment for MDD and the antidepressant effects of the intervention.Groups that received 1-19 sessions or 20-29 sessions had reduced benefit compared to groups that received longer treatment courses.In general, effectiveness peaked in the group that received 36 sessions and was still substantial, but reduced, in the groups that received extended treatment (36-41 sessions and >41 sessions).The extended treatment groups showed less clinical benefit and a slower rate of improvement from early in the treatment course.However, the additional treatments in the extended treatment courses were accompanied by additional and clinically meaningful improvement in symptom severity, with substantial increases in response and remission rates.There was no evidence in any group of a plateau over time in the average rate of improvement.Across the sample, the average trajectory reflected a reduction in symptom severity of about 3% per session over the first 10 sessions, followed by a rate of about 1% with each subsequent treatment.The association between number of sessions and effectiveness was confirmed in analyses restricted to the subsample treated exclusively with classic 10 Hz left DLPFC TMS and in the subsample with clinician-rated CGI-S scores.
This study documented consistent dose-response effects in real-world settings, as the number of TMS sessions strongly covaried with the effectiveness of the intervention.However, observation of robust doseresponse effects does not entail demonstration of causal relationships [39,40].Our findings derive from an observational study and not a prospective, randomized trial, and causality of number of sessions in impacting effectiveness was not tested.Across medicine, scores of important dose-response associations have been detected or confirmed in observational studies [41][42][43][44], with one of the most salient being the association between smoking exposure and cancer risk [45].However, dose-response effects in observational treatment studies may be due to biases, especially when subgroups with different prognoses receive different dosing [39,40].In our study, where clinicians were responding to their patients' clinical state in real time, the decision on how many TMS sessions to administer was likely driven by the effectiveness of the intervention.Patients who showed little benefit, plateau in benefit, or symptomatic worsening would be expected to terminate sooner than patients showing progressive improvement.Not all patients had access to, or were offered, the same number of treatment sessions, and extensions beyond 36 sessions are might typically be requested from insurance companies for patients when the clinician believes they are likely to benefit further.Thus, our study describes dose-response relations in a large naturalistic sample where multiple factors determined treatment duration, and the extent to which the findings reflect a causal effect of treatment number or potential confounds is uncertain.Only randomized trials in which patients are assigned to different durations of treatment can definitively address the causal role of treatment number in impacting on effectiveness [46,47].
The findings regarding the trajectory of symptom change during the TMS course are compatible with the possibility that number of TMS sessions actively contributes to effectiveness, but since they are based on non-randomized, naturalistic data they are not determinative of causality.Patients who received fewer than 30 sessions had diminished clinical benefit at endpoint compared to all other groups, yet on a group level they appeared to be as responsive to TMS after 10 and 20 sessions as the groups that showed peak effects with longer treatment courses.The groups that received shorter TMS courses also did not differ from the groups with longer courses in the variance or shape of the distributions of symptom improvement scores at these assessments.Thus, the reduced endpoint effectiveness in the groups that received fewer than 30 sessions often may have been due to premature termination as opposed to treatment resistance.In contrast, the extended treatment groups, which showed reduced and slower improvement from early in the treatment course, experienced substantial clinical gains with longer courses of treatment.Their outcomes were superior to those that received short courses.This pattern is also compatible with an active effect of treatment number on effectiveness.
At the group level, there was no indication in this study that the antidepressant effects of TMS plateaued with prolonged treatment.This is consistent with the findings from the prospective extension trial that followed the O'Reardon et al. [1] sham-controlled trial; 73 participants without meaningful clinical benefit after 4-6 weeks of active TMS in the blinded RCT received open-label TMS for an additional 6 weeks [20].The cumulative incidence of sustained response steadily rose over the 12 weeks of continuous (5 sessions/week) treatment.However, Zhang et al. [25] in a 12-week trial of patients randomized to 4 types of TMS reported no change in the magnitude of antidepressant effects after 5 weeks.
In this study, in all groups the average rate of symptomatic improvement was relatively uniform and stable after 10 sessions and until treatment termination, raising the possibility that additional treatment could result in further symptom reduction in patients who have shown incomplete improvement.In this and other TMS studies [3,4,21,48], rates of response are often nearly double rates of remission, and providing longer TMS courses may be a useful strategy to maximize remission rates.It has been recently demonstrated that providing an additional second course of ECT results in marked clinical benefit in patients who were nonresponders to a standard course of ECT treatment [49].In contrast, there is substantial evidence that the therapeutic effects of antidepressant medications generally plateau despite increasing oral dose [50][51][52] or extending the duration of treatment beyond 6-12 weeks [53][54][55][56].
In addition to the powerful associations between the number of sessions in the acute TMS course and antidepressant effectiveness, we characterized average trajectories of symptom improvement in groups defined by length of the TMS course.In patients who received 36 or fewer sessions, the average rate of improvement in symptom scores was about 3% per session after the first 10 sessions and then a steady 1.0-1.3%thereafter.Patients who received longer than typical courses of TMS showed less benefit after the first 10 sessions and a slower but steady rate of improvement thereafter.These patterns correspond well with the response trajectories empirically derived by Kaster et al. [24] when applying group-based trajectory modeling to serial depression scores of individual patients in a 6-week RCT.Similar to the average pattern of symptom change documented in this study, Kaster et al. [24] identified a group with Rapid Response where marked symptom reduction occurred over the first 2 weeks (10 sessions) followed by a stable and considerably slower rate of improvement.Kaster et al. [24] identified two groups that differed only in pre-TMS depression severity and who had slow and steady improvement from TMS outset.This pattern, labelled as Linear Response, resembles the trajectories we observed in the extended treatment groups, i.e., relatively diminished benefit after the first 10 sessions with a slow and steady rate of improvement thereafter.In addition, Kaster et al. [24] identified a Fig. 4. Percent change in PHQ-9 scores for the "36 Sessions", "37-41 Sessions", and ">41 Sessions" groups after 10, 20, 30, and 36 sessions.To be included, patients contributed PHQ-9 scores at all 4 time points.fourth group of non-responsive individuals with minimal symptomatic change over time, which we did not attempt to isolate here.Despite differences in sample selection, outcome measures, TMS procedures, and study designs, our investigation and that of Kaster et al. [24] appear to describe similar rapid and slow patterns of symptom change during TMS for MDD.
This study found that exceptionally long treatment duration was associated with additional meaningful clinical benefit, and, in all groups, there was no drop-off in the rate of clinical improvement at the end of the treatment course.Presuming that lengthening the TMS course is a useful strategy to maximize effectiveness, there may be practical impediments with protocols involving a single session per day due to cost and time inefficiencies.Some accelerated protocols, in which multiple TMS sessions are administered in the same day, have shown rapid symptom improvement in MDD [57][58][59].In studies implementing accelerated protocols, there appears to a positive association between the total number of sessions and effectiveness in MDD [59], akin to the findings of this study.Another potential implication centers on the observation that after 10 TMS sessions the modal patient has a stable rate of improvement that is about one-third that displayed earlier.Adaptive neurophysiological changes contingent on treatment exposure are commonly observed with pharmacological and other neruomodulatory interventions that, in some cases, may interfere with therapeutic effects [60][61][62].For example, ECT dynamically raises its own seizure threshold [63,64], and, yet, electrical dose relative to threshold is a key determinant of both therapeutic and adverse effects [65][66][67].Speculatively, identifying the adaptive mechanisms responsible for the marked reduction in the rate of improvement during the TMS course and potentially modifying TMS procedures to mitigate their impact could broadly enhance the efficiency and cost/effectiveness of this intervention.
The major strengths of this study, like its major limitations, derive from its observational nature utilizing a large patient registry in realword settings.A strong association was demonstrated between the number of TMS treatments in an acute course and antidepressant effectiveness.Evidence was also provided that the rate of change in symptom severity during the TMS course has a characteristic pattern, typically declining sharply after 10 sessions and remaining stable thereafter until endpoint.In this study, patients who received exceptionally long TMS courses (>36 sessions) also had different trajectories of improvement, showing less benefit from early in the treatment course.This study also documented that the additional treatments administered in these extended courses were associated with substantial improvements in rates of response and remission.

Disclosures
The NeuroStar® Advanced Therapy System Clinical Outcomes Registry, analysis of the registry data, and the drafting of this manuscript were supported by Neuronetics Inc. (Malvern, PA, USA).While staff at Fig. 5. Improvement in percentage change in PHQ-9 scores per session (Mean ± SEM) for the six groups during the intervals of 1-10, 11-20, 21-30, 31-36 and 37endpoint sessions.To be included, patients contributed PHQ-9 scores at all applicable time points.

Fig. 2 .
Fig. 2. Endpoint clinical outcomes for six groups defined by receipt of differing number of sessions in the acute TMS course.(a) Mean ± SEM of percent change in PHQ-9 scores over the treatment course for the six groups.(b) Rates of response and remission in the six groups.

Fig. 3 .
Fig. 3. Clinical outcomes for the six groups after 10, 20, 30 and 36 sessions and at endpoint in the groups with extended treatment (>36 sessions).(a) Percent change in PHQ-9 scores for the six groups at the fixed time points.(b) Response rates for the six groups at the fixed time points.(c) Remission rates for the six groups at the fixed time points.

Table 2
Demographics, treatment schedule, and TMS parameters in groups that received differing durations of treatment with TMS.