International pooled patient-level meta-analysis of ketamine infusion for depression: In search of clinical moderators
This meta-analysis (n=809, s=17) finds robust effects of ketamine for relieving depression (at 24 hours and seven days). Moderators of this effect were the level of treatment resistance (i.e. more failed SSRIs) and studies that used a cross-over design (smaller placebo effect). Other moderators were found, but all were modest and clinically irrelevant (i.e. age or sex doesn't moderate treatment effect).
Authors
- Suresh Muthukumaraswamy
- Sanjay Mathew
- Carlos Zarate Jr.
Published
Abstract
Depression is disabling and highly prevalent. Intravenous (IV) ketamine displays rapid-onset antidepressant properties, but little is known regarding which patients are most likely to benefit, limiting personalized prescriptions. We identified randomized controlled trials of IV ketamine that recruited individuals with a relevant psychiatric diagnosis (e.g., unipolar or bipolar depression; post-traumatic stress disorder), included one or more control arms, did not provide any other study-administered treatment in conjunction with ketamine (although clinically prescribed concurrent treatments were allowable), and assessed outcome using either the Montgomery-Åsberg Depression Rating Scale or the Hamilton Rating Scale for Depression (HRSD-17). Individual patient-level data for at least one outcome was obtained from 17 of 25 eligible trials [pooled n = 809]. Rates of participant-level data availability across 33 moderators that were solicited from these 17 studies ranged from 10.8% to 100% (median = 55.6%). After data harmonization, moderators available in at least 40% of the dataset were tested sequentially, as well as with a data-driven, combined moderator approach. Robust main effects of ketamine on acute [~24-hours; β*(95% CI) = 0.58 (0.44, 0.72); p < 0.0001] and post-acute [~7 days; β*(95% CI) = 0.38 (0.23, 0.54); p < 0.0001] depression severity were observed. Two study-level moderators emerged as significant: ketamine effects (relative to placebo) were larger in studies that required a higher degree of previous treatment resistance to federal regulatory agency-approved antidepressant medications (≥2 failed trials) for study entry; and in studies that used a crossover design. A comprehensive data-driven search for combined moderators identified statistically significant, but modest and clinically uninformative, effects (effect size r ≤ 0.29, a small-medium effect). Ketamine robustly reduces depressive symptoms in a heterogeneous range of patients, with benefit relative to placebo even greater in patients more resistant to prior medications. In this largest effort to date to apply precision medicine approaches to ketamine treatment, no clinical or demographic patient-level features were detected that could be used to guide ketamine treatment decisions.
Research Summary of 'International pooled patient-level meta-analysis of ketamine infusion for depression: In search of clinical moderators'
Introduction
Ketamine, a glutamatergic anaesthetic, has shown rapid and large antidepressant effects at subanaesthetic intravenous doses in randomized controlled trials, including in treatment-resistant and bipolar depression. However, most ketamine RCTs have small samples that are adequate to detect group-level effects but underpowered to identify moderators—baseline characteristics that predict which patients will benefit most. Previous candidate predictors (clinical, mechanistic, biological) have not been consistently replicated across trials, and study-level meta-analyses have failed to find reliable moderators. This uncertainty limits personalised prescribing and the efficient targeting of ketamine to patients most likely to gain clinically meaningful benefit. Price and colleagues therefore conducted an individual participant data (pooled patient-level or "mega-analytic") meta-analysis of randomized IV ketamine trials recruiting patients with depressive symptoms. The study aimed to (1) quantify ketamine's effect versus control on continuous and dichotomous depression outcomes at two acute timepoints (~24 hours and ~7 days after a single infusion); (2) test patient- and study-level moderators of ketamine's efficacy using sequential (univariate) analyses; and (3) apply a data-driven combined-moderator method to identify weighted combinations of variables that might improve clinical prediction. The focus was on moderators readily available in clinical practice and on trials using clinician-rated MADRS or HRSD-17 depression scales.
Methods
Trials were identified through a pre-registered systematic search of PubMed to 19 January 2021 and supplemental checks of published reviews. Eligible studies were randomized controlled trials in which at least one IV ketamine infusion was administered to adults with unipolar or bipolar depression or another disorder where depressive symptoms are central (for example, PTSD). Trials were excluded if ketamine was given alongside other study-administered treatments such as ECT, but concomitant clinically prescribed medications were allowed. Authors of eligible trials were invited to contribute individual participant data, including drug condition, infusion order for crossover trials, pre- and post-infusion MADRS and HRSD-17 scores, and a prespecified set of 33 potential moderator variables. Target timepoints were "rapid" (≈24 hours post-infusion, allowable window 4 hours to 3 days) and "post-rapid" (≈7 days post-infusion, allowable window 6–14 days). Contributing teams attested to trial methodological details used to assess risk of bias across Cochrane-relevant criteria. Risk of bias was judged low overall except for possible functional unblinding from ketamine-specific side effects. To harmonise outcomes, HRSD-17 scores were converted to MADRS using a published conversion algorithm when necessary, and analyses focused on patients receiving IV ketamine ≥0.5 mg/kg versus placebo (inert or psychoactive). For trials with crossover or repeated-infusion designs, only data from the first infusion were used to avoid within-subject repeated measures complicating moderator tests. Availability of the 33 moderators varied widely across participants and studies. Analyses used linear mixed-effects regression models with a random effect for study to account for between-study heterogeneity; continuous moderators were standardised and dichotomous variables coded 0/1. Completer datasets were used because completion rates were high (≥90%) and imputation across studies was deemed of limited value. Primary outcomes were percentage improvement in MADRS from baseline to rapid and post-rapid timepoints. Response (≥50% MADRS reduction) and remission (MADRS ≤ 9) were reported descriptively but not used in moderator testing. Moderators were examined sequentially (interaction moderator*treatment), with a pre-specified set of nine "Tier 1" variables that were non-redundant and available in ≥99.5% of patients treated as primary; Bonferroni correction was applied across these nine tests. An additional 29 "Tier 2" variables available in ≥40% of patients were treated as exploratory and not adjusted for multiple comparisons. Moderator effect sizes were expressed as Spearman r correlations; the authors considered |r| ≥ 0.3 to be of potential clinical utility. A data-driven combined moderator (M*) approach was also applied. This uses regularised multivariable regression to derive weights for multiple moderators, producing a single composite score that may yield stronger predictive power than individual moderators. Separate M* models were run for rapid and post-rapid outcomes. Tier 1 patient-characteristic variables were included in M* #1; Tier 2 variables were grouped thematically into seven subsets (M* #2a–2g) to maximise sample size within each combined analysis. Bootstrap confidence intervals were used to assess statistical significance of M* estimates.
Results
Seventeen of 25 eligible trials provided at least one usable outcome variable, yielding pooled data from n = 809 patients; n = 720 patients received one of the ketamine or control conditions specified for inclusion in the primary meta-analyses. After harmonisation, moderator availability across the requested 33 variables ranged from 10.8% to 100% (median 55.6%). Ketamine dosing and administration were largely uniform across included trials, supporting the primary exposure definition of IV ketamine ≥0.5 mg/kg. Main clinical outcomes confirmed robust antidepressant effects at both timepoints. The pooled analyses showed strong reductions in depression severity at the rapid (≈1 day) and post-rapid (≈7 days) assessments, with overall responder and remission peaks reported as 46% and 27% respectively across the pooled sample. These rates were comparable to those reported in clinical settings but lower than earliest RCTs in the literature. Sequential moderator analyses tested 37 potential moderators. Three significant Tier 1 moderators emerged, all pertaining to study-level design features. First, trials that required a higher treatment-resistant depression (TRD) threshold for enrolment (≥2 failed adequate antidepressant medication trials) showed a larger ketamine effect relative to placebo at the post-rapid timepoint; this interaction survived correction for multiple comparisons (post-rapid r = 0.108; β* 0.47, 95% CI 0.16 to 0.77; p adjusted = 0.027). The pattern reflected somewhat larger ketamine responses and somewhat lower placebo responses in higher-TRD trials. Second, crossover-design trials displayed greater ketamine-versus-placebo separation at the rapid timepoint (r = 0.132; β* 0.52, 95% CI 0.23 to 0.81; p adjusted = 0.036), an effect driven mainly by lower placebo responses in crossover studies rather than larger ketamine responses. A third study-level moderator—trials conducted in the US—showed a larger post-rapid ketamine effect unadjusted (r = 0.089; β* 0.41; p unadjusted = 0.0096) but this did not survive correction. Among Tier 2 (exploratory) moderators, higher baseline systolic blood pressure predicted greater post-rapid benefit from ketamine relative to placebo (r = 0.106; β* 0.23, 95% CI 0.04 to 0.42; p unadjusted = 0.019). Several other variables (placebo type, marital status, Black race, number of failed trials at patient level, number of major depressive episodes, BMI) showed trend-level moderation (p unadjusted < 0.10) in at least one analysis but were not robust. Combined moderator (M*) analyses found that each composite model was statistically significant (bootstrap CIs did not cross 0), and all M* effect sizes exceeded the largest single-moderator effect. Nevertheless, composite effect sizes remained small-to-medium, with r values ranging from 0.12 to 0.29 across models. The largest differential effect was observed in M* #2f, which combined six Tier 1 variables with BMI and smoking status in a dataset of n = 232 patients from seven studies. For the rapid timepoint this model produced r = 0.293 (95% CI 0.175 to 0.415) and for the post-rapid timepoint r = 0.234 (95% CI 0.118 to 0.347). Variables with the largest weights in this combined moderator were study TRD threshold, current MDD diagnosis, study country (US vs outside US), and BMI; the pattern associated with greater ketamine-versus-placebo improvement included greater prior treatment resistance, absence of an MDD diagnosis (for example bipolar disorder or PTSD), US-based study, and higher BMI. Despite statistical significance, the authors characterised these combined effects as modest and of limited clinical utility because a large portion of variance in outcome remained unexplained.
Discussion
Price and colleagues interpret the findings as confirmation that a single IV ketamine infusion produces robust, rapid antidepressant effects across a heterogeneous international sample, with clinically meaningful response and remission occurring in a substantial minority of patients. The pooled patient-level approach increased power for moderator testing but nevertheless identified few reliable patient-level clinical or demographic moderators that could usefully guide individualised treatment selection prior to a first infusion. When moderators were detected, they were mainly study-design features: higher TRD thresholds at enrolment and crossover designs were associated with greater ketamine-versus-placebo separation, and an apparent US study effect was observed but did not survive correction. The authors position these results in relation to prior trial-level work by noting that many previously reported candidate moderators (for example age, sex, diagnosis of unipolar versus bipolar depression, concomitant medication status) did not predict differential benefit in this pooled dataset. They suggest that study design choices—particularly TRD inclusion thresholds—can influence observed effect sizes and that reserving ketamine for patients with greater prior treatment resistance may both align with consensus clinical recommendations and enhance trial power. The enhanced separation in crossover trials may reflect expectancy effects or other design-specific influences, given that only first-infusion data were analysed and carry-over was therefore unlikely to explain the finding. Key limitations acknowledged by the investigators include the predominance of single-infusion RCTs in the literature (limiting inferences about serial infusion regimens used in clinical practice), lack of longer-term follow-up data, and a constrained set of harmonisable moderators across trials. Some moderators could only be examined at the between-study level, reducing power and limiting the advantages of the pooled participant-level approach. The dataset also lacked diversity in age ranges (no paediatric or geriatric trials met inclusion criteria at the time), had limited racial and ethnic heterogeneity, and under-represented patients with common comorbidities seen in routine clinical settings. The authors note that intranasal esketamine trials were not included owing to fewer published studies and data-sharing constraints, which may limit generalisability to that licensed treatment. In terms of implications, the authors conclude that, because routinely collected clinical and demographic variables have limited predictive power, a pragmatic "fast-fail" clinical strategy—testing a short, time-limited course of one to three infusions to identify responders—remains the most accurate available approach to individualise ketamine treatment. They further suggest that development and integration of mechanistic, treatment-relevant biomarkers (for example neuroimaging or accessible blood-based measures) may be necessary to achieve clinically useful precision in predicting ketamine response, although such measures are currently less available and insufficiently replicated.
Conclusion
This international pooled patient-level meta-analysis validated the rapid and short-term antidepressant efficacy of a single IV ketamine infusion, with overall responder and remission rates of approximately 46% and 27% respectively. Despite exhaustive sequential and data-driven combined moderator searches across 37 variables, the ability to predict which patients will preferentially benefit from ketamine prior to treatment was limited. The investigators therefore recommend that, until more predictive mechanistic markers are established, a time-limited trial of ketamine (a "fast-fail" strategy) remains the most accurate method to identify individual responders, while noting access, cost, and equity issues that may impede this approach in many healthcare contexts.
View full paper sections
INTRODUCTION
Ketamine is a glutamatergic agent used routinely for induction and maintenance of anesthesia. In randomized controlled trials (RCTs), subanesthetic (typically, 0.5 mg/kg) intravenous (IV) ketamine exhibits well-replicated, rapid, potent antidepressant effects (i.e., study-level meta-analytic Cohen's d's ≥ 1.0, reflecting large effects) in difficult-to-treat conditions such as treatment-resistant depressionand bipolar depression. Antidepressant effects are detected within approximately 2 hours post-infusion (after acute dissociative and euphoric side effects subside) and continue far beyond the drug's elimination half-life of 2.5-3 hours. Ketamine is now administered outside of research environments, including in hospital settings and specialized "ketamine therapy" clinics. However, IV ketamine's clinical potential has been limited by practicalities including lack of insurance coverage for this off-label prescribing practice, high out-of-pocket expense to patients in many healthcare systems, burden on patients and the healthcare system due to ketamine's side effect profile and administration routes, and concerns for abuse liability. Such limitations may nevertheless be offset among a subset of patients for whom a strong, rapid response to ketamine administration is highly likely. But to date, there is limited understanding of which patients are likely to experience robust benefit. Because IV ketamine's effect size at a group level is typically large, randomized controlled trials (RCTs) have routinely been conducted with small sample sizes. Although such studies are adequately powered to detect ketamine's effects at the group level, individual RCTs are often under-powered for conducting moderator analyses-i.e., analyses of baseline characteristics that can indicate which patients experience more benefit from ketamine relative to a comparator. Moderator analyses may yield smaller effect sizes, necessitating larger samples, and rely on sufficient heterogeneity within study participants. Although some predictors of ketamine's antidepressant efficacy, including clinical (e.g., family history of alcohol use disorder; suicide history; body mass index (BMI); benzodiazepine use) and mechanistic (e.g., neuroimaging; cognitive; peripheral blood markers; genetic) variables, have been reported, none have been replicated across more than one RCT. RCT designs are essential to separate specific from nonspecific predictors of outcome, but many predictive analyses have been conducted in ketamine-treated patients alone. Study-level meta-analyses have likewise not identified reliable moderators of effect size across trials. A more powerful meta-analytic approach is therefore needed to guide clinical treatment decisions, ideally focusing on moderators that can be readily measured in clinical settings. The current study therefore employed a pooled patient-level 'mega-analytic' approach using participant-level data from RCTs of IV ketamine, administered to individuals experiencing depressive symptoms. While preserving the advantages of conventional meta-analysis as a means of aggregating evidence across numerous studies (overcoming certain limitations of individual studies, e.g. small sample size), patient-level 'mega-analysis' (also known as individual participant data meta-analysis) offers unique advantages, including an order-of-magnitude increase in data points analyzed for each variable (many per study rather than one summary measure per study)-which substantially increases statistical power, particularly for testing moderators-and the ability to test hypotheses not able to be adequately tested in the individual original studies. We aimed to clarify the potential role of IV ketamine in the treatment of depression by: (1) characterizing the impact of IV ketamine (vs. control groups) on continuous and dichotomous measures of depression, including clinically meaningful (response/remission) benchmarks; (2) identifying individual patient and study-level characteristics that moderate ketamine's effect on symptoms, in the hopes of suggesting ways to maximize response rates through personalized patient prescriptions; (3) utilizing a data-driven 'combined moderator' approach to identify novel combinations of patient characteristics that together may enhance clinical prediction and decision-making accuracy for use in clinical settings.
STUDY IDENTIFICATION AND SELECTION
The meta-analysis protocol was pre-registered at(CRD42021235630). PubMed was searched over the period from inception to 01/19/2021 using the auto-expanding option encompassing all terms and synonyms related to the following search: "ketamine AND (randomized or RCT) AND depress*". Published meta-analyses and reviews were checked for additional relevant studies. Two independent raters assessed eligibility of all records according to inclusion criteria (agreement = 87%), and a third rater (RBP) resolved all discrepant eligibility determinations (n = 70; 13% of abstracts reviewed). Based on a dimensional conceptualization of depression and to promote patient-level diagnostic heterogeneity, all studies retrieved through our systematic literature review (as described above) were considered eligible if they recruited individuals with a unipolar or bipolar depressive disorder or another highly comorbid disorder in which depressive symptoms are central (e.g., post-traumatic stress disorder), and in which depression scores were reported as an outcome. At least one IV ketamine administration was required. Studies giving ketamine in combination with additional study-administered treatments (e.g., ECT) were excluded to improve power for testing mechanistic hypotheses relevant to ketamine specifically; however, studies including patients on stable doses of other concomitant medications prescribed clinically were allowable. An RCT design was required to minimize bias. Allowable control conditions included inert or psychoactive placebo, wait-list, or treatment-as-usual. Finally, to maximize data points while using uniform outcome measures across studies, depression outcome measures were selected as those most frequently reported in ketamine studies. Two outcomes emerged as most prevalent: (1) the Montgomery-Åsberg Depression Rating Scale (MADRS), and (2) the Hamilton Rating Scale for Depression (17-item version; HRSD). Both are widely used, well-validated, clinician-rated measures of depression severity. Authors of eligible studies were invited, via email, to contribute data. Repeated attempts were made if no response was received. The following data were requested per-participant, with authors asked to contribute all available variables: drug condition, infusion order (relevant for crossover studies), pre-and post-infusion MADRS and HRSD-17 scores, and 33 potential moderator variables (detailed below). For post-infusion scores, the target timepoints relative to the infusion date were 24-hours ("rapid") and 7 days ("post-rapid") following a single infusion, and this precise protocol was available in 66.7% of contributing studies; however, deviations from these designs in a subset of included studies were allowable if the "rapid" outcome was collected between 4 hours and 3 days after a single infusion (with no additional infusions given in the interim), and if the "post-rapid" outcome was collected between 6 and 14 days following a first infusion, even if subsequent infusions were also given within that interval (see Tablefor protocol details of all included studies). Anxiety (Hamilton Anxiety Rating Scale) and suicidal ideation (Beck Scale for Suicide Ideation) at baseline and 24-hours were also solicited as potential exploratory outcomes but were provided by too few studies to be considered usable (≤33.3%).
QUALITY ASSESSMENT AND DATA EXTRACTION
Each contributing study team was asked to attest to specific methodological details (randomization, allocation concealment, blinding, and missing data). Responses were used to summarize the degree of protection against bias across 5 relevant criteria from the Cochrane Collaborations' risk of bias tool. Risk of bias based on the responses provided was uniformly low, with the exception of some risk of functional unblinding due to ketamine-specific side effects (details in Supplementary-1). Evidence for publication bias was also not found (Supplementary-1).
DATA HARMONIZATION
As shown in Table, 10 studies collected MADRS only, 3 studies collected HRSD-17 only, and 4 studies collected both MADRS and HRSD-17 scores. Given the higher prevalence of MADRS scores, to harmonize outcome measurement across all studies and maximize sample size for all analyses, a published score-to-score conversion algorithmfor depressed patients was utilized to estimate individual MADRS scores (at each timepoint) from HRSD-17 scores. Sensitivity analyses showed that studies where the MADRS was estimated did not significantly differ from other studies in terms of average MADRS scores or ketamine efficacy (Supplementary-1). Due to high uniformity and application of consensus guidelines among ketamine clinical research, ketamine dosing, administration, and infusion methods were largely uniform across included studies (Table). Based on the strong preponderance of studies using 0.5 mg/kg ketamine dosing, and prior evidence of dose-response relationships, primary analyses defined each patient's treatment group as either (1) ≥0.5 mg/kg of intravenous ketamine or (2) placebo (inert or psychoactive). Patients receiving other ketamine doses (7.6% of patients), or other potentially active antidepressants (lanicemine; 2.4% of patients), were not included. In the minority of studies that utilized a crossover and/or repeated infusions design, we included only data relating to the first infusion that was given, thereby eliminating additional repeated within-subject measurements uniformly across all studies. The 33 requested moderator variables were selected through consensus among study planners (RBP, EDB, CJZ, STW, SJM) to represent a comprehensive list based on previously reported moderation and a N = number of unique patients with data provided and used in current primary analyses. Values may differ from those reported in original publications due to the eligible treatment conditions used in the pooled patient-level meta-analysis. For "rapid" timepoint, datapoint was approximately 1 day following the first infusion in the sequence of infusions. For "post-rapid" timepoint, the datapoint that was as close as possible to 7 days after the first infusion was used, even if subsequent serial infusions had been given within the ~7-day post-infusion period. To harmonize outcomes for primary analyses, MADRS scores were estimated from HRSD-17 scores according to a published conversion table. d Data not included in analyses.
R.B. PRICE ET AL.
prediction findings for ketamine and the study team's knowledge of basic clinical (psychiatric and medical) and demographic information that is routinely collected in ketamine trials or was anticipated to be available in at least a subset of ketamine RCTs. The variables were returned in a range of formats and with highly variable data availability/compliance. For studylevel characteristics used in descriptive and moderator analyses, design features were extracted by one rater (AB) and independently verified by a second rater (RBP). A single rater (RBP) then utilized a combination of automated (e.g., text string search) and hand-coding procedures to apply data harmonization techniques and create a uniform final set of dummycoded (categorical) and continuous variables that maximized the capacity to analyze moderators uniformly across studies, as detailed in Table. In the final set of harmonized moderators (Table), availability of patientlevel data ranged from 10.8% of patients to 100%, with a median of 55.6%. A second rater (MLWoody) independently verified all coded variables by cross-referencing the original source data; discrepant values were resolved by consensus.
STATISTICAL ANALYSIS
Analyses were conducted comparing IV ketamine doses of 0.5 mg/kg or greater vs. all placebo conditions, with inert and psychoactive placebo collapsed into one group (type of placebo condition was analyzed as a study-level moderator). Two outcomes were computed as the % improvement in MADRS score from pre-infusion to: (a) "rapid" postinfusion MADRS and (b) "post-rapid" post-infusion MADRS. MADRS response (≥50% decrease from pre-infusion) and remission (MADRS ≤ 9) rates were calculated to provide further descriptive information on the clinical main effects of ketamine vs. placebo, but were not used as outcomes in moderator analyses, given that the goal of these analyses was to explain heterogeneity of outcomes, which is maximally captured by continuous measures. Individual patient data analyseswere completed separately for "rapid" and "post-rapid" continuous outcomes using linear mixed effects regression models. All models included a random study effect to control for unobserved study heterogeneity; patient-level data was considered level 1 and study-level data was considered level 2. For interpretability, continuous variables were standardized and dichotomous variables were coded as 0 and 1. All analyses were performed using R version 3.6.3. Completion rates were high in the contributing studies (≥90%) and riskof-bias assessments (Supplementary-1) suggested low risk of bias from missing data. The novel information obtainable through imputation was expected to be low due to high completion rates, the use of only two assessment points in each analysis, and the inability to impute across studies. Therefore, completer datasets were used for all analyses. Main effects. We tested the main treatment effect for % improvement, response, and remission at the "rapid" and "post-rapid" time points. Standardized coefficients (β*) or odds ratios (OR) with 95% profile likelihood confidence intervals are reported for these outcomes. Number needed to treat (NNT) is also provided. Sequential moderator analyses. Potential moderators were first tested sequentially. For each of the two outcome variables (% change in MADRS at rapid and post-rapid timepoints), models included the moderator variable, treatment, and their interaction term (moderator*treatment) as independent variables, with study as a random effect. A class of 9 moderator variables were non-redundant and available in ≥99.5% of patients and were therefore considered as primary (labeled "Tier 1"). Twotailed p-values are reported with Bonferroni correction across these 9 variables; for completeness, unadjusted p-values are also reported. An additional set of 29 moderators were available in a minimum of 40% of patient-level datasets. These "Tier 2" variables, available in 40-82% of patients, were considered exploratory due to lower statistical power and low case counts for some patient features. Thus, Tier 2 p-values are unadjusted to minimize Type II error. The cut-point of ≥40% for inclusion in Tier 2 was determined based on a natural inflection point in the distribution of missingness (see Table), allowing for retention of 78% of all potential moderators, with a minimum of n = 288 patients in each individual moderator analysis. Five continuous moderator variables (Table) showing substantial deviations from normality per Q-Q plot inspection were log-transformed prior to analysis. For each model, we extracted the standardized β (β*) and 95% confidence interval for the interaction term. We also computed the moderator effect size, r, with 95% bootstrap confidence intervals based on 200 samples. These effect sizes are Spearman correlations that indicate the strength with which a potential moderator distinguishes outcome differences between those receiving ketamine versus placebo. More positive r values indicate that higher values of an ordered moderator (or endorsing a categorical moderator) are associated with higher percentage improvement in depression scores for ketamine relative to placebo. As a benchmark to guide our interpretation of findings, for both individual and combined moderators, we considered only moderators with medium-to-large effect sizes (|r | ≥ 0.3) to be of sufficient explanatory power to be useful in guiding clinical decisionmaking. Combined moderator analyses. A data-driven approach was taken to probe for combinations of moderator variables that jointly (as a weighted combination) predict efficacy of ketamine over placebo. The combined moderator is denoted M*. Its derivation has been described in detail previouslyand used successfully to identify combined moderators for randomized trials. Briefly, the optimal combined moderator approach uses multivariable regularized regression to simultaneously estimate weights that quantify the extent to which each moderator distinguishes outcome differences between participants who received ketamine versus placebo. These weights are used to compute a new combined moderator, denoted M*. M* incorporates information across multiple potentially weak and/or contradictory moderators, thereby providing a single, stronger indication of the treatment on which an individual is likely to have a preferable outcome. Bootstrap confidence limits for M* were computed and used to determine statistical significance based on whether the CI crossed 0, as this approach to significance testing was robust to the nested study design. As above, two separate models were run for each analysis, using (1) the rapid and (2) the post-rapid timepoints as the outcome variable. Tier 1 M* models included six Tier 1 variables that pertained to patient characteristics (M* #1). Two Tier 1 variables (crossover design; placebo type) were excluded from these analyses, because they pertained strictly to research study design features and inferences would not be generalizable to clinical treatment settings; and one additional Tier 1 variable (principal diagnosis) was omitted due to high overlap/redundancy with the Major Depressive Disorder (MDD) diagnosis dummy-coded variable already included. Next, 7 unique subsets of Tier 2 variables (M* #2a-2g) were constructed to organize moderator variables thematically (as shown in Table) while also maximizing the number of retained datapoints within each analysis. Given that each moderator variable in Tier 2 was available within a unique subset of studies, compiling numerous (i.e., ≥3) Tier 2 variables into a single M* analysis would necessitate reducing the total number of patients/studies available for use within that analysis. Thus, we opted to separately analyze the 7 unique moderator variable subsets (M* #2a-2g). Each of these Tier 2 M* analyses retained all six of the Tier 1 patient characteristic variables (the inclusion of these Tier 1 variables never reduced the number of studies/patients available for any analysis, due to >99% availability of each Tier 1 variable across the full dataset, and thus could only increase predictive power for the data-driven approach), while adding between 1 and 3 unique Tier 2 variables (see Table, "Tier/Analysis"). M* analyses in each Tier 2 level included a maximum of n = 632 (Tier 2a) and a minimum of n = 217 patients (Tier 2 f). As with the sequential analyses, for each M* we extracted the standardized beta for the interaction term and the moderator effect size r. Non-specific predictor effects. Although our a priori focus was on moderators predicting differential response to ketamine vs. placebo, the non-specific effects (i.e., across ketamine and placebo arms) for each potential moderator variable were also quantified. This information is included in the full statistical output (Supplementary-1).
STUDY SELECTION
See Fig.for PRISMA flowchart. At least one usable outcome variable was obtained from 68% of eligible studies (17/25; n = 809 patients). Of these, a total of n = 720 patients received one of the ketamine or control conditions specified for inclusion in metaanalyses. Tablepresents descriptive characteristics of participating studies; Supplementary-1 presents quality assessments of included studies.
SEQUENTIAL MODERATORS
Of 37 moderators tested sequentially, three significant "Tier 1" moderators were identified pertaining to study-level design features (two that were robust after adjusting for multiple comparisons), and one exploratory "Tier 2" patient-level moderator was significant. Tier 1 moderators. The effect of ketamine, relative to placebo, was greater for studies with a higher treatment-resistant depression (TRD) threshold (≥2 failed antidepressant medication [ADM] trials) as a condition of enrollment. The effect for the rapid timepoint outcome [r = 0.083; β*(95% CI) = 0.32 (0.04, 0.59);p unad- justed = 0.023; p adjusted = 0.207] did not survive multiple comparisons correction, but the effect for the post-rapid timepoint outcome was robust [r = 0.108; β*(95% CI) = 0.47 (0.16, 0.77); p unadjusted = 0.003; p adjusted = 0.027]. These interaction effects were driven jointly by numerically (but not statistically) larger ketamine responses, combined with numerically (but not statistically) lower placebo responses, in studies enrolling patients with greater treatment resistance (Fig.). The effect of ketamine relative to placebo was also greater for studies with a crossover design, but only at the rapid timepoint [r = 0.132; β*(95% CI) = 0.52 (0.23, 0.81); p unadjusted = 0.0004; p adjusted = 0.036; Fig., and not at the post-rapid timepoint [r = 0.041; β*(95% CI) = 0.16 (-0.15, 0.48); p unadjusted = 0.301; p adjusted = 1.0]. This interaction effect at the rapid timepoint was driven by a significantly lower placebo response in the trials with a crossover design [within placebo-treated patients: β*(95% CI) = -0.48 (-0.86, -0.09); p = 0.020], while the ketamine response in crossover trials was numerically (but not statistically) higher than in parallel-arm studies [within ketamine-treated patients: β* (95% CI) = 0.11 (-0.23, 0.45); p = 0.506]. The effect of ketamine, relative to placebo, was also greater for studies completed in the U.S., but only at the post-rapid timepoint, and this did not survive multiple comparisons correction [r = 0.089; β*(95% CI) = 0.41 (0.10, 0.72); p unadjusted = 0.0096; p adjusted = 0.086]. This pattern was driven jointly by a numerically (but not statistically) lower placebo response and a numerically (but not statistically) higher ketamine response among trials conducted in the U.S. (Fig.). Tier 2 (exploratory) moderators. At the post-rapid timepoint (but not the rapid timepoint), baseline systolic blood pressure moderated response [r = 0.106; β*(95% CI) = 0.23 (0.04, 0.42); p unadjusted = 0.019], such that higher blood pressure at baseline a Tier 1 variables are those available in >99% of patients in the pooled dataset. These first two Tier 1 variables, which related to study design features rather than to patient-level characteristics, were excluded from all combined moderator analyses, as the purpose was to identify combinations of patient (not study) characteristics that predicted differential improvement following ketamine (relative to placebo). b Tier 1 variables are those available in >99% of patients in the pooled dataset. These 7 Tier 1 variables were related to patient-level characteristics and thus were included in the Tier 1 M*, and also retained in all M* analyses across all Tiers 2a-f, as their inclusion never reduced the number of studies/patients available for any analysis and could only increase predictive power for the data-driven approach. c Bipolar I/II diagnosis was highly non-orthogonal and virtually redundant with the MDD diagnosis variable, as bipolar and unipolar depression diagnoses are mutually exclusive. To eliminate this redundancy in hypothesis tests, this variable was not tested as a moderator in any analysis. was associated with better post-rapid response to ketamine specifically. See Supplementary-1 for effect sizes and statistics for all (Tier 1 and Tier 2) individual moderators. Six additional moderators [placebo type (inert vs. psychoactive); marital status; Black race; number of failed trials (coded at the patient level); number of major depressive episodes; BMI] exhibited non-significant trendlevel (p unadjusted < 0.10) moderation effects in at least one analysis.
COMBINED MODERATORS
Full findings for all M* analyses are presented in Supplementary-2. Overall, each M* analysis was statistically significant (95% CI did not cross 0), and all M* effect sizes uniformly exceeded the largest effect size observed for any individual moderator above (i.e., r = 0.11). However, effect size point estimates (r; interpretable as a correlation coefficient) remained small-to-medium (range across all M* analyses: r = 0.12-0.29). M* #2 f provided the maximum differential effect size for both the rapid [r (95% CI) = 0.293 (0.175, 0.415)] and post-rapid [r (95% CI) = 0.234 (0.118, 0.347)] outcome timepoints. This model utilized data from n = 232 patients (7 studies) and included six Tier 1 variables [current MDD diagnosis (present/absent), inpatient (vs. outpatient), age, sex, study done in US, study TRD threshold ≥2] plus BMI, and smoker status (yes/no). For the rapid timepoint (where the effect size was maximal), study-level TRD threshold, MDD diagnosis, country where the study was conducted (US or outside of US), and BMI contributed the largest weights to the combined moderator, such that participants who had greater treatment resistance, had no diagnosis of MDD (e.g., had bipolar disorder, PTSD), were enrolled in the US, and had a higher BMI tended to have greater improvement in ketamine relative to placebo. Notably, only one of these variables was significant as an individual moderator, but in combination, the variables provide information regarding participants who may benefit from ketamine, with a small-medium combined effect size.
COMMENT
The current analyses were conducted in the largest pooled patient-level dataset of ketamine-treated patients to date, involving patients enrolled in 8 countries (over 4 continents) who were assessed for depression symptoms before and after a single infusion. Results from patient-level data confirmed theModerators of the effect of ketamine vs. placebo on standardized % improvement in MADRS scores. In all figures, larger scores on the y-axis = greater improvement from baseline, expressed in standard deviation units relative to the overall sample mean. A moderation by study's eligibility threshold for the number of previous failed, adequate antidepressant medication trials that were required for study enrollment (post-rapid timepoint); B moderation by use of a crossover design (rapid timepoint); C moderation by study performance in the US (post-rapid timepoint). Regression prediction lines based on models predicting MADRS % improvement from baseline (standardized across the full dataset) at post-infusion (rapid or post-rapid) timepoint with a random effect for study. All individual patient-level datapoints are depicted by red triangles (ketamine-treated patients) or black circles (placebo-treated patients). Statistics overlaid on each figure depict the simple effects of the moderator variable within ketamine-treated patients alone and within placebo-treated patients alone. robust rapid (app. 1 day post-infusion) and post-rapid (app. 7 days post-infusion) impact of IV ketamine on depression symptoms across a wide range of study designs and patient characteristics. Overall response (peak of 46%) and remission (peak of 27%) rates were comparable to those observed retrospectively in clinical settings, but lower than those observed in the earliest published RCTs, consistent with a waning pattern of effect sizes observed across many disciplines as a field of study matures. Despite variability in patient outcomes, an exhaustive search for moderators of outcome across 37 variables (Table) produced very few individual study-or patient-level features that reliably predicted ketamine's benefit over placebo, suggesting ketamine's antidepressant impact is highly uniform across heterogeneous patients. Compiling information across multiple variables simultaneously using a validated, data-driven approachyielded several combined moderators, whereby combining study-and patient-level variables enabled the differential impact of ketamine among some patients relative to others to emerge. Nevertheless, effect sizes remained modest (max effect size of r = 0.29, a small-medium effect), suggesting limited clinical utility for precision medicine applications. Despite modest effect sizes, the few significant moderators that were identified have implications for both research design and clinical applications. The observation of stronger effects among studies utilizing a higher threshold of treatment-resistance for study entry (≥2 failed adequate trials of a federal regulatory agency-approved antidepressant medication) suggests that studies will have improved power to detect separation of ketamine from placebo if such eligibility thresholds are used, and further confirms that the current consensus recommendation to conduct a thorough treatment history assessmentand consider reserving ketamine treatment for patients who have not responded to previous adequate trials of first-line depression treatments is well warranted-unless an urgent clinical need (e.g., suicidal crisis; marked deterioration in functioning) is present that justifies an initial (and potentially time-limited) course of ketamine. In practice, specialized ketamine clinics may not uniformly uphold this standard, which raises an ethical concern in light of relatively high out-of-pocket expenses to patients. A second study design feature-the use of a crossover design-was also associated with enhanced ketamine efficacy. Of note, the effect of crossover study design cannot be explained by carry-over effects, repeated measurements, or the influence of repeated infusions themselves (e.g., increased functional unblinding), since only data from the first infusion each patient received was included in the present analyses. Patient expectancies, a powerful predictor of response, might be differentially impacted in crossover relative to parallel arm studies, given the guarantee of receiving ketamine. Finally, the finding of stronger post-rapid efficacy among U.S. patients, which did not survive multiple comparisons correction, could tentatively be related to cultural features of U.S. patients; features of the U.S. clinical treatment landscape (e.g., private insurance; specific treatment settings and guidelines); and/or study features, including the chronology of data collection, with the initial discoveries of ketamine's antidepressant effects occurring in the U.S.. In Tier 1 moderator analyses, which included all patients in the pooled sample, the absence of moderating effects for numerous demographic and clinical features, including age, sex, and unipolar (relative to bipolar) depression, suggests broadly equivalent clinical applicability of ketamine treatment for providing acute relief to heterogeneous adults with depression symptoms. The consistent lack of moderating effects for sex among human patients is important given that such effects have been suggested based on pre-clinical animal models. Likewise, the lack of moderation findings for medication status (presence/absence of concomitant psychiatric medications, as well as number of psychiatric medications) is also notable and relevant in both research and clinical practice. Similarly, the current analyses did not uphold the reliability of several moderators reported previously in smaller cohorts, such as concurrent benzodiazepine prescriptionsand BMI. We leveraged an innovative data-driven "combined moderator" approach to produce optimized weighted combinations of discrete moderator variables, a technique that has been used previously to identify subgroups of patients who will respond beneficially to a treatment, even when each individual moderator, treated in isolation, cannot do so. For instance, although BMI moderated outcome only at a trend level in sequential moderator analyses (Supplementary-1), our combined moderator analyses (M* #2 f) for the rapid timepoint suggested that having increased BMI, in combination with living in the US, having no diagnosis of MDD (e.g., bipolar disorder, PTSD), and having greater prior treatment resistance, and when simultaneously accounting for information across 6 additional variables (see Supplementary-2, Tier #2 f analyses), did predict differential response to ketamine, to the greatest degree of any of the 8 unique moderator combinations tested within the current analyses. Nevertheless, the maximum effect size remained small by conventional standards (r ≤ 0.29), meaning much of the variance in postketamine depression was left unexplained. In previous clinical trials where the current combined moderator approach has been applied, combined moderators have yielded larger effect sizes, reinforcing the conclusion that ketamine's differential impact on depression was particularly challenging to predict from the current set of moderators-whether tested alone or in combination. More broadly, the scarcity of moderation findings in the present analyses suggests that information available routinely in clinical settings (i.e., demographic and clinical features) may have limited utility in guiding precision medicine application of ketamine treatment to individual patients. Mechanistic moderators assessing treatment-relevant substrates with more costly and/or invasive methods (e.g., neuroimaging; blood tests) may be necessary to explain sufficient variance to guide clinical decision-making, but studies of such response markers are few and findings have yet to be replicated. Enhancing the availability and generalizability of such measures in real-world clinical settings may prove an important longer-term goal.
LIMITATIONS
We were constrained by certain aspects of the available published datasets, including predominant use of single infusion designs within randomized trials, which differs from clinical practice in which serial ketamine infusions are the norm; lack of longerterm follow-up data; and a constrained set of moderators available for harmonization across multiple datasets. Several moderators were available only as between-study indicators, which decreases statistical power to detect moderation and fails to fully leverage the pooled patient-level approach. In M* analyses, comparisons of effect sizes across Tiers 2a-g are complicated by the different subsets of patients and studies available for inclusion in each analysis; however, due to small-to-medium overall effect sizes observed consistently across all tiers, the interpretation of moderator findings as having low overall clinical utility is not impacted. Although previous studies suggest that response to a single, first infusion of ketamine is a fairly robust predictor of response to subsequent, serial infusions, some(but not all) findings suggest enhanced outcomes can be achieved even among first infusion non-responders through sustained treatment. Our analyses cannot account for this possibility. We did not include trials of the FDA-approved compound intranasal esketamine, given relatively fewer published studies with lower clinical heterogeneity within such studiesand relevant proprietary restrictions that impacted the availability of patientlevel data when attempting to establish institutional data-sharing agreements. Though this might limit the clinical generalizability of our analyses, off-label IV ketamine use remains widespread, and the need for precision medicine tools is even more pressing in these contexts given that the cost of such treatments predominantly rests with the patient. At the time of the literature review, no published studies that recruited pediatric/adolescent or geriatric patients could be identified meeting other study eligibility criteria, although positive findings in these age groups have been reported in the interim. Similarly, few studies could be identified in patients with non-primary depressive diagnoses that measured pre-and postinfusion depression with standard outcome measures, and most studies excluded patients with psychiatric, substance, and/or medical comorbidities that are commonly present in real-world clinical patients and urgently require novel treatment approaches, as they confer heightened risk of poor outcomes (e.g., suicidal behaviors; protracted course of illness). Finally, despite strong international collaboration, the included datasets had high racial and ethnic homogeneity, both within and across studies. Given the transdiagnostic, cross-developmental relevance of depressive symptoms and clinical interest in a broad range of applications for ketamine within psychiatry, recruitment of heterogeneous patient samples with greater real-world representation, diversity, and key comorbidities (e.g., concurrent depression and substance use disorders) is an important goal for future work.
CONCLUSIONS
The efficacy of IV ketamine for both rapid and post-rapid depression reduction was validated in this international pooled patient-level mega-analysis. Although the clinical response to ketamine treatment showed substantial individual differences and room for improvement (46% overall responder rate and 27% remission), the current, comprehensive search for moderators, involving both sequential/univariate and data-driven combined moderator methods, yielded limited capacity to guide clinical decision-making in advance of a first infusion. Given the rapidity of ketamine's therapeutic onset, a "fast-fail" approach to empirically assess the impact of a time-limited trial of infusions (e.g., between one and three infusions) remains the most accurate method currently available, but in many countries (such as the U.S.), this approach has low accessibility to the vast majority of patients, entailing high out-of-pocket expense and introducing potential concerns regarding risk-to-benefit ratio. Further development of mechanistic measures-particularly those that map onto ketamine's essential impacts on the brain, yet remain clinically accessible and affordable to perform at pre-infusion baseline-may yield an as-yet unrealized capacity for precision ketamine treatment. filed by the ISMMS for the use of intranasally administered Neuropeptide Y (NPY) for the treatment of mood and anxiety disorders. This intellectual property has not been licensed. DSC is a named co-inventor on a patent application in the US, and several issued patents outside the US filed by the ISMMS related to the use of ketamine for the treatment of post-traumatic stress disorder (PTSD). This intellectual property has not been licensed. DSC is a named co-inventor on a patent application filed by ISMMS for systems and methods for providing a resilience building application to support mental health of subjects. This intellectual property has not been licensed. In the past 5 years, JWM has provided consultation services and/or served on advisory boards for Boehreinger Ingelheim, Clexio Biosciences, Engrail Therapeutics, FSV7, Global Medical Education (GME), Otsuka, and Sage Therapeutics. JWM is named on a patent pending for neuropeptide Y as a treatment for mood and anxiety disorders and on a patent pending for the use of KCNQ channel openers to treat depression and related conditions. The Icahn School of Medicine (employer of JWM) is named on a patent and has entered into a licensing agreement and will receive payments related to the use of ketamine or esketamine for the treatment of depression. The Icahn School of Medicine is also named on a patent related to the use of ketamine for the treatment of PTSD. JWM is not named on these patents and will not receive any payments. JJM receives royalties for commercial use of the C-SSRS from the Research Foundation for Mental Hygiene. SJM is supported through the use of resources and facilities at the Michael E. Debakey VA Medical Center, Houston, Texas and receives support from The Menninger Clinic, Houston, Texas. SJM has served as a consultant to Allergan, Alkermes, Axsome Therapeutics, BioXcel Therapeutics, Clexio Biosciences, Eleusis, EMA Wellness, Engrail Therapeutics, Greenwich Biosciences, Intra-Cellular Therapies, Janssen, Levo Therapeutics, Perception Neurosciences, Praxis Precision Medicines, Neumora, Neurocrine, Relmada Therapeutics, Sage Therapeutics, Seelos Therapeutics, Signant Health, and Worldwide Clinical Trials. SJM has served as an investigator for clinical trials funded by Janssen, Merck, NeuroRx, and Sage Therapeutics, and has received research support from Biohaven Pharmaceuticals and VistaGen Therapeutics. DMM has received speaker's honoraria from MECTA, Otsuka and Janssen and an honorarium from Janssen for participating in an esketamine advisory board meeting. GP has served as a consultant for Abbott Laboratories, Acadia Pharmaceuticals, Inc, Alkermes, Inc, Alphasigma USA, Inc, AstraZeneca PLC, Avanir Pharmaceuticals, Axsome Therapeutics, Boston Pharmaceuticals, Inc., Brainsway Ltd, Bristol-Myers Squibb Company, Cala Health, Cephalon Inc., Dey Pharma, L.P., Eleusis health solutions ltd, Eli Lilly Co., Genentech, Inc, Genomind, Inc, GlaxoSmithKline, Evotec AG, H. Lundbeck A/S, Inflabloc Pharmaceuticals, Janssen Global Services LLC, Jazz Pharmaceuticals, Johnson & Johnson Companies, Methylation Sciences Inc, Monopteros Therapeutics, Mylan Inc, Novartis Pharma AG, One Carbon Therapeutics, Inc, Osmotica Pharmaceutical Corp., Otsuka Pharmaceuticals, PAMLAB LLC, Pfizer Inc., Pierre Fabre Laboratories, Ridge Diagnostics (formerly known as Precision Human Biolaboratories), Sage Therapeutics, Shire Pharmaceuticals, Sunovion Pharmaceuticals, Taisho Pharmaceutical Co, Ltd, Takeda Pharmaceutical Company LTD, Theracos, Inc., and Wyeth, Inc. GP has received honoraria (for lectures or consultancy) from Abbott Laboratories, Acadia Pharmaceuticals Inc, Alkermes Inc, Alphasigma USA Inc, Asopharma America Cntral Y Caribe, Astra Zeneca PLC, Avanir Pharmaceuticals, Bristol-Myers Squibb Company, Brainsway Ltd, Cephalon Inc., Dey Pharma, L.P., Eli Lilly Co., Evotec AG, Forest Pharmaceuticals, GlaxoSmithKline, Inflabloc Pharmaceuticals, Grunbiotics Pty LTD, Hypera S.A., Jazz Pharmaceuticals, H. Lundbeck A/S, Medichem Pharmaceuticals, Inc, Meiji Seika Pharma Co. Ltd, Novartis Pharma AG, Otsuka Pharmaceuticals, PAMLAB LLC, Pfizer, Pharma Trade SAS, Pierre Fabre Laboratories, Ridge Diagnostics, Shire Pharmaceuticals, Sunovion Pharmaceuticals, Takeda Pharmaceutical Company LTD, Theracos, Inc., Titan Pharmaceuticals, and Wyeth Inc. GP has received research support (paid to hospital) from Alphasigma USA, Inc, AstraZeneca PLC, Bristol-Myers Squibb Company, Cala Health, Forest Pharmaceuticals, the National Institute of Mental Health, Mylan Inc, Neuralstem, Inc, PAMLAB LLC, PCORI, Pfizer Inc., Johnson & Johnson Companies, Ridge Diagnostics (formerly known as Precision Human Biolaboratories), Sunovion Pharmaceuticals, Tal Medical, and Theracos, Inc. GP has served (not currently) on the speaker's bureau for BristolMyersSquibb Co and Pfizer, Inc. STW has received contract funding from Janssen, Sage Therapeutics, and Oui Therapeutics for the conduct of clinical trials administered through Yale University; he has received consulting fees from Biohaven Pharmaceuticals, Sage Therapeutics, Janssen, and Oui Therapeutics. No other authors have conflicts of interest to disclose.
Full Text PDF
Study Details
- Study Typemeta
- Populationhumans
- Characteristicsmeta analysis
- Journal
- Compounds
- Authors