A retrospective analysis of ketamine intravenous therapy for depression in real-world care settings
This retrospective analysis (n=537) assessed the effectiveness of intravenous ketamine therapy in community-based practices i.e real-world care settings. Over half of the participants showed a response at 14-31 days post-infusion and 28.9% remitted while 73% exhibited a reduction in suicidal ideation. However, remission status was weakly inversely correlated with depression severity.
Authors
- Debattista, C.
- Gargeya, R. S.
- Heifets, B. D.
Published
Abstract
Background: Outcomes of ketamine intravenous therapy (KIT) for depression in real-world care settings have been minimally evaluated. We set out to quantify treatment response to KIT in a large sample of patients from community-based practices.Methods: We retrospectively analyzed 9016 depression patients who received KIT between 2016 and 2020 at one of 178 community practices across the United States. Depression symptoms were evaluated using the Patient Health Questionnaire-9 (PHQ-9). The induction phase of KIT was defined to be a series of 4-8 infusions administered over 7 to 28 days.Results: Among the 537 patients who underwent induction and had sufficient data, 53.6% of patients showed a response (≥ 50% reduction in PHQ-9 score) at 14-31 days post-induction and 28.9% remitted (PHQ-9 score drop to < 5). The effect size was d = 1.5. Among patients with baseline suicidal ideation (SI), 73.0% exhibited a reduction in SI. A subset (8.4%) of patients experienced an increase in depressive symptoms after induction while 6.0% of patients reported increased SI. The response rate was uniform across 4 levels of baseline depression severity. However, more severe illness was weakly correlated with a greater drop in scores while remission status was weakly inversely correlated with depression severity. Kaplan-Meier analyses showed that a patient who responds to KIT induction has approximately 80% probability of sustaining response at 4 weeks and approximately 60% probability at 8 weeks, even without maintenance infusions.Conclusion: KIT can elicit a robust antidepressant response in community clinics; however, a small percentage of patients worsened.
Research Summary of 'A retrospective analysis of ketamine intravenous therapy for depression in real-world care settings'
Introduction
Ketamine intravenous therapy (KIT) is established as a rapid treatment for depressive symptoms, but most controlled evidence comes from single-infusion studies conducted at academic centres. Those trials typically show rapid symptomatic improvement within 24 hours and high short-term response rates after a single infusion, but relapse is common within two weeks. As community-based ketamine clinics have proliferated and commonly use repeated-infusion “induction” regimens (typically 4–8 infusions over 2–4 weeks) followed by variable maintenance dosing, there remains limited systematic time-course data using standardised clinical instruments from these real-world settings. Mcinnes and colleagues set out to quantify outcomes of KIT in community clinics across the United States by analysing a large, de-identified dataset collected via a measurement-based care software platform. The primary aim was to evaluate antidepressant response and remission after a KIT induction series using the Patient Health Questionnaire-9 (PHQ-9) and to estimate the short-term durability of response prior to any maintenance infusions. The analysis focuses on patients treated between 2016 and 2020 and emphasises treatment delivered in independent private practices rather than academic trials.
Methods
This was a retrospective analysis of de-identified patient-reported outcome data captured by a measurement-based care software platform across 178 independent community ketamine practices in 40 US states between 1 January 2016 and 30 December 2020. The full dataset comprised 9016 patients who received KIT; clinical records logged PHQ-9 responses, treatment dates, infusion labels (induction versus maintenance), doses and infusion durations, and free-text treatment notes. Demographic, diagnostic, medication, and detailed medical-history variables were not captured by the dataset used for this analysis. The study received approval from the Stanford University Institutional Review Board. The PHQ-9 (a nine-item self-report depression scale) was the primary outcome instrument; patients were prompted to complete the PHQ-9 electronically every 14 days via text-linked web portal. For this analysis, induction was operationally defined as 4–8 infusions administered over 7–28 days, with each infusion labelled as an induction infusion in the software. Baseline PHQ-9 was required within one month before induction and the post-induction PHQ-9 was required 14–31 days after the final induction infusion and before any maintenance infusion. When patients dropped out prior to completing induction, their last PHQ-9 was used if it fell at least two weeks after their final infusion. Key outcome definitions followed standard convention: response was a ≥50% reduction in total PHQ-9 score from baseline, remission was PHQ-9 < 5, and suicidal ideation (SI) was defined as a score ≥1 on item 9 of the PHQ-9. Baseline severity categories were 0–9 none/mild, 10–14 moderate, 15–19 moderately severe, and 20–27 severe. Statistical analyses included Chi-square tests for categorical relationships, Cohen’s d for effect size, point-biserial and Pearson/Spearman correlations for associations with baseline severity, and Kaplan–Meier survival analysis to estimate durability of response; significance was set at p < 0.05. Analyses were performed using standard scientific computing libraries and R.
Results
Cohort derivation and sample characteristics: From 9016 patients in the measurement-based care platform, 3518 had completed a 4–8 infusion induction within 7–28 days; of these, 537 patients met the additional requirement of having both a baseline PHQ-9 within one month pre-induction and a post-induction PHQ-9 14–31 days after induction, and therefore comprised the analytic induction cohort. The investigators note substantial missing outcome data across the broader sample and present analyses comparing baseline PHQ-9 distributions for excluded versus included patients to assess selection bias. Primary depression outcomes after induction: In the n = 537 cohort the mean baseline PHQ-9 was 18.1 (SD = 5.3, 95% CI = 17.6–18.5) and the mean post-induction PHQ-9 at 14–31 days was 9.4 (SD = 6.5, 95% CI = 8.9–10.0). The mean raw reduction in PHQ-9 score was 8.7 points (SD = 6.6, median = 8.5), corresponding to a mean percent reduction of 47.1% (median 52.0%). The response rate (≥50% reduction) was 53.6% and the remission rate (PHQ-9 < 5) was 28.9% (n = 155/537). The Cohen’s d effect size for induction was reported as 1.5. A minority of patients worsened after induction: 8.4% (45/537) had an increased PHQ-9 score, including nine patients with an increase ≥5 points. Relationships with baseline severity: The magnitude of PHQ-9 score reduction correlated with baseline severity (Pearson r = 0.44, p < 10^-25; Spearman r = 0.42), indicating larger absolute drops for patients with higher baseline scores. However, response rates were similar across baseline severity categories (none/mild 56.1%, moderate 54.1%, moderately severe 52.2%, severe 53.9%; χ2 (8) = 0.24, p > 0.999). By contrast, remission likelihood declined with greater baseline severity (58.5%, 43.9%, 28.7%, 17.8% across increasing severity categories; χ2 (8) = 42.6, p < 10^-5), and remission status showed a negative point-biserial correlation with baseline severity (r = -0.27, p < 10^-9). The tendency to worsen did not meaningfully correlate with baseline severity. Suicidal ideation outcomes: Two-thirds of the cohort (66.3%, n = 356) endorsed some SI at baseline (PHQ-9 item 9 > 0). Of these, 73.0% (260/356) reported a reduction in SI after induction and 42.7% (152/356) reported no SI at the end of induction. Improvement rates varied by baseline SI intensity (for example, 79.6% improvement among those averaging item-9 = 3). Overall, 6.0% of the full induction cohort reported an increase in SI after induction; 16 individuals experienced increases in both overall PHQ-9 and SI, and another 16 had an increase in SI without an increase in total PHQ-9. Durability of response and maintenance patterns: A Kaplan–Meier analysis among responders estimated the probability of retaining responder status (loss of response defined as return below the ≥50% reduction threshold). Probability of retaining response was 78.3% at 28 days after induction (95% CI = 73.3–83.7%) and 59.9% at 56 days (95% CI = 51.7–69.4%); patients were censored when they began maintenance or stopped reporting PHQ-9s at least every two weeks. Regarding maintenance practice patterns, 52.5% of patients who completed induction entered maintenance care. Responders who entered maintenance received on average 3.6 maintenance infusions (SD = 4.9), remitters 3.7 (SD = 5.0), and the larger group of 3518 patients who completed induction averaged 2.6 maintenance infusions (SD = 5.0). Median time from final induction infusion to first maintenance infusion was 48 days for remitters and 43 days for responders; across all patients entering maintenance the median interval was 28 days, with considerable variation. The investigators highlight uncertainty about whether treatment discontinuation reflects clinical improvement or financial/access barriers.
Discussion
Mcinnes and colleagues interpret these findings as evidence that a standard multi-infusion KIT induction delivered in community clinics can produce a robust antidepressant response that is reasonably durable over several weeks without immediate maintenance. They emphasise that the observed response rate of 53.6% and remission rate of 28.9% at 14–31 days post-induction compare favourably with some prior induction studies conducted in academic settings and exceed a previously published small community outpatient report; differences between studies may relate to timing of outcome measurement, patient populations, circadian factors related to infusion timing, or other practice differences. The investigators acknowledge real-world sources of bias that could inflate apparent effectiveness, including lack of blinding and patients paying out-of-pocket who may have strong expectancy effects. They also note that the induction response measured 14–31 days after the final infusion may underestimate peak early effects documented immediately post-infusion in some trials. A key safety and clinical concern raised is that a small subset of patients worsened after induction (8.4%) and 6.0% experienced increased suicidal ideation; potential contributors include dysphoric/anxious ketamine responses, comorbidities, adverse effects leading to intolerance, or psychological reactions when a last-resort treatment does not meet expectations. Several important limitations are highlighted by the authors: only 15.3% of patients who completed induction in the platform had complete pre- and post-induction PHQ-9 data and were therefore included in the primary analysis; the measurement platform did not capture demographics, medical or psychiatric history, concomitant medications, or psychotherapy engagement; dosage data were incomplete; and there is clear economic and access selection bias in privately funded KIT. The authors suggest that richer electronic health record datasets that include demographic and clinical history (for example Osmind’s EHR) would permit more informative analyses, including predictors of response, the impact of treatment refractoriness, medication interactions, and integration with psychotherapeutic approaches such as ketamine-assisted psychotherapy. Finally, they call for further long-term outcome data and replication studies to address the limitations inherent to retrospective real-world analyses.
Conclusion
The investigators conclude that KIT induction in community clinics was associated with a substantial antidepressant effect at 14–31 days post-induction (response 53.6%, effect size d = 1.5; remission 28.9%) and that responders had an approximately 80% probability of sustaining response at 4 weeks and approximately 60% at 8 weeks without maintenance infusions. They also note that 42.7% of patients with baseline suicidal ideation no longer endorsed SI after induction, that roughly half of completers elected to continue into maintenance treatment (receiving on average 2–3 maintenance infusions), and that a small but clinically important minority worsened after induction. The authors state that worsening and increased suicidal ideation warrant further study and that better, more comprehensive point-of-care data collection is needed to characterise long-term outcomes and predictors of benefit and harm in real-world KIT practice.
View full paper sections
INTRODUCTION
Ketamine intravenous therapy (KIT) is a rapid and effective treatment for depression. However, most efficacy data reflects responses to single infusions of ketamine administered to patients at academic medical centers. In this context, single doses of KIT produce a rapid decrease in depressive symptoms with positive effects peaking at 24 h. Several randomized studies have found that, within the first 3 days of a single ketamine infusion, 50-70% of patients experience a therapeutic response (i.e. > 50% reduction in symptoms on a standardized rating scale), but roughly 90% of all patients relapse within 2 weeks. Despite a lack of conclusive long-term data, ketamine clinics have opened across the United States offering a variety of infusion regimens. A recent analysis of 85 real-world outpatients treated at Massachusetts General Hospital over the course of 13 months suggested that repeated KIT was associated with clinically significant improvement in approximately 20% of patients. However, detailed time course data using standardized clinical instruments is still largely lacking. The retrospective analysis by, along with several prospective randomized trials performed at academic centers, employ similar long term ketamine treatment regimens. Typically, a patient would receive a series of closely spaced (4-8) infusions over a two-week period, known as the induction, followed by maintenance KIT at variable intervals. Prospective studies evaluating the efficacy of a ketamine induction regimen have found antidepressant response and remission rates, measured within days of the final infusion, to be similar to those observed after single ketamine infusions, thoughfound that repeated infusions may offer enhanced therapeutic benefit. Additionally, limited data suggest that the multiple-infusion ketamine induction results in an augmented durability of antidepressant response. Previous studies also support the safety and efficacy of a post-induction maintenance strategy notwithstanding relatively small samples. However, it is unclear how patients enrolled in prospective, randomized trials compare to those seeking care in private practice. As the majority of patients receiving KIT are treated in community practices, it is crucial to assess KIT outcomes in these settings. Here, we report outcomes on 9016 de-identified real-world outpatients with symptoms of depression who received KIT between 2016 and 2020 at one of 178 independent community ketamine practices across the United States. Treatment providers in participating clinics tracked mental health outcomes using a measurement-based care (online platform and the Patient Health Questionnaire-9 (PHQ-9). We determined that most clinicians in participating private practices treated patients with KIT regimens similar to those previously published, including an initial induction comprising between 4 and 8 ketamine infusions over the course of 2 to 4 weeks, and subsequent variablyspaced maintenance infusions. We present an analysis of outcomes using the PHQ-9 after KIT induction, as well as an estimate of the durability of response in this real-world population.
SAMPLE
We analyzed a de-identified dataset of 9016 patients who received KIT for depression at one of 178 private practice community clinics between January 1, 2016 and December 30, 2020. The clinics in this study were chosen because they used a measurement-based care software platform with their patients. The primary measure utilized was the PHQ-9, which has demonstrated internal consistency and test-retest reliability. Providers used a web-based interface to schedule the delivery of patient-reported measures. Patients were asked to complete a PHQ-9 electronically every 14 days (the recall period for the instrument). At designated days and times, the system sent text messages to patients' cell phones with reminders to complete the PHQ-9; these messages contained a link to a secure online portal wherein responses were logged. Providers could view scores on their portal as well as record ketamine infusions. The following information was collected from patient records: PHQ-9 responses (overall score as well as responses on individual line items), treatment dates, treatment types (induction or maintenance infusion), treatment notes, patient weight, treatment doses, and infusion durations. Demographic information, medication history, psychiatric history, and medical history were not available for this study. This retrospective analysis was approved by the Stanford University Institutional Review Board.
CLINICAL PROCEDURES
Located throughout 40 states, the 178 independent community practices that adopted the measurement-based care software used their own enrollment criteria and clinical protocols for KIT. There was variability in PHQ-9 administration and response across clinics and patients. The measurement-based care software allowed coding of treatment doses, infusion durations, treatment type (each infusion was labeled as an induction or maintenance infusion), and additional notes including use of adjunctive medications. The starting ketamine dose was usually 0.5 mg/kg infused over 40 min, although this dose varied and was generally increased in subsequent infusions. Though data regarding adjunctive medications used were not uniformly available for our analysis, providers' notes indicate that the most common adjunctive medications were given for hypertension, nausea, and anxiety. There was variability in the number of infusions patients received in both induction and maintenance phases of KIT. The price of a single infusion ranged from $300 to $690, with varying pricing structures including discounts for paying for a series of induction infusions at once. Patients were generally expected to pay out-of-pocket as KIT for depression is at best only partially covered by insurance at the time of this writing.
OUTCOMES AND VARIABLES
The PHQ-9 was used as the primary measure of depression symptoms. Following convention, response to KIT was defined as ≥50% decrease in total PHQ-9 score from pretreatment status. Remission was defined as the PHQ-9 score decreasing to < 5. The presence of suicidal ideation (SI) was defined as a score greater than or equal to 1 on line item 9 of the PHQ-9. Following the validated score cutoffs for the PHQ-9, we defined scores of 0-9 as none or mild depression, 10-14 as moderate depression, 15-19 as moderately severe depression, and 20-27 as severe depression. For the purposes of this study, we define induction to be a series of 4-8 infusions administered over the course of 7 to 28 days (with the additional stipulation that each infusion be labeled as an induction infusion in the software). The lower limit of 4 infusions for an induction is derived fromand. The upper limit of 8 infusions is supported by the experience of community providers and by the convention used in intranasal esketamine induction during phase III studies. The baseline PHQ-9 was required to be within one month prior to induction. The post-induction PHQ-9 was required to be reported 14-31 days after the final induction infusion and prior to any maintenance infusion. When patients dropped out prior to completion of induction, we were able to use their last reported PHQ-9 provided it was at least two weeks after their final infusion.
STATISTICAL ANALYSIS
The Chi-Square Test of Independence was used to determine whether there was a relationship between categorical variables (e.g. response rate). Effect size was estimated using Cohen's d statistic. The point biserial correlation test was used to measure the correlation between binary outcomes (e.g. worsening, remitting) and baseline severity. A Kaplan-Meier curve was generated to assess the durability of response after induction and before the initiation of maintenance infusions. A p-value of < 0.05 was considered significant. Analyses were performed using NumPy, SciPy, and R (R Core Team, 2017). Data was plotted using Matplotliband R. CI refers to confidence interval and SD refers to standard deviation.
SUBJECT CHARACTERISTICS
From our dataset of 9016 patients, we were able to analyze outcomes of KIT induction in a cohort of 537 cohort patients (Fig.). This cohort represents patients who met the following criteria: (1) received KIT induction consisting of 4 to 8 infusions within a 7-28 day period for KIT induction, congruent with previously published regimens as cited above; (2) had baseline PHQ-9 data collected within one month of the initial infusion; and (3) had PHQ-9 data collected between 14-31 days after induction and prior to receiving maintenance KIT. Of the 3518 patients with 4-8 induction infusions completed within 7-28 days, 537 patients completed a PHQ-9 both before and after the induction series. Patients most commonly received 6 induction infusions in private practice settings, although there was substantial variability including a subset who did not complete induction (Supplementary Fig.). Supplementary Fig.shows the distribution of induction length for all patients receiving 4-8 infusions. Notably, 56.9% of patients complete their induction within 14 days, consistent with published induction schemata. We evaluated the extent to which the 537 patients included in our analysis may have differed from the overall population by comparing the distribution of baseline PHQ-9 scores in these respective groups. We tested two potential sources of selection bias leading to exclusion of a patient from analysis: dropout prior to the fourth infusion, or incomplete PHQ-9 responses. Patients may have dropped out before the fourth infusion due to lack of antidepressant response, worsening of their depression, or inability to tolerate ketamine. To address this possibility, we compared all available baseline PHQ-9 scores for individuals who failed to complete the induction to those whom we included in our main analysis. For the 374 available baseline scores for patients in this group, the mean baseline PHQ-9 was 15.9 (SD = 6.4, 95% CI = 15.3-16.6) and the median was 16.5 (Supplementary Fig.). The n = 537 induction cohort had a mean baseline PHQ-9 score of 18.1 (SD = 5.3, 95% CI = 17.6-18.5). While the confidence intervals do not overlap, the mean and median baseline PHQ-9 scores for the n = 374 cohort falls within the same validated moderatesevere illness category (PHQ-9 = 15-19) as the mean and median baseline scores of all patients completing the induction (Fig.). We additionally examined a subset (n = 89) of the 374 patients who dropped out before infusion 4 for whom we had an available PHQ-9 14-31 days after ceasing treatment early. The mean baseline PHQ-9 score in this cohort was 16.0 (SD = 5.9, 95% CI = 14.8-17.2) and the mean score 14-31 days after their final infusion was 10.3 (SD = 6.8, 95% CI = 8.9-11.7). For the n = 537 cohort, the mean post-induction score was 9.4 (SD = 6.5, 95% CI = 8.9-10.0). Here, the mean baseline scores also fall within the same validated moderate-severe illness category as the n = 537 population (Fig.), while the post-treatment scores have overlapping confidence intervals and a difference in means of less than 1. Taken together, these data show that while baseline scores for patients who dropped out are statistically different from those of the n = 537 population, the difference is not clinically meaningful. A second source of potential bias is that patients who did not diligently fill out surveys could not be represented in our analysis of treatment response calculation because of missing data. To address the possibility that these individuals might have influenced the response rate to produce a different outcome had they been retained in the analysis, we quantified the mean PHQ-9 score 14-31 days after induction for all 381 patients (from within the n = 537 cohort) who received 6 induction infusions, segmented by the number of PHQ-9 questionnaires reported by the patients (Supplementary Fig.). As there is no association between post-induction PHQ-9 score and the number of questionnaire responses (χ 2 (26) = 253.0, p > 0.99) we conclude that patients excluded for absent PHQ-9 data were not likely to systematically differ from those patients who regularly provided PHQ-9 data.
DEPRESSION OUTCOMES FOR KIT INDUCTION
All post-induction outcomes were quantified 14-31 days after the final induction infusion. The n = 537 induction cohort had a mean Fig.. PHQ-9 scores before and after induction for the n = 537 cohort. A) The distribution of baseline PHQ-9 scores before induction. These baseline scores had to be reported within 4 weeks of the first induction infusion. The mean and SD baseline PHQ-9 score was 18.1 Â ± 5.3. B) The distribution of baseline PHQ-9 scores after induction. The mean and SD postinduction PHQ-9 score was 9.4 Â ± 6.5. Scores were reported 14-31 days after the final induction infusion. If multiple scores were reported within either the pre-or post-induction interval by the same patient, the mean was taken. baseline PHQ-9 score of 18.1 (SD = 5.3, 95% CI = 17.6-18.5) and a mean post-induction score of 9.4 (SD = 6.5, 95% CI = 8.9-10.0) (Fig.). Fig.shows the distribution of the drop in PHQ-9 scores. The mean decrease in raw score was 8.7 (SD = 6.6, 95% CI = 8.1-9.2, median = 8.5). The response rate to KIT induction was 53.6% (the median reduction in PHQ-9 score was 52.0%). KIT induction was associated with a Cohen's d effect size of 1.5. The remission rate was 28.9% (n = 155/537). A subset (n = 45/537; 8.4%) of patients experienced an increase in PHQ-9 score during induction (Supplementary Fig. shows their baseline PHQ-9 score distribution). The likelihood of worsening showed a negligible negative correlation with baseline PHQ-9 severity (point biserial correlation r = -0.09, p = 0.03) and this relationship was not clinically meaningful. Supplementary Fig.shows the PHQ-9 score before and after induction for all 45 patients with worsening symptoms, including 9 patients who experienced a PHQ-9 score increase of at least 5 points. We next examined the relationship between baseline depression severity and response to KIT induction. Fig.shows the drop in PHQ-9 score as a function of baseline score. The Pearson's correlation coefficient between the drop in PHQ-9 scores and baseline PHQ-9 score was 0.44 (p < 10^-25) and the Spearman's correlation coefficient was 0.42 (p Fig.. Change in PHQ-9 scores for KIT induction for the n = 537 cohort. Data to the right of indicates an improvement in depressive symptoms. A) The distribution of the drop in PHQ-9 raw scores. The x-axis is the decrease in score. The mean and SD raw drop in score was 8.7 Â ± 6.6 (median of 8.5). B) The distribution of the percent decrease in PHQ-9 scores. The x-axis is percent decrease. The response rate was 53.6%. The mean percent drop was 47.1% and the median percent drop was 52.0%. 8.4% of patients had a PHQ-9 score that increased. < 10^-23). The response rates were 56.1%, 54.1%, 52.2%, and 53.9%, respectively, for the none/mild, moderate, moderately severe, and severe baseline levels of depression (n = 41, 98, 157, 241, respectively). The response rate did not differ based on the level of baseline depression (χ 2 (8) = 0.24, p > 0.999). These data suggest a relationship between the magnitude of reduction in PHQ-9 score and baseline symptom severity, as patients with higher baseline PHQ-9 scores would need to achieve larger reductions in score than patients with lower baseline scores to achieve responder status. The likelihood of remission was negatively correlated with the 4 baseline levels of severity at 58.5%, 43.9%, 28.7%, and 17.8%, respectively (χ 2 (8) = 42.6, p < 10^-5). Although there was no difference in response rate based on baseline severity, more severe illness was weakly inversely correlated with the likelihood of remission (point biserial correlation r = -0.27, p < 10^-9). The baseline PHQ-9 scores for remitters are shown in Supplementary Fig.. The percentages of patients per illness severity category who had a PHQ-9 score increase were 14.6%, 8.2%, 10.8%, and 5.8%, respectively. There was no correlation between baseline symptom severity and tendency to worsen (χ 2 (8) = 5.395, p = 0.71). Supplementary Tableshows outcomes grouped by the number of infusions received during induction.
SI IN KIT INDUCTION
Within the n = 537 cohort, 66.3% (n = 356) had baseline SI to some degree, defined as a score > 0 on PHQ9 question 9, and 73.0% of this subset (260/356) reported an improvement in SI after KIT induction. Moreover, 42.7% (152/356) of patients who had SI prior to induction exhibited no more SI at the end of induction. Among the 356 patients with baseline SI, 79.6% (86/108), 85.6% (83/97), and 60.3% (91/151) of patients with an average baseline PHQ-9 line item 9 score of 3, 2, 1, respectively, exhibited a decrease in SI. The distributions of SI at baseline and after induction are shown in Supplementary Fig.. Within the n = 537 cohort, 16 individuals experienced an increase in both SI and overall PHQ-9 score, while 16 individuals experienced an increase in SI but no increase in overall PHQ-9 score. In other words, 35.6% (16/45) of patients who worsened in terms of global depression symptoms also reported an increase in SI, and 6.0% of the overall n = 537 cohort reported an increase in SI.
DURABILITY OF RESPONSE TO INDUCTION
For patients who either responded or remitted after induction, we performed a Kaplan-Meier analysis to estimate the probability of relapse over time (Fig.). An event represents a patient losing response (relapse). We performed right censoring for those patients who entered maintenance or were lost to follow up (defined as failing to submit a PHQ-9 every two weeks). The y-axis represents the probability of retaining responder status and the x-axis is days after the last induction infusion. We observed that the probability of retaining responder status at 28 days after induction is 78.3% (95% CI = 73.3-83.7%) and the probability at 56 days is 59.9% (95% CI = 51.7%-69.4%). Tablesummarizes the probability of maintaining an antidepressant response and uncertainty estimates up to 91 days post-induction without maintenance infusions.
PRACTICE PATTERNS FOR MAINTENANCE KIT
After induction, patients enter the maintenance phase of KIT. However, little is known about how often patients come in for follow-up or for how long they remain in treatment. Details of the timing and number of maintenance infusions can be found in Supplementary Tables. 52.5% of patients who completed induction entered maintenance treatment. On average, patients who completed induction and responded to treatment received 3.6 maintenance infusions (SD = 4.9, 95% CI = 3.0-4.2) while remitters received 3.7 (SD = 5.0, 95% CI = 2.9-4.5). The 3518 patients who completed KIT induction (regardless whether they reported sufficient PHQ-9 scores to be included in the n = 537 cohort) received 2.6 maintenance infusions (SD = 5.0, 95% CI = 2.5-2.8) on average. The mean length between the final induction infusion and the first maintenance infusion for the 109 patients who completed induction, achieved remission, and entered maintenance was 84.3 days (95% CI = 66.6-102.0, median = 48) and the mean length for the 202 responders who entered maintenance was 75.1 days (95% CI = 63.1-87.2, median = 43). For all 1846 patients who completed induction and entered maintenance, the mean length was 46.3 days (95% CI = 42.2-50.3,
DISCUSSION
This is the largest published analysis to date examining the realworld effectiveness of a standard KIT induction protocol for depression using data from patients treated at community clinics across the United States. From a dataset of 9016 patients, we focused on a cohort of 537 individuals who underwent a KIT induction as defined above, and for whom sufficient outcomes data before and after induction were available. We found that response to KIT induction is both robust and durable. We observed an overall response rate of 53.6% and a remission rate of 28.9% measured at 14-31 days after the last infusion. This contrasts with a recent study of 85 community outpatients bywho reported a response rate of only 18.5% measured immediately prior to the 6th infusion using the Quick Inventory of Depressive Symptomatology-Self Report scale (QIDS-SR16). Possible reasons for discrepant findings include the timing of the final QIDS-SR16, e.g. missing the 6th infusion, and the fact that they administered all their infusions between 5:30 and 8 pm while most community patients in our study received KIT during daytime business hours. It has been posited that the time of day in which KIT is administered could potentially influence antidepressant response given that ketamine has an effect on circadian rhythms. While our response rates diverge from those of the Sakurai group, our) who observed response rates between 45 and 59% after an induction protocol. We note that most studies we cite measured induction response just after the final infusion, whereas our measure was 14-31 days after the final infusion and thus may be an underestimate of the true response rate if our sample were measured at a comparably early post-induction timepoint. It is possible that the response rate we observed could be skewed upwards by unblinded clinicians, who want their patients to improve, and patients, who paid for treatment with hopes of feeling better. This potential bias is an intrinsic limitation of real-world analyses vis-a-vis clinical trials. On the other hand, Sakurai et al. () reported a relatively low response rate from data garnered under similar conditions (patient payment for treatment, real world setting), suggesting that any possible upward skewing of response rates is unlikely to fully explain our results. We found that response rates did not vary as a function of initial depression severity, though more severely ill patients tended to show a greater drop in PHQ-9 scores while also having a reduced likelihood of remission. As with, we also found that a small subset of patients (8.4%) who completed induction worsened. There was no relationship between baseline PHQ-9 score and the likelihood of worsening and only about 1 / 3 of patients who worsened experienced an increase in SI. Overall, most patients with SI at baseline experienced an improvement in this symptom and 43% of those with SI no longer had SI after induction; however, 6.0% of patients reported an increase in SI after induction. Reasons for worsening of symptoms could include a Fig.. Kaplan-Meier estimates of the probability that a patient who responded to induction has not lost responder status over time. Loss of response (the 'event' in the survival curve) is defined as the PHQ-9 score increasing to the point that there is no longer a from baseline. Patients were censored when they began receiving maintenance infusions or when they stopped reporting at least one PHQ-9 every two weeks. Vertical lines indicate censored The solid line is the survival curve while dotted lines represent 95% confidence intervals on the survival curve. The x-axis is the number of days since the end of induction. Of the 288 patients who responded to induction, 274 had sufficient data for this survival analysis. Of these, 76 patients experienced a loss of response, 125 patients were lost to follow-up, and 73 patients entered maintenance treatment. Tableshows the numerical details of this survival analysis. dysphoric anxious response to ketamine, complications of comorbid diagnoses including affective switchingor other complex comorbidities, failure to tolerate ketamine side-effects such as dissociation or nausea, and a fear of what might happen next if a "treatment of last resort" fails. Our findings highlight that practitioners should exercise caution when treating only mildly ill individuals as KIT is not without potential adverse consequences. In addition to the robust response and remission rates, we also observed a strong durability of response. At 4 weeks after induction, there is approximately 80% probability that responders do not lose responder status even without maintenance infusions, and at 8 weeks the probability is approximately 60% (Fig.). The median time to a first maintenance infusion was 48 days for remitters and 43 days for responders, versus 28 days for all 1846 patients who completed induction and entered maintenance. The sustained response to repeated infusions that we observed contrasts with the transient response to single infusions, further supporting the utility of the KIT induction model that has become widely adopted. We again acknowledge that the expense of treatment may lead to an upward bias in duration of response in any real-world treatment dataset. Additionally, these results diverge from the intranasal esketamine model which requires weekly or biweekly treatments to maintain response after the induction, although intranasal administration of esketamine is not necessarily comparable to KIT using racemic ketamine. At this time, there is insufficient comparative efficacy data to draw further conclusions. We andboth note that patients entering maintenance received a relatively small number of maintenance infusions (roughly 2.6 in our broad data set and 2.3 in the latter study) with a median time interval between them of 28 days in our data set (n = 1309). Thus, most patients tend to exit treatment within 6 months although there is substantial variation. It is not clear whether patients exit treatment because they feel better or they can no longer afford the treatment. It has been suggested that response to KIT may be enhanced with psychotherapy during the maintenance phase if patients are able to make healthy new lifestyle changes, and some patients may no longer feel they need maintenance KIT once those changes are implemented. Some community-based practitioners have advocated for a style of ketamine therapy called "Ketamine-Assisted Psychotherapy" (KAP) which incorporates behavioral and psychotherapeutic interventions pre-treatment, on-drug, and during the post-induction period. Use of ameasurement based care platform will facilitate a more rigorous evaluation of response durability with these differing models of ketamine therapy used in community practices.
LIMITATIONS
Our analysis has some major limitations. We report outcomes on a fraction of the total sample due to missing mood survey data. Of the 3518 patients identified who completed KIT induction, only 15.3% had PHQ-9 data available both pre-and post-treatment. This particular measurement-based care software did not collect demographic information, past medical history, or substance use data. We lack data on the chronicity of the episode for which the patient is being treated, number of failed traditional antidepressants, previous electroconvulsive therapy or transcranial magnetic stimulation treatment, and current medications including those used to manage side-effects during KIT. We also do not know who was receiving psychotherapy, which may make ketamine and other psychedelic agents more effective. While it is assumed that patients undergoing induction treatments are ketamine naive, that was not documented explicitly. Dosage information was not available for many treatments, though we believe it is a fair assumption that KIT dosages started at 0.5 mg/kg and increased throughout the induction. There is an economic and racial selection bias for patients who are able to access KIT. There are also inherent limitations with retrospective analyses compared to randomized controlled trials, though our real-world study Several of the limitations acknowledged here can be addressed by future replication studies that incorporate more detailed data collected in Osmind's electronic health record (EHR) platform. The present analysis utilized a dataset from a separate measurement-based care platform, which was not a full EHR and thus did not capture various data variables that would enhance our understanding of KIT in real-world practice settings. The Osmind EHR captures demographic data and clinical history, including number and duration of episodes of depression and a full medication history, and also automates outcomes tracking with a patient facing mobile phone application to facilitate measurement based care by clinicians. This richer data set can form the basis of predictive models that directly inform clinical care. For example, to what degree does treatment refractoriness affect response and relapse? How do medications, taken daily or as adjuncts to KIT, modify response? Can a simplified set of PHQ9 items predict early response versus treatment failure? More broadly, point of care data collection and personal sensing streams (e.g. actigraphy, or "wearables") can establish links between self report measures and functional health outcomes. Finally, as KIT practices grow in popularity with substantial variability in care models, it is imperative to define long-term patient outcomes, data which is at present completely absent in the literature.
CONCLUSIONS
We observed a robust and durable antidepressant response to KIT 14-31 days post-induction in 53.6% of those that completed the induction (effect size was d = 1.5). Responders to KIT induction have approximately 80% probability of sustaining response at 4 weeks and approximately 60% probability at 8 weeks, even without maintenance infusions. KIT was also associated with remission for 28.9% of patients completing induction. 42.7% of patients who had SI at baseline no longer experienced SI after induction. 52.5% of patients completing induction elected to continue into maintenance treatment with the average patient receiving 2-3 maintenance infusions. Patients with the most severe illness respond equally well as those with less severe illness, but have lower rates of remission. A small percentage of people worsen with KIT and this should be an area of future research.
ROLE OF FUNDING
Research reported in this publication was supported by the National Center For Advancing Translational Sciences of the National Institutes of Health under Award Number UL1TR003142. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
DECLARATION OF COMPETING INTEREST
LAM is a consultant to Clexio Biosciences and an employee of Osmind Inc. JJQ is an employee of Osmind Inc. RSG is an employee of Osmind Inc. CD is a consultant to Magnus Medical, Corcept Therapeutics, Sage Therapeutics, Pfizer, Alkermes, and Osmind; he has received research support from Sage Therapeutics, Compass Pathways, Relmada Therapeutics, and Janssen. BDH is not supported by, nor maintains any financial interest in, any commercial activity that may be associated with the topic of this article.
Full Text PDF
Study Details
- Study Typemeta
- Populationhumans
- Characteristicsobservational
- Journal
- Compound