Methodological challenges in psychedelic drug trials: Efficacy and safety of psilocybin in treatment-resistant major depression (EPIsoDE) - Rationale and study design
This paper (2022) details the rationale and study design for an upcoming double-blind placebo-controlled trial (n=144) which will assess the safety and efficacy of using psilocybin in a cohort with treatment-resistant depression.
Authors
- Betzler, F.
- Evens, R.
- Gilles, M.
Published
Abstract
Psychedelics such as psilocybin have recently gained remarkable interest in both the specialist literature and the lay press because studies suggest that these substances may have great therapeutic potential in various psychiatric disorders, including major depression. However, clinical trials with psychedelic drugs pose particular methodological challenges to researchers, some of which differ considerably from those with other psychotropic drugs. These include the problem of successful blinding, which can hardly be guaranteed in clinical trials with psychedelic substances and - directly related - the high risk of expectation bias and nocebo effects. Some of these challenges are being addressed in the given clinical trial on the efficacy and safety of psilocybin in treatment-resistant major depression. It is a phase II randomized, double-blind, active placebo-controlled parallel-group trial with 144 patients. The rationale, the study design, and the core features of the study are presented here. The trial (EPIsoDE trial; EudraCT number: 2019-003984-24; NCT04670081) is funded by the German Federal Ministry of Education and Research (BMBF 01EN2006 A/B).
Research Summary of 'Methodological challenges in psychedelic drug trials: Efficacy and safety of psilocybin in treatment-resistant major depression (EPIsoDE) - Rationale and study design'
Introduction
Classical psychedelics, a group of serotonin 2A receptor (5-HT2AR) agonists that includes psilocybin and LSD, have re-emerged in psychiatric research because several recent studies suggest therapeutic potential for major depression, treatment-resistant depression (TRD) and some substance use disorders. Mertens and colleagues note that most published trials to date have important methodological limitations: many are open-label or uncontrolled, sample sizes have generally been small (double digits), and only one available double-blind trial lacked sufficient power and assay sensitivity. The unique subjective effects of psychedelics also create trial-specific problems such as difficulty maintaining blinding and high risk of expectation bias and nocebo effects in control arms. This paper presents the rationale and protocol for the EPIsoDE trial, a Phase IIb, randomized, double-blind, active placebo-controlled parallel-group study designed to examine the efficacy and safety of oral psilocybin in TRD (EudraCT 2019-003984-24; NCT04670081). The investigators frame the trial to address several of the methodological challenges that complicate clinical assessment of psychedelics, and they propose design features (for example the choice of comparators and repeated dosing) that they suggest could serve as prototypes for future trials in this area.
Methods
EPIsoDE is a bi-centric, prospective, randomized, double-blind, active placebo-controlled parallel-group clinical trial conducted at two German sites (Central Institute of Mental Health, Mannheim; Department of Psychiatry and Psychotherapy, Campus Charité Mitte, Berlin). The core intervention compares a putatively therapeutic oral psilocybin dose (25 mg) against two comparator conditions: a low psilocybin dose (5 mg) and an active placebo (100 mg nicotinamide). Each patient receives two dosing sessions separated by six weeks, accompanied by psychotherapeutic preparatory and integration sessions delivered by two therapists (one female, one male). After informed consent and individualized down‑titration of monoaminergic medication (if applicable), participants are allocated to one of several dosing sequences so that every participant receives at least one 25 mg dose; the paper describes sequences including: nicotinamide→25 mg, 5 mg→25 mg, 25 mg→5 mg, and 25 mg→25 mg. (The extracted text contains some inconsistency about whether this is framed as three or four arms; the protocol’s randomization ratios and dosing sequences indicate four distinct allocation sequences while the primary inferential comparisons focus on three treatment conditions: 25 mg, 5 mg and nicotinamide.) Eligibility criteria specify adults aged 25–65 with moderate to severe TRD (HAM-D total score ≥17; ICD‑10 F32.1/F32.2/F33.1/F33.2) who have failed at least two adequate antidepressant courses (≥6 weeks each, distinct classes). Key inclusion conditions include medical stability, consent and a washout of monoaminergic medication for at least 2 weeks (fluoxetine 5 weeks). Exclusions include psychotic-spectrum or bipolar disorders, cluster A or borderline personality disorder, current substance use disorder, first‑degree family history of psychosis/bipolar, recent significant suicidality, recent ECT or ketamine/esketamine, pregnancy/lactation, and substantial prior psychedelic use (past year or >5 lifetime uses). Study procedures comprise a baseline preparatory session and a pre-dose preparation the day before each dosing session, an in‑clinic dosing session with continuous monitoring (vital signs hourly, every 30 minutes for first 2 hours), overnight observation, blood draws for psilocin at 2 and 3 hours post-dose, and structured integration sessions the day after and one week after each dose. Weekly phone check‑ins occur in weeks 2–5 after each dosing for safety monitoring. Follow-up visits are scheduled at six weeks (primary endpoint), 12 weeks (secondary endpoint), and longer-term follow-ups at six and twelve months. The investigational products are GMP-manufactured psilocybin capsules (5 mg and 25 mg formulations) supplied via the Usona Institute/Almac and nicotinamide produced and encapsulated to match the active capsules. Randomization uses Efron’s biased-coin method with dynamic allocation (ratios reported as 2:2:1:1 for the dosing sequences) and allows stratification by centre; allocation is concealed and outcomes are rated by an independent clinician who did not attend dosing sessions to support blinded assessment. Sample size calculation assumed responder rates at six weeks of 50% for 25 mg, 20% for 5 mg and 10% for nicotinamide. Using exact-test–based computations, an initial per‑group sample of n=43 was calculated to achieve the planned power for key comparisons; allowing for 10% dropout inflated the target to 48 per group and a total enrolment target of 144. The primary efficacy endpoint is responder status (≥50% reduction in HAM‑D total score) at six weeks after the first dose. Confirmatory analysis is planned on an intention‑to‑treat basis using logistic regression with treatment indicators, baseline HAM‑D and centre strata as covariates, embedded in a fixed‑sequence testing procedure: first test superiority of 25 mg versus nicotinamide (one‑sided α=0.025), and, if significant, test 25 mg versus 5 mg. Secondary efficacy analyses use mixed‑effects linear models (REML) for total HAM‑D scores at 1, 6 and 12 weeks with the same covariates and time as fixed effects; nonparametric checks are planned. Safety analyses are descriptive and compared across arms primarily with ANOVA.
Results
This paper does not report trial outcome data. It is a rationale and study‑design/protocol report describing the planned Phase IIb trial procedures, sample size calculations and statistical analysis plans. The extracted text details the assumptions underlying the sample‑size computation and the planned analytical methods but contains no efficacy, safety or follow-up results because the trial was ongoing or not yet analysed at the time of writing.
Discussion
Mertens and colleagues frame the trial design around several methodological problems that are particularly salient in psychedelic research: difficulty maintaining blinding, pronounced expectation bias that can generate nocebo effects in comparator arms, and the challenge of selecting an appropriate comparator dose. They argue that expectations influence response rates and dropout and that, because blinding is hard to maintain with classical psychedelics, comparator conditions and trial procedures must be chosen to minimise bias. The paper discusses a recent psilocybin versus escitalopram trial and interprets its escitalopram arm’s small improvement as possibly driven by disappointment (a nocebo effect) when participants inferred they had not received psilocybin before the antidepressant could plausibly have an effect. To mitigate nocebo and ethical concerns, the protocol offers every patient at least one high, putatively therapeutic 25 mg psilocybin dose and schedules the primary endpoint assessment immediately before the second dosing session at six weeks. This approach is intended both to reduce disappointment-driven deterioration in comparator arms and to prevent differential dropout from inflating apparent treatment effects. The investigators also justify using two comparators—an active placebo (nicotinamide) and a low psilocybin dose (5 mg)—to support blinding and to test the feasibility of a low‑dose comparator that may produce mild subjective effects without expected therapeutic efficacy. Nicotinamide was chosen as an active placebo because niacin, which induces flushing, is not available in the EU; nicotinamide is its amide and was encapsulated to match psilocybin preparations. The authors highlight unresolved epistemological and methodological questions that the field faces: whether a profound subjective psychedelic experience is necessary for durable therapeutic benefit, what the minimum effective dose is, and how to disentangle subjective experience, set/setting and psychotherapeutic support from pharmacological effects. They note that dose‑ranging trials are necessary but may not fully resolve the problem because the therapeutic mechanism may depend on subjective experience. Finally, the investigators contend that the EPIsoDE design addresses some of these challenges—through its comparator strategy, psychotherapeutic framework, repeated dosing and blinded outcome assessment—and that the trial therefore represents a meaningful methodological contribution to clinical research on psychedelics. The authors acknowledge broader uncertainties about blinding and expectation effects and frame their design choices as both ethically motivated and aimed at improving assay sensitivity in this difficult area.
View full paper sections
CONCLUSION
Here we present the rationale and study design of our phase IIb trial on the efficacy and safety of psilocybin in TRD. The study addresses several crucial methodological questions in psychedelic drug trials. Specifically, these are the risk of unblinding and nocebo effects, prevalent "expectation bias", and the choice of the most adequate comparator dose. In clinical trials of psychotropic drugs, the experimental drug is usually compared to a placebo and/or an active comparator with proven efficacy in that same indication. While there has been some debate over the last decade whether the effect size in antidepressant clinical trials against placebo might be exaggerated due to unblinding through the perception of side effects in the verum arm, it is general consensus that classical antidepressant drugs have a superior albeit modest efficacy compared to placebo, and that their efficacy cannot be attributed to the perception of side effects alone. However, the treatment effect in randomized controlled trials (RCTs) is clearly modulated by patients' and therapists' expectations. Response rates are larger and dropout rates smaller in RCTs without a placebo control, and the higher the probability for receiving placebo in a particular trial the smaller the response rate and the higher the dropout rate.conclude from their findings that the probability of receiving placebo should be considered when interpreting and synthesizing results from RCTs in major depression. The effect of this "expectation bias" by both clinicians and patients on outcomes in clinical trials cannot be underestimated; it influences the magnitude of symptom reduction with all appliednot only psychiatrictreatments. In depression trials, the magnitude of improvement follows the pattern of accepted expectations, when the trial design is known to the investigators, and it is lower in trials with investigators and staff blinded to the trial design and execution. When the level of blinding is high and it is difficult for investigators, raters and patients to guess treatment assignment, the differences between the treatment conditions become small. These assumptions are of paramount importance in interpreting psychedelic studies when blinding is nearly impossible and expectations of therapy are particularly high. For example, findings from a recent trial by Carhart-Harris et al. comparing psilocybin with the standard antidepressant escitalopram were interpreted that psilocybin was at least on a par with the comparator antidepressant. Apart from the fact that the study was not sufficiently powered to exclude a statistical type-2 (beta) error, this interpretation is inconsistent with the existing evidence from studies with antidepressants. At assessment of the primary endpoint, at six weeks, the total HAM-D score had decreased on average by 27.7 percent with escitalopram treatment. This represents a smaller change than that usually observed with placebo in escitalopram trialsor in antidepressant trials in general. Similar observations hold true for the Montgomery-Asberg Depression Rating Scale (MADRS). Thus, in the trial bypatients treated with escitalopram had a symptom improvement that was at most on placebo level and fell short by approximately 50 per cent of the established magnitude of the antidepressant effect of escitalopram as found by. Different from other clinical trials with psychotropic drugs, for which potential patients are approached by investigators, patients usually self-register for trials with psychedelics. In an atmosphere of great public euphoria about these substances, patients associate them with great hopes for an alleviation of their suffering. Patients who wish to be treated with a selective serotonin reuptake inhibitor (SSRI) such as escitalopram do not participate in a clinical trial, but instead consult their psychiatrist or their general practitioner (GP). Due to the unique subjective effects of classical psychedelics at high doses, blinding of treatment with a classical psychedelic is extremely difficult in an RCT. Hence, the most probable reason that patients treated with escitalopram achieved no more than placebo-level improvement was disappointment about assumingly not being in the psilocybin condition. This has likely been evident (or strongly assumed by them) after the first dosing session, before the first escitalopram pill has even been taken. This could be called a "nocebo"-effect, and it underlines the observation that in clinical trials with antidepressants the condition "treatment as usual" sometimes produces worse results than placebo treatment. These considerations have several implications: First, while placebocontrolled studies with a new experimental drug often require a third arm with an active, established comparator in order to demonstrate the "assay sensitivity" of the study, a "real" placebo condition is required in studies of psychedelics with active comparators for the same reason. Second, in order to minimize "nocebo"-effects in comparator arms, treatment with a potentially effective, high dose of the psychedelic should be offered to every patient randomized after assessment of the primary endpoint. This approach of repeated dosing was used in the studies byand, but we propose it as a methodological standard in clinical trials with psychedelics not only for reasons of treatment ethics but also to reduce a potential nocebo effect. In our study, the primary endpoint is assessed immediately before the second dosing session. This proceeding prevents two-thirds of patients from measurement of the primary endpoint being driven solely by disappointment at not having received the supposedly effective, high dose of psilocybin. This also significantly reduces the risk of artificially increasing the dropout rate in the treatment arms with patients who have received the supposedly less or ineffective treatment. In studies that are usually analyzed using the Last-Observation-Carried-Forward (LOCF) method, higher dropout rates in these arms would inflate the effectiveness of the supposedly more effective therapy. This might be especially important in studies of psychedelics for relapse prevention in substance use disorders, in which patients may immediately revert to their harmful use patterns after the experience that they have a high likelihood of not having received the potentially effective therapy. But also in depressed patients, whose condition is inherently characterized by negative thinking, depressed mood, hopelessness and even suicidal ideations or behavior, realizing that one has likely not received the high-dose psychedelic in a one-dose trial design, will most likely lead to a symptom deterioration. Finally, for patients who have often made numerous unsuccessful attempts at treatment and who might have down-titrated their previous antidepressant medication solely for their study participation, it is an ethical imperative to offer them the potentially effective therapy. In our study, every patient is administered at least one high, putatively effective dose of psilocybin. Since one-sixth of patients receive the high dose twice, we also expect an indication of whether two high doses are more effective than one. Attempts have been made to address the problem of blinding, which is very difficult to maintain, in studies with psychedelics by administering so-called "active placebos" or low, potentially ineffective doses of the test substance as comparators. This has also been recommended by the Food and Drug Administration (FDA) and the National Institutes of Health (NIH) during the annual meeting of the American Society of Clinical Psychopharmacology (ASCP; 2019) (www .psychedelic.support/resources/fda-nih-perspectives-psychedelic-drugdevelopment/)). An active placebo -or "concealed placebo" according to its original description in a German article published in 1959-is a placebo that produces subjective side effects that lead the patient to believe that they are receiving active treatment. A compound that has been suggested as an active placebo is niacin. Niacin, or nicotinic acid, a form of vitamin B3, causes peripheral side-effects such as flushing, itching or tingling and abdominal discomfort. Since niacin is not approved in the European Union, we used nicotinamide, the amide of niacin, in our study. Nicotinamide, which can be converted to niacin, causes the niacin-like side-effects usually only at high doses. Another active placebo that has been used to blind the effects of psilocybin in healthy volunteers, is the stimulant methylphenidate. However, healthy subjects and patients usually know the typical effects of psychedelics at least from the literature or the press, and they can relatively easily distinguish them even from psychoactive active placebos (e.g., methylphenidate). An alternative approach that has therefore been proposed is to administer low, supposedly ineffective doses of the experimental compound.administered psilocybin to patients with life-threatening cancer and symptoms of depression and/or anxiety. They compared a high dose (22 or 30 mg/70 kg body weight) with a very low dose (1 or 3 mg/70 kg body weight) and called the latter "placebo-like", still assuming that such low doses are associated with some subjective experience, without having a "therapeutic" effect. The determination of these dosages was based at least in part on a study published by the same authors in 2011.administered five different doses of psilocybin (0, 5, 10, 20, 30 mg/70 kg body weight) to healthy volunteers. They documented significant acute perceptual and subjective effects already at the 5 mg-dose, but only the two highest doses induced "mystical-type" experiences, and only those were associated with long-lasting, persisting effects on attitudes, mood, and behavior. Based on the observations by, we chose the 5 mg-dose of psilocybin as a supposedly therapeutically ineffective dose, while assuming that subjective effects will be noticeable to most patients. Hence, we implemented two comparator arms in order to support the blinding and to test the feasibility of a 5-mg dose as a comparator condition in psilocybin trials. The questions that are touched on here are central to all treatment research with psychedelics. Is a profound psychedelic experience necessary for a long-term therapeutic effect? If so, what dose is necessary and at what dose is the relationship between effectiveness and side effects and risks optimal? How are the therapeutic outcome modulated by the quality of the acute psychedelic experience and what role do the setting and psychotherapeutic interventions play? Which acute and subacute effects of psychedelics are to be considered side effects or AEs and which of themalthough maybe unpleasantare relevant for the therapeutic process and hence beneficial or even required for a treatment response? While dose ranging studies are necessary to determine the minimum effective dose required by regulators, the methodological challenge is substantial and possibly even unsolvable, because there is no other field in pharmacology in which the therapeutic effect seems to be linked as much to the subjective experience of the effect [e.g.. Even innovative substances under development that behave pharmacologically like psychedelic drugs without being associated with a psychedelic subjective effect upon administration to humans will not be able to solve the epistemological problem. If they are therapeutically ineffective, they will not be able to answer the question of whether the psychedelic experience is necessary for the therapeutic effect. On the other hand, if they are effective, they can no longer be called psychedelics. In conclusion, the field of trial methodology faces challenges that are very specific to clinical trials with psychedelic drugs. We are confident that our trial manages to address some of these methodological challenges and hence represents a significant contribution to this rapidly emerging field.
Study Details
- Study Typeindividual
- Populationhumans
- Characteristicsplacebo controlleddouble blind
- Journal
- Compound
- Topic