LSDPsilocybinPlacebo

Self-blinding citizen science to explore psychedelic microdosing

This self-blinding experiment (n=191) finds that the placebo and microdosing groups both experienced similar improvements in self-rated psychological well-being and cognitive function (e.g. mood, energy, creativity) after four weeks. This study provides more evidence that microdosing benefits can be attributed to expectancy (placebo) effects.

Authors

  • Fernando Rosas

Published

eLife
individual Study

Abstract

Microdosing is the practice of regularly using low doses of psychedelic drugs. Anecdotal reports suggest that microdosing enhances well-being and cognition; however, such accounts are potentially biased by the placebo effect. This study used a 'self-blinding' citizen science initiative, where participants were given online instructions on how to incorporate placebo control into their microdosing routine without clinical supervision. The study was completed by 191 participants, making it the largest placebo-controlled trial on psychedelics to-date. All psychological outcomes improved significantly from baseline to after the 4 weeks long dose period for the microdose group; however, the placebo group also improved and no significant between-groups differences were observed. Acute (emotional state, drug intensity, mood, energy, and creativity) and post-acute (anxiety) scales showed small, but significant microdose vs. placebo differences; however, these results can be explained by participants breaking blind. The findings suggest that anecdotal benefits of microdosing can be explained by the placebo effect.

Unlocked with Blossom Pro

Research Summary of 'Self-blinding citizen science to explore psychedelic microdosing'

Introduction

There is renewed scientific and clinical interest in psychedelic drugs such as LSD and psilocybin, principally in the context of psychedelics-assisted psychotherapy where a few large doses are given alongside psychotherapy. An alternative pattern of use, "microdosing", has become popular; it generally refers to repeated low doses (commonly 10-20% of a typical full dose) taken one to three times per week. Anecdotal reports and uncontrolled observational studies have suggested benefits for well-being, creativity and cognition, but these designs are vulnerable to expectancy, confirmation-bias and placebo effects, and randomised controlled laboratory studies to date have been small and focused on single acute doses rather than repeated use. The combination of (i) uncontrolled naturalistic evidence and (ii) underpowered controlled trials leaves uncertainty about whether reported benefits of microdosing are pharmacological or placebo-driven. Szigeti and colleagues set out to address these limitations by developing a low-cost, large-sample, placebo-controlled citizen-science approach they call "self-blinding". The study aimed to test whether repeated psychedelic microdosing produces superior accumulative, acute and post-acute effects compared with placebo on psychological state and cognitive function. The primary endpoint was after a core 4-week dosing period (week 5 of a 10-week protocol), and the investigators hypothesised that improvements from baseline would correlate with the number of microdoses taken and that acute/post-acute outcomes would be better on/after microdose days compared with placebo days. The design combined prospective online assessments and cognitive testing with randomisation and participant-implemented blinding to obtain a large, ecologically valid sample at minimal cost.

Methods

The study used a three-arm, participant-randomised, self-blinding design. Participants prepared opaque capsules at home: a set of capsules containing their chosen microdose substance and a matching set of empty placebo capsules. Weekly capsule sets were sealed into envelopes with non-human-readable QR codes; a constrained semi-random draw produced one of three possible sequences corresponding to the three study groups (placebo-only, half/half, or microdose-heavy). Participants opened one envelope per week across a 4-week dose period and took capsules according to the envelope schedule. Scanning the QR code allowed the study team to reconstruct allocation after the fact without revealing it to participants. Adults (>18) with prior psychedelic experience who intended to microdose were recruited online and consented via a study website. Inclusion criteria included English proficiency, prior psychedelic use, and agreement not to use other psychedelics during the study period; no biochemical verification of substances or doses was performed. Participants were free to use any psychedelic and were instructed to use their usual microdose amount; reported average doses were ~13 ± 5.5 µg for LSD/analogues and 0.2 ± 0.12 g for psilocybin-containing mushrooms (psilocybin doses were converted to LSD-equivalents for some analyses). Recruitment resulted in an educated, largely male sample (mean age 33.5 ± 9.4; 70% male). Outcomes were organised by timescale. Accumulative outcomes (baseline, week 5 primary endpoint, optional week 9 follow-up) included Ryff's Psychological Well-being (RPWB), the Cognitive and Affective Mindfulness Scale (CAMS), Satisfaction With Life (SWL), Green Paranoid Thought Scales (GPTS), Big Five personality traits (B5) and a composite cognitive performance score (CPS) derived from six Cambridge Brain Sciences tasks, with learning effects removed. Acute outcomes were assessed weekly 2–6 hours after capsule ingestion and included PANAS (positive/negative affect), visual analogue scales (VAS) for drug intensity, mood, energy, creativity, focus and temper, and cognitive tasks. Post-acute outcomes were collected 48–72 hours after the last capsule of a week. Statistical analyses used mixed-effect repeated-measures models (SAS PROC MIXED) for accumulative outcomes with change from baseline as the dependent variable, and mixed linear models for acute/post-acute outcomes with subject as a random effect. Models adjusted for significant baseline covariates tested from a prespecified list (age, sex, education, baseline score, dose, short suggestibility scale, expectation score, psychiatric history, current psychiatric medications, lifetime macrodose experiences and months microdosing). Planned comparisons included within-group change from baseline to week 5 and week 9, and between-group comparisons (placebo vs half/half and placebo vs microdose-heavy). Participants were also asked to guess daily and weekly whether they had taken a microdose or placebo; guess data were incorporated into additional models and a 2x2 stratified analysis (condition x guess) to explore blinding integrity and expectancy effects.

Results

Enrollment and completion numbers reported in the extracted text were n = 240 at baseline, n = 191 at the week 5 primary endpoint, and n = 159 at week 9. The sample was predominantly male (70%), middle-aged (mean 33.5 ± 9.4) and pro-psychedelic in attitude. Most participants microdosed LSD (n = 147, 61%) or LSD analogues (n = 33, 14%); 57 participants (24%) used psilocybin-containing mushrooms and three used other psychedelics. Average reported microdoses were ~13 ± 5.5 µg LSD-equivalent for LSD/analogues and 0.2 ± 0.12 g for mushrooms. Randomisation produced no significant baseline differences across groups. Accumulative (within-group) results: In the microdose-heavy group (MD) from baseline to week 5, self-reported psychological outcomes improved significantly: RPWB increased by 4.2 ± 3.9 (p=0.04), CAMS by 2.4 ± 1.1 (p<0.001), SWL by 1.2 ± 1.2 (p=0.04), and GPTS decreased by −5.0 ± 1.7 (p<0.001). Personality changes included reduced neuroticism (−1.3 ± 0.9, p<0.01) and increased openness (0.9 ± 0.8, p=0.03). However, similar improvements (notably in mindfulness and reduced paranoia) were observed in the placebo (PL) and half/half (HH) groups: CAMS increased and GPTS decreased significantly in all groups, and neuroticism also fell in PL (−1.0 ± 1.0, p=0.04). Cognitive performance (CPS) did not improve in the MD group, and between-group comparisons for accumulative outcomes at week 5 showed no significant differences between PL and MD. Acute and post-acute results (initial models without guess): Acute measures taken on dose days showed higher scores under MD compared with PL for PANAS (acute emotional state; adj. mean difference 2.2 ± 1.4, p<0.01), and for VAS items drug intensity (12.5 ± 3.0, p<0.001), mood (4.6 ± 2.9, p<0.001), energy (5.3 ± 2.7, p<0.001) and creativity (4.7 ± 2.6, p<0.001). Effect sizes were small (Cohen's d < 0.3) for all scales except drug intensity (d = 0.58). Among post-acute measures, only trait-state anxiety (STAIT) differed (−1.4 ± 1.3, p=0.03) favouring microdose weeks. Blinding integrity and the role of guess: Participants were asked to guess daily and weekly; the overall break-blind rate (proportion of correct guesses) was 0.72 ± 0.18, higher than a simple random-guess expectation (reported comparison: random 0.62). Specificity (true negative rate for placebo) was 0.82 ± 0.16 and sensitivity (true positive rate for microdose) 0.45 ± 0.30. Break-blind rate correlated with reported dose (higher doses associated with better guessing), and an estimated detection threshold where guessing exceeded random was ~12 µg LSD-equivalent. When guess was added to acute/post-acute models, the condition effect (PL vs MD) lost significance for all scales except acute drug intensity (adjusted mean difference 3.4 ± 2.0, p<0.001). The guess*condition interaction was non-significant except for drug intensity. Stratified 2x2 analyses (condition by guess) showed that when guess was held constant there were no significant differences between PL and MD (except drug intensity), whereas when condition was held constant, outcomes were consistently better when participants believed they had taken a microdose (21 of 22 comparisons across acute and post-acute scales). Cognitive tasks and CPS, which are less subjective, showed no effects of either drug condition or guess. These patterns indicate that participants’ beliefs about taking a microdose largely explained the observed improvements, with a small genuine drug-related effect detectable only on subjective drug intensity.

Discussion

Szigeti and colleagues interpret the findings as supporting the hypothesis that reported psychological benefits of repeated microdosing in a real-world sample are attributable primarily to placebo or expectancy effects rather than to a robust pharmacological action of low-dose psychedelics. Although the microdose group improved from baseline across a range of self-reported measures, parallel improvements occurred in placebo and half/half arms and no significant between-group differences were observed for accumulative outcomes. The initial acute and post-acute differences also disappeared after accounting for participants' guesses about capsule content, except for a small drug-related increase in subjective drug intensity. Cognitive outcomes, which are less vulnerable to subjective expectation, did not improve under microdosing. The authors place these results in the context of prior small controlled studies and uncontrolled observational work, suggesting that uncontrolled designs are especially vulnerable to expectancy bias in this area. They highlight the practical challenge that doses sufficient to produce detectable pharmacological effects may also compromise blinding, and note that participants’ ability to detect active capsules rose with dose, with an estimated psychoactive detection threshold around 12 µg LSD-equivalent in this sample. The discussion stresses that high break-blind rates, combined with positive attitudes toward psychedelics in the recruited sample, likely amplified placebo-like improvements. Key limitations acknowledged by the investigators include the inability to verify substance identity, purity or exact dose (participants sourced their own drugs), the lack of objective confirmation of adherence to the self-blinding procedure or capsule ingestion, and the observational nature of recruitment which limits clinical inference. They also note that guess was recorded after assessments, so causal direction between performance and guess cannot be established definitively, although many participants reported thinking about whether they had taken a microdose prior to being asked to guess. The sample was largely healthy with high baseline well-being, reducing scope for improvement; two post-hoc analyses in more vulnerable subsamples (lowest 25% well-being and highest 25% neuroticism) did not reveal between-group benefits. Finally, the authors advocate the self-blinding citizen-science method as a low-cost way to implement randomisation and placebo control at scale, useful as a screening or exploratory tool for trending interventions. They recommend future trials consider active placebos or different dosing strategies and place emphasis on assessing blinding integrity, especially given the interaction between expectancy and reported effects. Despite limitations, the investigators conclude that their largest placebo-controlled psychedelic study to date provides evidence that the positive self-reported effects of naturalistic microdosing are largely explained by expectancy/placebo mechanisms rather than direct pharmacological benefit.

View full paper sections

SECTION

There is renewed interest in the medical application of psychedelic drugs, such as lysergic acid diethylamide (LSD) and psilocybin. Contemporary research is predominantly focusing on 'psychedelics assisted psychotherapy', where a few (one to three) large doses of psychedelics are used as adjunct to psychotherapy. Using this paradigm, psychedelics have shown promise in the treatment of conditions such as depression, end-of-life-anxiety, addiction, and obsessivecompulsive behaviors. Recently, 'microdosing' has emerged as an alternative paradigm of psychedelic use. Due to its underground origin, microdosing does not have a universally agreed upon definition, and inconsistencies exist in substance, dose, frequency, and duration of use. However, microdosing can be broadly defined as the frequent use (one to three times per week) of low doses of psychedelics (10-20% of a typical 'full' dose, e.g. 10-15 μg LSD or 0.1-0.3 g of dried psilocybin containing mushrooms). Anecdotal evidence suggests that microdosing may improve well-being, creativity, and cognition, and recent uncontrolled, observational studies have provided some empirical support for these claims. While encouraging, these studies are vulnerable to experimental biases, including confirmation-bias and placebo effects, in particular, because microdosers are a self-selected sample with optimistic expectations about psychedelics and microdosing. This positivity bias, combined with the low dose and the subjective evaluation of effects, pave the way for a strong placebo response. A few recent double-blind, controlled studies have been conducted on microdosing. All studies used LSD and focused on the acute effects of a single microdose in a small number of healthy subjects. Studies have found large variability in LSD blood concentration after microdosing, along with increased BDNF blood levels. No robust evidence was found to support the positive anecdotal claims about microdosing, but some dose-related self-rated subjective effects were detected (e.g. self-ratings of 'feel drug', 'feel high', and 'like drug'), along with concomitant changes in brain function. Two key issues need to be considered when assessing the scientific credibility of microdosing: the lack of placebo control in uncontrolled studies and the small sample size in controlled studies. Uncontrolled, observational studies affirm the anecdotal reports, but by design, these studies cannot provide evidence for5/55 beyond placebo benefits. Lab-based, controlled studies have small samplesdue to restrictive drug policies that render randomized controlled trials prohibitively expensive, and hence may be statistically underpowered. In the present study we conceived of a novel citizen-scienceinitiative as a solution to this problem, exploiting modern technology and the popularity of microdosing. The key component is a self-blinding setup procedure that enabled self-experimenters, who microdose on their own initiative using their own psychedelic, to implement placebo control and randomization without clinical supervision. To investigate potential changes over the study period, participants were directed to online self-report surveys and cognitive tasks at various timepoints. The strength of this design is that it allowed us to obtain a large sample size while implementing placebo control at minimal logistic and economic costs. The primary objective of the study was to test whether psychedelics microdosing produces superior outcomes compared to placebo on psychological state and cognitive function. We hypothesized that improvements from baseline will be positively correlated with the number of microdoses taken during the dose period and that acute/post-acute outcomes will be better under/after taking a microdose. This study had a naturalistic design involving elements of experimental control (self-blinding), prospective data collection and online citizen-science. From baseline to the final endpoint, the study was 10 weeks long (weeks 0-9), including a core 4-week microdosing period. Primary endpoint was at week 5 and there was an optional follow-up at week 9. The self-blinding procedure randomly assigned individuals to three groups, where the groups are defined by the number of weeks taking placebos/microdoses during the dose period.

DESIGN

Request a detailed protocol Individuals took two microdoses during each microdose week, resulting in 0/4/8 total microdoses for the PL/HH/MD groups. Participants had equal probability (1/3) of being assigned to each group; Figureillustrates the experimental timeline and the groups' dose schedule. Outcomes can be organized into three categories capturing the effects of microdosing on different timescales. Accumulative: assessed monthly, first at baseline, then after the completion of the dosing regime at week 5, and finally at the optional long-term follow-up at week 9. Accumulative outcomes were: Ryff's psychological well-being (RPWB), cognitive and affective mindfulness scale (CAMS), satisfaction with life scale (SWL), green paranoid thought scales (GPTS). Acute: assessed weekly during the dose period on Thursdays, when either a microdose or placebo capsule was taken. The testing was carried out 2-6 hr after the ingestion of the capsule, while the potential microdose was active. Acute outcomes were positive and negative affect schedule (PANAS), visual analogue scale items (drug intensity, mood, energy, creativity, focus, and temper) and cognitive performance (see Accumulative above for details). An overview of the outcomes can be found in Tableand a description of each measure is in Appendix 1. See Figurefor the experimental timeline and assessment timepoints. A high-level overview of self-blinding is provided here; for a detailed illustration see Figure. First, two sets of capsules had to be prepared using nontransparent capsules: one set with microdoses inside and another set without anything inside (placebos). Next, these capsules were packaged into weekly sets, which were then placed inside envelopes together with a QR code (Figure). The envelopes were grouped and shuffled. Then, using a semi-random drawing process, four of them were selected (Figure) corresponding to the 4 weeks of the dose period (i.e. each envelope held capsules for 1 week of the dose period). The drawing process was constrained such that only three combinations of the envelopes were possible to draw, matching the three study groups: placebo (four). At this stage, participants were ready to start the experiment. When the dose period started, one envelope was opened per week and the capsules inside were used as scheduled (Figure). Additionally, the QR code from the envelope had to be scanned, which shared a numeric code with our informatics infrastructure. The decryption key (i.e. how capsule types are encoded by the numbers) was not shared with participants, so the numeric code allowed only us to deduce which type of capsule was taken when. In summary, the two key elements of self-blinding are to hide the active components inside opaque capsules while preparing identical looking placebos (1) and to position non human-readable QR codes along the capsules prior to randomization (2). With the QR codes in place, it is possible for the experimenter to recover knowledge of capsule types after randomization without revealing that information to participants.

MICRODOSE PREPARATION

Request a detailed protocol Participants were allowed to use any psychedelic substance to microdose with. The microdose dose, which is the amount of substance to use as a microdose, was not defined for participants, rather they were instructed to use a microdose dose that they would use outside the study. The rationale for this direction was threefold. First, given that participants typically would source their substance from the black market, the precise microdose dose could not have been known even if instructions requested it. Second, based on community feedback, most experienced microdosers have a preferred dose that they would not have liked to change to participate in the study. Lastly, this study was not a clinical trial and therefore from a regulatory perspective not allowing for control over and/or directing about drug doses. Psychedelics users were recruited through advertisement on relevant online and offline forums. Individuals could sign up through the study's website,, where they could find information about the study, including the study manual and explainer videos, the participant information's sheet, and procedure for declaring informed consent. Once informed consent was given, individuals were able to sign up by providing their email address and planned start date. The inclusion criteria were: >18 years of age, good understanding of English, intention to microdose with psychedelics, previous experience with psychedelics (either micro-or macrodosing), no use of psychedelic drugs from a week before the start until the completion of the postregime timepoint (other than the study's microdoses), and willingness to follow the study protocol. All the questionnaires were implemented online using the SurveyGizmo platform (). For the online assessment of cognitive performance, the Cambridge Brain Sciences () service was used. At each timepoint, links to each test were sent in a dedicated email via the Psychedelics Survey () service. These links had a personal ID embedded, so each test completion could be matched to individuals.

BLIND BREAKING AND COLLECTION OF GUESS DATA

11/55 Participants were asked to guess which type of capsule they had taken that day during the dose period (for days when capsule was taken). This guess was a forced binary choice between microdose and placebo options. At the end of the post-acute test sessions, participants were asked separately to guess whether the current week was a microdose or a placebo week (Figure). In the discussion of our results, the term 'break blind' indicates that the participant guessed the capsule correctly for the day (acute outcomes) or week (post-acute outcomes). No guess was collected about perceived group allocation at the end of study, because information about group structure was not shared with participants. Group differences in demographics, recreational drug use, and baseline scores of the accumulative outcomes were assessed with ANOVA and chi-square tests for continuous and categorical variables. Accumulative outcomes were analyzed with mixed-effect repeated measurement models, using the SAS PROC MIXED method with compound symmetry covariance structure. Models were constructed with change from baseline as the dependent variable, group, time and group*time interaction as factors, and individuals as experimental unit. Models were adjusted for all significant baseline covariates (the following variables were tested as potential covariates: age, sex, education, baseline score, dose, total dose, short suggestibility scale score, expectation score, number of past psychiatric diagnosis, number of current psychiatric medications, number of lifetime macrodose experiences, and number of lifetime months microdosing). To accommodate dose as a potential covariate, psilocybin mushroom mass was converted to an estimated equivalent LSD dose (0.1 g of dried mushroom ~4.6 µg LSD;. The following planned comparisons were made: within-group comparisons of change over time from baseline to the primary endpoint at week 5 and from baseline to the final followup at week 9. Additionally, between-group comparisons were made (PL vs. HH and PL vs. MD) at week 5 and week 9. To analyze acute and post-acute outcomes, mixed linear models were constructed. Models included score as dependent variable, subject ID as a random-effect, and condition as fixed-effect, where condition was a binary categorical variable (PL/MD). For acute outcomes, condition was PL/MD when the score was obtained under the influence of a placebo/microdose capsule, while for post-acute outcomes condition was PL/MD when the score was Request a detailed protocol

STATISTICAL ANALYSIS

Request a detailed protocol obtained at the end of placebo/microdose week. Planned comparisons were made between scores obtained under PL and MD conditions. Each participant contributed four scores to these models, corresponding to the four acute/postacute assessment timepoints during the dose period. All acute/post-acute models were adjusted for all significant baseline covariates (same variables were tested for significance as in the case for the accumulative outcomes, except baseline score and total dose consumed). To better understand how guess influenced scores, a second set of models were constructed with the addition of guess (binary categorical variable, PL/MD) and guess*condition factors. Using these guess adjusted models, planned comparisons were made between PL and MD conditions. Finally, the two binary variables (condition and guess) divided the data into 2*2 = 4 strata, post-hoc comparisons were made between the following strata (condition/guess): PL/PL vs. MD/PL, PL/MD vs. MD/MD, PL/PL vs. PL/MD and MD/PL vs. MD/MD. This selection was made such that condition changes while guess remains fixed in the first two comparisons, and guess changes while condition remains fixed in the last two comparisons. The study only engaged people who planned to microdose through their own initiative with their own psychedelic substance, but who consented to incorporate placebo control to make their self-experimentation compatible with our study. Investigators did not endorse any use of psychedelics, and no financial compensation was offered to participants. Email addresses were the only personally identifiable data collected. individuals. No statistically significant differences were found between the groups in any demographic, recreational drug use or baseline measures, confirming efficiency of the randomization (see Supplementary file 1 for details on demographics, Supplementary file 2 for recreational drug use, and Supplementary file 3 for statistical analysis of baseline variables). Completion rate was highly similar across the three groups (χ (12, N = 240)=0.64, p=0.99), see Figure. The completion of the 4 weeks follow-up timepoint was optional. For the most part, the sample consisted of educated, middle-age (33.5 ± 9.4), healthy males (70% male, 19% female, 1% other) from western countries. As expected, most participants had a positive attitude toward psychedelic drugs, in particular toward medical use: 74% and 90% either agreed or strongly agreed with the statements 'I am an active advocate of psychedelic drug-use' and 'I am an active advocate of the therapeutic use of psychedelics', respectively. See Appendix for details on the sample's expectations/attitude about microdosing and psychedelics. The sample consisted of healthy individuals for the most part: 33% of participants reported to have had at least one psychiatric diagnosis in the past, the most frequent past diagnoses were: anxiety disorder (13%), depression (13%), and PTSD (7%). Only 7% of the sample had current mental diagnosis. Most participants microdosed with LSD (n = 147; 61%)/LSD analogue (n = 33; 14%), followed by psilocybin containing mushrooms (n = 57; 24%) and three individuals used other psychedelics (LSA: n = 1; DOB: n = 2). The average reported dose for LSD/LSD analogues was 13 ± 5.5 µg, while for psilocybin mushroom it was 0.2 ± 0.12 g, see Appendix 1-figurefor further details. Accumulative outcomes were first collected at baseline, then at week 5 (i.e. after the completion of the 4 weeks long dose period) and at the optional long-term follow-up timepoint at week 9. The following two sets of pre-planned comparisons were made: within group comparisons of baseline vs. week 5, baseline vs. week 9 (changes over time) and between-group comparisons at the week 5 and week 9 timepoints. Sample sizes were n = 240/191/159 at baseline, week 5 and week 9, respectively. Data was also analyzed separately for LSD/LSDanalogues and psilocybin microdoses, the results from both subgroups matched the results of the combined analysis presented here. For the within group (change over time) comparison of baseline vs. week 5, all self-reported psychological outcomes improved significantly in the MD group: well-being (RPWB) increased with 4.2 ± 3.9 (adjusted mean estimate ±95% CI; p=0.04*), mindfulness (CAMS) increased with 2.4 ± 1.1 (p<0.001***), life satisfaction (SWL) increased with 1.2 ± 1.2 (p=0.04*), and paranoia (GPTS) decreased with -5.0 ± 1.7 (p<0.001***). Personality structure (B5) showed reduced neuroticism trait score (-1.3 ± 0.9, p<0.01**) and increased openness (0.9 ± 0.8, p=0.03*). Significant changes over the same period (from baseline to week 5) were also observed in the PL and HH groups for mindfulness (PL: 1.6 ± 1.1, p<0.01**; HH: 1.3 ± 1.2, p=0.02*) and paranoia (PL: -3.4 ± 1.7 p<0.001***; HH: -4.9 ± 1.9 p<0.001***), but not for well-being or life satisfaction. Neuroticism also decreased in the PL group (-1.0 ± 1.0, p=0.04*). Changes in mindfulness and paranoia were sustained at the week 9 follow-up timepoint for all groups, while decreased neuroticism only prolonged in the MD group, see Supplementary file 5 for details. CPS did not change in the MD group (from baseline to week 5), but

MICRODOSES

Accumulative outcomes As secondary analysis to further examine the role of placebo-like expectation effects in the accumulative outcomes, we performed a post-hoc adjustment by adding the 'number of times microdose capsule was guessed' variable as a covariate to the models (irrespective whether the guess was correct or not). This variable was significant for some models (RPWB: p<0.01**; CAMS: p=0.02*; B5 agreeableness: p=0.02*; B5 openness: p=0.03*) and further decreased the already small between-group differences on self-reported scales, while it did not affect cognitive outcomes. Specifically, the adjusted treatment difference (±95% CI) at the week 5 timepoint between PL and MD groups without/with the number of MD guesses covariate was: well-being (RPWB) 2.5 ± 5.6 (p=0.37)/0.9 ± 5.7 (p=0.76), mindfulness (CAMS): 0.8 ± 1.5 (p=0.32)/0.4 ± 1.5 (p=0.65), paranoia (GPTS): -1.6 ± 2.5 (p=0.21)/-1.2 ± 2.5 (p=0.36), life satisfaction (SWL) 0.4 ± 1.7 (p=0.67)/0.2 ± 1.8 (p=0.83), B5 intellect: -0.2 ± 1.2 (p=0.80)/-0.2 ± 1.2 (p=0.71), B5 openness: 0.3 ± 1.2 (p=0.57)/0.0 ± 1.2 (p=0.97), B5 neuroticism: -0.3 ± 1.4 (p=0.70)/ -0.1 ± 1.4 (p=0.87), B5 extraversion: -0.2 ± 1.2 (p=0.81)/-0.4 ± 1.3 (p=0.52), B5 agreeableness: 0.5 ± 1.1 (p=0.37)/0.2 ± 1.1 (p=0.75), and B5 consciousness: 0.8 ± 1.3 (p=0.24)/0.5 ± 1.3 (p=0.44). First, outcomes are described without considering the guess component, which is discussed in the next section. Acute outcomes were measured during the dose period while the potential microdose was still active, while post-acute outcomes were measured every Sunday, when no capsule was taken, 48-72 hr after the last placebo/microdose capsule. For psychological measures the average sample size was 857 (between 849 and 884 due to partial completions; participants contributed four scores corresponding to the four acute timepoints, see Materials and methods for details), while for cognitive performance it was 684 (between 678 and 689). Data was also analyzed separately for LSD/LSDanalogues and psilocybin microdoses, and the results from both subgroups matched the results of the combined analysis presented here. Among acute measures, condition (PL vs. MD) was significant for acute emotional state (PANAS) (adjusted mean estimate ±95% CI: 2.2 ± 1.4, p<0.01**) and the acute drug intensity (12.5 ± 3.0, p<0.001***), mood (4.6 ± 2.9, p<0.001***), energy (5.3 ± 2.7, p<0.001***), and creativity (4.7 ± 2.6, p<0.001***) VASs, meaning that scores collected on days when a microdose was taken were significantly higher compared to scores collected on placebo days. Effect sizes, as quantified by Cohen's d, remained small (d < 0.3) on all scales, with the exception of the drug intensity VAS (d = 0.58).

ACCUMULATIVE OUTCOMES ADJUSTED FOR NUMBER OF MICRODOSE GUESSES

Acute and post-acute outcomes17/55 Among post-acute measures, condition was significant only on the anxiety measure (STAIT; -1.4 ± 1.3, p=0.03*), meaning that anxiety was reduced at the end of microdose weeks compared with placebo weeks, see Tablefor details on both acute and post-acute outcomes. Next, the acute and post-acute results were re-analyzed with the addition of guess into the models. Condition (PL vs. MD) was no longer significant for any scale, except for acute drug intensity VAS (adjusted mean difference ±95% CI: 3.4 ± 2.0; p<0.001***), which increased under MD (Table). The guess*condition interaction term was non-significant for all scales, except for drug intensity (p<0.01**). To better understand the role of guess, the data was further analyzed by comparing the 2*2 = 4 strata formed by the two binary variables, condition (PL/MD), and guess (PL/MD), in the models. For self-reported outcomes, no significant differences were found between microdose and placebo conditions with fixed guess (condition/guess: PL/PL vs. MD/PL and PL/MD vs. MD/MD comparisons), except for acute drug intensity visual analogue scale, which was higher when microdose was taken (adj. mean difference ±95% CI; 7.3 ± 3.1, p<0.001***). Conversely, when drug condition was fixed (condition/guess: PL/PL vs. PL/MD and MD/PL vs. MD/MD comparisons), significant differences were found in 21 of the 22 comparisons (=2*conditions*(4*post-acute+7*acute scales)), all favoring MD guess. These findings suggest that scores are significantly better when the participant believed they had taken a microdose irrespective of what was actually taken. Taking an actual microdose was only associated with a significant difference in the drug intensity scale. Figureshows the stratified distribution of selected outcomes, see Supplementary file 8 for all comparisons. Break blind rate, defined as the proportion of correct capsule guesses (see section Blind breaking and collection of guess data for details), was 0.72 ± 0.18 (M ± SD). Specificity (true negative rate: ratio of true placebo guesses to all placebo guesses) was 0.82 ± 0.16, noticeably higher than sensitivity (true positive rate: ratio of true microdose guesses to all microdose guesses) 0.45 ± 0.30, meaning that placebo capsules were guessed correctly at a higher rate than microdoses. Based on knowledge of the ratio of PL/MD capsules (3/1) in the envelopes, which is evident to participants when they prepare the capsules, a 'random guesser' would have a break blind rate of 0.62 with 0.75 specificity and 0.25 sensitivity. The high sensitivity exhibited by participants (0.46 vs. the random guesser's 0.25) suggests that the higher than random break blind rate is mostly due to superior ability to identify microdoses, see Appendix 1-tablefor details. Break blind rate was positively associated with reported microdose dose (F(1, 237)=7.4, p<0.01**), meaning that the higher the dose was, the more likely participants guessed their daily condition correctly. For this analysis psilocybin mushroom doses were converted to estimated LSD dose equivalent, see

BLINDING INTEGRITY

03/03/2021 Self-blinding citizen science to explore psychedelic microdosing | eLife20/55 Statistical analysis in Materials and methods for details. The estimated 'detection threshold', that is, the dose above which participants guess significantly better than random, was 12 µg. We employed a novel self-blinding methodology to investigate the acute, postacute, and long-term, accumulative effects of psychedelic microdosing. To the best of our knowledge, this study is the first one to use a self-blinding methodology, the first placebo-controlled investigation of the accumulative effects of repeated microdosing, and the largest placebo-controlled psychedelic study to-date. When looking at changes over time from baseline to week 5 (accumulative outcomes) in the microdose group alone, results confirmed the psychological benefits reported by anecdotes (Fadiman and Krob, 2017) and observational, uncontrolled studies: significant improvements were observed in the domains of well-being, mindfulness, life satisfaction, and paranoia. However, when looking at the between-group comparisons of the same outcomes, no significant differences were found between the placebo and microdose groups. On the cognitive tests, which are less subjective than the self-reported psychological outcomes, the microdose group did not even improve from baseline to week 5 and the between-groups comparisons were not significant either. Thus, our study validates the positive anecdotal reports about the psychological benefits of microdosing (significant improvements from baseline in a broad range of psychological measures); however, our results also suggest that these improvements are not due to the pharmacological action of microdosing, but are rather explained by the placebo effect (lack of significant between-groups differences). Similar conclusions can be drawn from the examination of the acute and postacute outcomes as well. In our initial analysis without incorporation of the guess component, we detected significant effects on post-acute anxiety (STAIT), acute emotional state (PANAS), and mood, energy, creativity, and drug intensity (visual analogue scale items). Effect sizes were small on all scales (Cohen's d < 0.3 except drug intensity); thus, the clinical and practical value of these effects is debatable. Furthermore, when the guess component was added to the models, the already small differences disappeared on all scales, except for acute drug intensity. It can be argued that the addition of the guess variable to the models may undermine the statistical significance of the condition effect due to collinearity between condition and guess. To overcome this potential issue, we conducted the stratification analysis where only one of these variables is changing, while the other remains fixed. No significant differences were observed between placebo and microdose conditions when the guess was fixed (condition/guess; PL/PL vs. MD/PL and PL/MD vs. MD/MD comparisons), except for drug intensity (MD>PL). Conversely, when condition was fixed (PL/PL vs. PL/MD and MD/PL vs. MD/MD comparisons), scores obtained under placebo and microdose guesses were significantly different in 21 out of the 22 comparisons, always favoring the microdose guess, see Figureand Supplementary file 8. Importantly, neither CPS nor any cognitive subtask, the non-self-rated outcomes where beliefs and subjective feelings are likely to be less influential, were significantly different under either guess or drug conditions. In summary, these results strongly suggest that the actual content of capsules did not determine differences between the conditions, but beliefs about their content did. An important observation was that participants guessed their capsules correctly in 72% of the cases. This break blind rate was higher than random (random: 63% vs. participants: 72%), but not as high as reported in antidepressant studies (around 80%). It is known from a variety of clinical studies that higher break blind rate is associated with larger between-conditions effect-sizes (where placebo is the control condition). This relationship is explained by non-specific treatment factors such as expectation of a benefitand investigator alliance. The influence of such factors is likely to be large for the present study, because of highly positive expectations and favorable attitudes toward psychedelics, see attitude analysis in the Appendix. These factors together suggest that the observed 'significant' acute and post-acute effects may be an artifact of the combination of break blinds and expected benefits. The acute and post-acute results observed could be understood as the difference between the expected benefits when a microdose is perceived (i.e. guessed by participants) versus the absence of expected benefits when placebo is perceived. This difference in expectations could be mistaken for a 'real' drug effect in any study where blinding integrity is not considered during analysis. If this explanation is correct, one prediction for future microdose studies with a similarly pro-psychedelics sample is that they may observe larger effects when break blind rate is higher, or conversely, smaller effects when break blind rate is lower. What factors account for the blind breaking? Drug intensity was the only outcome that remained significant even after adjusting for guess (3.4 ± 2.0; p<0.001***). This observation suggests that drug intensity is a small, but true drug effect. This increased drug intensity mostly manifested as body and22/55 perceptual sensations, see Blind breaking cues in Appendix 1 for details. This finding suggests that in most cases blind breaking induced clinically irrelevant side effects, rather than deduced from improvements of outcome variables. We note that according to our data the threshold LSD dose where participants guess better than random is 12 µg, see Figure, which is in line with the 13 µg threshold dose estimated by a recent dose controlled study. It is worth noting that the current study was designed to protect blinding integrity by including placebos for the microdose group as well, administering the microdose capsules on different days of the week and by including the halfhalf group. The 3-arm design can be seen as a strength in this regard, adding ambiguity and thus strengthening blinding. Illustrative of the integrity of the blind, we received several emails from participants in the PL group who were in disbelief after opening their unused envelopes containing unused capsules after the conclusion of the study: "I counted the number of cut blotters I had in the left overs: they are 8...so you must be right... Which is incredible […] Some days during the test were really, really focused and colours more vivid. This sensation was really new to me". "I have just checked the remaining envelopes and it appears that I was indeed taking placebos throughout the trial. I'm quite astonished […] It seems I was able to generate a powerful 'altered consciousness' experience based only the expectation around the possibility of a microdose". "An empty pill with strong belief/intentions makes nearly everything. You put spirituality into an empty pill here...wow!" It is our view that the present part-controlled, part-observational design yields data superior to conventional observational data (inclusion of placebo control), but inferior to controlled clinical trial data (incomplete control over recruitment, screening, assessment, drug administration, etc.). This study does, however, have greater ecological validity than would a fully controlled lab study. A key limitation of the present study is the lack of verification of the nature, purity, and dosage of the psychedelic substance used for microdosing. Psilocybin-containing mushrooms were used by 23% of the sample, 14% used legal LSD analogues (such as 1P-LSD), whereas 62% sourced their substance from the black market, mostly LSD (61%). According to the Energy Control's drug checking service (Barcelona), LSD blotter adulteration rates were low during the within a given microdose cannot be known with certainty; however, the positive relationship between dose and blind breaking (Figure) and that the threshold dose for psychoactivity was consistent with a recent controlled study (12 µg vs 13 µg; Bershad et al., 2019a) provide some reassurance. Nonetheless, our results should be not understood as clinical evidence, rather they are representative of 'real life microdosing'. We could not confirm whether participants followed accurately the self-blinding procedure. Three individuals reported following an invalid sequence of weeks, but these individuals did their setups together, all committing the same mistake (1.3% error rate). Furthermore, we had no way of confirming whether the capsules were taken as instructed during the dose period. Instructions emphasized not to complete assessments planned on dosing days in case the dose schedule could not be followed for any reason, but we could not confirm whether participants adhered to this rule. Our stratification analysis does not allow for a strict determination of a causal relationship between guess and outcome, because guess was recorded after completion of assessments, guess was last question during test sessions. After closing the study, a survey was conducted among participants, where 86% (n = 166) responded that "I was thinking about whether I took a microdose or placebo even before I was asked to guess" (opposed to "I was not thinking about whether I took a microdose or placebo, except when I was asked to guess"), making a causal interpretation more likely. We note that the order we chose is consistent with previous work in psychiatric studies; had the guesses been requested prior to the assessments, it could have primed responses. Also, we cannot rule out that performance during the assessments influenced the guess. However, the lack of any feedback from the assessments mitigates this risk. Most participants reported to break blind due to body and perceptual sensations, rather than improved outcomes, see Blind breaking cues in the Appendix for details. We cannot rule out the possibility that a study in a clinical population would yield more promising results. In the present healthy sample, where well-being scores are high at baseline, there is less scope for potential improvements, which could have prevented the observation of placebo-microdose differences. Most study participants reported not to have any history of mental health problems; only 7% reported having a current psychiatric diagnosis, and 33% reported to have had a psychiatric diagnosis in the past (Supplementary file 1). We conducted two post-hoc analysis for two selective pseudo-depression24/55 subsamples: participants with the lowest 25% baseline well-being scores and those with the highest 25% baseline neuroticism scores. Results in these subsamples were entirely consistent with those from the complete sample: there were no significant differences between conditions for any of the accumulative outcomes (adjusted treatment difference ±95% CI of PL vs MD at week 5 for the lowest 25% baseline well-being subsample: well-being (RPWB) -1.6 ± 13.6 (p=0.81), mindfulness (CAMS) 0.3 ± 3.3 (p=0.85), paranoia (GPTS) -5.1 ± 6.8 (p=0.14), life satisfaction (SWL) 0.3 ± 4.5 (p=0.87), cognition (CPS) 0.1 ± 0.55 (p=0.71); same measures for the highest 25% baseline neuroticism subsample: well-being (RPWB) 4.8 ± 14.3 (p=0.50), mindfulness (CAMS) 1.3 ± 3.7 (p=0.49), paranoia (GPTS) -3.1 ± 8 (p=0.43), life satisfaction (SWL) -1.4 ± 4.6 (p=0.53), cognition (CPS) 0.04 ± 0.67 (p=0.90)). Thus, although not designed as a clinical study, data from this opportunistic naturalistic study do not provide support for clinical effects of microdosing. Although this was the largest placebo-controlled psychedelic research study published to-date, we note that one could argue that the study was still underpowered to detect a true effect based on the fact that the MD group did improve more than the PL group on all scales (from baseline to week 5), but just not to a statistically significant extent (Figure). On the well-being scale (RPWB), the adjusted PL vs. MD group difference was 2.5 ± 5.6 points. To illustrate this difference in practice, this scale consists of 42 statements that participants rate on a 6-point Likert scale (Strongly disagree -Strongly agree), thus, the full range of scores is thus 0-252, so the 2.5 point mean difference is 1% of the total scale. This difference is equivalent to scoring one item, for example 'I like most aspects of my personality', Strongly agree instead of Slightly agree or Slightly disagree, while responding the same to the remaining 41 items. Based on our data, we calculated that the sample size (90% power and alpha of 0.05) required to observe a true between-group difference would be: 1508 for well-being (RPWB), 1638 for mindfulness (CAMS), 4918 for life satisfaction (SWL), 1392 for paranoia (GPTS), and 366 for cognitive performance (CPS). These differences therefore are not clinically meaningful or sufficient to justify the cost of intervention. The successful execution of this initiative here may inspire similar initiatives throughout the world in a broad range of scientific and medical contexts. Controlling for placebo effects is important for trending phenomena, such as cannabidiol (CBD) oils, nootropics, and nutrition, where social-pressure, expectancy, positive-test strategies, and confirmation bias can lead to falsepositive findings. Self-blinding citizen-science initiatives could be employed in these areas as a cost-efficient screening tool prior to conducting expensive clinical studies.

FUTURE DIRECTIONS

03/03/2021 Self-blinding citizen science to explore psychedelic microdosing | eLife25/55 An important feature of the self-blinding methodology is the low cost; we estimate that the current study's costs were about 0.5-1% of an equivalent clinical study. Since the research team is not providing the study drug/placebo and on-site staffing is not required, expenses are similar to a conventional observational study, yet still with incorporation of randomization and placebo control. Important lessons can be taken from the current study for the design of future microdosing trials. The combination of the lack of detected efficacy in this study and an association between self-reported doses and ability to break blind (see Figure) suggest that selecting dosage is fraught with difficulties: if a low microdose is chosen, efficacy is unlikely if we extrapolate current results, whereas a high microdose could jeopardize the blinding. Randomization to microdose versus an active placebo conditions (e.g. niacin, which has been employed in macrodosing studiesand careful assessment of blinding could, in principle, alleviate some of these concerns. The present study also has implications for full/'macrodose' psychedelic studies, where blinding is impossible due to the intense nature of the experience. It can be hypothesized that the intense hallucinations are essential for therapeutic outcome, questioning the suitability of placebo-controlled trials in this context. The fact that one may be unable to fully extricate belief, or 'context' more broadly, from the direct (e.g. pharmacological) action of a given intervention, raises interesting philosophical and ethical question with implications for drug development and regulation. One might also hypothesize that the action of microdosing and psychedelics relies on prior and continuously updating belief combining (perhaps synergistically) with a direct drug effect. Such a positive interaction could, in theory, be tested, and if endorsed, this could be interpreted as implying that belief is an active component of the psychedelic treatment model, rather than a problematic confound. In summary, here we created a novel, cost-effective, self-blinding, citizen-science methodology that enabled us to conduct the largest placebo-controlled study on psychedelics to-date and the first placebo-controlled examination of repeated psychedelic microdosing. Our findings confirm the anecdotal benefits of microdosing (improvements in a broad range of psychological measures); however, the results also suggest that the improvements are not due to the pharmacological action of microdosing, but are rather explained by the placebo effect (lack of significant between-groups effect). 27/55 where is the z-score of individual ind at timepoint tp on task t and is the number of individuals in the placebo group at timepoint tp. Finally, the CPS is calculated as the average adjusted z-score across the six tasks: In summary, CPS score is the z-score difference from the average of the placebo group who had the same number of previous opportunities to perform the tasks. Whenever the scores of the individual tasks are presented, the learning effects are always removed from the scores as described above (all steps prior to taking the average across the six subtasks). DEMS is a set of self-constructed visual analogue scales designed to measure the acute effects of microdosing. Responses were collected on a scale of 0-100. For all VAS items, the slider's default position was the midpoint, but for a valid response the slider had to be moved. A 42-item instrument that consists of six subscales (positive relations, personal growth, autonomy, environmental mastery, purpose in life, and self-acceptance). To quantify well-being as a single outcome, the sum of the six subscales was used during analysis. The original scale uses a six-step rating (from Strongly disagree to Strongly agree), see Ryff and Keyes, 1995 for details. In our online implementation a seven-step rating was used by accident (Neutral was added as an extra response option). To make our scores comparable with other studies, all RPWB scores have been rescaled by multiplying them with 6/7 and rounding it to the closest digit. A 5-item unidimensional scale designed to measure judgment of one's own life satisfaction, seefor details. The scale uses a 7-point rating that ranges from Strongly agree to Strongly disagree; final score is the sum of item scores. SSS is a 21-item, unidimensional scale that quantifies an individual tendency to accept messages. Each item is rated on a 5-point scale (Not at all, A little, Somewhat, Quite a bit, and A lot), seefor details. The sum of the items was used in analysis. SCS is a 8-item unidimensional scale that captures social belongingness, see Lee and Robbins, 1995 for details. Each item was rated on a 5-point Likert scale; final score is the sum of item scores. A 20-item scale where each item corresponds to a feeling or mental state (e.g. 'I have disturbing thoughts'), participants rate how often they felt that way on a 4point scale (Almost never, Sometimes, Often, and Almost always), see Spielberger, 1983 for details. The appropriate sum of item scores (some items reverse scored) was used in analysis. A 14-item unidimensional scale that covers both the feeling and functional aspects of mental well-being. Each item is rated on a 5-point scale (None of the time, Rarely, Some of the time, Often, and All of the time), seefor details. Sum of item scores was used in analysis.

Study Details

Your Library