Predicting changes in substance use following psychedelic experiences: natural language processing of psychedelic session narratives

This quantitative interview study (n=1141) applied a machine learning tool to analyze written reports of psychedelic experiences and predicted whether the participants could reduce substance abuse in response to using psychedelics with a 65% accuracy across three independently trained Natural Language Processing models.

Authors

Albert Garcia-Romeu

Published

June 5, 2021

The American Journal of Drug and Alcohol Abuse

individual Study

Links

Read Paper DOI Google Scholar

Abstract

Background: Experiences with psychedelic drugs, such as psilocybin or lysergic acid diethylamide (LSD), are sometimes followed by changes in patterns of tobacco, opioid, and alcohol consumption. But, the specific characteristics of psychedelic experiences that lead to changes in drug consumption are unknown.Objective: Determine whether quantitative descriptions of psychedelic experiences derived using Natural Language Processing (NLP) would allow us to predict who would quit or reduce using drugs following a psychedelic experience.Methods: We recruited 1141 individuals (247 female, 894 male) from online social media platforms who reported quitting or reducing using alcohol, cannabis, opioids, or stimulants following a psychedelic experience to provide a verbal narrative of the psychedelic experience they attributed as leading to their reduction in drug use. We used NLP to derive topic models that quantitatively described each participant’s psychedelic experience narrative. We then used the vector descriptions of each participant’s psychedelic experience narrative as input into three different supervised machine learning algorithms to predict long-term drug reduction outcomes.Results: We found that the topic models derived through NLP led to quantitative descriptions of participant narratives that differed across participants when grouped by the drug class quit as well as the long-term quit/reduction outcomes. Additionally, all three machine learning algorithms led to similar prediction accuracy (~65%, CI = ±0.21%) for long-term quit/reduction outcomes.Conclusions: Using machine learning to analyze written reports of psychedelic experiences may allow for accurate prediction of quit outcomes and what drug is quit or reduced within psychedelic therapy.

Unlocked with Blossom Pro

Research Summary of 'Predicting changes in substance use following psychedelic experiences: natural language processing of psychedelic session narratives'

Introduction

Research indicates that psychedelics can produce meaningful reductions in problematic substance use, with open-label and observational studies reporting improvements in tobacco, alcohol, and other substance use following experiences with agents such as psilocybin and LSD. Prior work has linked therapeutic benefits to acute subjective qualities of the psychedelic experience—particularly so-called mystical-type effects characterised by unity, positive mood, and ineffability—but objectively measuring those subjective experiences during sessions is difficult. Automated speech analysis during acute drug effects and post-session narrative analysis have both been proposed as ways to quantify subjective experience, and natural language processing (NLP) offers a potentially efficient, generalisable method to do so. Cox and colleagues set out to determine whether quantitative descriptions of retrospective psychedelic session narratives derived via NLP could (1) distinguish which drug class a person subsequently reduced or quit, (2) distinguish the extent of reduction/quit outcomes, and (3) predict long-term quit/reduction outcomes when those NLP outputs are used as inputs to supervised machine learning (ML) algorithms. The analysis used a large convenience sample of retrospective written narratives from people who reported a psychedelic experience as preceding a reduction or cessation in substance use.

Methods

Participants (N = 1141; 247 female, 894 male) completed an anonymous online survey between September 2013 and May 2014. Recruitment targeted online communities and social media (Facebook, Reddit, Erowid, Shroomery, Maps) and solicited people who had quit or reduced a drug after a psychedelic experience. Inclusion required being ≥18 years old, English fluency, and reporting a reduction or cessation in use following a 5-HT2A receptor agonist psychedelic (examples listed in the survey included psilocybin mushrooms, LSD, mescaline, peyote, DMT, and ayahuasca). Participants were grouped by the primary drug class they reported reducing or quitting: alcohol (n = 512), cannabis (n = 272), opioids (n = 195), and stimulants (n = 162). The survey collected pre- and post-experience drug use, a categorical outcome for the reference experience (e.g., complete abstinence, persistent reduction, temporary reduction), categorical current-use frequency for those with persistent reduction, and an open-ended narrative of the psychedelic experience (no word limits). Procedures had institutional ethical approval and written informed consent was obtained. Narratives were preprocessed using NLTK for Python: stop-word removal, lemmatisation, and tf-idf transformation. The researchers derived three experimental topic models using latent semantic analysis (LSA) with singular value decomposition and created a fourth control model that simply counted alcohol-related words. The three LSA-based models were: LSA-All (vocabulary from all participants), LSA-Alcohol (vocabulary from all participants but trained only on alcohol-reduction narratives), and LSA-Scrubbed (vocabulary from all participants but with alcohol-related words removed based on prior alcohol language research). For each model, each narrative was represented as a vector of topic weights; the Alcohol Word Count model produced 32 features (31 individual alcohol-related word counts plus a summed alcohol-word count). To test predictive performance, topic-model outputs were used as inputs to three supervised ML algorithms: k-nearest neighbours, a Bernoulli naïve Bayes classifier, and a random forest classifier (all implemented with scikit-learn). Models were trained on a randomly selected 75% of participants and tested on the remaining 25%; this train/test split was repeated 1000 times to characterise the distribution of predictive performance. For inferential comparisons, the team performed 10 planned sets of ANOVAs addressing three questions—whether topic models differed by drug class reduced/quit, whether they differed by quit/reduction outcome categories, and whether topic models differed in their predictive accuracy for quit outcomes—using Bonferroni adjustments within each ANOVA family and follow-up pairwise comparisons where appropriate. Full computational details were reported as supplemental material.

Results

Narrative lengths varied by group: mean (median) words per narrative were 142 (91) for alcohol, 120 (83) for cannabis, 160 (101) for opioids, and 111 (70) for stimulants. Topic-model selection used coherence scores across topic numbers; the team selected 6 topics for LSA-All, 6 for LSA-Scrubbed, and 9 for LSA-Alcohol based on those coherence profiles. Differentiation of drug class reduced/quit: Using the LSA-All model, all six topics showed statistically significant differences across the four drug-class groups, with Topic 2 having the largest effect size (η2 = 0.45). Tukey HSD follow-ups indicated that each pairwise combination of drug classes was differentiated by at least one topic, although opioid and stimulant groups had the fewest differentiating topics. The LSA-Scrubbed model also differentiated drug classes for five of six topics (largest effect η2 = 0.29) and resolved five of six pairwise comparisons; the only undifferentiated pair was alcohol versus stimulants. The Alcohol-Related-Word-Count control model found significant differences in frequency for three alcohol-related words ('alcohol', 'beer', 'drink') and for the summed alcohol-word count; 'alcohol' had the largest effect size (η2 = 0.26). The control model chiefly distinguished participants who reduced/quit alcohol from those who reduced/quit other drug classes. Differentiation of quit/reduction outcome groups: For the LSA-All model, four of six topics differed across the quoted outcome categories (largest effect η2 = 0.039), and the model distinguished 9 of 15 possible pairwise outcome comparisons. The LSA-Scrubbed model showed significant differences (Topic 5 largest, η2 = 0.03) and discriminated 10 of 15 pairwise comparisons. The LSA-Alcohol model did not show topic differences across outcome groups after Bonferroni correction. The Alcohol-Related-Word-Count model again showed differences for the words 'alcohol' and 'drink' and for the summed alcohol-word count (η2 values 0.032, 0.026, and 0.027, respectively); participants in the 'reduced greatly' group used these alcohol-related terms more than several other outcome groups. Prediction of quit outcomes with ML: With six possible outcome categories, random guessing would be accurate at approximately 17%. Across the three ML algorithms, all topic-source models produced accuracies substantially above chance. Using k-nearest neighbours, the LSA-Alcohol model achieved a median prediction accuracy of 47% (SD 5%, maximum 63%), while LSA-All, LSA-Scrubbed, and Alcohol-Word-Count had median accuracies of 39%, 38%, and 38% respectively. The Bernoulli naïve Bayes classifier yielded median accuracies of 52% (SD 4%, max 68%) for LSA-Alcohol and approximately 41–42% for the other models. The random forest classifier produced median accuracies of 50% (SD 4%, max 63%) for LSA-Alcohol and about 40–41% for the other models. Across algorithms, LSA-Alcohol consistently gave the highest median and maximum accuracies, while LSA-All and LSA-Scrubbed provided moderate but reliably above-chance performance.

Discussion

Cox and colleagues interpret their findings as evidence that NLP-derived topic models capture meaningful variation in retrospective psychedelic-session narratives that relates to both the drug class a person subsequently reduced or quit and the degree of reduction. The LSA-All and LSA-Scrubbed models best differentiated which drug class participants reduced/quit and most pairwise quit outcomes, indicating that topic models trained on a broader corpus can detect experience features beyond simple mentions of alcohol-related words. In contrast, LSA-Alcohol—the model trained specifically within alcohol-reduction narratives—produced the highest predictive accuracies when used as input to supervised ML, suggesting that within-class models may be most useful when the target substance is known. All three ML algorithms achieved prediction accuracies well above chance, with the Bernoulli naïve Bayes classifier producing the highest reported maximum accuracy (up to 68% in the study's reported runs) and k-nearest neighbours performing similarly despite being the least computationally intensive. The authors therefore note that relatively simple algorithms may suffice in some settings, though they recommend future work to identify scenarios where more complex approaches offer meaningful gains. Key limitations acknowledged include reliance on retrospective self-report, raising the possibility of recall bias or misattribution of features from other experiences; and a self-selected sample of people who had reduced or quit and were motivated to report this, limiting generalisability to individuals who did not change use or who did not volunteer narratives. The authors emphasise that their models were developed on this specific population and may not apply to broader or clinical populations without further validation. They suggest future research should test these NLP/ML procedures in other samples and examine when more complex models are warranted. Overall, the investigators conclude that combining NLP of written psychedelic-session narratives with supervised ML offers a potentially useful analytic approach for identifying which individuals may need additional support during psychedelic-assisted interventions for substance use.

Conclusion

The study concludes that NLP can discriminate between which substances people report reducing or quitting after a psychedelic experience, can distinguish the extent of reduction or cessation, and that NLP outputs can be used with supervised ML to predict quit outcomes better than chance. Cox and colleagues propose that these techniques may help researchers and clinicians anticipate the level of support an individual might require during psychedelic treatment for substance use disorder, while noting the need for further research to establish generalisability and robustness.

Study Details

Study Type
individual
Population
humans
Characteristics
observationalinterviews
Journal
The American Journal of Drug and Alcohol Abuse
Author
Albert Garcia-Romeu