A window into the intoxicated mind? Speech as an index of psychoactive drug effects
This study (2014) demonstrated with the example of MDMA that speech analysis can capture subtle differences in mental state in drugged versus sober individuals. The authors found that the speech of individuals dosed with MDMA showed closer proximity to such concepts as intimacy and empathy than usual.
Authors
- Harriet de Wit
Published
Abstract
Abused drugs can profoundly alter mental states in ways that may motivate drug use. These effects are usually assessed with self-report, an approach that is vulnerable to biases. Analyzing speech during intoxication may present a more direct, objective measure, offering a unique ‘window’ into the mind. Here, we employed computational analyses of speech semantic and topological structure after ±3,4-methylenedioxymethamphetamine (MDMA; ‘ecstasy’) and methamphetamine in 13 ecstasy users. In 4 sessions, participants completed a 10-min speech task after MDMA (0.75 and 1.5 mg/kg), methamphetamine (20 mg), or placebo. Latent Semantic Analyses identified the semantic proximity between speech content and concepts relevant to drug effects. Graph-based analyses identified topological speech characteristics. Group-level drug effects on semantic distances and topology were assessed. Machine-learning analyses (with leave-one-out cross-validation) assessed whether speech characteristics could predict drug condition in the individual subject. Speech after MDMA (1.5 mg/kg) had greater semantic proximity than placebo to the concepts friend, support, intimacy, and rapport. Speech on MDMA (0.75 mg/kg) had greater proximity to empathy than placebo. Conversely, speech on methamphetamine was further from compassion than placebo. Classifiers discriminated between MDMA (1.5 mg/kg) and placebo with 88% accuracy, and MDMA (1.5 mg/kg) and methamphetamine with 84% accuracy. For the two MDMA doses, the classifier performed at chance. These data suggest that automated semantic speech analyses can capture subtle alterations in mental state, accurately discriminating between drugs. The findings also illustrate the potential for automated speech-based approaches to characterize clinically relevant alterations to mental state, including those occurring in psychiatric illness.
Research Summary of 'A window into the intoxicated mind? Speech as an index of psychoactive drug effects'
Introduction
Abused drugs produce characteristic alterations in consciousness and mood that are central to their subjective appeal and to understanding addiction. Standard approaches to measuring these mental-state changes rely on retrospective descriptive reports or repeated standardized self-report measures; both have important limitations including recall bias, constrained response options, and dependence on participants' introspective capacity and motivation. Speech offers an alternative avenue because it is produced during the drug state and, if analysed for content and structure, could provide a more direct and less biased index of altered mental states. Bedi and colleagues set out to test whether automated analyses of free speech can detect drug-induced mental-state alterations and discriminate between drugs at the individual level. They administered two doses of MDMA (0.75 and 1.5 mg/kg), a single methamphetamine dose (20 mg), and placebo in a within-subject, double-blind design. The study combined semantic analysis using Latent Semantic Analysis (LSA), graph-based measures of speech topology, and multivariate machine-learning classification to evaluate whether speech meaning and structure reflect drug-specific subjective effects, with a focus on socioemotional concepts hypothesised to be especially responsive to MDMA.
Methods
Thirteen healthy volunteers aged 18–38 years who reported prior ecstasy use were recruited. Participants underwent medical and psychiatric screening and were excluded for current DSM-IV Axis I diagnoses, medical illness, BMI outside 18.5–30 kg/m2, relevant family cardiovascular history, prior adverse ecstasy response, and pregnancy or lactation. The Institutional Review Board at the University of Chicago approved the protocol and all participants provided written informed consent. A within-subject, double-blind, randomised design comprised four 5-hour sessions per subject, each separated in time. At each session participants received one of four capsules: MDMA 0.75 mg/kg, MDMA 1.5 mg/kg, methamphetamine 20 mg, or placebo. Participants adhered to pre-session abstinence rules verified by urine, saliva, and breath tests; females were tested for pregnancy each session. Behavioural testing began 65 minutes after capsule ingestion, and the free speech task was administered at 130 minutes, a time chosen to coincide with expected peak drug effects. During the free speech task participants spoke for 10 minutes about a randomly selected ‘‘person of importance’’ from a list they had provided; a different person was discussed in each session. Research assistants trained in active listening used reflective techniques to minimise interviewer influence; the same assistant interviewed each subject across sessions. Speech was recorded, professionally transcribed blind to drug condition, then preprocessed with the Natural Language Toolkit (NLTK): tokenisation, sentence parsing, part-of-speech tagging, and lemmatisation. Preprocessing produced an ordered list of lemmatised tokens for each interview. Semantic analysis used Latent Semantic Analysis (LSA) applied to the TASA corpus (37 651 documents, ~12.2 million words). Singular value decomposition reduced dimensionality to 300 components and word vectors were compared via cosine similarity to quantify semantic proximity. The authors selected a priori target concepts related to MDMA's putative socioemotional effects (eg, affect, empathy, friend, intimacy, rapport, support, compassion) and computed mean proximity values per interview. Verbosity (total token count) was also computed. Group-level effects on semantic proximities and verbosity were tested with repeated-measures ANOVA and planned comparisons with significance at p < 0.05; effect sizes reported in the paper are given as partial Z2. For individual-level prediction, the investigators employed a support vector machine (SVM) with leave-subject-out cross-validation for binary classifications, using a small feature set selected by p-value ranking (eg, rapport, love, support, verbosity). Features were normalised to each subject's mean across their four sessions to control baseline differences. A four-way classifier used linear discriminant analysis (LDA) with analogous cross-validation. Structural speech analysis used a graph-based approach in which each unique word is a node and directed edges link sequential tokens. Topological features extracted included number of nodes, edges, loop counts (returning to a word after 0–3 intervening words; L1–L4), mean degree (edges per node), and size of the largest connected component. Group differences in these structural metrics were evaluated with repeated-measures ANOVA and planned comparisons.
Results
Thirteen participants (4 female) completed the speech recordings. Mean age was 24.5 years (SD = 5.4). Reported prior ecstasy use averaged 12.6 occasions (SD = 19.1); participants also reported recent marijuana and alcohol use as described in the text. Semantic analyses showed that MDMA at 1.5 mg/kg (MDMA1.5) increased mean semantic proximity to several socioemotional concepts relative to placebo. Specifically, MDMA1.5 produced greater proximity to friend (F(1,12) = 5.7, p = 0.03, partial Z2 = 0.32) and support (F(1,12) = 5.3, p = 0.04, partial Z2 = 0.31), and yielded marginal increases for intimacy (F(1,12) = 4.2, p = 0.062, partial Z2 = 0.26) and rapport (F(1,12) = 4.1, p = 0.067, partial Z2 = 0.25). The lower MDMA dose (0.75 mg/kg) increased proximity to empathy relative to placebo (F(1,12) = 10.3, p = 0.007, partial Z2 = 0.46). Methamphetamine (20 mg) decreased proximity to compassion compared with placebo (F(1,12) = 6.3, p = 0.03, partial Z2 = 0.35). No significant differences emerged for the other preselected concepts. Verbosity was higher on methamphetamine than on placebo (F(1,12) = 5.8, p = 0.03, partial Z2 = 0.32). Graph-based structural analyses revealed a single drug-related difference: methamphetamine reduced the normalised count of 1-loops (returns to the same word without intervening words) relative to placebo (F(1,12) = 6.6, p = 0.03, partial Z2 = 0.36). No other topological speech features differed across conditions, and MDMA did not alter the measured speech topology. In machine-learning classification, binary SVMs discriminated MDMA1.5 from placebo with 88% accuracy and MDMA1.5 from methamphetamine with 84% accuracy. Classification accuracy for placebo versus methamphetamine was 69%. Comparisons involving the lower MDMA dose performed at chance. A four-way LDA classifier (distinguishing all four conditions) achieved 59% accuracy (above the 25% chance baseline reported by the authors).
Discussion
Bedi and colleagues conclude that automated semantic analysis of free speech can detect drug-induced changes in mental state and, in their sample, that MDMA produces systematic shifts in speech meaning toward prosocial concepts. The pattern of increased semantic proximity to friend, support, intimacy, rapport, and empathy is broadly consistent with the drug's purported socioemotional effects, and most effects were dose dependent, appearing primarily at the higher MDMA dose apart from empathy which increased only at the lower dose. By contrast, methamphetamine decreased proximity to compassion and increased speech quantity, while producing a small reduction in one measure of local repetition (1-loops) in the graph-based analysis. The authors position their findings as complementary to existing subjective-report methods, arguing that speech-content analysis is less constrained by preselected descriptors and less vulnerable to recall bias. They note that the lack of MDMA effects on speech topology suggests the drug alters meaning rather than the formal structure of discourse, differentiating its effects from those associated with thought disorder in psychosis. Machine-learning results demonstrating high binary classification accuracy for MDMA1.5 versus placebo and versus methamphetamine indicate that multivariate speech features can predict drug condition at the individual level. Several limitations acknowledged by the investigators temper the conclusions. The sample was small and relatively homogeneous, limiting generalisability and statistical power. Only two MDMA doses and a single methamphetamine dose were tested, so dose–response relationships remain incompletely characterised. The a priori selection of socioemotional concepts emphasised MDMA-relevant dimensions and may have biased classification outcomes; the authors note that data-driven (‘‘black-box’’) approaches could complement hypothesis-driven analyses and include supplementary material illustrating such an analysis. Task selection is another potential confound: participants spoke about a person of importance, a task chosen to be sensitive to prosocial effects and resembling psychotherapy, but task choice, repetition across sessions, interviewer influence, and dialog versus monologue format could all affect speech content independently of drug effects. The graph-construction method used for topology was one of several possible approaches; although it has sensitivity to psychosis-related disorganisation and detected the methamphetamine effect here, other graph methods might yield different results. Finally, the authors emphasise inherent limits to what speech can reveal about underlying thought—certain phenomena such as mental imagery may not be captured. Nonetheless, they argue that automated semantic and structural speech analyses show promise as objective, efficient adjuncts to traditional measures for characterising drug-induced mental states, with potential application to clinical assessment and to characterising novel psychoactive substances. Further work is recommended to clarify dose effects, psychopharmacological mechanisms (including roles for serotonin, dopamine, norepinephrine, and oxytocin), optimal speech tasks, and the integration of semantic, structural, and acoustic features.
Study Details
- Study Typeindividual
- Populationhumans
- Characteristicsplacebo controlledsingle blindcrossover
- Journal
- Compounds
- Author