Measuring the subjective: revisiting the psychometric properties of three rating scales that assess the acute effects of hallucinogens
This study (n=158) examined the psychometric properties of three commonly used rating scales (MEQ, HRS, ARCI). The authors found only sparing agreement of their psychometric analyses with the theoretically proposed models of the scales.
Authors
- José Carlos Bouso
Published
Abstract
Objective: In the present study we explored the psychometric properties of three widely used questionnaires to assess the subjective effects of hallucinogens: the Hallucinogen Rating Scale (HRS), the Mystical Experience Questionnaire (MEQ), and the Addiction Research Center Inventory (ARCI). Methods: These three questionnaires were administered to a sample of 158 subjects (100 men) after taking ayahuasca, a hallucinogen whose main active component is N,N-dimethyltryptamine (DMT). A confirmatory factorial study was conducted to check the adjustment of previous data obtained via theoretical proposals. When this was not possible, we used an exploratory factor analysis without restrictions, based on tetrachoric and polychoric matrices and correlations. Results: Our results sparsely match the theoretical proposals of the authors, perhaps because previous studies have not always employed psychometric methods appropriate to the data obtained. However, these data should be considered preliminary, pending larger samples to confirm or reject the proposed structures obtained. Conclusions: It is crucial that instruments of sufficiently precise measurement are utilized to make sense of the information obtained in the study of the subjective effects of psychedelic drugs. Copyright © 2016 John Wiley $ Sons, Ltd.
Research Summary of 'Measuring the subjective: revisiting the psychometric properties of three rating scales that assess the acute effects of hallucinogens'
Introduction
Bouso and colleagues frame the problem by noting that classic hallucinogens produce dramatic alterations in consciousness but that measuring those subjective effects remains challenging. The introduction reviews commonly used instruments in contemporary psychedelic research — including the Hallucinogen Rating Scale (HRS), the Mystical Experience Questionnaire (MEQ), and the Addiction Research Center Inventory (ARCI) — and argues that, while neurobiological and pharmacological methods have advanced, many psychometric tools have not been revisited with modern statistical techniques. The authors highlight that only a few instruments (notably some versions of the OAV/5D-ASC and the MEQ in limited contexts) have undergone rigorous confirmatory psychometric evaluation, whereas factorial structure, item loadings and reliability of HRS and ARCI have been incompletely assessed or analysed with outdated methods. This study therefore set out to re-examine the factor structure and internal consistency of the MEQ, HRS and the 49-item ARCI using data gathered after naturalistic ayahuasca ceremonies in Spain. The investigators aimed to apply confirmatory and exploratory factor-analytic methods appropriate to ordinal and dichotomous data (polychoric and tetrachoric correlations) and to propose tentative alternative structures or reduced versions when the original theoretical models did not fit the empirical data. The authors also translated the MEQ into Spanish and treated the work as a first transcultural psychometric effort in a sample of ayahuasca users.
Methods
Participants were recruited opportunistically at ayahuasca ceremonies in different parts of Spain. Researchers attended ceremonies conducted by a Spanish practitioner for personal-growth (non-religious) purposes and invited attendees to participate at the end of each session. After informed consent, participants completed sociodemographic questions and the Spanish versions of the ARCI (49-item short form), the 71-item HRS (Spanish adaptation), and a Spanish translation of the 30-item MEQ. The Research Ethics Committee of the Autonomous University of Madrid approved the procedures. The MEQ translation used back-translation and multiple bilingual clinicians to prioritise conceptual equivalence. The MEQ items were scored on a six-point Likert scale and results were expressed as proportions of the maximum possible score. The HRS used 71 computable Likert items (0–4) and 28 qualitative items in the Spanish adaptation. The ARCI short form consisted of 49 true/false items across five established subscales (MBG, PCAG, LSD, BG, A); score ranges for each subscale were reported. For the psychometric analyses, the investigators first attempted confirmatory factor analysis (CFA) for each instrument using AMOS 18 and Unweighted Least Squares when multivariate normality could not be assumed. When the theoretical models failed empirical identification or fit, they performed unrestricted exploratory factor analysis (EFA) based on polychoric correlations for ordinal items and tetrachoric correlations for dichotomous items. Multivariate normality was tested with Mardia’s criterion. Parallel analysis (minimum rank factor analysis) determined the number of factors to retain, and a Simplimax rotation (with Promin start and multiple random starts) was applied. Kelley’s criterion assessed residuals. Internal consistency was estimated with multiple indices recommended for non-linear data: standardized Cronbach’s alpha, Carmine’s theta, and McDonald’s omega for whole scales, and a multivariate reliability formula plus standardized alpha for subscales. The FACTOR 9.2 package was used for EFA steps. Pearson correlations among derived factor scores were calculated with Bonferroni correction to control type I error.
Results
Sample: Of 167 completed booklets, 158 were usable (100 men). Mean age was 39 years (range 20–60) and average education approximated three years of university. Most participants (134, 84%) had prior psychedelic experience (mean reported 21 lifetime uses, range 0–350). Mean ingested ayahuasca dose was 113 ml (range 50–265); concentrations of DMT and beta-carbolines were unknown. Self-rated perceived dose intensity was reported as low by 19 (12%), medium by 120 (76%), and high by 19 (12%). MEQ: A CFA testing the four-factor model previously proposed failed to identify with these data, so the authors performed an EFA on a polychoric matrix (KMO = 0.91; Bartlett p < 0.0001). Optimised parallel analysis suggested two factors, and the bifactorial solution explained 59.11% of the common variance (Factor 1 eigenvalue = 11.3, 47.75%; Factor 2 eigenvalue = 2.7, 11.37%). Estimated reliabilities were high for the two factors (0.95 and 0.92) and for the whole test (θ = 0.94; ω = 0.94; standardized Cronbach’s α = 0.94). The solution produced few residuals (RMSR = 0.062; expected acceptable RMSR = 0.080) and strong simplicity indices. A Schmid–Leiman transformation identified a second-order factor with high loadings, and the authors note the possibility that the MEQ could be treated as essentially unidimensional in this sample. HRS: The CFA of the original 71-item, six-factor HRS produced suboptimal fit (GFI = 0.82; AGFI = 0.81; PGFI = 0.77; NFI = 0.76). The investigators removed 12 poorly discriminating items and re-ran EFA on 59 items. A polychoric matrix (KMO = 0.83; Bartlett p < 0.001) and parallel analysis supported retaining six factors. The six-factor solution accounted for 56.5% of the common variance. Multivariate consistency for factors ranged from 0.87 to 0.94; whole-scale indices were θ = 0.94, ω = 0.93 and αs = 0.93. Inter-factor correlations ranged from 0.20 to 0.60. Residuals were low (RMSR = 0.051) and simplicity indices were strong. All items showed acceptable discriminative capacity (>0.30) except for one reverse-scored item (#44), which was retained. ARCI: The ARCI 49-item structure could not be modelled confirmatorily owing to cross-loading items in the theoretical structure. A tetrachoric correlation matrix was used but required smoothing because it was not positive definite; Mardia’s statistic was reported as 0.08 and KMO = 0.74 (Bartlett p < 0.001). Parallel analysis initially suggested three factors, but this solution explained only 26.0% of common variance and was empirically unsatisfactory: only 18 of 49 items loaded significantly (>0.20). Using those 18 items, fit indices were acceptable (GFI = 0.96; AGFI = 0.95; NFI = 0.93; RFI = 0.92), and three factors showed good multivariate consistency (θ = 0.92, 0.89, 0.86; αs = 0.85, 0.81, 0.79). Inter-factor correlations were near zero. However, the 18-item solution explained 50.5% of common variance and the whole-scale internal consistency was reported as precarious (θ = 0.75; ω = 0.51; αs = 0.73). Correlations among derived subscales: All MEQ subscales correlated positively with each other (p < 0.0004, Bonferroni corrected). Within the HRS, Agitation showed no correlation with Sensitive Distortion, Security/Control, and Visual Distortion. Within ARCI, Activation correlated positively with Euphoria and negatively with Sedation; Euphoria and Sedation did not correlate. Cross-instrument correlations showed a general pattern of association between MEQ and HRS subscales (except Agitation), and Euphoria (ARCI) correlated with MEQ total and MEQ Mystical Ecstasy and with certain HRS subscales (Sensitive Distortion and Security/Control). The authors summarise that Mystical Ecstasy did not correlate with Agitation, Activation or Sedation, whereas Transdimensionality correlated with all HRS subscales but with none of the ARCI factors.
Discussion
Bouso and colleagues interpret their findings as a partial validation and partial revision of the three questionnaires in the context of ayahuasca use. For the MEQ, the authors did not replicate the previously proposed four-factor model; instead they found a robust two-factor solution that explained a similar proportion of variance to prior reports and exhibited excellent internal consistency. They further show that a higher-order factor may account for both components, suggesting the MEQ could function as a unidimensional measure of mystical experience in this sample. The differences from earlier studies are attributed to sample size, assessment setting (in situ versus retrospective or clinical), substance differences (psilocybin versus ayahuasca), and possible cultural effects; the authors recommend further cross-cultural and cross-substance replication. Regarding the HRS, the original theoretical structure was not supported by CFA. After excluding poorly discriminating items, the investigators derived a six-factor solution from 59 items that accounted for 56.5% of variance and showed good multivariate and univariate reliability. The authors propose tentative labels for those factors (Sensorial distortion, Cognitive distortion, Agitation, Security/Control, Visual distortion, and Quality of the experience) and suggest their reconfiguration may better represent ayahuasca effects than the original model, which was initially based on a very small sample and reported limited psychometric detail. For the ARCI 49-item short form, the authors conclude that the original five-dimension formulation does not hold cleanly in this dataset. They identified an 18-item, three-factor solution (Euphoria, Activation, Sedation) with acceptable subscale reliabilities but weak whole-scale consistency and limited explained variance. Consequently, Bouso and colleagues consider ARCI-49 to be of limited feasibility for psychedelic research with ayahuasca and recommend further psychometric work across diverse substances before any definitive structure is accepted. The authors discuss the pattern of correlations across instruments and infer that mystical aspects of the experience (MEQ) are closely related to psychedelic phenomenology as measured by the HRS, typically occurring in the absence of agitation, activation or sedation, and that euphoria (ARCI) may be an important component of the mystical–psychedelic experience. Key limitations acknowledged include the single-substance design, unknown alkaloid concentrations in the ayahuasca preparations, and the need for larger and more diverse samples and for replication with other psychedelics and altered states. The paper concludes that modern psychometric techniques based on polychoric and tetrachoric correlations should be applied more widely in psychopharmacology and offers these preliminary factor solutions as a provisional basis for more rigorous measurement of subjective psychedelic effects.
View full paper sections
RESULTS
To start with, we performed a confirmatory factorial analysis based on the proposals previously published for each questionnaire. We used the AMOS 18 software, performing the method of the Unweighted Least Squares when the multivariate normality could not be guaranteed while reporting the goodness of fit indicators provided by the software. The empirical identifiably of the theoretical models was studied using the criteria of. These criteria are provided by the software, and although some inherent limitations are usually noted with this method, it that should not affect the models studied. Because some of the characteristics of study variables were previously unclear, the Unweighted Least Squares method was used. When the theoretical structure could not be empirically replicated, we performed an Unrestricted Exploratory Factor Analysis, configuring firstly the polychoric (for the type-Likert questionnaires) or tetrachoric (when the answer option was dichotomic) matrices. In order to test multivariate normality Mardia criterionwas used. After checking the suitability criteria to find out the number of factors to retain, we then performed an optimized parallel analysis based on minimum rank factor analysis. Then, an overall factor analysis was performed, fitting the solution to the number of factors obtained by the parallel analysis. A Simplimax rotationwas undertaken (using a Clever start with Promin, number of random starts = 100, maximum number of iterations = 100; convergence value p < 0.00001) and several criteria were used to guarantee the simplicity of the studied solution. For the study of the residues, Kelley criterion was applied. Aiming to analyze the internal consistency of each scale we utilized the standardized Cronbach's alpha, Carmine's theta, and McDonalds' Omega, as it is recommended for non-linear data. For estimating the internal consistency of each subscale we used a multivariate measure, the formula of, and the standardized Cronbach's alpha as a univariate measure. Thesolution was applied for studying the possible existence of superior order factors. All the mentioned analyses were performed using the program FACTOR 9.2 (Lorenzo-(freely available at:). Pearson correlations were analyzed between the scores of the factorial derived scales. Bonferroni correction was applied to avoid type I errors.
CONCLUSION
In this paper we have analyzed the factorial structure and internal consistency of three of the most widely used rating scales destined to measure the subjective effects of psychedelics: the MEQ (MacLean et al., 2012), the HRS, and the 49-item version of the ARCI. From these three rating scales, only MEQ has been explored with modern psychometric techniques. HRS and ARCI have only been subjected to psychometric research to a limited extent, but not extensive enough to be considered appropriate instruments of measure. Furthermore, the limited psychometric analyses undertaken a few decades agowere based on psychometric approaches surpassed nowadays. Despite these important limitations, HRS and ARCI are widely used at present. Consequently, there is a need for revising the psychometrics of these rating scales according to modern statistical techniques (Lorenzo-. It was not until very recently that the factorial structure and reliability of MEQ-a questionnaire that had different names and versions over time and was extensively used in early psychedelic research, and more recently in clinical research with the name of States of Consciousness Questionnaire (SCQ)explored the factorial structure and reliability of this questionnaire with psilocybin users after retrospective assessment via the Internet and found a solution in which 30 items loaded four factors, labeled as: Mystical (containing items from the former subscales Internal and external unity, Noetic quality, and Sense of sacredness); Positive mood (including items 365 psychometrics of three psychedelic rating scales related to the previous Deeply felt positive mood); Time/Space (including items from the former Transcendence of time and space); and Ineffability (including items of the previous Ineffability and paradoxicality). In a recent study,confirmed, in a series of clinical studies using psilocybin, the structure found by. We tried to conduct a confirmatory factorial analysis with the 30 items grouped in the four-factor structure thatproposed, but the model was not recognized by the analysis performed with our data. In a subsequent exploratory factor analysis we found a two-factor model that explained 59.11% of the common variance, a figure similar to those found by MacLean et al. (2012) (57% and 64%, respectively). Our Factor 1 grouped the items belonging to the MacLean et al. (2012) subscales Mystical, and Positive mood, excepting items 14 (Factor 1), and items 5 and 80 (Factor 2 of MacLean's model); and our Factor 2 grouped the subscales Space/Time, and Ineffability plus items 5, 14, and 80 of MacLean's model. In this sense, we found a factorial structure very similar to the one obtained byand, which increases the suitability of the basic structure found by those authors. Tentatively, we relabeled these two factors as Mystical ecstasy and Transdimensionality. The internal consistency of the whole test as well as of each of the two factors resulted in excellent indices.found a reliability figure for the 30-item scale (α = .93) similar to ours (multivariate indices: θ = 0.94; ω = 0.94). Furthermore, the two-factor structure we obtained displayed equal or similar reliability indices for each subscale of those found by, although direct comparison cannot be established because of factor structure differences. Finally, we explored the possibility that the two factors we got constituted a suprafactor, or a second order factor, with both components showing high saturations (68.1 and 89.2, respectively), as the 30 items did (>0,30). Thus, MEQ may conform as a unifactorial questionnaire that measures Mystical Experiences. Conceptually, this finding could be consistent with the conceptions asserting that, under a mystical experience, "all is one". The differences between the MacLean et al. (2012) andstudies and ours with regard to the few differences in the MEQ factor structure may have different explanations: (i) differences in the sample size; (ii) differences in the conditions in which the questionnaire is answered (anonymous retrospective assessment via Internet vs. a clinical setting vs. natural conditions in the presence of the researchers just after the experience); (iii) possible dissimilarities in the subjective effects of the substances considered (psilocybin vs. ayahuasca); and (iv) cultural differences between the samples. The MEQ is answered from 0 (minimum score) to 5 (maximum score), so the median values are between 2 and 3 (2.5). In our sample, only 2 items are below 2, and 5 items are between 2 and 2.5; and the rest of items are above the median value. So small discrepancies between our model and MacLean's model probably do not lie in eventual low scores yielded by our sample. Future studies should cover more culturally diverse samples, taking into account different study settings and incorporating a wider range of substances in order to define the most suitable MEQ factor structure. Furthermore, the good indices of internal consistence obtained for the whole scale and the different proposed subscales in both studies, seem to indicate that the 30-item MEQ is a reliable measure for the study of mystical mimetic effects of hallucinogens. With respect to HRS, this study is the first to analyze its factorial structure following an EFA. The psychometrics of theoriginal version were developed after performing a principal components factor analysis based on 11 subjects, after they had received four different doses of DMT, plus a placebo dose. These authors did not report the items loadings, the percentage of explained variances, or the eventual reliability indices. A subsequent analysis of principal components was conducted over the subscales of the 71-item version of the HRS, and applied to two different samples (one of them following an ayahuasca experience). The results informed variances between 68% and 75%, respectively, for the whole test. They also found good reliability indices for four subscales (Affect, Cognition, Perception, and Somaesthesia), but inadequate indices for Volition and Intensity. The Confirmatory Factor Analysis we conducted with HRS items did not support the original HRS theoretical model. After applying the pertinent psychometric analyses, we retained 59 items that were distributed in 6 factors. These 6 factors where composed by a set of items different from those proposed by. In our study, the percentage of the explained variance was lower (56.5%) than that obtained by. Regarding reliability, whilefound two subscales with inadequate univariate internal consistency indices, all the subscales we obtained presented good or excellent reliability parameters. Our study also included calculations on multivariate reliability indices, finding excellent figures for the six subscales. In any case, because the subscales in our proposed version of HRS are composed by different items compared to the version described by, direct comparisons between them cannot be established. Because the subscales we developed showed good or excellent internal consistency indices, it seems plausible to consider that the item reconfiguration we propose may constitute a more suitable scale. In this sense, we also calculated the multivariate reliability indices for the whole scale, finding again excellent indices (θ = 0.94; ω = 0.93; α s == 0.93). Despite the relatively low percentage of common variance found, we obtained at the same time a factor solution with few residues and excellent indices of simplicity, indicative of a good suitability of the resulting model. All the items in our version also showed a good discriminative capacity. Considering that the model proposed bywas only based on 11 subjects, and that neither data on the item distribution nor figures on the items loading were reported, we considered our model as more appropriate. Furthermore, as a result of the recent advances in statistical methodologies that allow for more sophisticated psychometric analyses based on tetrachoric correlation matrices, nowadays considered a more adequate statistical approach to perform AFE, the model we obtained could be considered more appropriate to analyze the HRS scores, at least in the assessment of the effects of ayahuasca. Lastly, we relabeled the 6 factors with the following tentative names: Sensorial distortion (8 items), Cognitive distortion (16 items), Agitation (7 items), Security/Control (10 items), Visual distortion (8 items), and Quality of the experience (10 items). Future studies utilizing larger samples and different psychedelics should confirm or reject these findings. With regard to the ARCI 49-item short form, as far as we are aware, there are no studies whose factorial structure had been researched.performed a discriminant analysis of the different subscales, finding discriminations among PCAG, MBG, LSD, and BG, but not for A. The reliability analysis showed good indices for PCAG, MBG, and BG, but unacceptable indices for A and LSD. In our study, it was not possible to undertake a confirmatory factorial analysis because of ARCI theoretic structure, whose five dimensions contain items belonging to two or more subscales. In the psychometric analysis we conducted, we found an 18-item solution with three factors, which explains 50.5% of the common variance. The reliability indices for the whole test proved to be precarious. At the same time, the three factors were relabeled, each of them obtaining an excellent multivariated internal reliability. The tentative names we propose for the new three factors are: Euphoria (8 items), Activation (4 items), and Sedation (6 items). Because the subscales of the original version of the 49-items ARCI are composed by items different from those informing the subscales, it is not possible to establish direct comparisons between their internal consistency indices. To our knowledge, there are no data on explained variance of the items within the 49-item ARCI. Although we found a low percentage of variance in our 18-item version, and a precarious internal consistency for the whole scale, multivariate indices in each subscale were adequate, so they may have certain heuristic value if used in psychedelic research. Finally, the ARCI questionnaire was developed to assess subjective effects and abuse potential of drugs pertaining to different pharmacological categories. That is why it is necessary to develop further psychometric studies incorporating other psychoactive drugs utilizing this rating scale before considering that we have a final version of the questionnaire. In any case, and based on our analysis, ARCI-49 seems to not be a feasible questionnaire for use in psychedelic research, at least with respect to ayahuasca. In relation to the correlations obtained between the different subscales, it is very interesting to note that the MEQ subscale Mystical Ecstasy (ME) did not correlate with Agitation (AG), and did not correlate with Activation and with Sedation (SED). The other subscale of the MEQ, Transdimensionality (TD) correlated with all subscales of the HRS, but with none of the ARCI. And the Total score of the MEQ did not correlate with the AG, ACT and SED but correlated with EUP. Thus, it seems that having a full psychedelic experience without agitation (assessed with the HRS), activation and sedation (assessed with the ARCI), but having euphoria (assessed by the ARCI), may be a good map of the mystical experience achieved by our subjects under the effects of ayahuasca. The correlation analysis between the HRS subscales and the ARCI subscales is also interesting, showing that sedation is not a component of the psychedelic experience, but euphoria may be part of that, because it correlates positively with 2 of the 6 subscales of the HRS. Another interesting result emerging from the correlation analyses refers to the ARCI. We found a single significant positive correlation between subscales EUP and ACT, a negative correlation between ACT and SED, and no significant correlation between EUP and SED. The ARCI is an instrument used to measure the effects of different pharmacological classes of drugs, so each subscale should ideally measure a different effect and thus reflect effects of different kinds of drugs. The interesting pattern of correlations found in our sample, beside the good reliability indices reached in each subscale, may led to considerer further explorations of this new version of the ARCI. Finally, the positive and significant correlations between the MEQ subscales and the HRS subscales (except Agitation), may be reflecting two very dependent aspects of the psychedelic experience, the psychedelic one and the mystical one, two aspects of the same experience that could be named mystical psychedelic experience and that tend to occur in the absence of agitation. Because Euphoria correlates also with two of three subscales of the MEQ and with another two subscales of the HRS, it is quite possible that euphoria may also be an essential component of the psychedelic mystical experience.
Study Details
- Study Typeindividual
- Populationhumans
- Characteristicsobservational
- Journal
- Compounds
- Author