Cigna's StressWaves Test Assessment: Reliability and Validity Concerns in AI-Based Stress Management

In the paper published in the journal Scientific Reports, researchers assessed Cigna's stress management toolkit, which included an artificial intelligence (AI)-based tool known as the Cigna StressWaves Test (CSWT). The study aimed to scrutinize the claim of the CSWT being a 'clinical grade' assessment through an independent validation process. Findings revealed that the CSWT lacked repeatability and exhibited poor convergent validity. The tool's public availability without adequate validation data raised concerns about the premature deployment of digital health tools for stress and anxiety management.

Study: Cigna
Study: Cigna's StressWaves Test Assessment: Reliability and Validity Concerns in AI-Based Stress Management. Image credit: Duncan Andison/Shutterstock

Background

The global impact of psychological stress on health, ranging from cardiovascular issues to depression, has long been recognized. Traditional methods for monitoring stress, like the Perceived Stress Scale (PSS), relied on patient-reported questionnaires known for their reliability and validity.

The PSS has been widely used in stress measurement and as a benchmark in studying various stress indicators, such as cortisol concentration, and evaluating stress management techniques. Recently, the emergence of AI-based digital tools for stress, depression, and anxiety assessment, like the CSWT, has gained attention. However, despite its wide availability, the CSWT needs published validation data, a critical aspect considering its integration into stress management strategies by a global health services company.

Methodology Overview: CSWT Evaluation

This study involved 60 participants aged 18 or above, recruited from Arizona State University. Before the experiment, researchers obtained institutional approval (IRB #00016588) and collected informed consent.

Criteria for inclusion encompassed English-speaking individuals above 18 years old. The experiment employed standardized equipment (Logitech H390 Wired Headset connected to a Dell computer) in a quiet laboratory setting. Participants must be made aware of their CSWT stress scores throughout the study.

CSWT: The CSWT, positioned as a clinical-grade tool for stress assessment based on speech analysis, prompted participants to select a question and respond for at least 60 seconds. Each participant underwent the test twice consecutively, choosing from eight prompts per session. Only one participant selected the same prompt for both sessions. The CSWT provides ordinal and gradient scale outputs for stress levels. Participants also completed the 10-question PSS and scored numerically and on a three-level ordinal scale. The researchers randomized the administration order of CSWT and PSS for the participants.

Statistical Analysis: The primary analysis focused on test–retest reliability, assessed via intra-class correlation (ICC) between the two CSWT administrations. Secondary analysis evaluated CSWT validity against PSS, measuring correlations between PSS scores and the average CSWT scores from both administrations. Cohen's Kappa measured ordinal ratings' repeatability and validity relative to PSS. Statistical analyses utilized R Studio with the IRR package.

Power Analysis: Sample size estimations, targeting a primary analysis of test–retest reliability, anticipated an expected ICC of 0.75. A sample size of 55 participants, with an additional 5 for potential data issues, was calculated to achieve 80% power to detect correlations of at least 0.33 between CSWT and PSS, considering a significance level of 0.05. Researchers set the lower threshold for acceptable ICC at 0.5 due to the inherent variability in speech-related acoustic features.

CSWT Evaluation: Reliability and Validity

In this study, 60 participants (36 females, 24 males) completed the CSWT twice during a single session to examine its reliability and the PSS once for assessing validity. The test–retest reliability analysis revealed that the CSWT lacked repeatability, indicating a non-significant ICC (ICC = −0.106, p > 0.05).

Similarly, the assessment of convergent validity between the CSWT and the PSS demonstrated a lack of significant correlation (r = 0.200, p > 0.05). Multiple linear regression, utilizing both CSWT administrations to predict the PSS, only accounted for 6.9% of the variance in the PSS, further underscoring the CSWT's poor validity relative to the PSS.

These findings challenge the claims of the CSWT being a clinically robust tool and question its effectiveness. The poor reliability and validity suggest limited agreement between the CSWT and the established PSS, raising concerns about its utility, especially considering its integration into broader stress management offerings. The extensive availability of such tools through prominent platforms might lead users to rely on them for critical health decisions, potentially resulting in misleading assessments, inappropriate treatments, and unwarranted anxiety or reassurance.

Moreover, beyond its limitations in reliability and validity, the CSWT's interpretation of psychological stress levels, particularly in extrapolating trait psychological stress from brief speech samples, raises feasibility concerns. This study serves as a cautionary example of deploying AI-driven tools without robust validation data, urging the need for stringent verification processes akin to those in healthcare to ensure the credibility of digital health tools, particularly in mental health assessment.

Additionally, the challenges associated with developing speech-based health measures, as evidenced by the variability in speech production and model transparency, contribute to the limitations of tools like the CSWT. The inherent variability in human speech production poses constraints on accurately predicting complex health states like psychological stress directly from speech, suggesting a need for cautious interpretation and verification of claims made by AI-driven health tools based on speech analysis.

Conclusion

To sum up, evaluating the CSWT against the PSS revealed substantial shortcomings in reliability and validity. These findings raise significant concerns regarding the CSWT's claim of clinical-grade performance and its effectiveness as a reliable stress assessment tool. The study emphasizes the critical need for stringent validation processes for AI-driven health tools, especially mental health assessment.

Additionally, the challenges associated with speech-based health measures highlight the necessity for transparent validation and cautious interpretation of claims made by such tools. It underscores the importance of robust verification and transparent reporting in ensuring the reliability and accuracy of digital health tools in clinical settings, particularly for mental health assessment.

Journal reference:
Silpaja Chandrasekar

Written by

Silpaja Chandrasekar

Dr. Silpaja Chandrasekar has a Ph.D. in Computer Science from Anna University, Chennai. Her research expertise lies in analyzing traffic parameters under challenging environmental conditions. Additionally, she has gained valuable exposure to diverse research areas, such as detection, tracking, classification, medical image analysis, cancer cell detection, chemistry, and Hamiltonian walks.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Chandrasekar, Silpaja. (2023, November 21). Cigna's StressWaves Test Assessment: Reliability and Validity Concerns in AI-Based Stress Management. AZoAi. Retrieved on November 21, 2024 from https://www.azoai.com/news/20231121/Cignas-StressWaves-Test-Assessment-Reliability-and-Validity-Concerns-in-AI-Based-Stress-Management.aspx.

  • MLA

    Chandrasekar, Silpaja. "Cigna's StressWaves Test Assessment: Reliability and Validity Concerns in AI-Based Stress Management". AZoAi. 21 November 2024. <https://www.azoai.com/news/20231121/Cignas-StressWaves-Test-Assessment-Reliability-and-Validity-Concerns-in-AI-Based-Stress-Management.aspx>.

  • Chicago

    Chandrasekar, Silpaja. "Cigna's StressWaves Test Assessment: Reliability and Validity Concerns in AI-Based Stress Management". AZoAi. https://www.azoai.com/news/20231121/Cignas-StressWaves-Test-Assessment-Reliability-and-Validity-Concerns-in-AI-Based-Stress-Management.aspx. (accessed November 21, 2024).

  • Harvard

    Chandrasekar, Silpaja. 2023. Cigna's StressWaves Test Assessment: Reliability and Validity Concerns in AI-Based Stress Management. AZoAi, viewed 21 November 2024, https://www.azoai.com/news/20231121/Cignas-StressWaves-Test-Assessment-Reliability-and-Validity-Concerns-in-AI-Based-Stress-Management.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
AI Researchers Reveal New Method for Measuring How Much is 'Too Much' in Image Generation Models