In an article published in the journal Nature, researchers explored the detection of suicidal thoughts and behavior (STB) using digital markers derived from an online pro-choice suicide forum.
By analyzing over 3.2 million interactions among 192 individuals, the study developed a machine-learning model to predict high-risk users (HRUs). Key network features like transitivity and density were crucial in identifying HRUs, suggesting that social interaction patterns could indicate heightened suicide risk.
Background
Suicide is a global health crisis, with over 700,000 annual deaths worldwide, and its study is hampered by its sensitive nature. Previous research has utilized online data to analyze STB, but primarily focused on content analysis, leaving social interaction patterns underexplored. Existing studies have shown potential in using social network structures to understand STB, but they are limited by censorship on mainstream platforms.
This paper addressed these gaps by analyzing uncensored social interactions from a pro-suicide online forum, Sanctioned Suicide. The study applied network analysis and machine learning to identify patterns that signal heightened suicide risk, providing novel insights into the dynamics of STB in an environment free from the censorship seen on other platforms.
Network Analysis of High-Risk User Behavior
The researchers focused on analyzing user interactions within the pro-choice suicide forum "Sanctioned Suicide." Data was collected from the "Suicide Discussion" subforum, comprising over 600,000 posts from 11,000 users between March 2018 and February 2021.
A custom Python script was used to gather and anonymize the data. The study aimed to identify the HRUs who were likely to attempt or complete suicide. HRUs were identified based on specific keywords and user activity patterns, leading to the selection of 48 HRUs and 144 control users for comparison.
A network-based approach was used to quantify user interactions within the forum, focusing on thread participation and temporal patterns. Interactions were weighted based on the recency of posts, with the strength of ties decaying over time. This method facilitated the construction of directed weighted graphs for further analysis, helping to identify and understand the behaviors of HRUs within the forum environment.
Egocentric Networks for Risk Prediction
The authors constructed two types of interaction networks using Python's networkX, namely, thread-specific and thread-agnostic. Thread-specific networks focused on interactions within individual threads, while the thread-agnostic network aggregated interactions across all threads.
Egocentric networks, centered around individual users (egos) and their immediate connections (alters), were then extracted. These networks considered both inward and outward-directed edges, with edge weights indicating the strength of connections. Structural features of these egocentric networks, such as centrality measures, were calculated to characterize user engagement.
These 17 features were utilized in a machine learning model developed in R using the caret package. The model aimed to classify HRUs in the online suicide forum by analyzing interaction patterns. To address the class imbalance, the synthetic minority oversampling technique (SMOTE) was employed.
The model's performance was validated using cross-validation and evaluated on a held-out test set, with metrics including sensitivity and area under the curve (AUC). Shapley additive explanations (SHAP) were used to interpret the model's predictions, highlighting key features influencing outcomes. The study's code and data were made available for transparency and reproducibility.
Key Findings and Implications
The researchers assessed the predictive performance of a machine learning model using network-based features to identify HRUs for suicide on the "Sanctioned Suicide" forum. The model achieved an AUC of 0.73 in both cross-validation and test sets, with respective sensitivities and specificities of around 0.70.
The analysis revealed that egocentric network features, such as lower density, higher transitivity, and lower in-degree centrality, were significant predictors of HRU status. The authors highlighted that users with sparse, triadic networks and low centrality in social interactions were more likely to be at risk. Conversely, those more integrated within the community were less likely to be HRUs.
The SHAP analysis provided insight into the relative importance of these features in predicting suicidal behavior. The findings suggested that social network patterns could serve as critical indicators of suicide risk, though further research is needed to generalize these results across different platforms and consider the context of interactions. The study underscored the potential of incorporating network-based features into suicide prevention efforts on social media.
Conclusion
In conclusion, the researchers demonstrated the potential of using network-based features to predict HRUs for suicide on an online pro-choice suicide forum. By analyzing over 3.2 million interactions, researchers developed a machine-learning model that achieved an AUC of 0.73.
Key findings indicated that HRUs tend to have sparse networks with higher transitivity and lower centrality, suggesting that social interaction patterns could signal heightened suicide risk. These results highlighted the value of incorporating social network analysis into suicide prevention strategies, though further research is necessary to validate these findings across different platforms.
Journal reference:
- Lekkas, D., & Jacobson, N. C. (2024). Breaking the silence: leveraging social interaction data to identify high-risk suicide users online using network analysis and machine learning. Scientific Reports, 14(1). DOI: 10.1038/s41598-024-70282-0, https://www.nature.com/articles/s41598-024-70282-0