In a paper published in the journal Scientific Reports, researchers utilized advanced AI techniques to analyze the social media behavior of 1358 users on Vkontakte (VK), the largest Russian online social networking service. They examined 753,252 posts and reposts alongside Big Five personality traits and intelligence assessments to understand how psychological attributes manifested in users' behavior.
The study identified that emotional tone strongly influenced traits like extraversion and agreeableness, while social engagement metrics correlated with logical thinking. Users with high neuroticism levels tended to share provocative content, and religion showed complex links with consciousness and agreeableness. The findings shed light on the intricate relationship between social media behavior and psychological traits, advocating a shift toward behavior-based diagnostic models.
Related Work
In previous research, psychologists aimed to diagnose personality traits by analyzing long-term human behavior. Traditional methods like lab experiments and imaging had limitations. Studies on Facebook showed links between personality traits and online behavior, particularly for extroverts. However, differences between online and offline contexts meant expressions of traits varied.
Advances in big data and machine learning enabled predictive models for personality traits using social media data, some achieving high accuracy. VKontakte analyses similarly found correlations between user activity and personal characteristics. As technology evolves, explainable AI becomes crucial for understanding complex relations and reducing biases in mental health research.
Psychological Trait Prediction Study
In terms of data, the study involved a sample of 1358 Russian-speaking users from the VK social network, with 46.7% male participants and an average age of 31.1 years. These users underwent psychological testing and consented to provide access to their VK profiles, resulting in a dataset comprising 753,252 posts and reposts. Data collection was facilitated through a web service, adhering to ethical standards set by the Russian Psychological Society and the Helsinki Declaration, with approval from the Ethics Committee of the Institute of Psychology of the Russian Academy of Science.
Researchers conducted psychological diagnostics using the Big Five model to measure traits like Openness to experience, conscientiousness, extraversion, agreeableness, and neuroticism. They additionally administered crystallized and fluid intelligence tests, assessing users' verbal abilities and reasoning skills. VK's API allowed access to various profile characteristics such as friend count, subscribers, and engagement metrics like likes and comments. Researchers employed natural language analysis models to evaluate text characteristics, including sentiment, emotional tone, and thematic analysis of posts.
Feature generation involved aggregating estimates for sentiment, emotional tone, and thematic content of posts and reposts alongside activity indicators like post frequency and word count. Researchers selected profiles based on criteria ensuring sufficient activity for analysis. They employed feature selection techniques to identify relevant characteristics, followed by data augmentation to enhance model generalization and mitigate overfitting. Researchers formed separate datasets for training, testing, and control purposes.
Modeling involved constructing neural network models for predicting psychological traits, with input vectors representing user activity features. Researchers utilized fully connected linear layers with activation functions in the model architecture, and they conducted training using Stochastic Gradient Descent optimization. They evaluated learning outcomes based on the coefficient of determination (R2) and mean squared error (MSE).
Researchers utilized explainable AI methods to interpret model predictions, employing algorithms based on gradients to determine feature importance and evaluate model performance. Tools like Captum and Shapley additive explanations (SHAP) provided insights into feature influence on predictions, enabling analysis across user profiles and trait scales. The integration of these methods facilitated a comprehensive understanding of model behavior and feature significance.
Analyzing Feature Influence in Models
The study involved an in-depth examination of the influence of various user profile characteristics on neural network models predicting personality traits and intelligence levels. Utilizing explainable artificial intelligence methods, particularly the integrated gradient method, researchers assessed the significance of features across different categories, including activity, frequency, text characteristics, sentiment, emotional evaluation, topic, and personal characteristics.
Through this comprehensive analysis, features related to user activity in social networks, emotional content of posts, and issues emerged as the most influential across all models predicting Big Five traits, verbal intelligence, and logical thinking. These findings underscored the importance of considering diverse user behaviors and content in understanding predictive models' outcomes.
Further exploration using the Partial Dependence Plot method provided insights into the direction and magnitude of feature category influences on prediction outcomes. The results revealed nuanced relationships between feature categories and psychological traits or cognitive abilities. For instance, correlations between activity signs and characteristics such as Extraversion and Consciousness highlighted distinct behavioral patterns among users. Similarly, sentiment and emotional evaluation features showed multifaceted associations with traits like Extraversion and Neuroticism, reflecting varied emotional expressions and responses within social network interactions.
An in-depth examination of individual cases using Shapley values elucidated the stability of trait assessments within neural network models across different severity categories. Clustering of Shapley values provided insights into feature impact consistency and revealed patterns aligning with trait severity levels. While traits like extraversion and consciousness displayed merged clusters for higher severity levels, models assessing verbal intelligence demonstrated less distinct clustering, possibly due to the complex nature of verbal abilities' prediction. Nonetheless, the clustering results reflected the models' accuracy and highlighted vital features driving predictive outcomes effectively.
Conclusion
In summary, the study elucidated how user profile characteristics influenced neural network predictions of personality traits and intelligence levels. By employing explainable AI methods, researchers uncovered the pivotal role of features like user activity and emotional content in shaping predictive outcomes.
Analyzing individual cases using Shapley values provided further insights into trait stability within the models. Overall, this research enhanced the understanding of predictive mechanisms and offered valuable insights for refining neural network models in psychological research and beyond.