In a paper published in the journal Scientific Reports, researchers investigated people's decision-making processes regarding protective actions during earthquakes using explainable machine learning and video data analysis. They identified and annotated environmental changes and behavioral responses by analyzing real-world closed-circuit television (CCTV) footage and social media videos from the 2018 Anchorage earthquake.
They applied extreme gradient boosting (XGBoost) machine learning to model and forecast individuals' protective actions, such as dropping, covering, or evacuating. Using explainable techniques, they unveiled the complex relationships between factors like shaking intensity and the presence of people, shedding light on effective strategies for emergency managers and policymakers to enhance earthquake preparedness and response efforts.
Related Work
Previous research has extensively explored individuals' decision-making during earthquakes, drawing on the protective action decision model (PADM) to examine factors such as environmental and social cues. However, significant gaps persist in both modeling frameworks and data utilization. While past studies often relied on statistical models with linear structures and survey data, the potential of machine learning and empirical video data from sources like CCTV footage and social media still needs to be explored.
Earthquake Response Analysis
The methodological framework presented here outlines a comprehensive approach for extracting behavioral insights from earthquake protective-action decision-making. Initially, researchers collected CCTV footage data and social media videos related to earthquakes collected from multiple sources. Subsequently, annotators meticulously annotated these data using the European distributed corpora project Linguistic Annotator (ELAN) software, identifying critical environmental and social cues and tracking individuals' behavioral states over time.
The annotated videos and data were publicly available on an open science framework (OSF) repository and transformed into numeric variables for analysis. The XGBoost model, a tree-structured machine learning approach, was then applied to model protective action decision-making, with predictive performance evaluated using various metrics. Researchers interpreted the model using variable importance and partial dependence plots to glean insights into the effects of different factors on decision-making.
The XGBoost model, renowned for its boosting technique, sequentially builds multiple decision trees to correct errors made by previous trees, ensuring enhanced predictive performance. Researchers employed two state-of-the-art explanation tools to interpret the model: variable importance and partial dependence plots. Variable importance provides insight into the relative contribution of each variable to predictive power, while partial dependence plots reveal the direction of associations between predictors and the target variable.
The model training process involved splitting the dataset into training and testing sets, with hyperparameters tuned using grid search and cross-validation. Researchers utilized performance metrics such as accuracy, recall, precision, and F1 score to assess predictive performance. Additionally, the study compared the performance of the XGBoost model with a traditional statistical model, the multinomial logit (MNL) model, to gauge the effectiveness of the machine learning approach.
Data collection focused on earthquake-related videos, particularly those depicting human behavior during earthquake shaking, obtained from social media platforms. A multi-phase annotation procedure was employed using ELAN software, with annotations encompassing various aspects such as behavioral states, shaking intensity, presence of alarms, and environmental conditions. The annotated data were then recoded into numeric variables, allowing for detailed analysis. Despite encountering class imbalance issues, particularly in the distribution of protective action behavioral states, the dataset provided valuable insights into earthquake protective action decision-making.
Predictive Performance Analysis
Researchers presented comparative predictive performance metrics between machine learning and traditional statistical models. XGBoost demonstrates an overall prediction accuracy of 95%, significantly outperforming MNL, which achieves a prediction accuracy of 66.9%. XGBoost is highly accurate in classifying protective action categories, particularly in predicting evacuation, drop, and cover, while MNL shows less predictive power, especially for hold-on.
The variable importance analysis illustrates the predictive power of each variable in classifying protective action decision-making. Social cues, such as the presence of a leader, emerge as highly influential factors, along with environmental cues like the number of people and whether it is a public setting. Time-related variables also demonstrate substantial predictive importance, underscoring the dynamic nature of decision-making during earthquake events.
Partial dependence plots reveal nuanced associations between critical variables and protective action decision-making. Drop, cover, and hold-on actions correlate with shaking intensity, while a threshold effect manifests in the relationship between the number of people and protective actions.
Decision-makers, being leaders, tend to guide others rather than take protective actions themselves, while distance from egress influences the likelihood of undertaking different protective actions. The results highlight the effectiveness of XGBoost in predicting protective action decision-making during earthquakes, emphasizing the importance of both social and environmental cues. Additionally, nonlinear relationships between variables underscore the complexity of emergency decision-making processes, providing valuable insights for disaster preparedness and response efforts.
Conclusion
To sum up, this study utilized CCTV footage and videos from the 2018 Anchorage earthquake to investigate the efficacy of machine learning in modeling protective action decision-making. It demonstrated superior predictive accuracy and provided insights into complex behavioral dynamics compared to traditional statistical models.
While offering promising avenues for enhancing emergency response strategies, the study acknowledges limitations in sample size and data granularity, suggesting opportunities for future research to explore multi-source data integration for a more comprehensive understanding of protective action decision-making processes.