In a recently submitted article on the ArXiV* server, researchers spotlight the crucial yet often overlooked aspect of aligning Artificial Intelligence (AI) outputs with human expectations in decision support systems. This challenge has spurred the development of Explainable AI (XAI), which advocates for a more human-centered perspective in AI development.
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
The article emphasizes the importance of determining not only what information AI should provide but also how to present it to users. To address the lack of standardized terminology for human-AI interactions in decision support, the researchers conducted a systematic review, resulting in a taxonomy of interaction patterns. The findings indicate that current interactions tend to be simplistic, emphasizing the importance of fostering more interactive functionality in AI systems.
XAI and Decision-Making Interactivity
The growing field of XAI aims to enhance AI systems by not only providing recommendations but also justifications for user decisions. While AI advancements have opened new possibilities for supporting human decision-making, there is a need to understand how and when to communicate information effectively. Interactivity between humans and AI is a pivotal component, and it's essential to deliberate the type and sequence of most beneficial interactions. Empirical studies have shown that interactions with AI systems are often limited to simple menu selections, such as clicking buttons.
Scope and Search Criteria
This survey primarily focuses on human-AI interaction paradigms in explicit decision-making tasks, differentiating from proxy tasks where users simulate AI outputs. The study covers works conducted between 2013 and June 2023, searching in five major databases: Association for Computing Machinery (ACM), Institute of Electrical and Electronics Engineers (IEEE) Explore, Compendex, Scopus, and PubMed.
The researchers constructed the search terms to encompass AI systems, human-AI interaction, decision-making tasks, and interaction design to pinpoint articles that evaluated interactions during decision-making tasks. Inclusion criteria include complete decision-making processes, implementation of user interfaces, various modes of interaction, and empirical user studies. Exclusions comprise studies involving robotics and gaming and non-peer-reviewed or review papers.
Evaluation Methods in AI-Assisted Decision-Making
Researchers analyzed various evaluation methods and measures across different domains in the study of AI-assisted decision-making. The goal of these methods was to assess the effectiveness and user experience of AI support during decision-making tasks. These researchers commonly employed both objective and subjective measures to evaluate the interaction between humans and AI systems comprehensively.
Objective measures in these evaluations included several essential constructs. Efficacy, for example, focuses on the actual performance of the human and AI working together, with metrics such as decision accuracy, error rates, and other performance-related indicators. Trust and Reliance were also assessed objectively, often through measurements like agreement with AI advice, compliance frequency, over-reliance on AI recommendations, and the weighting of AI advice in AI-follow assistance and delegation scenarios.
Efficiency was another fundamental construct measured objectively, typically involving decision time, total task completion time, and exploration of various task functionalities. Additionally, researchers occasionally assessed users' objective understanding of AI systems, examining how well users comprehended AI behavior and reasoning. Researchers used subjective measures to capture users' perceptions and experiences during AI-assisted decision-making. These included a range of constructs:
Efficacy: Researchers also evaluated users' understanding of AI systems, examining how well users comprehended AI behavior and reasoning and occasionally assessed this objectively.
Trust and Reliance: Researchers often assess subjective trust and Reliance on AI by employing standardized questionnaires encompassing various dimensions, including competence, reliability, trustworthiness, intention to comply, credibility, and uncertainty.
Usability: Researchers evaluated system usability subjectively, considering factors like usefulness, acceptance, satisfaction, potential implementation, and system complexity.
Decision Satisfaction: Researchers used subjective measures to encompass evaluations of decision satisfaction and assessments of mental demand, cognitive load, task difficulty, frustration, and workload.
Understanding: Assessing users' subjective comprehension and understanding of AI systems, especially in cases where explainability components were involved.
It's important to note that the construct of fairness. However, the exclusion of works focusing on human perceptions of AI outcomes, particularly fairness judgments, resulted in the limited prominence of relevance in AI-assisted decision-making within subjective measures. Researchers grouped any metrics that did not
fit the defined categories under the "other" constructs. This comprehensive set of evaluation methods and measures provided valuable insights into the performance and user experience of AI-assisted decision-making across various domains.
Survey Limitations and Considerations
The selection criteria exclusively focused on published manuscripts that conducted empirical evaluations of human-AI interactions, potentially excluding relevant work if they did not explicitly mention interaction design in titles, abstracts, or the body of the text. There is a possibility of publication bias, which might have led to the omission of relevant research.
Additionally, the analysis was limited to screen-based interfaces for AI assistance, and embodied AI, which supports different interaction modes, was not covered. The strict criteria required studies to encompass complete decision-making tasks to ensure authentic human-AI interactions. Given the wide range of experimental designs and variables in the papers reviewed, researchers abstracted interactions to identify patterns across diverse studies.
Conclusion
To sum up, this paper offers a systematic review of human-AI interactions within AI-assisted decision-making tasks, culminating in creating an interaction pattern taxonomy. This taxonomy provides a structured framework for comprehending these interactions, revealing a predominant focus on either AI-driven or human-led decision processes with limited attention to fostering interactive functionalities. Researchers recognize the significance of interaction design, advocating for purposeful choices in system development to enhance collaboration between humans and AI. The taxonomy presented is a valuable resource to inform the design and development of AI-based decision support systems, aiming to foster more effective, engaging, and user-centered collaborations.
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.