In an article published in the journal Scientific Reports, researchers from the USA compared the responses of human participants and chat generative pre-trained transformer (ChatGPT) on three verbal divergent thinking tasks. They highlighted that the artificial intelligence (AI) generative language models outperformed humans on various tasks measuring divergent thinking, which is a key aspect of creativity.
Background
AI is a branch of computer science that aims to create machines or systems that can perform tasks that normally require human intelligence, such as reasoning, learning, and problem-solving. One of the applications of AI is natural language processing (NLP), which is the ability to understand and generate natural language, such as text or speech. NLP models can be used for various purposes, such as translation, summarization, sentiment analysis, and dialogue generation.
OpenAI's ChatGPT is one of the most advanced AI language models specializing in pattern recognition and prediction. Trained on extensive internet text data, it produces coherent and fluent text in various styles and tones. Additionally, it employs reinforcement learning from human feedback (RLHF) to enhance the human-like quality of its responses.
About the Research
In the present paper, the authors aimed to compare the creative potential of humans and ChatGPT on three verbal divergent thinking tasks: the Alternative Uses Task (AUT), the Consequences Task (CT), and the Divergent Associations Task (DAT). Divergent thinking is the ability to generate multiple creative solutions to the same problem that allows flexibility and out-of-the-box thinking. It is often considered an indicator of a person's creative ability but not a guarantee of creative achievement. Additionally, it can be assessed by tasks that require participants to make as many responses as possible to a given prompt. The responses are typically scored for originality (uniqueness of responses), fluency (number of responses), and elaboration (length/detail of responses).
The researchers recruited 151 human participants via an online platform and matched them with 151 instances of ChatGPT. The participants and ChatGPT were given the same instructions and prompt for each task, and their responses were scored using automated tools that measure semantic distance and word count.
The AUT asked the participants and ChatGPT to produce as many creative uses as possible for two common objects: a fork and a rope. The CT asked them to think of as many creative consequences as possible for two hypothetical scenarios: humans no longer needing sleep and humans walking with their hands. Similarly, the DAT asked them to produce 10 nouns as different from each other as possible.
Research Findings
The outcomes showed that ChatGPT exhibited greater originality than humans across all divergent thinking tasks, even after controlling for response fluency. Specifically, it achieved higher semantic distance scores, reflecting the novelty or uniqueness of responses, and higher word count scores, indicating greater response length or detail, compared to humans in both the AUT and the CT tasks. Additionally, it obtained higher semantic distance scores than humans in the DAT task, despite humans producing a greater number of single-occurrence words, highlighting the diversity of responses.
Furthermore, the authors observed that the type of prompt influenced the originality of responses for both humans and ChatGPT. In the AUT, both groups demonstrated higher originality scores for the fork than the rope, although ChatGPT consistently scored higher in originality regardless of the object. In the CT, ChatGPT exhibited higher originality scores for the scenario involving humans walking with their hands compared to the scenario of humans no longer needing sleep, while humans showed similar originality scores for both scenarios.
Applications
The paper has implications for the understanding and assessing creativity in humans and AI. The outcomes suggested that the current state of AI language models can demonstrate higher creative potential than human respondents on verbal divergent thinking tasks, which challenges the assumption that creativity is a uniquely human trait. The results also recommended that AI models can generate novel and diverse ideas that could inspire or assist humans in various domains requiring creativity, such as art, science, medicine, and education.
The authors highlighted the need for developing and applying appropriate methods and tools to measure and compare the creativity of humans and AI. They used automated tools relying on semantic distance and word count to score the responses, which may not capture the full complexity and quality of creativity. Moreover, they focused on one aspect of divergent thinking, originality, but did not consider other aspects, such as usefulness or appropriateness, which are also important for evaluating creativity.
Conclusion
In summary, new-age technologies such as ChatGPT effectively generate divergent thinking tasks. They outperformed the humans on various tasks. The authors showed that the state-of-the-art AI language model can generate original and elaborative responses than human participants on each of the tasks, even when controlling for the number of responses. Moreover, they showed that the type of prompt influences the originality of the responses for both humans and AI. The researchers acknowledged limitations and challenges and suggested that future research should consider using more comprehensive and multidimensional measures of creativity that can account for both human and AI perspectives.