In an article recently published in the Journal of Creativity, researchers compared human and Generative Artificial Intelligence (GAI) chatbots for creativity. They found no difference in creativity between humans and GAI chatbots, though they generated ideas differently. The question of whether GAIs can be "truly" creative is also discussed in this study.
Background
The debate over whether AI can match human creativity is ongoing. While AI has excelled in domains like chess and Go, some still believe that creativity remains a distinctively human skill. The increasing integration of GAI into daily life is revolutionizing various sectors. These technologies facilitate automated decision-making, data pattern recognition, and enhanced efficiency. As automation handles routine tasks, humans are left to tackle more complex, creative challenges.
The arguments over whether machines can possess creativity delves into the definition of creativity itself. Widely accepted definitions emphasize the production of novel and useful outputs within a social context devoid of inherently human attributes. Machines do not need to mimic human emotions or behaviors; they must replicate the cognitive processes and produce something "new and useful." The question is not whether GAI is creative but rather the significance of its creative output.
Recent research demonstrates that collaborating with AI, like ChatGPT, can boost individual creativity and self-efficacy. However, a key debate centers on whether AI genuinely exhibits creativity or merely recombines existing knowledge. While creativity encompasses diverse aspects, including problem formulation, idea generation, selection, and implementation, GAIs generate vast textual and visual outputs based on prompts, akin to human free-associative thinking. The present study aimed to assess the creative capabilities of six GAI chatbots, challenging the notion that humans inherently surpass AI in creativity.
Materials and Methods
The study involved 100 participants (50 women, 50 men) with an average age of 41 years, recruited from Prolific Academic, all native English speakers from the USA with work experience. Participants provided informed consent. They were compensated US-$9 per hour, with an average completion time of 17 minutes.
Five GAI chatbots were selected for the study: Alpa.ai, Copy.ai, Studio, ChatGPT (versions 3 and 4), and YouChat. The participants finished the use test for the given prompts involving objects. They were instructed to generate as many creative ideas as possible for each object in three minutes. The order of prompts was randomized.
Responses from GAI chatbots were collected, and each chatbot's responses were limited in length, with options to ask for more ideas. Data collection took place in early February 2023 and adhered to ethical guidelines. Six human raters, blinded to response origins, evaluated creativity on a scale of 1-5 using the CAT method for both human and five GAI chatbot responses. Detailed guidelines for originality and creativity were provided, and raters resolved ambiguities on a subset of 100 responses before evaluating all ideas. Additionally, a trained language model assessed scores for all responses. Fluency scores were calculated as the amalgamation of ideas, although slight variations occurred due to differing assessments of irrelevant answers.
Six human raters and a particularly trained AI independently assessed the originality of responses from both humans and GAI chatbots, unaware of the response sources. To calculate originality and fluency scores, responses from each creator and prompt were averaged separately. The study conducted data analysis in early March 2023. The data and R-code for reproducing the analyses are available at the provided link.
Study Findings
The findings revealed that, on average, GAI-generated ideas were as original as those generated by humans, particularly in the context of everyday creativity. However, the study emphasized that GAI chatbots rely on human prompts and lack the ability to initiate creative tasks independently.
Compared to ChatGPT 4, an average of 9.4 humans were more creative across all given prompts. While GAI chatbots can definitely compete with humans in creativity, their performance on complex tasks will depend on several factors like domain knowledge, creative thinking, emotions, and cultural background. Although these chatbots excel in knowledge-intensive tasks like coding, they have limitations in emotional responses.
The extensive knowledge bases of these chatbots enable broader idea recombination, which help them excel in generating human-level creative ideas. Although the results show that GAI chatbots such as Copy.ai, ChatGPT 3.5 and 4, and YouChat can generate human-like ideas, the level of creative achievement depends on the individual, which suggests the merging of human and AI competencies in the future for augmented creativity.
Conclusions
This study examined the creativity of both humans and GAI chatbots in generating ideas for various prompts using six human raters and five different GAI chatbots, including ChatGPT version 4.
While the concept of creativity of GAI chatbots is complex, the study findings suggest that chatbots can produce original ideas comparable to humans. Some argue that human creativity is superior as it involves real-world experience, emotion, and inspiration, but according to this study, GAIs meet the common definition of creativity, which is producing something new and useful.
Overall, the study results suggested that the future of creativity lies in a synergy between humans and AI, where GAI chatbots can assist in idea generation and knowledge recombination, while humans continue to provide the essential problem-solving, motivation, and evaluative aspects of creativity. GAIs are valuable creative assistants, but further research is needed to understand their potential.