In an article recently published in the journal Nature, researchers proposed an expandable voice user interface (VUI) Rainbow and investigated the feasibility of using it as a lab assistant for scientific laboratories.
Background
Laboratory process automation and better communication and intuitive control of software and hardware components are the crucial aspects of a modern scientific laboratory, which has increased the importance of VUIs.
In laboratories, employees are primarily involved in documenting experiments, interacting with laboratory equipment and software, and simultaneously following incubation times and protocols. Voice assistants can perform these tasks through voice command, which allows employees to continue their work uninterruptedly.
However, conventional voice assistants, such as Siri and Alexa, are unsuitable for scientific laboratories as they are ineffective for speech recognition of specialized vocabulary of various scientific fields and are not programmed to execute specific laboratory commands.
Thus, commercial voice assistants must be adapted before their adoption in scientific laboratories. Additionally, the diversity of employees working in a laboratory is another significant challenge, as the small target population for VUIs makes the differences in accents and dialects more noticeable.
Developing a single system that can reliably understand every laboratory-specific word in all dialects is difficult. Thus, several aspects, including costs, users’ dialects and accents, data privacy, and training requirements, must be considered while implementing voice control in laboratories.
The Rainbow VUI
In this study, researchers proposed Rainbow, an open-source VUI adaptable to any Windows personal computer with Internet access, for scientific laboratories. The objective of the study was to investigate the feasibility of using free software components to develop a suitable voice control assistant for scientific laboratories.
Researchers used Google Translate Site (GTS) and the scripting language AutoIt for developing Rainbow. GTS was utilized as a voice input and output system to ensure communication with the user, while AutoIt controlled GTS, executed all actions, and developed the VUI. The open-source voice assistant can perform device-specific, Microsoft Windows-based, and scientific/lab-specific tasks through voice command.
The AutoIt version 3.3.14.5 with the corresponding AutoIt Script Editor version 4.4.6 and AutoIt v3 Window Information to view necessary information for scripting was employed to develop the VUI Rainbow. Supportive materials, including the AutoIt Forum and AutoIt Help v3.3.14.5, were used for scripting.
AutoIt was compatible with Windows XP up to Windows 11, and researchers used a desktop computer with Windows 10 Bit operating system in this study. The key functions of AutoIt as VUI were voice input processing, menu navigation, and execution of corresponding activities.
The GTS was set in English through Google Chrome Browser, and a Bluetooth-paired Logitech Mono H820e Wireless headset was utilized to receive feedback and command input from the VUI. Although Rainbow allowed running every installed program on the desktop using voice command, researchers used only Microsoft Edge, Microsoft Editor, Microsoft Excel, and Microsoft Word. SoftMax Pro Software version 7.0.3 was integrated into the VUI as analysis software for controlling the microplate reader SpectraMax iD5.
Experimental Evaluation
A total of 38 user volunteers, including 14 male and 24 female, from the scientific field who were unfamiliar with the Rainbow VUI were invited to evaluate the VUI based on reliability and accuracy. Among them, 18 volunteers were native German speakers with Swabian dialect, 13 had standard German pronunciation, while the rest had Turkish, Russian, Hungarian, and Spanish pronunciation.
The experiment was performed in a bioanalytical laboratory with background noises representing typical laboratory conditions. Every volunteer wore an FFP2 mask during voice input due to pandemic conditions. All volunteers were asked to perform the test with only GTS on one hand and with Rainbow on the other to investigate whether Rainbow can increase the GTS accuracy compared to only GTS.
Researchers prepared two sets of 35/36 commands to cover all Rainbow commands and asked volunteers to perform one of the sets. All volunteers wore the headset during accuracy experiments to receive one of the two command sets. Initially, the set of commands was pronounced using Rainbow, and then the process was repeated for GTS.
All incorrectly and correctly recognized commands during accuracy tests were documented by preparing a Microsoft Excel 2016 sheet with all commands per user and recording the outcomes for each command. Moreover, a graphical user interface (GUI) was developed that allowed users/volunteers to integrate and associate incorrect commands with the correct command to enable every user to improve Rainbow's accuracy. The GUI acted as an add-on to Rainbow and scripted using AutoIt.
The GUI was evaluated by experimenting with an employee. Initially, the personal Rainbow accuracy was determined by the employee by testing two sets of commands, and then the employee spoke both sets of commands repeatedly three times using GTS and noted the misrecognized commands. Subsequently, the employee recorded the misrecognized commands in Rainbow using the GUI and assessed Rainbow by repeating both sets of commands three times.
Significance of the Study
The proposed voice assistant displayed significant flexibility, which allowed the expansion of task capabilities as per requirements without increasing data security risks. Rainbow was extended successfully with additional features by integrating the existing AutoIt Timer. For instance, researchers demonstrated that third-party software/SoftMax Pro could be effectively integrated into Rainbow to enable voice control of analysis devices.
Overall, Rainbow achieved 91.3% speech recognition accuracy/correctly identified 1232 commands out of 1349 Rainbow speech commands, while GTS achieved 85.1% speech recognition accuracy/correctly recognized 1148 commands out of 1349 commands, which indicated that an improved GTS accuracy can be achieved using Rainbow in place of using only GTS.
In the GUI evaluation experiment, the personal Rainbow accuracy initially achieved by the employee was 95.8% for both sets of commands, with three of the 71 commands misrecognized. However, the employee achieved 98.6% accuracy after implementing the three misrecognized commands from the personal Rainbow accuracy test and nine terms identified by GTS in Rainbow using the GUI.
Journal reference:
- Avila Vazquez, M. F., Rupp, N., Ballardt, L., Opara, J., Zuchner, T. (2023). An expandable voice user interface as a lab assistant based on an improved version of Google’s speech recognition. Scientific Reports, 13(1), 1-9. https://doi.org/10.1038/s41598-023-46185-x, https://www.nature.com/articles/s41598-023-46185-x