Discover how cutting-edge NLP models are transforming education and healthcare by personalizing learning and enhancing patient care, all while reshaping how we interact with language and technology.
Shaikh Arifuzzaman is an assistant professor of computer science in UNLV's College of Engineering. His research focuses on AI/machine-learning models and scalable high-performance computing techniques. Credit: Josh Hawkins/UNLV
It's hardly hyperbole to say that ChatGPT changed technology overnight when it was released in November 2022. Since then, generative artificial intelligence (AI) models have sprouted everywhere, creating scripts, photos, and whole podcast episodes. Some are even diagnosing diseases.
While the AI revolution seems promising to some, others fear it may eliminate jobs. Shaikh Arifuzzaman, a professor of computer science at the UNLV College of Engineering, agrees with the former. He assures us that we can harness the power of AI for good.
"I know many of us are a bit reluctant because this is such disruptive technology," said Arifuzzaman. "But humans can adapt. AI is here to help us. It will make many things easier. It'll close gaps."
Arifuzzaman's research revolves around machine learning (ML) models and scalable, high-performance computing techniques, with particular emphasis on the applications of natural language processing (NLP) and data science. He leverages his expertise to provide a primer on how NLP technology works and where he predicts AI will take us.
How do NLP models like ChatGPT process words?
When an NLP model like ChatGPT "reads" text, it first breaks the passage into smaller units called tokens. Tokens can be whole words, parts of words, or even just characters. Each token is then represented as a vector, which is a high-dimensional array of numbers. These vectors are like points in a multidimensional space, capturing the token's meaning and context.
For example, in ChatGPT-3 – the first widely available version of the tool – each token is represented by a vector with 12,888 dimensions. Each dimension encodes nuanced relationships between tokens based on the model's training data.
Traditional NLP models process the input passage sequentially, or one token at a time. However, language is fluid, so words can mean something in one context but have drastically different meanings in another context. When text is processed sequentially, you lose that context.
ChatGPT went one step further. "GPT stands for "generative pre-trained transformer." Broadly speaking, a transformer is an architecture that processes all tokens in the passage simultaneously, allowing the model to relate tokens to each other and detect more subtle aspects of language. This ability to process tokens simultaneously makes transformers, such as ChatGPT, particularly effective at capturing the complexity of natural language.
How do you train an NLP model?
Training an NLP model like ChatGPT involves three key stages: data preparation, model training, and evaluation. First, researchers collect a large dataset of text, which is cleaned and tokenized using the aforementioned process.
During training, the model learns to understand and generate language by predicting the next token in a sequence. This process uses a technique called backpropagation, in which the model calculates its prediction errors and adjusts its internal parameters to minimize them. This is done iteratively over the dataset in batches using a computational framework optimized for large-scale parallel processing.
A validation dataset is used throughout training to test the model's performance on unseen data. After training, the model's performance is evaluated on a separate test dataset to ensure it generalizes well.
Once deployed, the model operates based on its training and does not continuously learn from new data unless retrained. Often, a model like GPT is fine-tuned to specialize it for specific tasks or domains, improving its performance and relevance in those contexts.
Why has there been such a sudden rise in AI development?
That's a great question. Machine learning and AI technologies are essentially based on neural networks, which we've had for over 50 years. We've theorized about ML and AI for that whole time, but two big conditions were met recently that made them possible.
First, computing technology is far stronger than it was 50 years ago. Cell phones in our pockets are a million times stronger than computers of the 1970s. Floppy disks hold 1.44 megabytes of information, but your phone's storage probably holds 100,000 times more.
Second, a huge amount of textual information is now available on the Internet. Everything from websites and PDFs to social media posts and tweets is available for anyone to sift through—and it's constantly growing. This provides a vast and diverse dataset for training AI models.
What should people keep in mind when they use generative AI, such as ChatGPT?
The first thing you must consider is the correctness of the information any generative AI model outputs. These models take vast amounts of data and are essentially trained to imitate form. They don't necessarily understand the factual aspect of the output; they just try to generate an answer that looks normal.
Let's say a student researcher is assigned a literature review. If they ask ChatGPT to give them a list of relevant papers and links, ChatGPT will produce a list of relevant article titles, but the article itself may not exist. It might list a URL that looks real, but the website doesn't exist. The point is that ChatGPT will give you information, but it's the user's responsibility to double and triple-check that the information is correct.
Another issue that may arise is bias. Any AI model must be trained on human-created data, and all humans have bias. So, it is very possible that bias may be transmitted to the AI model. Especially on a global scale, there could be significant cultural differences between Western and Eastern societies. If an AI model's training data isn't diverse enough, it could very well become biased in one direction or against another.
How do you predict NLP technology will be applied in the future?
NLP models have the potential to revolutionize many different sectors. In education, for example, NLP technology could one day give students access to personalized tutoring aids. Khan Academy has already built a GPT that helps students communicate their exact needs and struggles. The GPT can create a study plan complete with specialized learning materials tailored to the student.
And it's not just students; NLP technology can help educators, too. Now, they'll have generative AI tools that can create teaching materials, and educators can focus more personal attention on their students.
This technology can also use information synthesis to streamline healthcare. A large part of diagnosing patients is analyzing verbal information, such as when a patient tells a healthcare provider about their symptoms or medical history. One day, NLP models could parse through a patient's words and highlight different possibilities for the healthcare provider.
The other side of information synthesis is seen in legal settings. Many people don't understand all the terms and language in legal documents. NLP technology will provide an easy way to process and summarize these documents and even translate them into other languages if necessary.