Written by Judi Fusco
After reading Stephen Wolfram’s article, What is ChatGPT doing … and why does it work?, Judi Fusco, Director of Emerging Technologies and Learning Sciences at Digital Promise, was inspired to write a brief background and introductory explanation of how ChatGPT works. The goal of this blog post is to start to demystify ChatGPT so that educators can see limitations and learn how to work with it.
Key Ideas
- A background on what ChatGPT is and the meaning of the acronym
- ChatGPT is made up of statistical models that only understand patterns they have gleaned from the data.
- AI systems, like ChatGPT, still need human involvement to ensure accuracy and quality
ChatGPT Acronym
What does GPT mean? Educator CIRCLS has written about ChatGPT and defines it and the acronym (GPT) as follows:
ChatGPT is a chat-based generative pre-trained transformer:
Generative: can generate responses to questions,
Pretrained: was trained in advance on a large amount of the written material available on the web,
Transformer: processes sentences differently than other types of models. They can take the context of the words into account and understand relationships between the words.
A very simple definition is that it’s a chatbot that allows a user to ask a question in plain language and get a response in a way that’s similar to how a human would reply.
Here are some additional terms that are often used when describing ChatGPT.
Model Information
ChatGPT is a Large Language Model (LLM). A LLM is a type of neural network that was designed to process language and generate output in natural language. Though it’s a LLM, it is implemented in a transformer architecture, also a type of neural network. Two things that are commonly said about LLMs are 1) they can create human-like language and 2) they can generate plausible-sounding false information. A few more facts about it. ChatGPT 3 has 175 billion parameters (most of these are the weights it learned during training). For comparison, GPT-2 has 1.5 billion parameters and GPT-4 is projected to have about 100 trillion parameters. This large number of parameters is why ChatGPT can write text on such a large number of topics.
Neural networks are a type of Machine Learning and also a form of Deep Learning (see an Artificial Intelligence glossary for more). AI experts discuss how the architecture of a neural network was inspired by the interconnections of neurons and synapses in a human brain and that they work like a human brain. However, they do work differently. Neural networks learn from data that humans deem important and then present it to the models in very specific formats. They learn by determining relationships between the items in the data. In language models this includes knowing how likely a word is to appear next after a given word.
In the case of ChatGPT, it was trained with large amounts of written language from web pages and books. Estimates of how many words in these texts range from ten billion to 300 billion words; OpenAI, the company who makes ChatGPT, hasn’t described the training materials. If you want to learn more about neural networks, here’s a 5-minute animated video.
The important takeaway is that ChatGPT is made up of statistical models that only understand patterns they have gleaned from the data. That’s the basis of ChatGPT.
How does ChatGPT write?
ChatGPT takes the question you ask and runs it through its pre-trained models that “understand” language and then it writes an answer. ChatGPT is designed to write one word at a time and after each word, it picks the word to go next from the probability distributions of words in the language model. Some systems simply choose the word with the highest probability, but ChatGPT doesn’t always do that. If it did, it might pick one word repeatedly because it would fit in many places, which would be repetitive. ChatGPT uses something called “temperature” to determine how much freedom the model has in choosing the next word. A low temperature indicates that the model just takes the word calculated to be best in the probability distribution, and a high temperature means more randomness in the words chosen from the distribution.
While higher temperatures can make the text less boring, if a system is set too hot, the text can become not factual — ChatGPT calls it more creative. As I use ChatGPT, I have found that it is sometimes too “loose” with words and that the answer wasn’t correct. There are hundreds of articles discussing this issue. As I’ve interacted with ChatGPT, I have come to realize how pervasive the “too creative” problem is. I’m not the first person to say that I might use it for inspiration, but I won’t use it for anything where I need to be certain that the writing is accurate and truthful.
Limitations and How to Work with ChatGPT
As discussed above, one limitation of ChatGPT is that it can produce less than truthful information. Another risk is that students will use ChatGPT to do their work, that the teacher won’t be able to tell, and then students won’t learn. In a recent session we had with tech-savvy educators, they discussed how talk about ChatGPT felt similar to what occurred when Wikipedia emerged in the first decade of the 2000s.
While the conversation has a similar feel, ChatGPT is different from Wikipedia in that it is an AI system that may fundamentally change the way we work. In addition, AI technologies need data to work successfully in ways that technologies of the past did not. ChatGPT has only been around for a few months and if it were a non-AI technology, it would still take significant time for educators to determine how it should be used and to develop new commonly used and accepted practices. With new AI systems, we have to reflect on learning and teaching and how the new systems change classroom practice. And, in addition, we have to think about how we protect students, their families, their privacy, and their data with these new systems.
There are so many things for us humans to consider as ChatGPT and other AI technologies come into classrooms. If you’re passionate about AI and how it might be used in learning environments, please sign up for the EngageAI mailing list to join this important conversation.