Skip to Main Content

Generative AI

Brief Glossary of AI-Related Terms

Artificial Intelligence (AI): AI is a branch of computer science. AI systems use hardware, algorithms, and data to create “intelligence” to do things like make decisions, discover patterns, and perform some sort of action. AI is a general term and there are more specific terms used in the field of AI. AI systems can be built in different ways, two of the primary ways are: (1) through the use of rules provided by a human (rule-based systems); or (2) with machine learning algorithms. Many newer AI systems use machine learning (see definition of machine learning below).

Algorithm: Algorithms are the “brains” of an AI system and what determines decisions in other words, algorithms are the rules for what actions the AI system takes. Machine learning algorithms can discover their own rules (see Machine learning for more) or be rule-based where human programmers give the rules.

Generative AI (GenAI) and chat-based generative pre-trained transformer (ChatGPT) models: A system built with a neural network transformer type of AI model that works well in natural language processing tasks (see definitions for neural networks and Natural Language Processing below). In this case, the model: (1) can generate responses to questions (Generative); (2) was trained in advance on a large amount of the written material available on the web (Pre-trained); (3) and can process sentences differently than other types of models (Transformer).

Transformer models: Used in GenAI (the T stands for Transformer), transformer models are a type of language model. They are neural networks and also classified as deep learning models. They give AI systems the ability to determine and focus on important parts of the input and output using something called a self-attention mechanism to help.

Training Data: This is the data used to train the algorithm or machine learning model. It has been generated by humans in their work or other contexts in their past. While it sounds simple, training data is so important because the wrong data can perpetuate systemic biases. If you are training a system to help with hiring people, and you use data from existing companies, you will be training that system to hire the kind of people who are already there. Algorithms take on the biases that are already inside the data. People often think that machines are “fair and unbiased” but this can be a dangerous perspective. Machines are only as unbiased as the human who creates them and the data that trains them.

Foundation Models: Foundation Models represent a large amount of data that can be used as a foundation for developing other models. For example, generative AI systems use large language foundation models. They can be a way to speed up the development of new systems, but there is controversy about using foundation models since depending on where their data comes from, there are different issues of trustworthiness and bias.

Large Language Model (LLM): "refers to a large general-purpose language model that can be pre-trained and then fine-tuned for specific purposes. They are trained to solve common language problems, such as text classification, question answering, document summaries, and text generation. The models can then be adapted to solve specific problems in different fields using a relatively small size of field datasets via fine-tuning. The ability of LLMs taking the knowledge learnt from one task and applying it to another task is enabled by transfer learning. LLMs predict the probabilities of next word (token), given an input string of text, based on the language in the training data. Besides, instruction tuned language models predict a response to the instructions given in the input. These instructions can be "summarize a text", "generate a poem in the style of X", or "give a list of keywords based on semantic similarity for X". LLMs are large, not only because of their large size of training data, but also their large number of parameters. They display different behaviors from smaller models and have important implications for those who develop and use A.I. systems. To develop effective LLMs, researchers must address complex engineering issues and work alongside engineers or have engineering expertise themselves." (This definition from Penn State University Libraries).

Machine Learning (ML): Machine learning is a field of study with a range of approaches to developing algorithms that can be used in AI systems. AI is a more general term. In ML, an algorithm will identify rules and patterns in the data without a human specifying those rules and patterns. These algorithms build a model for decision making as they go through data. (You will sometimes hear the term machine learning model.) Because they discover their own rules in the data they are given, ML systems can perpetuate biases. Algorithms used in machine learning require massive amounts of data to be trained to make decisions.

It’s important to note that in machine learning, the algorithm is doing the work to improve and does not have the help of a human programmer. It is also important to note three more things. One, in most cases the algorithm is learning an association (when X occurs, it usually means Y) from training data that is from the past. Two, since the data is historical, it may contain biases and assumptions that we do not want to perpetuate. Three, there are many questions about involving humans in the loop with AI systems; when using ML to solve AI problems, a human may not be able to understand the rules the algorithm is creating and using to make decisions. This could be especially problematic if a human learner was harmed by a decision a machine made and there was no way to appeal the decision.

Natural Language Processing (NLP): Natural Language Processing is a field of Linguistics and Computer Science that also overlaps with AI. NLP uses an understanding of the structure, grammar, and meaning in words to help computers “understand and comprehend” language. NLP requires a large corpus of text (usually half a million words).

NLP technologies help in many situations that include: scanning texts to turn them into editable text (optical character recognition), speech to text, voice-based computer help systems, grammatical correction (like auto-correct or grammarly), summarizing texts, and others.

Robots: Robots are embodied mechanical machines that are capable of doing a physical task for humans. “Bots” are typically software agents that perform tasks in a software application (e.g., in an intelligent tutoring system they may offer help). Bots are sometimes called conversational agents. Both robots and bots can contain AI, including machine learning, but do not have to have it. AI can help robots and bots perform tasks in more adaptive and complex ways.

The above information has been pulled from the Glossary of Artificial Intelligence Terms for Educators by Pati Ruiz and Judi Fusco of the The Center for Integrative Research in Computing and Learning Sciences (CIRCLS), and is licensed under a Creative Commons Attribution 4.0 International License.