Skip to content

Instantly share code, notes, and snippets.

@yuinchien
Last active February 3, 2024 16:31
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save yuinchien/8f7d47243cf5064de543d287714c64b1 to your computer and use it in GitHub Desktop.
Save yuinchien/8f7d47243cf5064de543d287714c64b1 to your computer and use it in GitHub Desktop.
AI Glossary
TERM DEFINITION SOURCE LINK
AGI. Artificial General Intelligence An AGI could learn to accomplish any intellectual task that human beings or animals can perform. Alternatively, AGI has been defined as an autonomous system that surpasses human capabilities in the majority of economically valuable tasks. Source
Adversarial suffix A string of random seeming characters, to a prompt that makes the LLM significantly more likely to return an unfiltered response. Source Demo
AI. Artificial Intelligence Artificial intelligence (AI) is the intelligence of machines or software, as opposed to the intelligence of humans or animals. It is a field of study in computer science which develops and studies intelligent machines. Source
AI Safty An interdisciplinary field concerned with preventing accidents, misuse, or other harmful consequences that could result from artificial intelligence (AI) systems. It encompasses machine ethics and AI alignment, which aim to make AI systems moral and beneficial, and AI safety encompasses technical problems including monitoring systems for risks and making them highly reliable. Beyond AI research, it involves developing norms and policies that promote safety. Source
Attention A mechanism used in a neural network that indicates the importance of a particular word or part of a word. Attention compresses the amount of information a model needs to predict the next token/word. A typical attention mechanism might consist of a weighted sum over a set of inputs, where the weight for each input is computed by another part of the neural network. Source
Alignment AI alignment research aims to steer AI systems towards humans' intended goals, preferences, or ethical principles. An AI system is considered aligned if it advances the intended objectives. A misaligned AI system pursues some objectives, but not the intended ones. Source
Prompt Injection Attack Using carefully crafted prompts that make the model ignore previous instructions or perform unintended actions. Source
Backpropagation A crucial step in a common method used to iteratively train a neural network model. It is used to calculate the necessary parameter adjustments, to gradually minimize error. Source
Bias An idea that machine learning algorithms can be biased when carrying out their programmed tasks, like analyzing data or producing content). AI is typically biased in ways that uphold harmful beliefs, like race and gender stereotypes. Source
Context Window The “context window” refers to how much text a language model can look back on and reference, when attempting to generate text. This is different from the large corpus of data the language model the was trained on, and instead represents more of a “working memory” for the model. Source
Data poisoing An Artificial Intelligence poisoning attack occurs when an AI model's training data is intentionally tampered with, affecting the outcomes of the model's decision-making processes. Despite the black-box nature of AI models, these attacks seek to deceive the AI system into making incorrect or harmful decisions. Source
Deep Learning A method in artificial intelligence (AI) that teaches computers to process data in a way that is inspired by the human brain. Deep learning models can recognize complex patterns in pictures, text, sounds, and other data to produce accurate insights and predictions. Source
Dictionary Learning Dictionary learning is a way to find a better sparse mapping matrix by the use of training data. Source
Feature Engineering The process of using domain knowledge to select and transform the most relevant variables from raw data when creating a predictive model using machine learning or statistical modeling. The goal of feature engineering and selection is to improve the performance of machine learning (ML) algorithms. Source
Generative AI Generative AI enables users to quickly generate new content based on a variety of inputs. Inputs and outputs to these models can include text, images, sounds, animation, 3D models, or other types of data. Source
GAN. Generative Adversarial Network A generative adversarial network (GAN) has two parts: The generator learns to generate plausible data. The generated instances become negative training examples for the discriminator. The discriminator learns to distinguish the generator's fake data from real data. The discriminator penalizes the generator for producing implausible results. When training begins, the generator produces obviously fake data, and the discriminator quickly learns to tell that it's fake. Source
Hallucination AI hallucinations are incorrect or misleading results that AI models generate. These errors can be caused by a variety of factors, including insufficient training data, incorrect assumptions made by the model, or biases in the data used to train the model. AI hallucinations can be a problem for AI systems that are used to make important decisions, such as medical diagnoses or financial trading. Source
Interpretability Models are interpretable when humans can readily understand the reasoning behind predictions and decisions made by the model. The more interpretable the models are, the easier it is for someone to comprehend and trust the model. Models such as deep learning and gradient boosting are not interpretable and are referred to as black-box models because they are too complex for human understanding. It is impossible for a human to comprehend the entire model at once and understand the reasoning behind each decision. Source
LLM. Large Language Model A large language model (LLM) is a type of artificial intelligence (AI) algorithm that uses deep learning techniques and massively large data sets to understand, summarize, generate and predict new content. Source
Machine Learning The study of computer algorithms that improve automatically through experience and by the use of data. Key concepts include supervised, unsupervised, and reinforcement learning. Source
Multimodal Multimodal AI is artificial intelligence that combines multiple types, or modes, of data to create more accurate determinations, draw insightful conclusions or make more precise predictions about real-world problems. Multimodal AI systems train with and use video, audio, speech, images, text and a range of traditional numerical data sets. Most importantly, multimodal AI means numerous data types are used in tandem to help AI establish content and better interpret context, something missing in earlier AI. Source
Neural Networks Neural networks (NNs) or neural nets are a branch of machine learning models that are built using principles of neuronal organization discovered by connectionism in the biological neural networks constituting animal brains. Source
NLP. Natural Language Processing is a machine learning technology that gives computers the ability to interpret, manipulate, and comprehend human language. Source
Prompt Engineering The process of structuring text that can be interpreted and understood by a generative AI model.[1][2] A prompt is natural language text describing the task that an AI should perform. Source
Pre-training The initial phase of training a machine learning model where the model learns general features, patterns, and representations from the data without specific knowledge of the task it will later be applied to. This unsupervised or semi-supervised learning process enables the model to develop a foundational understanding of the underlying data distribution and extract meaningful features that can be leveraged for subsequent fine-tuning on specific tasks. Source
RAG. Retrieval-Augmented Generation A technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources. Source
RLHF. Reinforcement Learning from Human Feedback Reinforcement Learning from Human Feedback is a means to take a pretrained language model, and encourage it to behave in ways that are consistent with with humans prefer. This can include “helping it to follow instructions” or “helping it to act more like a chat bot”. The human feedback consists of a human-ranking set of two or more examples text, and the reinforcement learning encourages the model learns to prefer outputs that are similar to the higher-ranked ones. Source
Singularity In the context of AI, the singularity (also known as the technological singularity) refers to a hypothetical future point in time when technological growth becomes uncontrollable and irreversible, leading to unforeseeable changes to human civilization. Source
Transformer A transformer model is a neural network that learns context and thus meaning by tracking relationships in sequential data like the words in this sentence. Source
Temperature Temperature is a parameter that controls the randomness of a model's predictions during generation. Higher temperature leads to more creative samples that enable multiple variations in phrasing (and in the case of fiction, variation in answers as well), while lower temperature leads to more conservative samples that stick to the most-probable phrasing and answer. Adjusting the temperature is a way to encourage a language model to explore rare, uncommon, or surprising next words or sequences, rather than only selecting the most likely predictions. Source
Token In the context of AI, tokens are the basic units of text or code that AI models use to process and generate language. These tokens can be characters, words, subwords, or other segments of text or code, depending on the chosen tokenization method or scheme. Source
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment