AI glossary for content assistants
Plain-English definitions of 13,917 AI terms for branded assistant teams.
Search glossary terms
13,917 glossary pages match your filters.
Category
Browse by letter
Glossary
13,917 terms. Open one for definitions and related concepts.
Intel Gaudi
Intel Gaudi is an AI accelerator processor designed for deep learning training and inference, offering an alternative to NVIDIA GPUs for data center workloads.
AMD Instinct
AMD Instinct is AMD's line of data center GPU accelerators for AI training and inference, powered by the ROCm open software platform.
Cerebras WSE
The Cerebras Wafer-Scale Engine (WSE) is the largest chip ever built, a single wafer-sized processor designed for massive AI model training.
Groq LPU
The Groq Language Processing Unit (LPU) is a specialized AI chip designed for ultra-fast, deterministic inference of large language models.
Apple Neural Engine
The Apple Neural Engine is a dedicated NPU in Apple silicon chips that accelerates on-device machine learning for iPhones, iPads, and Macs.
GPU Memory
GPU memory (VRAM) is the dedicated high-bandwidth memory on a graphics card that stores model weights, activations, and data during AI computation.
VRAM
VRAM (Video Random Access Memory) is the dedicated memory on a GPU that stores data for graphics and AI computation workloads.
HBM
High Bandwidth Memory (HBM) is a high-performance memory technology used in data center GPUs and AI accelerators for maximum memory bandwidth.
HBM3
HBM3 is the third generation of High Bandwidth Memory, offering higher speed and capacity for AI accelerators like the NVIDIA H100.
GDDR6
GDDR6 is the standard graphics memory technology used in consumer GPUs, offering good bandwidth for gaming and moderate AI workloads.
Memory Bandwidth
Memory bandwidth is the rate at which data can be transferred between memory and processors, a critical bottleneck for AI model performance.
NVMe
NVMe (Non-Volatile Memory Express) is a high-speed storage protocol used in AI systems for fast data loading, model checkpointing, and dataset access.
Cloud Computing
Cloud computing provides on-demand access to computing resources including GPUs for AI, without owning physical hardware, through providers like AWS, Azure, and GCP.
Edge Computing
Edge computing processes data near its source rather than in the cloud, enabling real-time AI inference with lower latency and better privacy.
Serverless Computing
Serverless computing automatically manages infrastructure for AI workloads, scaling resources on demand and charging only for actual compute time used.
Distributed Computing
Distributed computing spreads computation across multiple machines, essential for training large AI models that exceed the capacity of any single device.
Parallel Computing
Parallel computing performs many calculations simultaneously, the fundamental principle behind GPU-accelerated AI training and inference.
High-Performance Computing
High-performance computing (HPC) uses supercomputers and computing clusters to solve complex problems, increasingly converging with AI infrastructure.
HPC
HPC (High-Performance Computing) is the abbreviation for high-performance computing systems and practices used for large-scale AI and scientific computation.
Supercomputer
A supercomputer is an extremely powerful computing system used for large-scale AI training, scientific simulation, and solving the world's hardest computational problems.
Quantum Computing
Quantum computing uses quantum mechanical phenomena like superposition and entanglement to solve certain problems exponentially faster than classical computers.
Quantum Machine Learning
Quantum machine learning combines quantum computing with machine learning algorithms, exploring potential speedups for training, optimization, and feature mapping.
VPU
A Vision Processing Unit (VPU) is a specialized processor optimized for computer vision and image processing tasks at low power.
DPU
A Data Processing Unit (DPU) is a programmable processor that offloads networking, storage, and security tasks from CPUs in data center infrastructure.
IPU
An Intelligence Processing Unit (IPU) is a processor designed by Graphcore specifically for machine learning workloads with a unique bulk synchronous parallel architecture.
Analog AI Chip
An analog AI chip performs neural network computations using continuous analog signals rather than digital logic, offering potential gains in energy efficiency and speed.
Optical Computing
Optical computing uses light (photons) instead of electrical signals to perform computations, offering potential advantages in speed and energy efficiency for AI workloads.
Photonic Computing
Photonic computing uses integrated photonic circuits to process data with light, enabling ultra-fast and energy-efficient AI computations.
cuDNN
cuDNN (CUDA Deep Neural Network library) is a GPU-accelerated library of primitives for deep neural networks, providing optimized implementations of common operations.
CUDA Cores
CUDA cores are the basic parallel processing units within NVIDIA GPUs, each capable of executing one floating-point or integer operation per clock cycle.
NVSwitch
NVSwitch is a high-bandwidth switch chip from NVIDIA that enables all-to-all GPU communication within multi-GPU systems at full NVLink bandwidth.
DGX A100
The NVIDIA DGX A100 is a purpose-built AI system featuring eight A100 GPUs connected via NVSwitch, designed for AI training and inference at scale.
DGX H100
The NVIDIA DGX H100 is a next-generation AI system with eight H100 GPUs and NVSwitch, delivering dramatically higher performance for AI training and inference.
DGX Cloud
DGX Cloud is an AI supercomputing service that provides instant access to NVIDIA DGX systems through cloud providers, eliminating the need to build on-premise infrastructure.
A100 GPU
The NVIDIA A100 is an Ampere-architecture data center GPU designed for AI training and inference, available in 40GB and 80GB HBM2e configurations.
H100 GPU
The NVIDIA H100 is a Hopper-architecture data center GPU with fourth-generation Tensor Cores and a Transformer Engine, designed for training and running large language models.
H200 GPU
The NVIDIA H200 is an enhanced Hopper GPU with 141GB of HBM3e memory and nearly double the memory bandwidth of the H100, optimized for large language model inference.
B100 GPU
The NVIDIA B100 is a Blackwell-architecture GPU designed as a PCIe-compatible option for data centers seeking next-generation AI performance without infrastructure changes.
B200 GPU
The NVIDIA B200 is the flagship Blackwell-architecture GPU delivering up to 20 petaflops of FP4 AI performance for next-generation training and inference.
L40S
The NVIDIA L40S is a data center GPU optimized for AI inference, video processing, and graphics workloads, offering strong generative AI performance in a standard PCIe form factor.
L4 GPU
The NVIDIA L4 is a low-power data center GPU designed for efficient AI inference and video processing in space-constrained and power-limited environments.
T4 GPU
The NVIDIA T4 is a Turing-architecture data center GPU widely used for cost-effective AI inference, supporting INT8 and FP16 precision with 16GB of GDDR6 memory.
V100 GPU
The NVIDIA V100 is a Volta-architecture data center GPU that introduced Tensor Cores, marking a turning point in GPU-accelerated deep learning.
NVIDIA AI Enterprise
NVIDIA AI Enterprise is a software platform that provides enterprise-grade AI tools, frameworks, and support for deploying AI applications in production environments.
cuBLAS
cuBLAS is a GPU-accelerated library implementing the BLAS (Basic Linear Algebra Subprograms) standard, providing optimized matrix operations fundamental to AI computation.
TPU v5
TPU v5 is the latest generation of Google Cloud TPUs, available in v5e (efficiency) and v5p (performance) variants for AI training and inference at scale.
Trainium2
Trainium2 is the second generation of AWS custom AI training chips, offering significantly improved performance for training large foundation models on AWS infrastructure.
Inferentia2
Inferentia2 is the second generation of AWS custom AI inference chips, offering high throughput and low cost for serving machine learning models on AWS.
Turn owned content into answers
Use InsertChat to launch a branded assistant visitors can ask directly.
7-day free trial · No card required
Try the FAQ like a visitor.
Open product, pricing, security, integration, and free-tool questions in the same chat your visitors use.
InsertChat
Interactive FAQ
Hey. Pick a question below and see how InsertChat turns FAQs into clear, source-backed answers.
Product FAQ
What is InsertChat?
InsertChat is a white-label AI assistant for your website. Train it, brand it, publish it, and learn from visitor questions.
How does InsertChat use my website content?
Connect approved pages, docs, videos, FAQs, policies, and other sources. InsertChat turns them into source-backed answers and next steps.
Can I control the assistant's tone and sources?
Yes. Choose its sources, tone, welcome message, and prompts so it stays on brand.
How does InsertChat stay accurate?
Answers use approved content and source links. Analytics show unclear or missing answers so you can improve coverage.
Can it collect leads or route support questions?
Yes. InsertChat can collect details, qualify intent, add context, and send chats to the right inbox, CRM, workflow, or person.
Can I control how the assistant behaves?
Yes. Control prompts, model choice, tool access, and the branded assistant experience so behavior stays consistent.
Which AI models can I use?
InsertChat supports multiple model providers. Choose each assistant's model for quality, speed, and cost, or use BYOK.
Can I pick different models for different workflows?
Yes. Use a faster model for common questions and a stronger model for complex reasoning. InsertChat supports that balance per conversation.
Where can I deploy an assistant?
Use a widget, embed, full-page assistant, custom domain, in-app embed, or API. Reuse one setup across surfaces.
Do I need coding skills?
No. Build and deploy AI assistants using our visual builder. The embed code is one line of JavaScript.
Can I customize the branding and UI?
Yes. Customize the assistant name, logo, colors, welcome message, suggested prompts, tone, domain, and white-label presentation.
Can I use my own domain?
Yes. Custom domains are supported, typically via enterprise options.
Does InsertChat support voice?
Yes. Voice dictation and text-to-speech let users speak instead of type.
Does InsertChat support vision?
Yes. Enable vision for assistants when images help clarify a request or context.
What tools and integrations are supported?
Zendesk, HubSpot, Shopify, WooCommerce, calendar booking, web search, Perplexity, and webhooks for your own systems.
Can I control which tools the assistant is allowed to use?
Yes. Tool access is controlled per assistant so you enable only what you need.
Can the agent hand off to a human?
Yes. Configure human handoff so the agent escalates when needed. Full conversation history is passed along.
Do you provide analytics?
Yes. Track chats, leads, feedback, top questions, unanswered questions, most-used sources, and content gaps.
Is it mobile friendly?
Yes. The widget and embeds work well on desktop and mobile with no separate experience needed.
What's the fastest path to a successful deployment?
Start with one assistant and a small set of high-value sources. Iterate using real questions from analytics.
What is the fastest way to get started?
Create an account. Connect one key source. Ask a test question, brand the assistant, then publish it on one page.