Glossary

AI glossary for content assistants

Plain-English definitions of 13,917 AI terms for branded assistant teams.

Plain EnglishRAGLLMs

Start for Free

Search glossary terms

13,917 glossary pages match your filters.

Glossary

13,917 terms. Open one for definitions and related concepts.

Intel Gaudi

Intel Gaudi is an AI accelerator processor designed for deep learning training and inference, offering an alternative to NVIDIA GPUs for data center workloads.

Open page

AMD Instinct

AMD Instinct is AMD's line of data center GPU accelerators for AI training and inference, powered by the ROCm open software platform.

Open page

Cerebras WSE

The Cerebras Wafer-Scale Engine (WSE) is the largest chip ever built, a single wafer-sized processor designed for massive AI model training.

Open page

Groq LPU

The Groq Language Processing Unit (LPU) is a specialized AI chip designed for ultra-fast, deterministic inference of large language models.

Open page

Apple Neural Engine

The Apple Neural Engine is a dedicated NPU in Apple silicon chips that accelerates on-device machine learning for iPhones, iPads, and Macs.

Open page

GPU Memory

GPU memory (VRAM) is the dedicated high-bandwidth memory on a graphics card that stores model weights, activations, and data during AI computation.

Open page

VRAM

VRAM (Video Random Access Memory) is the dedicated memory on a GPU that stores data for graphics and AI computation workloads.

Open page

HBM

High Bandwidth Memory (HBM) is a high-performance memory technology used in data center GPUs and AI accelerators for maximum memory bandwidth.

Open page

HBM3

HBM3 is the third generation of High Bandwidth Memory, offering higher speed and capacity for AI accelerators like the NVIDIA H100.

Open page

GDDR6

GDDR6 is the standard graphics memory technology used in consumer GPUs, offering good bandwidth for gaming and moderate AI workloads.

Open page

Memory Bandwidth

Memory bandwidth is the rate at which data can be transferred between memory and processors, a critical bottleneck for AI model performance.

Open page

NVMe

NVMe (Non-Volatile Memory Express) is a high-speed storage protocol used in AI systems for fast data loading, model checkpointing, and dataset access.

Open page

Cloud Computing

Cloud computing provides on-demand access to computing resources including GPUs for AI, without owning physical hardware, through providers like AWS, Azure, and GCP.

Open page

Edge Computing

Edge computing processes data near its source rather than in the cloud, enabling real-time AI inference with lower latency and better privacy.

Open page

Serverless Computing

Serverless computing automatically manages infrastructure for AI workloads, scaling resources on demand and charging only for actual compute time used.

Open page

Distributed Computing

Distributed computing spreads computation across multiple machines, essential for training large AI models that exceed the capacity of any single device.

Open page

Parallel Computing

Parallel computing performs many calculations simultaneously, the fundamental principle behind GPU-accelerated AI training and inference.

Open page

High-Performance Computing

High-performance computing (HPC) uses supercomputers and computing clusters to solve complex problems, increasingly converging with AI infrastructure.

Open page

HPC

HPC (High-Performance Computing) is the abbreviation for high-performance computing systems and practices used for large-scale AI and scientific computation.

Open page

Supercomputer

A supercomputer is an extremely powerful computing system used for large-scale AI training, scientific simulation, and solving the world's hardest computational problems.

Open page

Quantum Computing

Quantum computing uses quantum mechanical phenomena like superposition and entanglement to solve certain problems exponentially faster than classical computers.

Open page

Quantum Machine Learning

Quantum machine learning combines quantum computing with machine learning algorithms, exploring potential speedups for training, optimization, and feature mapping.

Open page

VPU

A Vision Processing Unit (VPU) is a specialized processor optimized for computer vision and image processing tasks at low power.

Open page

DPU

A Data Processing Unit (DPU) is a programmable processor that offloads networking, storage, and security tasks from CPUs in data center infrastructure.

Open page

IPU

An Intelligence Processing Unit (IPU) is a processor designed by Graphcore specifically for machine learning workloads with a unique bulk synchronous parallel architecture.

Open page

Analog AI Chip

An analog AI chip performs neural network computations using continuous analog signals rather than digital logic, offering potential gains in energy efficiency and speed.

Open page

Optical Computing

Optical computing uses light (photons) instead of electrical signals to perform computations, offering potential advantages in speed and energy efficiency for AI workloads.

Open page

Photonic Computing

Photonic computing uses integrated photonic circuits to process data with light, enabling ultra-fast and energy-efficient AI computations.

Open page

cuDNN

cuDNN (CUDA Deep Neural Network library) is a GPU-accelerated library of primitives for deep neural networks, providing optimized implementations of common operations.

Open page

CUDA Cores

CUDA cores are the basic parallel processing units within NVIDIA GPUs, each capable of executing one floating-point or integer operation per clock cycle.

Open page

NVSwitch

NVSwitch is a high-bandwidth switch chip from NVIDIA that enables all-to-all GPU communication within multi-GPU systems at full NVLink bandwidth.

Open page

DGX A100

The NVIDIA DGX A100 is a purpose-built AI system featuring eight A100 GPUs connected via NVSwitch, designed for AI training and inference at scale.

Open page

DGX H100

The NVIDIA DGX H100 is a next-generation AI system with eight H100 GPUs and NVSwitch, delivering dramatically higher performance for AI training and inference.

Open page

DGX Cloud

DGX Cloud is an AI supercomputing service that provides instant access to NVIDIA DGX systems through cloud providers, eliminating the need to build on-premise infrastructure.

Open page

A100 GPU

The NVIDIA A100 is an Ampere-architecture data center GPU designed for AI training and inference, available in 40GB and 80GB HBM2e configurations.

Open page

H100 GPU

The NVIDIA H100 is a Hopper-architecture data center GPU with fourth-generation Tensor Cores and a Transformer Engine, designed for training and running large language models.

Open page

H200 GPU

The NVIDIA H200 is an enhanced Hopper GPU with 141GB of HBM3e memory and nearly double the memory bandwidth of the H100, optimized for large language model inference.

Open page

B100 GPU

The NVIDIA B100 is a Blackwell-architecture GPU designed as a PCIe-compatible option for data centers seeking next-generation AI performance without infrastructure changes.

Open page

B200 GPU

The NVIDIA B200 is the flagship Blackwell-architecture GPU delivering up to 20 petaflops of FP4 AI performance for next-generation training and inference.

Open page

L40S

The NVIDIA L40S is a data center GPU optimized for AI inference, video processing, and graphics workloads, offering strong generative AI performance in a standard PCIe form factor.

Open page

L4 GPU

The NVIDIA L4 is a low-power data center GPU designed for efficient AI inference and video processing in space-constrained and power-limited environments.

Open page

T4 GPU

The NVIDIA T4 is a Turing-architecture data center GPU widely used for cost-effective AI inference, supporting INT8 and FP16 precision with 16GB of GDDR6 memory.

Open page

V100 GPU

The NVIDIA V100 is a Volta-architecture data center GPU that introduced Tensor Cores, marking a turning point in GPU-accelerated deep learning.

Open page

NVIDIA AI Enterprise

NVIDIA AI Enterprise is a software platform that provides enterprise-grade AI tools, frameworks, and support for deploying AI applications in production environments.

Open page

cuBLAS

cuBLAS is a GPU-accelerated library implementing the BLAS (Basic Linear Algebra Subprograms) standard, providing optimized matrix operations fundamental to AI computation.

Open page

TPU v5

TPU v5 is the latest generation of Google Cloud TPUs, available in v5e (efficiency) and v5p (performance) variants for AI training and inference at scale.

Open page

Trainium2

Trainium2 is the second generation of AWS custom AI training chips, offering significantly improved performance for training large foundation models on AWS infrastructure.

Open page

Inferentia2

Inferentia2 is the second generation of AWS custom AI inference chips, offering high throughput and low cost for serving machine learning models on AWS.

Open page

Page 128 of 290. Showing 48 of 13,917 matching glossary pages.

Turn owned content into answers

Use InsertChat to launch a branded assistant visitors can ask directly.

Start for Free

7-day free trial · No card required

Interactive FAQ

Try the FAQ like a visitor.

Open product, pricing, security, integration, and free-tool questions in the same chat your visitors use.

InsertChat

Interactive FAQ

Hey. Pick a question below and see how InsertChat turns FAQs into clear, source-backed answers.

Just now

0 of 21 questions explored Instant FAQ answers

Product FAQ

What is InsertChat?

InsertChat is a white-label AI assistant for your website. Train it, brand it, publish it, and learn from visitor questions.

How does InsertChat use my website content?

Connect approved pages, docs, videos, FAQs, policies, and other sources. InsertChat turns them into source-backed answers and next steps.

Can I control the assistant's tone and sources?

Yes. Choose its sources, tone, welcome message, and prompts so it stays on brand.

How does InsertChat stay accurate?

Answers use approved content and source links. Analytics show unclear or missing answers so you can improve coverage.

Can it collect leads or route support questions?

Yes. InsertChat can collect details, qualify intent, add context, and send chats to the right inbox, CRM, workflow, or person.

Can I control how the assistant behaves?

Yes. Control prompts, model choice, tool access, and the branded assistant experience so behavior stays consistent.

Which AI models can I use?

InsertChat supports multiple model providers. Choose each assistant's model for quality, speed, and cost, or use BYOK.

Can I pick different models for different workflows?

Yes. Use a faster model for common questions and a stronger model for complex reasoning. InsertChat supports that balance per conversation.

Where can I deploy an assistant?

Use a widget, embed, full-page assistant, custom domain, in-app embed, or API. Reuse one setup across surfaces.

Do I need coding skills?

No. Build and deploy AI assistants using our visual builder. The embed code is one line of JavaScript.

Can I customize the branding and UI?

Yes. Customize the assistant name, logo, colors, welcome message, suggested prompts, tone, domain, and white-label presentation.

Can I use my own domain?

Yes. Custom domains are supported, typically via enterprise options.

Does InsertChat support voice?

Yes. Voice dictation and text-to-speech let users speak instead of type.

Does InsertChat support vision?

Yes. Enable vision for assistants when images help clarify a request or context.

What tools and integrations are supported?

Zendesk, HubSpot, Shopify, WooCommerce, calendar booking, web search, Perplexity, and webhooks for your own systems.

Can I control which tools the assistant is allowed to use?

Yes. Tool access is controlled per assistant so you enable only what you need.

Can the agent hand off to a human?

Yes. Configure human handoff so the agent escalates when needed. Full conversation history is passed along.

Do you provide analytics?

Yes. Track chats, leads, feedback, top questions, unanswered questions, most-used sources, and content gaps.

Is it mobile friendly?

Yes. The widget and embeds work well on desktop and mobile with no separate experience needed.

What's the fastest path to a successful deployment?

Start with one assistant and a small set of high-value sources. Iterate using real questions from analytics.

What is the fastest way to get started?

Create an account. Connect one key source. Ask a test question, brand the assistant, then publish it on one page.