Glossary

AI glossary for content assistants

Plain-English definitions of 13,917 AI terms for branded assistant teams.

Plain EnglishRAGLLMs

Start for Free

Search glossary terms

13,917 glossary pages match your filters.

Glossary

13,917 terms. Open one for definitions and related concepts.

Naive RAG

The simplest RAG implementation that retrieves documents and passes them directly to a language model without additional processing or refinement.

Open page

Advanced RAG

An enhanced RAG approach that adds pre-retrieval, retrieval, and post-retrieval optimizations such as query rewriting, re-ranking, and answer refinement.

Open page

Modular RAG

A flexible RAG architecture composed of interchangeable modules for retrieval, processing, and generation that can be configured for different use cases.

Open page

Self-RAG

A RAG variant where the language model decides when to retrieve, evaluates retrieved passages, and critiques its own generation for quality and faithfulness.

Open page

Corrective RAG

A RAG approach that evaluates retrieved documents for relevance and triggers corrective actions like web search or query refinement when retrieval quality is poor.

Open page

Adaptive RAG

A RAG system that dynamically adjusts its retrieval strategy based on query complexity, routing simple queries directly and complex ones through multi-step retrieval.

Open page

Iterative RAG

A RAG approach that performs multiple rounds of retrieval and generation, using each round's output to refine subsequent queries and improve answer quality.

Open page

Multi-step RAG

A RAG pipeline that breaks complex queries into multiple sub-questions, retrieves information for each, and synthesizes a comprehensive final answer.

Open page

Recursive RAG

A RAG approach that recursively retrieves and processes information, using results from one retrieval step to inform the next until sufficient context is gathered.

Open page

Agentic RAG

A RAG system where an AI agent orchestrates the retrieval process, dynamically deciding what to search for, when to retrieve, and how to use retrieved information.

Open page

Graph RAG

A RAG approach that uses knowledge graphs to structure and retrieve information, capturing entity relationships that flat document retrieval misses.

Open page

Structured RAG

A RAG approach that leverages structured data sources like databases, tables, and APIs alongside unstructured text for more precise and comprehensive retrieval.

Open page

Multi-modal RAG

A RAG system that retrieves and reasons over multiple data types including text, images, tables, and audio to generate comprehensive answers.

Open page

Long-form RAG

A RAG approach optimized for generating extended, well-structured responses such as reports, summaries, or articles from multiple retrieved sources.

Open page

FLARE

Forward-Looking Active REtrieval is a technique where the model generates a tentative response and retrieves when it detects low-confidence tokens.

Open page

REPLUG

A retrieval-augmented language model that treats the retriever as a pluggable module and trains it alongside the language model for better end-to-end performance.

Open page

RETRO

Retrieval-Enhanced Transformer is a model architecture that interleaves retrieval into the transformer layers, retrieving during both training and inference.

Open page

Atlas

A retrieval-augmented language model from Meta that jointly pre-trains a retriever and language model, achieving strong few-shot performance on knowledge tasks.

Open page

Interleaved Retrieval-Generation

A technique that alternates between generating text and retrieving information, allowing the model to fetch context as needed throughout the generation process.

Open page

Chroma

An open-source embedding database designed for simplicity, making it easy to build AI applications with embeddings by providing a developer-friendly API.

Open page

Vespa

An open-source serving engine for large-scale data that combines vector search, text search, and structured data processing in a single platform.

Open page

HNSW

Hierarchical Navigable Small World is a graph-based indexing algorithm for fast approximate nearest neighbor search, widely used in vector databases.

Open page

IVF

Inverted File Index is a vector indexing method that partitions vectors into clusters and searches only the most relevant clusters for faster retrieval.

Open page

Product Quantization

A vector compression technique that divides high-dimensional vectors into subspaces and quantizes each independently, dramatically reducing memory usage.

Open page

Locality-Sensitive Hashing

A hashing technique that maps similar vectors to the same hash buckets with high probability, enabling fast approximate nearest neighbor search through hash lookups.

Open page

DiskANN

A graph-based indexing algorithm that stores the index on disk rather than in memory, enabling billion-scale vector search on standard hardware without expensive RAM.

Open page

Flat Index

A vector index that stores all vectors without compression or approximation, providing exact nearest neighbor search by comparing against every vector in the database.

Open page

Brute Force Search

A search method that compares a query vector against every vector in the database to find exact nearest neighbors, providing perfect accuracy at the cost of speed.

Open page

text-embedding-ada-002

OpenAI's second-generation text embedding model that converts text into 1536-dimensional vectors, widely used for semantic search and RAG applications.

Open page

text-embedding-3-small

OpenAI's compact third-generation embedding model offering strong performance with flexible dimensions and lower cost than its larger sibling.

Open page

text-embedding-3-large

OpenAI's most capable third-generation embedding model, producing up to 3072-dimensional vectors with flexible dimension support for maximum accuracy.

Open page

Cohere Embed v3

Cohere's third-generation embedding model that supports over 100 languages and provides specialized search and classification embedding types.

Open page

Voyage AI

An embedding model provider specializing in high-quality, domain-specific embeddings for code, legal, finance, and general-purpose retrieval.

Open page

BGE

BAAI General Embedding is a family of open-source embedding models developed by BAAI that achieve state-of-the-art performance on retrieval benchmarks.

Open page

E5

EmbEddings from bidirEctional Encoder rEpresentations is a family of open-source text embedding models from Microsoft known for strong zero-shot retrieval.

Open page

CLIP

Contrastive Language-Image Pre-training is an OpenAI model that learns to connect text and images in a shared embedding space, enabling cross-modal search.

Open page

Dense Embedding

A vector representation where every dimension holds a meaningful non-zero value, capturing semantic meaning in a compact, continuous numerical space.

Open page

Sparse Embedding

A vector representation where most dimensions are zero, with non-zero values corresponding to specific vocabulary terms or features in the input text.

Open page

Multi-vector Embedding

A representation approach that produces multiple vectors per text input, one per token or segment, enabling finer-grained matching than single-vector embeddings.

Open page

Matryoshka Embedding

An embedding training technique that produces vectors useful at multiple dimensions, allowing you to truncate to shorter lengths while preserving most quality.

Open page

Cosine Distance

The complement of cosine similarity (1 minus cosine similarity), measuring how different two vectors are, where 0 means identical direction and 2 means opposite.

Open page

L2 Distance

Another name for Euclidean distance, computing the straight-line distance between two vectors in high-dimensional space using the L2 norm.

Open page

Manhattan Distance

A distance metric that sums the absolute differences across all dimensions, measuring distance along grid lines rather than straight-line distance.

Open page

Jaccard Similarity

A set-based similarity metric that measures the overlap between two sets by dividing the size of their intersection by the size of their union.

Open page

Hamming Distance

A distance metric that counts the number of positions where two equal-length sequences differ, commonly used for comparing binary vectors and hash codes.

Open page

Fixed-size Chunking

A text splitting strategy that divides documents into chunks of a predetermined character or token count, simple to implement but may break content at arbitrary points.

Open page

Token-based Chunking

A chunking method that splits text based on token count rather than character count, ensuring chunks align with how language models process text.

Open page

Sentence-based Chunking

A chunking strategy that splits text at sentence boundaries, ensuring each chunk contains complete sentences for more coherent retrieval results.

Open page

Page 18 of 290. Showing 48 of 13,917 matching glossary pages.

Turn owned content into answers

Use InsertChat to launch a branded assistant visitors can ask directly.

Start for Free

7-day free trial · No card required

Interactive FAQ

Try the FAQ like a visitor.

Open product, pricing, security, integration, and free-tool questions in the same chat your visitors use.

InsertChat

Interactive FAQ

Hey. Pick a question below and see how InsertChat turns FAQs into clear, source-backed answers.

Just now

0 of 21 questions explored Instant FAQ answers

Product FAQ

What is InsertChat?

InsertChat is a white-label AI assistant for your website. Train it, brand it, publish it, and learn from visitor questions.

How does InsertChat use my website content?

Connect approved pages, docs, videos, FAQs, policies, and other sources. InsertChat turns them into source-backed answers and next steps.

Can I control the assistant's tone and sources?

Yes. Choose its sources, tone, welcome message, and prompts so it stays on brand.

How does InsertChat stay accurate?

Answers use approved content and source links. Analytics show unclear or missing answers so you can improve coverage.

Can it collect leads or route support questions?

Yes. InsertChat can collect details, qualify intent, add context, and send chats to the right inbox, CRM, workflow, or person.

Can I control how the assistant behaves?

Yes. Control prompts, model choice, tool access, and the branded assistant experience so behavior stays consistent.

Which AI models can I use?

InsertChat supports multiple model providers. Choose each assistant's model for quality, speed, and cost, or use BYOK.

Can I pick different models for different workflows?

Yes. Use a faster model for common questions and a stronger model for complex reasoning. InsertChat supports that balance per conversation.

Where can I deploy an assistant?

Use a widget, embed, full-page assistant, custom domain, in-app embed, or API. Reuse one setup across surfaces.

Do I need coding skills?

No. Build and deploy AI assistants using our visual builder. The embed code is one line of JavaScript.

Can I customize the branding and UI?

Yes. Customize the assistant name, logo, colors, welcome message, suggested prompts, tone, domain, and white-label presentation.

Can I use my own domain?

Yes. Custom domains are supported, typically via enterprise options.

Does InsertChat support voice?

Yes. Voice dictation and text-to-speech let users speak instead of type.

Does InsertChat support vision?

Yes. Enable vision for assistants when images help clarify a request or context.

What tools and integrations are supported?

Zendesk, HubSpot, Shopify, WooCommerce, calendar booking, web search, Perplexity, and webhooks for your own systems.

Can I control which tools the assistant is allowed to use?

Yes. Tool access is controlled per assistant so you enable only what you need.

Can the agent hand off to a human?

Yes. Configure human handoff so the agent escalates when needed. Full conversation history is passed along.

Do you provide analytics?

Yes. Track chats, leads, feedback, top questions, unanswered questions, most-used sources, and content gaps.

Is it mobile friendly?

Yes. The widget and embeds work well on desktop and mobile with no separate experience needed.

What's the fastest path to a successful deployment?

Start with one assistant and a small set of high-value sources. Iterate using real questions from analytics.

What is the fastest way to get started?

Create an account. Connect one key source. Ask a test question, brand the assistant, then publish it on one page.