AI glossary for content assistants
Plain-English definitions of 13,917 AI terms for branded assistant teams.
Search glossary terms
13,917 glossary pages match your filters.
Category
Browse by letter
Glossary
13,917 terms. Open one for definitions and related concepts.
Paragraph-based Chunking
A chunking strategy that uses paragraph boundaries as natural split points, preserving topical coherence within each chunk.
Recursive Character Text Splitting
A LangChain chunking method that recursively splits text by trying different separators in order of preference, from paragraphs down to individual characters.
Structure-aware Chunking
A chunking approach that uses document structure elements like headings, sections, and tables to create meaningful chunks that respect the document's organization.
Parent-child Chunking
A strategy that creates small chunks for precise retrieval but passes their larger parent chunks to the language model for richer context.
Hierarchical Chunking
A chunking approach that creates multiple levels of chunks reflecting the document's hierarchy, from sections down to paragraphs and sentences.
Chunk Overlap
A technique where consecutive chunks share some overlapping text at their boundaries to prevent important context from being lost at split points.
Small-to-big Retrieval
A retrieval strategy that searches over small chunks for precision then expands to larger surrounding context before sending to the language model.
Sentence Window Retrieval
A technique that retrieves individual sentences but returns a window of surrounding sentences as context, balancing retrieval precision with generation context.
Auto-merging Retrieval
A technique that automatically merges smaller retrieved chunks into larger parent chunks when enough child chunks from the same parent are retrieved.
Re-ranking
A retrieval optimization that applies a more accurate but slower model to re-score and reorder initial search results, improving the final ranking quality.
Cross-encoder Reranking
A re-ranking approach that uses a cross-encoder model to jointly score query-document pairs, providing more accurate relevance judgments than bi-encoder similarity.
Multi-stage Retrieval
A retrieval pipeline with multiple sequential filtering and ranking stages, progressively narrowing and improving results from a broad initial search.
Retrieve-and-rerank
A two-stage search pattern that first retrieves candidates using fast methods, then re-orders them with a more accurate model for better final results.
Query Decomposition
Breaking a complex question into simpler sub-questions that can each be answered independently, then combining the answers for a comprehensive response.
Hypothetical Document Embedding
The full name for HyDE, a technique that generates a hypothetical answer document and uses its embedding for more effective retrieval.
Multi-query Retrieval
A technique that generates multiple different queries from a single user question and retrieves documents for each, combining results for broader coverage.
Sub-question Decomposition
Breaking a complex question into independent sub-questions that can be individually answered and combined into a comprehensive response.
Ontology
A formal specification of concepts, categories, and relationships within a domain, providing a shared vocabulary and structure for organizing knowledge.
Taxonomy
A hierarchical classification system that organizes concepts into parent-child categories, helping structure knowledge for retrieval and navigation.
Triple
A basic unit of knowledge graph data consisting of a subject, predicate, and object that represents a single fact or relationship between entities.
RDF
Resource Description Framework is a W3C standard for representing knowledge as triples, providing a common format for describing entities and relationships on the web.
Property Graph
A graph data model where both nodes and relationships can have properties (key-value pairs), offering a flexible and intuitive way to model complex domains.
Wikidata
A free, collaborative knowledge base maintained by the Wikimedia Foundation, containing structured data about millions of entities used by AI systems worldwide.
DBpedia
A knowledge base that extracts structured information from Wikipedia articles, making encyclopedia knowledge available as a queryable graph database.
ConceptNet
A commonsense knowledge graph connecting words and phrases with labeled relationships, capturing everyday knowledge that AI systems need to understand language.
Document Loader
A component that ingests documents from various sources and formats, converting them into a standardized format for processing in a RAG pipeline.
PDF Parser
A tool that extracts text, tables, and structure from PDF documents, converting them into processable format for AI knowledge bases.
Web Scraper
A tool that extracts content from web pages by parsing HTML, handling JavaScript rendering, and cleaning the extracted text for AI processing.
Web Crawler
A program that systematically browses websites by following links, discovering pages that can then be scraped and added to an AI knowledge base.
OCR
Optical Character Recognition converts images of text into machine-readable text, enabling AI systems to process scanned documents, photos, and handwritten content.
Table Extraction
The process of identifying and extracting structured tabular data from documents, preserving row-column relationships for accurate AI processing.
Layout Analysis
The process of understanding the visual structure of a document page, identifying regions like text blocks, tables, figures, and headers for proper content extraction.
Document Understanding
The ability of AI to comprehend document content by analyzing both text and visual layout, extracting structured information from complex document formats.
Metadata Extraction
The process of pulling out descriptive information about a document, such as title, author, date, and categories, to enrich knowledge base entries for better retrieval.
RAG Evaluation
The process of measuring how well a RAG system retrieves relevant content and generates accurate, faithful answers from the retrieved context.
Faithfulness
A RAG evaluation metric measuring whether the generated answer accurately represents the information in the retrieved context without adding unsupported claims.
Answer Relevancy
A RAG evaluation metric measuring how well the generated answer addresses the user's original question, regardless of factual accuracy.
Context Precision
A RAG evaluation metric measuring what proportion of the retrieved context is actually relevant to answering the user's question.
Context Recall
A RAG evaluation metric measuring what proportion of the information needed to answer a question was successfully retrieved from the knowledge base.
Groundedness
A measure of how well an AI response is supported by and traceable to specific source material, closely related to faithfulness in RAG evaluation.
Hallucination Rate
A metric measuring the frequency at which an AI system generates claims not supported by its source material, indicating how often it makes things up.
Noise Robustness
A RAG system's ability to generate accurate answers even when some of the retrieved context is irrelevant, outdated, or contradictory.
Scalar Quantization
A compression technique that reduces the precision of each dimension in a vector from 32-bit floats to smaller representations like 8-bit integers.
Binary Quantization
An aggressive compression method that represents each vector dimension as a single bit, enabling extremely fast search with minimal memory usage.
Random Projection
A dimensionality reduction technique that projects high-dimensional vectors into a lower-dimensional space using random matrices while approximately preserving distances.
Ball Tree
A tree-based data structure for organizing points in multi-dimensional space, enabling efficient nearest neighbor search by partitioning space into nested hyperspheres.
KD-Tree
A space-partitioning data structure that organizes points by recursively splitting along coordinate axes, efficient for low-dimensional nearest neighbor search.
OpenAI Embedding Ada
OpenAI text-embedding-ada-002, a widely adopted embedding model that produces 1536-dimensional vectors for semantic search and retrieval tasks.
Turn owned content into answers
Use InsertChat to launch a branded assistant visitors can ask directly.
7-day free trial · No card required
Try the FAQ like a visitor.
Open product, pricing, security, integration, and free-tool questions in the same chat your visitors use.
InsertChat
Interactive FAQ
Hey. Pick a question below and see how InsertChat turns FAQs into clear, source-backed answers.
Product FAQ
What is InsertChat?
InsertChat is a white-label AI assistant for your website. Train it, brand it, publish it, and learn from visitor questions.
How does InsertChat use my website content?
Connect approved pages, docs, videos, FAQs, policies, and other sources. InsertChat turns them into source-backed answers and next steps.
Can I control the assistant's tone and sources?
Yes. Choose its sources, tone, welcome message, and prompts so it stays on brand.
How does InsertChat stay accurate?
Answers use approved content and source links. Analytics show unclear or missing answers so you can improve coverage.
Can it collect leads or route support questions?
Yes. InsertChat can collect details, qualify intent, add context, and send chats to the right inbox, CRM, workflow, or person.
Can I control how the assistant behaves?
Yes. Control prompts, model choice, tool access, and the branded assistant experience so behavior stays consistent.
Which AI models can I use?
InsertChat supports multiple model providers. Choose each assistant's model for quality, speed, and cost, or use BYOK.
Can I pick different models for different workflows?
Yes. Use a faster model for common questions and a stronger model for complex reasoning. InsertChat supports that balance per conversation.
Where can I deploy an assistant?
Use a widget, embed, full-page assistant, custom domain, in-app embed, or API. Reuse one setup across surfaces.
Do I need coding skills?
No. Build and deploy AI assistants using our visual builder. The embed code is one line of JavaScript.
Can I customize the branding and UI?
Yes. Customize the assistant name, logo, colors, welcome message, suggested prompts, tone, domain, and white-label presentation.
Can I use my own domain?
Yes. Custom domains are supported, typically via enterprise options.
Does InsertChat support voice?
Yes. Voice dictation and text-to-speech let users speak instead of type.
Does InsertChat support vision?
Yes. Enable vision for assistants when images help clarify a request or context.
What tools and integrations are supported?
Zendesk, HubSpot, Shopify, WooCommerce, calendar booking, web search, Perplexity, and webhooks for your own systems.
Can I control which tools the assistant is allowed to use?
Yes. Tool access is controlled per assistant so you enable only what you need.
Can the agent hand off to a human?
Yes. Configure human handoff so the agent escalates when needed. Full conversation history is passed along.
Do you provide analytics?
Yes. Track chats, leads, feedback, top questions, unanswered questions, most-used sources, and content gaps.
Is it mobile friendly?
Yes. The widget and embeds work well on desktop and mobile with no separate experience needed.
What's the fastest path to a successful deployment?
Start with one assistant and a small set of high-value sources. Iterate using real questions from analytics.
What is the fastest way to get started?
Create an account. Connect one key source. Ask a test question, brand the assistant, then publish it on one page.