Glossary

AI glossary for content assistants

Plain-English definitions of 13,917 AI terms for branded assistant teams.

Plain EnglishRAGLLMs

Start for Free

Search glossary terms

13,917 glossary pages match your filters.

Glossary

13,917 terms. Open one for definitions and related concepts.

Model Parallelism

Model parallelism distributes a neural network across multiple GPUs when it is too large to fit on a single GPU, enabling training and inference of very large models.

Open page

Pipeline Parallelism

Pipeline parallelism assigns consecutive neural network layers to different GPUs, enabling models too deep to fit on one GPU to train and infer across multiple devices.

Open page

Speculative Decoding

Speculative decoding accelerates LLM inference by using a small draft model to propose multiple tokens, which the large model verifies in parallel, yielding 2-4x throughput improvements.

Open page

NVIDIA Triton Inference Server

NVIDIA Triton Inference Server is an open-source platform for deploying AI models at scale, supporting multiple frameworks and optimization backends for production inference.

Open page

HBM4

HBM4 is the fourth generation of High Bandwidth Memory, providing dramatically increased bandwidth and capacity for next-generation AI accelerators.

Open page

Quantization-Aware Training

Quantization-Aware Training (QAT) simulates quantization effects during training, producing models that maintain accuracy under low-precision inference.

Open page

RoCE Networking

RoCE (RDMA over Converged Ethernet) is a network protocol enabling direct memory access between servers over standard Ethernet, used as an alternative to InfiniBand in AI clusters.

Open page

AI Memory Hierarchy

The AI memory hierarchy describes the layered structure of memory types in AI hardware — from on-chip registers and SRAM to HBM and DRAM — with speed-capacity tradeoffs at each level.

Open page

NVIDIA Jetson

NVIDIA Jetson is a family of edge AI computing modules combining GPU, CPU, and NPU on a compact, power-efficient platform for deploying AI at the edge.

Open page

Information Retrieval

Information retrieval is the science of searching for and finding relevant documents, data, or information from large collections based on user queries.

Open page

Search Engine

A search engine is a system that indexes and retrieves information from large document collections, ranking results by relevance to user queries.

Open page

Search Index

A search index is a data structure that enables fast lookup and retrieval of documents, mapping terms or vectors to the documents that contain them.

Open page

Indexing

Indexing is the process of analyzing and organizing documents into a search index to enable fast and relevant retrieval in response to queries.

Open page

Crawling

Web crawling is the automated process of discovering and downloading web pages or documents for indexing by search engines and AI knowledge systems.

Open page

Ranking

Search ranking is the process of ordering search results by relevance, using algorithms that score how well each document matches a user's query and intent.

Open page

Relevance

Search relevance measures how well search results match a user's query intent, encompassing both topical match and usefulness of the results.

Open page

Query

A search query is the text or expression a user submits to a search system to find relevant information, documents, or answers.

Open page

Autocomplete

Search autocomplete uses AI to predict and suggest query completions as users type, speeding up search and guiding users toward effective queries.

Open page

Faceted Search

Faceted search allows users to filter search results by multiple attributes or categories, combining free-text search with structured navigation.

Open page

Boolean Search

Boolean search uses logical operators (AND, OR, NOT) to combine search terms, giving users precise control over query construction.

Open page

Fuzzy Search

Fuzzy search finds approximate matches by tolerating spelling errors, typos, and minor variations in search terms.

Open page

BM25

BM25 (Best Matching 25) is a probabilistic ranking algorithm used by search engines to score document relevance based on term frequency and document length.

Open page

PageRank

PageRank is Google's foundational algorithm that ranks web pages by analyzing the link structure of the web to measure page importance and authority.

Open page

Learning to Rank

Learning to rank uses machine learning to train ranking models from relevance data, optimizing search result ordering for user satisfaction.

Open page

Neural Ranking

Neural ranking uses deep learning models to assess search result relevance, understanding semantic meaning beyond keyword matching.

Open page

Cross-Encoder Ranking

A cross-encoder processes a query and document together through a single model to produce highly accurate relevance scores for search reranking.

Open page

Bi-Encoder Ranking

A bi-encoder independently encodes queries and documents into vectors, enabling fast similarity-based retrieval from large collections.

Open page

Reciprocal Rank Fusion

Reciprocal Rank Fusion (RRF) combines ranked lists from multiple search methods into a single ranking based on each result's position across lists.

Open page

Elasticsearch

Elasticsearch is an open-source distributed search engine widely used for full-text search, log analytics, and increasingly for AI-powered semantic search.

Open page

OpenSearch

OpenSearch is an open-source search and analytics engine derived from Elasticsearch, maintained by Amazon with support for vector search and AI capabilities.

Open page

Apache Solr

Apache Solr is an open-source enterprise search platform built on Apache Lucene, providing full-text search, faceting, and distributed search capabilities.

Open page

Meilisearch

Meilisearch is a fast, open-source search engine designed for developer experience, providing instant search with typo tolerance and easy setup.

Open page

Typesense

Typesense is a fast, open-source search engine focused on developer experience, providing typo-tolerant instant search with simple setup.

Open page

Algolia

Algolia is a commercial search-as-a-service platform providing fast, hosted search with AI features, used by thousands of websites and applications.

Open page

Inverted Index

An inverted index is the core data structure behind text search engines, mapping every unique term to the list of documents containing that term.

Open page

Analyzer

A search analyzer is a text processing pipeline that transforms raw text into normalized tokens for indexing and searching in search engines.

Open page

Tokenizer

A tokenizer splits text into individual tokens (words or subwords), a fundamental step in both search indexing and language model processing.

Open page

Synonym Filter

A synonym filter expands search to match related terms by defining equivalence between words, improving recall without relying on semantic search.

Open page

Neural Search

Neural search uses deep learning models throughout the search pipeline to improve query understanding, document retrieval, and result ranking.

Open page

Dense Retrieval

Dense retrieval uses learned dense vector representations to find relevant documents, encoding semantic meaning for similarity-based search.

Open page

Hybrid Search

Hybrid search combines keyword-based and semantic vector search to leverage the strengths of both approaches for more comprehensive and relevant results.

Open page

Conversational Search

Conversational search enables multi-turn, natural language interactions where users refine searches through dialogue rather than isolated keyword queries.

Open page

Recommendation System

A recommendation system uses AI to suggest relevant items to users based on their behavior, preferences, and patterns from similar users.

Open page

Collaborative Filtering

Collaborative filtering recommends items based on behavioral patterns from similar users, without needing to understand item content or attributes.

Open page

Content-Based Filtering

Content-based filtering recommends items similar to those a user has previously liked, based on item attributes, features, and content analysis.

Open page

Cold Start Problem

The cold start problem occurs when recommendation systems lack sufficient data about new users or items to make accurate personalized suggestions.

Open page

Two-Tower Model

A two-tower model uses separate neural networks for users and items, encoding each into vectors for scalable similarity-based retrieval and recommendation.

Open page

Web Crawling

Web crawling is the automated process of systematically browsing the internet to discover, fetch, and catalog web pages for indexing by search engines.

Open page

Page 130 of 290. Showing 48 of 13,917 matching glossary pages.

Turn owned content into answers

Use InsertChat to launch a branded assistant visitors can ask directly.

Start for Free

7-day free trial · No card required

Interactive FAQ

Try the FAQ like a visitor.

Open product, pricing, security, integration, and free-tool questions in the same chat your visitors use.

InsertChat

Interactive FAQ

Hey. Pick a question below and see how InsertChat turns FAQs into clear, source-backed answers.

Just now

0 of 21 questions explored Instant FAQ answers

Product FAQ

What is InsertChat?

InsertChat is a white-label AI assistant for your website. Train it, brand it, publish it, and learn from visitor questions.

How does InsertChat use my website content?

Connect approved pages, docs, videos, FAQs, policies, and other sources. InsertChat turns them into source-backed answers and next steps.

Can I control the assistant's tone and sources?

Yes. Choose its sources, tone, welcome message, and prompts so it stays on brand.

How does InsertChat stay accurate?

Answers use approved content and source links. Analytics show unclear or missing answers so you can improve coverage.

Can it collect leads or route support questions?

Yes. InsertChat can collect details, qualify intent, add context, and send chats to the right inbox, CRM, workflow, or person.

Can I control how the assistant behaves?

Yes. Control prompts, model choice, tool access, and the branded assistant experience so behavior stays consistent.

Which AI models can I use?

InsertChat supports multiple model providers. Choose each assistant's model for quality, speed, and cost, or use BYOK.

Can I pick different models for different workflows?

Yes. Use a faster model for common questions and a stronger model for complex reasoning. InsertChat supports that balance per conversation.

Where can I deploy an assistant?

Use a widget, embed, full-page assistant, custom domain, in-app embed, or API. Reuse one setup across surfaces.

Do I need coding skills?

No. Build and deploy AI assistants using our visual builder. The embed code is one line of JavaScript.

Can I customize the branding and UI?

Yes. Customize the assistant name, logo, colors, welcome message, suggested prompts, tone, domain, and white-label presentation.

Can I use my own domain?

Yes. Custom domains are supported, typically via enterprise options.

Does InsertChat support voice?

Yes. Voice dictation and text-to-speech let users speak instead of type.

Does InsertChat support vision?

Yes. Enable vision for assistants when images help clarify a request or context.

What tools and integrations are supported?

Zendesk, HubSpot, Shopify, WooCommerce, calendar booking, web search, Perplexity, and webhooks for your own systems.

Can I control which tools the assistant is allowed to use?

Yes. Tool access is controlled per assistant so you enable only what you need.

Can the agent hand off to a human?

Yes. Configure human handoff so the agent escalates when needed. Full conversation history is passed along.

Do you provide analytics?

Yes. Track chats, leads, feedback, top questions, unanswered questions, most-used sources, and content gaps.

Is it mobile friendly?

Yes. The widget and embeds work well on desktop and mobile with no separate experience needed.

What's the fastest path to a successful deployment?

Start with one assistant and a small set of high-value sources. Iterate using real questions from analytics.

What is the fastest way to get started?

Create an account. Connect one key source. Ask a test question, brand the assistant, then publish it on one page.