Glossary

AI glossary for content assistants

Plain-English definitions of 13,917 AI terms for branded assistant teams.

Plain EnglishRAGLLMs

Start for Free

Search glossary terms

13,917 glossary pages match your filters.

Glossary

13,917 terms. Open one for definitions and related concepts.

Cosine Similarity

Cosine similarity measures the cosine of the angle between two vectors, ranging from -1 to 1, widely used for comparing embeddings in NLP and recommendation systems.

Open page

Euclidean Distance

Euclidean distance is the straight-line distance between two points in space, the most common distance metric in machine learning.

Open page

Mahalanobis Distance

Mahalanobis distance accounts for correlations between variables by normalizing with the covariance matrix, measuring distance in standard deviations.

Open page

Gradient Descent

Gradient descent is an iterative optimization algorithm that adjusts parameters in the direction of steepest decrease of the loss function.

Open page

Learning Rate

The learning rate is a hyperparameter controlling the step size of parameter updates during gradient descent optimization.

Open page

Chain Rule

The chain rule computes the derivative of a composite function, forming the mathematical basis of backpropagation in neural networks.

Open page

Partial Derivative

A partial derivative measures how a multi-variable function changes with respect to one variable while holding all others constant.

Open page

Taylor Expansion

A Taylor expansion approximates a function locally using a polynomial based on its derivatives, used to analyze optimization landscapes in ML.

Open page

Convexity

Convexity is a property of sets and functions ensuring that any local optimum is a global optimum, simplifying optimization analysis.

Open page

Stochastic Process

A stochastic process is a collection of random variables indexed by time or space, modeling systems that evolve with inherent randomness.

Open page

Monte Carlo Method

Monte Carlo methods use random sampling to estimate mathematical quantities that are difficult or impossible to compute analytically.

Open page

Matrix Factorization

Matrix factorization decomposes a matrix into a product of smaller matrices, used for dimensionality reduction and recommendation systems.

Open page

Softmax Function

The softmax function converts a vector of real numbers into a probability distribution, used as the output layer in neural network classifiers.

Open page

Sigmoid Function

The sigmoid function maps any real number to the range (0, 1), historically used as a neural network activation and for binary classification output.

Open page

Logarithm

The logarithm is the inverse of exponentiation, converting products to sums and enabling stable computation of likelihoods in machine learning.

Open page

Matrix Calculus

Matrix calculus extends calculus to matrix-valued functions, providing rules for computing gradients of loss functions with respect to weight matrices.

Open page

Bayes Optimal Classifier

The Bayes optimal classifier achieves the lowest possible error rate by choosing the class with highest posterior probability for each input.

Open page

Bias-Variance Tradeoff

The bias-variance tradeoff is the fundamental tension between model simplicity (high bias) and model flexibility (high variance) in machine learning.

Open page

Maximum Entropy Principle

The maximum entropy principle selects the probability distribution with the most uncertainty (highest entropy) among those satisfying known constraints.

Open page

Exponential Family

The exponential family is a class of probability distributions with a common mathematical form that includes most distributions used in machine learning.

Open page

Sufficient Statistic

A sufficient statistic captures all the information in a dataset relevant to estimating a parameter, enabling efficient data compression without information loss.

Open page

Conjugate Prior

A conjugate prior is a prior distribution that, when combined with a particular likelihood, produces a posterior distribution of the same family.

Open page

Loss Function

A loss function measures the discrepancy between model predictions and true values, providing the objective that training algorithms minimize.

Open page

Regularization

Regularization adds constraints or penalties to the optimization objective to prevent overfitting and improve model generalization.

Open page

Dimensionality Reduction

Dimensionality reduction projects high-dimensional data into a lower-dimensional space while preserving important structure.

Open page

Sampling Methods

Sampling methods generate random draws from probability distributions, enabling Monte Carlo estimation and generative modeling in machine learning.

Open page

Convergence

Convergence describes when a sequence of values approaches a limit, applicable to optimization algorithms, statistical estimators, and series in ML.

Open page

Moment

A moment is a quantitative measure of the shape of a probability distribution, with the first four moments capturing mean, variance, skewness, and kurtosis.

Open page

Kernel Function

A kernel function computes the inner product between data points in a high-dimensional feature space without explicitly mapping them there.

Open page

Manifold

A manifold is a low-dimensional surface embedded in a higher-dimensional space, capturing the intrinsic structure of data in machine learning.

Open page

Convolution (Mathematics)

Convolution is a mathematical operation combining two functions to produce a third, fundamental to signal processing and convolutional neural networks.

Open page

Information Bottleneck

The information bottleneck method finds the optimal tradeoff between compressing input information and preserving information relevant to the target variable.

Open page

Bayesian Optimization

Bayesian optimization is a sequential strategy for optimizing expensive black-box functions using a probabilistic surrogate model.

Open page

Principal Component Analysis

PCA is a dimensionality reduction technique that finds the directions of maximum variance in data and projects it onto a lower-dimensional space, preserving the most important structure.

Open page

t-SNE (t-distributed Stochastic Neighbor Embedding) is a nonlinear dimensionality reduction technique that produces 2D or 3D visualizations of high-dimensional data by preserving local neighborhood relationships.

Open page

UMAP

UMAP (Uniform Manifold Approximation and Projection) is a fast nonlinear dimensionality reduction technique that preserves both local and global structure, used for visualization and general-purpose dimensionality reduction.

Open page

Information Geometry

Information geometry applies differential geometry to statistical manifolds, treating families of probability distributions as geometric spaces with curvature defined by the Fisher information metric.

Open page

Optimal Transport

Optimal transport finds the minimum-cost way to move probability mass from one distribution to another, providing a geometric distance between distributions used in generative AI and domain adaptation.

Open page

Causal Inference

Causal inference is the study of cause-and-effect relationships from data, going beyond correlation to determine whether one variable actually causes changes in another.

Open page

Tensor Decomposition

Tensor decomposition extends matrix factorization to multi-dimensional arrays, decomposing tensors into simpler components for compression, pattern discovery, and efficient neural network design.

Open page

Manifold Learning

Manifold learning discovers the underlying low-dimensional structure in high-dimensional data, assuming data lies on or near a nonlinear manifold embedded in the high-dimensional space.

Open page

Kernel Methods

Kernel methods enable learning in implicit high-dimensional or infinite-dimensional feature spaces by using kernel functions to compute inner products without explicitly computing feature representations.

Open page

Gaussian Processes

A Gaussian Process is a probability distribution over functions, defined by a mean function and kernel (covariance) function, enabling principled uncertainty quantification in predictions.

Open page

Variational Inference

Variational inference approximates intractable posterior distributions by optimizing a simpler variational distribution to be as close as possible to the true posterior, enabling scalable Bayesian learning.

Open page

MCMC

Markov Chain Monte Carlo (MCMC) is a class of algorithms for sampling from probability distributions by constructing a Markov chain that has the target distribution as its stationary distribution.

Open page

Focal Loss

Focal loss is a modified cross-entropy loss that down-weights easy, well-classified examples and focuses training on hard, misclassified ones, addressing class imbalance in object detection and classification.

Open page

Contrastive Loss

Contrastive loss trains models to bring similar examples close together and push dissimilar examples apart in embedding space, enabling representation learning for similarity search and retrieval.

Open page

Triplet Loss

Triplet loss trains embedding models using anchor-positive-negative triplets, ensuring the anchor is closer to the positive than to the negative by at least a margin.

Open page

Page 115 of 290. Showing 48 of 13,917 matching glossary pages.

Turn owned content into answers

Use InsertChat to launch a branded assistant visitors can ask directly.

Start for Free

7-day free trial · No card required

Interactive FAQ

Try the FAQ like a visitor.

Open product, pricing, security, integration, and free-tool questions in the same chat your visitors use.

InsertChat

Interactive FAQ

Hey. Pick a question below and see how InsertChat turns FAQs into clear, source-backed answers.

Just now

0 of 21 questions explored Instant FAQ answers

Product FAQ

What is InsertChat?

InsertChat is a white-label AI assistant for your website. Train it, brand it, publish it, and learn from visitor questions.

How does InsertChat use my website content?

Connect approved pages, docs, videos, FAQs, policies, and other sources. InsertChat turns them into source-backed answers and next steps.

Can I control the assistant's tone and sources?

Yes. Choose its sources, tone, welcome message, and prompts so it stays on brand.

How does InsertChat stay accurate?

Answers use approved content and source links. Analytics show unclear or missing answers so you can improve coverage.

Can it collect leads or route support questions?

Yes. InsertChat can collect details, qualify intent, add context, and send chats to the right inbox, CRM, workflow, or person.

Can I control how the assistant behaves?

Yes. Control prompts, model choice, tool access, and the branded assistant experience so behavior stays consistent.

Which AI models can I use?

InsertChat supports multiple model providers. Choose each assistant's model for quality, speed, and cost, or use BYOK.

Can I pick different models for different workflows?

Yes. Use a faster model for common questions and a stronger model for complex reasoning. InsertChat supports that balance per conversation.

Where can I deploy an assistant?

Use a widget, embed, full-page assistant, custom domain, in-app embed, or API. Reuse one setup across surfaces.

Do I need coding skills?

No. Build and deploy AI assistants using our visual builder. The embed code is one line of JavaScript.

Can I customize the branding and UI?

Yes. Customize the assistant name, logo, colors, welcome message, suggested prompts, tone, domain, and white-label presentation.

Can I use my own domain?

Yes. Custom domains are supported, typically via enterprise options.

Does InsertChat support voice?

Yes. Voice dictation and text-to-speech let users speak instead of type.

Does InsertChat support vision?

Yes. Enable vision for assistants when images help clarify a request or context.

What tools and integrations are supported?

Zendesk, HubSpot, Shopify, WooCommerce, calendar booking, web search, Perplexity, and webhooks for your own systems.

Can I control which tools the assistant is allowed to use?

Yes. Tool access is controlled per assistant so you enable only what you need.

Can the agent hand off to a human?

Yes. Configure human handoff so the agent escalates when needed. Full conversation history is passed along.

Do you provide analytics?

Yes. Track chats, leads, feedback, top questions, unanswered questions, most-used sources, and content gaps.

Is it mobile friendly?

Yes. The widget and embeds work well on desktop and mobile with no separate experience needed.

What's the fastest path to a successful deployment?

Start with one assistant and a small set of high-value sources. Iterate using real questions from analytics.

What is the fastest way to get started?

Create an account. Connect one key source. Ask a test question, brand the assistant, then publish it on one page.