Artificial Intelligence

Key Terms in the Field of Artificial Intelligence

The most important math concepts for AI and data science, explained.

Posted January 31, 2019

Source: istockphoto

Binary Tree – a tree data structure where each node has at most two nodes (left and right nodes) and a data element. The topmost node of the tree is the root node.

Cauchy distribution – named after French mathematician Augustin Cauchy, is a continuous probability distribution

Combinatorics – field of math consisting of problems of selection, arrangement and operation within a finite or discrete system

Conditional Distributions – a probability distribution for a sub-population

Differential Calculus – the study of the rate of change of functions with respect to their variables through the concepts of derivatives and differentials

Dynamic Programming – branch of math that studies the theory and methods of solution of multi-step problems of optimal control

Bayes’ Theorem – named after 18th century British mathematician Thomas Bayes, it is a formula for determining conditional probability

Derivative – the limit of the ration of the change in a function to the corresponding change in its independent variable as the latter change approaches zero

Eigenvalue – any number such that a given matrix minus that number times the identity matrix has zero determinant.

Eigenvector - a vector which when operated on by a given operator gives a scalar multiple of itself.

Fourier transform – named after French mathematician Joseph Fourier, it’s a method for converting a time function into one expressed in terms of frequency

Function– a relation or expression involving one or more variables

Gradient descent– works towards adjusting input weights of neurons in artificial neural networks and finding local minima or global minima in order to optimize a problem

Gram-Schmidt Orthonormalization - also called the Gram-Schmidt process, is a procedure which takes a nonorthogonal set of linearly independent functions and constructs an orthogonal basis over an arbitrary interval with respect to an arbitrary weighting function

Hashing – generating a value or values from a string of text by use of a math function

Heap – a tree-based data structure where each element is assigned a key value (weight)

Hessian – named after 19th century German mathematician Ludwig Otto Hesse, tool used in differential geometry that describes the local curvature of a function

Information Theory – the mathematical expression of condition and parameters that impact the transmission and processing of information

Integral Calculus – branch of math concerned with the theory and application of integrals and integration, it deals with the total size or value such as lengths, areas and volumes

Joint distributions – the distribution of several random variables on the same probability space

Laplacian distribution (double exponential distribution) – the distribution of differences between two independent variates with identical exponential distributions

Lagrangian– a function that describes the state of a dynamic system in terms of position coordinates and their time derivatives and that is equal to the difference between the potential energy and kinetic energy

Linear Algebra – a branch of math that is concerned with the mathematical structures closed under the operations of addition and scalar multiplication, includes the theory of systems of linear equations, matrices, determinants, vector spaces and linear transformations

Maximum a Posteriori Estimation (MAP) – a common method of point estimation in Bayesian statistics

Artificial Intelligence Essential Reads

Cultivating the Modern Polymath with AI

Trust Me, I'm an AI Doctor

Maximum Likelihood Estimation (MLE) – method of finding the value of one or more parameters for a given statistic which makes the known likelihood distribution a maximum

Multivariate calculus – integral, differential and vector calculus in relation to functions of several variables

Orthogonal - two lines or curves are orthogonal if they are perpendicular at their point of intersection

Orthogonalization – the process of finding orthogonal vectors that span a particular subspace

Partial Derivatives - derivatives of a function of multiple variables where all but the variable of interest are held fixed during the differentiation

Principal Component Analysis (PCA) - Method used for identification of a smaller number of uncorrelated variables known as principal components from a larger set of data

Probability – type of ration that compares how many times an outcome can occur compared to all possible outcomes

QR Decomposition – given a matrix A, the QR decomposition is a matrix decomposition of the form A = QR, where R is the upper triangular matrix, and Q is an orthogonal matrix

Random Variable – a variable whose possible values are outcomes of a random phenomenon

Singular Value Decomposition (SVD) - a factorization of a real or complex matrix

Single-Valued Functions - function that, for each point in the domain, has a unique value in the range

Stack – a sequence of objects or elements in a linear data structure format

Standard Deviation – the dispersion of a dataset relative to its mean, it is calculated as the square root of the variance

Vector - a quantity with magnitude and direction, but not position

References

University of Oxford Mathematical Institute. URL: https://www.maths.ox.ac.uk/

Encyclopedia of Mathematics. URL: http://www.encyclopediaofmath.org/

Encyclopædia Britannica. URL: https://www.britannica.com

Merriam-Webster. https://www.merriam-webster.com/

Investopedia. https://www.investopedia.com/

Technopedia. https://www.techopedia.com

Wolfram MathWorld. http://mathworld.wolfram.com/

Collins Dictionary. https://www.collinsdictionary.com