Restricted model spaces

In words...

In a machine learning problem characterized by a given input space and a given label space, a predictive model is a function that assigns a label to every possible input.

Instead of searching for the model in the set of all such functions, most learning algorithms consider a restricted model space. Such a model space is basically a subset of all functions from the input space to the label space. This is also referred to as a function class.

For instance, one very popular class of functions is the set of all linear functions of the input.

As for the class of linear functions, a function class can often be represented by a fixed model structure and a set of parameters that have to be learned from data. Indeed, in this case, each value of the parameters can be identified with a function of the class.

We also often encounter parameterized function classes for which the class of functions itself is defined up to the value of some parameters (called hyperparameters in this case). A typical example is the class of polynomials parametrized by the degree of the polynomials.

In pictures...

In maths...

Given an input space $\X$ and a label space $\Y$, a predictive model is a function $$ f : \X\rightarrow \Y $$ that maps any input vector $\g x$ to a label $f(\g x)$.

Instead of searching for a model in the set of all such function $\Y^{\X}$, most learning algorithms consider a restricted model space $$ \F \subset \Y^\X, $$ also called a function class.

A popular example of such a function class is the set of all linear functions. With input space $\X\subseteq\R^d$ and label space $\Y\subseteq\R$, this is written as $$ \F = \left\{ f\in\Y^\X \ :\ f(\g x) = \g w^T \g x,\ \g w\in\R^d \right\} . $$ Here, the parameters of the model that have to be learned from data are the components of the parameter vector $\g w$. This means that any vector $\g w\in\R^d$ uniquely defines a model $f$ in the function class $\F$.