# Generalization error bounds

## In words...

Generalization error bounds are used to quantify the quality of the estimate of the risk obtained by computing the empirical error on the training set.

A first error bound can be easily derived, but, in order to be useful, the bounds should hold for all classifiers in a given class. It is still not too difficult to derive such a uniform bound for finite function classes, but it becomes more complicated for infinite function classes.

## In pictures...

## In maths...

The generalization error bounds studied in this book will be of probabilistic nature and will take the form
$$
P^N \left\{ R_{emp}(f) - R(f) \geq \epsilon \right\} \leq \delta,
$$
where $P^N$ stands for the product probability over the training sample of size $N$. Such bounds are typically formulated as upper bounds on the risk of model $f$ selected from a function class $\F$, such as
$$
\forall f \in \F,\quad R(f) \leq R_{emp}(f) + \epsilon
$$
holding with probability $1-\delta$. Note that we only need asymmetric bounds, i.e., upper bounds on $R(f) - R_{emp}(f)$ and not $|R(f) -R_{emp}(f) |$ to obtain performance guarantees.