Generalization error bounds

In words...

Generalization error bounds are used to quantify the quality of the estimate of the risk obtained by computing the empirical error on the training set.

A first error bound can be easily derived, but, in order to be useful, the bounds should hold for all classifiers in a given class. It is still not too difficult to derive such a uniform bound for finite function classes, but it becomes more complicated for infinite function classes.

In pictures...

In maths...

The generalization error bounds studied in this book will be of probabilistic nature and will take the form $$ P^N \left\{ R_{emp}(f) - R(f) \geq \epsilon \right\} \leq \delta, $$ where $P^N$ stands for the product probability over the training sample of size $N$. Such bounds are typically formulated as upper bounds on the risk of model $f$ selected from a function class $\F$, such as $$ \forall f \in \F,\quad R(f) \leq R_{emp}(f) + \epsilon $$ holding with probability $1-\delta$. Note that we only need asymmetric bounds, i.e., upper bounds on $R(f) - R_{emp}(f)$ and not $|R(f) -R_{emp}(f) |$ to obtain performance guarantees.