##
Error Estimation and Model Selection

####
126 pages, ISBN 3-89601-225-8

A central problem of machine learning is to decide whether a hypothesis
just

happens to match the available data well, or whether it actually has
a

high generalization ability. Strongly related is the problem
of

deciding which of several available learning algorithms or hypothesis

languages leads to the highest generalization performance. This

is referred to as the model selection problem.
This book centers around an analysis of the error rate of classifiers

that predicts the expected generalization behavior of a learning

algorithm for a given problem. The analysis results in a model

selection algorithm which can solve large model selection (e.g., feature

subset selection) problems efficiently.

Similar analyses can be applied to quantify the generalization performance

of a holdout testing based model selection algorithm, and to quantify
the

optimistic bias of the error estimate which is imposed by running several

learners on the same data set and selecting the one with the lowest
holdout

error rate.