In former articles, we have defined what generalization error is in the context of machine learning, and how to bound it through various inequalities. We also defined overfitting and how it can be remedied by using a validation set. We could avoid the paradox of choosing a size for our validation set by using cross-validation, which turned out to be an unbiased estimator for E_out(N — 1). In this article, we will give some practical examples on which inequalities to use in the case of a validation set and cross-validation.

Example — Validation Set

Imagine that we have a dataset…

Naja Møgeltoft

Data science and Machine Learning student at Copenhagen University.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store