Have a personal or library account? Click to login
Probabilities of discrepancy between minima of cross-validation, Vapnik bounds and true risks Cover

Probabilities of discrepancy between minima of cross-validation, Vapnik bounds and true risks

Open Access
|Sep 2010

Abstract

Two known approaches to complexity selection are taken under consideration: n-fold cross-validation and structural risk minimization. Obviously, in either approach, a discrepancy between the indicated optimal complexity (indicated as the minimum of a generalization error estimate or a bound) and the genuine minimum of unknown true risks is possible. In the paper, this problem is posed in a novel quantitative way. We state and prove theorems demonstrating how one can calculate pessimistic probabilities of discrepancy between these minima for given for given conditions of an experiment. The probabilities are calculated in terms of all relevant constants: the sample size, the number of cross-validation folds, the capacity of the set of approximating functions and bounds on this set. We report experiments carried out to validate the results.

DOI: https://doi.org/10.2478/v10006-010-0039-x | Journal eISSN: 2083-8492 | Journal ISSN: 1641-876X
Language: English
Page range: 525 - 544
Published on: Sep 27, 2010
Published by: University of Zielona Góra
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2010 Przemysław Klęsk, published by University of Zielona Góra
This work is licensed under the Creative Commons License.

Volume 20 (2010): Issue 3 (September 2010)