The Cult of Universality in Statistical Learning Theory
The question is frequently raised as to why the theory and practice of machine learning are so divergent. Whereas if you glance at any article about classification, chances are that you will find symbol upon lemma & equation upon inequality, making claims about the bounds on the error rates, that should putatively guide the engineer in the solution of her problem. However, the situation seems to be that the engineer having been forewarned by her pragmatic colleagues (or having checked a few herself) that these bounds are vacuous for most realistic problems, circumvents them altogether in her search for any useful nuggets in the article. So why do these oft-ignored analyses still persist in a field that is largely comprised of engineers? From my brief survey of the literature it seems that one (but, by no means, the only) reason is the needless preponderance of worst-case thinking . (Being a panglossian believer of the purity of science and of the intentions of its workers...