Université Paris 6
Pierre et Marie Curie
Université Paris 7
Denis Diderot

CNRS U.M.R. 7599
``Probabilités et Modèles Aléatoires''

Smooth discrimination analysis

Auteur(s):

Code(s) de Classification MSC:

Résumé: Discriminant analysis for two data sets in $\R^d$ with probability densities $f$ and $g$ can be based on the estimation of the set $G= \{x: f(x) \geq g(x)\}$. We consider applications where it is appropriate to assume that the region $G$ has a smooth boundary or belongs to another nonparametric class of sets. In particular, this assumption makes sense if discrimination is used as a data analytic tool. Decision rules based on minimisation of empirical risk over the whole class of sets and over sieves are considered. Their rates of convergence are obtained. We show that these rules achieve optimal rates for estimation of $G$ and optimal rates of convergence for Bayes risks. An interesting conclusion is that the optimal rates for Bayes risks can be very fast, in particular, faster than the "parametric" root-$n$ rate. These fast rates cannot be guaranteed for plug-in rules.

Mots Clés: discrimination analysis ; optimal rates ; empirical risk ; Bayes risk ; sieves

Date: 1999-06-22

Prépublication numéro: PMA-511