Université Paris 6
Pierre et Marie Curie | Université Paris 7
Denis Diderot | |

CNRS U.M.R. 7599
| ||

``Probabilités et Modèles Aléatoires''
| ||

**Auteur(s): **

**Code(s) de Classification MSC:**

- 62J02 General nonlinear regression
- 62G07 Curve estimation (nonparametric regression, density estimation, etc.)

**Résumé:** This paper is about Gaussian regression with random design, where the observations are i.i.d., it
is known from Le Cam (1973, 1975 and 1986) that the rate of convergence of optimal estimators is
closely connected to the metric structure of the parameter space with respect to the Hellinger
distance. In particular, this metric structure essentially determines the risk when the loss function
is a power of the Hellinger distance. For random design regression, one typically uses as loss
function the squared $\Bbb{L}_2$-distance between the estimator and the parameter. If the
parameter space is bounded with respect to the $\Bbb{L}_\infty$-norm, both distances are
equivalent. Without this assumption, it may happen that there is a large distorsion between the
two distances, resulting in some unusual rates of convergence for the squared $\Bbb{L}_2$-risk,
as noticed by Baraud (2002). We shall first explain this phenomenon and then show that the use of
the Hellinger distance instead of the $\Bbb{L}_2$-distance allows to recover the usual rates and
to perform model selection in great generality. An extension to the $\Bbb{L}_2$-risk is given
under a boundedness assumption similar to the one in Wegkamp (2003).

**Mots Clés:** *Random design regression ; model selection ; Hellinger distance ; minimax risk ; Besov spaces*

**Date:** 2002-12-19

**Prépublication numéro:** *PMA-783*

**Pdf file : **PMA-783.pdf