$$ \newcommand{\arginf}{\mathrm{arginf}} \newcommand{\argmin}{\mathrm{argmin}} \newcommand{\argmax}{\mathrm{argmax}} \newcommand{\asconv}[1]{\stackrel{#1-a.s.}{\rightarrow}} \newcommand{\Aset}{\mathsf{A}} \newcommand{\b}[1]{{\mathbf{#1}}} \newcommand{\ball}[1]{\mathsf{B}(#1)} \newcommand{\bbQ}{{\mathbb Q}} \newcommand{\bproof}{\textbf{Proof :}\quad} \newcommand{\bmuf}[2]{b_{#1,#2}} \newcommand{\card}{\mathrm{card}} \newcommand{\chunk}[3]{{#1}_{#2:#3}} \newcommand{\condtrans}[3]{p_{#1}(#2|#3)} \newcommand{\convprob}[1]{\stackrel{#1-\text{prob}}{\rightarrow}} \newcommand{\Cov}{\mathbb{C}\mathrm{ov}} \newcommand{\cro}[1]{\langle #1 \rangle} \newcommand{\CPE}[2]{\PE\lr{#1| #2}} \renewcommand{\det}{\mathrm{det}} \newcommand{\dimlabel}{\mathsf{m}} \newcommand{\dimU}{\mathsf{q}} \newcommand{\dimX}{\mathsf{d}} \newcommand{\dimY}{\mathsf{p}} \newcommand{\dlim}{\Rightarrow} \newcommand{\e}[1]{{\left\lfloor #1 \right\rfloor}} \newcommand{\eproof}{\quad \Box} \newcommand{\eremark}{</WRAP>} \newcommand{\eqdef}{:=} \newcommand{\eqlaw}{\stackrel{\mathcal{L}}{=}} \newcommand{\eqsp}{\;} \newcommand{\Eset}{ {\mathsf E}} \newcommand{\esssup}{\mathrm{essup}} \newcommand{\fr}[1]{{\left\langle #1 \right\rangle}} \newcommand{\falph}{f} \renewcommand{\geq}{\geqslant} \newcommand{\hchi}{\hat \chi} \newcommand{\Hset}{\mathsf{H}} \newcommand{\Id}{\mathrm{Id}} \newcommand{\img}{\text{Im}} \newcommand{\indi}[1]{\mathbf{1}_{#1}} \newcommand{\indiacc}[1]{\mathbf{1}_{\{#1\}}} \newcommand{\indin}[1]{\mathbf{1}\{#1\}} \newcommand{\itemm}{\quad \quad \blacktriangleright \;} \newcommand{\jointtrans}[3]{p_{#1}(#2,#3)} \newcommand{\ker}{\text{Ker}} \newcommand{\klbck}[2]{\mathrm{K}\lr{#1||#2}} \newcommand{\law}{\mathcal{L}} \newcommand{\labelinit}{\pi} \newcommand{\labelkernel}{Q} \renewcommand{\leq}{\leqslant} \newcommand{\lone}{\mathsf{L}_1} \newcommand{\lrav}[1]{\left|#1 \right|} \newcommand{\lr}[1]{\left(#1 \right)} \newcommand{\lrb}[1]{\left[#1 \right]} \newcommand{\lrc}[1]{\left\{#1 \right\}} \newcommand{\lrcb}[1]{\left\{#1 \right\}} \newcommand{\ltwo}[1]{\PE^{1/2}\lrb{\lrcb{#1}^2}} \newcommand{\Ltwo}{\mathrm{L}^2} \newcommand{\mc}[1]{\mathcal{#1}} \newcommand{\mcbb}{\mathcal B} \newcommand{\mcf}{\mathcal{F}} \newcommand{\meas}[1]{\mathrm{M}_{#1}} \newcommand{\norm}[1]{\left\|#1\right\|} \newcommand{\normmat}[1]{{\left\vert\kern-0.25ex\left\vert\kern-0.25ex\left\vert #1 \right\vert\kern-0.25ex\right\vert\kern-0.25ex\right\vert}} \newcommand{\nset}{\mathbb N} \newcommand{\N}{\mathcal{N}} \newcommand{\one}{\mathsf{1}} \newcommand{\PE}{\mathbb E} \newcommand{\pminfty}{_{-\infty}^\infty} \newcommand{\PP}{\mathbb P} \newcommand{\projorth}[1]{\mathsf{P}^\perp_{#1}} \newcommand{\Psif}{\Psi_f} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\pscal}[2]{\langle #1,#2\rangle} \newcommand{\psconv}{\stackrel{\PP-a.s.}{\rightarrow}} \newcommand{\qset}{\mathbb Q} \newcommand{\revcondtrans}[3]{q_{#1}(#2|#3)} \newcommand{\rmd}{\mathrm d} \newcommand{\rme}{\mathrm e} \newcommand{\rmi}{\mathrm i} \newcommand{\Rset}{\mathbb{R}} \newcommand{\rset}{\mathbb{R}} \newcommand{\rti}{\sigma} \newcommand{\section}[1]{==== #1 ====} \newcommand{\seq}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\set}[2]{\lrc{#1\eqsp: \eqsp #2}} \newcommand{\sg}{\mathrm{sgn}} \newcommand{\supnorm}[1]{\left\|#1\right\|_{\infty}} \newcommand{\thv}{{\theta_\star}} \newcommand{\tmu}{ {\tilde{\mu}}} \newcommand{\Tset}{ {\mathsf{T}}} \newcommand{\Tsigma}{ {\mathcal{T}}} \newcommand{\ttheta}{{\tilde \theta}} \newcommand{\tv}[1]{\left\|#1\right\|_{\mathrm{TV}}} \newcommand{\unif}{\mathrm{Unif}} \newcommand{\weaklim}[1]{\stackrel{\mathcal{L}_{#1}}{\rightsquigarrow}} \newcommand{\Xset}{{\mathsf X}} \newcommand{\Xsigma}{\mathcal X} \newcommand{\Yset}{{\mathsf Y}} \newcommand{\Ysigma}{\mathcal Y} \newcommand{\Var}{\mathbb{V}\mathrm{ar}} \newcommand{\zset}{\mathbb{Z}} \newcommand{\Zset}{\mathsf{Z}} $$

2017/10/07 23:39 · douc

Statement

Let $\seq{X_n}{n \in \nset}$ be a sequence of iid random vectors taking values on $\rset^p$ on the same probability space $(\Omega,\mcf, \PP)$.

For any $k \in [1:n]$, denote by $X^k_{(n)}$ the $k$-th nearest neighboor of $X_0$ among the set $\set{X_k}{k \in [1:n]}$, that is we have the properties

$\|X_0-X^1_{(n)}\| \leq \|X_0-X^2_{(n)}\| \leq \ldots \leq \|X_0-X^n_{(n)}\|$.
$\{X^1_{(n)}, \ldots, X^n_{(n)}\}=\{X_1,\ldots,X_n\}$.

The following proposition proves the consistency property of the kNN (kth nearest neighboor).

Proposition

For any $k \in \nset$, we have $\PP$-a.s., $$ \lim_{n \to \infty} X^k_{(n)}=X_0 $$

Proof

Note that since $n \mapsto \|X_0-X^k_{(n)}\|$ is non-increasing, $\lim_n X^k_{(n)}=X_0$ means that for any $\epsilon \in \bbQ_+^*$, the open ball $\ball{X_0,\epsilon}$ contains at least $k$ points among $(X_n)_{n \geq 1}$. Conversely, if $\lim_n X^k_{(n)}=X_0$ does not hold, it means that for some $\epsilon \in \bbQ_+^*$, the ball $\ball{X_0,\epsilon}$ contains at most $k-1$ points among $(X_n)_{n \geq 1}$. This in turn implies that $\| X_0- X_{m} \| \geq \epsilon$ for any sufficiently large $m$. Hence, $$ \{\lim_{n \to \infty} X^k_{(n)}=X_0\}^c \subset \cup_{(\epsilon,n) \in \bbQ^*_+ \times \nset^*}A_{\epsilon,n} \quad \mbox{where} \quad A_{\epsilon,n}=\{\forall m \geq n, \ \| X_0- X_{m} \| \geq \epsilon\} $$ From this inclusion property, in order to show that $\PP$-a.s., $\lim_{n \to \infty} X^k_{(n)}=X_0$ and hence $\PP\lr{\{\lim_{n \to \infty} X^k_{(n)}=X_0\}^c}=0$, we only need to show that $\PP(A_{\epsilon,n})=0$ for any $ (\epsilon,n) \in \bbQ^*_+ \times \nset^* $. Using that the $(X_m)_{m \geq 0}$ are iid, \begin{align*} \PP(A_{\epsilon,n})&=\PP(A_{\epsilon,1})=\PP(\cap_{m=1}^\infty \{\| X_0- X_{m} \| \geq \epsilon\})=\PE\lrb{\lim_{\ell \to \infty} \PE\lrb{\left. \prod_{m=1}^\ell\indiacc{\| X_0- X_{m} \| \geq \epsilon} \right| X_0 }}= \PE\lrb{\lim_{\ell \to \infty} h(X_0)^\ell} \end{align*} where $h(X_0)=\PP\lr{ \| X_0- X_{1} \| \geq \epsilon | X_0 }$. To complete the proof, we only need to show that $h(X_0) \in [0,1)$, $\PP$-a.s. Equivalently, since we already have $h(X_0) \in [0,1]$, we only need to show that \begin{equation} \label{eq:h} \PP\lr{ h(X_0)=1}=0 \end{equation} Choose a $\epsilon/2$-net countable covering of $\rset^p$, that is $\rset^p \subset \cup_{n \in \nset} A_n$ where $A_n = \ball{a_n,\epsilon/2}$.

Note that on $\{h(X_0)=1\}\cap \{X_0 \in A_n\} $, \begin{align*} 1=h(X_0)=\PP\lr{ \| X_0- X_{1} \| \geq \epsilon | X_0 } &\leq \PP\lr{ \| X_0-a_n\| + \| a_n-X_{1} \| \geq \epsilon | X_0 } \\ &\leq \PP\lr{ \epsilon/2 + \| a_n-X_{1} \| \geq \epsilon | X_0 }= \PP\lr{\| a_n-X_{1} \|\geq \epsilon/2 | X_0 }\\ &=\PP\lr{\| a_n-X_{1} \| \geq \epsilon/2 }=\PP(X_1 \notin A_n)=\PP(X_0 \notin A_n) \end{align*} Hence, $\{h(X_0)=1\}\cap \{X_0 \in A_n\} \subset \{1=\PP(X_0 \notin A_n)\}\cap \{X_0 \in A_n\}$ $$ \PP\lr{h(X_0)=1,X_0 \in A_n} \leq \PP\lr{1 = \PP(X_0 \notin A_n),X_0 \in A_n}=\indiacc{1 = \PP(X_0 \notin A_n)} \PP(X_0 \in A_n) =0 $$ This implies that

$$ \PP\lr{ h(X_0)=1} = \PP\lr{ h(X_0)=1, X_0 \in \cup_{n \in \nset}A_n } \leq \sum_{n \in \nset} \PP\lr{h(X_0)=1,X_0 \in A_n}=0 $$ which proves \eqref{eq:h} and the proof of the Proposition is completed.

Welcome to Randal Douc's wiki

Sidebar

Wiki

Wiki

Courses and public working groups

Courses and public working groups

Private Working Groups

Private Working Groups

Personal Notes

Personal Notes

Réponses

Réponses

Miscellanous

Miscellanous

Statement

Proof

Welcome to Randal Douc's wiki

User Tools

Site Tools

Sidebar

Wiki

Wiki

Courses and public working groups

Courses and public working groups

Private Working Groups

Private Working Groups

Personal Notes

Personal Notes

Réponses

Réponses

Miscellanous

Miscellanous

Statement

Proof

Page Tools