Fourth recitation

Hanzhe Li

Contents
Reading guide
  • This chapter focuses on continuous distributions, multivariate normal vectors, change-of-variables formulas, GOE, and several integral estimates.
  • For multivariate problems, the main objects are densities, linear transformations, covariance matrices, and orthogonal invariance.
  • Jensen's inequality, tail integral formulas, and normal tail bounds will be used later for laws of large numbers and concentration inequalities.

Tip. In normal distribution problems, zero covariance implies independence only when the variables come from a linear Gaussian structure.

Exercise 3.1

Note

For continuous densities and convolution problems, handle normalization, substitutions, Jacobians, and independence separately. Check the support before doing each integral.

Problem

Find the normalizing constants for the following densities:

  1. f(x)=1C1x(1x)f(x)=\frac{1}{C}\frac{1}{\sqrt{x(1-x)}}, 0<x<10<x<1;
  2. f(x)=1Cexexf(x)=\frac{1}{C}e^{-x-e^{-x}}, xRx\in\mathbb{R}.
Proof
f(x)=1C1x(1x),0<x<1.f(x)=\frac{1}{C}\frac{1}{\sqrt{x(1-x)}},\quad 0<x<1.

From 01f(x)dx=1\int_{0}^{1}f(x)\,dx=1,

1C01dxx(1x)=1.\frac{1}{C}\int_{0}^{1}\frac{dx}{\sqrt{x(1-x)}}=1.

Compute the integral. Let x=sin2tx=\sin^2 t. Then dx=2sintcostdtdx=2\sin t\cos t\,dt, and

x(1x)=sintcost,01dxx(1x)=0π/22dt=π.\sqrt{x(1-x)}=\sin t\cos t,\quad \int_{0}^{1}\frac{dx}{\sqrt{x(1-x)}}=\int_{0}^{\pi/2}2\,dt=\pi.

Thus

πC=1C=π.\frac{\pi}{C}=1 \quad\Rightarrow\quad C=\pi.
  1. Since exexdx=eex+C\int e^{-x-e^{-x}}\,dx=e^{-e^{-x}}+C, direct evaluation gives C=1C=1.
Problem

Find the normalizing constant for

R2x1x2e12(x12+x22)dx1dx2.\int_{\mathbb{R}^2} |x_1 - x_2| e^{-\frac{1}{2}(x_1^2 + x_2^2)} \, dx_1 dx_2.
Proof

Let u=x1x2u=x_1-x_2 and v=x1+x2v=x_1+x_2. Then

J=12.|J|=\frac12.

Therefore

the integral=R2ue14(u2+v2)12dudv.\text{the integral} = \int_{\mathbb{R}^2} |u| e^{-\frac{1}{4}(u^2 + v^2)} \cdot \frac{1}{2} \, du dv.

Hence

=20+ue14u2du0+e14v2dv=4π.= 2 \int_{0}^{+\infty} u e^{-\frac{1}{4}u^2} \, du \cdot \int_{0}^{+\infty} e^{-\frac{1}{4}v^2} \, dv = 4\sqrt{\pi}.

Thus C=4πC=4\sqrt{\pi}.

Problem

Let X,YX,Y be independent exponential random variables with parameter 11. Find the density of U=X+YU=X+Y.

Proof

Directly,

fU(u)=0uexexudx=ueu1u0.f_U(u)=\int_0^u e^{-x}e^{x-u}\,dx=u e^{-u}\mathbb{1}_{u\geq 0}.
Problem
  1. Let X,Y,ZX,Y,Z be independent, and suppose each is uniformly distributed on [0,1][0,1].

(1) Find the density of logX-\log X.

(2) Prove that W=(XY)ZW=(XY)^Z is also uniformly distributed on [0,1][0,1].

Proof

(1) We have

P(logXx)=P(Xex)=1ex,\mathbb{P}(-\log X \leq x)=\mathbb{P}(X\geq e^{-x})=1-e^{-x},

so the density is ex1x0e^{-x}\mathbb{1}_{x\geq 0}.

(2) Since W=(XY)ZW=(XY)^Z,

logW=Z(logXlogY).-\log W=Z(-\log X-\log Y).

It is enough to prove that Z(logXlogY)Z(-\log X-\log Y) has the exponential distribution with parameter 11. Let

T=logXlogY.T=-\log X-\log Y.

By the previous problem,

fT(t)=tet1t0.f_T(t) = te^{-t} \mathbb{1}_{t \geq 0}.

Also,

fT,Z(t,z)=tet1t010z1.f_{T,Z}(t, z) = te^{-t} \mathbb{1}_{t \geq 0} \mathbb{1}_{0 \leq z \leq 1}.

It remains to prove that R=TZR=TZ has the exponential distribution with parameter 11.

Use the change of variables r=tzr=tz, s=ts=t. Then

J=(zrzstrts)=(1srs201).J = \begin{pmatrix} \frac{\partial z}{\partial r} & \frac{\partial z}{\partial s} \\ \frac{\partial t}{\partial r} & \frac{\partial t}{\partial s} \end{pmatrix} = \begin{pmatrix} \frac{1}{s} & -\frac{r}{s^2} \\ 0 & 1 \end{pmatrix}.

Thus

det(J)=1s.|\det(J)| = \frac{1}{s}.

By the density transformation formula,

fR,S(r,s)=es10rs.f_{R,S}(r,s)=e^{-s}\mathbb{1}_{0\leq r\leq s}.

Therefore

fR(r)=Res10rsds=1r0r+esds=er1r0.f_R(r)=\int_{\mathbb{R}}e^{-s}\mathbb{1}_{0\leq r\leq s}\,ds =\mathbb{1}_{r\geq 0}\int_r^{+\infty}e^{-s}\,ds =e^{-r}\mathbb{1}_{r\geq 0}.

Exercise 3.2

Note

There are three common ways to compute moments: direct integration, recursion or integration by parts, and generating functions. It is useful to compare the moments of the normal law and the semicircle law.

Problem

Let XX have the standard normal distribution, and let YY have the standard Wigner semicircle law. Find all their moments.

Proof

Both random variables are symmetric, so we only need the moments of order 2k2k.

E[X2k]=+x2k12πe12x2dx=2k+122π0+uk12eudu(let u=x22)=2k+122πΓ ⁣(k+12)=2k+122π(π(2k1)!!2k)(using Γ(n+1)=nΓ(n) and Γ(12)=π)=(2k1)!!.\begin{aligned} \mathbb{E}[X^{2k}] &= \int_{-\infty}^{+\infty} x^{2k} \frac{1}{\sqrt{2\pi}} e^{-\frac{1}{2}x^2} \, dx \\ &= \frac{2^{k+\frac{1}{2}}}{\sqrt{2\pi}} \int_{0}^{+\infty} u^{k-\frac{1}{2}} e^{-u} \, du \quad (\text{let } u = \frac{x^2}{2}) \\ &= \frac{2^{k+\frac{1}{2}}}{\sqrt{2\pi}} \, \Gamma\!\left(k + \frac{1}{2}\right) \\ &= \frac{2^{k+\frac{1}{2}}}{\sqrt{2\pi}} \cdot \left( \frac{\sqrt{\pi} \, (2k-1)!!}{2^k} \right) \quad (\text{using } \Gamma(n+1)=n\Gamma(n) \text{ and } \Gamma(\tfrac12)=\sqrt{\pi}) \\ &= (2k-1)!!. \end{aligned}

For the semicircle law,

E[Y2k]=20212πy2k4y2dy=let y=2sinθ22k+2π0π2(sinθ)2kcos2θdθ=22k+2π0π2[(sinθ)2k(sinθ)2k+2]dθ.\begin{aligned} \mathbb{E}[Y^{2k}] &= 2 \int_{0}^{2} \frac{1}{2\pi} y^{2k} \sqrt{4 - y^2} \, dy \\ &\overset{\text{let } y = 2\sin\theta}{=} \frac{2^{2k+2}}{\pi} \int_{0}^{\frac{\pi}{2}} (\sin\theta)^{2k} \cos^2\theta \, d\theta \\ &= \frac{2^{2k+2}}{\pi} \int_{0}^{\frac{\pi}{2}} \left[ (\sin\theta)^{2k} - (\sin\theta)^{2k+2} \right] \, d\theta. \end{aligned}

From real analysis,

0π2sinnxdx={n1nn3n212π2,n positive even,n1nn2n3131,n positive odd.\begin{aligned} \int_{0}^{\frac{\pi}{2}} \sin^n x \, dx = \begin{cases} \dfrac{n-1}{n} \dfrac{n-3}{n-2} \cdots \dfrac{1}{2} \cdot \dfrac{\pi}{2}, & n \text{ positive even}, \\ \dfrac{n-1}{n} \dfrac{n-2}{n-3} \cdots \dfrac{1}{3} \cdot 1, & n \text{ positive odd}. \end{cases} \end{aligned}

Substitution gives the final answer

E[Y2k]=1k+1(2kk).\mathbb{E}[Y^{2k}]=\frac{1}{k+1}\binom{2k}{k}.
Problem

The joint density of (X,Y)(X,Y) is

f(x,y)=C1x(yx)ey,0xy<.f(x, y) = C^{-1} x (y - x) e^{-y}, \quad 0 \leq x \leq y < \infty.

Find the constant CC, the conditional density fXYf_{X|Y}, and the conditional expectation E[YX]\mathbb{E}[Y|X].

Proof
0+0yCx(yx)eydxdy=0+Ceydy0yx(yx)dx=0+C16y3eydy=C163!=C.\begin{aligned} &\int_{0}^{+\infty} \int_{0}^{y} Cx (y - x)e^{-y} \, dx \, dy \\ &= \int_{0}^{+\infty} Ce^{-y} \, dy \int_{0}^{y} x(y - x) \, dx \\ &= \int_{0}^{+\infty} C \cdot \frac{1}{6} y^3 e^{-y} \, dy \\ &= C \cdot \frac{1}{6} \cdot 3! \\ &= C. \end{aligned}

Thus C=1C=1.

fXY(x)=f(x,y)fY(y)=x(yx)ey10xy16y3ey=6x(yx)y310xy.\begin{aligned} f_{X|Y}(x) &= \frac{f(x, y)}{f_Y(y)} \\ &= \frac{x(y - x)e^{-y} \mathbb{1}_{0 \leq x \leq y}}{\frac{1}{6}y^3 e^{-y}} \\ &= \frac{6x(y - x)}{y^3} \mathbb{1}_{0 \leq x \leq y}. \end{aligned}

Similarly,

fYX(y)=f(x,y)fX(x)=x(yx)ey10xyxex=(yx)e(yx)1yx.\begin{aligned} f_{Y|X}(y) &= \frac{f(x, y)}{f_X(x)} \\ &= \frac{x(y - x)e^{-y} \mathbb{1}_{0 \leq x \leq y}}{x e^{-x}} \\ &= (y - x)e^{-(y-x)} \mathbb{1}_{y \geq x}. \end{aligned}

Therefore

E[YX]=xy(yx)e(yx)dy=X+2.\begin{aligned} \mathbb{E}[Y \mid X] &= \int_{x}^{\infty} y (y - x) e^{-(y-x)} \, dy \\ &= X + 2. \end{aligned}
Problem
  1. Let gg be continuously differentiable on R\mathbb{R}, and assume that both gg and gg' are bounded. Prove that
XN(μ,σ2)    E[(Xμ)g(X)]=σ2E[g(X)].X \sim N(\mu, \sigma^2) \implies \mathbb{E}[(X - \mu)g(X)] = \sigma^2 \mathbb{E}[g'(X)].
Proof

It is enough to consider ZN(0,1)Z\sim N(0,1).

Ef(Z)=12πRet2/2f(t)dt=12π0f(t)twew2/2dwdt12π0f(t)twew2/2dwdt=Fubini12π0wew2/2[0wf(t)dt]dw12π0wew2/2[w0f(t)dt]dw=12π0wew2/2f(w)dw+12π0wew2/2f(w)dw=E[Zf(Z)].\begin{aligned} & \mathbb{E} f^{\prime}(Z)=\frac{1}{\sqrt{2 \pi}} \int_{\mathbb{R}} e^{-t^2 / 2} f^{\prime}(t) d t \\ & \quad=\frac{1}{\sqrt{2 \pi}} \int_0^{\infty} f^{\prime}(t) \int_t^{\infty} w e^{-w^2 / 2} d w d t-\frac{1}{\sqrt{2 \pi}} \int_{-\infty}^0 f^{\prime}(t) \int_{-\infty}^t w e^{-w^2 / 2} d w d t \\ & \quad \overset{Fubini}{=} \frac{1}{\sqrt{2 \pi}} \int_0^{\infty} w e^{-w^2 / 2}\left[\int_0^w f^{\prime}(t) d t\right] d w-\frac{1}{\sqrt{2 \pi}} \int_{-\infty}^0 w e^{-w^2 / 2}\left[\int_w^0 f^{\prime}(t) d t\right] d w \\ & \quad = \frac{1}{\sqrt{2 \pi}} \int_0^{\infty} w e^{-w^2 / 2}f(w) d w+\frac{1}{\sqrt{2 \pi}} \int_{-\infty}^0 w e^{-w^2 / 2}f(w) d w \\ & \quad=\mathbb{E}[Z f(Z)] . \end{aligned}

Conversely, suppose ZZ satisfies

E(Zg(Z))=E(g(Z)).\mathbb{E}(Zg(Z))=\mathbb{E}(g^{\prime}(Z)).

Consider the Stein equation

xf(x)f(x)=h(x)E(h(G)),xf(x)-f^{\prime}(x)=h(x)-\mathbb{E}(h(G)),

where GN(0,1)G\sim N(0,1). For a fixed hh, this is an ordinary differential equation in ff and has a unique solution. Take h(x)=1xzh(x)=\mathbb{1}_{x\leq z}. Substituting x=Zx=Z and taking expectations gives

0=E(Zf(Z))E(f(Z))=P(Zz)P(Gz).0=\mathbb{E}(Zf(Z))-\mathbb{E}(f^{\prime}(Z))=\mathbb{P}(Z\leq z)-\mathbb{P}(G\leq z).

Thus Z=lawGZ\overset{\mathrm{law}}{=}G.

Problem

Let {Xr:1rn}\{X_r : 1 \leq r \leq n\} be i.i.d. random variables with finite variance. Set

X=1nk=1nXk.\overline{X} = \frac{1}{n} \sum_{k=1}^n X_k.

Find Cov(X,XkX)\operatorname{Cov}(\overline{X}, X_k - \overline{X}).

Proof
Cov(Xˉ,XkXˉ)=Cov(Xˉ,Xk)Cov(Xˉ,Xˉ)=Cov(Xkn,Xk)Var(Xˉ)=Var(Xk)n1n2Var ⁣(k=1nXk)=Var(Xk)n1nVar(Xk)=0.\begin{aligned} &\operatorname{Cov}\left(\bar{X}, X_k-\bar{X}\right)\\ =&\operatorname{Cov}\left(\bar{X}, X_k\right) -\operatorname{Cov}\left(\bar{X}, \bar{X}\right)\\ =&\operatorname{Cov}\left(\frac{X_k}{n}, X_k\right) -\operatorname{Var}(\bar{X})\\ =&\frac{\operatorname{Var}(X_k)}{n}-\frac{1}{n^2}\operatorname{Var}\!\left(\sum_{k=1}^n X_k\right)\\ =&\frac{\operatorname{Var}(X_k)}{n}-\frac{1}{n}\operatorname{Var}(X_k)\\ =&0. \end{aligned}
Problem

Let XX be a nonnegative random variable. Prove that for every r>0r>0,

E[Xr]=0rtr1P(Xt)dt.\mathbb{E}[X^r] = \int_0^\infty r t^{r-1}\mathbb{P}(X \geq t)\,dt.
Proof
E[Xr]=E ⁣[0Xrtr1dt]=E ⁣[0+rtr11tXdt]=0rtr1E[1tX]dt(Fubini)=0rtr1P(Xt)dt.\begin{aligned} \mathbb{E}[X^r] &= \mathbb{E}\!\left[\int_0^X r t^{r-1}\,dt\right] \\ &= \mathbb{E}\!\left[\int_0^{+\infty} r t^{r-1}\mathbb{1}_{t \leq X}\,dt\right] \\ &= \int_0^{\infty} r t^{r-1}\mathbb{E}[\mathbb{1}_{t \leq X}]\,dt \qquad \text{(Fubini)} \\ &= \int_0^{\infty} r t^{r-1}\mathbb{P}(X \geq t)\,dt. \end{aligned}
Problem

For i.i.d. random variables XX and YY, prove:

(1) U=X+YU=X+Y and V=XYV=X-Y are uncorrelated, but need not be independent.

(2) If X,YN(0,1)X,Y\sim N(0,1), then UU and VV are independent.

Proof

(1)

E[(X+Y)(XY)]=EX2EY2=0.\mathbb{E}[(X+Y)(X-Y)]=\mathbb{E} X^2-\mathbb{E} Y^2=0.

Also,

E[X+Y]E[XY]=E[X]2E[Y]2=0.\mathbb{E}[X+Y]\mathbb{E}[X-Y]=\mathbb{E}[X]^2-\mathbb{E}[Y]^2=0.

Thus UU and VV are uncorrelated. If X,YX,Y are independent and both have the Bernoulli distribution B(1,12)B(1,\frac12), then X+YX+Y and XYX-Y are not independent.

(2)

fX,Y(x,y)=12πe12x212y2 let u=x+yv=xyJ=(xuxvyuyv)=(12121212)detJ=12fU,V(u,v)=14πe14u214v2=12π2e12(u2)212π2e12(v2)2.\begin{aligned} & f_{X , Y}(x , y)=\frac{1}{2 \pi} e^{-\frac{1}{2} x^2-\frac{1}{2} y^2} \\ & \text { let } u=x+y \quad v=x-y \\ & J=\left(\begin{array}{cc} \frac{\partial x}{\partial u} & \frac{\partial x}{\partial v} \\[5pt] \frac{\partial y}{\partial u} & \frac{\partial y}{\partial v} \end{array}\right)=\left(\begin{array}{cc} \frac{1}{2} & \frac{1}{2} \\[5pt] \frac{1}{2} & -\frac{1}{2} \end{array}\right) \\ & |\operatorname{det} J|=\frac{1}{2} \\ & \begin{aligned} f_{U ,V}(u , v) & =\frac{1}{4 \pi} e^{-\frac{1}{4} u^2-\frac{1}{4} v^2} \\ & =\frac{1}{\sqrt{2 \pi} \cdot \sqrt{2}} e^{-\frac{1}{2}\left(\frac{u}{\sqrt{2}}\right)^2}\cdot \frac{1}{\sqrt{2 \pi} \cdot \sqrt{2}} e^{-\frac{1}{2}\left(\frac{v}{\sqrt{2}}\right)^2}. \end{aligned} \end{aligned}

Therefore UU and VV are independent.

Exercise 3.3

Note

Linear transformations of multivariate normal vectors are most cleanly described by covariance matrices. After a linear transformation, first compute the mean and covariance.

Problem

Let (X,Y)(X,Y) have a bivariate standard normal distribution. Find:

(1) the joint density and marginal densities of X+YX+Y and XYX-Y;

(2) E[XYX+Y]\mathbb{E}[X-Y\mid X+Y] and Var(XYX+Y)\operatorname{Var}(X-Y\mid X+Y).

Proof

Set U=X+YU=X+Y and V=XYV=X-Y, and suppose

(X,Y)N(0,Σ).(X,Y)\sim N(0,\Sigma).

For a bivariate standard normal vector, write

Σ=(1ρρ1).\Sigma=\left(\begin{array}{cc} 1 & \rho \\ \rho & 1 \end{array}\right).

Let

D=(1111).D=\left(\begin{array}{cc} 1 & 1\\ 1& -1 \end{array}\right).

Then

(U,V)=(X,Y)D.(U,V)=(X,Y)D.

By Theorem 3.3.3 in the notes,

(U,V)N(0,DTΣD):=N(0,Σ).(U,V)\sim N(0,D^T\Sigma D):=N(0,\Sigma').

A direct calculation gives

Σ=(2+2ρ0022ρ).\Sigma'=\left(\begin{array}{cc} 2+2\rho & 0 \\ 0& 2-2\rho \end{array}\right).

Thus U,VU,V are independent, UN(0,2+2ρ)U\sim N(0,2+2\rho), and VN(0,22ρ)V\sim N(0,2-2\rho).

(2) Since U,VU,V are independent,

E[VU]=E[V]=0,Var(VU)=Var(V)=22ρ.\mathbb{E}[V\mid U]=\mathbb{E}[V]=0,\qquad \operatorname{Var}(V\mid U)=\operatorname{Var}(V)=2-2\rho.
Problem
  1. Let X=(X1,X2,,Xn)X=(X_1,X_2,\cdots,X_n) have the multivariate normal distribution N(0,Σ)N(0,\Sigma), where Σ=(σij)i,j=1n\Sigma=(\sigma_{ij})_{i,j=1}^n is positive definite. Prove that
U=k=1nakXk and V=k=1nbkXk are independent if and only if j,k=1najbkσjk=0,U = \sum_{k=1}^n a_k X_k \text{ and } V = \sum_{k=1}^n b_k X_k \text{ are independent if and only if } \sum_{j,k=1}^n a_j b_k \sigma_{jk} = 0,

where a1,,an,b1,,bna_1,\cdots,a_n,b_1,\cdots,b_n are real numbers. Also find E[UV]\mathbb{E}[U\mid V] when b1,,bnb_1,\cdots,b_n are not all zero.

Proof

(1) Linear combinations of the components of a multivariate normal vector are again jointly normal. Hence (U,V)(U,V) is bivariate normal. Clearly,

E[U]=E[V]=0.\mathbb{E}[U]=\mathbb{E}[V]=0.

Moreover,

U,V independentCov(U,V)=0E[UV]=0j,k=1najbkE[XjXk]=0j,k=1najbkσjk=0.\begin{aligned} &U,V\text{ independent}\\ \Leftrightarrow &\operatorname{Cov}(U,V)=0\\ \Leftrightarrow &\mathbb{E}[UV]=0\\ \Leftrightarrow &\sum_{j,k=1}^n a_jb_k\mathbb{E}[X_jX_k] =0\\ \Leftrightarrow &\sum_{j,k=1}^n a_jb_k\sigma_{jk} =0. \end{aligned}

(2) For normal variables, uncorrelatedness is equivalent to independence. Thus

Cov ⁣(UCov(U,V)Var(V)V,V)=0.\operatorname{Cov}\!\left(U-\frac{\operatorname{Cov}(U, V)}{\operatorname{Var}(V)}V , V\right)=0.

Hence UCov(U,V)Var(V)VU-\frac{\operatorname{Cov}(U,V)}{\operatorname{Var}(V)}V is independent of VV. Therefore

E ⁣[UCov(U,V)Var(V)VV]=E ⁣[UCov(U,V)Var(V)V]=0.\mathbb{E}\!\left[U-\frac{\operatorname{Cov}(U,V)}{\operatorname{Var}(V)}V\mid V\right] =\mathbb{E}\!\left[U-\frac{\operatorname{Cov}(U,V)}{\operatorname{Var}(V)}V\right]=0.

By linearity of conditional expectation,

E[UV]=Cov(U,V)Var(V)V.\mathbb{E}[U \mid V] = \frac{\operatorname{Cov}(U, V)}{\operatorname{Var}(V)} \, V.

Substituting the covariances gives

E[UV]=j,kajbkσjkj,kbjbkσjkV.\mathbb{E}[U \mid V] = \frac{\sum_{j,k} a_j b_k \sigma_{jk}}{\sum_{j,k} b_j b_k \sigma_{jk}} V.
Problem

Let X1,X2,,XnX_1,X_2,\cdots,X_n be i.i.d. N(μ,σ2)N(\mu,\sigma^2) random variables, and let X=1ni=1nXi\overline{X}=\frac{1}{n}\sum_{i=1}^n X_i. Find ρ(X1,X)\rho(X_1,\overline{X}).

Proof

By assumption, X1,X2,,XnX_1,X_2,\ldots,X_n are i.i.d., with E(Xi)=μ\mathbb{E}(X_i)=\mu and Var(Xi)=σ2\operatorname{Var}(X_i)=\sigma^2. Also,

X=1nj=1nXj.\overline{X} = \frac{1}{n} \sum_{j=1}^n X_j.

First compute the covariance:

Cov(X1,X)=Cov(X1,1nj=1nXj)=1nj=1nCov(X1,Xj).\operatorname{Cov}(X_1, \overline{X}) = \operatorname{Cov}\left(X_1, \frac{1}{n} \sum_{j=1}^n X_j\right) = \frac{1}{n} \sum_{j=1}^n \operatorname{Cov}(X_1, X_j).

By independence, Cov(X1,Xj)=0\operatorname{Cov}(X_1,X_j)=0 when j1j\ne 1, while Cov(X1,X1)=Var(X1)=σ2\operatorname{Cov}(X_1,X_1)=\operatorname{Var}(X_1)=\sigma^2. Hence

Cov(X1,X)=σ2n.\operatorname{Cov}(X_1, \overline{X}) = \frac{\sigma^2}{n}.

Also,

Var(X)=1n2j=1nVar(Xj)=σ2n,\operatorname{Var}(\overline{X}) = \frac{1}{n^2} \sum_{j=1}^n \operatorname{Var}(X_j) = \frac{\sigma^2}{n},

and

Var(X1)=σ2.\operatorname{Var}(X_1)=\sigma^2.

Thus

ρ(X1,X)=Cov(X1,X)Var(X1)Var(X)=σ2nσ2σ2n=1n.\rho(X_1, \overline{X}) = \frac{\operatorname{Cov}(X_1, \overline{X})}{\sqrt{\operatorname{Var}(X_1) \operatorname{Var}(\overline{X})}} = \frac{\frac{\sigma^2}{n}}{\sqrt{\sigma^2 \cdot \frac{\sigma^2}{n}}} = \frac{1}{\sqrt{n}}.
Problem

Let ee be a fixed unit vector in Rn\mathbb{R}^n with n2n\geq 2. Let XN(0,In)X\sim N(0,I_n), where InI_n is the identity matrix. Let ZZ be the square of the length of the projection of ee onto the line spanned by XX. Find the density of ZZ.

Proof

The squared projection length is

Z=(eX)2X2.Z = \frac{(e^\top X)^2}{\|X\|^2}.

Since XN(0,In)X\sim N(0,I_n), by an orthogonal transformation we may assume that ee is the first coordinate vector (1,0,,0)(1,0,\ldots,0)^\top. This is allowed because XX is spherically symmetric. Let Y=(Y1,Y2,,Yn)Y=(Y_1,Y_2,\ldots,Y_n)^\top be the transformed vector. Then YN(0,In)Y\sim N(0,I_n), and

eX=Y1,X2=i=1nYi2.e^\top X = Y_1, \quad \|X\|^2 = \sum_{i=1}^n Y_i^2.

Thus

Z=Y12Y12+Y22++Yn2.Z = \frac{Y_1^2}{Y_1^2 + Y_2^2 + \cdots + Y_n^2}.

Let U=Y12χ12U=Y_1^2\sim \chi^2_1 and V=Y22++Yn2χn12V=Y_2^2+\cdots+Y_n^2\sim \chi^2_{n-1}. Then UU and VV are independent, and

Z=UU+V.Z = \frac{U}{U+V}.

If Uχa2U\sim\chi^2_a and Vχb2V\sim\chi^2_b are independent, then

UU+VBeta(a2,b2).\frac{U}{U+V} \sim \operatorname{Beta}\left(\frac{a}{2}, \frac{b}{2}\right).

Here a=1a=1 and b=n1b=n-1, so

ZBeta(12,n12).Z \sim \operatorname{Beta}\left(\frac{1}{2}, \frac{n-1}{2}\right).

Its density is

fZ(z)=Γ(n2)Γ(12)Γ(n12)z121(1z)n121,0<z<1.f_Z(z) = \frac{\Gamma\left(\frac{n}{2}\right)}{\Gamma\left(\frac{1}{2}\right) \Gamma\left(\frac{n-1}{2}\right)} z^{\frac{1}{2} - 1} (1-z)^{\frac{n-1}{2} - 1}, \quad 0 < z < 1.

Equivalently,

fZ(z)=Γ(n2)πΓ(n12)z1/2(1z)n32,0<z<1,f_Z(z) = \frac{\Gamma\left(\frac{n}{2}\right)}{\sqrt{\pi} \, \Gamma\left(\frac{n-1}{2}\right)} z^{-1/2} (1-z)^{\frac{n-3}{2}}, \quad 0 < z < 1,

and it is 00 elsewhere.

Exercise 3.4

Note

Complex Gaussian and GOE problems often use invariance. Find the symmetry first; it is usually cleaner than computing a density directly.

Problem

Let ZNC(0,1)Z\sim N_C(0,1). Prove that

E[ZkZˉl]={k!,k=l,0,kl.\mathbb{E}[Z^k \bar{Z}^l] = \begin{cases} k! & , \quad k = l, \\ 0 & , \quad k \neq l. \end{cases}
Proof
 Using f(z)=1πez2, we have E[ZkZˉl]=C1πez2zkzˉl dz=z=reiθ02π0+1πer2(reiθ)k(reiθ)lr dr dθ=02π0+1πrk+l+1ei(kl)θer2 dr dθ={20+r2k+1er2 dr=Γ(k+1)=k!,k=l,0,kl.\begin{aligned} &\text { Using } f(z)=\frac{1}{\pi} \mathrm{e}^{-|z|^{2}}, \text { we have }\\ & \mathbb{E}\left[Z^{k} \bar{Z}^{l}\right]=\int_{\mathbb{C}} \frac{1}{\pi} \mathrm{e}^{-|z|^{2}} z^{k} \bar{z}^{l} \mathrm{~d} z \\ &\stackrel{z=re^{i\theta}}{=} \int_{0}^{2 \pi} \int_{0}^{+\infty} \frac{1}{\pi} \mathrm{e}^{-r^{2}}\left(r \mathrm{e}^{\mathrm{i} \theta}\right)^{k}\left(r \mathrm{e}^{-\mathrm{i} \theta}\right)^{l} r \mathrm{~d} r \mathrm{~d} \theta \\ & =\int_{0}^{2 \pi} \int_{0}^{+\infty} \frac{1}{\pi} r^{k+l+1} \mathrm{e}^{i(k-l) \theta} \mathrm{e}^{-r^{2}} \mathrm{~d} r \mathrm{~d} \theta \\ & =\left\{\begin{array}{ll} 2 \int_{0}^{+\infty} r^{2 k+1} \mathrm{e}^{-r^{2}} \mathrm{~d} r=\Gamma(k+1)=k!\quad\quad\quad & ,k=l, \\ 0 & ,k \neq l . \end{array}\right. \end{aligned}
Problem

Let Z1,Z2NC(0,1)Z_1,Z_2\sim N_C(0,1) be independent.

11 Find the density of Z1/Z2Z_1/Z_2.

22 Use Exercise 3.4.1 to compute E[Z1Z22n]\mathbb{E}[|Z_1-Z_2|^{2n}] for positive integers nn.

Proof

(1) Since Z1,Z2i.i.d.NC(0,1)Z_1,Z_2\overset{\text{i.i.d.}}{\sim}N_\mathbb C(0,1),

fZ1,Z2(z1,z2)=1π2ez12z22,z1,z2C.f_{Z_1, Z_2}(z_1, z_2) = \frac{1}{\pi^2} e^{-|z_1|^2 - |z_2|^2}, \quad z_1, z_2 \in \mathbb{C}.

Let

W=Z1Z2,V=Z2.W = \frac{Z_1}{Z_2}, \quad V = Z_2.

Then

Z1=WV,Z2=V.Z_1 = WV, \quad Z_2 = V.

The Jacobian determinant of this transformation is v2|v|^2. Hence

fW,V(w,v)=fZ1,Z2(wv,v)v2=v2π2ev2(1+w2).f_{W,V}(w, v) = f_{Z_1, Z_2}(wv, v) \cdot |v|^2 = \frac{|v|^2}{\pi^2} e^{-|v|^2(1 + |w|^2)}.

Integrating out vv gives the marginal density of WW:

fW(w)=CfW,V(w,v)dv.f_W(w) = \int_{\mathbb{C}} f_{W,V}(w, v) \, dv.

Let r=vr=|v|. Then dv=rdrdθdv=r\,dr\,d\theta, and the integral does not depend on θ\theta:

fW(w)=1π2002πr2er2(1+w2)rdθdr=2ππ20r3er2(1+w2)dr.f_W(w) = \frac{1}{\pi^2} \int_0^{\infty} \int_0^{2\pi} r^2 e^{-r^2(1+|w|^2)} \cdot r \, d\theta \, dr = \frac{2\pi}{\pi^2} \int_0^{\infty} r^3 e^{-r^2(1+|w|^2)} \, dr.

Let a=1+w2a=1+|w|^2 and t=r2t=r^2. Then r3dr=t2dtr^3dr=\frac{t}{2}dt, so

0r3ear2dr=120teatdt=12a2.\int_0^{\infty} r^3 e^{-a r^2} dr = \frac{1}{2} \int_0^{\infty} t e^{-a t} dt = \frac{1}{2a^2}.

Thus

fW(w)=2ππ212(1+w2)2=1π(1+w2)2.f_W(w) = \frac{2\pi}{\pi^2} \cdot \frac{1}{2(1+|w|^2)^2} = \frac{1}{\pi (1+|w|^2)^2}.

22 Since Z1,Z2NC(0,1)Z_1,Z_2\sim N_\mathbb C(0,1) are independent, Z1Z2NC(0,2)Z_1-Z_2\sim N_\mathbb C(0,2). Let

Y=Z1Z22NC(0,1).Y=\frac{Z_1-Z_2}{\sqrt{2}}\sim N_C(0,1).

Then

Z1Z22n=2nY2n=2n(YYˉ)n.|Z_1 - Z_2|^{2n} = 2^n |Y|^{2n} = 2^n(Y\bar Y)^n.

By Exercise 3.4.1 with k=n,l=nk=n,l=n,

E[(YYˉ)n]=E[YnYˉn]=n!.\mathbb{E}[(Y\bar Y)^n]=\mathbb{E}[Y^n\bar Y^n]=n!.

Therefore

E[Z1Z22n]=2nn!.\mathbb{E}[|Z_1-Z_2|^{2n}]=2^n n!.
Problem
  1. (Moments of GOE) Let HH have the GOEn_n distribution, and set ak=E[tr(Hk)]a_k=\mathbb{E}[\operatorname{tr}(H^k)]. Compute the first six moments aka_k, k=1,,6k=1,\dots,6.
Proof

By symmetry, only the even moments need to be considered. In random matrix notation, tr=1nTr\operatorname{tr}=\frac1n\operatorname{Tr}, where Tr\operatorname{Tr} is the usual trace and nn is the matrix dimension. By Wick's formula,

ETr(H2k)=i1,,i2kE(hi1i2hi2i3hi2ki1)=i1,,i2kπP(2k)(p,q)πE(hipip+1hiqiq+1)=i1,,i2kπP(2k)(p,q)πE(δipiqδip+1iq+1+δipiq+1δip+1iq).\begin{aligned} \mathbb{E}\operatorname{Tr}(H^{2k}) &=\sum_{i_1,\cdots , i_{2k}} \mathbb{E}( h_{i_1i_2}h_{i_2i_3}\cdots h_{i_{2k}i_1}) \\ &=\sum_{i_1,\cdots , i_{2k}} \sum_{\pi \in \mathcal{P}(2k)}\prod_{(p,q)\in \pi }\mathbb{E}(h_{i_pi_{p+1}}h_{i_qi_{q+1}})\\ &=\sum_{i_1,\cdots , i_{2k}} \sum_{\pi \in \mathcal{P}(2k)}\prod_{(p,q)\in \pi }\mathbb{E}(\delta_{i_pi_q}\delta_{i_{p+1}i_{q+1}}+\delta_{i_pi_{q+1}}\delta_{i_{p+1}i_q}). \end{aligned}

For k=1k=1,

a2=1ni1,i2E[hi1i2hi2i1]=1ni1,i2E[hi1i22]=1n(2n+(n2n))=n+1.a_2 = \dfrac{1}{n}\sum_{i_1,i_2} \mathbb{E}[h_{i_1 i_2} h_{i_2 i_1}] = \dfrac{1}{n}\sum_{i_1,i_2} \mathbb{E}[h_{i_1 i_2}^2]=\dfrac{1}{n}(2n+(n^2-n))=n+1.

For k=2k=2,

a4=1ni1,i2,i3,i4E[hi1i2hi2i3hi3i4hi4i1].a_4 = \dfrac{1}{n}\sum_{i_1,i_2,i_3,i_4} \mathbb{E}[h_{i_1i_2} h_{i_2i_3} h_{i_3i_4} h_{i_4i_1}].

Wick's formula gives three pairings:

  • (1,2)(3,4)(1,2)(3,4): the term is nonzero only when i1=i3i_1=i_3. If i1=i2=i3=i4i_1=i_2=i_3=i_4, the contribution is 44; if i1=i2i4i_1=i_2\neq i_4 or i1=i4i2i_1=i_4\neq i_2, the contribution is 22; if i1i2i_1\neq i_2 and i1i4i_1\neq i_4, the contribution is 11. The total contribution is
4n+2n(n1)+2n(n1)+n(n1)2. 4n+2n(n-1)+2n(n-1)+n(n-1)^2.
  • (1,3)(2,4)(1,3)(2,4): this is the crossing pairing. It is nonzero only when i1=i3,i2=i4i_1=i_3,i_2=i_4, or when i2=i3,i1=i4i_2=i_3,i_1=i_4. The total contribution is 4n+n(n1)=n2+3n4n+n(n-1)=n^2+3n.

  • (1,4)(2,3)(1,4)(2,3): this is the same as the first type, and contributes n3+2n2+nn^3+2n^2+n.

After summing,

a4=2n2+5n+5.a_4=2n^2+5n+5.

For k=3k=3, cyclic symmetry gives five cases, represented by the pairings

(1,2)(3,4)(5,6),(1,2)(3,6)(4,5),(1,2)(3,5)(4,6),(1,2)(3,4)(5,6),\quad (1,2)(3,6)(4,5),\quad (1,2)(3,5)(4,6), (1,3)(2,5)(4,6),(1,4)(2,5)(3,6).(1,3)(2,5)(4,6),\quad (1,4)(2,5)(3,6).

These cases contain 2,3,6,3,12,3,6,3,1 pairings respectively. For the first representative, (1,2)(3,4)(5,6)(1,2)(3,4)(5,6):

Ehi1i2hi2i30\mathbb{E} h_{i_1i_2}h_{i_2i_3}\neq 0

if and only if i1=i2,i2=i3i_1=i_2,i_2=i_3, or i1=i3i_1=i_3;

Ehi3i4hi4i50\mathbb{E} h_{i_3i_4}h_{i_4i_5}\neq 0

if and only if i3=i4,i4=i5i_3=i_4,i_4=i_5, or i3=i5i_3=i_5;

Ehi5i6hi6i10\mathbb{E} h_{i_5i_6}h_{i_6i_1}\neq 0

if and only if i5=i6,i6=i1i_5=i_6,i_6=i_1, or i1=i5i_1=i_5.

There are eight cases, each contributing 11. For instance, choosing the second relation in all three places gives i1=i3=i5i_1=i_3=i_5 and free indices i2,i4,i6i_2,i_4,i_6, so there are n4n^4 possibilities. Continuing in the same way gives

n4+3n2+3n2+n.n^4+3n^2+3n^2+n.

The other four representative pairings contribute

n4+3n3+3n2+n,n3+4n2+3n,3n2+5n,n3+4n2+3n.n^4+3n^3+3n^2+n,\quad n^3+4n^2+3n,\quad 3n^2+5n,\quad n^3+4n^2+3n.

The total is

5n4+22n3+52n2+41n.5n^4+22n^3+52n^2+41n.

M. Ledoux, in "A recursion formula for the moments of the Gaussian orthogonal ensemble", gives a five-term recursion for the exact GOE moments. Write

E(Tr(H2p))=bpN=k1ηk(p)Nk.\mathbb{E}(\operatorname{Tr}(H^{2p}))=b_p^N=\sum_{k\geq 1}\eta_k(p)N^k.

Then

(p+1)ηk(p)=(8p2)ηk1(p1)(4p1)ηk(p1)+p(2p3)(10p9)ηk(p2)8(2p3)ηk2(p2)+8(2p3)ηk1(p2)10(2p3)(2p4)(2p5)ηk1(p3)+5(2p3)(2p4)(2p5)ηk(p3)2(2p3)(2p4)(2p5)(2p6)(2p7)ηk(p4).\begin{aligned} (p + 1)\eta_k(p) &= (8p - 2)\eta_{k-1}(p - 1) - (4p - 1)\eta_k(p - 1) \\ &\quad + p(2p - 3)(10p - 9)\eta_k(p - 2) - 8(2p - 3)\eta_{k-2}(p - 2) \\ &\quad + 8(2p - 3)\eta_{k-1}(p - 2) - 10(2p - 3)(2p - 4)(2p - 5)\eta_{k-1}(p - 3) \\ &\quad + 5(2p - 3)(2p - 4)(2p - 5)\eta_k(p - 3) \\ &\quad - 2(2p - 3)(2p - 4)(2p - 5)(2p - 6)(2p - 7)\eta_k(p - 4). \end{aligned}
Problem

Suppose {xij}i,j=1n\{x_{ij}\}_{i,j=1}^n are i.i.d. N(0,1)N(0,1) random variables, and let

Xn=(xij)i,j=1n.X_n=(x_{ij})_{i,j=1}^n.

Construct the symmetric matrix

H=12(Xn+Xnt).H=\frac{1}{\sqrt{2}}(X_n+X_n^t).

Prove that HH has the GOE distribution.

Proof

Since H=12(Xn+Xnt)H=\frac{1}{\sqrt{2}}(X_n+X_n^t), its entries are

Hij=12(xij+xji).H_{ij}=\frac{1}{\sqrt{2}}(x_{ij}+x_{ji}).

Because {xij}\{x_{ij}\} are i.i.d. N(0,1)N(0,1), we can compute the distribution of each entry of HH.

For diagonal entries,

Hii=12(xii+xii)=2xii.H_{ii}=\frac{1}{\sqrt{2}}(x_{ii}+x_{ii})=\sqrt{2}\,x_{ii}.

Since xiiN(0,1)x_{ii}\sim N(0,1),

HiiN(0,2).H_{ii}\sim N(0,2).

For off-diagonal entries with i<ji<j,

Hij=12(xij+xji).H_{ij}=\frac{1}{\sqrt{2}}(x_{ij}+x_{ji}).

Since xijx_{ij} and xjix_{ji} are independent and both have distribution N(0,1)N(0,1), their sum has distribution N(0,2)N(0,2). Multiplying by 1/21/\sqrt2 gives

HijN(0,1).H_{ij}\sim N(0,1).

There are nn diagonal entries and n(n1)2\frac{n(n-1)}2 off-diagonal entries, for a total of n(n+1)2\frac{n(n+1)}2 independent entries.

The joint density is the product of the densities of these independent entries:

f(H)=i=1n12π2exp(Hii24)×i<j12πexp(Hij22)=2n2(2π)n(n+1)4exp(14i=1nHii212i<jHij2).\begin{aligned} f(H) &= \prod_{i=1}^n \frac{1}{\sqrt{2\pi \cdot 2}} \exp\left(-\frac{H_{ii}^2}{4}\right) \times \prod_{i<j} \frac{1}{\sqrt{2\pi}} \exp\left(-\frac{H_{ij}^2}{2}\right) \\ &= 2^{-\frac{n}{2}} (2\pi)^{-\frac{n(n+1)}{4}} \exp\left(-\frac{1}{4}\sum_{i=1}^n H_{ii}^2 - \frac{1}{2}\sum_{i<j} H_{ij}^2\right). \end{aligned}

Since

14i=1nHii2+12i<jHij2=14tr(H2),\frac{1}{4}\sum_{i=1}^n H_{ii}^2+\frac{1}{2}\sum_{i<j}H_{ij}^2 =\frac14\operatorname{tr}(H^2),

we get

f(H)=2n2(2π)n(n+1)4e14tr(H2).f(H)=2^{-\frac{n}{2}}(2\pi)^{-\frac{n(n+1)}4}e^{-\frac14\operatorname{tr}(H^2)}.
Problem

(Orthogonal invariance of GOE) Let HH have the GOE distribution. Prove that for every orthogonal matrix QQ, the matrix QHQ1QHQ^{-1} also has the GOE distribution.

Proof

Let HGOEnH\sim\mathrm{GOE}_n. Its joint density is

f(H)=Cne14tr(H2),f(H)=C_n e^{-\frac14\operatorname{tr}(H^2)},

where

Cn=2n2(2π)n(n+1)4.C_n=2^{-\frac n2}(2\pi)^{-\frac{n(n+1)}4}.

Set

H~=QHQ1=QHQt,\widetilde H=QHQ^{-1}=QHQ^t,

because QQ is orthogonal.

First, H~\widetilde H is still symmetric:

H~t=(QHQt)t=QHtQt=QHQt=H~.\widetilde H^t=(QHQ^t)^t=QH^tQ^t=QHQ^t=\widetilde H.

The trace is invariant:

tr(H~2)=tr(QHQtQHQt)=tr(QH2Qt)=tr(H2QtQ)=tr(H2).\operatorname{tr}(\widetilde H^2) =\operatorname{tr}(QHQ^tQHQ^t) =\operatorname{tr}(QH^2Q^t) =\operatorname{tr}(H^2Q^tQ) =\operatorname{tr}(H^2).

Thus the exponential part of the density is unchanged:

exp(14tr(H~2))=exp(14tr(H2)).\exp\left(-\frac14\operatorname{tr}(\widetilde H^2)\right) =\exp\left(-\frac14\operatorname{tr}(H^2)\right).

For the Jacobian, H~=QHQt\widetilde H=QHQ^t is a linear transformation, so there exists a matrix AA such that

Vec(H~)=AVec(H),\operatorname{Vec}(\widetilde H)=A\operatorname{Vec}(H),

where Vec\operatorname{Vec} stacks the entries of a matrix into an n2n^2-dimensional vector. A linear map is orthogonal exactly when it preserves the Euclidean norm. Here

Vec(H~)22=tr(H~2)=tr(H2)=Vec(H)22.\|\operatorname{Vec}(\widetilde H)\|_2^2=\operatorname{tr}(\widetilde H^2)=\operatorname{tr}(H^2)=\|\operatorname{Vec}(H)\|_2^2.

Thus AA is orthogonal, and the absolute value of the Jacobian determinant is 11.

The density after the change of variables is therefore

f(H~)=f(H)det(J)1=f(H)=Cne14tr(H~2).f(\widetilde H)=f(H)|\det(J)|^{-1}=f(H)=C_n e^{-\frac14\operatorname{tr}(\widetilde H^2)}.

Thus H~=QHQ1\widetilde H=QHQ^{-1} also has the GOE distribution.

Exercise 4.1

Note

This section prepares tail integrals, Jensen's inequality, and moment bounds. Later we will turn them into probability bounds.

Problem

For a nonnegative random variable XX, prove that

n=1P(Xn)E[X]n=1P(Xn)+1.\sum_{n=1}^\infty \mathbb{P}(X \geq n) \leq \mathbb{E}[X] \leq \sum_{n=1}^\infty \mathbb{P}(X \geq n) + 1.
Proof

For a nonnegative random variable XX,

X=i=0X1iX<i+1.X=\sum_{i=0}^{\infty}X\mathbb 1_{i\leq X<i+1}.

Also,

i1iX<i+1X1iX<i+1(i+1)1iX<i+1.i\mathbb{1}_{i\leq X<i+1}\leq X\mathbb{1}_{i\leq X<i+1}\leq (i+1)\mathbb{1}_{i\leq X<i+1}.

Hence

E[X]=i=0E[X1iX<i+1]i=0(i+1)E[1iX<i+1]1+i=0iE[1iX<i+1]1+i=1j=1iE[1iX<i+1]switch the order of summation1+j=1i=jE[1iX<i+1]1+j=1P(Xj).\begin{aligned} \mathbb{E}[X] &=\sum_{i=0}^{\infty}\mathbb{E}[X\mathbb{1}_{i\leq X<i+1}]\\ &\leq \sum_{i=0}^{\infty}(i+1)\mathbb{E}[\mathbb{1}_{i\leq X<i+1}]\\ &\leq 1+\sum_{i=0}^{\infty}i\mathbb{E}[\mathbb{1}_{i\leq X<i+1}]\\ &\leq 1+\sum_{i=1}^{\infty}\sum_{j=1}^{i}\mathbb{E}[\mathbb{1}_{i\leq X<i+1}]\\ &\quad \text{switch the order of summation}\\ &\leq 1+\sum_{j=1}^{\infty}\sum_{i=j}^{\infty}\mathbb{E}[\mathbb{1}_{i\leq X<i+1}]\\ &\leq 1+\sum_{j=1}^{\infty}\mathbb{P}(X\geq j). \end{aligned}

The other side is similar.

Problem

(Jensen's inequality) A function u:RRu:\mathbb{R}\to\mathbb{R} is called convex if for every aRa\in\mathbb{R} there exists λ=λa\lambda=\lambda_a such that

u(x)u(a)+λa(xa),xR.u(x)\geq u(a)+\lambda_a(x-a),\quad \forall x\in\mathbb{R}.

A convex function uu is called strictly convex if λa\lambda_a is strictly increasing in aa.

  1. Prove that if uu is convex and XX has an expectation, then
E[u(X)]u(E[X]).\mathbb{E}[u(X)]\geq u(\mathbb{E}[X]).
  1. Prove that if uu is strictly convex and E[u(X)]=u(E[X])\mathbb{E}[u(X)]=u(\mathbb{E}[X]), then XX is a constant with probability 11.
Proof

(1) Take a=E[X]a=\mathbb{E}[X]. By convexity,

E[u(X)]E[u(a)+λa(Xa)]=u(a)+λa(E[X]a)=u(a)=u(E[X]).\mathbb{E}[u(X)] \geq \mathbb{E}[u(a)+\lambda_a(X-a)] =u(a)+\lambda_a(\mathbb{E}[X]-a) =u(a) =u(\mathbb{E}[X]).

(2) Equality can hold only when X=E[X]X=\mathbb{E}[X] a.e. Hence XX is a constant with probability 11.

Problem

Let XX be a nonnegative random variable. Prove that for every r>0r>0,

E[Xr]=0rxr1P(X>x)dx.\mathbb{E}[X^r] = \int_0^\infty r x^{r-1} \mathbb{P}(X > x) \, dx.
Proof
E[Xr]=E0Xrtr1dt=E0rtr11X>tdtuse Fubini’s theorem to switch the order of integration=0rtr1E[1X>t]dt=0rtr1P(X>t)dt.\begin{aligned} \mathbb{E} [X^{r}] &=\mathbb{E} \int_{0}^{X} rt^{r-1}\,dt \\ &=\mathbb{E} \int_{0}^{\infty} r t^{r-1}\mathbb{1}_{X>t}\,dt \\ &\quad \text{use Fubini's theorem to switch the order of integration} \\ &=\int_{0}^{\infty} r t^{r-1}\mathbb{E}[\mathbb{1}_{X>t}]\,dt \\ &=\int_{0}^{\infty} r t^{r-1}\mathbb{P}(X>t)\,dt. \end{aligned}
Problem

Fix r>0r>0.

  1. If E[Xr]<\mathbb{E}[|X|^r]<\infty, prove that
limx+xrP(Xx)=0.\lim_{x\to+\infty}x^r\mathbb{P}(|X|\geq x)=0.
  1. If
limx+xrP(Xx)=0,\lim_{x\to+\infty}x^r\mathbb{P}(|X|\geq x)=0,

prove that E[Xs]<\mathbb{E}[|X|^s]<\infty for every s(0,r)s\in(0,r). Does E[Xr]<\mathbb{E}[|X|^r]<\infty necessarily hold? Give a reason or a counterexample.

Proof

(1) We have

xrP(Xx)=xrR1XxdPRXr1XxdP.x^r\mathbb{P}(|X|\geq x) =x^r\int_{\mathbb R}\mathbb{1}_{|X|\geq x}\,d\mathbb{P} \leq \int_{\mathbb R}|X|^r\mathbb{1}_{|X|\geq x}\,d\mathbb{P}.

Also, Xr1XxXr|X|^r\mathbb{1}_{|X|\geq x}\leq |X|^r, and Xr|X|^r is integrable. By the dominated convergence theorem,

limnRXr1XndP=RlimnXr1XndP=0.\lim_{n\to\infty}\int_{\mathbb R}|X|^r\mathbb{1}_{|X|\geq n}\,d\mathbb{P} =\int_{\mathbb R}\lim_{n\to\infty}|X|^r\mathbb{1}_{|X|\geq n}\,d\mathbb{P} =0.

This proves the claim.

(2) For every ε>0\varepsilon>0, choose MM such that for all x>Mx>M,

xrP(Xx)<ε.x^r\mathbb{P}(|X|\geq x)<\varepsilon.

By the tail integral formula,

E[Xs]=0sts1P(X>t)dt=0Msts1P(X>t)dt+Msts1P(X>t)dtC1+Msts1εtrdt=C1+C2ε.\begin{aligned} \mathbb{E}[|X|^s] &=\int_{0}^{\infty} s t^{s-1}\mathbb{P}(|X|>t)\,dt\\ &=\int_{0}^{M} s t^{s-1}\mathbb{P}(|X|>t)\,dt +\int_{M}^{\infty} s t^{s-1}\mathbb{P}(|X|>t)\,dt\\ &\leq C_1+\int_{M}^{\infty} s t^{s-1}\frac{\varepsilon}{t^r}\,dt\\ &= C_1+C_2\varepsilon. \end{aligned}

Thus E[Xs]<\mathbb{E}[|X|^s]<\infty.

However, E[Xr]<\mathbb{E}[|X|^r]<\infty need not hold. For example, take

P(Xx)1xrlogx.\mathbb{P}(|X|\geq x)\sim \frac{1}{x^r\log x}.

The tail integral formula shows that E[Xr]\mathbb{E}[|X|^r] diverges.

Exercise 3.5

Note

The normal tail probability has order ex2/2/xe^{-x^2/2}/x. Integration by parts is the main tool here.

Problem

For XN(0,1)X\sim N(0,1), prove the standard normal tail estimate

xx2+112πex2/2P(Xx)1x12πex2/2.\dfrac{x}{x^2+1}\dfrac{1}{\sqrt{2\pi }}e^{-x^2/2}\leq \mathbb{P}(X\geq x)\leq \dfrac{1}{x}\dfrac{1}{\sqrt{2\pi }}e^{-x^2/2}.
Proof

The upper bound follows from

E(1Xx)E(Xx1Xx).\mathbb{E}(\mathbb{1}_{X\geq x})\leq \mathbb{E}\left(\frac{X}{x}\mathbb{1}_{X\geq x}\right).

Thus

P{X>x}12πxuxeu2/2du=1x12πex2/2.\mathbb{P}\{X>x\} \leq \frac{1}{\sqrt{2\pi}}\int_x^\infty \frac{u}{x}e^{-u^2/2}\,du =\frac{1}{x}\frac{1}{\sqrt{2\pi}}e^{-x^2/2}.

For the lower bound, define

f(x)=xex2/2(x2+1)xeu2/2du.f(x)=xe^{-x^2/2}-(x^2+1)\int_x^\infty e^{-u^2/2}\,du.

We have f(0)<0f(0)<0, limxf(x)=0\lim_{x\to\infty}f(x)=0, and

f(x)=(1x2+x2+1)ex2/22xxeu2/2du=2x(xeu2/2duex2/2x).f'(x)=(1-x^2+x^2+1)e^{-x^2/2} -2x\int_x^\infty e^{-u^2/2}\,du =-2x\left(\int_x^\infty e^{-u^2/2}\,du-\frac{e^{-x^2/2}}{x}\right).

For x>0x>0, the upper bound already proved gives f(x)>0f'(x)>0. Hence f(x)0f(x)\leq 0, which is the desired lower bound.

  1. Define functions HnH_n, n0n\geq 0, by H0=1H_0=1 and (1)nHnϕ=ϕ(n)(-1)^nH_n\phi=\phi^{(n)}. Prove that Hn(x)H_n(x) is a degree nn polynomial with leading term xnx^n, and that
+Hm(x)Hn(x)ϕ(x)dx={m!,m=n,0,mn.\int_{-\infty}^{+\infty}H_m(x)H_n(x)\phi(x)\,dx = \begin{cases} m!, & m=n,\\ 0, & m\neq n. \end{cases}

Also prove

n=0Hn(x)tnn!=ext12t2.\sum_{n=0}^{\infty}H_n(x)\frac{t^n}{n!}=e^{xt-\frac12t^2}.
Solution

By definition,

(1)nHn(x)ϕ(x)=ϕ(n)(x),(-1)^nH_n(x)\phi(x)=\phi^{(n)}(x),

where

ϕ(x)=12πex2/2.\phi(x)=\frac{1}{\sqrt{2\pi}}e^{-x^2/2}.

We compute ϕ=xϕ\phi'=-x\phi and ϕ=ϕ+x2ϕ\phi''=-\phi+x^2\phi'. By induction,

Hn(x)=Hn1(x)+xHn1(x).H_n(x)=-H_{n-1}'(x)+xH_{n-1}(x).

It follows that HnH_n is monic of degree nn.

For orthogonality, assume mnm\geq n. Then

HmHnϕdx=(1)nHmϕ(n)dx.\int H_mH_n\phi\,dx=(-1)^n\int H_m\phi^{(n)}\,dx.

Integrating by parts nn times, with boundary terms equal to 00, gives

(1)nHmϕ(n)dx=Hm(n)ϕdx.(-1)^n\int H_m\phi^{(n)}\,dx=\int H_m^{(n)}\phi\,dx.

If m=nm=n, then Hm(n)=n!H_m^{(n)}=n!; if m>nm>n, orthogonality follows by the same integration-by-parts argument. Since ϕ=1\int\phi=1, the stated relation follows.

Finally, by Taylor expansion,

ϕ(xt)=n=0(t)nn!ϕ(n)(x).\phi(x-t)=\sum_{n=0}^{\infty}\frac{(-t)^n}{n!}\phi^{(n)}(x).

But

ϕ(xt)=12πe(xt)2/2=ϕ(x)extt2/2.\phi(x-t)=\frac{1}{\sqrt{2\pi}}e^{-(x-t)^2/2}=\phi(x)e^{xt-t^2/2}.

Substituting ϕ(n)(x)=(1)nHn(x)ϕ(x)\phi^{(n)}(x)=(-1)^nH_n(x)\phi(x) and cancelling ϕ(x)\phi(x) gives the generating function.

  1. For positive integers m,nm,n, compute the correlation coefficient ρ(Hm(X),Hn(Y))\rho(H_m(X),H_n(Y)).
Solution

From the generating function in (1),

eXtt22=i=0Hi(X)tii!,eYss22=j=0Hj(Y)sjj!.e^{Xt-\frac{t^2}{2}}=\sum_{i=0}^{\infty}H_i(X)\frac{t^i}{i!}, \quad e^{Ys-\frac{s^2}{2}}=\sum_{j=0}^{\infty}H_j(Y)\frac{s^j}{j!}.

First,

E(eXtt22)=et22E(etX)=1.\mathbb{E}\left(e^{Xt-\frac{t^2}{2}}\right) =e^{-\frac{t^2}{2}}\mathbb{E}(e^{tX}) =1.

Comparing coefficients gives E(H0(X))=1\mathbb{E}(H_0(X))=1 and E(Hn(X))=0\mathbb{E}(H_n(X))=0 for n1n\geq 1.

Next consider the joint generating function:

E(eXtt22eYss22)=et2+s22E(etX+sY).\mathbb{E}\left(e^{Xt-\frac{t^2}{2}}e^{Ys-\frac{s^2}{2}}\right) =e^{-\frac{t^2+s^2}{2}}\mathbb{E}(e^{tX+sY}).

Since (X,Y)(X,Y) is standard bivariate normal, tX+sYN(0,t2+s2+2ρts)tX+sY\sim N(0,t^2+s^2+2\rho ts), so

E(etX+sY)=e12(t2+s2+2ρts).\mathbb{E}(e^{tX+sY})=e^{\frac12(t^2+s^2+2\rho ts)}.

Hence

E(eXtt22eYss22)=eρts.\mathbb{E}\left(e^{Xt-\frac{t^2}{2}}e^{Ys-\frac{s^2}{2}}\right) =e^{\rho ts}.

On the other hand, expanding the generating functions gives

E(i=0Hi(X)tii!j=0Hj(Y)sjj!)=i,j=0E[Hi(X)Hj(Y)]tisji!j!.\mathbb{E}\left(\sum_{i=0}^{\infty}H_i(X)\frac{t^i}{i!} \sum_{j=0}^{\infty}H_j(Y)\frac{s^j}{j!}\right) =\sum_{i,j=0}^{\infty}\mathbb{E}[H_i(X)H_j(Y)]\frac{t^is^j}{i!j!}.

Since

eρts=n=0(ρts)nn!=n=0ρntnsnn!,e^{\rho ts}=\sum_{n=0}^{\infty}\frac{(\rho ts)^n}{n!} =\sum_{n=0}^{\infty}\rho^n\frac{t^ns^n}{n!},

comparing the coefficients of tisjt^is^j gives

E[Hi(X)Hj(Y)]={ρnn!,i=j=n,0,ij.\mathbb{E}[H_i(X)H_j(Y)] = \begin{cases} \rho^n n!, & i=j=n,\\ 0, & i\neq j. \end{cases}

Together with E(Hn(X))=E(Hn(Y))=0\mathbb{E}(H_n(X))=\mathbb{E}(H_n(Y))=0 for n1n\geq 1 and

Var(Hn(X))=E[Hn2(X)]=n!,\operatorname{Var}(H_n(X))=\mathbb{E}[H_n^2(X)]=n!,

we get

ρ(Hm(X),Hn(Y))={ρn,m=n1,0,mn.\rho(H_m(X),H_n(Y)) = \begin{cases} \rho^n, & m=n\geq 1,\\ 0, & m\neq n. \end{cases}
  1. Let P(x)P(x) and Q(y)Q(y) be nonconstant polynomials. Prove that
ρ(P(X),Q(Y))ρ(X,Y).|\rho(P(X),Q(Y))|\leq |\rho(X,Y)|.
Solution

Expand PP and QQ in Hermite polynomials:

P(x)=i=1kaiHi(x),Q(y)=j=1lbjHj(y).P(x)=\sum_{i=1}^{k}a_iH_i(x), \quad Q(y)=\sum_{j=1}^{l}b_jH_j(y).

The constant terms do not affect covariance. By (2),

Cov(P(X),Q(Y))=i=1min(k,l)aibii!ρi.\operatorname{Cov}(P(X),Q(Y)) =\sum_{i=1}^{\min(k,l)}a_ib_i i!\rho^i.

Also,

Var(P(X))=i=1kai2i!,Var(Q(Y))=j=1lbj2j!.\operatorname{Var}(P(X))=\sum_{i=1}^{k}a_i^2i!, \quad \operatorname{Var}(Q(Y))=\sum_{j=1}^{l}b_j^2j!.

It remains to prove

iaibii!ρiρiai2i!ibi2i!.\left|\sum_i a_ib_i i!\rho^i\right| \leq |\rho|\sqrt{\sum_i a_i^2i!}\sqrt{\sum_i b_i^2i!}.

By Cauchy-Schwarz and ρiρ|\rho|^i\leq |\rho| for i1i\geq 1 and ρ<1|\rho|<1,

i=1aibii!ρii=1aibii!ρiρi=1(aii!)(bii!)ρi=1ai2i!i=1bi2i!.\begin{aligned} \left|\sum_{i=1} a_ib_i i!\rho^i\right| &\leq \sum_{i=1}|a_i||b_i|i!|\rho|^i\\ &\leq |\rho|\sum_{i=1}(|a_i|\sqrt{i!})(|b_i|\sqrt{i!})\\ &\leq |\rho|\sqrt{\sum_{i=1}a_i^2i!}\sqrt{\sum_{i=1}b_i^2i!}. \end{aligned}

Hence ρ(P,Q)ρ|\rho(P,Q)|\leq |\rho|, where ρ=ρ(X,Y)|\rho|=|\rho(X,Y)|.

End-of-chapter check
  • The original problems and solutions in this chapter come from the corresponding TeX source files.
  • You can first read only the problem boxes, write down the main identities, and then open the proof or solution.
  • If a conclusion uses independence, countable additivity, a change-of-variables formula, or a moment condition, it is worth marking that point explicitly.