Bivariate Normal Distribution

DOWNLOAD Mathematica Notebook

The bivariate normal distribution is the statistical distribution with probability density function

 P(x_1,x_2)=1/(2pisigma_1sigma_2sqrt(1-rho^2))exp[-z/(2(1-rho^2))],
(1)

where

 z=((x_1-mu_1)^2)/(sigma_1^2)-(2rho(x_1-mu_1)(x_2-mu_2))/(sigma_1sigma_2)+((x_2-mu_2)^2)/(sigma_2^2),
(2)

and

 rho=cor(x_1,x_2)=(V_(12))/(sigma_1sigma_2)
(3)

is the correlation of x_1 and x_2 (Kenney and Keeping 1951, pp. 92 and 202-205; Whittaker and Robinson 1967, p. 329) and V_(12) is the covariance.

The probability density function of the bivariate normal distribution is implemented as MultinormalDistribution[{mu1, mu2}, {{sigma11, sigma12}, {sigma12, sigma22}}] in the Wolfram Language package MultivariateStatistics` .

The marginal probabilities are then

P(x_1)=int_(-infty)^inftyP(x_1,x_2)dx_2
(4)
=1/(sigma_1sqrt(2pi))e^(-(x_1-mu_1)^2/(2sigma_1^2))
(5)

and

P(x_2)=int_(-infty)^inftyP(x_1,x_2)dx_1
(6)
=1/(sigma_2sqrt(2pi))e^(-(x_2-mu_2)^2/(2sigma_2^2))
(7)

(Kenney and Keeping 1951, p. 202).

Let Z_1 and Z_2 be two independent normal variates with means mu_i=0 and sigma_i^2=1 for i=1, 2. Then the variables a_1 and a_2 defined below are normal bivariates with unit variance and correlation coefficient rho:

a_1=sqrt((1+rho)/2)z_1+sqrt((1-rho)/2)z_2
(8)
a_2=sqrt((1+rho)/2)z_1-sqrt((1-rho)/2)z_2.
(9)

To derive the bivariate normal probability function, let X_1 and X_2 be normally and independently distributed variates with mean 0 and variance 1, then define

Y_1=mu_1+sigma_(11)X_1+sigma_(12)X_2
(10)
Y_2=mu_2+sigma_(21)X_1+sigma_(22)X_2
(11)

(Kenney and Keeping 1951, p. 92). The variates Y_1 and Y_2 are then themselves normally distributed with means mu_1 and mu_2, variances

sigma_1^2=sigma_(11)^2+sigma_(12)^2
(12)
sigma_2^2=sigma_(21)^2+sigma_(22)^2,
(13)

and covariance

 V_(12)=sigma_(11)sigma_(21)+sigma_(12)sigma_(22).
(14)

The covariance matrix is defined by

 V_(ij)=[sigma_1^2 rhosigma_1sigma_2; rhosigma_1sigma_2 sigma_2^2],
(15)

where

 rho=(V_(12))/(sigma_1sigma_2)=(sigma_(11)sigma_(21)+sigma_(12)sigma_(22))/(sigma_1sigma_2).
(16)

Now, the joint probability density function for x_1 and x_2 is

 f(x_1,x_2)dx_1dx_2=1/(2pi)e^(-(x_1^2+x_2^2)/2)dx_1dx_2,
(17)

but from (◇) and (◇), we have

 [y_1-mu_1; y_2-mu_2]=[sigma_(11) sigma_(12); sigma_(21) sigma_(22)][x_1; x_2].
(18)

As long as

 |sigma_(11) sigma_(12); sigma_(21) sigma_(22)|!=0,
(19)

this can be inverted to give

[x_1; x_2]=[sigma_(11) sigma_(12); sigma_(21) sigma_(22)]^(-1)[y_1-mu_1; y_2-mu_2]
(20)
=1/(sigma_(11)sigma_(22)-sigma_(12)sigma_(21))[sigma_(22) -sigma_(12); -sigma_(21) sigma_(11)][y_1-mu_1; y_2-mu_2].
(21)

Therefore,

 x_1^2+x_2^2=([sigma_(22)(y_1-mu_1)-sigma_(12)(y_2-mu_2)]^2)/((sigma_(11)sigma_(22)-sigma_(12)sigma_(21))^2) 
 +([-sigma_(21)(y_1-mu_1)+sigma_(11)(y_2-mu_2)]^2)/((sigma_(11)sigma_(22)-sigma_(12)sigma_(21))^2),
(22)

and expanding the numerator of (22) gives

 sigma_(22)^2(y_1-mu_1)^2-2sigma_(12)sigma_(22)(y_1-mu_1)(y_2-mu_2)+sigma_(12)^2(y_2-mu_2)^2+sigma_(21)^2(y_1-mu_1)^2-2sigma_(11)sigma_(21)(y_1-mu_1)(y_2-mu_2)+sigma_(11)^2(y_2-mu_2)^2,
(23)

so

 (x_1^2+x_2^2)(sigma_(11)sigma_(22)-sigma_(12)sigma_(21))^2 
=(y_1-mu_1)^2(sigma_(21)^2+sigma_(22)^2)-2(y_1-mu_1)(y_2-mu_2)(sigma_(11)sigma_(21)+sigma_(12)sigma_(22))+(y_2-mu_2)^2(sigma_(11)^2+sigma_(12)^2) 
=sigma_2^2(y_1-mu_1)^2-2(y_1-mu_1)(y_2-mu_2)(rhosigma_1sigma_2)+sigma_1^2(y_2-mu_2)^2 
=sigma_1^2sigma_2^2[((y_1-mu_1)^2)/(sigma_1^2)-(2rho(y_1-mu_1)(y_2-mu_2))/(sigma_1sigma_2)+((y_2-mu_2)^2)/(sigma_2^2)].
(24)

Now, the denominator of (◇) is

 sigma_(11)^2sigma_(21)^2+sigma_(11)^2sigma_(22)^2+sigma_(12)^2sigma_(21)^2+sigma_(12)^2sigma_(22)^2-sigma_(11)^2sigma_(21)^2 
 -2sigma_(11)sigma_(12)sigma_(21)sigma_(22)-sigma_(12)^2sigma_(22)^2=(sigma_(11)sigma_(22)-sigma_(12)sigma_(21))^2,
(25)

so

1/(1-rho^2)=1/(1-(V_(12)^2)/(sigma_1^2sigma_2^2))
(26)
=(sigma_1^2sigma_2^2)/(sigma_1^2sigma_2^2-V_(12)^2)
(27)
=(sigma_1^2sigma_2^2)/((sigma_(11)^2+sigma_(12)^2)(sigma_(21)^2+sigma_(22)^2)-(sigma_(11)sigma_(21)+sigma_(12)sigma_(22))^2).
(28)

can be written simply as

 1/(1-rho^2)=(sigma_1^2sigma_2^2)/((sigma_(11)sigma_(22)-sigma_(12)sigma_(21))^2),
(29)

and

 x_1^2+x_2^2=1/(1-rho^2)[((y_1-mu_1)^2)/(sigma_1^2)-(2rho(y_1-mu_1)(y_2-mu_2))/(sigma_1sigma_2)+((y_2-mu_2)^2)/(sigma_2^2)].
(30)

Solving for x_1 and x_2 and defining

 rho^'=(sigma_1sigma_2sqrt(1-rho^2))/(sigma_(11)sigma_(22)-sigma_(12)sigma_(21))
(31)

gives

x_1=(sigma_(22)(y_1-mu_1)-sigma_(12)(y_2-mu_2))/(rho^')
(32)
x_2=(-sigma_(21)(y_1-mu_1)+sigma_(11)(y_2-mu_2))/(rho^').
(33)

But the Jacobian is

J((x_1,x_2)/(y_1,y_2))=|(partialx_1)/(partialy_1) (partialx_1)/(partialy_2); (partialx_2)/(partialy_1) (partialx_2)/(partialy_2)|=|(sigma_(22))/(rho^') -(sigma_(12))/(rho^'); -(sigma_(21))/(rho^') (sigma_(11))/(rho^')|
(34)
=1/(rho^('2))(sigma_(11)sigma_(22)-sigma_(12)sigma_(21))
(35)
=1/(rho^')=1/(sigma_1sigma_2sqrt(1-rho^2)),
(36)

so

 dx_1dx_2=(dy_1dy_2)/(sigma_1sigma_2sqrt(1-rho^2))
(37)

and

 1/(2pi)e^(-(x_1^2+x_2^2)/2)dx_1dx_2=1/(2pisigma_1sigma_2sqrt(1-rho^2))exp[-z/(2(1-rho^2))]dy_1dy_2,
(38)

where

 z=((y_1-mu_1)^2)/(sigma_1^2)-(2rho(y_1-mu_1)(y_2-mu_2))/(sigma_1sigma_2)+((y_2-mu_2)^2)/(sigma_2^2).
(39)

Q.E.D.

The characteristic function of the bivariate normal distribution is given by

phi(t_1,t_2)=int_(-infty)^inftyint_(-infty)^inftye^(i(t_1x_1+t_2x_2))P(x_1,x_2)dx_1dx_2
(40)
=Nint_(-infty)^inftyint_(-infty)^inftye^(i(t_1x_1+t_2x_2))exp[-z/(2(1-rho^2))]dx_1dx_2,
(41)

where

 z=[((x_1-mu_1)^2)/(sigma_1^2)-(2rho(x_1-mu_1)(x_2-mu_2))/(sigma_1sigma_2)+((x_2-mu_2)^2)/(sigma_2^2)]
(42)

and

 N=1/(2pisigma_1sigma_2sqrt(1-rho^2)).
(43)

Now let

u=x_1-mu_1
(44)
w=x_2-mu_2.
(45)

Then

 phi(t_1,t_2)=N^'int_(-infty)^infty(e^(it_2w)exp[-1/(2(1-rho^2))(w^2)/(sigma_2^2)])int_(-infty)^inftye^ve^(t_1u)dudw,
(46)

where

v=-1/(2(1-rho^2))1/(sigma_1^2)[u^2-(2rhosigma_1w)/(sigma_2)u]
(47)
N^'=(e^(i(t_1mu_1+t_2mu_2)))/(2pisigma_1sigma_2sqrt(1-rho^2)).
(48)

Complete the square in the inner integral

 int_(-infty)^inftyexp{-1/(2(1-rho^2))1/(sigma_1^2)[u^2-(2rhosigma_1w)/(sigma_2)u]}e^(t_1u)du 
=int_(-infty)^inftyexp{-1/(2sigma_1^2(1-rho^2))[u-(rho_1sigma_1w)/(sigma^2)]^2}{1/(2sigma_1^2(1-rho^2))((rho_1sigma_1w)/(sigma_2))^2}e^(it_1u)du.
(49)

Rearranging to bring the exponential depending on w outside the inner integral, letting

 v=u-rho(sigma_1w)/(sigma_2),
(50)

and writing

 e^(it_1u)=cos(t_1u)+isin(t_1u)
(51)

gives

 phi(t_1,t_2)=N^'int_(-infty)^inftye^(it_2w)exp[-1/(2sigma_2^2(1-rho^2))w^2]exp[(rho^2)/(2sigma_2^2(1-rho^2))w^2]int_(-infty)^inftyexp[-1/(2sigma_2^2(1-rho^2))v^2]{cos[t_1(v+(rhosigma_1w)/(sigma_2))]+isin[t_1(v+(rhosigma_1w)/(sigma_2))]}dvdw.
(52)

Expanding the term in braces gives

 [cos(t_1v)cos((rhosigma_1wt_1)/(sigma_2))-sin(t_1v)sin((rhosigma_1w)/(sigma_2t_1))]+i[sin(t_1v)cos((rhosigma_1w)/(sigma_2t_1))+cos(t_1v)sin((rhosigma_1wt_1)/(sigma_2))] 
=[cos((rhosigma_1wt_1)/(sigma_2))+isin((rhosigma_1wt_1)/(sigma_2))][cos(t_1v)+isin(t_1v)]=exp((irhosigma_1w)/(sigma_2)t_1)[cos(t_1v)+isin(t_1v)].
(53)

But e^(-ax^2)sin(bx) is odd, so the integral over the sine term vanishes, and we are left with

 phi(t_1,t_2)=N^'int_(-infty)^inftye^(it_2w)exp[-(w^2)/(2sigma_2^2)]exp[(rho^2w^2)/(2sigma_2^2(1-rho^2))]exp[(irhosigma_1wt_1)/(sigma_2)]dwint_(-infty)^inftyexp[-(v^2)/(2sigma_1^2(1-rho^2))]cos(t_1v)dv 
=N^'int_(-infty)^inftyexp[iw(t_2+t_1(rho(sigma_1)/(sigma_2)))]exp[-(w^2)/(2sigma_2^2)]dwint_(-infty)^inftyexp[-(v^2)/(2sigma_1^2(1-rho^2))]cos(t_1v)dv.
(54)

Now evaluate the Gaussian integral

int_(-infty)^inftye^(ikx)e^(-ax^2)dx=int_(-infty)^inftye^(-ax^2)cos(kx)dx
(55)
=sqrt(pi/a)e^(-k^2/4a)
(56)

to obtain the explicit form of the characteristic function,

 phi(t_1,t_2)=(e^(i(t_1mu_1+t_2mu_2)))/(2pisigma_1sigma_2sqrt(1-rho^2)){sigma_2sqrt(2pi)exp[-1/4(t_2+rho(sigma_1)/(sigma_2)t_1)^22sigma_2^2]}{sigma_1sqrt(2pi(1-rho^2))exp[-1/4t_1^22sigma_1^2(1-rho^2)]} 
=e^(i(t_1mu_1+t_2mu_2))exp{-1/2[t_2^2sigma_2^2+2rhosigma_1sigma_2t_1t_2+rho^2sigma_1^2t_1^2+(1-rho^2)sigma_1^2t_1^2]} 
=exp[i(t_1mu_1+t_2mu_2)-1/2(sigma_1^2t_1^2+2rhosigma_1sigma_2t_1t_2+sigma_2^2t_2^2)].
(57)

In the singular case that

 |sigma_(11) sigma_(12); sigma_(21) sigma_(22)|=0
(58)

(Kenney and Keeping 1951, p. 94), it follows that

 sigma_(11)sigma_(22)=sigma_(12)sigma_(21)
(59)
y_1=mu_1+sigma_(11)x_1+sigma_(12)x_2
(60)
y_2=mu_2+(sigma_(12)sigma_(21))/(sigma_(11))x_2
(61)
=mu_2+(sigma_(11)sigma_(21)x_1+sigma_(12)sigma_(21)x_2)/(sigma_(11))
(62)
=mu_2+(sigma_(21))/(sigma_(11))(sigma_(11)x_1+sigma_(12)x_2),
(63)

so

y_1=mu_1+x_3
(64)
y_2=mu_2+(sigma_(21))/(sigma_(11))x_3,
(65)

where

x_3=y_1-mu_1
(66)
=(sigma_(11))/(sigma_(21))(y_2-mu_2).
(67)

The standardized bivariate normal distribution takes sigma_1=sigma_2=1 and mu_1=mu_2=0. The quadrant probability in this special case is then given analytically by

P(x_1<=0,x_2<=0)=P(x_1>=0,x_2>=0)
(68)
=int_(-infty)^0int_(-infty)^0P(x_1,x_2)dx_1dx_2
(69)
=1/4+(sin^(-1)rho)/(2pi)
(70)

(Rose and Smith 1996; Stuart and Ord 1998; Rose and Smith 2002, p. 231). Similarly,

P(x_1<=0,x_2>=0)=P(x_1>=0,x_2<=0)
(71)
=int_(-infty)^0int_0^inftyP(x_1,x_2)dx_1dx_2
(72)
=(cos^(-1)rho)/(2pi).
(73)

Wolfram Web Resources

Mathematica »

The #1 tool for creating Demonstrations and anything technical.

Wolfram|Alpha »

Explore anything with the first computational knowledge engine.

Wolfram Demonstrations Project »

Explore thousands of free applications across science, mathematics, engineering, technology, business, art, finance, social sciences, and more.

Computerbasedmath.org »

Join the initiative for modernizing math education.

Online Integral Calculator »

Solve integrals with Wolfram|Alpha.

Step-by-step Solutions »

Walk through homework problems step-by-step from beginning to end. Hints help you try the next step on your own.

Wolfram Problem Generator »

Unlimited random practice problems and answers with built-in Step-by-step solutions. Practice online or make a printable study sheet.

Wolfram Education Portal »

Collection of teaching and learning tools built by Wolfram education experts: dynamic textbook, lesson plans, widgets, interactive Demonstrations, and more.

Wolfram Language »

Knowledge-based programming for everyone.