Inferential Statistics Hypothesis tests
Eva Riccomagno, Maria Piera Rogantin
DIMA – Universit`a di Genova
riccomagno@dima.unige.it rogantin@dima.unige.it
Part F
Hypothesis tests for the equality of two means
a) paired samples
b) two-samples
Review
Exercise. Chicago Tribune Chicago land’s technology professionals get local technology news from various newspapers and magazines. A marketing company claims that 25% of the IT professionals choose the Chicago Tribune as their primary source for local IT news. A survey was conducted to check this claim. Among a sample of 750 IT professionals in the Chicago land area, 23.47% of them prefer the Chicago Tribune. Can we conclude that the claim of the marketing company is true?
The random variable modeling the preference of the Chicago Tribune is X ∼ B(1, p)
Test statistic: P = X; sample value: ˆb p = 0.2347
Large sample size (n = 750). Using CLT P ∼ Nb p, p(1−p)n H0 : p = 0.25 H1 : p 6= 0.25 or p < 0.25?
p-value computation in R using t.test
> np=750*0.2347
> prop.test(np,750,0.25)
1-sample proportions test with continuity correction data: np out of 750, null probability 0.25
X-squared = 0.85654, df = 1, p-value = 0.3547
alternative hypothesis: true p is not equal to 0.25 95 percent confidence interval:
0.2051343 0.2670288 sample estimates:
p 0.2347
> prop.test(np,750,0.25,"less")
1-sample proportions test with continuity correction data: np out of 750, null probability 0.25
X-squared = 0.85654, df = 1, p-value = 0.1774 alternative hypothesis: true p is less than 0.25 95 percent confidence interval:
0.0000000 0.2617696 sample estimates:
p 0.2347
In both cases there is not evidence to reject H0 (p = 0.25)
Test for the equality of two means
A common application is to test if a new process or treatment is superior to a current process or treatment
The data may either be paired or unpaired
a) Paired samples When there is a one-to-one correspondence between the values in the two samples. That is, if X1, X2, . . . , Xn and Y1, Y2, . . . , Yn are the two sample variables, then Xi cor- responds to Yi
b) Unpaired samples The sample sizes for the two samples may or may not be equal
a) Paired samples
Let X and Y be two random variables modeling a characteristic of the same population
Example. Drinking Water
(from https://onlinecourses.science.psu.edu Penny State University)
Trace metals in drinking water affect the flavor and an unusually high concentration can pose a health hazard
Ten pairs of data were taken measuring zinc concentration in bottom water and surface water
water
bottom surface 1 0.430 0.415 2 0.266 0.238 3 0.567 0.390 4 0.531 0.410 5 0.707 0.605 6 0.716 0.609 7 0.651 0.632 8 0.589 0.523 9 0.469 0.411 10 0.723 0.612
●
●
●
●
●●
●
●
●
●
0.2 0.3 0.4 0.5 0.6 0.7
0.30.40.50.60.7
bottom
surface
attach(water)
m=min(surface,bottom) M=max(surface,bottom) plot(surface~bottom, asp=1,pch=16,
xlim=c(m,M),ylim=c(m,M), cex.axis=1.5,cex.lab=1.5) abline(0,1,col="red",lwd=2)
Assume X ∼ N (µX, σX2 ) and Y ∼ N (µY , σY2) Test hypotheses
H0 : µX = µY and H1 : µX 6= µY
or equivalently H0 : µX − µY = 0 and H1 : µX − µY 6= 0 (simple or composite hypotheses – one/two sided) Let (X1, Y1), . . . , (Xn, Yn) be the n paired sample variables Consider the sample random variables D1, . . . , Dn with
Di = Xi − Yi Consider the sample mean of D
D ∼ N (µD, σD2 /n)
with µD = µX − µY and σD2 = σX2 + σY2 − 2Cov(X, Y ), usually un- known and estimated by the unbiased estimator SD2
The test for mean equality of becomes a Student’s t test on µD, with H0 : µD = 0
Example. Drinking Water (continue)
> D=surface-bottom;D
[1] 0.015 0.028 0.177 0.121 0.102 0.107 0.019 0.066 0.058 0.111
• Hypotheses: H0 : µD = 0 and H1 : µD 6= 0
• Two-sided: R0 = (−∞, c1) ∪ (c2, +∞)
• Sample size: n = 10
• Sample variables: D1, . . . , D10 i.i.d. Di ∼ N (0, σD2 ) with σD2 estimated by SD2
• Test statistic under H0: T = D
SD/√
n ∼ t9
• α = 0.05
The thresholds of the rejection region c1 and c2 are such that 0.025 = P(T < c1 | µD = 0) 0.025 = P(T > c2 | µD = 0) Observe that, because of the symmetry w.r.t. 0 of the Student’s t density
c1 = −c2
In the sample: d = 0.0804 s = 0.052
The sample value of the test statistic, under H0, is 4.86
> d_m=mean(D);d_m; s=sd(D);s [1] 0.0804
[1] 0.05227321
> t=d_m/(s/sqrt(10));t [1] 4.863813
The rejection region is R0 = (−∞, −2.262) ∪ (2.262, ∞). The p-value is 0.0009
> c1=qt(0.025,9) [[1] -2.262157
> 2*(1-pt(t,9)) [1] 0.0008911155
The direct computation in R produces
> t.test(surface,bottom,paired=TRUE) Paired t-test
data: surface and bottom
t = -4.8638, df = 9, p-value = 0.0008911
alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval:
-0.117794 -0.043006 sample estimates:
mean of the differences -0.0804
There is experimental evidence to reject H0
b) Unpaired samples
Example. Prey of two species of spiders
(from https://onlinecourses.science.psu.edu Penny State University)
The feeding habits of two species of net-casting spiders are stud- ied. The species, the deinopis and menneus, coexist in eastern Australia. The following data were obtained on the size, in mil- limeters, of the prey of random samples of the two species.
The spiders were selected randomly and thus we assume inde- pendent measurements.
> d=c(12.9,10.2,7.4,7.0,10.5,11.9,7.1,9.9,14.4,11.3)
> m=c(10.2,6.9,10.9,11.0,10.1,5.3,7.5,10.3,9.2,8.8)
> mean(d);mean(m) [1] 10.26
[1] 9.02
d m
68101214
Normal distribution
Assume the prey size of the two population (denoted by A and B) follow a Normal distribution
XA ∼ N (µA, σA2) XB ∼ N (µB, σB2 )
Let nA and nB be the size of the two independent sample of XA and XB. In the example nA = nB = 10.
We want to test H0 : µA = µB and H1 : µA 6= µB
or equivalently H0 : µA − µB = 0 and H1 : µA − µB 6= 0 The two sample mean random variables are
XA ∼ N µA, σA2 nA
!
XB ∼ N µB, σB2 nB
!
The random variable difference of the two sample mean random variables follows the Normal distribution
XA − XB ∼ N µA − µB, σA2
nA + σB2 nB
!
The original test becomes a test on the mean of one Normal random variable
1. The variances σA2 and σB2 are known Fixed α, a usual z-test is carried out
2. The variances σA2 and σB2 are unknown, and assumed equal and estimated by the unbiased estimators SA2 e SB2
A unbiased estimator of the variance of the random variable XA − XB is S2 = (nA − 1)SA2 + (nB − 1)SB2
(nA + nB − 2) · nA + nB
nA nB (Pooled variance) In particular, if nA = nB, then S2 = SA2 + SB2
/nA
The test statistic is T =
XA − XB − (µA − µB)
S with T ∼ tn
A+nB−2
Fixed α, a usual Student’s t test is carried out 3. The unknown variances σA2 and σA2 are not equal
A hypothesis test based on the t distribution, known as Welch’s t-test, can be used
Example. Prey of two species of spiders (continue)
• Hypotheses: H0 : µD = µM and H1 : µD 6= µM
• Two-sided: R0 = (−∞, c1) ∪ (c2, +∞)
• Sample size: nD = nM = 10
• First, assume σD2 = σM2 . Pooled variance estimator:
S2 = SD2 + SM2 /nD
• Test statistic under H0:
T =
XD − XM
S ∼ t2n
D−2
• α = 0.05
The thresholds of the rejection region c1 and c2 are such that 0.025 = P(T < c1 | µD = µM) 0.025 = P(T > c2 | µD = µM)
The sample means of the two groups are:
xD = 10.26 xM = 9.0.2
The sample difference of means is: xD − xM = 1.24 The sample pooled variance is: s2 = 1.01
The sample value of the test statistic, under H0, is 1.18
> diff_m=mean(d)-mean(m);diff_m [1] 1.24
> s2=(sd(d)^2+sd(m)^2)/10;s2 [1] 0.9915556
> t=diff_m/sqrt(s2);t [1] 1.245269
The rejection region is R0 = (−∞, −2.1) ∪ (2.1, ∞) The p-value is 0.25
> c1=qt(0.025,18);c1 [1] -2.100922
> 2*(1-pt(t,18)) ## note 2*( ) -- two sided test [1] 0.2290008
There is no experimental evidence to reject H0
Can we assume equal variances?
A specific test can be performed (based on the Fisher distribu- tion). Here we do not give the details. Compute in R
> var.test(m, d, ratio = 1)
F test to compare two variances data: m and d
F = 0.56936, num df = 9, denom df = 9, p-value = 0.4142
alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval:
0.1414206 2.2922339 sample estimates:
ratio of variances 0.5693585
We can assume the σD2 = σM2 , although the ratio of variances is 0.57. This apparent inconsistency is due to the small sample sizes
Direct computation in R of the test
H0 : µD = µM and H1 : µD 6= µM, assuming σD2 = σM2
> t.test(d,m,var.equal=T) Two Sample t-test data: d and m
t = 1.2453, df = 18, p-value = 0.229
alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval:
-0.8520327 3.3320327 sample estimates:
mean of x mean of y 10.26 9.02
If the equality of the variances is rejected, we use the Welch Two Sample t-test
In such a case the polled variance s2 and the degrees of freedom are compute in an another manner
Compute in R
t.test(d,m)
Welch Two Sample t-test data: d and m
t = 1.2453, df = 16.74, p-value = 0.2302
alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval:
-0.8633815 3.3433815 sample estimates:
mean of x mean of y 10.26 9.02
The problem of making inference on means when vari- ances are unequal, is, in general, quite a difficult one. It is known as the Behrens-Fisher Problem
(G. Casella, R.J. Berger, Statistical Inference, 2nd ed., Duxbury, Ex.
8.42)
Notes and generalisations
• The Wald test. If the two random variables are not normally distributed and the sample size is “large” a Wald test can be performed
• Threshold different from zero. In some applications, you may want to adopt a new process or treatment only if it exceeds the current treatment by some threshold. In this case, the difference between the two mean is not compared with 0 but with the chosen threshold