Inferential Statistics Hypothesis tests on a population mean

(1)

Inferential Statistics

Hypothesis tests on a population mean

Eva Riccomagno, Maria Piera Rogantin

DIMA – Universit`a di Genova

riccomagno@dima.unige.it rogantin@dima.unige.it

(2)

Part D

Hypothesis tests on the mean of a population (continue)

1. Test on the mean of a Normal variable – known variance (a)-(b) . . .

(c) The p-value

(d) Sample size, given α and β

2. Test on the mean of a Normal variable – unknown variance

(3)

1. (c) The p^-value

Recall

A statistical hypothesis test on a parameter θ is given by H₀ : θ ∈ Θ₀

H₁ : θ ∈ Θ₁ T ∼ F R₀ The power function is defined as

P (θ) = P (T ∈ R₀|θ) =







α(θ) if θ ∈ Θ₀ 1 − β(θ) if θ ∈ Θ₁

=







pr. type I error reject H₀ when true 1 − pr. type II error reject H₀ when false The size of the test is α = sup_θ∈Θ₀ P (θ)

and it holds α ≥ α(θ) for all θ ∈ Θ₀

A test is said of level α if its size is less than or equal to α

(4)

p-value

We have a family of statistical hypothesis tests H₀ : θ ∈ Θ₀

H₁ : θ ∈ Θ₁ T ∼ F t_obs from a sample X

and for all α ∈ (0, 1) we have R^α₀ (a rejection region for a size α test)

The p-value or observed significance level is the smallest level of significance at which H₀ would be rejected, namely

p-value = inf{α ∈ (0, 1) : t_obs ∈ R^α₀}

(5)

• From the book by Wassermann p. 157

Informally, the p-value is a measure of the evidence against H0: the smaller the p-value, the stronger the evidence against H0

Typically, researchers use the following evidence scale:

p-value evidence

< .01 very strong evidence against H0

.01 − .05 strong evidence against H0

.05 − .10 weak evidence against H0

> .1 little or no evidence against H0

Warning! A large p-value is not strong evidence in favor of H0. A large p-value can occur for two reasons: (i) H0 is true or (ii) H0 is false but the test has low power.

Warning! Do not confuse the p-value with P (H0|Data). The p- value is not the probability that the null hypothesis is true.

• “evidence” ⇐⇒ “statistically significant”

• The p-value depends on the sample size. If the sample is large, even a small difference can be “evidence”, that is hard to explain by the chance variability

(6)

The most common situation

Consider the family of tests with R^α₀ = {T ≥ s_α} where s_α is computed from

P (T ≥ s_α|H₀) < α Note that α⁰ < α⁰⁰ if and only if s_α0 > s_α00

5 10 15 20

0.000.040.080.12

t

s s_α‘

α‘

α‘’ ^obs

p(t_obs) = sup

θ∈Θ₀

P (T > t_obs)

The p-value is the level of the test when t_obs is the threshold of the rejected region.

(7)

Running example. Concentration of toxic alga blooms

H₀ : µ ≥ 10000 and H₁ : µ < 10000

Day A: x_A = 8500 p-value(8500) = 0.012

stde=2100/sqrt(10)

pnorm(8500,10000,stde)

[1] 0.01194886 ₇₀₀₀₇₀₀₀₇₀₀₀ ₈₀₀₀₈₀₀₀₈₀₀₀ ₉₀₀₀₉₀₀₀₉₀₀₀ ₁₀₀₀₀₁₀₀₀₀₁₀₀₀₀ ₁₁₀₀₀₁₁₀₀₀₁₁₀₀₀ ₁₂₀₀₀₁₂₀₀₀₁₂₀₀₀ ₁₃₀₀₀₁₃₀₀₀₁₃₀₀₀

Day B: x_B = 9500 p-value(9500) = 0.226

7000 8000 9000 10000 11000 12000 13000

(8)

How to compute the p-value?

The p-value (as the rejection region) depends on the “form” of the alternative hypothesis

In practice, the p-value is the level of the test if the threshold of R₀ is the sample value

Consider H₀ : µ = µ₀

Assume: µ₀ = 10000 and suppose x = 9000

different contexts

H₁ : µ < µ₀

H₁ : µ > µ₀

H₁ : µ 6= µ₀

10000 9000

10000 11000

9000

p(9000) = 0.066

pnorm(9000,mu0,std)

p(9000) = 0.934

1-pnorm(9000,mu0,std)

p(9000) = 0.132

2*pnorm(9000,mu0,std) [notice: 2*pnorm]

(9)

1. (d) Sample size n^{, given} α ^e β for one-sided test

H₀ : µ = µ₀ H₁ : µ=µ₁

α probability of type I error β probability of type II error

To have at most the given probabilities of error, the sample size n should be

n ≥

z_α + z_β² σ²

(µ₀ − µ₁)²

where z_α and z_β are the α-th and the β-th quantiles of a standard normal random variable Z, Z ∼ N (0, 1)

The sample size n increases when:

- the distance between µ₀ and µ₁ decreases - the variance increases

- z_α and z_β increase, equivalently α and β decrease

(10)

How to compute n?

Assume µ1 < µ0

α = P X < s|µ = µ⁰

= P

X − µ0

σ/√

n < s − µ0

σ/√ n

= P (Z < z^α) β = P X > s|µ = µ¹

= P

X − µ1

σ/√

n > s − µ1

σ/√ n

= P Z > z^1−β

From ^s−µ⁰

σ/√

n = z_α and ^s−µ¹

σ/√

n = z_1−β = −z_β we have:

n = z_α + z_β² σ² (µ0 − µ₁)² The result is the same if µ1 > µ0

(11)

2. Test on the mean of a Normal variable with unknown variance – t test

Let X ∼ N (µ, σ²) with unknown µ and σ² The estimator of µ remains X.

Remember that, if σ² is known, X ∼ N

µ, ^σ²

n

or, equivalently, X − µ

σ/√

n ∼ N (0, 1)

The unbiased estimator of σ² is (see Part A) S² = 1

n − 1

n X

i=1

X_i − X²

(12)

A small aside of probability

The random variable T has distribution Student’s t with n − 1 degrees of freedom:

T = X − µ S/√

n ∼ t_[n−1]

The density function and the cumulative distribution function of a t_[n] r.v. are close to those of the standard normal r.v. N (0, 1) Dashed lines: t_[2] and t_[5] – solid line: N (0, 1)

−3 −2 −1 0 1 2 3

0.00.10.20.30.4

−3 −2 −1 0 1 2 3

0.00.10.20.30.4

−3 −2 −1 0 1 2 3

0.00.10.20.30.4

(13)

Running example. H₀ : µ ≥ 10000 H₁ : µ < 10000

Suppose σ unknown and estimated by s = 2000 (using S²)

The form of the rejection region is the same when σ is known : R₀ = {X < c}

The threshold c is such that P

X < c|µ = 10000 = P X − 10000 2000/√

10 < c − 10000 2000/√

10

!

= P (T < t_α) = α = 0.05

where t_α is the α-th quantile of a random variable t_[9]. Then c − 10000

2000/√

10 = t_α and c = 10000 + t_α 2000/√ 10

We obtain c = 8841 from R:

t_05 = qt(.05,9)

c=10000+t_05*2000/sqrt(10);c

(14)

Example

How accurate are radon detectors of a type sold to homeowners?

University researchers placed 12 detectors in a chamber that ex- posed them to 105 pico-curies per liter (pCi/l) of radon

The detector readings were as follows:

91.9,97.8,111.4,122.3,105.4,95.0,103.8,99.6,96.6,119.3,104.8,101.7

Is there convincing evidence that the mean of detector readings differs from the nominal value of 105?

(15)

Model and test

Let X be the random variable modeling the detector reading Assume X ∼ N (µ, σ²); σ² unknown

H₀ : µ = 105 and H₁ : µ 6= 105 Test statistic under H₀: ^X−105

S/√ n

• p-value computation in R using t.test

Rn=c(91.9,97.8,111.4,122.3,105.4,95.0,103.8,99.6,96.6,119.3,104.8,101.7) t.test(Rn-105) ### data should be centered at mu_0

One Sample t-test data: Rn - 105

t = -0.31947, df = 11, p-value = 0.7554

alternative hypothesis: true mean is not equal to 0 95 percent confidence interval:

-6.837503 5.104170 sample estimates:

mean of x -0.8666667

p-value=0.76.

The sample test statistic depends on both x and s

(16)

• Computation of the reject region R₀

t05=qt(0.025,11)

c1=105+t05*sd(Rn)/sqrt(12) c2=105-t05*sd(Rn)/sqrt(12) cbind(c1,c2)

s1 s2

[1,] 99.02916 110.9708 mean(Rn)

[1] 104.1333

R₀ = {x < 99.0} ∪ {x > 111.0}

Note that R₀ depends on data; in fact the variance of X is estimated by S²/√

n

The sample mean is 104.1 that does not belong to R₀

• Direct computation of the p-value

t_obs=(mean(Rn)-105)/(sd(Rn)/sqrt(length(Rn)));t_obs [1] -0.3194729

pvalue=2*pt(t_obs,(length(Rn)-1));pvalue [1] 0.7553532

There is not evidence to reject H₀ (µ = 105)