• Non ci sono risultati.

Inferential Statistics Hypothesis tests on a population mean

N/A
N/A
Protected

Academic year: 2021

Condividi "Inferential Statistics Hypothesis tests on a population mean"

Copied!
16
0
0

Testo completo

(1)

Inferential Statistics

Hypothesis tests on a population mean

Eva Riccomagno, Maria Piera Rogantin

DIMA – Universit`a di Genova

riccomagno@dima.unige.it rogantin@dima.unige.it

(2)

Part D

Hypothesis tests on the mean of a population (continue)

1. Test on the mean of a Normal variable – known variance (a)-(b) . . .

(c) The p-value

(d) Sample size, given α and β

2. Test on the mean of a Normal variable – unknown variance

(3)

1. (c) The p-value

Recall

A statistical hypothesis test on a parameter θ is given by H0 : θ ∈ Θ0

H1 : θ ∈ Θ1 T ∼ F R0 The power function is defined as

P (θ) = P (T ∈ R0|θ) =

α(θ) if θ ∈ Θ0 1 − β(θ) if θ ∈ Θ1

=

pr. type I error reject H0 when true 1 − pr. type II error reject H0 when false The size of the test is α = supθ∈Θ0 P (θ)

and it holds α ≥ α(θ) for all θ ∈ Θ0

A test is said of level α if its size is less than or equal to α

(4)

p-value

We have a family of statistical hypothesis tests H0 : θ ∈ Θ0

H1 : θ ∈ Θ1 T ∼ F tobs from a sample X

and for all α ∈ (0, 1) we have Rα0 (a rejection region for a size α test)

The p-value or observed significance level is the smallest level of significance at which H0 would be rejected, namely

p-value = inf{α ∈ (0, 1) : tobs ∈ Rα0}

(5)

From the book by Wassermann p. 157

Informally, the p-value is a measure of the evidence against H0: the smaller the p-value, the stronger the evidence against H0

Typically, researchers use the following evidence scale:

p-value evidence

< .01 very strong evidence against H0

.01 − .05 strong evidence against H0

.05 − .10 weak evidence against H0

> .1 little or no evidence against H0

Warning! A large p-value is not strong evidence in favor of H0. A large p-value can occur for two reasons: (i) H0 is true or (ii) H0 is false but the test has low power.

Warning! Do not confuse the p-value with P (H0|Data). The p- value is not the probability that the null hypothesis is true.

• “evidence” ⇐⇒ “statistically significant”

• The p-value depends on the sample size. If the sample is large, even a small difference can be “evidence”, that is hard to explain by the chance variability

(6)

The most common situation

Consider the family of tests with Rα0 = {T ≥ sα} where sα is computed from

P (T ≥ sα|H0) < α Note that α0 < α00 if and only if sα0 > sα00

5 10 15 20

0.000.040.080.12

t

s sα‘

α‘

α‘’ obs

p(tobs) = sup

θ∈Θ0

P (T > tobs)

The p-value is the level of the test when tobs is the threshold of the rejected region.

(7)

Running example. Concentration of toxic alga blooms

H0 : µ ≥ 10000 and H1 : µ < 10000

Day A: xA = 8500 p-value(8500) = 0.012

stde=2100/sqrt(10)

pnorm(8500,10000,stde)

[1] 0.01194886 700070007000 800080008000 900090009000 100001000010000 110001100011000 120001200012000 130001300013000

Day B: xB = 9500 p-value(9500) = 0.226

7000 8000 9000 10000 11000 12000 13000

7000 8000 9000 10000 11000 12000 13000

7000 8000 9000 10000 11000 12000 13000

(8)

How to compute the p-value?

The p-value (as the rejection region) depends on the “form” of the alternative hypothesis

In practice, the p-value is the level of the test if the threshold of R0 is the sample value

Consider H0 : µ = µ0

Assume: µ0 = 10000 and suppose x = 9000

different contexts

H1 : µ < µ0

H1 : µ > µ0

H1 : µ 6= µ0

10000 9000

10000 9000

10000 11000

9000

p(9000) = 0.066

pnorm(9000,mu0,std)

p(9000) = 0.934

1-pnorm(9000,mu0,std)

p(9000) = 0.132

2*pnorm(9000,mu0,std) [notice: 2*pnorm]

(9)

1. (d) Sample size n, given α e β for one-sided test

H0 : µ = µ0 H1 : µ=µ1

α probability of type I error β probability of type II error

To have at most the given probabilities of error, the sample size n should be

n ≥

zα + zβ2 σ2

0 − µ1)2

where zα and zβ are the α-th and the β-th quantiles of a standard normal random variable Z, Z ∼ N (0, 1)

The sample size n increases when:

- the distance between µ0 and µ1 decreases - the variance increases

- zα and zβ increase, equivalently α and β decrease

(10)

How to compute n?

Assume µ1 < µ0

α = P X < s|µ = µ0

= P

X − µ0

σ/

n < s − µ0

σ/ n



= P (Z < zα) β = P X > s|µ = µ1

= P

X − µ1

σ/

n > s − µ1

σ/ n



= P Z > z1−β

From s−µ0

σ/

n = zα and s−µ1

σ/

n = z1−β = −zβ we have:

n = zα + zβ2 σ2 0 − µ1)2 The result is the same if µ1 > µ0

(11)

2. Test on the mean of a Normal variable with unknown variance – t test

Let X ∼ N (µ, σ2) with unknown µ and σ2 The estimator of µ remains X.

Remember that, if σ2 is known, X ∼ N



µ, σ2

n



or, equivalently, X − µ

σ/√

n ∼ N (0, 1)

The unbiased estimator of σ2 is (see Part A) S2 = 1

n − 1

n X

i=1

Xi − X2

(12)

A small aside of probability

The random variable T has distribution Student’s t with n − 1 degrees of freedom:

T = X − µ S/√

n ∼ t[n−1]

The density function and the cumulative dis- tribution function of a t[n] r.v. are close to those of the standard normal r.v. N (0, 1) Dashed lines: t[2] and t[5] – solid line: N (0, 1)

−3 −2 −1 0 1 2 3

0.00.10.20.30.4

−3 −2 −1 0 1 2 3

0.00.10.20.30.4

−3 −2 −1 0 1 2 3

0.00.10.20.30.4

(13)

Running example. H0 : µ ≥ 10000 H1 : µ < 10000

Suppose σ unknown and estimated by s = 2000 (using S2)

The form of the rejection region is the same when σ is known : R0 = {X < c}

The threshold c is such that P

X < c|µ = 10000 = P X − 10000 2000/√

10 < c − 10000 2000/√

10

!

= P (T < tα) = α = 0.05

where tα is the α-th quantile of a random variable t[9]. Then c − 10000

2000/√

10 = tα and c = 10000 + tα 2000/√ 10

We obtain c = 8841 from R:

t_05 = qt(.05,9)

c=10000+t_05*2000/sqrt(10);c

(14)

Example

How accurate are radon detectors of a type sold to homeowners?

University researchers placed 12 detectors in a chamber that ex- posed them to 105 pico-curies per liter (pCi/l) of radon

The detector readings were as follows:

91.9,97.8,111.4,122.3,105.4,95.0,103.8,99.6,96.6,119.3,104.8,101.7

Is there convincing evidence that the mean of detector readings differs from the nominal value of 105?

(15)

Model and test

Let X be the random variable modeling the detector reading Assume X ∼ N (µ, σ2); σ2 unknown

H0 : µ = 105 and H1 : µ 6= 105 Test statistic under H0: X−105

S/ n

• p-value computation in R using t.test

Rn=c(91.9,97.8,111.4,122.3,105.4,95.0,103.8,99.6,96.6,119.3,104.8,101.7) t.test(Rn-105) ### data should be centered at mu_0

One Sample t-test data: Rn - 105

t = -0.31947, df = 11, p-value = 0.7554

alternative hypothesis: true mean is not equal to 0 95 percent confidence interval:

-6.837503 5.104170 sample estimates:

mean of x -0.8666667

p-value=0.76.

The sample test statistic depends on both x and s

(16)

• Computation of the reject region R0

t05=qt(0.025,11)

c1=105+t05*sd(Rn)/sqrt(12) c2=105-t05*sd(Rn)/sqrt(12) cbind(c1,c2)

s1 s2

[1,] 99.02916 110.9708 mean(Rn)

[1] 104.1333

R0 = {x < 99.0} ∪ {x > 111.0}

Note that R0 depends on data; in fact the variance of X is estimated by S2/√

n

The sample mean is 104.1 that does not belong to R0

• Direct computation of the p-value

t_obs=(mean(Rn)-105)/(sd(Rn)/sqrt(length(Rn)));t_obs [1] -0.3194729

pvalue=2*pt(t_obs,(length(Rn)-1));pvalue [1] 0.7553532

There is not evidence to reject H0 (µ = 105)

Riferimenti

Documenti correlati

Consider a test on the mean µ of a random variable X with unknown distribution and assume a large sample size.. Aside of probability.. Test on the proportion p of a binary

Solution proposed by Roberto Tauraso, Dipartimento di Matematica, Universit`a di Roma “Tor Vergata”, via della Ricerca Scientifica, 00133 Roma,

Doney, One-sided Local Large Deviation and Renewal Theorems in the Case of Infinite Mean, Probab. Erickson, Strong renewal theorems with infinite

As exposed by the above results, a different pattern in zooplankton species composition and their respective regulator factors would apply to these Chiloé Island lakes in comparison

The authors of [5] denote by H(G) the set of all H-subgroups of G.. Gasch¨ utz [7] introduced the class of finite T -groups and characterized solvable T -groups. He proved that

Figure 3.22: Experimental time series showing the presence of a stable upright position together with a stable large amplitude sub-harmonically resonant roll motion for ITACA in

Temperature Programmed Surface Reaction experiments using ammonia well agree with the catalytic activity results, indicating that high H 2 and N 2 desorption are obtained only if

There- fore an important development of the present work would be represented by the implementation of the developed algorithms on GPU-base hardware, which would allow the