• Non ci sono risultati.

Inferential Statistics Part B

N/A
N/A
Protected

Academic year: 2021

Condividi "Inferential Statistics Part B"

Copied!
25
0
0

Testo completo

(1)

Inferential Statistics Part B

Eva Riccomagno, Maria Piera Rogantin

DIMA – Universit`a di Genova

http://www.dima.unige.it/~rogantin/UnigeStat/

(2)

Part B

An introduction to hypothesis tests

• B1. Introduction

• B2. A probability aside: the binomial random variable

• B3. Rejection region

• B4. Decision errors

• B5. Formulation of the hypotheses

(3)

B1. Introduction

Example

An experiment has two possible outcomes:

1 for success, and 0 for failure

It is known that the probability of success, p, is either 0.3 or 0.7 and 20 independent trials are performed in exactly the same way Aim: infer the true value of p from the outcomes of the 20 trials The sum of the outcomes is modelled by a binomial random variable X ∼ B(20, p)

Two hypotheses for p: H0 : p = 0.3 null hypotesis

H1 : p = 0.7 alternative hypotesis In hypothesis testing, we choose some null hypothesis H0 and we ask if the data provide sufficient evidence to reject it

(4)

B2. A probability aside. Binomial random variable

An experiment that satisfies the following four conditions can be modelled by a Binomial random variable

1. there is a fixed sample size (number of trials, n)

2. on each trial, the event of interest either occurs or does not 3. the probability of occurrence (or not) is the same for each

trial (0 < p < 1)

4. trials are independent of one another Example:

toss an unbalanced coin 10 times where, in a single toss, head has probability p and tail has probability 1 − p

What is the probability of observing the sequence HHT HHT T HHH?

pp(1 − p)pp(1 − p)(1 − p)ppp = p7(1 − p)3

It is equal to the probability of observing any other sequence with 7 heads and 3 tails.

How many sequences with exactly 7 heads?

(5)

Let X be the random variable modeling the number of successes in n independent trials

The probability that X is equal to k is:

P(X = k) =

n k

 pk (1 − p)n−k for k = 0, 1, . . . , n where n

k

 is the number of the sequences of n elements (success or failure) with k successes

n k

 = n!

k!(n − k)!

Example n=20; p=0.10; x=seq(0,n);df=dbinom(x,n,p)

k 0 1 2 3 4 5 6 7 8 . . .

P (X = k) 0.122 0.270 0.285 0.190 0.090 0.032 0.009 0.002 0.000 . . .

(6)

Plots of the probability density functions n = 20 and different p

0 5 10 15 20

0.000.150.30

Binomial n=20 p= 0.1

0 5 10 15 20

0.000.150.30

Binomial n=20 p= 0.3

0 5 10 15 20

0.000.150.30

Binomial n=20 p= 0.5

0 5 10 15 20

0.000.150.30

Binomial n=20 p= 0.9

First plot n=20; p=0.10; x=seq(0,n);df=dbinom(x,n,p)

k 0 1 2 3 4 5 6 7 8 . . .

P (X = k) 0.122 0.270 0.285 0.190 0.090 0.032 0.009 0.002 0.000 . . .

(7)

Example n=20; p=0.10; x=seq(0,n);df=dbinom(x,n,p)

k 0 1 2 3 4 5 6 7 8 . . .

P (X = k) 0.122 0.270 0.285 0.190 0.090 0.032 0.009 0.002 0.000 . . .

Functions for random variables in R

d<ran-var-name>(x,<other parameters>) density probability function in x

p<ran-var-name>(x,<other parameters>) cumulative distribution function in x q<ran-var-name>(a,<other parameters>) a-th quantile

r<ran-var-name>(n,<other parameters>) random sample of size n

(8)

Binomial random variable

dbinom(x,n,p) pbinom (x,n,p) qbinom (a,n,p) rbinom (N,n,p)

n=20; p=0.30 x=seq(0,n)

df=dbinom(x,n,p) cdf=pbinom(x,n,p) cbind(x,round(df,6),

round(cdf,6)) par(mfrow=c(2,1)) plot(x,df,type="h",

col="blue",lwd=3) plot(x,cdf,type="s",

col="blue",lwd=3) par(mfrow=c(1,1))

x

[1,] 0 0.000798 0.000798 [2,] 1 0.006839 0.007637 [3,] 2 0.027846 0.035483 [4,] 3 0.071604 0.107087 [5,] 4 0.130421 0.237508 [6,] 5 0.178863 0.416371 [7,] 6 0.191639 0.608010 [8,] 7 0.164262 0.772272 [9,] 8 0.114397 0.886669 [10,] 9 0.065370 0.952038 [11,] 10 0.030817 0.982855 [12,] 11 0.012007 0.994862 [13,] 12 0.003859 0.998721 [14,] 13 0.001018 0.999739 [15,] 14 0.000218 0.999957 [16,] 15 0.000037 0.999994 [17,] 16 0.000005 0.999999 [18,] 17 0.000001 1.000000 [19,] 18 0.000000 1.000000 [20,] 19 0.000000 1.000000 [21,] 20 0.000000 1.000000

0 5 10 15 20

0.000.10

x

df

0 5 10 15 20

0.00.40.8

x

cdf

(9)

Back to the Hypothesis Tests and to the first example

Experiment with two possible outcomes: 1 for success, and 0 for failure

The probability of success, p, is either 0.3 or 0.7 and 20 indepen- dent trials are performed in exactly the same way

Aim: infer the true value of p from the outcomes of the 20 trials

The sum of the outcomes is modelled by a binomial random variable X ∼ B(20, p)

Sometimes such a variable is denoted by Sn

Two hypotheses for p: H0 : p = 0.3 null hypotesis

H1 : p = 0.7 alternative hypotesis In hypothesis testing, we choose some null hypothesis H0 and we ask if the data provide sufficient evidence to reject it

(10)

B3. Rejection region

Plots of probability density functions - under H0 : p = 0.3 red

- under H1 : p = 0.7 black

Which of the two hypotheses is more sup-

ported by the data? 0 5 10 15 20

0.000.050.100.150.20

0 5 10 15 20

0.000.050.100.150.20

We choose a threshold s and reject H0 if in the sample there are more 1’s than s

Does reject H0 mean accept H1?

(11)

Significance level α

In order to determine s typically the researcher fixes the signifi- cance level of the test α in (0, 1)

Practically he/she assumes H0 is true and computes the smallest s such that:

P(X > s | p = 0.3) < α Typically α = 0.05, 0.01, 0.1.

Meaning: if we obtain a “high” number of successes, we consider unlikely that the data are the realization of a random variable with p = 0.3

(12)

The threshold s is the 1−α quantile of X

0 5 10 15 20

0.000.050.100.150.20

probability density function − p=0.3

0 5 10 15 20

0.00.20.40.60.81.0

cumulative distribution function − p=0.3

> s=qbinom(0.95,20,0.3);s [1] 9

Decision rule here: “if in 20 trials we obtain more than 9 success, we reject H0 (p = 0.3)”

Terminology:

- the rejection region of the test is {10, 11, 12, . . . , 20}

- the test statistics is the number of ones in the sample

(13)

Review and comments

A hypothesis testing is formed by:

- a null and alternative hypothesis

- a test statistics (function of the sample variables) and its ob- served value

- a rejection region

The null is a default theory and we ask if the data provide suffi- cient evidence to reject the theory. If not we retain it. We also say that we fail to reject the null.

In the example the test statistics is Sn and its observed value is the number of occurrence of ones in the observed sample.

An observed value of Sn larger than 9 - does not support the null hypothesis

- is evidence that the alternative hypothesis holds.

Exercise What is the probability of observing Sn > 9 under H0? and under H1? Did we need to compute both of them start- ing from the distributions of Sn under H0 and H1? Note the

(14)

Reject the null, does not imply to accept the alternative neces- sarily. We accepted the alternative if the possible decisions are limited to null and alternative.

In order to investigate whether the data support or not the al- ternative, the test should be reformulated.

Go back to the plots of probability density functions to see what changes.

- under H0 : p = 0.3 red

- under H1 : p = 0.7 black 0 5 10 15 20

0.000.050.100.150.20

0 5 10 15 20

0.000.050.100.150.20

(15)

Behaviour in hypothesis testing

From the book: Larry Wassermann. All of Statistics. Springer.

2010. Chapter 6.1 p. 87

Hypothesis testing is like a legal trial. We assume some- one is innocent unless the evidence strongly suggests that he is guilty. Similarly, we retain H0 unless there is strong evidence to reject H0

[cerco nei dati l’evidenza contro H0]

(16)

B4. The decisions taken are affected by errors Types of error

Two types of error:

type I error: rejecting H0 when H0 is true type II error: retaining H0 when H0 is false

Usually the experimenter sets a maximum allowed probability α for the type I error (α = 0.1, 0.5, 0.01)

In the example with 20 tosses of a biased coin

H0 : p = 0.3 H1 : p = 0.7

R0 = {more than 9 successes}

α was set to 0.05 (sum of the red probabilities in the plot)

0 5 10 15 20

0.000.050.100.150.20

0 5 10 15 20

0.000.050.100.150.20

retain H0 reject H0

(17)

The probability of the type II error is indicated with β (probability to retaining H0 when H1 is true)

It can happen that in the 20 trials you get fewer successes than 10 even if the true probability is 0.7 (sum of black probabilities in the plot)

0 5 10 15 20

0.000.050.100.150.20

0 5 10 15 20

0.000.050.100.150.20

retain H0 reject H0

Here β is the cumulative distribution function of X under H1 calculated in s β = P(X ≤ s | p = 0.7)

In our case: β = 0.017

> b=pbinom(9,20,0.7);round(b,3)

(18)

Types of error (continue)

DECISION PROBABILITY

H0 retained H0 rejected H0 retained H0 rejected H0 true correct type I

1 − α α

error H0 false type II

correct β 1 − β

error

Cumulative distribution func- tion plots under H0 and H1

The threshold s is indicated.

Exercise: locate α and β

0 5 10 15 20

0.00.20.40.60.81.0

0 5 10 15 20

0.00.20.40.60.81.0

(19)

Simulation in R

H0 : p = 0.3, H1 : p = 0.7 T : number of successes critical value s = 9

Simulate a binomial experiment assuming H0

> rbinom(1,20,0.3) [1] 7

Correct decision

assuming H1

> rbinom(1,20,0.7) [1] 12

Correct decision

Simulate 100 binomial experiments and count how many times the test returns the correct decision.

assuming H0

> a=rbinom(100,20,0.3)

> length(a[a<=9]) [1] 94

In 94% of cases correct decision

assuming H1

> b=rbinom(100,20,0.7)

> length(b[b>9]) [1] 99

In 99% of cases correct decision

(20)

B5. Formulation of the hypothesis

Let X1, ...Xn ∼ F be a random sample

Example.

Null hypothesis: the disease rate is the same in groups A and B Alternative hypothesis: the disease rate is higher in group A

The rejection region R0: an appropriate subset of the out- comes.

Let x = (x1, . . . , xn) be the sample values. If x ∈ R0 we reject the null hypothesis, otherwise we do not reject it

x ∈ R0 ⇒ reject H0

x /∈ R0 ⇒ retain (do not reject) H0

The test statistic T, i.e. an appropriate function of the sample variables, allows us to determine R0

(21)

Examples.

1. A new drug should lower the probability of side effects. The best drug on the market has probability p = 0.2 of side effects

H0: p = 0.2 H1: p < 0.2

2. A new ductile iron should have a mean Brinell’s hardness greater than 170 megapascal (the ductile irons currently used have a mean hardness less than 170)

H0: µ ≤ 170 H1: µ > 170

3. The percentage of left-handed USA President of USA is not 1/4 as in the general population

H0: p = 1/4 H1: p 6= 1/4

Reformulate 3. to investigate whether it is higher than in the general population

(22)

One-sided and two-sided test

- Example 1: one-sided test (left)

H0: p = 0.2 H1: p < 0.2 - Example 2: one-sided test (right)

H0: µ ≤ 170 H1: µ > 170 - Example 3: two-sided test

H0: p = 1/4 H1: p 6= 1/4 If “left” or “right” depends on the test statistics

Simple and composite hypothesis

- H0: p = 1/4 simple

- H0: µ ≤ 170 composite

(23)

The “form” of R0 depends on H1 H0: p = 0.3 0.05 = P (X ∈ R0|H0)

0 5 10 15 20

0.000.100.20

0 5 10 15 20

0.000.100.20

H0: p=0.3 −− H1: p>0.3

0 5 10 15 20

0.000.100.20

0 5 10 15 20

0.000.100.20

H0: p=0.3 −− H1: p<0.3

0 5 10 15 20

0.000.100.20

0 5 10 15 20

0.000.100.20

0 5 10 15 20

0.000.100.20

H0: p=0.3 −− H1: p not 0.3

H1 : p > 0.3

one-sided (right) R0 = {x > 9}

s=qbinom((1-a),n,p)

H1 : p < 0.3 one-sided (left) R0 = {x ≤ 2}

s=qbinom(a,n,p)-1

H1 : p 6= 0.3 two-sided R0 = {x ≤ 1} ∪ {x > 10}

s1=qbinom((a/2),n,p)-1 s2=qbinom((1-a/2),n,p)

(24)

R coding for plots of page 9

n=20;p_0=0.3;p_1=0.7;a=0.05

s=qbinom((1-a),n,p_0) ## one-sided right test x=0:n

y_0=dbinom(x,n,p_0) y_1=dbinom(x,n,p_1)

plot(x+0.1,y_0,xlim=c(0,n),ylim=c(0,max(y_0,y_1)),type="h", lwd=3,xlab=" ",ylab=" ",col="red")

par(new=T)

plot(x-0.1,y_1,xlim=c(0,n),ylim=c(0,max(y_0,y_1)),type="h", lwd=3,xlab=" ",ylab=" ")

abline(h=0);

abline(v=s+.5, col="blue",lwd=3) ## one-sided right test

## the bars are drawn slightly shifted

## (x+0.1 and x-0.1) for better visualisation

Exercice What if you set the parameters at:

- n = 20; p0 = 0.3;p1 = 0.5; a = 0.05 - n = 20; p0 = 0.3; p1 = 0.7;a = 0.02

- n = 20;p0 = 0.1; p1 = 0.5; a = 0.05 shift left both p

- n = 20;p0 = 0.5; p1 = 0.7; a = 0.05 swap p0–p1 one-sided left test

Compute β in each case

(25)

R coding for plots of page 22

n=20;p=0.3;a=0.05 par(mfrow=c(3,1))

s=qbinom((1-a),n,p) ## one-sided right

x1=seq(0,s); x2=seq(s+1,n)

y1=dbinom(x1,n,p);y2=dbinom(x2,n,p)

plot(x2,y2,xlim=c(0,n),ylim=c(0,.2),type="h",lwd=3,xlab=" ",ylab=" ",col="blue") par(new=T)

plot(x1,y1,xlim=c(0,n),ylim=c(0,.2),type="h",lwd=3,

xlab=" ",ylab=" ",col="red", main="H0: p=0.3 -- H1: p>0.3") abline(h=0) ;abline(v=s+.5, col="black",lwd=3)

s=qbinom(a,n,p);s=s-1;s ## one-sided left

x1=seq(0,s); x2=seq(s+1,n)

y1=dbinom(x1,n,p);y2=dbinom(x2,n,p)

plot(x2,y2,xlim=c(0,n),ylim=c(0,.2),type="h",lwd=3,xlab=" ",ylab=" ",col="red") par(new=T)

plot(x1,y1,xlim=c(0,n),ylim=c(0,.2),type="h",lwd=3,

xlab=" ",ylab=" ",col="blue", main="H0: p=0.3 -- H1: p<0.3") abline(h=0) ;abline(v=s+.5, col="black",lwd=3)

s1=qbinom((a/2),n,p);s1=s1-1;s1 ## two-sided

s2=qbinom((1-a/2),n,p);s2

x1=seq(0,s1); x2=seq(s1+1,s2); x3=seq(s2+1,20)

y1=dbinom(x1,n,p);y2=dbinom(x2,n,p);y3=dbinom(x3,n,p)

plot(x2,y2,xlim=c(0,n),ylim=c(0,.2),type="h",lwd=3,xlab=" ",ylab=" ",col="red") par(new=T)

plot(x1,y1,xlim=c(0,n),ylim=c(0,.2),type="h",lwd=3,xlab=" ",ylab=" ",col="blue") par(new=T)

plot(x3,y3,xlim=c(0,n),ylim=c(0,.2),type="h",lwd=3,

xlab=" ",ylab=" ",col="blue",main="H0: p=0.3 -- H1: p not 0.3")

Riferimenti

Documenti correlati

This is doing matter since 15 billion years ago, when elementary particles joined to form nucleus, electrons, and hydrogen atoms, which started to form new atoms and molecules,

[r]

Say if the following statements are unambiguously true (TRUE), unambiguously false (FALSE) or impossible to classify the way they are stated (CAN’T SAY).. Write the mo- tivations

decreases the activation energy of the forward reaction and increases the activation energy of the reverse reaction.. increases the activation energy of the forward reaction

Solution proposed by Roberto Tauraso, Dipartimento di Matematica, Universit`a di Roma “Tor Vergata”, via della Ricerca Scientifica, 00133 Roma,

Let a, b, and c be the lengths of the sides of a triangle, and let R and r be the circumradius and inradius of the

[r]

[r]