• Non ci sono risultati.

Consider the general linear model Y = Xβ + E, where Y = (y1

N/A
N/A
Protected

Academic year: 2022

Condividi "Consider the general linear model Y = Xβ + E, where Y = (y1"

Copied!
2
0
0

Testo completo

(1)

Statistical models A.Y. 2014/15

Written exam of June 15, 2015.

1. Consider the general linear model Y = Xβ + E, where Y = (y1, . . . , yn) is a vector in Rn, X is a matrix n × p and E are random variables with mean 0.

(a) Show that ˆβ = (XtX)−1XtY is an unbiased estimator of β. Compute also V ( ˆβ). This choice is generally called the least square method; what does this exactly mean?

(b) Two possible motivations for the choice of ˆβ are maximum likelihood estimation, or Gauss-Markov theorem. State precisely the results (proofs are not necessary, but, if time allows, they are welcome) and the assumptions on the errors E for either result.

(c) Let ˆε = Y − X ˆβ the observed residuals; prove that V (ˆεi) = σ2(1 − Hii) where H = X(XtX)−1Xt and σ2 = V (Ej) for each j. Explain why, if Hii is close to 1 for some value of i, this results makes us expect that ˆyiwill be close to yi.

(d) Are ˆεiand ˆεj independent for i 6= j? [Hint: check the proof of the previous result]

2. Consider a linear model with response variable Y and two predictor variables, X1quantitative and X2 qualitative (with three values, say A, B and C).

(a) Write down (in a mathematical way) the assumptions of the additive model (Y ∼ X1+ X2in R).

(b) Give a graphical representation of this model

(c) Which are the (null and alternative) hypotheses that are routinely tested in this model?

(d) Write down (in a mathematical way) the assumptions of the model with interaction (Y ∼ X1∗ X2 in R).

(e) Give a graphical representation of this model

(f) Does this model differ (and, if so, how) from performing separate linear regressions of Y on X1(Y ∼ X1in R) for each of the subsets {X2 = A}, {X2 = B}, {X2 = C}?

(g) How can we decide whether to choose the model with or without interaction?

3. On a dataset we perform the regression of ozone concentration on wind speed (Wind) and month (with values from 5 to 9). Using R, we obtain the following regression table:

Coefficients:

Estimate Std. Error t value Pr(> |t|) (Intercept) 50.748 15.748 3.223 0.00169 **

Wind -2.368 1.316 -1.799 0.07484 .

month6 -41.793 31.148 -1.342 0.18253

month7 68.296 20.995 3.253 0.00153 **

month8 82.211 20.314 4.047 9.88e-05 ***

month9 23.439 20.663 1.134 0.25919

Wind:month6 4.051 2.490 1.627 0.10680 Wind:month7 -4.663 2.026 -2.302 0.02329 * Wind:month8 -6.154 1.923 -3.201 0.00181 **

Wind:month9 -1.874 1.820 -1.029 0.30569

Signif. codes: *** < 0.001 ** < 0.01 ∗ < 0.05 . < 0.1

Residual standard error: 23.12 on 106 degrees of freedom 1

(2)

Multiple R-squared: 0.5473, Adjusted R-squared: 0.5089 F-statistic: 14.24 on 9 and 106 DF, p-value: 7.879e-15

Describe clearly which is the final model that is obtained from this analysis. It is advisable writing down separately the model for observations belonging to each month; in other words, write formulae Ozone = . . . if month = 5, Ozone = . . . if month = 6, . . . .

Describe precisely which are the tests that have been performed (and presented with a P - value); discuss which results appear to be significant, which are the potential problems of this analysis, and how one could proceed.

As can be seen from the Table, the variable “month” has been used as qualitative. It could have also be considered as quantitative; what would have been the difference in the resulting model1?

1I just want to see the structure of the model, not the numerical values

2

Riferimenti

Documenti correlati

trace property , that is, we show that if a trace ρ of a finite Kripke structure K satisfies a given formula ϕ of AAEE or AABB, then there exists a trace π, whose length is

Les recherches menées dans ce domaine, à part leur attrait pour les positivistes du XIX e siècle, rendirent possible la publication des premières éditions critiques d’actes anciens

Reaction of serinol with carbonyl compounds – Hypothesis of a mechanism The nucleophilic nitrogen reacts with carbonyl group..

Tuttavia, il prezzo predatorio rappresenta un dilemma che ha tradizionalmente sollecitato l’attenzione della comunità antitrust: da un lato la storia e la teoria economica

The development of mobile phones and the miniaturization of portable electronic devices also stimulated many related research studies, like biological effects of

In small ruminants, often the reality of husbandry is far different compared to other livestock rearing, especially in Mediterranean regions, where sheep husbandry

The purpose of this thesis is the study and analysis of a specic industrial drying system for the production of tissue paper using the software Matlab.. First, equations which