• Non ci sono risultati.

Course title:

N/A
N/A
Protected

Academic year: 2021

Condividi "Course title:"

Copied!
2
0
0

Testo completo

(1)

Course title: Basic statistics for research

Instructors: Eva Riccomagno and Maria Piera Rogantin

Description: The course consists of eight four-hour meetings and covers two main topics, ex- ploratory data analysis and statistical inference. Each meeting includes a theoretical ses- sion and a practical session based on the software R. The practical sessions are based on datasets from case studies used to exemplify the theory. Participants are encouraged to provide datasets related to their research interests. Weekly assignments will be set to practice the techniques illustrated. Assignments will include bookwork exercises and data analyses with the software.

Essential probability concepts are scattered throughout the second part.

At the end of the course students will be able to perform the learnt techniques on their own datasets and to decide on the suitability of the learnt techniques for the analysis of their datasets.

Course structure

First part: univariate and multivariate exploratory data analysis

Lecture 1 [October 28th]: Introduction to the course and to the software R: quantitative and qualitative variables, basic data representation including graphical representations, the data matrix. Categorical and qualitative data. Row and column profiles. Barplot. Data structures in R, reading datasets in R, operations with data, for cycle.

Lecture 2 [Novenber 4th]: Analysis of univariate data and related computations in R:

frequency distributions (percentage distribution, cumulative distribution); Centrality mea- sures (mean, median and mode); Dispersion measures (range, percentiles and quantiles, variance, standard deviation) also in subgroups. Histograms, dot-plots, box-plots. Corre- lation and Pearson R for bivariate data.

Lecture 3 [November 11th]: Cluster analysis: distance measures, hierarchical aggregation, aggregation index and dendrogram, decomposition of inertia. R code and output interpre- tation.

Second part: statistical inference

Lecture 4 [November 18th]: Fundamental concepts in parametric inference (point estima- tion, confidence sets and hypothesis testing). Point estimate of mean and proportions.

Law of large numbers. The binomial random variable. Introduction to statistical hypoth- esis testing (formulation of null and alternative hypothesis, choice of the test statistics, significance level and rejection region). Type I and II errors.

Lecture 5 [November 25th]: Hypothesis tests on a population mean. The normal random variable and the central limit theorem. Composite hypothesis, power function and p-value.

Sample size. The z-test, t-test, Wald-test. Test on a proportion.

Lecture 6 [December 12th]: Test for the equality of means (paired and non-paired samples).

More on normal random variable. Multiple tests. Abuse and misuse of statistical hypothesis testing in scientific research. Confidence intervals (one and two-sided, confidence levels, relationship with tests). Distribution free hypothesis tests I: example of distribution free statistics, the sign-test, Wilcoxon-Mann-Whitney test.

Lecture 7 [December 16th]: Distribution free hypothesis tests II: goodness-of-fit tests (chi- square goodness-of-fit tests, one and two sample Kolmogorov-Smirnov goodness-of-fit tests,

1

(2)

warnings on the use of non-parametric testing procedures). Linear models and anova I:

introduction, inference on the coefficients, inference on the mean responses, analysis of the residuals)

Lecture 8 [December 20th]: Linear models and anova II: tests of subsets of coefficients, pre- diction of the response and related error. ANOVA: univariate (one-way, two-way for crossed factors and two-way for nested factors), the Kruskall Wallis test. MANOVA: multivariate anova with and without repeated measures.

Notes: Slides and datasets are available at http://www.dima.unige.it/ rogantin/IIT/ as well as bibliographical references and some of the assignments.

December 19, 2016

2

Riferimenti

Documenti correlati

■ Data transfer functions have the purpose of getting files into SRM spaces either from the client’s space or from other remote storage systems on the Grid, and to retrieve them..

2.. the critical value is the smallest s s.t.. 10000 and 9500 respectively), only with large sample the probability of correct decision

If the sample is large, even a small difference can be “evidence”, that is hard to explain by the chance variability.. are close to those of the standard

A marketing company claims that 25% of the IT professionals choose the Chicago Tri- bune as their primary source for local IT news. A survey was conducted last month to check

b) two-samples.. Chicago Tribune Chicago land’s technology professionals get local technology news from various newspapers and magazines. A marketing company claims that 25% of the

The powers of the introduced tests are compared with some traditional goodness of fit tests including Kolmogorov-Smirnov, Anderson- Darling and Cramer-von Mises tests for

letteralmente cancellati dal culto: innanzitutto ricorderei come il portavoce della comunità, prima che Fratel Cosimo iniziasse la sua preghiera di guarigione, fosse solito leggere