Introductory Statistics-E2

Numbering Code U-LAS11 10002 LE55 Year/Term 2021 ・ Second semester
Number of Credits 2 Course Type Lecture
Target Year All students Target Student For all majors
Language English Day/Period Tue.2
Instructor name VANDENBON, Alexis (Institute for Frontier Life and Medical Sciences Program-Specific Senior Lecturer)
Outline and Purpose of the Course Statistics is arguably the most important science in the world, because every other field of science depends upon it. Nowadays, science is becoming increasingly driven by large amounts of data. The key problem is how to extract knowledge from this data. Statistical analysis is a necessary step in solving this problem. This course will introduce the theory behind basic statistics and practical applications.
Course Goals Students will learn about basic concepts in statistics, and learn to apply them on real datasets. Students will develop a feeling for critical thinking when faced with data, be able to make hypotheses, and suggest relevant ways to test them.
Schedule and Contents The course will be offered according to the plan below. If face-to-face lectures are not possible because of the pandemic, the course will be online (“on demand”) and I will add new course material (videos and slides) on PandA. I will also hold a weekly Zoom meeting to take questions.

Lectures 1 and 2. Introduction to statistics and data analysis. Statistics in the context of the general process of investigation. Introduction to numerical and categorical data. Simple ways of visual inspection (scatter plots, histograms, etc) and summary statistics.
Lecture 3 and 4. Probability. Formal introduction to probability, probability distributions, independent and dependent variables, and conditional, marginal, joint probability, and random variables.
Lecture 5. Distributions of random variables. Introduction to the normal distribution and its properties.
Lectures 6 and 7. Foundations for inference. We will discuss the principles of parameter inference, and the reliability of parameter estimates, including standard errors and confidence intervals. We will also introduce hypothesis testing and p-values based on these principles.
Lectures 8 and 9. The Central Limit Theorem and inference for numerical data. Practical applications, and the t-test.
Lectures 10 and 11. Inference for categorical data. We examine proportions, their confidence intervals, hypothesis testing, and comparison.
Lecture 12. Introduction to linear regression. We will cover line fitting, residuals, correlation, and least squares regression. The assumptions, interpretation, and weaknesses of linear regression will be introduced.
Lecture 13. Multiple and logistic regression. We expand the principles of simple linear regression to cases with many predictors (multiple regression), and cases where the outcomes are binary categorical (logistic regression).
Lecture 14. Review of course material.
Lecture 15. Final examination, if the COVID-19 situation allows it. If a face-to-face examination is impossible, the final examination will be replaced by a number of smaller assignments.
Lecture 16. Feedback
Evaluation Methods and Policy Grading will be based on a final examination (50%) and small assignments (50%). If the COVID-19 situation does not allow a face-to-face examination, the grading will be based completely on assignments (100%).
Course Requirements None
Study outside of Class (preparation and review) The course will follow a textbook. At the end of each lecture I will specify the sections to read before the next lecture.
Textbooks Textbooks/References OpenIntro Statistics (Fourth Edition), Diez, Cetinkaya-Rundel, and Barr, (OpenIntro, Inc.), ISBN:978-1943450077, The course lectures will follow the content of this textbook. Please note that this textbook is also freely (legally) available for download at