Statistical Genetics II
Numbering Code | G-MED41 8S005 LE87 | Year/Term | 2022 ・ Intensive, Second semester | |
---|---|---|---|---|
Number of Credits | 2 | Course Type | lecture and seminar | |
Target Year | Doctoral students | Target Student | ||
Language | Japanese and English | Day/Period | Intensive | |
Instructor name | YAMADA RYO (Graduate School of Medicine Professor) | |||
Outline and Purpose of the Course |
Days and hours (1st week of Feb, Mon, Tue, Wed) 1 February 7th 8:45-10:15 2 February 7th 10:30-12:00 3 February 7th 13:00-14:30 4 February 7th 14:45-16:15 5 February 7th 16:30-18:00 6 February 8th 8:45-10:15 7 February 8th 10:30-12:00 8 February 8th 13:00-14:30 9 February 8th 14:45-16:15 10 February 8th 16:30-18:00 11 February 9th 8:45-10:15 12 February 9th 10:30-12:00 13 February 9th 13:00-14:30 14 February 9th 14:45-16:15 15 February 9th 16:30-18:00 This course is consisted of three components to master the basics of statistical genetics; (1) basic mathematics, (2) basics of statistics and (3) application of statistics to genetic studies. The course divides these three components into six modules and provides one of them for each semester. Basic mathematics A: Linear algebra and graph theory Basic mathematics B: Calculus and information geometry Basics of statistics A: Data types and statistical tests Basics of statistics B: Inference Application of statistics A: Statistical aspects of Mendelian traits and Cancer syndromes Application of statistics B: Statistical aspects of complex genetic traits and gene expression biomarkers. Schedule plan is as below: 2021 1st semester Basic mathematics B, 2nd semester Application A 2022 1st semester Basic mathematics A, 2nd semester Basics of statistics A 2023 1st semester Basic mathematics B, 2nd semester Application B 2024 1st semester Basic mathematics A, 2nd semester Basics of statistics B In the course, the language R is used for data analysis, simulation and visualization. This semester: Application A. |
|||
Course Goals |
Basic mathematics A: To understand matrix calculation least squares, PCA, and the basics of graph theory. Basic mathematics B: To understand calculus for probability density functions, likelihood functions and maximum likelihood estimation, approximation, and the basics of information geometry. Basics of statistics A:To understand data types, statistical tests, asymptotic tests, exact tests, and contingency table tests Basics of statistics B:To understand point and interval estimates, Bayesian estimates, maximum likelihood estimates and likelihood functions. Application A: To understand statistical aspects for risk evaluation of Mendelian traits and cancer syndromes. Application B: To understand statistical aspeccts for riks evaluation of complex genetic traits and expressional profiles. In every module, the basics of R language should be mastered. |
|||
Schedule and Contents |
Basic mathematics A The first half: Linear algebra, including matrix calculation, variance-covariance matrix, least square method, system of equation, PCA, optimization The second half: Graph theory, including basics of basics of graph theory, tree, minimum spanning tree, random graph, and network and graph objects in R language. Basic mathematics B The first half: Calculus, including expect of probability density functions, likelihood function and maximum likelihood estimate and calculus for them, calculus for probability density function, cumulative density function and hazard functions, partial derivative and HWE, calculus for least square methods and Taylor expansion. The second half: Information geometry, including its basics, Fisher information, dual flatness, exponential families and KL divergence. Basics of statistics A Data types including categorical types and simplex, 2x2 table tests and chi-square test and exact test, HWEtest and its exact test, 2x3 table test and genetic models, uniform distribution and multiple testing and Bonferroni's correction. Basics of statistics B Point and interval estimates, Bayesian appoach and binomial and beta distributions, haplotype frequency estimation and EM algorithm and LD block. Application A The first half: Mendelian traits, including pedigree, genotypes and phenotypes of Mendelian traits, NGS and disease-responsible variants. The second half: Cancer syndrome, including its basics and risk evaluation, decision-support tool, Bayseian estimation and Bayesian network. Application B The first half: Complex genetic traits, including genetic models, population and cohort, 2x3 table association tests and multiple-locus model. The second half: Transcriptome analysis and expression profiles, including their basics, differential expression analysis, clustering and heatmap, supervised learning and validation. |
|||
Evaluation Methods and Policy | Activities in the class hours, and homeworks are count. | |||
Course Requirements |
It is desirable to have background of molecular biology and genetics but not required if ready for selflearning them. Bring a laptop PC with wifi. Basic computer skills and programming in R are necessary. If no, self-learn them along the course. |
|||
Study outside of Class (preparation and review) |
Unskilled R users should learn it themselves by using it for their daily research activities. Homework every week. |
|||
Textbooks | Textbooks/References | For basics of statistics A and B "遺伝統計学の基礎 ISBN 978-4274068225 in Japanese and its English version handoout will be used. | ||
References, etc. | For basics of statistics A and B "遺伝統計学の基礎ISBN 978-4274068225 in Japanese and its English version handoout will be used. |