Statistical Genetics I

Numbering Code G-MED41 8S004 LE87 Year/Term 2022 ・ Intensive, First semester
Number of Credits 2 Course Type lecture and seminar
Target Year Doctoral students Target Student
Language Japanese and English Day/Period Intensive
Instructor name YAMADA RYO (Graduate School of Medicine Professor)
Outline and Purpose of the Course Days and hours (1st week of Augsut, Mon, Tue, and Wed)
1 August 2nd 8:45-10:15
2 August 2nd 10:30-12:00
3 August 2nd 13:00-14:30
4 August 2nd 14:45-16:15
5 August 3rd 16:30-18:00
6 August 3rd 8:45-10:15
7 August 3rd 10:30-12:00
8 August 3rd 13:00-14:30
9 August 3rd 14:45-16:15
10 August 3rd 16:30-18:00
11 August 4th 8:45-10:15
12 August 4th 10:30-12:00
13 August 4th 13:00-14:30
14 August 4th 14:45-16:15
15 August 4th 16:30-18:00

This course is consisted of three components to master the basics of statistical genetics; (1) basic mathematics,
(2) basics of statistics and (3) application of statistics to genetic studies.
The course divides these three components into six modules and provides one of them for each semester.
Basic mathematics A: Linear algebra and graph theory
Basic mathematics B: Calculus and information geometry
Basics of statistics A: Data types and statistical tests
Basics of statistics B: Inference
Application of statistics A: Statistical aspects of Mendelian traits and Cancer syndromes
Application of statistics B: Statistical aspects of complex genetic traits and gene expression biomarkers.
Schedule plan is as below:
2021 1st semester Basic mathematics B, 2nd semester Application A
2022 1st semester Basic mathematics A, 2nd semester Basics of statistics A
2023 1st semester Basic mathematics B, 2nd semester Application B
2024 1st semester Basic mathematics A, 2nd semester Basics of statistics B
In the course, the language R is used for data analysis, simulation and visualization.
This semester: BASIC MATHEMATICS B.
Course Goals Basic mathematics A: To understand matrix calculation least squares, PCA, and
the basics of graph theory.
Basic mathematics B: To understand calculus for probability density functions, likelihood functions and maximum likelihood estimation, approximation, and the basics of information geometry.
Basics of statistics A:To understand data types, statistical tests, asymptotic tests, exact tests, and contingency table tests
Basics of statistics B:To understand point and interval estimates, Bayesian estimates, maximum likelihood estimates and likelihood functions.
Application A: To understand statistical aspects for risk evaluation of Mendelian traits and cancer syndromes.
Application B: To understand statistical aspeccts for riks evaluation of complex genetic traits and expressional profiles.
In every module, the basics of R language should be mastered.
Schedule and Contents Basic mathematics A
The first half: Linear algebra, including matrix calculation, variance-covariance matrix, least square method,
system of equation, PCA, optimization
The second half: Graph theory, including basics of basics of graph theory, tree, minimum spanning tree, random graph, and network and graph objects in R language.
Basic mathematics B
The first half: Calculus, including expect of probability density functions, likelihood function and maximum likelihood estimate and calculus for them, calculus for probability density function, cumulative density
function and hazard functions, partial derivative and HWE, calculus for least square methods and Taylor expansion.   
The second half: Information geometry, including its basics, Fisher information, dual flatness, exponential families and KL divergence.
Basics of statistics A
Data types including categorical types and simplex, 2x2 table tests and chi-square test and exact test,
HWEtest and its exact test, 2x3 table test and genetic models, uniform distribution and multiple testing and Bonferroni's correction.
Basics of statistics B
Point and interval estimates, Bayesian appoach and binomial and beta distributions, haplotype frequency estimation and EM algorithm and LD block.
Application A
The first half: Mendelian traits, including pedigree, genotypes and phenotypes of Mendelian traits, NGS and disease-responsible variants.
The second half: Cancer syndrome, including its basics and risk evaluation, decision-support tool, Bayseian estimation and Bayesian network.
Application B
The first half: Complex genetic traits, including genetic models, population and cohort, 2x3 table association tests and multiple-locus model.
The second half: Transcriptome analysis and expression profiles, including their basics, differential expression analysis, clustering and heatmap, supervised learning and validation.
Evaluation Methods and Policy Activities in the class hours, and homeworks are count.
Course Requirements It is desirable to have backgound of molecular biology and genetics but not required if ready for self-learning them.

Bring a laptop PC with wifi.

Basic computer skills and programming in R are necessary. If no, self-learn them along the course.
Study outside of Class (preparation and review) Use R in your non-class daily studies to improve your R skills.
Homework every day.
Textbooks Textbooks/References For basics of statistics A and B "遺伝統計学の基礎ISBN 978-4274068225 in Japanese and its English version handoout will be used.
References, etc. For basic matheaticis, get handouts @ http://statgenet-kyotouniv.wikidot.com/2018 .
For application A and B, get handouts @ http://statgenet-kyotouniv.wikidot.com/2017 .
PAGE TOP