Distributed Information Systems

Numbering Code G-INF02 63217 LE11
G-INF02 63217 LE13
G-INF02 63217 LE10
Year/Term 2022 ・ Second semester
Number of Credits 2 Course Type Lecture
Target Year Target Student
Language English Day/Period Wed.3
Instructor name YOSHIKAWA MASATOSHI (Graduate School of Informatics Professor)
MA QIANG (Graduate School of Informatics Associate Professor)
Outline and Purpose of the Course This course gives an overview of major topics on distributed information systems. The course starts with a topic on complex data. Unlike flat tables employed by relational databases, modern information systems manage complex data. Students will learn data models which have rich expressive power to model complex data, and declarative languages to retrieve and update complex data. The course also covers highly-scalable distributed file systems and databases. The systems covered in lectures include HDFS, MapReduce, and Dremel. Column store technologies are also covered as an important storage model for handling OLAP tasks on high-volume data. Blockchain, an emerging technology, is also introduced. The last topic is Web mining and knowledge discovery. The fundamental technologies and application systems will be introduced. Some other contemporary topics are lectured if time allows.
Course Goals Our goal is to introduce students to principles and techniques of distributed information systems. Students are expected to obtain fundamental knowledge of representation, management, processing and mining of a large amount of distributed data.
Schedule and Contents Distributed and Parallel Information Systems (8 Lectures by Yoshikawa)

Complex Data
. Nested Data, Complex Value, Semi-Structured Data, XML
Highly-Scalable Distributed File Systems and Databases
. Column Store
. Dremel
. HDFS (Hadoop Distributed File System) and MapReduce
Blockchain
Foundation of Semantic Web

Knowledge Discovery (Web Mining) (7 Lectures by Ma)
. Content Mining: Information Extraction, Information Integration (Schema Matching)
. Structure Mining: Link analysis, Social Network Analysis
. Usage Mining: log analysis, personalization, user behavior analysis, HCI
. Sentiment Analysis and Opinion Mining
. Application Systems
Evaluation Methods and Policy Grading method: Grade is evaluated by writing examination and reports.
Course Requirements Basic knowledge of database systems and data mining.
Study outside of Class (preparation and review) In some lectures, homework is assigned. Course review is highly recommended.
Textbooks Textbooks/References Lecture notes and related documents will be distributed in lectures.
References, etc. Several related documents will be introduced in lectures.
PAGE TOP