Information Systems

Numbering Code U-ENG29 39111 LJ11 Year/Term 2022 ・ Second semester
Number of Credits 2 Course Type Lecture
Target Year Target Student
Language Japanese Day/Period Wed.3
Instructor name TAJIMA KEISHI (Institute for Liberal Arts and Sciences Professor)
Outline and Purpose of the Course Course lectures cover fundamental theory and related techniques for constructing information systems. Discussions will especially focus on architecture of Web information systems, techniques for processing structured documents and semi-structured data used in Web information systems, theories for web information retrieval systems and other information retrieval systems, and techniques for graph data analysis.
Course Goals The goals of this course are for students to have gained an understanding of architecture of Web information systems, techniques for processing structured documents and semi-structured data used in Web information systems, theories for web information retrieval systems and other information retrieval systems, and techniques for graph data analysis.
Schedule and Contents 1. History of information systems: From hypertext to Web services (2 classes)
An overview is provided of the history of developments in information systems for supporting the intellectual work of humans. Specifically, lectures will discuss hypertext (Memex, Dexter model, HyperCard), GUI and hypermedia (Smalltalk development environment, SMIL), structured documents (SGML, HTML, XML), stylesheets, as well as architecture of Web information systems (SOAP, REST, Ajax).

2. Structured documents and semi-structured data processing (2 classes)
XML is taken up as an example case of data formatting that are used for representing structured documents and semi-structured data. Discussion is made of general-purpose processing techniques for XML data (DOM and SAX) and echniques for querying and converting them (XPath, XQuery, and XSLT). Differences between the paradigms of each method are discussed. Also, local tree grammar, regular tree grammar, and single-type tree grammar are taken up as examples of tree grammar, used to define the schema of tree-structured data. Differences between the expressive power of each language are explained.

3. Information retrieval: Evaluation measures (2 classes)
Overview explanation is made of the fundamental concepts of information retrieval, and the various measures used in performance evaluation of information retrieval systems (precision, recall, F-measure, mean reciprocal rank (MRR), mean average precision (MAP), normalized discounted cumulative gain (nDCG), average mutual information, correlation coefficient, rank correlation coefficient). The user models that lies behind these measures will also be explained in overview.

4. Information retrieval: Retrieval models (3 classes)
Overview explanation is made of the three representative basic information retrieval models, and of their various successor models (Boolean model, fuzzy set model, extended Boolean model, vector space model, latent semantic indexing (LSI), latent Dirichlet allocation (LDA), word2vec, probability model, binary independence model, and query likelihood model).

5. Information retrieval: Other topics (1 class)
Several other concepts related to information retrieval will be overviewed. The topics include: techniques for query modification and recommendation, techniques for creation of data set for evaluation of information systems, and information recommendation techniques such as collaborative filtering.

6. Web analysis (2 classes)
These lectures describe analysis techniques for graph structures of Web data. Taken up especially as representative analysis methods are PageRank, Topic-Specific PageRank, TrustRank, HITS, SimRank, etc.

7. Network analysis (2 classes)
Fundamental concepts of network analysis are explained. Specifically explained are the concepts of scale-free properties, small-world properties, cluster properties, and analysis methods including the infection model and community extraction methods.

8. Feedback (1 class)
Questions about the examination from students are answered.
Evaluation Methods and Policy Evaluations will be made based on the scores of the final examination, which examine if the students understand the basics and the theories of technologies concerning the construction of Web information systems, information retrieval systems, graph data analysis, and processing of structured documents and semi-structured data used in Web information systems.
Course Requirements It is not mandatory but desired that students have basic knowledge taught in the following courses: Introduction to Algorithms and Data Structures, Language and Automata, Graph Theory, Databases, and Fundamentals of Statistical Modeling.
Study outside of Class (preparation and review) Students are to use lecture notes to prepare for and review classes. Exercise problems and homework will be assigned in classes, and students are to use these also to prepare for and review classes.
Textbooks Textbooks/References Lecture notes will be used as teaching materials.
Related URL
PAGE TOP