Data-intensive computing systems (2017/2018)

Course code
4S001412
Name of lecturer
Damiano Carra
Coordinator
Damiano Carra
Number of ECTS credits allocated
6
Academic sector
INF/01 - INFORMATICS
Language of instruction
Italian
Period
I sem. dal Oct 2, 2017 al Jan 31, 2018.

Lesson timetable

Go to lesson schedule

Learning outcomes

This course provides a broad introduction to the fundamentals in large-scale parallel computing systems that deals with very large data sets.
At the end of the course, the student will have to show to know and understand how data-intensive analysis systems work, including the evaluation of the benefits and the limitations of the different solutions.

Syllabus

* Programming frameworks:
-- Distributed filesystems (HDFS);
-- Data and graph processing (MapReduce, Pregel);
-- SQL-like systems (Pig, Hive);
-- NoSQL systems (HBase, Cassandra).

* Algorithms:
-- Design of algorithms for text processing;
-- Indexing algorithms (inverted indexing);
-- Graph analysis (PageRank).

* Datacenter architectures:
-- Datacenter organization;
-- Datacenter networking;
-- Failure management.

Reference books
Author Title Publisher Year ISBN Note
Jimmy Lin, Chris Dyer Data-Intensive Text Processing with MapReduce (Edizione 1) Morgan & Claypool Publishers 2010 978-1608453429
Tom White Hadoop: The Definitive Guide (Edizione 3) Oreilly & Associates Inc 2012 978-1449311520

Assessment methods and criteria

Examination consists of a project and the corresponding documentation. The project aims at verifying the comprehension of course contents and the capability to apply these contents in the resolution of a problem. The project topic is agreed with the teacher and focus on specific case studies. The project includes the performance evaluation for different input sizes, and the evaluation of the implementation alternatives. After the evaluation of the project documentation, the student may give an oral exam where the details of the project are discussed.

Teaching aids

Documents

Statistics about transparency requirements (Attuazione Art. 2 del D.M. 31/10/2007, n. 544)

Data from AA 2017/2018 are not available yet