Data-intensive computing systems (2013/2014)

Course code
Name of lecturer
Damiano Carra
Damiano Carra
Number of ECTS credits allocated
Academic sector
Language of instruction
I semestre dal Oct 1, 2013 al Jan 31, 2014.

Lesson timetable

I semestre
Day Time Type Place Note
Thursday 4:30 PM - 6:30 PM laboratorio Laboratory Gamma from Oct 7, 2013  to Oct 10, 2013
Thursday 4:30 PM - 6:30 PM laboratorio Laboratory Alfa from Oct 17, 2013  to Jan 31, 2014
Friday 2:30 PM - 3:30 PM lesson Lecture Hall I from Oct 18, 2013  to Jan 31, 2014
Friday 3:30 PM - 4:30 PM lesson Lecture Hall I  

Learning outcomes

This course provides a broad introduction to the fundamentals in large-scale parallel computing systems that deals with very large data sets. The course topics cover programming models (MapReduce, Pregel), algorithmic design (text processing, inverted indexing, graph analysis), and system architecture (datacenter topologies, communication, failure management).


- Programming frameworks -
Distributed filesystems (HFS), NoSQL systems (HBase, Cassandra), data and graph processing (MapReduce, Pregel), SQL-like systems (Pig, Hive);

- Algorithms -
Design of algorithms for text processing, inverted indexing (PageRank), and graph analysis.

- Datacenter architectures -
Topologies (VL2, PortLand, c-Through), communication protocols (spanning tree, ECMP, OpenFlow), failure management.

Reference books
Author Title Publisher Year ISBN Note
Jimmy Lin, Chris Dyer Data-Intensive Text Processing with MapReduce (Edizione 1) Morgan & Claypool Publishers 2010 978-1608453429
Tom White Hadoop: The Definitive Guide (Edizione 3) Oreilly & Associates Inc 2012 978-1449311520

Assessment methods and criteria

Examination consists of a project and the corresponding documentation.