Mining Massive Datasets (2020/2021)

Course code
4S009068
Name of lecturer
Damiano Carra
Coordinator
Damiano Carra
Number of ECTS credits allocated
6
Academic sector
ING-INF/05 - INFORMATION PROCESSING SYSTEMS
Language of instruction
English
Location
VERONA
Period
II semestre dal Mar 1, 2021 al Jun 11, 2021.

Lesson timetable

Go to lesson schedule

Learning outcomes

The course aims to present the main algorithmic solutions for the analysis and extraction of information from large amounts of data. Particular emphasis is given to distributed approaches and parallel algorithms.

At the end of the course the student has to show to have acquired the following skills:
● the knowledge necessary for the design of algorithms for the analysis of unstructured data and the interpretation of the results
● ability to develop cost/benefit analysis of the developed data analysis models
● ability to compare different data analysis techniques, choosing the most suitable among them according to the available computing resources and to design innovative solutions appropriately
● acquisition of the basis for continuing your studies independently in the context of developing advanced analyzes of large amounts of data.

Syllabus

- Data Mining introduction
- Finding Similar Items
- Mining Data Streams
- Frequent Itemsets
- Clustering
- Recommendation Systems
- Mining Social-Network Graphs
- Large-Scale Machine Learning

Reference books
Author Title Publisher Year ISBN Note
Jure Leskovec, Anand Rajaraman, Jeff Ullman Mining of Massive Datasets (Edizione 3) Cambridge University Press 2020 9781108476348 Book freely available at http://www.mmds.org/

Assessment methods and criteria

Examination consists of a project and the corresponding documentation. The project aims at verifying the comprehension of course contents and the capability to apply these contents in the resolution of a problem. The project topic is agreed with the teacher and focus on specific case studies. The project includes the performance evaluation for different input sizes, and the evaluation of the implementation alternatives. After the evaluation of the project documentation, the student may give an oral exam where the details of the project are discussed.