The course aims to present the main algorithmic solutions for the analysis and extraction of information from large amounts of data. Particular emphasis is given to distributed approaches and parallel algorithms.
At the end of the course the student has to show to have acquired the following skills:
● the knowledge necessary for the design of algorithms for the analysis of unstructured data and the interpretation of the results
● ability to develop cost/benefit analysis of the developed data analysis models
● ability to compare different data analysis techniques, choosing the most suitable among them according to the available computing resources and to design innovative solutions appropriately
● acquisition of the basis for continuing your studies independently in the context of developing advanced analyzes of large amounts of data.
- Data Mining introduction
- Finding Similar Items
- Mining Data Streams
- Frequent Itemsets
- Recommendation Systems
- Mining Social-Network Graphs
- Large-Scale Machine Learning
|Jure Leskovec, Anand Rajaraman, Jeff Ullman||Mining of Massive Datasets (Edizione 3)||Cambridge University Press||2020||9781108476348||Book freely available at http://www.mmds.org/|
Examination consists of a project and the corresponding documentation. The project aims at verifying the comprehension of course contents and the capability to apply these contents in the resolution of a problem. The project topic is agreed with the teacher and focus on specific case studies. The project includes the performance evaluation for different input sizes, and the evaluation of the implementation alternatives. After the evaluation of the project documentation, the student may give an oral exam where the details of the project are discussed.
Strada le Grazie 15
VAT number 01541040232
Italian Fiscal Code 93009870234
© 2020 | Verona University | Credits