Statistical learning (2020/2021)

Course code
4S009067
Credits
6
Coordinator
Alessandro Farinelli
INF/01 - INFORMATICS
Language of instruction
English
Teaching is organised as follows:
Activity Credits Period Academic staff Timetable
Teoria 5 II semestre Alberto Castellini, Alessandro Farinelli, Matteo Garbelli

Laboratorio 1 II semestre Alessandro Farinelli

Learning outcomes

The course aims to introduce students to the statistical models used in data science. The foundations of statistical learning (supervised and unsupervised) will be developed by placing the emphasis on the mathematical basis of the different state-of-the-art methodologies. It also aims to provide rigorous derivations of the methods currently used in industrial and scientific applications to allow students to understand their requirements for correct use. Laboratory sessions will illustrate the use of fundamental algorithms and industrial case studies in which the student will be able to learn to analyze real datasets by means of Python software.

At the end of the course the student has to show to have acquired the following skills:
● knowledge of the main stages of: data analysis and preparation
● ability to use the main regression models
● ability to develop pro-feature selection solutions
● ability to use regularization methods, e.g., ridge regression, LASSO, elastic net, least angle regression, and classification
● knowledge of unsupervised methods
● know and know how to develop algorithms in the field of dimensionality reduction, analysis of the main components (PCA), K-means clustering, hierarchical clustering, and cross-validation

Syllabus

-- Linear models for Regression (Linear Regression, Subset Variable Selection, Shrinkage/Regularization)
-- Classification models (Logistic Regression, Linear Discriminant Analysis)
-- Tree Based Methods (Decision Trees, Bagging, Random Forest, Boosting)
-- Unsupervised methods (Principal Component Analysis, K-Means Clustering, Hierarchical Clustering)
-- Model Assessment and selection (cross validation)
-- Introduction to Neural Networks (Single layer neural network, training a neural network)

Lab:
-- Linear regression and related variable selection/shrinkage methods (in Python)
-- Classification with logistic regression (in Python)
-- Clustering with k-means and hierarchical clustering (in Python)
-- Artificial Neural Networks (in Python)

Assessment methods and criteria

The exam is composed of an oral test and the realization of a project that focuses on the application of statistical learning approaches to a specific case study.