Activity | Credits | Period | Academic staff | Timetable |
---|---|---|---|---|
Teoria | 9 | I sem. | Manuele Bicego, Rosalba Giugno | |
Laboratorio | 3 | I sem. | Pietro Lovato |
The course is aimed at providing the theoretical and applied basis of Pattern Recognition, a class of automatic methodologies used to recognize and recover information from biological data. In particular, during the course the main aspects of this area will be presented and discussed: representation, classification, clustering and validation. The focus is more on the description of the employed methodologies rather than on the details of application programs (already seen in other courses)
At the end of the course, the students will be able to analyse a biological problem from a Pattern Recognition perspective; the will also have the skills needed to invent, develop and implement the different components of a Pattern Recognition System.
The course generally requires standard skills obtained from other courses of the first two years, with particular emphasis on basic notions of probability, statistics, and mathematical analysis.
The course is divided in three parts:
Part 1. The first part is devoted to the description and the analysis of the different methodologies for representation, classification and clustering of biological data
Part 2. The second part, more application-oriented, is devoted to the critical analysis of some relevant bioinformatics problems which are typically solved with classification or clustering approaches (e.g. gene expression data analysis, medical image segmentation, protein remote homology detection)
Part 3. The third part (in lab) is devoted to the implementation, using the MATLAB language, of some of the algorithms analysed in the first two parts.
Detailed Program
Theory (72 h):
- Introduction to Pattern Recognition
- Data Representation
- Bayes decision theory
- Generative and discriminative classifiers
- Validation
- Neural Networks
- Hidden Markov Models
- Clustering methods
- Clustering validation
- Applications
Lab (36 h):
- Introduction to matlab
- Data representation and standardization
- Principal Component Analysis
- Gaussians and Gaussian classifiers
- Hidden Markov Models
Reference books
R. Duda, P. Hart, D. Stork Pattern Classification. Wiley, 2001
P. Baldi, S. Brunak, Bioinformatics, The Machine Learning Approach. MIT Press, 2001
A.K. Jain and R.C. Dubes, Algorithms for Clustering Data, Prentice-Hall, 1988
The exam is aimed at the verification of the following skills:
- capability of clearly and concisely describe the different components of a Pattern Recognition System
- capability of analize, understand and describe a Pattern Recognition system (or a given part of it) relative to a biological problem
The exam consists of two parts
i) a written exam containing questions on topics presented during the course (15 points available). The written part is passed is the grade is greater or equal to 8.
ii) an oral presentation of a scientific paper published in relevant bioinformatics journals during 2015. The paper is chosen by the candidate and approved by the instructor (15 points available).
The two parts of the exam can be passed separately: the final grade is the sum of the two grades.
The total exam is passed if the final grade is greater or equal to 18. Each evaluation is maintained valid for the whole academic year.