Fundamental algorithms for Bioinformatics (2019/2020)

Course code
4S004550
Credits
12
Coordinator
Ferdinando Cicalese
Other available courses
Other available courses
INF/01 - INFORMATICS
Language of instruction
English
Teaching is organised as follows:
Activity Credits Period Academic staff Timetable
Algorithm design 6 I semestre Ferdinando Cicalese

Bioinformatics algorithms 6 II semestre Zsuzsanna Liptak

Learning outcomes

Students will acquire a wealth of advanced analytic tools which constitute the foundational basis of the algorithmic solution of important problems in bioinformatics Knowledge and understanding The aim of the course is to provide the student with the necessary skills and know-how for the design and analysis of algorithmic solutions to fundamental bioinformatics problems. Applying knowledge and understanding The students will acquire the ability to design algorithmic solutions for typical problems in bioinformatics and computational biology, e.g., analysis of “omics”-data. Making judgements The students will be able to identify the critical structural elements of a problem and the most appropriate approaches to tackle complex problems in bioinformatics. Communication The students will acquire the ability to describe with appropriate precision and clarity, to both experts and non-specialists: a bioinformatics problem, its mathematical model and the corresponding solution. Lifelong learning skills The students will be able to deepen their know-how in bioinformatics autonomously. Based on the topics studied and the knowledge acquired, they will be able to read, understand, and apply material from advanced text-books and scientific article.

Syllabus

------------------------
MM: Algorithm design
------------------------
Fundamental notions of algorithmic analysis and complexity: Brief recap on graph traversals; shortest path problem; minimum spanning tree algorithms; elements of computational complexity and NP-completeness Models of Genome Rearrangement: (i) approximation algorithms for reversal distance model (sorting unsigned permutations); (ii) the Doble Cut and Join model; (iii) Synteny Distance approximation algorithms Models for Physical Map: (i) The Consecutive Ones Property (C1P); (ii) approximation algorithm for the gap minimisation based on the metric TSP (connections to Hamcycle problems and approximation limits of general TSP; 2-approximation of metric TSP) Models for DNA assembly: (i) The Shortest Common Superstring problem (SCS), connections to maximum cost TSP, approximation of the maximum compression via weighted matching; (ii) Eulerian Cycles based assembly; efficient algorithms for the Eulerian path and Eulerian cycle problem. Models for contig assembly: gap-filling via min-cost flow (flow networks and flow decomposition into edge disjoint paths); min-cost circulation; use of min-cost circulation in SCS (max/min matching in bipartite graphs); Information-theoretic models for biological sequence comparisons: elements of information theory and data compression; LZ-parsing; universal compression distance for clustering and comparison of sequences.
------------------------
MM: Bioinformatics algorithms
------------------------
Here is an overview of the topics that will be covered. The topics in brackets may vary. * Introduction Part I: Pairwise Sequence Comparison * Pairwise sequence alignment * String distances * Pairwise alignment in practice: BLAST, Scoring matrices (* RNA secondary structure prediction) Part II: Multiple sequence alignment * exact DP algorithm (* Carillo-Lipman search space reduction) * approximation algorithms, heuristics Part III: Phyogenetic reconstruction * distance based data: UPGMA, NJ * character based data: Perfect phylogeny (PP) (* character based data: Small Parsimony, Large Parsimony) Part IV: Sequence assembly algorithms (* Shotgun sequencing: SCS) * Sequencing by Hybridization and NGS: de Bruijn graphs, Euler tours

Assessment methods and criteria

------------------------
MM: Algorithm design
------------------------
The exam verifies that the students can master the fundamental tools and techniques for the analysis and design of algorithms and that they understand how these techniques are employed in the solution of some classical computational problems arising in bioinformatics. The exam consists of a written test with open questions. The test includes some mandatory exercises and a set of exercises among which the student can choose what to work on. The mandatory exercises are meant to evaluate the student's knowledge of classical algorithms and analysis tools as seen during the course. "Free-choice" exercises test the ability of students to model "new" toy problems and design and analyse algorithmic solutions for it. The grade for the module Algorithm Design is determined by the result of the written test and the result of homework to be solved periodically during the semester. The overall grade for "Fundamental Algorithms for Bioinformatics" is computed by averaging the grades awarded for the two modules.
------------------------
MM: Bioinformatics algorithms
------------------------
Written exam, followed by oral exam. You are only admitted to the oral if you have passed the written exam. The written exam consists of theoretical questions (problems studied, analysis of algorithms studied, mathematical properties, which algorithms exist for a problem etc.), as well as applications of algorithms to concrete examples (computing a pairwise alignment with the DP algorithm etc.) In the oral exam, the student will explain in detail their solutions to the written exam, and show to what extent they have mastered the topics. Students of the Masters in Molecular and medical biotechnology will have separate questions. (The exam is the same for students who follow the course during the semester and those who do not: frequentanti e no).

 Activity Author Title Publisher Year ISBN Note Algorithm design J. Kleinberg, É. Tardos Algorithm Design (Edizione 1) Addison Wesley 2006 978-0321295354 Algorithm design H.J. Böckenhauer, D. Bongartz Algorithmic Aspects of Bioinformatics Springer 2007 Algorithm design Neil C. Jones, Pavel A. Pevzner An introduction to bioinformatics algorithms (Edizione 1) MIT Press 2004 0-262-10106-8 Algorithm design V. Mäkinen, D. Belazzougui, F. Cunial, and A.I. Tomescu Genome Scale Algorithm Design (Edizione 1) Cambridge University Press 2015 ISBN 978-1-107-07853-6 Algorithm design J.C. Setubal, J. Meidanis Introduction to Computational Biology Pws Pub Co 1997 Bioinformatics algorithms H.J. Böckenhauer, D. Bongartz Algorithmic Aspects of Bioinformatics Springer 2007 Bioinformatics algorithms Enno Ohlebusch Bioinformatics Algorithms 2013 978-3-00-041316-2 Bioinformatics algorithms Veli Mäkinen, Djamal Belazzougui, Fabio Cunial and Alexandru I. Tomescu Genome-Scale Algorithm Design CUP 2015 978-1-107-07853-6 Bioinformatics algorithms Joao Setubal and Joao Meidanis Introduction to Computational Biology 1997