PhD in Computer Science

PhD Course on "Short Tutorial on Biclustering Algorithms for Biological Data Analysis"

5 May - 6 May

Series to which this belongs

29° ciclo
30° ciclo
31° Ciclo


 Biclustering, the discovery of sets of objects with a coherent pattern across a subset of conditions, is key to study a wide-set of biological problems, where molecular units or patients are meaningfully related with a set of properties. The challenging combinatorial nature of this task led to the development of several approaches with restrictions on the allowed type, number and quality of biclusters, subsets of rows exhibiting a coherent pattern over a subset of columns found by analysing a data matrix. State of the art biclustering approaches relying on efficient string processing and mining techniques, in the case of temporal data, and based on pattern mining, in the general case, allow an exhaustive yet efficient space exploration together with the possibility to discover flexible structures of biclusters with parameterizable coherency and noise-tolerance.
This tutorial introduces the biclustering problem comparing it to the tradicional clustering problem and then tackles the problems of biclustering temporal (meaningful biological restriction) and non temporal data (general case). In the case of temporal data, the tutorial focus on biclustering algorithms for the analysis of gene expression time series obtained from transcriptomics using microarrays or RNA-seq technologies. In this context, the ability to monitor changes in expression patterns over time, and to observe the emergence of coherent temporal responses using gene expression time series, is shown to be critical to advance our understanding of complex biological processes, such as complex diseases. Efficient biclustering algorithms able to effectively unravel coherent coexpression patterns and important aspects of gene regulation, as anticorrelation and time-lagged relationships, are discussed. On going work on new biclustering algorithms to simultaneouly analyse multiple expression time series, together with their application to biomedical problems, is also be discussed. In the case of non temporal data, this tutorial describes recent pattern-based biclustering approaches tailored to analyse both expression data and network data, where the goal is to find putative regulatory modules and network modules, respectively


  • pdf   Flyer   (pdf, it, 228 KB, 15/03/16)