|Teoria||6||II semestre, I semestre||Rosalba Giugno|
|Laboratorio||6||II semestre, I semestre||Rosalba Giugno|
Knowledge and understanding The course aims to provide students with the knowledge and understanding of the paradigms and advanced programming tools for the management of biomedical / bioinformatic data and information. Applying knowledge and understanding The student will therefore be able to a) apply the paradigms and advanced programming tools for the analysis of genomic, transcriptomics and proteomics data; b) apply the code performance analysis and identify critical issues and their optimization. Making judgements Ability to independently propose effective and efficient solutions for the biomedical and bioinformatics application domain; ability to identify critical issues for the treatment of complex bioinformatics problems. Communication The student will also be able to interact with various interlocutors in a multidisciplinary biomedical and bioinformatics context, to interact with colleagues in the performance of group work, and to interact with the interlocutors in the working or research environment. Lifelong learning skills Ability to understand scientific literature in the process of interpreting the results or proposed solution, and to carry out individual and group in-depth studies aimed at tackling problems from the research and business world.
Overview and History of R
Workspace and Files
Objects and Data Structures
Sequence of Numbers
Reading Tabular Data
Bash- Scripting language
Overview of scripting language
Conditional statements and operators
I/O from files
R for Bioinformatics
Overview of BioConductor
Basic BioConductor Data Structures: IRanges and GenomicRanges
Classes and functions for representing biological strings: Biostrings
Classes and functions for representing genomes: BSgenome, GenomicRanges,
Annotation functions and overview of annotation web tools
RNA-SEQ Data Analysis using R/Python and web tools
Introduction to NGS technologies and experimental design
Data Pre-processing, from Fastq to BAM
Indexing Reference Genome
Mapping reads to a reference genome
Sorting and indexing alignment
Map quality control
Variant Discovery and Call set Refinement
Limma, Glimma, EdgeR
Practice on coding RNA and ncRNA detection and analysis
Applied Statistics for High-Throughput Data Mining
Introduction to variables and distribution
Linear and generalized linear modeling
Model matrix and model formulae
Analysis of categorical variables, exploratory data analysis, multiple testing
Distance in high dimensions
Principal components analysis and multidimensional scaling
Density based methods
Advanced Analyses of biological data in R: methods for graphs and networks.
Networks in igraph
Edge, vertex, and network attributes
Specific graphs and graph models
Reading network data from files
Turning networks into igraph objects
Plotting networks with igraph
Network and node descriptives
Distances and paths
Subgroups and communities
Assortativity and Homophily
Reconstruction and analysis of co-regulatory and co-espressed networks
The course includes special seminars in advanced topics such as Computational methods for the analysis of single cell data, graph mining, and multilayer networks. Topics are defined each year in base of the current trends in medical bioinformatics research. Students will have the possibility to use software related to the chosen topics and analyze real cases.
The exam consists of a written part (A) and the development of a project (B). (A) consists in developing during the test day a R program for solving a given problem using genomic, transcriptomic or proteomic data. (B) is the development of a project agreed upon with the teacher after request by email and appointment for the elaboration of the specifications (the project is valid throughout the academic year). The projects have different levels of difficulty. Every difficulty corresponds to a maximum evaluation value.
Voting for parts A and B is expressed in thirty.
The final vote is calculated as min (31, ((A + B) / 2) + C).
C is expressed in the interval [-4, + 4] and reflects the maturation and scientific autonomy acquired during the development of the tests and the project, in the exposure and in the interpretation of the scientific literature and the scientific context of the project.
|Teoria||Roger D. Peng||Exploratory Data Analysis with R||https://leanpub.com/exdata||2016|
|Teoria||Michael I. Love, Simon Anders, Vladislav Kim, Wolfgang Huber||RNA-Seq workflow: gene-level exploratory analysis and differential expression||https://f1000research.com/articles/4-1070/v1||2015|
|Laboratorio||Roger D. Peng||Exploratory Data Analysis with R||https://leanpub.com/exdata||2016|