Master's degree in Medical Bioinformatics

Master's degree in Medical Bioinformatics

Programming laboratory for bioinformatics (2019/2020)

Course code
4S004548
Credits
12
Coordinator
Rosalba Giugno
Academic sector
INF/01 - INFORMATICS
Language of instruction
English

Links



Teaching is organised as follows:
Activity Credits Period Academic staff
Teoria 6 II semestre, I semestre Rosalba Giugno
Laboratorio 6 II semestre, I semestre Rosalba Giugno

Go to lesson schedule

Learning outcomes

Knowledge and understanding The course aims to provide students with the knowledge and understanding of the paradigms and advanced programming tools for the management of biomedical / bioinformatic data and information. Applying knowledge and understanding The student will therefore be able to a) apply the paradigms and advanced programming tools for the analysis of genomic, transcriptomics and proteomics data; b) apply the code performance analysis and identify critical issues and their optimization. Making judgements Ability to independently propose effective and efficient solutions for the biomedical and bioinformatics application domain; ability to identify critical issues for the treatment of complex bioinformatics problems. Communication The student will also be able to interact with various interlocutors in a multidisciplinary biomedical and bioinformatics context, to interact with colleagues in the performance of group work, and to interact with the interlocutors in the working or research environment. Lifelong learning skills Ability to understand scientific literature in the process of interpreting the results or proposed solution, and to carry out individual and group in-depth studies aimed at tackling problems from the research and business world.

Syllabus

R Programming
Overview and History of R
Workspace and Files
Objects and Data Structures
Missing Values
Sequence of Numbers
Subsetting
Split-Apply-Combine Funtions
Simulation
Reading Tabular Data
Logic
Control Structures
I/O operations
Functions
Base Graphics
Advanced Graphics

Bash- Scripting language
Overview of scripting language
Varables
Indexed arrays
Associative arrays
Conditional statements and operators
Comparison operators
Loops
I/O from files
Functions

R for Bioinformatics
Overview of BioConductor
Basic BioConductor Data Structures: IRanges and GenomicRanges
Classes and functions for representing biological strings: Biostrings
Classes and functions for representing genomes: BSgenome, GenomicRanges,
Annotation functions and overview of annotation web tools

RNA-SEQ Data Analysis using R/Python and web tools
Introduction to NGS technologies and experimental design
Data Pre-processing, from Fastq to BAM
Indexing Reference Genome
Mapping reads to a reference genome
Sorting and indexing alignment
Map quality control
Variant Discovery and Call set Refinement
Differential Analysis
Limma, Glimma, EdgeR
DESeq2
Practice on coding RNA and ncRNA detection and analysis


Applied Statistics for High-Throughput Data Mining
Introduction to variables and distribution
Linear modeling
Linear and generalized linear modeling
Model matrix and model formulae
Analysis of categorical variables, exploratory data analysis, multiple testing
Unsupervised analysis
Distance in high dimensions
Principal components analysis and multidimensional scaling
Unsupervised clustering
Partition Methods
Hierchical Methods
Density based methods
Batch effects

Advanced Analyses of biological data in R: methods for graphs and networks.
Networks in igraph
Create networks
Edge, vertex, and network attributes
Specific graphs and graph models
Reading network data from files
Turning networks into igraph objects
Plotting networks with igraph
Network and node descriptives
Distances and paths
Subgroups and communities
Assortativity and Homophily
Reconstruction and analysis of co-regulatory and co-espressed networks

The course includes special seminars in advanced topics such as Computational methods for the analysis of single cell data, graph mining, and multilayer networks. Topics are defined each year in base of the current trends in medical bioinformatics research. Students will have the possibility to use software related to the chosen topics and analyze real cases.

Assessment methods and criteria

The exam consists of a written part (A) and the development of a project (B). (A) consists in developing during the test day a R program for solving a given problem using genomic, transcriptomic or proteomic data. (B) is the development of a project agreed upon with the teacher after request by email and appointment for the elaboration of the specifications (the project is valid throughout the academic year). The projects have different levels of difficulty. Every difficulty corresponds to a maximum evaluation value.

Voting for parts A and B is expressed in thirty.

The final vote is calculated as min (31, ((A + B) / 2) + C).
C is expressed in the interval [-4, + 4] and reflects the maturation and scientific autonomy acquired during the development of the tests and the project, in the exposure and in the interpretation of the scientific literature and the scientific context of the project.

Reference books
Activity Author Title Publisher Year ISBN Note
Teoria Rafael A Irizarry and Michael I Love Data Analysis for the Life Sciences https://leanpub.com/dataanalysisforthelifesciences/ 2015
Teoria Roger D. Peng Exploratory Data Analysis with R https://leanpub.com/exdata 2016
Teoria Michael I. Love, Simon Anders, Vladislav Kim, Wolfgang Huber RNA-Seq workflow: gene-level exploratory analysis and differential expression https://f1000research.com/articles/4-1070/v1 2015
Teoria Kolaczyk, Eric D., Csárdi, Gábor Statistical Analysis of Network Data with R Springer 2014
Laboratorio Rafael A Irizarry and Michael I Love Data Analysis for the Life Sciences https://leanpub.com/dataanalysisforthelifesciences/ 2015
Laboratorio Roger D. Peng Exploratory Data Analysis with R https://leanpub.com/exdata 2016
Laboratorio Michael I. Love, Simon Anders, Vladislav Kim, Wolfgang Huber RNA-Seq workflow: gene-level exploratory analysis and differential expression https://f1000research.com/articles/4-1070/v1 2015
Laboratorio Kolaczyk, Eric D., Csárdi, Gábor Statistical Analysis of Network Data with R Springer 2014




© 2002 - 2019  Verona University
Via dell'Artigliere 8, 37129 Verona  |  P. I.V.A. 01541040232  |  C. FISCALE 93009870234