Training courses 2017: Training course on Machine Learning for Biologists
IMPORTANT DATES for this Training course:
- Deadline for applications: 21 July 2017
- Course date: 4-7 September 2017
Chosen participants will be notified by 25 July 2017. A maximum of 24 candidates will be accepted in the course.
Course Description
The aim of the course is to provide a practical introduction to the analysis of “omics” data. Topics will range from data visualization/exploration to univariate/multivariate analysis and machine learning. Practical examples and applications will be illustrated by using R and Python.
Course Milestones:
- Data exploration and visualization
- Univariate/Multivariate analysis
- Introduction to machine learning: classifiers, performance measures, diagnostics
- Machine learning tools for the analysis of Gene Expression data
- The Data Analysis Plan (DAP) - intro to unbiased pipelines for (binary) classification
- Performance measures and diagnostic plots - Accuracy, MCC, Stability: theory and graphics
- Differential network analysis – co-expression networks, graph comparison, community detection: theory and examples in R/Python, visualization by the igraph library and use of the ReNette web interface
- Basic application of ML to gene prediction
Target audience
A maximum of 24 students will be accepted. Computational biologists, bioinformaticians, biological data analysts.
Learning objectives/outcomes
Acquisition of a working knowledge about running a full classification/profiling pipeline on omics data (e.g., gene expression). After this course participants should be able to:
- Develop and implement visualization/exploration tools
- Understand advantages and limitations of univariate and multivariate data analysis strategies in the context of omics research
- Understand principles and applications of ML
- Implement reproducible workflows in data analysis
Course prerequisites
basics of R, Python, *nix (Unix or Linux) shell, basic statistical knowledge, basic knowledge of linear algebra
FEE:
The fee will cover attendance to lectures and practicals, coffee breaks, lunches, the welcome aperitivo on Monday September 4th. The participants are expected to pay their own travel and accommodation costs.
- Candidates from a bioinformatics lab from one of the ELIXIR-IIB member institutions (see the list at the bottom): 120 euros.
- All the other candidates: 160 euros
Application Form
VENUE:
Fondazione Edmund Mach, Palazzo della Ricerca e della Conoscenza, Via E. Mach 1, San Michele all’Adige (TN), Italy, (zip code 38010).
General Info: http://www.fmach.it/eng/CRI/general-info
The campus: http://www.fmach.it/eng/About-us/Campus
To get in:
- By train: Trenitalia (national railways) stops at Trento or Mezzocorona and then local train to San Michele all’Adige
- By car: The venue is 5’ drive from the highway A22 (autostrada del Brennero), exit of “San Michele all’Adige/Mezzocorona”.
- By fly: From Verona airport please use the shuttle bus to Verona train station and then the train towards Trento.
SUGGESTED ACCOMMODATIONS: A limited number of rooms will be available at the venue (costs 20 EUR/night). Please contact the local organizer (alessandro.cestaro@fmach.it). Here is some information about hotels close to San Michele all’Adige.
Instructors and helpers
- Giuseppe Jurman - Fondazione Bruno Kessler, Trento, IT
- Marco Chierici - Fondazione Bruno Kessler, Trento, IT
- Davide Albanese - Fondazione E. Mach, Trento, IT
- Pietro Franceschi - Fondazione E. Mach, Trento, IT
- Marco Moretto - Fondazione E. Mach, Trento, IT
- Paolo Sonego - Fondazione E. Mach, Trento, IT
- Samantha Riccadonna - Fondazione E. Mach, Trento, IT
- Andrea Cattani - Fondazione E. Mach, Trento, IT
Organisers
- Alessandro Cestaro (Local Organizer, Fondazione E. Mach, Trento, IT )
- Vincenza Colonna (ELIXIR-IIB Training Coordinator Deputy, CNR, IT)
Programme
Course materials are available from the github repository:
https://github.com/ELIXIR-IIB-training/MLB2017
Monday 04 September 2017 - Introduction |
|||
14:00-15:30 | Course opening | Participants’ self-presentations | |
15:30-16:15 | Plenary lecture | P. Franceschi, S. Riccadonna | Introduction to "omics" data. Principles of data exploration and analysis |
16:15-16:45 | Coffee break | ||
16.45-19:00 | Practical | P. Franceschi, S. Riccadonna | Introduction to "omics" data. Principles of data exploration and analysis |
19:30-21:00 | Welcome aperitivo at Cantina storica Istituto Agrario San Michele | ||
Tuesday 05 September 2017 - Machine Learning |
|||
09:30-09:40 | Previously On | Recap of previous lessons by participants | |
09:40-10:30 | Lecture | D. Albanese, P. Franceschi, S. Riccadonna | Univariate and Multivariate analysis |
10:30-11:00 | Coffee break | ||
11:00-13:00 | Practical | D. Albanese, P. Franceschi, S. Riccadonna | Univariate and Multivariate analysis. Practical session with R |
13:00-14:30 | Lunch | ||
14:30-16:15 | Lecture | D. Albanese, P. Franceschi, S. Ricadonna | Machine Learning: introduction and applications to biological data. Classification basics, model selection and prediction |
16:15-16:45 | Coffee Break | ||
16:45-18:30 | Practical | D. Albanese, P. Franceschi, S. Riccadonna | Performance measures and diagnostic plots |
Wednesday 06 September 2017 |
|||
09:30-09:40 | Previously On | Recap of previous lessons by participants | |
09:40-10:30 | Lecture | P. Sonego, S. Riccadonna | Analyzing Gene Expression Data |
10:30-11:00 | Coffee break | ||
11:00-13:00 | Practical | P. Sonego, S. Riccadonna | Analyzing Gene Expression Data |
13:00-14:30 | Lunch | ||
14:30-16:15 | Lecture | M. Chierici, G. Jurman | The Data Analysis Plan (DAP) - intro to unbiased pipelines for (binary) classification |
16:15-16:45 | Coffee Break | ||
16:45-18:30 | Practical | M. Chierici, G. Jurman | Implementation of a basic DAP in Python (Scikit-Learn) with feature ranking and classification |
Thursday 07 September 2017 |
|||
09:15-09:25 | Previously On | Recap of previous lessons by participants | |
09:25-10:00 | Lecture | M. Moretto, A. Cestaro | Gene prediction methods as an example of ML on genomic data |
10:00-10:30 | Coffee Break | ||
10:30-12:30 | Practical | M. Moretto, A. Cestaro | Training a gene prediciton method |
12:30-13:00 | Wrap-up and feedback |
ELIXIR-IIB member institutions
- CNR (ELIXIR-IIB coordinator)
- CRS4
- CINECA
- Fondazione Edmund Mach, Trento
- INFN
- GARR
- Sapienza Università di Roma
- Università di Bari
- Università di Bologna
- Università di Firenze
- Università di Milano
- Università di Milano Bicocca
- Università di Padova
- Università di Parma
- Università di Roma "Tor Vergata"
- Università di Salerno
- Università della Tuscia