Training courses: Profiling of microbial communities using targeted and shotgun metagenomics
Course Description
This training course focuses on the study of the microbiota using Next Generation Sequencing (NGS) techniques. The course will introduce DNA metabarcoding and shotgun metagenomics and illustrate the major computational tools for the analysis of metagenomic data. In addition, the course will provide an introduction to Machine Learning methods applied to the analysis of metagenomic data. The course will include both a theoretical introduction to the topics and practical sessions with real data.
Important Dates
- Deadline for applications: 20th May 2025
- Chosen participants will be notified by: 31st May 2025
- Course date: 30 June - 4 July 2025
Venue
University of Bari “Aldo Moro” Aula 4, “Vecchi Istituti Biologici”, Campus Ernesto Quagliarello, via Orabona 4, 70126, Bari. The closest Campus entrance is on Via Giovanni Amendola 165/A
Fee
The course includes a fee of 250 Euros for academic attendees and 350 Euros for industry professionals, covering lunches and coffee breaks. Participants are expected to pay their own travel and accommodation costs (if any).
Selection procedure
A maximum of 25 candidates will be selected on a first-come first-served basis. Selected participants will be notified by 31st May 2025.
(Invited) Speakers
- Duccio Cavalieri, University of Florence
- Eugenio Parente, University of Basilicata
- Antonella Bruno, University of Milan Bicocca
- Edoardo Pasolli, University of Naples
Organizers and Instructors
- Giuseppe Defazio, University of Bari “Aldo Moro”, ELIXIR-IT, Italy
- Claudio Donati, Fondazione Edmund Mach, ELIXIR-IT, Italy
- Bruno Fosso, University of Bari “Aldo Moro”, ELIXIR-IT, Italy
- Paolo Manghi, Fondazione Edmund Mach, ELIXIR-IT, Italy
- Marinella Marzano, Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, IBIOM-CNR, ELIXIR-IT, Italy
- Pierfrancesco Novielli, INFN-Bari, ELIXIR-IT, Italy
- Graziano Pesole, University of Bari “Aldo Moro”, ELIXIR-IT, Italy
- Monica Santamaria, University of Bari “Aldo Moro”, ELIXIR-IT, Italy
- Sabina Tangaro, University of Bari “Aldo Moro”, ELIXIR-IT, Italy
Contacts
For all kinds of queries, please contact Claudio Donati at: claudio.donati@fmach.it, Bruno Fosso at: bruno.fosso@uniba.it or Monica Santamaria: at monica.santamaria@uniba.it
Target Audience
A maximum of 25 candidates will be selected on a first-come- first-served basis. The workshop is open to all biological or biomedical PhD and Post-Doc research scientists. The course requires basic knowledge of Unix and the command line (bash shell).
Aims of the workshop
- Introduce the principles of metabarcoding and shotgun metagenomics
- Guide participants through all steps of data analysis, from quality controls to data reduction and visualization
- Provide hands-on experience to develop practical analysis skills and enable the extraction and interpretation of biological insights from metabarcoding and metagenomic data
Resources and tools covered
- MetaPhlAn, StrainPhLan, HUMANN, Kraken2, Bracken, Metabat2, kMetaShot
- QIIME2, DADA2, BioMaS, ITSoneDB/ITSoneWB, GreenGenes2
Learning Outcomes
By the end of this course, the participants will be able to:
- Use the common bioinformatics workflows for DNA-metabarcoding and shotgun metagenomics data analysis;
- Analyse microbial communities using common ecological measures of biodiversity;
- Apply methods of multivariate statistics and dimensionality reduction to microbiome data;
- Use machine learning approaches for microbiota data analysis.
In summary, the course provides a comprehensive overview of modern microbiota analysis techniques, from sample preparation to bioinformatics analysis and machine learning applications.
Prerequisites
Participants are expected to have basic understanding of the UNIX shell.
Registration
Programme
Day 1 |
|||
Time | Learning Experience | Topic | |
10:00-12:00 | Welcome and registration | ||
12:00-14:00 | Welcome cocktail | ||
14:00-14:30 | Welcome and short intro on ELIXIR Introduction to the course |
||
14:30-17:30 | Lecture |
What is metagenomics - metabarcoding vs shotgun Intro on sequencing technologies - Second Generation Technologies - Third Generation Technologies Introduction to DNA-metabarcoding - Historical notes - Applications: from the gut microbiome to food traceability Experimental design |
|
17:30-18:00 | Hands-On | Access to virtual machines and upload/download tests of files and folders | |
Day 2 - Analysis of microbiome biodiversity through DNA metabarcoding |
|||
Time | Learning Experience | Topic | |
09:00-09:45 | Lecture |
Amplicon sequencing Variable regions vs full length |
|
09:45-10:30 | Hands-On |
Characteristics of the raw sequencing data - Data visualization - Data quality: fastqc/multiqc - Data import into qiime - Data pre-processing |
|
10:30-11:00 | Coffee Break | ||
11:00-12:00 | Lecture |
Denoising vs OTU-clustering & Chimera removal Taxonomic classification: Approaches based on similarity analysis vs Bayesian classifiers |
|
12:00-13:00 | Hands-On |
- Data denoising - Taxonomic classification of data and visualization of relative abundances |
|
13:00-14:00 | Lunch Break | ||
14:00-14:30 | Questions and answers | ||
14:30-15:30 | Lecture |
Theoretical notes on the concept of Diversity and Diversity measures Data normalization for Rarefaction and CLR - Dimensional reduction approaches (PCoA/PCA) and permANOVA tests Statistical tests on alpha diversity and beta diversity metrics Differential Abundance Analysis |
|
15:30-17:00 | Hands-On |
- Rarefaction and diversity metrics - Statistical Comparison - Differential Abundance Analysis |
|
Day 3 - Artificial Intelligence and the Microbiome |
|||
Time | Learning Experience | Topic | |
09:00-10:30 | Lecture |
Fundamentals of Machine Learning and AI Introduction to Python for microbiome data analysis Types of Machine Learning (Supervised vs Unsupervised) Microbiome data and their specificity - Compositional data and log-ratio transformations (clr, ilr, alr) - Sparsity and over-dispersion problems - Need for normalization and scaling Dimensionality reduction - PCA, t-SNE, UMAP (advantages and limitations in microbiomic analysis) - Considerations on normalization, data balancing and cross-validation |
|
10:30-11:00 | Coffee Break | ||
11:00-12:30 | Exercise |
Data preprocessing and preparation for ML - Data import and visualization - Transformations for compositional data - Normalization and management of relative abundances - Implementation of PCA and t-SNE to explore the structure of microbiome data |
|
12:30-13:30 | Lunch Break | ||
13:30-15:00 | Lecture |
Machine Learning Models and Interpretability Machine Learning for the Microbiome - Classification vs Regression - Common algorithms: Logistic Regression, Decision Trees, Random Forest, SVM, Gradient Boosting - Performance evaluation - Accuracy, Precision, Recall, F1-score, AUC-ROC - Confusion Matrix and importance of the choice of metric - Explainable AI (XAI) and model interpretability - (SHAP and LIME) for feature interpretation - Importance of features and their role in prediction - XAI applications in the microbiome: identification of relevant microbial markers |
|
15:00-15:30 | Coffee Break | ||
15:30-17:00 | Exercise |
Construction and interpretation of an ML model on metagenomic data - Training a classification model on abundance data - Hyperparameter optimization with cross-validation - Interpretation of results with SHAP - Discussion of the results and comparison with traditional methods |
|
Day 4 - Analysis of the taxonomic and functional composition of the microbiome through shotgun metagenomics |
|||
Time | Learning Experience | Topic | |
09:00-10:30 | Lecture |
Sequencing technologies Brief introduction to the main analysis techniques: - Taxonomic profiling - Functional profiling - Metagenome assembly and binning Computational tools for metagenomics - operating systems - hardware requests - software tools (nextflow-docker, etc) |
|
10:30-11:00 | Coffee Break | ||
11:00-12:30 | Lecture |
Taxonomic and functional profiling using shotgun data Raw data preprocessing: read filtering and host elimination Taxonomic profiling: - MetaPhlAn - Kraken2/Bracken Functional profiling: HUMANN Taxonomic profiling beyond the species level: Strain level analysis - The species concept in bacteria - Genomic variability within the species: strain, genome, pangenome - Strain level profiling: StrainPhlAn |
|
12:30-13:30 | Lunch Break | ||
13:30-15:00 | Hands-On |
- MetaPhlAn - Kraken2/Bracken - HUMANN |
|
15:00-15:30 | Coffee Break | ||
15:30-16:00 | Hands-On | StrainPhlAn | |
16:00-17:00 | Lecture |
Metagenome assembly and binning - Binning - Quality measures for MAGs - Dereplication of MAGS - Taxonomic classification of MAGs - Hands On kMetaShot - Functional annotation of MAGs |
|
20:30 | Social Dinner | ||
Day 5 - Case Studies |
|||
Time | Learning Experience | Topic | |
09:00-10:00 | Lecture | The human microbiome and the Holobiont theory of evolution Duccio Cavalieri, University of Florence |
|
10:00-11:00 | Lecture | The Food Microbiome Eugenio Parente, University of Basilicata |
|
11:00-11:15 | Coffee Break | ||
11:15-12:15 | Lecture |
Living with microbes: The microbiome of the built environment and its implications in human wellbeing Antonella Bruno, University of Milan Bicocca |
|
12:15-13:15 | Lecture |
Strain-resolved metagenomics of the human and food microbiomes Edoardo Pasolli, University of Naples |