Skip to content Skip to footer

Training courses: Profiling of microbial communities using targeted and shotgun metagenomics



Course Description

This training course focuses on the study of the microbiota using Next Generation Sequencing (NGS) techniques. The course will introduce DNA metabarcoding and shotgun metagenomics and illustrate the major computational tools for the analysis of metagenomic data. In addition, the course will provide an introduction to Machine Learning methods applied to the analysis of metagenomic data. The course will include both a theoretical introduction to the topics and practical sessions with real data.



Important Dates

  • Deadline for applications: 20th May 2025
  • Chosen participants will be notified by: 31st May 2025
  • Course date: 30 June - 4 July 2025



Venue

University of Bari “Aldo Moro” Aula 4, “Vecchi Istituti Biologici”, Campus Ernesto Quagliarello, via Orabona 4, 70126, Bari. The closest Campus entrance is on Via Giovanni Amendola 165/A



Fee

The course includes a fee of 250 Euros for academic attendees and 350 Euros for industry professionals, covering lunches and coffee breaks. Participants are expected to pay their own travel and accommodation costs (if any).



Selection procedure

A maximum of 25 candidates will be selected on a first-come first-served basis. Selected participants will be notified by 31st May 2025.



(Invited) Speakers

  • Duccio Cavalieri, University of Florence
  • Eugenio Parente, University of Basilicata
  • Antonella Bruno, University of Milan Bicocca
  • Edoardo Pasolli, University of Naples



Organizers and Instructors

  • Giuseppe Defazio, University of Bari “Aldo Moro”, ELIXIR-IT, Italy
  • Claudio Donati, Fondazione Edmund Mach, ELIXIR-IT, Italy
  • Bruno Fosso, University of Bari “Aldo Moro”, ELIXIR-IT, Italy
  • Paolo Manghi, Fondazione Edmund Mach, ELIXIR-IT, Italy
  • Marinella Marzano, Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, IBIOM-CNR, ELIXIR-IT, Italy
  • Pierfrancesco Novielli, INFN-Bari, ELIXIR-IT, Italy
  • Graziano Pesole, University of Bari “Aldo Moro”, ELIXIR-IT, Italy
  • Monica Santamaria, University of Bari “Aldo Moro”, ELIXIR-IT, Italy
  • Sabina Tangaro, University of Bari “Aldo Moro”, ELIXIR-IT, Italy



Contacts

For all kinds of queries, please contact Claudio Donati at: claudio.donati@fmach.it, Bruno Fosso at: bruno.fosso@uniba.it or Monica Santamaria: at monica.santamaria@uniba.it



Target Audience

A maximum of 25 candidates will be selected on a first-come- first-served basis. The workshop is open to all biological or biomedical PhD and Post-Doc research scientists. The course requires basic knowledge of Unix and the command line (bash shell).



Aims of the workshop

  • Introduce the principles of metabarcoding and shotgun metagenomics
  • Guide participants through all steps of data analysis, from quality controls to data reduction and visualization
  • Provide hands-on experience to develop practical analysis skills and enable the extraction and interpretation of biological insights from metabarcoding and metagenomic data



Resources and tools covered

  • MetaPhlAn, StrainPhLan, HUMANN, Kraken2, Bracken, Metabat2, kMetaShot
  • QIIME2, DADA2, BioMaS, ITSoneDB/ITSoneWB, GreenGenes2



Learning Outcomes

By the end of this course, the participants will be able to:

  • Use the common bioinformatics workflows for DNA-metabarcoding and shotgun metagenomics data analysis;
  • Analyse microbial communities using common ecological measures of biodiversity;
  • Apply methods of multivariate statistics and dimensionality reduction to microbiome data;
  • Use machine learning approaches for microbiota data analysis.
    In summary, the course provides a comprehensive overview of modern microbiota analysis techniques, from sample preparation to bioinformatics analysis and machine learning applications.



Prerequisites

Participants are expected to have basic understanding of the UNIX shell.



Registration

Application Form



Programme

Day 1

Time Learning Experience Topic
10:00-12:00 Welcome and registration
12:00-14:00 Welcome cocktail
14:00-14:30 Welcome and short intro on ELIXIR
Introduction to the course
14:30-17:30 Lecture What is metagenomics - metabarcoding vs shotgun
Intro on sequencing technologies
- Second Generation Technologies
- Third Generation Technologies
Introduction to DNA-metabarcoding
- Historical notes
- Applications: from the gut microbiome to food traceability
Experimental design
17:30-18:00 Hands-On Access to virtual machines and upload/download tests of files and folders

Day 2 - Analysis of microbiome biodiversity through DNA metabarcoding

Time Learning Experience Topic
09:00-09:45 Lecture Amplicon sequencing
Variable regions vs full length
09:45-10:30 Hands-On Characteristics of the raw sequencing data
- Data visualization
- Data quality: fastqc/multiqc
- Data import into qiime
- Data pre-processing
10:30-11:00 Coffee Break
11:00-12:00 Lecture Denoising vs OTU-clustering & Chimera removal
Taxonomic classification: Approaches based on similarity analysis vs Bayesian classifiers
12:00-13:00 Hands-On - Data denoising
- Taxonomic classification of data and visualization of relative abundances
13:00-14:00 Lunch Break
14:00-14:30 Questions and answers
14:30-15:30 Lecture Theoretical notes on the concept of Diversity and Diversity measures
Data normalization for Rarefaction and CLR
- Dimensional reduction approaches (PCoA/PCA) and permANOVA tests
Statistical tests on alpha diversity and beta diversity metrics
Differential Abundance Analysis
15:30-17:00 Hands-On - Rarefaction and diversity metrics
- Statistical Comparison
- Differential Abundance Analysis

Day 3 - Artificial Intelligence and the Microbiome

Time Learning Experience Topic
09:00-10:30 Lecture Fundamentals of Machine Learning and AI
Introduction to Python for microbiome data analysis
Types of Machine Learning (Supervised vs Unsupervised)
Microbiome data and their specificity
- Compositional data and log-ratio transformations (clr, ilr, alr)
- Sparsity and over-dispersion problems
- Need for normalization and scaling
Dimensionality reduction
- PCA, t-SNE, UMAP (advantages and limitations in microbiomic analysis)
- Considerations on normalization, data balancing and cross-validation
10:30-11:00 Coffee Break
11:00-12:30 Exercise Data preprocessing and preparation for ML
- Data import and visualization
- Transformations for compositional data
- Normalization and management of relative abundances
- Implementation of PCA and t-SNE to explore the structure of microbiome data
12:30-13:30 Lunch Break
13:30-15:00 Lecture Machine Learning Models and Interpretability
Machine Learning for the Microbiome
- Classification vs Regression
- Common algorithms: Logistic Regression, Decision Trees, Random Forest, SVM, Gradient Boosting
- Performance evaluation
- Accuracy, Precision, Recall, F1-score, AUC-ROC
- Confusion Matrix and importance of the choice of metric
- Explainable AI (XAI) and model interpretability
- (SHAP and LIME) for feature interpretation
- Importance of features and their role in prediction
- XAI applications in the microbiome: identification of relevant microbial markers
15:00-15:30 Coffee Break
15:30-17:00 Exercise Construction and interpretation of an ML model on metagenomic data
- Training a classification model on abundance data
- Hyperparameter optimization with cross-validation
- Interpretation of results with SHAP
- Discussion of the results and comparison with traditional methods

Day 4 - Analysis of the taxonomic and functional composition of the microbiome through shotgun metagenomics

Time Learning Experience Topic
09:00-10:30 Lecture Sequencing technologies
Brief introduction to the main analysis techniques:
- Taxonomic profiling
- Functional profiling
- Metagenome assembly and binning
Computational tools for metagenomics
- operating systems
- hardware requests
- software tools (nextflow-docker, etc)
10:30-11:00 Coffee Break
11:00-12:30 Lecture Taxonomic and functional profiling using shotgun data
Raw data preprocessing: read filtering and host elimination
Taxonomic profiling:
- MetaPhlAn
- Kraken2/Bracken
Functional profiling: HUMANN
Taxonomic profiling beyond the species level:
Strain level analysis
- The species concept in bacteria
- Genomic variability within the species: strain, genome, pangenome
- Strain level profiling: StrainPhlAn
12:30-13:30 Lunch Break
13:30-15:00 Hands-On - MetaPhlAn
- Kraken2/Bracken
- HUMANN
15:00-15:30 Coffee Break
15:30-16:00 Hands-On StrainPhlAn
16:00-17:00 Lecture Metagenome assembly and binning
- Binning
- Quality measures for MAGs
- Dereplication of MAGS
- Taxonomic classification of MAGs
- Hands On kMetaShot
- Functional annotation of MAGs
20:30 Social Dinner

Day 5 - Case Studies

Time Learning Experience Topic
09:00-10:00 Lecture The human microbiome and the Holobiont theory of evolution
Duccio Cavalieri, University of Florence
10:00-11:00 Lecture The Food Microbiome
Eugenio Parente, University of Basilicata
11:00-11:15 Coffee Break
11:15-12:15 Lecture Living with microbes: The microbiome of the built environment and its implications in human wellbeing
Antonella Bruno, University of Milan Bicocca
12:15-13:15 Lecture Strain-resolved metagenomics of the human and food microbiomes
Edoardo Pasolli, University of Naples