• Home
  • About
    • LM Rodriguez-R photo

      LM Rodriguez-R

      Bioinformatics & Microbial Ecology

    • Learn More
    • Email
    • Twitter
    • LinkedIn
    • Tumblr
    • Medium
    • Github
  • Posts
    • Blog
    • Tags
  • Publications
  • Software
  • CV

EMBO Metagenomics 360° 2025

11 Mar 2025

Reading time ~2 minutes

Genome-Resolved Metagenomics

All the data used here is already available in your Virtual Machines, but you can also obtain it here if you’d like to reproduce these tasks in your own machines:

wget 'https://disc-genomics.uibk.ac.at/data/came24.tar.gz' -O - | tar -zx
cd SLMU

0. Nonpareil

First, let us define where the data is, so we don’t have to type it every time:

DATA=/mnt/data/Rodriguez

Now, we can start by running nonpareil locally:

mkdir -p 00_nonpareil
nonpareil -T kmer -s $DATA/SLMU.1.fastq.gz -f fastq -b 00_nonpareil/SLMU
NonpareilCurves.R --pdf 00_nonpareil/SLMU.pdf --no-observed 00_nonpareil/SLMU.npo
xdg-open 00_nonpareil/SLMU.pdf

1. Assembly

# IMPORTANT: This step might use significant RAM and take a very long time
spades.py --meta -1 $DATA/SLMU.1.fastq.gz -2 $DATA/SLMU.2.fastq.gz -o 01_assembly

If this is taking too long, you can kill the process by using Ctrl + C, and instead copy the scaffolds we provide:

rm -rf 01_assembly
mkdir -p 01_assembly
cp $DATA/SLMU.scaffolds.fasta 01_assembly/scaffolds.fasta

2. Mapping reads to the assembly

mkdir -p 02_mapping
bowtie2-build 01_assembly/scaffolds.fasta 02_mapping/SLMU.idx
bowtie2 -1 $DATA/SLMU.1.fastq.gz -2 $DATA/SLMU.2.fastq.gz \
      -S 02_mapping/SLMU.sam -x 02_mapping/SLMU.idx --no-unal
samtools view -b 02_mapping/SLMU.sam | samtools sort -o 02_mapping/SLMU.bam -
ls 02_mapping

3. Binning

mkdir -p 03_binning
jgi_summarize_bam_contig_depths --outputDepth 03_binning/SLMU.abund \
      02_mapping/SLMU.bam
metabat2 -i 01_assembly/scaffolds.fasta -a 03_binning/SLMU.abund \
      -o 03_binning/SLMU_bin
ls 03_binning/SLMU_bin.*.fa

4. Comparing a MAG against a collection of close relatives

# IMPORTANT: This step is currently causing problems because NCBI has blocked
# the internal network. If if fails, you can use our local copy (see below)
miga new -v -P 04_classification/Sal -t genomes
miga gtdb_get -v -P 04_classification/Sal -T g__Salinibacter --ref

If the genomes cannot be downloaded, you can also use our local copy instead:

# IMPORTANT: This will remove the results from the above commands, so make sure
# you use it only if the above is failing
rm -rf 04_classification
mkdir -p 04_classification
cp -R $DATA/Sal 04_classification/Sal

And finally, you can add your own genomes and process all the data:

miga add -v -P 04_classification/Sal -i assembly -t popgenome \
      03_binning/SLMU_bin.*.fa
miga index_wf -o 04_classification/Sal -v

That last command will take some time, but you can continue to part 5 and revisit the results from MiGA later.

Once the run is complete, you can open the file:

firefox 04_classification/Sal/index.html

If you want the taxonomic table, you can use:

miga ls -P 04_classification/Sal -o 04_classification/Sal/taxonomy.tsv \
      --active -m tax --tab
miga browse -P 04_classification/Sal

And reload the page in Firefox, which should now have a taxonomy link.

5. Mapping reads to the genome(s)

mkdir -p 05_genome_mapping
bowtie2-build 03_binning/SLMU_bin.5.fa 05_genome_mapping/SLMU_bin5.idx
bowtie2 -1 $DATA/SLMU.1.fastq.gz -2 $DATA/SLMU.2.fastq.gz \
      -S 05_genome_mapping/SLMU_bin5.sam \
      -x 05_genome_mapping/SLMU_bin5.idx --no-unal

6. Recruitment plots

mkdir -p 06_recplot
rpe build -d 06_recplot/SLMU_5.db -r 05_genome_mapping/SLMU_bin5.sam \
    -g 03_binning/SLMU_bin.5.fa --mag
rpe plot -d 06_recplot/SLMU_5.db -o 06_recplot/plot

Open the file manager and open the following file to explore some of the results locally:

firefox 06_recplot/plot/SLMU_bin5_sam/SLMU_bin_recruitment_plot.html

Software Used

  • Nonpareil
  • SPADES
  • Bowtie2
  • Metabat2
  • MiGA
  • RecruitPlotEasy

And lots of Software used by those above!



Share Tweet +1