CLUSTERING-BASED METHOD
(16S Illumina: BTW/WSL)
BMP pipeline
​
BTW tutorial: https://github.com/vpylro/BTW/blob/master/README.md
​
References:
Pylro, et al. Data analysis for 16S microbial profiling from different benchtop sequencing platforms. Journal of Microbiological Methods, v. 107, p. 30-37, 2014. DOI: 10.1016/j.mimet.2014.08.018.
Morais et al. (2018), BTW—Bioinformatics Through Windows: an easy-to-install package to analyze marker gene data. PeerJ 6:e5299. DOI: 10.7717/peerj.5299.
This example assumes reads in FASTQ format.
​
This page gives a complete pipeline to analyze 16S rRNA gene data. Of course, you should edit as needed for your reads and file locations (represented here as $PWD/).
Create a folder containing 3 empty folders: fastq/ | fasta/ | demul |
​
1- Take forward and reverse Illumina reads (R1.fastq and R2.fastq files) and join them using the method fastq-join <<<USING QIIME 1.9>>>
multiple_join_paired_ends.py -i raw/ -o merged/
2 - Quality filtering, length truncate, and convert to FASTA each joined sample <<<USING VSEARCH>>>
for i in $(ls fastq/); do vsearch --fastx_filter fastq/$i --fastq_maxee 1.0 --fastq_trunclen 350 --fastaout fasta/${i%.fastq}.fa; done
3 - Change sequence header to make file compatible with further steps <<<USING BMP PERL SCRIPT>>>. This script will generate your converted FASTA file. Sample´s name should not contain any special characters, symbols or spaces. We strongly recommend keeping samples´s name as simple as possible.
for i in $(ls fasta/); do bmp_demultiplexed.pl -i fasta/$i -o demul/${i%.fa}.fa -b ${i%.fa}; done
4 - Make a single file containing all your samples
​
cat demul/*.fa > reads.fa
​
5 - Dereplication <<<USING VSEARCH>>>
vsearch --derep_fulllength reads.fa --output derep.fa --sizeout
​
6 - Abundance sort and discard singletons <<<USING VSEARCH>>>
vsearch --sortbysize derep.fa --output sorted.fa --minsize 2
7 - OTU clustering using UPARSE method <<<USING VSEARCH>>>
vsearch --cluster_size sorted.fa --consout otus1.fa --id 0.97
​
8 - Fasta Formatter <<<FASTX TOOLKIT SCRIPT>>>
fasta_formatter -i otus1.fa -o formated_otus1.fa
9 - Renamer <<<BMP SCRIPT>>>
bmp-otuName.pl -i formated_otus1.fa -o otus.fa
10 - Map reads back to OTU database <<<VSEARCH>>>
vsearch --usearch_global $PWD/reads.fa --db otus.fa --strand plus --id 0.97 --uc map.txt
11 - Assign taxonomy to OTUS using the RDP Classifier on QIIME (use the file “otus.fa” as input file)
assign_taxonomy.py -i otus.fa -m rdp -o taxonomy
12 - Convert UC to otu-table.txt <<< BMP SCRIPT>>>
​
bmp-map2qiime.py map.txt > otu_table.txt
​
13 - Convert otu_table.txt to otu-table.biom <<< QIIME SCRIPT>>>
​
make_otu_table.py -i otu_table.txt -t taxonomy/otus_tax_assignments.txt -o otu_table.biom
​
14 - Check OTU Table on QIIME.
​
biom summarize-table -i $PWD/otu_table.biom -o results_biom_table.txt
​​
The generated .biom OTU table is also fully compatible with the MicrobiomeAnalyst, a user-friendly web-based platform for microbiome data analyses and visualizations, including taxonomy plots and estimates of α- and β-diversity (http://www.microbiomeanalyst.ca).