A modular tool to aggregate results from bioinformatics analyses across many samples into a single report.
This report has been generated by the Arcadia-Science/metagenomics analysis pipeline using the Illumina workflow.
Report
generated on 2023-05-21, 02:28 UTC
based on data in:
/tmp/nxf.1x5laC0nuT
General Statistics
Showing 18/18 rows and 12/15 columns.| Sample Name | N50 (Kbp) | Assembly Length (Mbp) | Error rate | M Non-Primary | M Reads Mapped | % Mapped | % Proper Pairs | M Total seqs | % Duplication | GC content | % PF | % Adapter |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| EL12weeks | 0.69% | 0.0 | 70.3 | 58.5% | 17.7% | 120.2 | 28.4% | 46.9% | 99.7% | 0.0% | ||
| EL12weeks.reformatted | 4.6Kbp | 104.9Mbp | ||||||||||
| EL2weeks | 0.86% | 0.0 | 105.1 | 97.8% | 93.3% | 107.5 | 3.6% | 40.7% | 98.9% | 0.4% | ||
| EL2weeks.reformatted | 32.3Kbp | 88.6Mbp | ||||||||||
| EL4weeks | 0.64% | 0.0 | 140.0 | 98.1% | 92.6% | 142.7 | 6.4% | 42.7% | 99.5% | 0.2% | ||
| EL4weeks.reformatted | 25.5Kbp | 83.3Mbp | ||||||||||
| OM2weeks | 0.79% | 0.0 | 110.0 | 97.9% | 90.4% | 112.3 | 4.4% | 45.1% | 99.1% | 0.1% | ||
| OM2weeks.reformatted | 20.4Kbp | 85.5Mbp | ||||||||||
| OM4weeks | 0.77% | 0.0 | 107.6 | 98.0% | 90.6% | 109.8 | 6.4% | 42.3% | 99.3% | 0.1% | ||
| OM4weeks.reformatted | 5.8Kbp | 83.3Mbp | ||||||||||
| OM8weeks | 0.79% | 0.0 | 111.1 | 97.9% | 90.6% | 113.4 | 4.9% | 45.5% | 99.2% | 0.2% | ||
| OM8weeks.reformatted | 12.0Kbp | 120.6Mbp | ||||||||||
| WH1month | 1.12% | 0.0 | 116.5 | 97.3% | 92.6% | 119.7 | 3.2% | 58.5% | 98.5% | 0.8% | ||
| WH1month.reformatted | 10.0Kbp | 133.4Mbp | ||||||||||
| WH2months | 1.13% | 0.0 | 122.2 | 97.8% | 93.6% | 125.0 | 3.1% | 62.0% | 98.4% | 0.8% | ||
| WH2months.reformatted | 6.6Kbp | 162.9Mbp | ||||||||||
| WH4months | 0.43% | 0.0 | 111.6 | 98.5% | 93.5% | 113.3 | 18.5% | 53.6% | 99.5% | 0.2% | ||
| WH4months.reformatted | 28.2Kbp | 117.8Mbp |
fastp
fastp An ultra-fast all-in-one FASTQ preprocessor (QC, adapters, trimming, filtering, splitting...).DOI: 10.1093/bioinformatics/bty560.
Filtered Reads
Filtering statistics of sampled reads.
Insert Sizes
Insert size estimation of sampled reads.
Sequence Quality
Average sequencing quality over each base of all reads.
GC Content
Average GC content over each base of all reads.
N content
Average N content over each base of all reads.
QUAST
QUAST is a quality assessment tool for genome assemblies, written by the Center for Algorithmic Biotechnology.DOI: 10.1093/bioinformatics/btt086.
Assembly Statistics
| Sample Name | N50 (Kbp) | L50 (K) | Largest contig (Kbp) | Length (Mbp) |
|---|---|---|---|---|
| EL12weeks.reformatted | 4.6Kbp | 3.3K | 270.1Kbp | 104.9Mbp |
| EL2weeks.reformatted | 32.3Kbp | 0.6K | 863.9Kbp | 88.6Mbp |
| EL4weeks.reformatted | 25.5Kbp | 0.7K | 610.0Kbp | 83.3Mbp |
| OM2weeks.reformatted | 20.4Kbp | 0.9K | 1401.5Kbp | 85.5Mbp |
| OM4weeks.reformatted | 5.8Kbp | 2.8K | 266.5Kbp | 83.3Mbp |
| OM8weeks.reformatted | 12.0Kbp | 1.9K | 489.5Kbp | 120.6Mbp |
| WH1month.reformatted | 10.0Kbp | 2.5K | 568.7Kbp | 133.4Mbp |
| WH2months.reformatted | 6.6Kbp | 4.0K | 397.2Kbp | 162.9Mbp |
| WH4months.reformatted | 28.2Kbp | 0.8K | 568.7Kbp | 117.8Mbp |
Number of Contigs
This plot shows the number of contigs found for each assembly, broken down by length.
Samtools
Samtools is a suite of programs for interacting with high-throughput sequencing data.DOI: 10.1093/bioinformatics/btp352.
Percent Mapped
Alignment metrics from samtools stats; mapped vs. unmapped reads.
For a set of samples that have come from the same multiplexed library, similar numbers of reads for each sample are expected. Large differences in numbers might indicate issues during the library preparation process. Whilst large differences in read numbers may be controlled for in downstream processings (e.g. read count normalisation), you may wish to consider whether the read depths achieved have fallen below recommended levels depending on the applications.
Low alignment rates could indicate contamination of samples (e.g. adapter sequences), low sequencing quality or other artefacts. These can be further investigated in the sequence level QC (e.g. from FastQC).
Alignment metrics
This module parses the output from samtools stats. All numbers in millions.
Arcadia-Science/metagenomics Software Versions
are collected at run time from the software output.
| Process Name | Software | Version |
|---|---|---|
| BOWTIE2_ASSEMBLY_ALIGN | bowtie2 | 2.4.2 |
| pigz | 2.3.4 | |
| samtools | 1.11 | |
| BOWTIE2_ASSEMBLY_BUILD | bowtie2 | 2.4.2 |
| CHECK_SAMPLESHEET | python | 3.9.5 |
| CUSTOM_DUMPSOFTWAREVERSIONS | python | 3.10.6 |
| yaml | 6.0 | |
| FASTP | fastp | 0.23.2 |
| METABAT2_JGISUMMARIZEBAMCONTIGDEPTHS | metabat2 | 2.15 |
| METASPADES | metaspades | 3.15.3 |
| python | 3.9.6 | |
| PRODIGAL | pigz | 2.6 |
| prodigal | 2.6.3 | |
| QUAST | quast | 5.2.0 |
| SAMTOOLS_STATS | samtools | 1.16.1 |
| SOURMASH_COMPARE | sourmash | 4.6.1 |
| SOURMASH_GATHER | sourmash | 4.6.1 |
| SOURMASH_SKETCH | sourmash | 4.6.1 |
| SOURMASH_TAXANNOTATE | sourmash | 4.6.1 |
| Workflow | Arcadia-Science/metagenomics | 1.0dev |
| Nextflow | 23.04.1 |
Arcadia-Science/metagenomics Workflow Summary
- this information is collected when the pipeline is started.