Recommended Default Workflows

In this page you will find our recommendations on how to processes each data type available in Qiita and the minimum processing requirements to make your data public.

For convenience, there is an "Add Default Workflow" button for each preparation that will add the recommended workflow steps for your raw data which then you can process. It is important to note however that some steps in this default workflow will not work with all raw data; for example, for target gene the workflow is based on the default Earth Microbiome Project protocol and so assumes the uploaded data are multiplexed sequences with the reversed barcodes in your mapping file and index sequence file (see here for more details). Thus, if the protocol does not apply to your data you can still use the Default Workflow, however, you should first manually process your data using the appropriate steps until you have a defined step; in our example, demultiplex your reads. After demultiplexing, the Default Workflow is safe to use with any protocol.

If you have already manually performed one of the processing steps in the Default Workflow pipeline, the "Add Default Workflow" button will not re-select those steps but instead will only select any remaining steps that have not been completed. You can also add additional workflows on top of the recommended Default Workflow at any time.

Note that this is not a full inclusive list of data types accepted by Qiita but only those that have a defined workflow.
Hover on the spheres to get more information

Application: 16S, 18S -> NO parameters restrictions
Trimming and Deblur and/or Closed Reference OTU-picking


This workflow will (1) trim your reads to a selected read length, and (2) run either Deblur or closed-reference-picking to generate a feature-table.

Note it starts on a demultiplexed artifact, which means that the raw data needs to be demultiplexed and/or QC-ed. More details.

Extra: The Deblur command actually includes a fragment insertion step via SEPP.

Application: Genome Isolate -> NO parameters restrictions
Genome Isolate Processing


This workflow can be used for assembling (meta)-genomes (isolate and/or metagenomic data) using SPAdes v3.15.2 at set k-mer lengths of 21,33,55,77,99 and 127.

The assembled contigs are stored in per sample FASTA files (originally scaffolds.fna in SPAdes).

The --merge option merges the forward and reverse reads prior to assembly (preferable for isolate or metagenomes with high sequencing depth), the non-merge option works well for shallow shotgun data and/or complex environmental communities.

The --meta flag is used to assemble metagenomic datasets

Application: 16S, 18S -> NO parameters restrictions
Trimming and Deblur and/or Closed Reference OTU-picking [per-sample FASTQ]


This workflow will (1) trim your reads to a selected read length, and (2) run either Deblur and closed-reference-picking to generate a feature-table.

Note it starts on a per-sample FASTQ artifact with Phred offset: 33. More details.

Extra: The Deblur command actually includes a fragment insertion step via SEPP.

Application: Metagenomic -> NO parameters restrictions
Adapter & host removal, and Woltka profiling for Metagenomic datasets


This workflow will (1) trim adapter sequences using fastp_known_adapters_formatted.fna, (2) filter reads mapping to a selected host genome (if a host is selected), and (3) run resulting sequences against WoLr2 with Woltka to generate per-genome and per-gene feature-tables at various taxonomic and functional levels.

This workflow relies on fastp, minimap2, and Woltka. More info

Note that the current recommendation is to run Woltka with the two available databases: RS210 and WoLr2 as they give different perspectives from the same data. Additionally, this workflow starts from the rawest data available.

Note that, for human host samples, human sequence removal must be completed independently BEFORE upload to comply with Qiita’s Terms of Use.

Application: Metatranscriptomic -> NO parameters restrictions
Ribosomal RNA filtering, and Woltka profiling for Metatranscriptomic datasets


This workflow will (1) filter reads mapping to rRNA genes, and (2) run resulting sequences against WoLr2 with Woltka to generate per-genome and per-gene feature-tables at various taxonomic and functional levels.

This workflow relies on SortMeRNA, and Woltka. More info

Note that the current recommendation is to run Woltka with the two available databases: RS210 and WoLr2 as they give different perspectives from the same data. Additionally, this workflow starts from the rawest data available.

Note that, for human host samples, human sequence removal must be completed independently BEFORE upload to comply with Qiita’s Terms of Use.

Application: Metagenomic -> parameter restrictions:
  • Sample Information
    • calc_mass_sample_aliquot_input_g: *
  • Prep Information
    • syndna_pool_number: 1
    • mass_syndna_input_ng: *
    • vol_extracted_elution_ul: *
    • extracted_gdna_concentration_ng_ul: *

SynDNA filtering, Woltka profiling and cell counts calculations for Metagenomic datasets


This workflow will (1) run SynDNA filtering, (2) run resulting sequences against WoLr2 with Woltka to generate per-genome and per-gene feature-tables at various taxonomic and functional levels, and (3) it will calculate per-genome cell counts using the SynDNA results.

This workflow relies on and Woltka. More info.

Note that the current recommendation is to run Woltka with the two available databases: RS210 and WoLr2 as they give different perspectives from the same data.

Note that, for human host samples, human sequence removal must be completed independently BEFORE upload to comply with Qiita’s Terms of Use.