Recommended Default Workflows

In this page you will find our recommendations on how to processes each data type available in Qiita and the minimum processing requirements to make your data public.

For convenience, there is an "Add Default Workflow" button for each preparation that will add the recommended workflow steps for your raw data which then you can process. It is important to note however that some steps in this default workflow will not work with all raw data; for example, for target gene the workflow is based on the default Earth Microbiome Project protocol and so assumes the uploaded data are multiplexed sequences with the reversed barcodes in your mapping file and index sequence file (see here for more details). Thus, if the protocol does not apply to your data you can still use the Default Workflow, however, you should first manually process your data using the appropriate steps until you have a defined step; in our example, demultiplexed your reads. After demultiplexing the Default Workflow is safe to use with any protocol.

If you have already manually performed one of the processing steps in the Defaul Workflow pipeline, the "Add Default Workflow" button will not re-select those steps but instead will only select any remaining steps that have not been completed. You can also add additional workflows on top of the recommended Default Workflow at any time.

Note that this is not a full inclusive list of data types accepted by Qiita but only those that have a defined workflow.
Hover on the spheres to get more information

Application: Metatranscriptomic
Adapter & host removal, Ribosomal RNA filtering, and Woltka profiling


This workflow will (1) trim autodetected adapter sequences, (2) filter reads mapping to a selected host genome, (3) filter reads mapping to rRNA genes, and (4) run Woltka on the processed reads to generate per-genome and per-gene feature-tables at various taxonomic levels.

This workflow relies on fastp, minimap2, sortmerna and Woltka; more details.

Note that the current recommendation is to run Woltka with the two available databases: Rep200 and WoL as they give different perspectives from the same data. Additionally, this workflow starts from the rawest data available.

Application: 16S, 18S
Trimming and Deblur and/or Closed Reference OTU-picking


This workflow will (1) trim your reads to a selected read length, and (2) run either Deblur or closed-reference-picking to generate a feature-table.

Note it starts on a demultiplexed artifact; which means that the raw data needs to be demultiplexed and/or QC-ed. More details.

Extra: The Deblur command actually includes a fragment insertion step via SEPP.

Application: Metagenomic
Adaptor & host removal, and Woltka profiling


This workflow will (1) trim autodetected adapter sequences, (2) filter reads mapping to a selected host genome, and (3) run Woltka on the processed reads to generate per-genome and per-gene feature-tables at various taxonomic levels.

This workflow relies on fastp, minimap2, and Woltka. More info

Note that the current recommendation is to run Woltka with the two available databases: Rep200 and WoL as they give different perspectives from the same data. Additionally, this workflow starts from the rawest data available.

Application: Genome Isolate
Genome Isolate Processing


This workflow can be used for assembling (meta)-genomes (isolate and/or metagenomic data) using SPAdes v3.15.2 at set k-mer lengths of 21,33,55,77,99 and 127.

The assembled contigs are stored in per sample FASTA files (originally scaffolds.fna in SPAdes).

The --merge option merges the forward and reverse reads prior to assembly (preferable for isolate or metagenomes with high sequencing depth), the non-merge option works well for shallow shotgun data and/or complex environmental communities.

The --meta flag is used to assemble metagenomic datasets