Nextflow Pipeline: Parameters Reference#
This document provides a comprehensive reference for all parameters available in the Nextflow pipeline for CAZyme annotation in microbiome data. Parameters are organized by functional category.
Note
Before running the pipeline, make sure you have cloned the repository. See Nextflow Pipeline: Usage for installation instructions. All examples assume you are in the dbcan-nf directory or use the full path to main.nf.
Input/Output Options#
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
string |
Yes |
Path to comma-separated file containing information about the samples in the experiment. Must have 3 columns (sample, fastq_1, fastq_2) and a header row. See Nextflow Pipeline: Usage for samplesheet format details. |
|
string |
Yes |
The output directory where the results will be saved. Use absolute paths for Cloud infrastructure. |
|
string |
No |
Email address for completion summary. Set this to receive a summary email when the workflow exits. Can be set in |
|
string |
No |
Email address for completion summary, only sent when pipeline fails. |
|
boolean |
No |
Send plain-text email instead of HTML. Default: |
Analysis Mode Selection#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
Analysis mode to use. Options: |
Quality Control Options#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
boolean |
|
Skip FastQC quality control analysis. When enabled, FastQC steps are bypassed. |
|
boolean |
|
Skip TrimGalore adapter trimming. When enabled, trimming steps are bypassed. |
Kraken2 Options#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
boolean |
|
Skip Kraken2 taxonomic classification and read extraction. When enabled, all reads are used without filtering. |
|
string |
|
Path to custom Kraken2 database directory. If not specified, the pipeline will build a standard database automatically. |
|
string |
|
NCBI taxonomy ID for taxonomic filtering. Default |
Assembly Options (Short Reads Mode)#
These parameters are only applicable when --type shortreads is used.
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
boolean |
|
Enable subsampling mode. Downsample each sample before assembly to reduce computational requirements. Mutually exclusive with |
|
integer |
|
Number of reads per file to retain when subsampling is enabled. Only used when |
|
boolean |
|
Enable co-assembly mode. Combine all samples and perform a single MEGAHIT assembly. Requires at least 2 samples. Mutually exclusive with |
Long Reads Options#
These parameters are only applicable when --type longreads is used.
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
Flye assembly mode. Options include |
Database Options#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
Path to custom dbCAN database directory. If not specified, the pipeline will download the database automatically. |
MultiQC Options#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
MultiQC report title. Printed as page header, used for filename if not otherwise specified. |
|
string |
|
Custom config file to supply to MultiQC. |
|
string |
|
Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file. |
|
string |
|
Custom MultiQC YAML file containing HTML including a methods description. |
|
string |
|
File size limit when attaching MultiQC reports to summary emails. |
Generic Options#
These options are common to all nf-core pipelines and are typically set in a Nextflow config file.
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
Method used to save pipeline results to output directory. Options: |
|
boolean |
|
Do not use coloured log outputs. |
|
string |
|
Incoming hook URL for messaging service. Currently, MS Teams and Slack are supported. |
|
boolean |
|
Boolean whether to validate parameters against the schema at runtime. |
Parameter Usage Examples#
Basic short reads analysis:
nextflow run main.nf \
--input samplesheet.csv \
--outdir results \
--type shortreads \
-profile docker
Short reads with subsampling:
nextflow run main.nf \
--input samplesheet.csv \
--outdir results \
--type shortreads \
--subsample \
--subsample_size 5000000 \
-profile docker
Long reads analysis:
nextflow run main.nf \
--input samplesheet.csv \
--outdir results \
--type longreads \
--flye_mode --nano-raw \
-profile docker
Assembly-free analysis:
nextflow run main.nf \
--input samplesheet.csv \
--outdir results \
--type assemfree \
-profile docker
With custom databases:
nextflow run main.nf \
--input samplesheet.csv \
--outdir results \
--type shortreads \
--dbcan_db /path/to/dbcan_db \
--kraken_db /path/to/kraken_db \
-profile docker
Skipping quality control steps:
nextflow run main.nf \
--input samplesheet.csv \
--outdir results \
--type shortreads \
--skip_fastqc \
--skip_trimming \
-profile docker
Using parameter files:
nextflow run main.nf \
-profile docker \
-params-file params.yaml
With params.yaml:
input: './samplesheet.csv'
outdir: './results/'
type: 'shortreads'
subsample: true
subsample_size: 5000000
skip_kraken_extraction: false
kraken_tax: '9606'