Options

clean


To show all available options and their default values you can type in your terminal:

captus_assembly clean --help

Input


-r, --reads

With this option you provide the location of your raw FASTQ files, there are several ways to list them:

  • Directory: the path to the directory containing your FASTQ files is usually the easiest way to tell Captus which files to analyze. When you provide a directory, Captus searches within all its subdirectories for files with valid FASTQ extensions.

  • List of files: you can also provide the individual path to each of your FASTQ files separated by spaces. This is useful if you only want to analyze only a couple of samples within a directory with many other samples for example. Another use for lists is when your FASTQ files are located in different directories.

  • UNIX pattern: another easy way to provide lists of files is using the wildcards * and ? to match many or just one character respectively.

This argument is required .

Examples

Output


-o, --out

With this option you can redirect the output directory to a path of your choice, that path will be created if it doesn’t already exist.

This argument is optional, the default is ./01_clean_reads/


--keep_all

Many intermediate files are created during the read cleanup, some are large (like FASTQ files) while others small (like temporary logs). Captus deletes all the unnecesary intermediate files unless you enable this flag.


--overwrite

Use this flag with caution, this will replace any previous result within the output directory (for the sample names that match).


Adaptor trimming


--adaptor_set

We have bundled with Captus adaptor sequences, these options are available:

  • Illumina = Adaptor set copied from BBTools.
  • BGI = Including BGISEQ, DNBSEQ, and MGISEQ.
  • ALL = If you are unsure of the technology used for your sequences this combines both sets of adaptors.

This argument is optional, the default is ALL.


--rna

Enable this flag to trim poly-A tails from RNA-Seq reads.


Quality trimming and filtering

Here you can control PHRED quality score thresholds. BBTools uses the PHRED algorithm to trim low-quality bases or to discard low-quality reads.


--trimq

Leading and trailing read regions with average PHRED quality score below this value will be trimmed.

Many people raise this value to 20 or even higher but that usually discards lots of useful data for de novo assembly. In general, unless you have really high sequencing depth, don’t increase this threshold beyond ~16.

This argument is optional, the default is 13.


--maq

Once the trimming of low-quality bases from both ends of the reads has been completed, the average PHRED score of the entire read is recalculated and reads that do not have at least this minimum average quality are discarded.

Again, very high thresholds will throw away useful data. In general, set it to at least trimq or just a couple numbers higher.

This argument is optional, the default is 16.


--ftl

Trim any base to the left of this position. For example, if you want to remove 4 bases from the left of the reads set this number to 5.

This argument is optional, the default is 0 (no ftl applied).


--ftr

Trim any base to the right of this position. For example, if you want to truncate your reads length to 100 bp set this number to 100

This argument is optional, the default is 0 (no ftr applied).


QC Statistics


--qc_program

Select the program for obtaining the statistics from your FASTQ files. Both programs should return identical results, but Falco is much faster. Valid options are:

  • Falco
  • FastQC

This argument is optional, the default is Falco.


--skip_qc_stats

This flag disables the Falco or FastQC analysis, keep in mind that the final HTML report can’t be created without the results from this analysis.


Other


--bbduk_path, --falco_path, --fastqc_path

If you have installed your own copies of bbduk.sh, Falco, or FastQC you can provide the full path to those copies.

These arguments are optional, the defaults are bbduk.sh, falco, and fastqc respectively.


--ram, --threads, --concurrent, --debug, --show_less

See Parallelization (and other common options)


Created by Edgardo M. Ortiz (06.08.2021)
Last modified by Edgardo M. Ortiz (29.05.2022)