HTML Report

Concept


The output from this extract module, such as how many loci are recovered, in how many samples, and to what extent, would be the most direct indication of whether your analysis is successful or not, and thus would be of most interest to many users. However, collecting, summarizing, and visualizing such important information can be backbreaking, especially in a phylo"genomic" project which typically employs hundreds or even thousands of samples and loci.

Don’t worry, Captus automatically generates an informative report! Open captus-assembly_extract.report.html with your browser (internet connection required) to explore your extraction result at various scales, from the global level to the single sample or single locus level.

Tip

Example


Here is a small example of the report you can play with!
The heatmap shows a extraction result of the Angiosperms353 (Johnson et al., 2019) loci from targeted-capture data of four plant species. The blue bars along with x- and y-axes indicate how many loci are recovered in each sample and how many samples each locus is recovered in, respectively.

Note
  • When your result contains more than one marker type, the report will include separate plots for each marker type.
  • For loci with more than one copy found in a sample, information on best hit (hit with the highest weighted score) will be shown.
  • Information on loci with no samples recovered and samples with no loci recovered will not be shown.

Features


1. Hover information

Hover mouse cursor over the heatmap to see detailed information about each single data point.

List of the information to be shown


2. Variable dropdown

Switch this dropdown to change the variable to be shown as a heatmap among the following options:

Variable Description Unit
Recovered Length Percentage of reference sequence length recovered %
Identity Sequence identity of the recovered sequence to the reference sequence %
Total Hits (Copies) Number of hits found (Values greater than 1 imply the presence of paralogs) -
Score Score inspired by Scipio, calculated as (matches - mismatches) / reference sequence length -
Weighted Score Weighted score to address multiple reference sequences per locus
(for details, read Information included in the table)
-
Number of Frameshifts Number of corrected frameshifts in the extracted sequence
(always 0 if the reference sequence is in nucleotide)
-
Contigs in Best Hit Number of contigs used to assemble the best hit -
Best Hit L50 Least number of contigs in best hit that contain 50% of the best hit’s recovered length -
Best Hit L90 Least number of contigs in best hit that contain 90% of the best hit’s recovered length -
Best Hit LG50 Least number of contigs in best hit that contain 50% of the reference locus length -
Best Hit LG90 Least number of contigs in best hit that contain 90% of the reference locus length -

3. Sort by Value dropdown

Switch this dropdown to change the sorting manner of each axis as follow:

Label Locus (x-axis) Sample (y-axis)
None Sort by name Sort by name
Mean X Sort by mean value Sort by name
Mean Y Sort by name Sort by mean value
Mean Both Sort by mean value Sort by mean value
Total X Sort by total value Sort by name
Total Y Sort by name Sort by total value
Total Both Sort by total value Sort by total value

Created by Gentaro Shigita (11.08.2021)
Last modified by Gentaro Shigita (16.09.2022)