Your submission: ES_CapTrap_ONT on ONT data

Background

Challenge 3 is evaluated according to a de novo long reads-based genome (provided by the LRGASP Consortium) and SIRV Lexogen Set 1 which was included before library preparation.

The LRGASP uses SQANTI3 categories to define evaluating features and metrics for Challenge 3.

Categories defined by SQANTI3

Due lo the lack of annotation on the de novo transcriptome, these categories will only be associated to transcripts that map to SIRV reference sequences.

  • Full Splice Match (FSM): Transcripts matching a reference SIRV at all splice junctions.
  • Incomplete Splice Match (ISM): Transcripts matching consecutive, but not all, splice junctions of the reference SIRV.
  • Novel in Catalog (NIC): Transcripts containing new combinations of already annotated splice junctions or novel splice junctions formed from already annotated donors and acceptors.
  • Novel Not in Catalog (NNC): Transcripts using novel donors and/or acceptors.
  • Reference Match (RM): FSM transcript with 5´ and 3´ends within 50 nts of the TSS/TTS annotation. This means that a certain SIRV was detected perfectly.

The rest of the transcripts will be catalogued as Intergenic

Evaluation of detected transcripts for Challenge 3

These are some definitions used to evaluate the submitted transcriptome:

Global metrics

Absolute value Relative value (%)
Number of transcripts 23487 -
Mapping transcripts 23486 100
Average length 1387.79 -
Transcripts with coding potential 17767 75.65
Transcripts with Full Illumina SJ Support 19339 82.34
Non-canonical transcripts 159 0.68
Transcripts with possible intra-priming 770 3.28
Transcripts with possible RT-switching 1242 5.29
Splice Junctions with short-read coverage 69235 93.83
Non-canonical Splice Junctions 137 0.19

Length distribution

Length distribution of mapping reads.

Minimum SJ coverage

Coverage distribution of the SJ with less coverage in each detected isoform.

Coverage comparison: canonical vs non-canonical SJ

Coverage comparison by Illumina reads. Splice Junctions common for several transcripts are counted just once.

BUSCO completness results

Absolute value Relative value (%)
Complete and single-copy BUSCOs 3219 28.32
Complete and duplicated BUSCOs 1581 13.91
Fragmented BUSCOs 377 3.32
Missing BUSCOs 6189 54.45

Transcripts per Locus

Evaluation of Spike-Ins (SIRVs)

The following metrics and definitions apply to SIRV transcripts:

ATENTION If in this chunk of the evaluation all the results are 0, please, check if the reference genome and/or transcriptome used for building your transcript-model contain information about spike-ins.

Value
SIRV transcripts 54
True Positive detections (TP) 31
SIRV transcripts associated to TP (Reference Match) 31
Partial True Positive detections (PTP) 3
SIRV transcripts associated to PTP 4
False Negative (FN) 50
False Positive (FP) 19
Sensitivity 0.37
Precision 0.57
Non Redundant Precision 0.57
Positive Detection Rate 0.4
False Discovery Rate 0.43
Redundancy 1.03