Your submission: ES_cDNA_ONT on ONT data

Background

Challenge 3 is evaluated according to a de novo long reads-based genome (provided by the LRGASP Consortium) and SIRV Lexogen Set 1 which was included before library preparation.

The LRGASP uses SQANTI3 categories to define evaluating features and metrics for Challenge 3.

Categories defined by SQANTI3

Due lo the lack of annotation on the de novo transcriptome, these categories will only be associated to transcripts that map to SIRV reference sequences.

  • Full Splice Match (FSM): Transcripts matching a reference SIRV at all splice junctions.
  • Incomplete Splice Match (ISM): Transcripts matching consecutive, but not all, splice junctions of the reference SIRV.
  • Novel in Catalog (NIC): Transcripts containing new combinations of already annotated splice junctions or novel splice junctions formed from already annotated donors and acceptors.
  • Novel Not in Catalog (NNC): Transcripts using novel donors and/or acceptors.
  • Reference Match (RM): FSM transcript with 5´ and 3´ends within 50 nts of the TSS/TTS annotation. This means that a certain SIRV was detected perfectly.

The rest of the transcripts will be catalogued as Intergenic

Evaluation of detected transcripts for Challenge 3

These are some definitions used to evaluate the submitted transcriptome:

Global metrics

Absolute value Relative value (%)
Number of transcripts 21318 -
Mapping transcripts 21288 99.86
Average length 1793.35 -
Transcripts with coding potential 16144 75.73
Transcripts with Full Illumina SJ Support 14981 70.27
Non-canonical transcripts 2184 10.24
Transcripts with possible intra-priming 599 2.81
Transcripts with possible RT-switching 2205 10.34
Splice Junctions with short-read coverage 70791 90.83
Non-canonical Splice Junctions 2480 3.18

Length distribution

Length distribution of mapping reads.

Minimum SJ coverage

Coverage distribution of the SJ with less coverage in each detected isoform.

Coverage comparison: canonical vs non-canonical SJ

Coverage comparison by Illumina reads. Splice Junctions common for several transcripts are counted just once.

BUSCO completness results

Absolute value Relative value (%)
Complete and single-copy BUSCOs 3132 27.56
Complete and duplicated BUSCOs 1962 17.26
Fragmented BUSCOs 300 2.64
Missing BUSCOs 5972 52.54

Transcripts per Locus

Evaluation of Spike-Ins (SIRVs)

The following metrics and definitions apply to SIRV transcripts:

ATENTION If in this chunk of the evaluation all the results are 0, please, check if the reference genome and/or transcriptome used for building your transcript-model contain information about spike-ins.

Value
SIRV transcripts 24
True Positive detections (TP) 16
SIRV transcripts associated to TP (Reference Match) 16
Partial True Positive detections (PTP) 1
SIRV transcripts associated to PTP 2
False Negative (FN) 67
False Positive (FP) 6
Sensitivity 0.19
Precision 0.67
Non Redundant Precision 0.67
Positive Detection Rate 0.2
False Discovery Rate 0.33
Redundancy 1.06