Your submission: ES_cDNA_ONT on ONT data
Background
Challenge 3 is evaluated according to a de novo long reads-based genome (provided by the LRGASP Consortium) and SIRV Lexogen Set 1 which was included before library preparation.
The LRGASP uses SQANTI3 categories to define evaluating features and metrics for Challenge 3.
Categories defined by SQANTI3
Due lo the lack of annotation on the de novo transcriptome, these categories will only be associated to transcripts that map to SIRV reference sequences.
- Full Splice Match (FSM): Transcripts matching a reference SIRV at all splice junctions.
- Incomplete Splice Match (ISM): Transcripts matching consecutive, but not all, splice junctions of the reference SIRV.
- Novel in Catalog (NIC): Transcripts containing new combinations of already annotated splice junctions or novel splice junctions formed from already annotated donors and acceptors.
- Novel Not in Catalog (NNC): Transcripts using novel donors and/or acceptors.
- Reference Match (RM): FSM transcript with 5´ and 3´ends within 50 nts of the TSS/TTS annotation. This means that a certain SIRV was detected perfectly.
The rest of the transcripts will be catalogued as Intergenic
Evaluation of detected transcripts for Challenge 3
These are some definitions used to evaluate the submitted transcriptome:
- Full Illumina Splice Junction Support: Transcripts with all SJ supported by at least one Illumina read.
- Non-canonical transcripts Transcripts with at least one non-canonical junction.
- Intra-priming: It is considered that there is evidence of intra-priming when, in the genomic sequence 20bp downstream the detected TTS, there is at least a 60% of A’s.
- RT-switching: Evidence of RT-switching (see SQANTI ref)
Global metrics
Number of transcripts |
21318 |
- |
Mapping transcripts |
21288 |
99.86 |
Average length |
1793.35 |
- |
Transcripts with coding potential |
16144 |
75.73 |
Transcripts with Full Illumina SJ Support |
14981 |
70.27 |
Non-canonical transcripts |
2184 |
10.24 |
Transcripts with possible intra-priming |
599 |
2.81 |
Transcripts with possible RT-switching |
2205 |
10.34 |
Splice Junctions with short-read coverage |
70791 |
90.83 |
Non-canonical Splice Junctions |
2480 |
3.18 |
|
Length distribution
Length distribution of mapping reads.

Minimum SJ coverage
Coverage distribution of the SJ with less coverage in each detected isoform.

Coverage comparison: canonical vs non-canonical SJ
Coverage comparison by Illumina reads. Splice Junctions common for several transcripts are counted just once.

BUSCO completness results
Complete and single-copy BUSCOs |
3132 |
27.56 |
Complete and duplicated BUSCOs |
1962 |
17.26 |
Fragmented BUSCOs |
300 |
2.64 |
Missing BUSCOs |
5972 |
52.54 |
Transcripts per Locus

Evaluation of Spike-Ins (SIRVs)
The following metrics and definitions apply to SIRV transcripts:
- SIRV transcript: Transcript mapping to a SIRV sequence
- Reference SIRV (rSIRV): Lexogen SIRV model
- True Positive detections (TP): rSIRVs identified as RM
- Partial True Positive detections (PTP): rSIRVs identified as ISM or FSM_non_RM
- False Negative (FN): rSIRVs without FSM or ISM
- False Positive (FP): NIC + NNC + antisense + fusion SIRV_transcripts
- Sensitivity: TP/rSIRVs
- Precision: RM/ SIRV_transcripts
- Non_redundant Precision: TP/ SIRV_transcripts
- Positive Detection Rate: unique(TP+PTP)/rSIRVs
- False Discovery Rate: (SIRV_transcripts - RM)/SIRV_transcripts
- Redundancy: (FSM + ISM)/unique(TP+PTP)
ATENTION If in this chunk of the evaluation all the results are 0, please, check if the reference genome and/or transcriptome used for building your transcript-model contain information about spike-ins.
SIRV transcripts |
24 |
True Positive detections (TP) |
16 |
SIRV transcripts associated to TP (Reference Match) |
16 |
Partial True Positive detections (PTP) |
1 |
SIRV transcripts associated to PTP |
2 |
False Negative (FN) |
67 |
False Positive (FP) |
6 |
Sensitivity |
0.19 |
Precision |
0.67 |
Non Redundant Precision |
0.67 |
Positive Detection Rate |
0.2 |
False Discovery Rate |
0.33 |
Redundancy |
1.06 |
|