Your submission: ES_CapTrap_PacBio on PacBio data

Background

Challenge 3 is evaluated according to a de novo long reads-based genome (provided by the LRGASP Consortium) and SIRV Lexogen Set 1 which was included before library preparation.

The LRGASP uses SQANTI3 categories to define evaluating features and metrics for Challenge 3.

Categories defined by SQANTI3

Due lo the lack of annotation on the de novo transcriptome, these categories will only be associated to transcripts that map to SIRV reference sequences.

  • Full Splice Match (FSM): Transcripts matching a reference SIRV at all splice junctions.
  • Incomplete Splice Match (ISM): Transcripts matching consecutive, but not all, splice junctions of the reference SIRV.
  • Novel in Catalog (NIC): Transcripts containing new combinations of already annotated splice junctions or novel splice junctions formed from already annotated donors and acceptors.
  • Novel Not in Catalog (NNC): Transcripts using novel donors and/or acceptors.
  • Reference Match (RM): FSM transcript with 5´ and 3´ends within 50 nts of the TSS/TTS annotation. This means that a certain SIRV was detected perfectly.

The rest of the transcripts will be catalogued as Intergenic

Evaluation of detected transcripts for Challenge 3

These are some definitions used to evaluate the submitted transcriptome:

Global metrics

Absolute value Relative value (%)
Number of transcripts 15038 -
Mapping transcripts NA NA
Average length 1794.32 -
Transcripts with coding potential 13324 88.6
Transcripts with Full Illumina SJ Support 14143 94.05
Non-canonical transcripts 137 0.91
Transcripts with possible intra-priming 262 1.74
Transcripts with possible RT-switching 1054 7.01
Splice Junctions with short-read coverage 67875 98.79
Non-canonical Splice Junctions 102 0.15

Length distribution

Length distribution of mapping reads.

Minimum SJ coverage

Coverage distribution of the SJ with less coverage in each detected isoform.

Coverage comparison: canonical vs non-canonical SJ

Coverage comparison by Illumina reads. Splice Junctions common for several transcripts are counted just once.

BUSCO completness results

Absolute value Relative value (%)
Complete and single-copy BUSCOs 3462 30.46
Complete and duplicated BUSCOs 1503 13.22
Fragmented BUSCOs 213 1.87
Missing BUSCOs 6188 54.44

Transcripts per Locus

Evaluation of Spike-Ins (SIRVs)

The following metrics and definitions apply to SIRV transcripts:

ATENTION If in this chunk of the evaluation all the results are 0, please, check if the reference genome and/or transcriptome used for building your transcript-model contain information about spike-ins.

Value
SIRV transcripts 16
True Positive detections (TP) 15
SIRV transcripts associated to TP (Reference Match) 15
Partial True Positive detections (PTP) 0
SIRV transcripts associated to PTP 0
False Negative (FN) 69
False Positive (FP) 1
Sensitivity 0.18
Precision 0.94
Non Redundant Precision 0.94
Positive Detection Rate 0.18
False Discovery Rate 0.06
Redundancy 1