Your submission: mouse_simulation_dRNA_ONT on ONT data

Background

Challenge 1 is evaluated according to four criteria:

  1. Broad GENCODE Annotation
  2. Subset of manually curated loci selected by GENCODE
  3. sim Lexogen Set 4
  4. Simulated data.

The LRGASP uses SQANTI categories to define evaluating features and metrics for Challenge 1.

LRGASP Challenge 1 Definitions:

This document shows the performance of your pipeline for criteria 4. Critical data for evaluation according to 2. and 4. will be made available after the closure of the challenge, and therefore pre-evaluation reports cannot be provided. Note you???re your criterion 1 metrics reported here have been calculated using GENCODE human v38 and mouse M27 releases while the final evaluation will use human v39 and mouse M28 to be released after completing of the challenge.

Evaluation of detected transcripts for Challenge 1

Global overview

Value
Number of genes detected 9395
Number of known genes detected 9120
Number of transcripts detected 28035
Number of transcripts associated to a known gene 25719
Number of unique SJ detected 35023
Absolute value Relative value (%)
Novel SJ 2418 0.07
Non-canonical SJ 0 0.00

Evaluation of FSM

Absolute value Relative value (%)
Number of isoforms 3787 -
Reference Match 2158 56.98
5’ reference supported (transcript) 2751 72.64
3’ reference supported (transcript) 2685 70.9
5’ reference supported (gene) 2854 75.36
3’ reference supported (gene) 2784 73.51
Supported Reference Transcript Model (SRTM) 2281 60.23
Reference redundancy Level 1 -

Evaluation of ISM

Absolute value Relative value (%)
Number of isoforms 18701 -
5’ reference supported (transcript) 1480 7.91
3’ reference supported (transcript) 4884 26.12
5’ and 3’ reference supported (gene) 62 0.33
5’ reference supported (gene) 1796 1796
3’ reference supported (gene) 5877 31.43
Supported Reference Transcript Model (SRTM) 62 0.33
Reference redundancy Level 1.91 -

Evaluation NIC

Absolute value Relative value (%)
Number of isoforms 1116 -
5’ and 3’ reference supported (gene) 55 4.93
5’ reference supported (gene) 69 6.18
3’ reference supported (gene) 1049 94
Intron retention incidence 14 1.25

Evaluation NNC

Absolute value Relative value (%)
Number of isoforms 2115 -
5’ and 3’ reference supported (gene) 778 36.78
5’ reference supported (gene) 959 45.34
3’ reference supported (gene) 1275 60.28
Non-canonical SJ incidence 0 0
Full Illumina SJ support 2115 100
RT-switching incidence 75 3.55

Evaluation of Simulation

Simulated transcripts were grouped according to different thresholds and attributes, so metrics were calculated regarding to these ground truth setttings. These sets of ground truth transcripts are:

The following metrics and definitions apply to simulated transcripts.

Evaluation of all simulated transcripts

Value
Number of isoforms simulated 27152
True Positive detections (TP) 2136
Number of transcripts associated to TP (Reference Match) 2136
Partial True Positive detections (PTP) 9729
Number of transcripts associated to PTP 17991
False Negative (FN) 15428
False Positive (FP) 7908
Sensitivity 0.08
Precision 0.08
Non Redundant Precision 0.08
Positive Detection Rate 0.43
False Discovery Rate 0.63
False Detection Rate 0.28
Redundancy 1.72

Evaluation of all GENCODE simulation

Value
Number of isoforms simulated 20152
True Positive detections (TP) 1199
Number of transcripts associated to TP (Reference Match) 1199
Partial True Positive detections (PTP) 6455
Number of transcripts associated to PTP 10356
False Negative (FN) 12586
False Positive (FP) 16480
Sensitivity 0.06
Positive Detection Rate 0.38
Redundancy 1.53

Evaluation of only GENCODE transcripts simulated

Value
Number of isoforms simulated 15608
True Positive detections (TP) 1038
Number of transcripts associated to TP (Reference Match) 1038
Partial True Positive detections (PTP) 5922
Number of transcripts associated to PTP 9568
False Negative (FN) 8733
False Positive (FP) 17429
Sensitivity 0.07
Positive Detection Rate 0.44
Redundancy 1.54

Evaluation of only GENCODE transcripts simulated with TPM >= 5

Value
Number of isoforms simulated 9428
True Positive detections (TP) 888
Number of transcripts associated to TP (Reference Match) 888
Partial True Positive detections (PTP) 4667
Number of transcripts associated to PTP 7862
False Negative (FN) 3954
False Positive (FP) 19285
Sensitivity 0.09
Positive Detection Rate 0.58
Redundancy 1.6

Evaluation of novelty

Value
Number of isoforms simulated 7000
True Positive detections (TP) 937
Number of transcripts associated to TP (Reference Match) 937
Partial True Positive detections (PTP) 3274
Number of transcripts associated to PTP 7635
False Negative (FN) 2842
False Positive (FP) 19463
Sensitivity 0.13
Positive Detection Rate 0.59
Redundancy 2.06