Your submission: mouse_sim_drna_ont on ONT data

Background

Challenge 1 is evaluated according to four criteria:

  1. Broad GENCODE Annotation
  2. Subset of manually curated loci selected by GENCODE
  3. sim Lexogen Set 4
  4. Simulated data.

The LRGASP uses SQANTI categories to define evaluating features and metrics for Challenge 1.

LRGASP Challenge 1 Definitions:

This document shows the performance of your pipeline for criteria 4. Critical data for evaluation according to 2. and 4. will be made available after the closure of the challenge, and therefore pre-evaluation reports cannot be provided. Note you???re your criterion 1 metrics reported here have been calculated using GENCODE human v38 and mouse M27 releases while the final evaluation will use human v39 and mouse M28 to be released after completing of the challenge.

Evaluation of detected transcripts for Challenge 1

Global overview

Value
Number of genes detected 9134
Number of known genes detected 9134
Number of transcripts detected 58801
Number of transcripts associated to a known gene 58789
Number of unique SJ detected 86021
Absolute value Relative value (%)
Novel SJ 5115 0.06
Non-canonical SJ 872 0.01

Evaluation of FSM

Absolute value Relative value (%)
Number of isoforms 11060 -
Reference Match 4414 39.91
5’ reference supported (transcript) 6672 60.33
3’ reference supported (transcript) 5468 49.44
5’ reference supported (gene) 7625 68.94
3’ reference supported (gene) 6006 54.3
Supported Reference Transcript Model (SRTM) 5014 45.33
Reference redundancy Level 1.36 -

Evaluation of ISM

Absolute value Relative value (%)
Number of isoforms 41447 -
5’ reference supported (transcript) 2253 5.44
3’ reference supported (transcript) 3626 8.75
5’ and 3’ reference supported (gene) 150 0.36
5’ reference supported (gene) 3302 3302
3’ reference supported (gene) 4774 11.52
Supported Reference Transcript Model (SRTM) 150 0.36
Reference redundancy Level 4.29 -

Evaluation NIC

Absolute value Relative value (%)
Number of isoforms 615 -
5’ and 3’ reference supported (gene) 125 20.33
5’ reference supported (gene) 208 33.82
3’ reference supported (gene) 224 36.42
Intron retention incidence 45 7.32

Evaluation NNC

Absolute value Relative value (%)
Number of isoforms 5667 -
5’ and 3’ reference supported (gene) 2974 52.48
5’ reference supported (gene) 3405 60.08
3’ reference supported (gene) 4106 72.45
Non-canonical SJ incidence 885 15.62
Full Illumina SJ support 5667 100
RT-switching incidence 210 3.71

Evaluation of Simulation

Simulated transcripts were grouped according to different thresholds and attributes, so metrics were calculated regarding to these ground truth setttings. These sets of ground truth transcripts are:

The following metrics and definitions apply to simulated transcripts.

Evaluation of all simulated transcripts

Value
Number of isoforms simulated 27152
True Positive detections (TP) 4174
Number of transcripts associated to TP (Reference Match) 4200
Partial True Positive detections (PTP) 10713
Number of transcripts associated to PTP 44203
False Negative (FN) 13053
False Positive (FP) 10398
Sensitivity 0.15
Precision 0.07
Non Redundant Precision 0.07
Positive Detection Rate 0.52
False Discovery Rate 0.36
False Detection Rate 0.18
Redundancy 3.43

Evaluation of all GENCODE simulation

Value
Number of isoforms simulated 20152
True Positive detections (TP) 3171
Number of transcripts associated to TP (Reference Match) 3191
Partial True Positive detections (PTP) 6430
Number of transcripts associated to PTP 24372
False Negative (FN) 11187
False Positive (FP) 31238
Sensitivity 0.16
Positive Detection Rate 0.44
Redundancy 3.07

Evaluation of only GENCODE transcripts simulated

Value
Number of isoforms simulated 15608
True Positive detections (TP) 2769
Number of transcripts associated to TP (Reference Match) 2789
Partial True Positive detections (PTP) 6092
Number of transcripts associated to PTP 23304
False Negative (FN) 7334
False Positive (FP) 32708
Sensitivity 0.18
Positive Detection Rate 0.53
Redundancy 3.15

Evaluation of only GENCODE transcripts simulated with TPM >= 5

Value
Number of isoforms simulated 9428
True Positive detections (TP) 1919
Number of transcripts associated to TP (Reference Match) 1937
Partial True Positive detections (PTP) 5119
Number of transcripts associated to PTP 20515
False Negative (FN) 2848
False Positive (FP) 36349
Sensitivity 0.2
Positive Detection Rate 0.7
Redundancy 3.41

Evaluation of novelty

Value
Number of isoforms simulated 7000
True Positive detections (TP) 1003
Number of transcripts associated to TP (Reference Match) 1009
Partial True Positive detections (PTP) 4283
Number of transcripts associated to PTP 19831
False Negative (FN) 1866
False Positive (FP) 37961
Sensitivity 0.14
Positive Detection Rate 0.73
Redundancy 4.06