Primate dataset

This simulation is intended to be a relatively simple alignment problem as there has been very little evolution that has taken place between the four lineages.

The root genome consisted of hg18 chr20, chr21 and chr22 with annotations populated from mgcGenes, knownGene, cpgIslandExt and ensGene tracks from the UCSC Table Browser. Details of infile set creation can be gleaned from the evolverInfileGeneration project

The root genome was evolved for a distance of 1.0 via 100 Evolver steps of 0.01, forming the simulation burnin. The final genome of that burnin formed the ancestor for this simulation.

Tree

                ((simGorilla:0.008825,(simHuman:0.0067,simChimp:0.006667)sHuman-sChimp:0.00225)sG-sH-sC:0.00968,simOrang:0.018318);
              

Summary Stats

GenomeChrSize (bp)
simChimpA53,121,445
B85,778,862
C35,661,804
D10,574,168
Total185,136,279
simGorillaA53,120,926
B85,848,133
C35,654,756
D10,570,608
Total185,194,423
simHumanA53,106,993
B85,835,872
C35,630,306
D10,572,275
Total185,145,446
simOrangB85,903,762
C35,683,973
D10,564,720
E37,692,687
F15,493,520
Total185,338,662

Files

Script to download and create the correct directory structure: downloadPrimates.sh

An analysis package has the following directory structure:

packagePrimates/
..  README.txt
..  annotations/
..  predictions/
..  regions/
..  sequences/
..  truths/
            

These directories may be populated with the following (expand all files):

  • README.txt
  • annotations/
  • predictions/ - place your .maf files in here.
  • regions/ - regional analysis takes place in here.
  • sequences/
  • truths/ - the true mafs are placed in here
    • Alignment to the MRCA, ancestor

      simPrimates.ancestor.maf.gz (143 MB)

      aligns: {simHuman, simChimp, sHuman-sChimp, simGorilla, sG-sH-sC, simOrang, ancestor}

      version: 2

      md5sum: 1e2417d2ae8b4cf2743d5e740b7c5ed3

      sha1sum: fac50cbc86e759a1794208ee6e3f1b7d15e95fe2

    • Alignment to the Root, root

      simPrimates.burnin.maf.gz (581 MB)

      aligns: {simHuman, simChimp, sHuman-sChimp, simGorilla, sG-sH-sC, simOrang, ancestor, root}

      version: 2

      md5sum: 4fb72a9f14cf016c0d7b906d25e4731f

      sha1sum: 6133a7de6788ed71fd2bfad9fea00fb9a439f163

    • Alignments to the MRCA and Root, with no paralogous blocks, ancestor.noparalogies, burnin.noparalogies

      simPrimates.noparalogyMafs.maf.gz (472 MB)

      aligns: {simHuman, simChimp, sHuman-sChimp, simGorilla, sG-sH-sC, simOrang, ancestor, root}

      version: 2

      md5sum: 3fc4fcb8fa64958f2a9d655b992387f7

      sha1sum: c8714fd9adb64e27a615f08d35c900529034b2f7


tree drawn using phyfi