TACIT Open Chromatin and Open Chromatin Prediction Tracks (2022-5-10)

Tracks and Descriptions

Methods

For previously published bulk ATAC-seq datasets, we used the irreproducible discovery rate (IDR) optimal peaks. For the mouse snATAC-seq data, we mapped the reads with bowtie2. For the human SNARE-seq2 data, we filtered the reads with filtering the snATAC-seq reads with alignmentSieve. For both snATAC-seq datasets, we obtained reproducible peaks with ArchR and used the peaks from the cluster of cells that had been labeled as PV+ interneurons. We obtained enhancers using bedtools and Gencode transcription start sites mapped to the species with the data combined with, for non-mouse and non-human species, RefSeq annotations in that species. We obtained predictions at orthologs by mapping the motor cortex enhancers to 221 Boreoeutherian mammals in the Zoonomia Cactus Alignment using halLiftover and HALPER and then making predictions for the peak summit orthologs +/- 250bp using the machine learning model for the tissue or cell type.

Data Access

The data for these tracks can be accessed at the TACIT supplement website prediction tracks.

Credits

The Pfenning Lab and the UCSC Genome Browser provided the input data.

References

Srinivasan, Chaitanya, BaDoi N. Phan, Alyssa J. Lawler, Easwaran Ramamurthy, Michael Kleyman, Ashley R. Brown, et al. 2021. "Addiction-Associated Genetic Variants Implicate Brain Cell Type- and Region-Specific Cis-Regulatory Elements in Addiction Neurobiology." Journal of Neuroscience. 41 (43): 9008-9030. https://www.jneurosci.org/content/41/43/9008.long.

Wirthlin, Morgan, Irene M. Kaplow, Alyssa J. Lawler, Jing He, BaDoi N. Phan, et al. 2020. "The Regulatory Evolution of the Primate Fine-Motor System." bioRxiv. https://www.biorxiv.org/content/10.1101/2020.10.27.356733v1.

Wirthlin, Morgan, Xiaomeng Zhang, Irene M. Kaplow, Daniel E. Schaffer, Alyssa J. Lawler, Tobias A. Schmid, et al. "Vocal learning-associated convergent evolution in mammalian regulatory elements and proteins."

Li, Yang E., Sebastian Preissl, Xiaomeng Hou, Ziyang Zhang, Kai Zhang, Yunjiang Qiu, et al. 2021. "An atlas of gene regulatory elements in adult mouse cerebrum." Nature. 598 (7879): 129-136. https://www.nature.com/articles/s41586-021-03604-1.

Bakken, Trygve E., Nikolas L. Jorstad, Qiwen Hu, Blue B. Lake, Wei Tian, Brian E. Kalmbach, et al. 2021. "Comparative cellular analysis of motor cortex in human, marmoset and mouse." Nature. 598 (7879): 111-119. https://www.nature.com/articles/s41586-021-03465-8.

Langmead, Ben and Steven L Salzberg. 2012. “Fast gapped-read alignment with Bowtie 2.” Nature Methods. 9 (4): 357-359. https://www.nature.com/articles/nmeth.1923.

“ENCODE ATAC-seq pipeline.” https://github.com/ENCODE-DCC/atac-seq-pipeline.

Granja, Jeffrey M., M. Ryan Corces, Sarah E. Pierce, S. Tansu Bagdatli, Hani Choudhry, Howard Y. Chang, et al. 2021. “ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis.” https://www.nature.com/articles/s41588-021-00790-6.

Ramírez, Fidel, Devon P. Ryan, Björn Grüning, Vivek Bhardwaj, Fabian Kilpert, Andreas S. Richter, et al. 2016. “deepTools2: a next generation web server for deep-sequencing data analysis.” Nucleic Acids Research. 44 (W1): W160-W165. https://academic.oup.com/nar/article/44/W1/W160/2499308?login=true.

Kaplow, Irene M., Daniel E. Schaffer, Morgan E. Wirthlin, Alyssa J. Lawler, Ashley R. Brown, Michael Kleyman, et al. 2022. "Inferring mammalian tissue-specific regulatory conservation by predicting tissue-specific differences in open chromatin." BMC Genomics. 23 (1): 291.

Quinlan, Aaron R. and Ira M. Hall. 2010. “BEDTools: a flexible suite of utilities for comparing genomic features.” Bioinformatics 26 (6): 841-842. https://academic.oup.com/bioinformatics/article/26/6/841/244688?login=true.

Frankish, Adam, Mark Diekhans, Anne-Maud Ferreira, Rory Johnson, Irwin Jungreis, Jane Loveland, et al. 2019. “GENCODE reference annotation for the human and mouse genomes.” Nucleic Acids Research 47 (D1): D766-D773. https://academic.oup.com/nar/article/47/D1/D766/5144133?login=true.

O'Leary, Nuala A., Matthew W. Wright, J. Rodney Brister, Stacy Ciufo, Diana Haddad, Rich McVeigh, et al. 2016. “Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation.” Nucleic Acids Research 44 (D1): D733-D745. https://academic.oup.com/nar/article/44/D1/D733/2502674?login=true.

Kaplow, Irene M., Alyssa J. Lawler, Daniel E. Schaffer, Chaitanya Srinivasan, Morgan E. Wirthlin, BaDoi N. Phan, et al. In review. “Relating enhancer genetic variation across mammals to complex phenotypes using machine learning.”

Armstrong, Joel, Glenn Hickey, Mark Diekhans, Ian T. Fiddes, Adam M Novak, Alden Deran, et al. 2020. “Progressive Cactus is a multiple-genome aligner for the thousand-genome era.” Nature 587 (7833): 246-251. https://www.nature.com/articles/s41586-020-2871-y.

Hickey, Glenn, Benedict Paten, Dent Earl, Daniel Zerbino, David Haussler. 2013. “HAL: a hierarchical format for storing and analyzing multiple genome alignments.” Bioinformatics 29 (10): 1341-1342. https://academic.oup.com/bioinformatics/article/29/10/1341/256598?login=true.

Zhang, Xiaoyu, Irene M. Kaplow, Morgan Wirthlin, Tae Yoon Park, Andreas R. Pfenning. 2020. “HALPER facilitates the identification of regulatory element orthologs across species.” Bioinformatics 36 (15): 4339-4340. https://academic.oup.com/bioinformatics/article/36/15/4339/5837107?login=true.

Christmas, Matthew J., Irene M. Kaplow, Diane P. Genereux, Michael X. Dong, Graham M. Hughes, Xue Li, et al. In review. “Evolutionary constraint and innovation across hundreds of placental mammals.”