dREG gateway
Licensed according to this deed.
Category
Published on
Abstract
The dREG gateway allows users to identify the location of promoters and enhancers using PRO-seq, GRO-seq, or ChRO-seq data. Our genomes encode not just protein-coding genes, but also the transcriptional regulatory elements (TREs) that encode the temporal and spatial patterns of gene expression. Understanding the location and activity of both TREs and their target genes is currently an important challenge in the field of genomics, with broad applications in agriculture and medicine. Recent advances in biochemical methods have enabled researchers to identify the location of the enzyme that transcribes DNA into RNA, called RNA polymerase. This technique, called precision run on and sequencing (PRO-seq) (Kwak et al. 2013), allows researchers to develop a genome-wide map of the location and orientation of actively transcribing RNA polymerase in a population of cells. My lab recently developed a novel bioinformatic method, called dREG, that identifies the location and boundaries of active TREs using maps of primary transcription collected by PRO-seq (Danko et al. 2015). Our approach uses a machine learning technique called support vector regression (SVR) to learn a pattern of RNA polymerase that is associated with active TREs. Once we have trained a model to recognize TREs using PRO-seq data in a handful of cell lines, we can use dREG to identify the location of TREs in any cell or tissue sample for which PRO-seq data is available. To allow our technique to be used broadly by the genomics community, we have recently implemented an optimized version of dREG which leverages graphical processing units (GPUs) to accelerate computation. This optimized package currently allows dREG to analyze new data up to 100 times faster than our published version.
cgd24@cornell.edu,zw355@cornell.edu
Site
Users
cyoun
Cite this work
Researchers should cite this work as follows: