Enhancers are segments of DNA that increase the expression of a nearby gene. This post provides an overview enhancer biology and detection of enhancers at scale.
This is the first post in a series called Genomics for Statisticians. In this series I will explore some important ideas in modern genomics from the perspective of a non-biologist. When I began research in statistical genomics, I quickly discovered that my knowledge of genomics was not quite up to par. This series is the result of my effort to fill that gap. I hope other researchers in genomics who do not have extensive formal training in biology (e.g., statisticians, computer scientists) will find this series helpful as well. The posts should be reasonably self-contained. My goal is to explore concepts at the level of a college biology class, roughly. The first post in the series in on enhancers.
DNA sequence analysis: Enhancers harbor transcription factor binding motifs (i.e., very short stretches of DNA that help bind transcription factors). Furthermore, enhancers are often conserved across species. Thus, we can try to predict whether a region of the genome is an enhancer simply by looking at the corresponding primary sequence.
Biochemical annotations: As we have discussed, several biochemical annotations correlate with enhancer activity. Genome-wide, we can search for (i) histone modifications (e.g., presence H3K27ac and H3K4me1), (ii) transcription factor binding, (iii) open chromatin, (iv) DNA methylation, and (v) the initiation of transcription. Many assays, both bulk-tissue and single-cell, provide us with this information (see, for example, DNase-seq, Pro-seq, and ChIP-seq).
eQTL mapping: For a given SNP and given nearby gene, we can test whether expression of that gene differs significantly across the levels of the SNP. If so, the SNP may lie within an enhancer for that gene. A downside of this approach is that it operates at the resolution of linkage disequilibrium blocks.
3D conformation mapping: There exist assays (e.g., Hi-C) to probe the 3D conformation (in space) of a chromosome. If a given region of DNA is in close proximity to a known promoter, then that region might be an enhancer.
CRISPR-based approaches: See next section.
CRISPR–Cas9 (nuclease active) makes a single cut inside the enhancer. The cell uses non-homologous end joining to repair the cut. Generally, a few extra bases are inserted or deleted during this process, altering and consequently deactivating the targeted enhancer.
CRISPRi (short for CRISPR interference) “turns off” a candidate enhancer without altering its associated DNA sequence.