Ming "Tommy" Tang's Twitter Thread

1/ You think you're mapping transcription factor binding sites, but your assay might be lying to you. Here's what you may miss about ATAC-seq and DNase-seq.

2/ There are multiple ways to identify regulatory elements in the genome: ATAC-seq, DNase-seq, MNase-seq, FAIRE-seq. Each uses a different enzyme or method to probe chromatin structure. They're not interchangeable. And they're not equally accurate.

3/ ATAC-seq became the dominant method because it's fast, requires fewer cells, and the protocol is straightforward. It uses Tn5 transposase to simultaneously cut and tag accessible DNA. But here's the problem: Tn5 doesn't discriminate the way you think it does.

4/ Tn5 cuts BOTH: → Nucleosome linker DNA (the spaces between nucleosomes) → Open chromatin regions (where transcription factors actually bind) You get a mixed signal: nucleosome positioning data + TF binding information, all jumbled together.

5/ DNase-seq seems cleaner at first glance. DNase I preferentially cuts only the open chromatin regions where DNA is truly accessible. You'd think this gives you a purer signal for TF binding sites. But there's still a critical flaw.

6/ Every enzyme has sequence bias. Tn5 prefers certain DNA sequences over others. So does DNase I. When you're trying to identify the exact location of a transcription factor binding site (the "footprint"), these biases distort your results.

7/ The footprint problem is serious: You see a pattern that looks like TF binding, but part of that pattern comes from the enzyme's cutting preferences, not actual protein-DNA interaction. Without correcting for this, you're chasing artifacts.

8/ A 2013 Nature paper demonstrated how bad this can get: The authors showed that sequence-specific cutting biases create false motif footprints. Translation: You might be "discovering" binding sites that don't actually exist.

9/ MNase-seq and FAIRE-seq have their own quirks: MNase cuts linker DNA preferentially and has its own sequence bias. FAIRE-seq relies on crosslinking and phenol-chloroform extraction (messy, variable).

No method is perfect. Each reveals a different slice of chromatin architecture.

10/ So what do you do? → Understand what your assay actually measures → Use computational tools that correct for enzymatic bias → Integrate multiple methods when possible → Never trust a single experiment to tell the whole story

11/ Key takeaways: • ATAC-seq is convenient but mixes nucleosome and accessibility signals • DNase-seq is more specific but still has cutting bias

• Motif footprints can be artifacts if bias isn't corrected • Different assays reveal different aspects of chromatin structure • Always validate findings with orthogonal methods

I hope you've found this post helpful. Follow me for more. Subscribe to my FREE newsletter chatomics to learn bioinformatics

Enjoy this tweet? follow me @tangming2005 and join my newsletter to learn computational biology

Share this thread

Read on Twitter

Navigate thread