ChIP-PCR worked, ChIP-seq didn't - (Sep/01/2013 )

Hi,

I have recently done ChIP-PCR for several potentially new target genes of my TF which worked very well. Then repeated the experiment to use the DNA for Illumina Hi-seq, but first tested the ChIP for one of my targets with PCR, again with a good result.

When I sent my DNA to the core lab for sequencing, they said that the amount of DNA was very low (was to be expected) and they used the NEBnext ultra library prep kit which includes a PCR step for the library preparation. Samples were then sent to another core lab for the sequencing.

The results were quite disappointing: we couldn't find any relevant peaks, not even near the genes that I had used for PCR. Instead, there were a whole bunch of reads (about 10% in the input samples and about 30% in the ChIP'd samples) alledgedly coming from Plasmodium (my DNA was mouse), although now they believe that these are just repeat sequences coming from the mouse. Also, the input DNA doesn't look as if it's genomic DNA but really has distinct peaks.

My first guess was that they somehow swapped samples, but they say that's impossible. So my questions are, is it possible that some kind of bias was introduced during the library prep? How is it possible that we don't see any peaks near any of the known target genes of this TF, whereas we do see enrichment with PCR? Shouldn't the Hi-seq at least pick those up? They say that it may be because with PCR we are looking at very distinct regions which is not the case with sequencing, but I believe that's just BS.

Any thoughts please?

-roelq-

Hi roelq,

How many read did you get and how many could be mapped to the genome? Did you use salmon sperm DNA to block non-specific interactions with the beads?

When you do ChIP PCR to see the enrichment, what control region is used for comparison?

We had the same problem when we first tried ChIP-seq. What we got from the first try were mostly repetitive sequences (mainly simple repeats). Also the number of reads and those mappable were very low. We figured out the problem might lie at the use of salmon sperm DNA and library preparation. At the second try, we did not use salmon sperm DNA and made sure our libraries were good. then the sequencing gave us much more reads (>10 million) and over 80% could be mapped to the genome and only a small portion is repetitive sequences.

Have you got some bioinformatics people to look at your data and applied peak calling based on FDR?

-postdoc-

Hi postdoc,

for ChIP we used the Cell Signaling SimpleChIP kit with magnetic beads, which specifically states that it doesn't use a DNA blocking agent such as salmon sperm. The blocking is apparently done by using BSA.

As I said, the library prep was done with the NEBnext ultra kit. According to the core lab, this went fine, but I personally don't really know which parameters to look at to decide whether the library prep was good or not. Could you please enlighten me?

We performed Hi-seq with ~25M reads/sample. Peak calling was done with MACS.

Thx a lot already.

-roelq-

Hi roelq!

we've had that same problem (IPs looked like input w/o any discernable peaks) before and never found out what caused it (we are prepping th libaries ourselves with the NEBnext kits and adapters). Besides the fact that I don't trust qPCR a whole lot (we've had great qPCR enrichments with crappy sequencing results, and vice versa), there could always something go wrong during the library prep.

-NemaToStella-

Hey NemaToStella,

well, in our case our inputs look like IPs, with discernable peaks. That's just too strange wouldn't you say?

We have checked for Plasmodium contamination now, and of course we can't find any. Also redid PCR for one of the targets on the samples before and after library prep, and see a nice band in all samples. So I really don't understand why the sequencing shouldn't be able to pick that up.

I don't really understand why you wouldn't trust PCR? If you include the appropriate controls, I think you can be quite sure that what you see is real. I have for instance used different primer sets for the same gene where I see enrichment with primers spanning the TF binding site and no enrichment with primers further up. That's a good control imo. Besides that, we use input, IgG and no treatment (our TF only binds after treatment) as controls.

So with your experience, do you still think it is worth the money to do ChIP-seq if you can't be sure you will get a result? And when you do get a result, do you trust it?

-roelq-

It is not uncommon to see peaks in the input sample, e.g. at transcription start sites, enhancers etc. (i.e. at sites of accessible chromatin). But if your input looks EXACTLY like the IP, then I'd be worried (seen that before, too).

With qPCR, in the past I've had 3-fold enrichment with beautiful sequencing results, and also > 28-fold enrichment with IP samples that looked like input (same for other people in my lab). This wasn't as big as a problem as we routinely barcoded our samples and sequenced 10 per HiSeq lane.

Would multiplexing be an option for you? That'd save you money; if you just wanted to see how well your library prep worked, you could spike in a little of that sample...

-NemaToStella-