Protocol Online logo
Top : New Forum Archives (2009-): : ChIP and Next Generation Sequencing

ChIP-Seq with HA-tagged protein - (Aug/18/2011 )

Hi everybody,

I am trying to define site-specific and off-site binding of a DNA binding protein in human cells (at this point HEK-293 and K-562 cells).

The protein of interest is not expressed naturally in human cells, i.e. it has to be expressed heterologously. Furthermore, at this point there is no antibody available for the protein. Therefore I am using a N-terminal HA tag for ChIP with a ChIP-grade antibody.

Here's a quick run down of my protocol:
- transfect (HEK-293) or transduce (K-562) cell with expression vector
- growth for 48h
- fix cells 5x10exp7 with formaldehyde (1% for 10min) quench with Gly
- QC: WB with the a-HA antibody that I use as well for ChIPing. Protein expression can be verified under a fluorescent light microscope or by FACS since my protein is linked via a T2A sequence to mCherry.
- cell lysis and sonication on a Covaris AFA instrument
- binding of the HA-tagged protein to the a-HA antibody immobilized on magnetic beads (protein G) ON at 4°C.
- wash, crosslink reversal and elution
- ProteinaseK and RNase treatment
- QC: size distribution of the sheared DNA
- DNA end repair, addition of A bases, ligation of (barcoded) sequencing adapters and PCR amplification
- QC: amplification is done on a light-cycler using SYBR-green to monitor the emergence of amplification products.
- PAGE purification of the amplified libraries.
- QC of the libraries on an Agilent Bioanalyzer; I aim for fragments of 200-400bp.
- 76bp SE Illumina sequencing

I get 20-60 mil reads that I map back to GRCh37/hg19 using bowtie or bwa. Peak calling is done with MACS.

My problem is, that I get a LOT of background. One reason for this is already the low percentage of cells that initially express my protein. Based on mCherry expression I can say that it's only about 1/3 of the total number of cells that light up under the microscope. In order to alleviate this problem I've put together a new expression plasmid that has a selection marker and is Tet inducible so that I can hopefully increase this number. I didn't get to try it out yet though. Cell sorting is not an option because of the large number of cells.
On that note, does anybody have a rough idea of the number of molecules/cell of a protein one needs for a reliable ChIP-Seq experiment?

Besides the amount of protein I suspect that the main problem is the a-HA antibody that I am using for the ChIP. Has anybody already used the HA-tag for ChIP?
Which tags might be better choices in this setting? I am considering to switch to another tag, namely GST or His. I've seen a study in S. cerevisiae that featured a myc-specific antibody for ChIP but I guess if I use that in human cells I will pull down all the c-myc targets.

It is not known if and where my protein of interest binds in the human genome - that's exactly what I am trying to figure out. As long as I can say with confidence that there are no binding sites I'd be happy with that. As positive control I use a known binding sequence for my protein that I inserted into the HEK-293 genome (I don't havea positive control in K-562 cells).
I do not see this positive control after peak calling. At this point I fear that potential peaks might either not be visible because of insufficient enrichment (either caused by the low number of protein molecules per cell or inefficient pull-down of bound sequences) or because they are covered by the strong background.

I'd appreciate any suggestions,



Try pre-clearing with beads alone, and maybe incubate your lysates with the antibody first O/N and then add the beads in the morning for ~2hr................and maybe make an inducible stable cell line.......but it sounds like your already headed down that path