Analysis of CpG islands and promoters for methylation study - (May/12/2005 )
I want to analyze a gene that I'm working with for methylation. However I don't know what part of the gene I should look into. I'm aware that the CpG islands involved in transcriptional regulation is located in the 5' region of genes... But the gene I'm studying has one Cpg island i the 5'UTR region where the first exon lies and one further down were the first coding exon is located. So, I'm a bit confused, should I look at both islands or?
Thank in advance!
normally CpG islands lie at the 5'UTR and can extend into the first exon.
which program and parameters did you use to determine your island?
it will be worth your while to look at both islands.
Thanks for your reply, I used CpG island searcher.
Perhaps I should look at both islands, the 5'UTR is quite long, so the 2 CpG islands are separated by 30 000 bp...
Parameters: I tried both > 400bp, GC Percent > 60.0, Obs/Exp > 0.6 and >500bp, >55% >0.65 and got similar results.
I've seens different recommendations in publications, is there a "golden standard" for what to use?
If it's possible, can you post your sequence (5' flanking 1 kb + 1st exon) here so we can have a look.
yes please post your sequence.
well I found out through reviewers is that the "gold standard" for a cpg island is the 500bp definition as described by Takai and Jones, have have not seen any papers subsequent to this......but it has apparently superceded the Gardiner-Garden and Frommer definition......of 200bp GC=50% obs/exp=0.6
CpG island searcher is written by Takai so you are on the right track!
I don't think the new standard by Takai and Jones is by any means the 'gold standard'.
The main improvement claimed by them over the standard by Gardiner-Garden and Frommeris is their algorithm can exclude some Alu seqeuence in the promoter when predicting CpG island. Let's test it using the human E-cadherin promoter which contains a Alu element at 374-718. The result returned by their program is a CpG island from 616-1193. It does NOT exclude Alu but mistakenly includes it.
Actually, E-cadherin has a very short (157 bp) CpG island from 811-967. Methylation mapping demonstrates methylation of this area silences transcription.
Because CpG island is a concept defined by human and there is no such thing as 'gold standard' for it.
I wish I had you as a referee to my paper, I agree the definition of an CpG island is an arbitrary one. and it also depends on your window size.
I will post the reviewers comments when I get to work tomorrow!
I will be happy to read it. The definition for CpG island is an arbitrary one. It just tell that an area in the sequence is CpG rich, and it not like the definition for transcription start site, coding region, etc.
here are some snippets from reviewers comments.
forth by Takai and Jones (PNAS, 2002) which is more stringent and
therefore more accurate than the definition of Gardiner-Garden and
1. Definition of CpG islands, definition of CpG islets:
Gardiner-Garden and Frommer (1987) initially defined a CpG island as 200-bp region of DNA with GC content > 50% and observed CpG/expected CpG ratio greater or equal to 0.6. Takai & Jones (2002) refined this to respectively: 500bp length, 55% GC content % 0.65 O/E. This latter definition is in wide use today and has the advantage of excluding most Alu repetitive elements.
Who knows that the reviewer is not from the same group. You can argue by giving the E-cadherin example to prove how "good" the new definition can exclude Alu element.