How to search and retrieve promoter sequence of genes for methylation study - (Feb/03/2005 )
This is the URL to my gene of interest.
http://www.ncbi.nlm.nih.gov/entrez/viewer....nk&val=14579058
I am trying to estimate the promoter region of this gene so that I can do Meth specific PCR on the promoter..the promoter region of this gene has not been positively identified..but the exon information is known. I am new to this field.. Could anyone please help me out, preferably using step by step directions if possible.
Thank you!!!
Here is what I did to find the 5' flanking sequence of your gene:
1. I searched NCBI nucleotide database using "gene name + promoter" or "gene name + flanking" and got no hit.
2. Then, I went to ensembl.org and found the gene, clicked "View genomic sequence for this gene with exons highlighted", and asked it to export 1kb 5' sequence. The new Ensembl website has moved this link to the left side and the link is called Genomic Sequence
3. I copied the 5' sequence and fed to methprimer to predict CpG islands. Luckily, there is a CpG island close to the first exon (see fig below), which means the 5' sequence is likely the promoter of the gene.
4. I pasted the sequence back to NCBI blastn program to see if it matches experimentally identified promoter sequences that I might have missed by gene name text search. In this case, I didn't find any, which means the promoter of this gene has not been characterized yet. You can safely assume the sequence I got is the 5' regulatroy sequence and go ahead with methylation mapping of the identified CpG island.
Hope that helps.
5' Flanking sequence (red is exon):
GAAGGCCAAGTACCCCAACTCTCCCCGACGCTGCTGAGCCCTTCCCTCCCCTAATCTGAG
AAGTCAGCCTCTTGGCTCCTCAGGCCACCATTTAGGCCTGACTGGGGTAAGAAATGTCGC
TCCACTTTACAGAGGTAGCTGTGGTGTTGAAACACTGGACTTGGATATGGGGTGCTGGGA
TCGATTCCTAGCTTTACCACTAACTAGCTGTGTGGCCTTGAGTAAATCCCGTTACCTCTC
TGAGCCTCGGTTACCCTGTCTGTAAAAAGGGAGGTGAGAATACCTACCTCACGGAACTGT
TGGGAGGCTCAGATGAGATGCTATATGTGAAAACATTCTGTAAGCTTCGTACAAATGTGA
AGTATTAATATTATCGCAGTATTATTGTTGTTATTATTATTGTTATTATTAACAATCTTG
GGTGGGTAGTAGGAGAGCAAAAAGTATGAATGGGATGGAGCTAAGAAGTCTGAATACTTA
ATGAAATGGACTTTTTGGAAAGAAATCAGATGAAGGCATAAAATTTAGTTCTTAGCTCTT
GAACAGAAGCCTAAAATTCCTGGTTCTCTCAGGGCTTCGCCTTCAAGGGTTCTGGAGGAG
GGAAGGGTCTGCAGGTTCCATGGGTGACAGCCTGAGATCTGTCCCTTCAACGGGCTGGGC
TGGGTATGTGCCTACCGATGACAATGTGTAAATAAATGCGTGTTCACACCCACAGCTGGC
TCCGTTACTCTCCTGAGCTGGGGGTGGGGGGATCGGGGCCGGCTTCTCCCCTTGCACGCC
GTGGGGAGATGGCCGCGGACGAAGGGCACCGTCTGCAAAGCACTGAACCATCAGGGCCTT
CCCTGCAGGCCGAAGGCCTGCGCGAATGTCGGCCAGTTACTATGGAGACCGGCCCGCGGT
ATCCCAGCATGCCTCGGGACCAAACTGTCCAACTGTGAGCATCCCCGACGGTCGCCTCCT
GATGACGCACACGGAAGCACCGATAGGCTCTGCCTCCCGAAGAAAAGGGAGCCGCGCAGC
CpG island prediction (blue) and BSP primer design
Thank you very much!! I have one more query..Is there a way to do this using the Entrez gateway? The problem is my boss trusts ncbi more than anything else..thank you very much again!
You can just blast NCBI database using the sequence I got from ensembl. I am sure this sequence and the mRNA sequence will align to the same genomic sequence. Actually all genome databases (ensembl, UCSC and NCBI) use the same data and assembly but present the data differently. ensembl provides the best genome browser.
Hi pcrman,
How do you determine which sequence is the exon and whether it is exon 1 or 2 or so on?
All genome databases provide gene structure information. The best place to look for that is Ensembl.
Hi, PCR man:
I am new to this field. I am trying to design MSP primers for rat c-myc so I read all of your post. How did you know the the promoter region is in the 1kb sequence before the first exon?
Thanks!
Most genes if not all have their promoter immediately upstream the first exon.
Hi, pcrman
i am a new one here. i wanna study the methlyation status in the promoter of human GFAP in glioma cells so i read almost posts here.Followed your direction step by step,i tried to find CpG islands and to design BSP or MSP primer of this gene,but the results puzzled me.Could you help me out?
At first i searched NCBI and got a sequence--"Homo sapiens glial fibrillary acidic protein (GFAP), mRNA,3035bp, NM_002055".then i fed this 3035bp sequence to metheprimer,there is a CpG islands from 649 to 1142,and BSP primer were as follows:
Primer picking results for bisulfite sequencing (or restriction) PCR
Primer Start Size Tm GC% 'C's Sequence
1 Left primer 787 25 51.35 52.00 7 GTTTTAAGTTTGTAGATTTGATAGA
Right primer 1030 23 57.22 60.87 4 TTAAAACTCTACCCCTCTTCCTC
Product size: 244, Tm: 73.2, CpGs in product: 23
2 Left primer 787 25 51.35 52.00 7 GTTTTAAGTTTGTAGATTTGATAGA
Right primer 1031 24 58.14 62.50 4 CTTAAAACTCTACCCCTCTTCCTC
Product size: 245, Tm: 73.2, CpGs in product: 23
3 Left primer 787 25 51.35 52.00 7 GTTTTAAGTTTGTAGATTTGATAGA
Right primer 1032 24 59.60 62.50 4 CCTTAAAACTCTACCCCTCTTCCT
Product size: 246, Tm: 73.3, CpGs in product: 23
4 Left primer 787 25 51.35 52.00 7 GTTTTAAGTTTGTAGATTTGATAGA
Right primer 1029 23 59.18 65.22 4 TAAAACTCTACCCCTCTTCCTCC
Product size: 243, Tm: 73.3, CpGs in product: 23
5 Left primer 787 25 51.35 52.00 7 GTTTTAAGTTTGTAGATTTGATAGA
Right primer 1033 23 57.22 60.87 4 TCCTTAAAACTCTACCCCTCTTC
Product size: 247, Tm: 73.2, CpGs in product: 23
Then I wanted to identify this result so i went to ensembl.org and found the gene, clicked "View genomic sequence for this gene with exons highlighted", and asked it to export 1kb 5' sequence.the result as follows:
CAGCCAGCTCATGTGTAACGGCTTTGTGGAGCTGTCAAGGCCTGGTCTCTGGGAGAGAGGCACAGGGAGGCCAGACAAGG
AAGGGGTGACCTGGAGGGACAGATCCAGGGGCTAAAGTCCTGATAAGGCAAGAGAGTGCCGGCCCCCTCTTGCCCTATCAG
GACCTCCACTGCCACATAGAGGCCATGATTGACCCTTAGACAAAGGGCTGGTGTCCAATCCCAGCCCCCAGCCCCAGAACT
CCAGGGAATGAATGGGCAGAGAGCAGGAATGTGGGACATCTGTGTTCAAGGGAAGGACTCCAGGAGTCTGCTGGGAATGAG
GCCTAGTAGGAAATGAGGTGGCCCTTGAGGGTACAGAACAGGTTCATTCTTCGCCAAATTCCCAGCACCTTGCAGGCACTTACAGCTGAGTGAGATAATGCCTGGGTTATGAAATCAAAAAGT
TGGAAAGCAGGTCAGAGGTCATCTGGTACAGCCCTTCCTTCCCTTTTTTTTTTTTTTTTTTGTGAGACAAGGTCTCTCTCT
GTTGCCCAGGCTGGAGTGGCGCAAACACAGCTCACTGCAGCCTCAACCTACTGGGCTCAAGCAATCCTCCAGCCTCAGCCT
CCCAAAGTGCTGGGATTACAAGCATGAGCCACCCCACTCAGCCCTTTCCTTCCTTTTTAATTGATGCATAATAATTGTAAG
TATTCATCATGGTCCAACCAACCCTTTCTTGACCCACCTTCCTAGAGAGAGGGTCCTCTTGCTTCAGCGGTCAGGGCCCCA
GACCCATGGTCTGGCTCCAGGTACCACCTGCCTCATGCAGGAGTTGGCGTGCCCAGGAAGCTCTGCCTCTGGGCACAGTGA
CCTCAGTGGGGTGAGGGGAGCTCTCCCCATAGCTGGGCTGCGGCCCAACCCCACCCCCTCAGGCTATGCCAGGGGGTGTTG
CCAGGGGCACCCGGGCATCGCCAGTCTAGCCCACTCCTTCATAAAGCCCTCGCATCCCAGGAGCGAGCAGAGCCAGAGCAGGATGGAG
i fed this sequence to methprimer.However,no CpG island was found,and BSP primers were changed:
Primer Start Size Tm GC% 'C's Sequence
1 Left primer 861 25 59.31 64.00 7 GGTGAGGGGAGTTTTTTTTATAGTT
Right primer 1020 24 52.67 62.50 4 CTCCATCCTACTCTAACTCTACTC
Product size: 160, Tm: 71.0, CpGs in product: 5
2 Left primer 862 25 58.66 64.00 7 GTGAGGGGAGTTTTTTTTATAGTTG
Right primer 1020 24 52.67 62.50 4 CTCCATCCTACTCTAACTCTACTC
Product size: 159, Tm: 71.1, CpGs in product: 5
3 Left primer 819 24 58.29 66.67 8 GTGTTTAGGAAGTTTTGTTTTTGG
Right primer 1020 24 52.67 62.50 4 CTCCATCCTACTCTAACTCTACTC
Product size: 202, Tm: 70.5, CpGs in product: 5
4 Left primer 863 24 57.93 62.50 7 TGAGGGGAGTTTTTTTTATAGTTG
Right primer 1020 24 52.67 62.50 4 CTCCATCCTACTCTAACTCTACTC
Product size: 158, Tm: 71.2, CpGs in product: 5
5 Left primer 820 23 57.51 65.22 8 TGTTTAGGAAGTTTTGTTTTTGG
Right primer 1020 24 52.67 62.50 4 CTCCATCCTACTCTAACTCTACTC
Product size: 201, Tm: 70.6, CpGs in product: 5
Now i do not know which result is correct? And which primer pair above can i select is best to my study?
Thanks in advance for your help.
wenyaya
Hi wenyaya,
Sorry for the late reply.
You are right there is no CpG island in the 5' flanking region (up to 1kb) of GFAP gene and even in the first exon and intron, although the first exon is relatively GC-rich.
In this case if I were you I would not expect its methylation has an impact on transcription and thus would not analyze its methylation for disease association.