Protocol Online logo
Top : Forum Archives: : Bioinformatics and Biostatistics

Find promoter sequence of a gene - (Apr/18/2006 )

Pages: 1 2 Next

Hi,
I have to find the prometer sequence of this gene

http://www.ncbi.nlm.nih.gov/entrez/viewer....de&val=12642087

could anyone show me a way to find it,

unsure.gif

-athena-

athena, there are a number of things you can do,

firstly, you have the cds sequence, any chance of getting the upstream sequence? once you have it, you can feed it through a number of programs that search for promoters and transcription factor binding sites, just google it.

you could blast the sequence to that of a model organism and see if there is a promoter sequence close to or upstream of the start site.

you can also feed the sequence through cpg plot which identifies cpg islands, normally associated with promoter regions, all these programs are available on the web, just google.

good luck!

Nick

-methylnick-

QUOTE (methylnick @ Apr 19 2006, 08:39 AM)
athena, there are a number of things you can do,

firstly, you have the cds sequence, any chance of getting the upstream sequence? once you have it, you can feed it through a number of programs that search for promoters and transcription factor binding sites, just google it.

you could blast the sequence to that of a model organism and see if there is a promoter sequence close to or upstream of the start site.

you can also feed the sequence through cpg plot which identifies cpg islands, normally associated with promoter regions, all these programs are available on the web, just google.

good luck!

Nick

Hi Nick,
The gene is supposed to be in chromosome 3, according to former publications. But when search it in the gene bank, gene is not there, if I find the contig of this gene would it be helpful? It is strange that for some reason gene is not in the location, where it is supposed to be? what could be the reason for that ? unsure.gif

-athena-

the best thing to do is to blast the sequence against the database, the gene may have been incorrectly annotated and this would be reflected in the database entry.

I will move this to the bioinformatics forum as there will be many people there that may be able to help you too.

Nick

-methylnick-

hi-
What you need to do is to align the mRNA or CDC sequence with the genomic sequence- and i think you can do it with Blast2 sequences(bl2seq)
According to those alignments your transcription start site is at the position 1388 of the genomic sequence so positions 1-1387 are promoter+5'UTR sequences.
To find the promoter, you usually need to look for TATA box or CpGs ar Nick said. There is a data base called TransFac that you can use to identify promoter motifs( i am sure you can find other databases like that if you do a web search). It should be located -50 to -100 bp from the start site but it varies a lot with species and gene.

Also, your accession says that it is homologous to a telemoric repeat- so maybe its not included in the genome assembly?

QUOTE
The gene is supposed to be in chromosome 3, according to former publications. But when search it in the gene bank, gene is not there, if I find the contig of this gene would it be helpful? It is strange that for some reason gene is not in the location, where it is supposed to be? what could be the reason for that ?

I don't understand how you can't find it?
here are your megablast hits

# BLASTN 2.2.13 [Nov-27-2005]
# Query:
# Database: nr
# Fields: query id, subject ids, % identity, alignment length, mismatches, gap opens, q. start, q. end, s. start, s. end, evalue, bit score
# 23 hits found
1_25432 gi|12642087|gb|AF207841.1| 100.00 1118 0 0 1 1118 1 1118 0.0 2065
1_25432 gi|12642087|gb|AF207841.1| 100.00 589 0 0 1161 1749 1161 1749 0.0 1088
1_25432 gi|12642087|gb|AF207841.1| 100.00 237 0 0 1827 2063 1827 2063 3e-122 438
1_25432 gi|39978278|ref|XM_370527.1| 98.30 352 6 0 1388 1739 202 553 4e-176 617
1_25432 gi|39978278|ref|XM_370527.1| 100.00 173 0 0 1161 1333 31 203 1e-86 320
1_25432 gi|39978278|ref|XM_370527.1| 100.00 79 0 0 1937 2015 597 675 2e-34 147
1_25432 gi|39943473|ref|XM_361274.1| 94.76 191 7 2 500 687 952 762 7e-79 294
1_25432 gi|39962808|ref|XM_364826.1| 94.54 183 7 2 483 662 201 383 2e-74 279
1_25432 gi|39968518|ref|XM_365650.1| 94.05 185 8 2 481 662 52 236 7e-74 278
1_25432 gi|39972194|ref|XM_367488.1| 93.99 183 8 2 483 662 12 194 9e-73 274
1_25432 gi|39968378|ref|XM_365580.1| 93.99 183 8 2 483 662 36 218 9e-73 274
1_25432 gi|39941829|ref|XM_360452.1| 94.77 172 9 0 483 654 45 216 4e-71 268
1_25432 gi|39969296|ref|XM_366039.1| 95.06 162 8 0 493 654 49 210 3e-67 255
1_25432 gi|1045530|gb|U36923.1|MGU36923 89.27 205 14 7 493 692 202 403 2e-65 250
1_25432 gi|39946779|ref|XM_362927.1| 95.14 144 7 0 500 643 1660 1517 7e-59 228
1_25432 gi|39947500|ref|XM_363018.1| 96.06 127 5 0 483 609 9 135 9e-53 207
1_25432 gi|39946527|ref|XM_362801.1| 94.50 109 2 3 583 687 120 12 6e-40 165
1_25432 gi|39967888|ref|XM_365335.1| 93.58 109 4 2 583 688 5085 4977 3e-38 159
1_25432 gi|39967888|ref|XM_365335.1| 91.45 117 7 2 574 687 4976 4860 1e-37 158
1_25432 gi|39969550|ref|XM_366166.1| 93.52 108 4 2 583 687 1413 1306 1e-37 158
1_25432 gi|39977860|ref|XM_370318.1| 97.70 87 1 1 396 482 798 883 6e-35 148
1_25432 gi|39977860|ref|XM_370318.1| 94.74 38 1 1 269 306 761 797 1e-07 58.4
1_25432 gi|39946567|ref|XM_362821.1| 88.28 128 9 5 568 690 201 75 6e-35 148

The contig accession is NT_086127
and if you align this sequence against the contig sequence by bl2seq you will see that you get good matches however they are not perfect...
hope that helps...

-L_Han-

hi,
transfac did not find any promoter motif. I just started my masters, I have almost no experince, Please guide me step by step to find this promoter, I am in trouble, sad.gif otherwise

-athena-

the start codon is at 1134

After a (very) quick scan of the upstream region I can see a likely candidate for the TATAA box at 1085

-John Buckels-

QUOTE (athena @ Apr 24 2006, 08:44 AM)
hi,
transfac did not find any promoter motif. I just started my masters, I have almost no experince, Please guide me step by step to find this promoter, I am in trouble, sad.gif otherwise

Hmm ok.

try to find CpGs using: http://www.ebi.ac.uk/emboss/cpgplot/
check http://www.ebi.ac.uk/services/ for general overview of services.

-DPK-

athena- check the following page from NCBI
These are the references you need. One of these papers verifies that Avr-Pita is a telemoric gene.
I've just been to a lecture by Barbara Valent- the woman who's lab cloned this gene. They have made GFP constructs with the promoter- if you contact her lab, i am sure somebody can help you with the sequences- regions etc.
good luck.

-L_Han-

QUOTE (athena @ Apr 19 2006, 11:03 AM)
Hi,
I have to find the prometer sequence of this gene

http://www.ncbi.nlm.nih.gov/entrez/viewer....de&val=12642087

could anyone show me a way to find it,

unsure.gif



Dear Athena,
I would suggest you use prediction tools such as promoter Inspector McPromoter,

here are some of the links :

http://rulai.cshl.edu/tools/FirstEF/

genes.mit.edu/McPromoter.html

http://www.123genomics.com/files/analysis.html (you will find a number tools here) .

Make a consensus of the results from above tools and then manaully curate them. you can also Blast the region later on.

Aamit

-Aamit-

Pages: 1 2 Next