Protocol Online logo
Top : Forum Archives: : DNA Methylation, Histone and Chromatin Study

HELP!How to identify TSS, Exon, n Coding region using Blast / Blat - (Nov/08/2006 )

Good day to all. I got my insert using MCA/RDA method. The sequence as below (150bp):

GGGAAAGCACGCAGTGTGGCCGCTCTCATCACTCATTCAACGCGAGAGCACGACCCTGGGGAAAGGATGCGGTGTGGCCG
CTCTCGTCACTCGGTCAACGAGAGAGCACAACCCTGGGGAAAGGACGCAGTGTGGCCGCTCTCGTCACTC

When I blast with NCBI (http://www.ncbi.nlm.nih.gov/genome/seq/BlastGen/BlastGen.cgi?taxid=9606) I got 25 blast hits. Each hit was aligned at different nt position of chr:19p13.3. It seems that my insert is a repetitive sequence. Am I right? What is the difference between Reference Assembly and Celera Assembly. Which one is more reliable?

After blast with NCBI and BLAT, I found that the insert is part of JMJD2B gene. How to determine the location of Transcriptional Start Site, exon, coding region and CpG island of this gene?

I wish to construct a diagram as shown in Fig1A (see the link below). Can anybody tell me what program to use. Picture


Thank you.

-mrpcr-

Seems like your sequence is a tandem repeat see the attachment. I also ran a program called equicktandem which is a part of the EMBOSS package of tools.

your CpG island of interest doesn't seem to be firing off any transcripts and may not be the TSS for the JMJD2 gene. It looks as though it falls within the intron of this gene.

to find all the elements for this gene, in Genome Browser, click on the gene itself and it would link to the pubmed annotations for it.

As for the difference between a RefSeq and Celera Seq, I think RefSeq has been experimentally validated, ie: there is strong evidence that it is a gene and has a protein product. Celera sequence is probably remnants of the old sequencing effort where celera completed their "draft" sequence well in advance of the public effort but it seems now the public effort have caught up and in fact surpassed the celera sequence database.

-methylnick-

As Nick said it matches to the 6th intron of JMJD2b gene where there is a CpG island and predicted a promoter and first exon. The link to the hit is here. It could be the promoter of a unknown gene that has not be annotated.

-pcrman-

QUOTE (methylnick @ Nov 12 2006, 06:02 PM)
Seems like your sequence is a tandem repeat see the attachment. I also ran a program called equicktandem which is a part of the EMBOSS package of tools.

your CpG island of interest doesn't seem to be firing off any transcripts and may not be the TSS for the JMJD2 gene. It looks as though it falls within the intron of this gene.

to find all the elements for this gene, in Genome Browser, click on the gene itself and it would link to the pubmed annotations for it.

As for the difference between a RefSeq and Celera Seq, I think RefSeq has been experimentally validated, ie: there is strong evidence that it is a gene and has a protein product. Celera sequence is probably remnants of the old sequencing effort where celera completed their "draft" sequence well in advance of the public effort but it seems now the public effort have caught up and in fact surpassed the celera sequence database.


Thanks methylnick. Could the tandem repeat due my short insert sequence? If the insert sequence is fall within a intron of a gene, does it mean that it is not worthwhile to furthre study the gene? Referring to the PDF file that you sent, the direction of the arrow of JMJD2 is reverse from the insert sequence. What does it mean? Thank you.

-mrpcr-

QUOTE (pcrman @ Nov 13 2006, 06:20 AM)
As Nick said it matches to the 6th intron of JMJD2b gene where there is a CpG island and predicted a promoter and first exon. The link to the hit is here. It could be the promoter of a unknown gene that has not be annotated.


Thanks pcrman. From the map that you sent it seems that there is a unannotated promoter and a first exon within a transcript of a known gene. Am I right? But is this possible? Can you find where is the promoter location of JMJD2B gene? Thank you.

-mrpcr-

QUOTE (mrpcr @ Nov 13 2006, 03:34 AM)
Referring to the PDF file that you sent, the direction of the arrow of JMJD2 is reverse from the insert sequence. What does it mean? Thank you.


it just means the sequence you have from your experiment matches the antisense strand of the annotated gene, it doesn't really mean that much in your case, unless you have sequenced from RNA which could have implications of being a non-coding RNA or something like that.

Your CpG island of interest may or maynot be involved in the regulationo f JMJD2, have a look at the 5' end of the gene JMJD2 and see if there is a CpG island there, which is more likely to be the true promoter of JMJD2, the cpg island you have identified could be for an as yet unidentified gene or a non-coding/antisense transcript that could be involved in the post-transcriptional regulation of the JMJD2 gene

Nick

-methylnick-

Nick,

QUOTE
Seems like your sequence is a tandem repeat see the attachment. I also ran a program called equicktandem which is a part of the EMBOSS package of tools.

your CpG island of interest doesn't seem to be firing off any transcripts and may not be the TSS for the JMJD2 gene. It looks as though it falls within the intron of this gene.


Could the tandem repeat due my short insert sequence? If the insert sequence is fall within a intron of a gene, does it mean that it is not worthwhile to furthre study the gene? Thank you.

-mrpcr-

QUOTE (mrpcr @ Nov 13 2006, 08:17 PM)
Could the tandem repeat due my short insert sequence? If the insert sequence is fall within a intron of a gene, does it mean that it is not worthwhile to furthre study the gene? Thank you.


no, the tandem repeat is not due to your short insert sequence, it is inherent in the CpG island you have located.

I would not say "it is not worth studying the gene because it is in the intron" but rather, having a look at it and determining it's role in the regulation of the gene, it maybe that this CpG island does, and then in other cases it maynot, but because there is no evidence of transcription (in the form of EST clones) surrounding this CpG island, it is unlikely to be functional, but who is to say it isn't encoding some non-coding transcript that in turn, regulates the transcription (by RNAi mediated DNA methylation) or in fact the post transcriptional regulation through non-coding RNA of the gene JMJD2??.

-methylnick-

QUOTE (mrpcr @ Nov 13 2006, 03:38 AM)
Thanks pcrman. From the map that you sent it seems that there is a unannotated promoter and a first exon within a transcript of a known gene. Am I right? But is this possible? Can you find where is the promoter location of JMJD2B gene? Thank you.


JMBJ2B's promoter could be here. It contains a CpG island

The actual 5' flanking sequence is as below (1kb)

CODE
TTTGTGAAGTCACAGGGTCAAATCTTTCTCCGTGCCTCATCCCACTTAATCCTTGCAACA
TTCTCAAGAGGCGGGTAGACGGGGTGTGGCCACTACCATCATATCCATTGCACAGTTGGA
AACAGACTCAGCGAGGTTACTTTGCTGGAGGTCAACCACAGTGAGGCATGGGGCCCGGCT
TAGCTTTGAATCCAGGTCAGAAAGCACCAAAGCTCATTCTCTCCCATACATTTTAGGGAA
CTCGGGGAGAGGGACCCTAACTATTCCGAAGGCCTTGCCCCTTGGCAAAGGTTTGTTGAT
TGTGATGGGAATAGACCTACGGTGGCGGGGTGGAGGCTCTTGCTGCTTTACCCTATGTTC
TCGGTTCTCCTCCTTTGCAAAACGGATCGCTATGTCTCCTTGCTTGGCACTTTGCAAATT
GGGGAACGCTAAGCCCACATTCCGGCCACGGGTTTCGAACCTCCAGCCTGCACGTTCCCG
GCCGGTGCTGATGCTGTGCTGGGTCTCACGTACTCAGCCGCCGCCTGATAACCAGGGCGG
GGCCCGGAGCTTGCGGGCAGTGATTGGCACCTGCCCCAGCCTGTCCGCGCCTCGGGCTGG
CCCCTCAAGCCAATACTGGCTCCTCTCTGTGCGGTCGTCGGGCGGGCCCTGAGGCTCCCT
TGTCAATCCAAGGCCCAGCTGCCGTCGGGTTGGTCGCGGCAGCCTTGGCTGGCGTGCGCC
CCCTCCAATGAGAACAGAGCCGGTCAGCGGGCACGTGGGTGGGCGCCGGCGTGTCCCCGC
CGGTCTGCCAATGAAGAGGCGAGGCCGGCGTTTGTCCCCGCCCAATCGCGGCGCGCGGTG
GGCGGGCCGTGCGGTGTTGATGGGCCCGGCGGAGGGGAGGGGCGGAGCTGTCAGCTCCGG
CCAATGGGCGCTCGGGCGTGGATCCGGCAGCCAATGGCAGTCGGGGCGGAGCTGGCGCGC
GGCCTTATAAGCCCCCCCGCGAGCGCTTGCGGAGGGCTCG

-pcrman-