Blatting to human genome - (Jun/27/2005 )
I'm trying to clone in silico predictions in the lab and have been able to get a band of a fragment of a novel gene, however, when I try and blat the cloned sequence it only gives around 90% identity to our prediction and to the human genome. Also primers I have been using to the full length prediction have not yielded any product. I have tried to RACE the ends without success as yet.
The sequence of the cloned band as been confirmed by cloning from different tissues and sequencing (therefore not PCR induced errors)but still does not blat to human genome efficiently.
What I am really asking is why wouldn't the sequence BLAT properly???????
I would be interested to here any explanations, I've only been able to come up with two.
1/ The human genome in the area is not accurate (possible still a working draft with considerable ambiguities)
2/ The gene is actually a pseudogene and that has been retrotransposed into a heterochromatic region of the genome
Anyone else got any ideas?????
It sounds strange, but to make sure I would suggest you try:
Blast your sequence against NCBI "nr" database and all organisms to see if it matches to other organisms.
This happened to me four years ago. I cloned an isoform of a human gene from a human cell line. This isoform had an extra exon inserted before the last exon and truncated the protein translation of the last exon of the wide type gene but added 6 aa before the stop codon.
I blasted the extra exon I cloned against human sequences and found no matches. Even now when the genome project has finished, I still could not find a match to human sequence. But when I blasted all organism, it matched to a rat sequence. Strange enough, it matched to the same gene of rat! How could this happen??? My primers used for the cloning only matched to human sequence and the inserted extra exon from rat is within the PCR product (with no primer matching site). The only explanation is that human and rat sequence for the same gene were recombinated in the cell. Four years past, I still have no clues to the mystery.
Thanks for the input, I thought I had a problem but inter-species recombination now that is something that takes some explaining.
I am pretty sure that the sequence is human, I've cloned and sequenced it from two different human tissue sources, so unless my cDNA is contaminated (possible) I think it should be human. Besides the best hit when I BLAST it at NCBI against all organisms the predicted human protein and a human BAC clone containing the sequence. My feeling is the human genome sequencing is wrong in that region, may be a difficult region to sequence and therefore under represented in the analysis. But I am open to suggestions.