mutation annotations - (Sep/16/2012 )
I'm performing a computational analysis on mutations and currently trying to build a data set of specific mutation that I found in papers.
The first group of mutations is defined by: (1) Gene name (HGNC annotations, such as "LDLR" or "BRCA1") and (2) the cDNA (c.) nomenclature (such as c.2389 G>T or c.313+6T>C).
The second group is fedined by (1) Gene name (again HGNC annotations), (2) such as mutation (G>A), (3) intron number and (4) position relative to the intron (such as +6 or -2).
I wish to locate the genomic coordinate of the mutation (hg19 coordinates).
In both cases, I encounter here a conflict as there are, in many cases, a few transcripts, (i.e. more than one cDNA for the gene). In both papers, they didn't supply a transcript id, only gene name.
Is there a common way to interpret these mutations? (for example, looking only on the longest transcript). In addition, assuming I have only a single transcript, how do you to fetch the exons\introns positions within the transcript\cDNA?
Thanks in advance,
Ones without a reference should be based on the refseq at Genbank. However, if these are patient mutations or something similar, it may well be that they were not submitted to genbank, and so, there isn't much you can do about it. You could email the authors and see if you can get the information about sequence reference out of them.