I've recently got a raw data from 23andme, which is actually (now custom) Illumina OmniExpress SNP array.
Since the FDA ban in 2013, they cannot provide userfriendly nice simple health predictions with pretty pictures, but from what I read in discussions, I think it's better.
However, I can and did get a "raw data file" which is pretty basic, just genome coordinates and rs number where applicable.
rs numbers are fine, those are defined, but there are many custom identifiers that only have the genomic coordinate. Also, those are in old human genome build (GRCh37) instead of currently used GRCh38 on Ensembl and everywhere.
But that is not the main problem, there is a build converter. Main problem is, I end up with something like 4:154611811 and I need to find, quickly what is on that location (gene annotation, if it is inside gene, then position in the coding sequence/sequences for multiple transcipts (and which transcripts)). I know there are bioinformatic tools to run that through the whole file (which I would need eventually, but it is not like I'm able to make my own client from some API), but for now, I would just do if it was able to do that.
I usually look for location the other way around, from a sequence. Now stuck with the genomic location only failed to find an easy way.
I can put genomic coordinate into Ensembl to view that region, and see it lies in a transcript, but I can't get WHERE exactly what position in cds it has.
The same is with NCBI and UCSC browsers. They all show me, it is in a transcript, but I need to know the position in that transcript (or actually in cds, but that is convertible). I can find them by hand, but that is not duable for many coordinates.
Anyone has an idea?