Protocol Online logo
Top : New Forum Archives (2009-): : Bioinformatics and Biostatistics

How to retrieve miRNA sequences from Ensemble using their coordinates ? - (Feb/18/2020 )

Dear all, 

I am facing a technical issue in miRNA. I want to retrieve the sequences of 500 duck miRNAs from ENSEMBLE database using their coordinates. For example I have a miRNA genomic location named as "KB742382_1_145970_145992", and I want to retrieve the sequence and the name of this miRNAs. Note: duck genome on ensemble are arranged as scaffolds starts with KB. I think two ways exist: 

1. To download a fasta file from ensemble containing the coordinates and sequences, and then I can use R to subset thhis 

2. To download a gff3 file containing the coordinates and the names, then knowing the names of miRNAs or their transcripts, one can match these with a fasta file being downloaded contains names and sequences , then one can get the sequences ? 

 

How do you think ? Is this is available for ducks ? 

-mohsamir1984-

No idea if this is available for ducks. I think both approaches would work so long as you have the information available to you. From R or Python you could also potentially do some webscraping and retrieve the information directly.

-bob1-