Protocol Online logo
Top : New Forum Archives (2009-): : Bioinformatics and Biostatistics

Gene cross-checking - (Jul/20/2010 )

Pages: Previous 1 2 

What does your genelist look like? If you see the example I provided above, you'll see that my gene names are, for example:


Are your's similar, or do you have something like:


If yours looks like the second example, remove the "cel:" part from each gene name...

To test whether there is some other problem, cut and paste the example 10-gene list I used into a text file, and call this "genelist.txt". Run the script using that list, and see if it goes error-free...


nemat on Jul 22 2010, 04:20 PM said:

When I downloaded the pep files, I right clicked, saved target as and then downloaded it to my directory as a pep file, which I later turned into a text file. I attached an image of the formatting of the file, perhaps this is the problem?

That sounds okay -- except for the "which I later turned into a text file" part. You didn't need to do anything to these files, just DL them to your target directory by right-clicking. Notepad shows the files that way because Notepad is stupid, and doesn't recognise line endings other than those used by Windows. Mine look that way, too -- if I open them in Notepad. I long ago abandoned Notepad as a text editor in favor of TextPad.

But, that's just an aside -- if you just DL'ed the files from KEGG and kept their original filenames, the script should work fine.


I tried your gene.text file and it worked. I think the problem is that I copied my gene list from excel and they are still in cell formatting, any idea of how to completely clear their cell formats so that they are just texts? I feel as though we are so close!


Fixed the problem, and it ran very smoothly, now to analyze the information and prioritize which genes, to test. Thank you very, very much! And ill keep you updated on the results!

Best regards,


Excellent -- good luck!

The script I wrote is pretty basic -- it can be tuned to be more generic (e.g. to allow it to compare any list of gene names from any organism to any other organism, or to just retain the single highest scoring BLAST hit). Note that as written any BLAST hit whose e-value score is higher than 1e-03 is not retained -- you can tune this up or down by changing the "-e 1e-03" switch in the blastall system call line.

Pages: Previous 1 2