(Previously: A Guide to Molecular Sequence Analysis)

This guide will introduce the reader to molecular sequence analysis. In the context of this guide, sequence analysis is the process of trying to find out something about a nucleotide or amino acid sequence, employing in silico biology techniques. You may have sequenced a gene yourself, and wish to learn what the long string of letters representing bases, actually code for. You may want to confirm that you have indeed cloned a gene successfully, or you might want to learn about a sequence of DNA that you know absolutely nothing about. You may want to know if a worm has a similar protein to a human one. These, and many other situations, require that you employ sequence analysis.

How It's Done

Vast databases of genetic information have been made publicly available. Some information is not made public, and you may have access to such databases at your place of work. Currently, many of the international scientific journals require that a sequence be submitted to a publicly available database, before the discovery of a new sequence can be published. Large numbers of sequences have been checked and published, often annotated and cross-referenced. Each record (entry) is curated and maintained in one of the many different databases accessible over the Internet.

Software is freely available that will compare your unknown sequence to all of the sequences in a database. Sequences which are similar to yours are reported. You may also find some utilities useful, in particular those which predict which sequences are coding, or those that present you with a graphical, three dimensional image of a macromolecule.

What's In This Guide?

In silico biology is a thorough, expanding and complex science. This guide provides an interactive working introduction, for scientists with no working knowledge of molecular sequence analysis.

You will learn the essentials of molecular sequence analysis by performing your own searches of provided "unknown" sequences. Each database has it's strengths and weaknesses, and you will learn to choose the most appropriate database for your desired search. You will learn a little about the many options available when performing searches. A detailed explanation of such options, and of the complex statistics applied, will not be covered. Advanced users who wish to learn about these aspects, can refer to the scientific literature (References). There are also many excellent online guides for advanced users, which can be found through a simple Internet search.

