This is a cached page for the URL ( To see the most recent version of this page, please click here.
Protocol Online is not affiliated with the authors of this page nor responsible for its content.
About Cache [DNA]

(Previously: A Guide to Molecular Sequence Analysis)

Page Overview

  1. Introduction
  2. How It's Done
  3. What's In This Guide?
  4. Access To The Guide

Please visit ETEXT.NET
This site, written and owned by Andrew S Louka, is served to the internet without charge by who publish texts electronically. Many thanks to them for supporting the scientific community in this way.

" by Andrew S. Louka"

New! EasyTDT is a simple MS Excel sheet for transmission/disequilibrium test analysis, useful for quick by-hand calculations. It's a real time-saver. Drop me an e-mail if you'd like to see a practical tutorial in genetic association studies to include TDT and flavours thereof.


This guide will introduce the reader to molecular sequence analysis. In the context of this guide, sequence analysis is the process of trying to find out something about a nucleotide or amino acid sequence, employing in silico biology techniques. You may have sequenced a gene yourself, and wish to learn what the long string of letters representing bases, actually code for. You may want to confirm that you have indeed cloned a gene successfully, or you might want to learn about a sequence of DNA that you know absolutely nothing about. You may want to know if a worm has a similar protein to a human one. These, and many other situations, require that you employ sequence analysis.

How It's Done

Vast databases of genetic information have been made publicly available. Some information is not made public, and you may have access to such databases at your place of work. Currently, many of the international scientific journals require that a sequence be submitted to a publicly available database, before the discovery of a new sequence can be published. Large numbers of sequences have been checked and published, often annotated and cross-referenced. Each record (entry) is curated and maintained in one of the many different databases accessible over the Internet.

Software is freely available that will compare your unknown sequence to all of the sequences in a database. Sequences which are similar to yours are reported. You may also find some utilities useful, in particular those which predict which sequences are coding, or those that present you with a graphical, three dimensional image of a macromolecule.

What's In This Guide?

In silico biology is a thorough, expanding and complex science. This guide provides an interactive working introduction, for scientists with no working knowledge of molecular sequence analysis.

You will learn the essentials of molecular sequence analysis by performing your own searches of provided "unknown" sequences. Each database has it's strengths and weaknesses, and you will learn to choose the most appropriate database for your desired search. You will learn a little about the many options available when performing searches. A detailed explanation of such options, and of the complex statistics applied, will not be covered. Advanced users who wish to learn about these aspects, can refer to the scientific literature (References). There are also many excellent online guides for advanced users, which can be found through a simple Internet search.

If any of the links fail, please send an E-mail to the webmaster immediately.
Comments and suggestions about this guide are also welcome.

Access to the Guide

Some links in this guide will take you to another page at this site (e.g. to the glossary). To return to the referring page, use the back button on your web browser.

Click here to view the Contents Page.

PS... will continue to be a minimal graphics site, aiding access speeds and navigability from around the globe and on different platforms (eg PDAs) & browser versions.

I find it flattering to receive job applications, but I am not in a position to employ staff. Thanks for visiting - I hope you find the guide useful!

Top | Links | Glossary | Contents Page

Author: Andrew S. Louka

(c)1999-2002, Andrew S. Louka.
The entire contents of the guide (all material published electronically under the URL including single-file copies served from this site (e.g. as a PDF format file)) are copyright the author, Andrew S. Louka (except where specifically detailed within the guide). The guide may be used for personal and teaching purposes in electronic or printed form, but no form of payment whatsoever may be charged or accepted for it's distribution. Reproduction of the guide intended for publication, including screenshots published in journals must be approved by the author. The guide may not be served in any form, electronically or otherwise, unless specifically approved by the author in writing (PGP signed e-mail or hand written approval only). The guide may not be served on the Internet from any other address than (mirrors are not allowed).

** For maximum aesthetic effect, a stylesheet-enabled browser is recommended for this web site. **