• Create Account

# Frequency of restriction enzyme sites in a genome

5 replies to this topic

member

• Active Members
• 29 posts
0
Neutral

Posted 18 March 2011 - 06:12 AM

Hi all
I want to check one enzyme that on the average how many sites are there for this particular enzyme in human or mouse genome. is there any tool available for that?

### #2 philman

philman

Veteran

• Active Members
• 102 posts
5
Neutral

Posted 18 March 2011 - 06:31 AM

Well you can usually calculate it yourself, take the length of the genome, and divide by how often an enzyme cuts on average. For example EcoRI cuts at a 6bp site, the frequency of cutting is 4^6, so it cuts every 4096bp on average, as it's sequnce will occur at random every 4096bp.

This does mean that in a genome of several billion such as humans, it will cut millions of times.

### #3 Rsm

Rsm

Post Dog

• Active Members
• 361 posts
4
Neutral

Posted 18 March 2011 - 06:42 AM

philman, on 18 March 2011 - 06:31 AM, said:

Well you can usually calculate it yourself, take the length of the genome, and divide by how often an enzyme cuts on average. For example EcoRI cuts at a 6bp site, the frequency of cutting is 4^6, so it cuts every 4096bp on average, as it's sequnce will occur at random every 4096bp.

This does mean that in a genome of several billion such as humans, it will cut millions of times.

708,007 times, to be exact
I got soul, but I'm not a soldier

member

• Active Members
• 29 posts
0
Neutral

Posted 18 March 2011 - 06:44 AM

philman, on 18 March 2011 - 06:31 AM, said:

Well you can usually calculate it yourself, take the length of the genome, and divide by how often an enzyme cuts on average. For example EcoRI cuts at a 6bp site, the frequency of cutting is 4^6, so it cuts every 4096bp on average, as it's sequnce will occur at random every 4096bp.

This does mean that in a genome of several billion such as humans, it will cut millions of times.

I dont think thats a valid method bcaz there are enzymes that cut with more frequency, even if they have same recognition sequence length. for example EcoRI cuts 3 times more frequently than MsPI

### #5 bob1

bob1

Thelymitra pulchella

• Global Moderators
• 4,350 posts
224
Excellent

Posted 20 March 2011 - 04:04 PM

As a general rule that method is roughly correct, though it does vary... You can use it to estimate the general number of cut sites.  If you want to know exactly, then you will have to get the genome and run it through a program like enzymeX or NEBcutter.

Edited by bob1, 20 March 2011 - 04:07 PM.

### #6 phage434

phage434

Veteran

• Global Moderators
• 1,808 posts
130
Excellent

Posted 20 March 2011 - 05:38 PM

You can get a much more accurate estimate if you take into account the probability of GC and AT pairs independently.  If the GC content of the organism is (say) 70%, and the recognition site of the enzyme is GAATTC (EcoRI site), then the probability of its presence will be (.35)(.15)(.15)(.15)(.15)(.35) = 6.2e-5, since the probability of G is half of the probability of GC, and the probability of A is half of the probability of AT.

Instead of the naively calculated 4096 bp between sites on average, the 70% GC content version will have an expected distance of 1/6.2e-5 = 16,129 bp.

This still won't be exact, nor does it account for digraphs or special genome sequences.