Protocol Online logo
Top : New Forum Archives (2009-): : -Microbiology and Virology-

question on symbols used in microbiology - (Aug/11/2012 )

I have a question regarding the use of symbols in bacterial strains.

I found this on a webpage (http://openwetware.o..._coli_genotypes): dcm = cytosine methylation at second C of CCWGG sites exist. dam & dcm are the default properties and always elided, while dam- or dcm- should be declare explicitly


Now, when reading this I understand that dcm is always present in cells and when its not present in a certain strain it should be noted with dcm- , but when I look at certain strains, it always says: dcm (without the -) does this mean that dcm is present in these strains? Which is weird, since they state its always present ? Or does it mean that its absent? (which would make more sense because they only seem to mention "things" when its absent or mutated)



But the website also states this: coli B strains are naturally lon- and dcm- , does this mean that all B strains are dcm- and this the dcm- part is not mentioned anymore? And when its mentioned, it means its present?? This makes it pretty confusing.


When I look at those B strains it says for example: BL21
E. coli B F- dcm ompT hsdS(rB- mB-) gal +>K-12S)

Does this mean dcm is not present or does it mean it is present? Either way, its confusing since they dont use the dcm- and since its a B strain, its automatically not present?

Especially because another B strain (that should not have dcm) is given like this: B2155
thrB1004 pro thi strA hsdsS lacZD M15 (F`lacZD M15 lacIq traD36 proA+ proB+) D dapA::erm (Ermr) pir::RP4 <::kan (Kmr) from SM10>



It does not mention dcm anymore? So dcm is present is this strain or not?



A last question: whats the difference betwen () and <> when used in strain genotypes? sometimes I see (...) and <...> in a strain , but what does it mean? for example:
E. coli


B F- ompT hsdS(rB- mB-) metA::Tn5(kanr)
dcm+


Tetr gal λ (DE3) endA Hte <argU proL Camr>

-lyok-

http://ecoliwiki.net/colipedia/index.php/Help:Genetic_nomenclature
http://en.wikipedia.org/wiki/Bacterial_genetic_nomenclature
http://www.sci.sdsu.edu/~smaloy/MicrobialGenetics/topics/mutations/nomenclature-v3.pdf

But nothing compares to the original paper:

http://www.genetics.org/content/54/1/61.full.pdf

Sorry for not replying to all your questions one by one. You can read the material and have even more questions answered:)

Andreea

-ascacioc-

Ok,
thanks for those links.

However, after checking them, I still have a few very specific problems:

1: what is the meaning of <..> ? Are those for genes that are originaly from a phage or plasmid that has integrated in the genome of the strain? Because I noticed it also can be used for genes still on a plasmid (eg F plasmid). So when do you know the genes are integreted or still on the plasmid ? It seems that if the are still on the plasmid the name of the plasmid is still mentioned, like for instance they mention the F plasmid followed by <...>.
2: what I understand from the papers is that X::Y means that Y is inserted in X , but following this logic, I do not understand the following => B2155 strain :
thrB1004 pro thi strA hsdsS lacZD M15 (F`lacZD M15 lacIq traD36 proA+ proB+) D dapA::erm (Ermr) pir::RP4 <::kan (Kmr) from SM10>
An E. coli strain carrying the pir sequence required for maintenance of plasmids containing R6K ori. Also, this strain is auxotrophic for DAP (diaminopimelic acid - a lysine precursor). The auxotrophy helps in removal of this strain from a bi-parental mating setup after conjugation.


How come this strain still has a pir sequence that is working while there is RP4 inserted in it?
3: same strain, but pir::RP4 <::kan (Kmr) from SM10> doesnt seem to make sense to me.
What does it mean <::...> ? THe genes inside the brackets are from SM10, but are they inserted in the pir gene then?


4: same question as before: coli B strains are naturally lon- and dcm- and also dcm = cytosine methylation at second C of CCWGG sites exist. dam & dcm are the default properties and always elided, while dam- or dcm- should be declare explicitly

Then how come I find B strain where they mention dcm (eg: BL21
E. coli B F- dcm ompT hsdS(rB- mB-) gal +>K-12S))

while others dont mention the dcm or lon?
===> to me this looks like DL21 does NOT have a working dcm gene , but it does have a lon gene... because its not mentioned.. but this is in conflict with the general statement that B strains dont have lon gene?
ALso (from novagen that sells this strain) : BL21(DE3) F– ompT hsdSB(rBmB) gal dcm (DE3) ; why do they mention dcm here and not lon? And at the same time they also state: "BL21 is the most widely used host background for protein expression and has the advantage of being deficient
in the lon (8) and ompT proteases."

It seems that people are not following the general "rules" about the nomenclature or?

-lyok-

I wrote a good portion of that page, and if you think things are unclear after reading it, you should have seen things before it existed. Genotypes are simply a mess. In the next few years we will mostly give up on them and start looking at complete sequence, which will tell what is really happening. Meanwhile, attempting to make more sense of the mess is, and will remain, very challenging. Figure out the properties you care about, and then test them on the strain you have in your hands.

-phage434-

It is difficult to put in the genotype all the mutations that one strain has. Just to show you what mutations appear in 2 B-strains, check this paper:
http://www.ncbi.nlm.nih.gov/pubmed/19765592
Now, how would you imagine writing the genotype? It is difficult.
But to clear up some of the things you asked:
-E. coli B F- dcm ompT hsdS(rB- mB-) gal +>K-12S)) - why don't they mention dcm and lon? Because they mention E. coli B in the beginning, which means, among others that is defficient in lon and dcm, as yourself mention above :)
-the question related to pir::RP4 - it is a bit complicated to explain how pir functions (please google it yourself); to put it in a nutshell: pir needs RP4 otherwise you need a helper strain for the whole thing to function
-<..> - you are right about phage or plasmid; they are inserted when :: is present
-also (..) denotes episomes and plasmids according to Demerec
In the end of the day, phage434 is right, you must learn the characteristics of your strain at hand and live with it. I mean, in my few years of cloning and protein expression, I used at most 20 strains, for which it was enough to know:
-the chromosomal/plasmid antibiotic resistance
-endA, recA, lon, ompT
-what (DE3) means
-pLysS and similar plasmids
-plasmids containing rare codons e.g. pRARE

In the end of the day most of the people use, on average 2-3 bacterial strains throught their life i.e. DH5alpha and BL21(DE3). Unless they are doing hardcore microbiology. And in between us being said, most of the people in our generation don't even mind about genotypes. So don't get scared that you do not get everything.

Andreea

Disclaimer: I did not use italics everywhere it is necessary for gene names :)

-ascacioc-

T hanks

ascacioc on Sun Aug 12 22:35:20 2012 said:


It is difficult to put in the genotype all the mutations that one strain has. Just to show you what mutations appear in 2 B-strains, check this paper:
http://www.ncbi.nlm....pubmed/19765592
Now, how would you imagine writing the genotype? It is difficult.
But to clear up some of the things you asked:
-E. coli B F- dcm ompT hsdS(rB- mB-) gal +>K-12S)) - why don't they mention dcm and lon? Because they mention E. coli B in the beginning, which means, among others that is defficient in lon and dcm, as yourself mention above
-the question related to pir::RP4 - it is a bit complicated to explain how pir functions (please google it yourself); to put it in a nutshell: pir needs RP4 otherwise you need a helper strain for the whole thing to function
-<..> - you are right about phage or plasmid; they are inserted when :: is present
-also (..) denotes episomes and plasmids according to Demerec
In the end of the day, phage434 is right, you must learn the characteristics of your strain at hand and live with it. I mean, in my few years of cloning and protein expression, I used at most 20 strains, for which it was enough to know:
-the chromosomal/plasmid antibiotic resistance
-endA, recA, lon, ompT
-what (DE3) means
-pLysS and similar plasmids
-plasmids containing rare codons e.g. pRARE

In the end of the day most of the people use, on average 2-3 bacterial strains throught their life i.e. DH5alpha and BL21(DE3). Unless they are doing hardcore microbiology. And in between us being said, most of the people in our generation don't even mind about genotypes. So don't get scared that you do not get everything.

Andreea

Disclaimer: I did not use italics everywhere it is necessary for gene names


Thanks, however "why don't they mention dcm and lon? Because they mention E. coli B in the beginning," ....

I see your point and agree, but there are not consistent with this... Because often they do mention dcm ... and sometimes they even mention lon too..

So its very confusing ... I dont agree with the fact that "because its always like this , we never mention dcm or lon" , its an exception and it makes it just confusing because some people do mention dcm but not lon or others mention dcm and lon...

I think, as phage434 said, there should be a general consensus/rule.

-lyok-