Constance L. Cepko *
Shawn Fields-Berry *
John Lin *
* Department of Genetics, Harvard Medical School and Howard Hughes Medical Institute, Boston, MA 02115
1 Department of Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609
2 Merck Research Laboratories, West Point, PA 19486
3 Department of Pathology, Children's Hospital of Philadelphia, Philadelphia, PA 19104
Knowledge of the geneological relationships of cells during development can allow one to gain insight into when and where developmental decisions are being made. Geneological relationships can be revealed by a variety of methods, all of which involve marking a progenitor cell and/or a group of cells and then following the progeny. We describe the use of replication-incompetent retroviral vectors for the analysis of lineal relationships in developing vertebrate tissues. An overview of the relevant aspects of the retroviral life cycle, and the strategies and current methods in use in our laboratory are described. Knowledge of the geneological relationships of cells during development can allow one to gain insight into when and where developmental decisions are being made. Hypotheses can be ruled in or out concerning the commitment of cells to particular fates. For example, when analyzing the cell types that result from the marking of a single progenitor cell, one can gain insight into whether the progenitor was committed to the production of one or multiple cell types. If multiple cell types are found in a clone, one can conclude that the progenitor that gave rise to these cells was not restricted to the production of only one cell type. Alternatively, if all of the cells that descend from a progenitor are the same type, the hypothesis is supported, but not proven, that the progenitor was committed to making only that cell type. In the latter case, a firm conclusion concerning the commitment of the progenitor can be reached only if the progenitor and/or progeny are exposed to a variety of environments. If only one cell type is produced despite variations in the environment, commitment of the progenitor to production of one cell type is supported. Analyses of clones generated after marking the progenitors of a tissue at various times in development can greatly aid in charting the stages of production of different cell types, allowing one to focus studies concerning cell fate decisions to particular times and places. In addition, analysis of the proliferation and migration patterns exhibited by clones can increase our understanding of the development of a particular area. The complexity and inaccessibility of many types of embryos have made lineage analysis through direct approaches, such as time lapse microscopy and injection of tracers, almost impossible. A genetic and clonal solution to lineage mapping is through the use of retrovirus vectors. The basis for this technique will be summarized, and the strategies and current methods in use in our laboratory will be detailed. TRANSDUCTION OF GENES VIA RETROVIRUS VECTORS A retrovirus vector is an infectious virus that transduces a non viral gene into mitotic cells in vivo or in vitro . These vectors utilize the same efficient and precise integration machinery of naturally-occurring retroviruses to produce a single copy of the viral genome stably integrated into the host chromosome. Those that are useful for lineage analysis have been modified so that they are replication incompetent and thus cannot spread from one infected cell to another. They are however faithfully passed on to all daughter cells of the originally infected progenitor cell, making them ideal for lineage analysis. Retroviruses use RNA as their genome, which is packaged into a membrane-bound protein capsid. They produce a DNA copy of their genome immediately after infection via reverse transcriptase, a product of the viral pol gene which is included in the viral particle. The DNA copy is integrated into the host cell genome and is thereafter referred to as a "provirus". Integration of the genome of most retroviruses requires that the cell go through an M phase , and thus only mitotic cells will serve successfully as hosts for integration of most retroviruses. (However, there is a recent generation of retrovirus vectors based upon HIV , which can integrate into postmitotic cells. As lineage analysis is designed to ask about the fate of daughter cells, infection of postmitotic cells is not desirable.) Most vectors began as proviruses that were cloned from cells infected with a naturally-occurring retrovirus. Although extensive deletions of proviruses were made, vectors retain the cis-acting viral sequences necessary for the viral lifecycle. These include the packaging sequence (necessary for recognition of the viral RNA for encapsidation into the viral particle), reverse transcription signals, integration signals, viral promoter, enhancer, and polyadenylation sequences. A cDNA can thus be expressed in a vector using the transcription regulatory sequences provided by the virus (although see below for further discussion of this point). Since replication-incompetent retrovirus vectors usually do not encode the structural genes whose products comprise the viral particle, these proteins must be supplied through complementation. The products of the genes, gag, pro, pol, and env are typically supplied by "packaging" cell lines or cotransfection with packaging constructs into highly transfectable cell lines (for review see Cepko and Pear in Ausubel et al., 1997 ). Packaging cell lines are stable lines that contain the gag, pro, pol, and env genes as a result of the introduction of these genes by transfection. However, these lines do not contain the packaging sequence on the viral RNA that encodes the structural proteins. Thus, the packaging lines, or cells transfected with packaging constructs, make viral particles that do not contain the genes gag, pro, pol, or env. Retrovirus vector particles are essentially identical to naturally occurring retrovirus particles. They enter the host cell via interaction of a viral envelope glycoprotein (a product of the viral env gene) with a host cell receptor. The murine viruses have several classes of env glycoprotein which interact with different host cell receptors. The most useful class for lineage analysis of rodents is the ecotropic class. The ecotropic env glycoprotein allows entry only into rat and mouse cells via the ecotropic receptor on these species. It does not allow infection of humans, and thus is considered relatively safe for gene transfer experiments. The first packaging line commonly in use was the
Knowledge of the geneological relationships of cells during development can allow one to gain insight into when and where developmental decisions are being made. Geneological relationships can be revealed by a variety of methods, all of which involve marking a progenitor cell and/or a group of cells and then following the progeny. We describe the use of replication-incompetent retroviral vectors for the analysis of lineal relationships in developing vertebrate tissues. An overview of the relevant aspects of the retroviral life cycle, and the strategies and current methods in use in our laboratory are described.
Knowledge of the geneological relationships of cells during development can allow one to gain insight into when and where developmental decisions are being made. Hypotheses can be ruled in or out concerning the commitment of cells to particular fates. For example, when analyzing the cell types that result from the marking of a single progenitor cell, one can gain insight into whether the progenitor was committed to the production of one or multiple cell types. If multiple cell types are found in a clone, one can conclude that the progenitor that gave rise to these cells was not restricted to the production of only one cell type. Alternatively, if all of the cells that descend from a progenitor are the same type, the hypothesis is supported, but not proven, that the progenitor was committed to making only that cell type. In the latter case, a firm conclusion concerning the commitment of the progenitor can be reached only if the progenitor and/or progeny are exposed to a variety of environments. If only one cell type is produced despite variations in the environment, commitment of the progenitor to production of one cell type is supported. Analyses of clones generated after marking the progenitors of a tissue at various times in development can greatly aid in charting the stages of production of different cell types, allowing one to focus studies concerning cell fate decisions to particular times and places. In addition, analysis of the proliferation and migration patterns exhibited by clones can increase our understanding of the development of a particular area.
The complexity and inaccessibility of many types of embryos have made lineage analysis through direct approaches, such as time lapse microscopy and injection of tracers, almost impossible. A genetic and clonal solution to lineage mapping is through the use of retrovirus vectors. The basis for this technique will be summarized, and the strategies and current methods in use in our laboratory will be detailed.
TRANSDUCTION OF GENES VIA RETROVIRUS VECTORS
A retrovirus vector is an infectious virus that transduces a non viral gene into mitotic cells in vivo or in vitro . These vectors utilize the same efficient and precise integration machinery of naturally-occurring retroviruses to produce a single copy of the viral genome stably integrated into the host chromosome. Those that are useful for lineage analysis have been modified so that they are replication incompetent and thus cannot spread from one infected cell to another. They are however faithfully passed on to all daughter cells of the originally infected progenitor cell, making them ideal for lineage analysis.
Retroviruses use RNA as their genome, which is packaged into a membrane-bound protein capsid. They produce a DNA copy of their genome immediately after infection via reverse transcriptase, a product of the viral pol gene which is included in the viral particle. The DNA copy is integrated into the host cell genome and is thereafter referred to as a "provirus". Integration of the genome of most retroviruses requires that the cell go through an M phase , and thus only mitotic cells will serve successfully as hosts for integration of most retroviruses. (However, there is a recent generation of retrovirus vectors based upon HIV , which can integrate into postmitotic cells. As lineage analysis is designed to ask about the fate of daughter cells, infection of postmitotic cells is not desirable.) Most vectors began as proviruses that were cloned from cells infected with a naturally-occurring retrovirus. Although extensive deletions of proviruses were made, vectors retain the cis-acting viral sequences necessary for the viral lifecycle. These include the packaging sequence (necessary for recognition of the viral RNA for encapsidation into the viral particle), reverse transcription signals, integration signals, viral promoter, enhancer, and polyadenylation sequences. A cDNA can thus be expressed in a vector using the transcription regulatory sequences provided by the virus (although see below for further discussion of this point). Since replication-incompetent retrovirus vectors usually do not encode the structural genes whose products comprise the viral particle, these proteins must be supplied through complementation. The products of the genes, gag, pro, pol, and env are typically supplied by "packaging" cell lines or cotransfection with packaging constructs into highly transfectable cell lines (for review see Cepko and Pear in Ausubel et al., 1997 ). Packaging cell lines are stable lines that contain the gag, pro, pol, and env genes as a result of the introduction of these genes by transfection. However, these lines do not contain the packaging sequence on the viral RNA that encodes the structural proteins. Thus, the packaging lines, or cells transfected with packaging constructs, make viral particles that do not contain the genes gag, pro, pol, or env.
Retrovirus vector particles are essentially identical to naturally occurring retrovirus particles. They enter the host cell via interaction of a viral envelope glycoprotein (a product of the viral env gene) with a host cell receptor. The murine viruses have several classes of env glycoprotein which interact with different host cell receptors. The most useful class for lineage analysis of rodents is the ecotropic class. The ecotropic env glycoprotein allows entry only into rat and mouse cells via the ecotropic receptor on these species. It does not allow infection of humans, and thus is considered relatively safe for gene transfer experiments. The first packaging line commonly in use was they2 line . It encodes the ecotropic env gene and makes high titers of vectors. However, it can also lead to the production of helper virus (discussed below). A second generation of ecotopic packaging lines, yCRE , GP+E-86  and yE  have not been reported to lead to production of helper virus to date. A third generation of "helper-free" packaging lines, exemplified by the ecotropic lines, Bosc23  and Phoenix (Gary Nolan, Stanford University), were made in 293T cells, and have the advantage over the earlier lines of giving high titer stocks transiently after transfection. Similarly, cotransfection of 293T cells with packaging constructs and vectors can lead to the transient production of high titer stocks . The first two generations of packaging lines, which are based upon mouse fibroblasts, require production of stably transduced lines for production of high titer stocks.
For infection of non-rodent species, an envelope glycoprotein other than the ecotropic glycoprotein must be used to allow entry into the host cells. The one that endows the greatest host range is the VSV G glycoprotein, which allows infection of most species, including fish . The G protein apparently also makes for a more stable particle, which allows for greater concentration of the virus preparations. For lineage analysis of avian species, packaging lines and vectors based upon avian retroviruses are available [12-14]. In addition, we recently found that avian retroviruses with the VSV G protein on their surface were more efficient at infecting chick embryos Figure 1 and . Such virions gave the same titer as those with the avian A type env protein when they were titered on avian cells in vitro. However, when injected in vivo, the VSV G carrying particles give an approximately 350 fold more efficient infection, as judged by the number of clones in the retina, than the particles carrying the avian A env protein. Similar increases in efficiencies were noted throughout the embryo. We interpret these data to mean that cells within the avian embryo do not express high enough levels of the receptor for the A type env to be readily infected, but are not limited with respect to the ubiquitous phospholipid receptor for the VSV G protein . We did not see this effect of increased efficiency of infection using murine vectors and murine embryos, but it is possible that different mouse strains vary in this regard. Gaiano et al. found that murine virions carrying VSV G gave a slightly different spectrum of cell types following infection of the early mouse brain than virions carrying the murine ecotropic env .
Two other parameters to be considered when choosing a vector for lineage analysis are the reporter gene and the promoter that drives its expression. The reporters that have been used include cytoplasmic lacZ , nuclear lacZ , human placental alkaline phosphatase (PLAP) , and avian gag . More recently, there are vectors encoding green fluorescent protein (GFP) . We have found advantages and disadvantages in the use of each of these reporters. When deciding which reporter to use, we first consider the background activities in the tissue of interest. Although the lacZ gene gives a stable, reliable, and specific signal in most cells using Xgal detection , there is problematic b-galactosidase background in a few tissues. Control staining with Xgal thus should be done to determine if it is a problem in an area of interest. Changes in the fixation and staining conditions can reduce b-galactosidase background . Similarly, PLAP staining is reliable and stable, with heat treatment of the infected tissue rendering most endogenous alkaline phosphatases inactive. However, in some tissues residual alkaline phosphatases cause a backgroud problem. In such cases, inhibitors of endogenous alkaline phosphatases may solve the problem . When using GFP, the signal from the introduced GFP can be weak, and thus background fluorescence in the tissue can be a problem. We have found that tissue sections prepared on a vibratome yield bright GFP+ cells, but the same tissue sectioned on a cryostat gives much dimmer and more ill-defined GFP+ cells.
The second issue to consider is how one wishes to define the cell types expressing the reporter gene. In some cases, the morphology of the infected cells indicates their identity. In such cases, we recommend that one use lacZ or PLAP with histochemical detection, which is the simplest and most rapid way to find the infected cells. Moreover, the Xgal precipitate formed by b-galactosidase and the XP/NBT product of PLAP are stable for months of storage, allowing one time to analyze many sections. However, there are differences between lacZ and PLAP that will direct your choice of which one to use for morphological identification. LacZ typically does not fill the cell bodies of large cells, such as neurons, as completely as PLAP. Thus, when it is desirable to characterize cells via their morphology, PLAP, which associates with the plasma membrane, is superior to lacZ. PLAP is also the most sensitive of the reporters that we have used, most likely due to the fact that it is a very stable enzyme. However, PLAP can produce such a dense stain that, when clonally-related cells are close together, we have been unable to count the number of cells in a clone or distinguish the morphologies of individual cells (see  for an example). In such cases, nuclear lacZ is very useful.
If one cannot use morphological criteria to identify the types of cells carrying the reporter gene, one option is to use immunohistochemistry to detect defining cellular antigens. The Xgal product of lacZ and the reaction product of PLAP make it difficult to detect a fluorescent immunohistochemical product as they absorb fluorescence. Moreover, the Xgal product and the X-P/NBT precipitate produced by PLAP often are too dark to allow simultaneous detection of another colored precipitate produced by immunohistochemical detection of a cellular antigen. However, occasionally, this will work (e.g. see [15, 27]). One can use double immunohistochemical procedures by employing antisera to detect PLAP or lacZ. However, double immunohistochemical procedures are much more time consuming then histochemistal procedures and immunohistochemical detection of lacZ and PLAP is not always as sensitive as the histochemical procedures. For all of these reasons, GFP is a better choice. GFP allows simultaneous detection of the GFP reporter and an immunohistochemical signal e.g. see . However, as mentioned above, GFP expression is sometimes weak and, in addition, storage of sections over a long period of time (i.e. months) does not allow for preservation of the GFP signal. The newest reporter to be described, b-lactamase , may offer advantages over GFP in terms of sensitivity, but it has not yet been tested in vivo, where it may suffer from leakage of the product from infected cells. If this is a problem, future substrates might overcome this limitation.
The choice of the promoter to drive expression of a reporter gene also requires consideration. In order to see all the progeny of infected cells, a constitutive promoter should be used. We have had success using the LTR of Moloney Murine Leukemia Virus (Mo-MLV) for work in rats and mice and the LTR of Avian Leukosis Viruus for work in chicks. We compared several alternative promoters located internal to the Mo-MLV LTR, in the context of a wild type LTR promoter and in the context of an LTR promoter with an enhancer deletion, using infections of murine tissue in vivo [30, 31]. The LTR promoter performed the best of several promoters tested, including the human histone 4, chicken b-actin, CMV immediate early, and the SV40 early promoters. However, Gaiano et al.  recently reported that the Mo-MLV promoter was relatively inactive in early (E8.5 to E9.5) progenitor cells of the CNS. This problem is reminiscent of the failure of the LTR to express in embryonic stem cells and preimplantation embryos, which appears to be due to inhibition of the LTR in stem cells. Gaiano et al. found that an internal promoter of Ef1a or CMV/b-actin resulted in more expression in early progenitor cells as well as in later neurons. These findings suggest that one should test for stable expression in the area of interest, using the infection time and site that will be used for future experiments, before choosing the promoter. However, even after performing such preliminary experiments, and even with the choice of an apparently constitutive promoter, it is important to restrict one's conclusions about lineal relationships to cells that are marked and not to make assumptions about their relationships to cells that are unmarked. We have found, in some clones, that not all cells express a reporter gene. This is true even in control situations, such as in clones of NIH-3T3 fibroblasts infected with either a lacZ or PLAP vector in vitro. This observation has been made using several different promoters and vector designs.
PRODUCTION OF VIRUS STOCKS FOR LINEAGE ANALYSIS
Any of the aforementioned packaging systems and vectors can be used to produce vector stocks for lineage analysis. All stocks should be assayed for the presence of helper virus. A detailed description of protocols for making stocks, for titering and concentrating them, and checking for helper virus contamination, has been published and will not be given here (see Cepko and Pear in Ausubel et al., 1997  and reference ). The components for producing such stocks are generally available from the laboratories that have constructed them. For example, we have deposited stable lines that produce the murine replication-incompetent vectors that encode the histochemical reporter genes, lacZ and PLAP, at the ATCC in Rockville, Md. They2 and yCRE producers of BAG , a lacZ virus that we have used for lineage analysis, can be obtained by anyone and are listed as ATCC CRL 1858 (yCRE BAG) and 9560 (y2 BAG). Similarly, y2 producers of DAP , a vector encoding PLAP (described further below) is available as CRL 1949. For reasons that are unclear, the DAP line is more reliable for production of high titer stocks. Both of these vectors transcribe the reporter gene from the Mo-MLV LTR promoter and are generally useful for expression of the reporter gene in most tissues (though see discussion of promoters above). For lineage applications it is usually necessary to concentrate virus in order to achieve sufficient titer. This is typically due to a limitation in the volume that can be injected at any one site. Viruses can be concentrated fairly easily by a relatively short centrifugation step. Virions also can be precipitated using polyethylene glycol or ammonium sulfate, and the resulting precipitate collected by centrifugation. Finally, the viral supernatant can be concentrated by centrifugation through a filter that allows only small molecules to pass (e.g. Centricon filters). Regardless of the protocols used, one must keep in mind that the murine ecotropic and amphotropic retroviral particles, as well as the avian retroviral particles, are fragile, with short half-lives even under optimum conditions. In contrast, if one uses the VSV G protein as the viral envelope protein, the virions are more stable. In order to prepare the highest titered stock for multiple experiments, we usually concentrate several hundred milliliters of producer cell supernatant. These concentrated stocks are titered and tested for helper virus contamination, and can be stored indefinitely at -80°C or in liquid N2 in small (10- to 50-ml) aliquots. If we anticipate storage of over several months, we place a drop of mineral oil (e.g. a fresh tube of PCR oil) on top of the stock to prevent dehydration.
REPLICATION-COMPETENT HELPER VIRUS
Replication-competent virus is sometimes referred to as helper virus as it can complement ("help") a replication-incompetent virus and thus allow it to spread from cell to cell. It can be present in an animal through exogenous infection (e.g. from a viremic animal in the mouse colony), expression of an endogenous retroviral genome (e.g. the akv loci in AKR mice), or through recombination events in an infected cell that occurs between 2 viral RNAs encapsidated in retroviral virions produced during packaging. The presence of helper virus is an issue of concern when using replication-incompetent viruses for lineage analysis as it can lead to horizontal spread of the marker virus, creating false lineage relationships. The most likely source of helper virus is the viral stock used for lineage analysis. The genome(s) that supplies the gag, pol, and env genes in packaging lines does not encode they sequence, but can still become packaged, although at a low frequency. If it is co-encapsidated with a vector genome, recombination in the next cycle of reverse transcription can occur. If the recombination allows the y- genome to acquire the y sequence from the vector genome, a recombinant that is capable of autonomous replication is the result. This recombinant can spread through the entire culture (although slowly due to envelope interference). Once this occurs, it is best to discard the producer clone or stock as there is no convenient way to eliminate the helper virus. As would be expected, recombination giving rise to helper virus occurs with greater frequency in stocks with high titer, with vectors that have retained more of the wild type sequences (i.e. the more homology between the vector and packaging genomes, the more opportunity there is for recombination), and within stocks generated using gagpol and env on a complementing genome during transfection (as opposed to 2 separate genomes, one for gagpol and one for env). Note that a murine helper genome itself will not encode a histochemical marker gene as apparently there is not room, or flexibility, within murine viruses that allows them to be both replication-competent and capable of expressing another gene like lacZ. The way that spread would occur is by a cell being infected with both the lacZ virus and a helper virus. Such a doubly-infected cell would then produce both viruses.
When performing lineage analysis, there are several signs that can indicate the presence of helper virus within an individual animal. If one allows an animal to survive for long periods of time after inoculation, particularly if embryos or neonates are infected, the animal is likely to acquire a tumor when helper virus is present. Most naturally occurring replication-competent viruses are leukemogenic, with the disease spectrum being at least in part a property of the viral LTR. Secondly, if one analyzes either short or long term after inoculation, the clone size, clone number, and spectrum of labelled cells may be indicative of helper virus. For example, the eye of a newborn rat or mouse has mitotic progenitors for retinal neurons, as well as mitotic progenitors for astrocytes and endothelial cells. By targeting the infection to the area of progenitors for retinal neurons, we only rarely see infection of a few blood vessels or astrocytes as their progenitors are outside of the immediate area that is inoculated and they only get infected by leakage of the viral inoculum from the targeted area. However, if helper virus were present, we would see infection of a high percentage of astrocytes, blood vessels, and eventually, other eye tissues since virus spread would eventually lead to infection of cells outside of the targeted area. One would expect to see a correlation between the % of such non-targeted cells that are infected and the degree to which their progenitors are mitotically active after inoculation, due to the fact that infection requires a mitotic target cell. If one were to examine tissues other than ocular tissues, one would similarly see evidence of virus spread to cells whose progenitors would be mitotically active during the period of virus spread. In addition, the size and number of "clones" may also appear to be too large for true "clonal" events if helper virus were present. This interpretation of course relies upon some knowledge of the area under study.
DETERMINATION OF SIBLING RELATIONSHIPS
When performing lineage analysis, it is critical to unambiguously define cells as descendants of the same progenitor. This can be relatively straightforward when sibling cells remain rather tightly, and reproducibly, grouped. An example of such a straightforward case is the rodent retina, where the descendants of a single progenitor migrate to form a coherent radial array [31, 32]. The 2 analyses described below were applied to the rodent retina, and are applicable in any system where clones are arranged simply and reproducibly. The first assay is to perform a standard virological titration in which a particular viral inoculum is serially diluted and applied to tissue. In the retina, the number of radial arrays, their average size, and their cellular composition were analyzed in a series of animals infected with dilutions that covered a 3 log range. The number of arrays was found to be linearly related to the inoculum size, while the size and composition were unchanged. Such results indicate that the working definition of a clone, in this case a radial array, fulfilled the statistical criteria expected of a single hit event.
The second assay is to perform a mixed infection using 2 different retroviruses in which the histochemical reporter genes are distinctive. Two such viruses might encode cytoplasmically localized vs. nuclear-localizedb-gal. This can work when the cytoplasmically-localized b-gal is easily distinguished from the nuclear-localized b-gal [19, 33]. We have found that this is not the case in rodent nervous system cells as the cytoplasmically-localized b-gal quite often is restricted to neuronal cell bodies and is therefore difficult to distinguish from nuclear localized b-gal. In order to overcome this problem, we created the afore mentioned DAP virus , which is distinctive from the lacZ-encoding BAG virus. A stock containing BAG and DAP was produced by y2 producer cells grown on the same dish. The resulting supernatant was concentrated and used to infect rodent retina. The tissue was then analyzed histochemically for the presence of blue (due to BAG infection) and purple (due to DAP infection) radial arrays. If radial arrays were truly clonal, then each one should be only 1 color. Analysis of approximately 1100 arrays indicated that most were clonal. However, 5 comprised both blue and purple cells. This value will be an underestimate of the true frequency of incorrect assignment of clonal boundaries as sometimes 2 BAG or 2 DAP virions will infect adjacent cells and thus not lead to formation of bi-colored arrays. A closer approximation of the true frequency can be obtained by using the following formula (for derivation, see Fields-Berry et al. 1992 ).
where a and b are the relative titers comprising the virus stock. The relative titer of BAG and DAPused in the co-infection was 3:1 and thus the value for percent errors in clonal assignments was 1.2%.
The value of 1.2% for errors in assignment of clonal boundaries includes errors due to both aggregation and independent virions (e.g. perhaps due to helper-virus mediated spread) infecting adjacent progenitors. The percent errors in other areas of an animal will depend upon the particular circumstances of the injection site, and upon the multiplicity of infection (MOI, the ratio of infectious virions to target cells). Most of the time MOI will be quite low (e.g. in the retina it was approximately 0.01 at the highest concentration of virus injected). Concerning the injection site, injection into a lumen, such as the lateral ventricles, should not promote aggregation nor high local MOI, but injection into solid tissue in which the majority of the inoculum has access to a limited number of cells at the inoculation site, could present problems. By co-injecting BAG and DAP, one can monitor the frequency of these events and thus determine if clonal analysis is feasible.
An error rate as small as 1.2% does not affect the interpretation of "clones" that are frequently found in a large data set. However, as with any experimental procedure that relies in some way on statistical analysis, rare associations of cell types must be interpreted with some caution and conclusions cannot be drawn independently of other data.
The above analysis was performed using viruses that were produced on the same dish and concentrated together. This was done as we felt that the most likely way that 2 adjacent progenitors might become infected would be through small aggregates of virions. Aggregation most likely occurs during the concentration step as one often can see macroscopic aggregates after resuspending pellets of virions. Thus, when the 2-marker approach is used to analyze clonal relationships, it is best to co-concentrate the 2 vectors together in order for the assay to be sensitive to aggregation due to this aspect of the procedure. (Although aggregation of virions may frequently occur during concentration, it apparently does not frequently lead to problems in lineage analysis presumably due to the high ratio of non-infectious particles to infectious particles found in most retrovirus stocks. It is estimated that only 0.11.0 % of the particles will generate a successful infection. Moreover, most aggregates are probably not efficient as infectious units; it must be difficult for the rare infectious particle(s) within such a clump to gain access to the viral receptors on a target cell.)
METHODS FOR DETERMINING LINEAGE ANALYSIS
In order to determine the ratio of 2 genomes present in a mixed virus stock (e.g. BAG plus DAP), there are several methods that can be used. The first 2 methods are performed in vitro, and are simply an extension of a titration assay. Any virus stock is normally titered on NIH 3T3 cells to determine the amount of virus to inject. The infected NIH 3T3 cells are then either selected for the expression of a selectable marker when the virus encodes such a gene (e.g. neo in BAG and DAP), or are stained directly, histochemically, forb-gal or PLAP activity without prior selection in drugs. If no selection is used, the relative ratio of the two markers can be scored directly by evaluating the number of clones of each color on a dish. Alternatively, selected G418-resistant colonies can be stained histochemically for both enzyme activities and the relative ratio of blue vs. purple G418 resistant colonies computed. A third method of evaluating the ratio of the 2 genomes is to use the values observed from in vivo infections. After animals are infected and processed for both histochemical stains, the ratio of the 2 genomes can be compared by counting the number of clones, or infected cells, of each color. When all the above methods were applied to lineage analysis in mouse retina  and rat striatum , the value obtained for the ratio of G418 resistant colonies scored histochemically was almost identical to the ratio observed in vivo. Directly scoring histochemically stained, non-G418 selected NIH 3T3 cells led to an underestimate of the number of BAG infected colonies, presumably as such cells often are only faint blue, while DAP-infected cells are usually an intense purple. In vivo, this is not generally the case as BAG-infected cells are usually deep blue.
The method of injecting 2 distinctive viruses is a straightforward and feasible method of assessing clonal boundaries when they are fairly easy to define. This method does not require circumstances where there is a wide range of dilutions that can be injected to give countable numbers of events, which is required for a reliable dilution analysis. Moreover, it does not rely as critically on controlling the exact volume of injection, as is required in the dilution analysis. The use of a small number of vectors that encode distinctive histochemical products for definition of sibling relationships is only appropriate when there is very little migration of sibling cells. In these cases, the arrangement of cells that will be used to identify clonal relationships can be defined, and then this definition can be tested as described above. The fit of the definition with true clonal relationships will be revealed by the percentage of defined "clones" that are of more than one color. When the error is too great, one can re-evaluate the criteria, make a new definition, and again test it by looking for "clones" that are more than one color. Through trial and error, an accurate definition of sibling relations should be possible when migration is not too great.
When one cannot accurately define clonal relationships with a few distinctive viruses, a much greater number of vectors must be used. One can employ a library of retroviral vectors, each member of which is tagged with a unique small insert of an irrelevant DNA. Each vector is scored using the polymerase chain reaction (PCR). The library/PCR method is tedious but extremely worthwhile when dealing with problematic areas.
Regardless of which method is used to score sibling relationships, one further recommendation to aid in the assignments is to choose an injection site which will allow the inoculum to spread. If one injects into a packed tissue, the viral inoculum will most likely infect cells within the injection tract and it will be very difficult to sort out sibling relationships. For example, a lumen, such as the neural tube, provides an ideal site for injection. Regardless of site, one must inject such that the virus has clear access to the target population; the virus will bind to cells at the injection site and will not gain access to cells that are not directly adjoining that site.
The procedures described below are those that we have used for infection of rodents and chick embryos, histochemical processing of tissue forb-gal and PLAP, and preparation and use of a library for the PCR method of defining clonal relationships.
INFECTION OF RODENTS
Injection of virus in utero in rodents
The following protocols may be used with rats or mice. Note that clean, but not aseptic, technique is used throughout. We routinely soak instruments in 70% EtOH before operations, use the sterile materials noted, and include Penicillin/Streptomycin (final concentrations of 100 units/ml each) in the lavage solution. We have not had difficulty with infection using these techniques.
1. Mix ketamine and xylazine 1:1 in a 1 ml syringe with 27-gauge needle; lift the animal's tail and hindquarters with one hand and with the other inject 0.05 ml (mice) or 0.18 ml (rats) of anesthetic mixture intraperitoneally .
One or more additional doses of ketamine alone (0.05 ml for mice and 0.10 ml for rats) is usually required to induce or maintain anesthesia, particularly if the procedure takes over one hour. Respiratory arrest and spontaneous abortion appear to occur more often if a larger dose of the mixture is given initially, or any additional doses of xylazine are given.
2. Remove hair over entire abdomen using depilatory agent (any commercially available formulation, such as Nair, works well); shaving of remaining hair with a razor may be necessary. Wash skin several times with water, then with 70% EtOH, and allow to dry.
3. Place animal on its back in support apparatus.
For this purpose, we find that a slab of styrofoam with two additional slabs glued on top to create a trough works well for this purpose. With the trough appropriately narrow, no additional restraint is then needed to hold the anesthetized animal.
4. Make midline incision in skin from xyphoid process to pubis using scalpel, and retract; attaching retractors firmly to styrofoam support will create a stable working field. Stop any bleeding with cotton swabs before carefully retracting fascia and peritoneum, and incising them in the midline with scissors (care is required here not to incise underlying bowel). Continue incision cephalad along midline of fascia (where there are few blood vessels) to expose entire abdominal contents. If necessary to expose the uterus, gently pack the abdomen with cotton balls or swabs to remove the intestines from the opertive field, being careful not to lacerate or obstruct the bowel. Fill the peritoneal cavity with LR, and lavage until clear if the solution turns at all turbid.
Wide exposure is important to allow the later manipulations. During the remainder of the operation, keep the peritoneal cavity moist and free of blood; dehydration or blood around the uterus increases the rate of post-operative abortion.
5. Elevate the embryos one at a time out of the peritoneal cavity, and transilluminate with a fiberoptic light source to visualize the structure to be injected. For lateral cerebral ventricular injections, for example, the cerebral venous sinuses serve as landmarks.
When deciding on a structure to inject, keep in mind that free diffusion of virus solution throughout a fluid-filled structure lined with mitotic cells is best for ensuring even distribution of viral infection events throughout the tissue being labelled. The neural tube is an example of such a structure; when virus is injected into one lateral ventricle, it is observed to quickly diffuse throughout the entire ventricular system.
6. Using a heat-drawn glass micropipette attached to an automatic microinjector, penetrate the uterine wall, extraembryonic membranes, and the structure to be infected in one rapid thrust; this minimizes trauma and improves survival. Once the pipette is in place, inject the desired volume of virus solution, usually 0.1-1.0
When injecting through the uterine wall, all embryos may potentially be injected except those most proximal to the cervix on each side. (injection of these greatly increases the rate of post-operative abortion). In practice, it is often not advisable to inject all possible embryos, if excessive uterine manipulation would be required. At the earliest stages at which this technique is feasable (E12 in the mouse or E13 in the rat), virtually any uterine manipulation may cause abortion, so any embryo which cannot be reached easily should not be injected.
7. Once all animals have been injected, lavage the peritoneal cavity until it is clear of all blood and clots, ensure that all cotton balls and swabs have been removed, and move retractors from the abdominal wall/fascia to the skin.
Filling the peritoneal cavity with LR with Pen/Strep before closing increases survival significantly, probably by preventing maternal dehydration during recovery from anesthesia as well as preventing infection.
8. Using 3-0 Dexon or silk suture material on a curved needle, suture the peritoneum, abdominal musculature, and fascia from each side together, using a continuous locking stitch.
After closing the fascia, again lavage using LR/Pen/Strep.
9. Close skin using surgical staples (such as the Clay-Adams Autoclip) placed 0.5 cm apart.
Sutures may also be used, but these require much more time (often necessitating further anesthesia which increases abortion risk) and are frequently chewed off by the animal, resulting in evisceration.
10. Place animal on its back in the cage and allow anesthesia to wear off.
Ideally, the animal will wake up within one hour of the end of the operation. Increasing time to awakening results in increasing abortion frequency. Food and water on the floor of the cage should be provided for the immediate post-operative period.
11. Mothers may be allowed to deliver progeny vaginally or harvested by Caesarean section. Maternal and fetal survival are approximately 60% at early embryonic ages of injection, and increase with gestational age to virtually 100% after postnatal injections.
Injection of virus into mice using exo utero surgery
Injections into small or delicate structures (such as the eye) require micropipettes which are too fine to penetrate the uterine wall. In addition, it is impossible to precisely target many structures through the rather opaque uterine wall. These problems can be circumvented, though with a considerable increase in technical difficulty and decrease in survival, by use of the exo utero technique . The procedure is similar to that detailed above, with the following modifications to free the embryos from the uterine cavity:
1. The technique works well in our hands only with outbred, virus-antigen free CD- 1 and Swiss-Webster mice, but even these strains may have different embryo survival rates when obtained from different suppliers or different colonies of the same supplier.
This variability presumably results from subclinical infections which may render some animals unable to survive the stress of the operation. We have had no success with this technique in rats.
2. After the uterus is exposed and before filling the peritoneum with LR, incise the uterus longitudinally along its ventral aspect with sharp microscissors. The uterine muscle will contract away from the embryos, causing them to be fully exposed, surrounded by their extraembryonic membranes.
3. Only two embryos in each uterine horn can be safely injected, apparently because of trauma induced by neighboring embryos touching each other. Using a dry sterile cotton swab, scoop out all but 2 embryos, each with its placenta and extraembryonic membranes, and press firmly against the uterus where the placenta was removed for 30-40 seconds to achieve hemostasis .
It is very important to stop all bleeding before proceeding. From this point on, the embryos must be handled extremely gently, as only the placenta is tethering an embryo to the uterus, and it tears easily.
4. Fill the peritoneal cavity with LR, and cushion each embryo to be injected with sterile cotton swabs soaked in LR. Keeping the embryos submerged throughout the remainder of the procedure is essential to their survival.
5. The injection should then be done with a pneumatic microinjector and heat-pulled glass micropipette. This may usually be done by puncturing the extraembryonic membranes first and then the structure to be injected; for some very delicate injections it may be necessary to make an incision in the extraembryonic membranes, which is then closed with 10-0 nylon suture after the injection.
INFECTION OF CHICK EMBRYOS
The following description is an example of an infection protocol used for the chick neural tube. More details of infection protocols for chick embryos can be found in Morgan and Fekete, 1996 .
1. Fertilized virus-free White Leghorn chicken eggs were obtained from SPAFAS (Norwich, CT) and kept at 4°C for one week or less until they were transferred to a high humidity, rocking incubator (Petersime, Gettysburg, Ohio) at 38°C which was designated time 0. Line O eggs are the most desirable hosts as they do not encode any endogenous retroviruses that could lead to helper virus generation .
2. In order to prevent the embryo from sticking to the shell, it is useful to lower the embryo by removing albumin at an early stage. To accomplish this, set the eggs on their sides and rinse with 70% ethanol. Poke a hole in both ends of the egg as it lies on its side using a sharp pair of forceps, scissors, or needle. Use a 5 ml syringe and 21 gauge needle and withdraw 1.5 ml. of albumin from the pointed end of the egg. Angle the needle so as not to disrupt the yolk by pointing it down and by not putting it too deep into the egg. Cover both holes with clear tape and return to the incubator. Alternatively, you can locate and stage the embryo by cutting a hole on the top (side facing up as it lies on its side) with curved scissors. Remove shell to make a hole 1/2 to 1" in diameter. Locate the embryo and then enlarge the hole to allow easy access to the embryo. If the embryo is to be used later, cover the hole with clear tape and return to the incubator.
3. At approximately 18 to 42 hours incubation (Hamburger and Hamilton stage 10-17 ) embryos are injected with 0.1 to 1.0ml of viral inoculum including 0.25% fast green dye . The inoculum is delivered by injection directly into the ventricular system, which is easily accessed at these early times. For delivery, the inoculum is loaded into a glass micropipette, made as above for rodent injections. A micromanipulator and Stoelting microsyringe pump (catalogue # 51219) is convenient and delivers the maximum volume in about 1'. After injection, cover the hole with clear tape and return to the incubator.
Infected chick or rodent embryos can be incubated to any desired point. Chicks can be allowed to hatch and rodents can be delivered by Caesarean section and reared to maturity. Embryonic brains are dissected in PBS followed by overnight fixation in 4% paraformaldehyde in PBS (pH 7.4) at 4°C. Posthatch or postnatal animals are perfused with the same fixative and are incubated overnight in the fixative. They are then washed overnight in three changes of PBS and cryoprotected in 30% sucrose. After cryoprotection, the brains are embedded in OCT media and cut on a Reichart-Jung 3000 cryostat at 60-90mm. Sections are histochemically stained for the appropriate marker and are mounted with gelvatol. For details of the histochemical reaction for lacZ or PLAP, see Cepko et al. in .
Heat H20 to 60°C. Add paraformaldehyde. Add NaOH to get paraformaldehyde in solution. Cool to room temperature, add 1/10 volume 10X PBS and pH with HCl. Can be stored at 4°C several weeks.
PREPARATION AND USE OF A RETROVIRAL LIBRARY FOR LINEAGE ANALYSIS USING PCR AND SEQUENCING
We developed a direct approach to address lumping and splitting errors  by constructing a library of viruses that was analyzed by PCR. In our first libraries, each virus of the library carried one member from a pool of approximately 100 DNA fragments from Arabidopsis thalliana DNA, in addition to the lacZ or PLAP gene. Infected cells, recognized by their enzyme activity, were mapped and the positive cells cut from cryosections. The A. thalliana DNA was amplified by PCR and characterized by size and restriction enzyme digestion patterns. If the size and restriction digestion pattern of the PCR product from two or more cells was the same, they were considered siblings with a probability calculated on the basis of the number of infections in that brain and the complexity of the library [41, 42]. Lineage analysis using such libraries revealed novel lineal relationships in the rat cerebral cortex [41, 43, 44] and chick diencephalon . However, the limited number of unique members in the library made from A. thalliana DNA restrained the analysis to tissues with low infection rates.
More data could be acquired with each experiment and additional questions could be addressed in the central nervous system and other tissues with a more complex library containing a greater number of DNA tags. We therefore constructed several retroviral vectors, of which CHAPOL (chick alkaline phosphatase with oligonucleotide library) is the prototype , that include degenerate oligonucleotides with a theoretical complexity of 1.7 x 107. Studies in the developing nervous system of the chick have been successfully completed using CHAPOL [47, 48].
A summary of the production of CHAPOL and BOLAP (an oligo library in a murine vector) will be given here; a detailed description of the construction of CHAPOL can be found elsewhere . For either avian or murine retroviruses, the overall strategy is the same. A population of double-stranded DNA molecules that includes a short degenerate region, [(G or C)(A or T)]12, is generated by PCR amplification of a chemically constructed single-stranded oligonucleotide population of the same sequence. The oligo preparation is ligated into a retrovirus vector and a preparation of highly competent E. coli is transformed. The library is then grown as a pool and a preparation of plasmids from the pool is made. The DNA of the pool is transfected into an avian or mammalian packaging cell line to produce a library of virus particles. The library is injected into an area to be mapped. Infected cells are detected histochemically and each infected cell is recovered for PCR amplification. Each PCR product is then sequenced. Two cells with the same sequence are considered siblings, again with a probability derived from an analysis of the frequency of recovery of each genome (see Walsh et al. 1992 ).
PREPARATION OF CHAPOL
The avian replication-incompetent virus CHAP [49, 50], encoding the human placental alkaline phosphatase (PLAP) gene, was modified to accept the oligo inserts. CHAP was linearized, purified, and mixed with the degenerate oligonucleotides in the presence of ligase and aliquots of the resulting ligation products were used to transform E. coli DH5a. Following transformation, all aliquots were pooled. One hundred ml of the pool was plated at varying dilutions on plates containing ampicillin. The remainder of the pool was divided and added to eight 2L flasks containing lL LB media with 50 mg/ml ampicillin. The cultures were shaken overnight at 37°C. Plasmid DNA was extracted from these cultures by the triton lysis procedure and purified on CsCl gradients .
CHAPOL DNA was transfected into the avian virus packaging line Q2bn , and the transiently produced virus was collected and concentrated. Aliquots of CaPO4 precipitates of 100mg CHAPOL DNA were made in 10 ml of HBS. The precipitate in each aliquot was then distributed equally on ten ten cm plates of Q2bn and glycerol shock was carried out for 90 seconds at room temperature 4 hours later. At 24 hours post glycerol shock, the supernatants were collected and pooled. This was repeated at 48 hours. The supernatants from the 24 and 48 hour harvests were pooled and the titer calculated by infection of QT6 cells and assay of the PLAP activity as described (Cepko and Pear in Ausubel et al. 1997 ). The stock was filtered through a 0.45 mm filter and concentrated by centrifugation in an SW27 rotor at 4°C, 20K, for 2 hours. The concentrated stock was titered on QT6 and tested for helper virus, which proved negative. The titer of CHAPOL was determined to be 1.1 x 107 CFU/ml. The same stock has used for all experiments conducted over a 5 year period and many aliquots remain. We recommend making large stocks and storing them as small aliquots.
USE OF CHAPOL
CHAPOL was used to infect the developing brain of chick embryos using the procedures outlined above. At various times later, the tissue was harvested and stained for AP activity. The outline of each section was drawn by camera lucida and the location and type of cells labeled on each section were recorded. A single cell or cluster of cells with a small group of surrounding cells were removed using a heat pulled glass micropipette (figure 2) and transferred to a 96-well PCR plate for Proteinase K digestion (as below). Following digestion, nested PCR was performed (as below). The product of each PCR was run on a 1.5% agarose gel to determine if a product of the appropriate size had been amplified. The recovery of a PCR product of the proper size occurred from PCR of 30-85% of the picks (the frequency varied depending upon the batch of PCRs and the tissue being studied) using CHAPOL. Sequencing of the oligonucleotide insert (as below) was performed on all reactions which gave the expected product on the agarose gel analysis (e.g. figure 3) and was successful approximately 75% of the time. All sequences were stored in the software program GCG (1991). All common sequences were pulled from the database created in GCG and the corresponding cells labeled. Sections were then aligned to determine the three dimensional boundaries of clonal expansion. Each type of cell (e.g. neuron, glia) was also recorded to determine the variety of cells which can arise from a single progenitor.
The value of using a complex library of vectors is illustrated by the view of more heavily infected brains (figure 4) where closely aggregated AP+ cells would have been lumped into a single clone based on proximity. In addition, even lightly infected brains could give rise to lumping errors (also see Walsh and Cepko, 1992 ).
Two issues are important for determining the value of this type of library of DNA markers: the number of unique members in the library and the distribution of the library members . If only 2 members exist in the library, for example, then there is a one in two chance that the same tag will be selected in two consecutive picks (if they are present in equal concentrations). If 100 members exist in the library at equal concentrations, the chance that two picks come up with the same member is reduced to 10-2. The second important variable determining the quality of the library is the distribution of the members within the library. This can be illustrated as follows. Consider a library composed of 106 members, with 50% of the library composed of one member. If two neighboring or distant cells are found to carry the over represented insert, the probability that the two cells arose from separate clones is still 0.5. CHAPOL was found to have an equal distribution in that each of the inserts picked to date (n> 500) has occurred independently only once. One further issue to consider is the level of difficulty in using the library. We have found in practice that this method of tag identification is in fact easier than our previous method based upon the analysis of the size and restriction digestion pattern of A. thalliana DNA.
These libraries should be useful for application in a wide range of tissues and species. The host range has been expanded to previously uninfectable hosts , and infection of non-neural tissue with CHAPOL has been observed, as one would expect based upon experience with avian and murine retroviruses.
PCR AND SEQUENCING FROM CHAPOL
Proteinase K digestion: The coverslips were removed from slides by immersion in sterile H2O. Single cells or small clusters of cells containing purple NBT precipitate with surrounding unlabeled tissue (approximately 0.5 to 2 mm tissue fragments) were scraped from the slide using a heat pulled glass micropipette (figure 2). The cells were transferred to a 96 well PCR (Hybaid) plate with 10ml of a proteinase K solution (50 mM KCl, 10 mM TrisHCl pH 7.5, 2.5 mM MgCl, 0.02% tween 20, 200 mg/ml proteinase K). Each well was overlaid with 1 drop of light mineral oil (Sigma) and the plates were heated to 60°C for 2 hours, 85°C for 20 minutes, and 95°C for 10 minutes in a Hybaid OmniGene thermocycler.
Nested PCR: The first PCR was accomplished by adding 0.15ml Taq polymerase (Boehringer Mannheim), 0.15 ml dNTP mix (Boehringer Mannheim), 0.75 ml each of 10 mM oligonucleotide 0 (5'TGTGGCTGCCTGCACCCCAGGAAAG3') and 10 mM oligonucleotide 5 (5'GTGTGCTGTCGAGCCGCCTTCAATG3'), 2 ml PCR buffer with MgC12 (Boehringer-Mannheim) and 16.2