Pl theoretical

Figure 1. Diagramatic representation of S. meliloti proteins capable of being analyzed by proteome analysis. The dots represent the theoretical pi and molecular weight of proteins predicted from the S. meliloti chromosome. The box represents the pi and molecular mass window that is capable of being examined by 2-DGE analysis.

The procedures used to resolve, stain and quantify proteins in a gel are influenced by the experimental aims and by the type of post-gel analysis employed. Some of the major considerations that need to be understood prior to undertaking proteome analysis are briefly discussed here. S. meliloti strain 1021 encodes over 6200 open reading frames (Galibert et al. 2001) and if 50% of the genome is expressed at any one time, this would generate 3100 proteins of varying abundance. This protein number assumes that the proteins are not subjected to post-translational modifications that would lead to more than one protein product from one gene. Nevertheless, if one continues to work with this assumption, then there is a need to visualize at least 3000 individual proteins in order to obtain an overall perspective of the proteome of this organism and to conduct effective post-gel analysis of the proteins present. High loads of total protein are needed to achieve this aim, although gel resolution is compromised when loads are too high.

Another consideration is the limit of detection for colloidal Coomassie which is around 1 p.g of protein. Since Coomassie stained protein spots are preferred for peptide mass fingerprint analysis (see below) this means that at least 3 mg of total protein needs to be separated over the entire pH range. We therefore load up to 1 mg of total protein for first dimensional separation for each pH range and use the sequential colloidal Coomassie staining method prior to gel analysis. Although MALDI-TOF mass spectrometry (Matrix Assisted Laser Desorption Ionization MS) is our preferred strategy for protein identification via peptide mass fingerprinting (Natera et al. 2000), we sometimes employ N-terminal sequencing (Chen et al. 2000 a, b; Guerreiro et al. 1997, 1998, 1999; Natera et al. 2000) or Western blotting to identify specific classes of proteins (see Figure 2).

Western Immunoprobing

N-terminal Sequencing

MALDI-TOF (PMF)

Protease Digestion

ESI-MS/MS (sequence and PTMs)

Figure 2. Different strategies for post-gel separation analyses. Proteins from 2-D gels (upper central panel) can be analyzed by Western immunoprobing, Edman sequencing or via mass spectrometry. MALDI-TOF generates peptide mass fingerprints of the peptides generated from protein digestion whereas ESI MS/MS can generate peptide sequence and identify post translational modifications (PTMs) of the peptides.

Cellular lysis of S. meliloti is achieved by disrupting the cells in reagents that are compatible with 2-DGE (Guerreiro et al. 1997). An overlapping series of IPG strips that can collectively separate proteins between pH 3 and pH 11 are used to establish a proteomic pi "contig". Proteins are then separated according to their size using either vertical or horizontal electrophoresis with the very thin pre-cast horizontal gels giving the best resolution. Protein visualization is best achieved by staining the gel directly using silver or fluorescent stains for analytical gels or sequential colloidal Coomassie staining (Chen et al. 2000 a, b). Image analysis of gels loaded with equal amounts of protein is achieved using Melanie 3 software and this enables differential protein expression to be quantified. Generally, we run at least three replicates to ensure that the changes in protein levels are due to the experimental regime.

Peptide mass fingerprinting is the cheapest and fastest method of post-gel protein identification. MALDI-TOF mass spectrometry, as stated above, can usually generate effective data when a minimum of 1 jag of protein is present in a stained protein spot. Proteins are first digested with a protease (usually trypsin) and the masses of the resulting peptides are determined with a high degree of accuracy to generate a peptide mass "fingerprint" from MALDI-TOF analysis. This information is then used to match theoretical databases of peptide fragments generated from analyzing the output of the genomic sequence information. Masslynx or similar software packages are used to determine the likelihood of the match and a diagramatic example of the matches generated is shown in Figure 3. Alternatively, N-terminal sequencing or Electrospray Ionization Mass Spectroscopy (ESI-MS") can be used to generate protein sequence from the peptides generated and determine many post-translational modifications.

Natera Results Example

Figure 3. Diagramatic representation of the process of peptide mass fingerprinting. 3. Results and Discussion

Using these strategies we have examined the proteome of S. meliloti:

Figure 3. Diagramatic representation of the process of peptide mass fingerprinting. 3. Results and Discussion

Using these strategies we have examined the proteome of S. meliloti:

(i) at different phases of growth (Guerreiro et al. 1999);

(ii) after the deletion of the pSym (Chen et al. 2000b);

(iii) in the endosymbiotic state compared to culture grown (Natera et al. 2000);

(iv) to examine the effects of a mutation of nolR in strain Rm41 (Chen et al. 2000a).

We have also used proteome analysis to determine the response of R. leguminosarum to flavonoid exposure (Guerreiro et al. 1997), the effects of plasmid curing (Guerreiro et al. 1998) or to examine the unexpected pleiotropic effects of mutation (Guerreiro et al. 2000).

Several major points have resulted from the analysis of this data. First, some products of the nodulation genes can be detected in R. leguminosarum but thus far, not in S. meliloti. This may reflect the different extent of transcriptional activation of nodulation genes in these two organisms. Second, extensive pleiotropic effects in the proteome can be detected that result from either (a) the mutation of a regulatory gene (nolR) or (b) a structural gene involved in polysaccharide synthesis ipssA). The extent of these changes rival those seen when large areas of the pSym or other plasmids are deleted from the strains (Guerreiro et al. 2000) even though a large amount of genetic information is missing in the deleted strains. The high number of proteins that were induced by mutation of pssA in two species was unexpected especially if this gene product possesses the one function in polysaccharide synthesis. However, the increasing number of multifunctional proteins that are being discovered (Jeffery 1999) combined with the notion that multiple protein-protein interactions can occur in the cell, adds a further level of complexity to the analysis of biological systems and emphasizes that multiple approaches will be necessary to unravel these processes. It is not inconceivable that PssA may interact with other proteins in the cell or indeed possess another or other functions. Finally, extensive alterations occur in the expression profiles of proteins isolated from bacteroids when compared to culture grown cells. This result most likely reflects the alteration to metabolism that result from a low oxygen environment, the switch to the nitrogen-fixing state and the utilization of more specific sources of nutrients that are provided by the plant (Natera et al. 2000). We are currently undertaking a comprehensive analysis of over 2000 proteins isolated from either cultured cells or bacteroids. This endeavor will go a long way to define the major changes that occur as the bacteria make the transition from the cultured state to the bacteroid. We expect that over 85% of the proteins will be identified, far more than our previous analyses. This is to be expected since our initial analyses were made at a time when the full genome sequence was not available. Nevertheless, even with a lx shotgun coverage of the genome (supplied by S. Long and M. Barnett, Stanford University), we were able to obtain a significant number of identities for our PMF queries (Natera et al. 2000).

4. References

Chen et al. (2000) Electrophoresis 21, 3833-3842

Chen et al. (2000) Electrophoresis 21, 3823-3832

Djordjevic et al. (1987) Ann. Rev. Phytopathol. 25, 145-168

Freiberg et al. (1997) Nature 387, 394-401

Galibert et al. (2001) Science 293, 668-672

Gottfert etal. (2001) J. Bacteriol. 183, 1405-1412

Guerreiro et al. (1997) Molec. Plant-Microbe Interact. 10, 506-516

Guerreiro et al. (1998) Electrophoresis 19, 1972-1979

Guerreiro et al. (1999) Electrophoresis 20, 818-825

Guerreiro et al. (2000) J. Bacteriol. 182, 4521-4532

Jeffery (1999) Trends Biochem. Sciences 24, 8-11

Natera et al. (2000) Molec. Plant-Microbe Interact. 13, 995-1009

5. Acknowledgements

We acknowledge access to the S. meliloti genome sequence prior to publication and access to the lx shotgun genome sequence provided by the S. meliloti sequencing consortium. A grant from the Australian Government has facilitated the peptide mass fingerprinting analysis at the Australian Proteome Analysis Facility (Macquarie University, Australia).

COMPARISON OF CHROMOSOMAL GENES FROM M. LOTI AND S. MELILOTI SUGGEST AN ANCESTRAL GENOME

R.A. Morton

Dept of Biology, McMaster University, 1280 Main St. West, Hamilton, ON, L8S 4K1 Canada

1. Introduction

Although the origins of biological fixation of nitrogen are unclear, it is an ancient process, thought to have evolved more than 2 billion years ago, perhaps in response to a decline in abiotic nitrogen fixation as atmospheric CO2 decreased (Navarro-González et al. 2001; Raven, Yin 1998). Biological nitrogen fixation is limited to prokaryotes, but in some cases microbes fix nitrogen in symbiosis with plants. It is not clear when such nitrogen-fixing associations may have originated (Raven, Yin 1998). The most recent are root-nodule symbioses. Nodule-forming plants are closely related, probably belonging to a single clade (Gualtieri, Bisseling 2000). Within this clade, however, only diverse subgroups of plants are nodulated and comparison of plant and bacterial phylogenies suggest multiple origins of symbiosis (Swensen 1996; Doyle 1998).

The availabilty of complete genomes of symbiotic, nodulating bacteria allows comprehensive comparison of their genetic systems and hypotheses about their evolutionary histories to be made. I have compared genes of M. loti and S. meliloti in order to identify those gene functions that were present in their ancestor and conserved in their present genomes. The results confirm the independent origin of current nitrogen-fixing bacteria and the key role played by horizontal gene transfer.

2. Material and Methods

Orthologous pairs of genes coalesce in the most recent common ancestor of their genomes. A pair of orthologs is a connection between two genes that can be visualized by a genome dot plot. In order to develop a set of orthologous connections, I used a principle of reciprocal similarity. Orthologous genes are more similar to each other than to any other gene in either genome. Each gene of one genome is BLASTed against all the genes of the second genome. This is repeated in the reciprocal direction. When the most similar pair of genes (the BLAST hit with the highest score) in reciprocal directions is the same, the pair is classified as orthologs. One difficulty with this method occurs when several target genes are similar to a single origin gene as, for example, is often the case with rRNA-coding genes. Therefore I made an adjustment when the BLAST hits a group of genes in the target genome with nearly the same score ("duplications"). Duplications were resolved by comparisons with neighboring genes. The "duplication" (if any) which maintained a continuous set of orthologous pairs was retained as the ortholog.

The Mesorhizobium loti MAFF303099 genome (http://www.kazusa.or.jp/rhizobase/) and a preliminary release of the Sinorhizobium meliloti 1021 genome (http://sequence.toulouse.inra.fr/rhime/Complete/doc/Complete.html) were obtained in February 2001. Orthologous ORFs were aligned with CLUSTALW and the expected substitution rate determined using the PROTDIST or DNADIST algorithms of Phylip (Felsenstein 1994). Distances are expected substitutions per site using the PAM 250 model of evolution for protein or the Kimura 2-parameter model with a transition/transversion ratio of 2 for DNA.

3. Results and Discussion

3.1. Ancestral genome. M. loti and S. meliloti are both nitrogen-fixing, symbiotic rhizobacteria which nodulate different host species. M. loti has a large chromosome (>6750 genes) and two small plasmids (pMLa and pMlb). The S. meliloti chromosome contains about 3400 genes and there are two large "megaplasmids" of 1.35 Mbp (pSymA) and 1.68 Mbp (pSymB) together containing more than 2800 genes. Genes of the M. loti chromosome were compared with those of each of the three S. meliloti replicons (Figure 1).

The results show that there is extensive orthology (2573 chromosomal gene pairs) between chromosomal genes of these two species. Orthologous pairs are scattered throughout their respective genomes. However, a pattern is seen of orthologs that form contiguous groups, sometime in inverted direction (Figure 1). These represent conserved chromosomal sequences of genes or syntenic groups. Genes in these syntenic groups can be identified by virtue of belonging to "runs" of orthologous pairs. When five or more pairs is used as a cut-off, 984 orthologs (approximately 40% of all chromosomal orthologs) were identified as belonging to a putative ancestral genome. The genome dot plots of M. loti chromosome genes against genes of the two S. meliloti megaplasmids show no evidence of extensive synteny although there are several regions of limited extent. In the case of pSymA, 620 potential orthologs were identified, but only 61 in runs of five or more while for pSymB there were 121 out of 825 orthologs. Thus, there is little support for the hypothesis that the smaller S. meliloti chromosome resulted from the transfer of ancestral genes into the megaplasmids. Rather, genes of the S. meliloti megaplasmids seem to be of divergent origin when compared to the M. loti chromosome (Figure 1).

0 10 0 10 pSymA pSymB

0 1 0 2.0 3 0 Chromosome

^ Sym j Island

Figure 1. Genome dot plot comparing annotated genes of the M. loti MAFF303099 chromosome (vertical axis) with those of three replicons of S. meliloti 1021. Axes are m megabase pairs from an arbitrary origin. The M. loti "symbiotic island" is identified by "Sym Island" and the extent of several other regions of the M. loti chromosome that appear to be insertions relative to the S. meliloti chromosome are indicated to the nght by double arrows.

S. meliloti

3.2. Symbiotic insertions. Sullivan and Ronson (1998) showed that M. loti strain ICMP3153 could transfer a large symbiotic element to non-symbiotic strains which then allowed them to nodulate plants. This element was defined in M. loti MAFF303099 by Takakazu et al. (2000). It is shown on the M. loti chromosome right axis in Figure 1 as "Sym Island". Consistent with an origin by horizontal gene transfer, this M. loti symbiotic island was not identified as part of the ancestral chromosome. As well, there are a number of other regions of the M. loti chromosome which appear to have received large insertions since divergence from the S. meliloti lineage. Chromosomal rearrangements make difficult the determination of all such regions, but a few possible ones have been indicated by double arrows on the right side of Figure 1.

3.3. Gene divergence. Identification of genes descended vertically from an ancestral genome allows study of rates of divergence. The 2526 putative protein-coding orthologs have an average divergence of 0.714 substitutions per amino acid site, while the average divergence of the 967 protein-coding ORFs that were part of orthologous runs of 5 or more genes is 0.553 (Figure 2). A fraction of those genes which are not part of extensive "runs" are clearly more diverged (distances > 1). A similar analysis of orthologous, ancestral pairs between S. typhi and E. coli gave an average distance of 0.14 expected amino acid substitutions per amino acid site (Koski et al. 2001). This is about four times the average distance (0.55) between the 967 protein-coding genes conservatively identified as descended from an ancestral genome.

Since S. typhi and E. coli are estimated to have diverged approximately 100 million years ago, either there was an ancient separation of the M. loti/ S. meliloti lineages (-400 million years ago), or alternatively, there has been more rapid rate of divergence of their proteins.

The greater average divergence of M. loti/ S. meliloti orthologs relative to E. coli/S. typhi orthologs was confirmed for many individual, homologous genes. Table 1 shows that equivalent genes involved in ammonia assimilation are 5-10 times more diverged in the rhizobial species than in E. coli/S. typhi. On the other hand, the 16S rRNA genes are only 1.5 times more diverged in rhizobia.

3.4. Origins of symbiotic nitrogen fixation. The rhizobia have been divided into three distinct groups, Rhizobium, Bradyrhizobium and Azorhizobium, on the basis of 16S rRNA gene sequences (Young and Haukka 1996). These groups are not monophyletic with regards to nitrogen fixation. Non-nodulating bacteria are inter-dispersed on the 16S rRNA tree along with non-nitrogen-fixing species. Although plants that are capable of forming symbiotic relationships form a large monophyletic clade, nodulating groups are scattered among non-nodulating groups. There is little phylogenetic correlation between nitrogen-fixing bacteria and their legume hosts (Doyle, 1998). Related groups of bacteria can often nodulate unrelated groups of plants.

Nodulation and nitrogen-fixing genes in Mesorhizobium have been located on "symbiotic islands" and these have been demonstrated to be capable of transmission between bacterial strains (Sullivan and Ronson 1998). Taken together, these results suggest that symbiotic nitrogen fixation has independently evolved and been lost many times.

Young and Haukka (1996) concluded from the 16S rRNA phylogeny that the common ancestor of rhizobia pre-dated the origin of higher plants. Among the rhizobia, Mesorhizobium and Sinorhizobium are distinct groups which cluster separately according to their 16S rRNA sequences. The extent of divergence of their 16S rRNA genes does not indicate as ancient a separation of these two lineages as does their chromosomal orthologs (Table 1). Both, however,

0.714

0.553

Count J

Distance

Figure 2. Distribution of chromosomal protein-coding ortholog distances between M. loti and S. meliloti. Grey: all orthologs (2526), black: orthologs in runs of 5 or more (967).

Table 1. Protein distances between homologous gene pairs in E. coli/S. typhi compared to M. loti/S. meliloti

DNA or Protein Distance (sub./site)

Table 1. Protein distances between homologous gene pairs in E. coli/S. typhi compared to M. loti/S. meliloti

DNA or Protein Distance (sub./site)

Chromosomal Gene

E. coli

M. loti

Ratio

S. typhi

S. meliloti

glnA Glutamine Synthetase

0.018

0.106

5.9

gltD Glutamate Synthetase (ß)

0.047

0.216

4.6

gltB Glutamate Synthetase (a)

0.058

0.263

4.5

amtB Ammonium Transport

0.088

0.789

9.0

glnK Nitrogen Regulation

0.013

0.148

11.4

glnB Nitrogen Regulation

0

0.118

NA

ntrB (glnL) Nitrogen Regulation

0.067

0.361

5.4

ntrC (glnG) Nitrogen Regulation

0.048

0.273

5.7

16S rRNA

0.031

0.046

1.5

pSymA (S. meliloti) to Chromosome (M loti) Gene

nifH Nitrogenase (Fe)

NA

0.076

NA

nifD Nitrogenase (FeMo a)

NA

0.154 •

NA

nifK Nitrogenase (FeMo ß)

NA

0.123

NA=not applicable.

are consistent with separation of lineages long before the evolution of symbiotic nitrogen fixation. Thus, nitrogen fixation and nodulation must have been acquired independently, apparently by horizontal transfer of alien genes into the chromosome of M. loti and into the pSymA plasmid of S. meliloti. Genes that have been transmitted vertically on the chromosomes of these two species appear to be those that are required for a free-living lifestyle, similar to that of their presumed progenitor.

4. References

Doyle JJ (1998) Trends Plant Sci. 3, 473-478

Felsenstein J (1994) PHYLIP Version 3.5, distributed by the author

Gualtieri G, Bisseling T (2000) Plant Mol. Biol. 42, 181-194

Koski LB, Morton RA, Golding GB (2001) Mol. Biol. Evol. 18, 404-412

Navarro-González R, McKay CP, Mvondo DN (2001) Nature 412, 61-64

Sullivan JT, Ronson CW (1998) Proc. Natl. Acad. Sci. USA 95, 5145-5149

Young JP, Haukka KE (1996) New Phytol. 133, 87-94

5. Acknowledgements

I would like to thank Dr Turlough Finan for suggesting this project and providing the opportunity to use the S. meliloti sequences before publication.

Was this article helpful?

0 0

Post a comment