The cyanobacterial genome core and the origin of photosynthesis

Abstract
Comparative analysis of 15 complete cyanobacterial genome sequences, including "near minimal" genomes of five strains of Prochlorococcus spp., revealed 1,054 protein families [core cyanobacterial clusters of orthologous groups of proteins (core CyOGs)] encoded in at least 14 of them. The majority of the core CyOGs are involved in central cellular functions that are shared with other bacteria; 50 core CyOGs are specific for cyanobacteria, whereas 84 are exclusively shared by cyanobacteria and plants and/or other plastid-carrying eukaryotes, such as diatoms or apicomplexans. The latter group includes 35 families of uncharacterized proteins, which could also be involved in photosynthesis. Only a few components of cyanobacterial photosynthetic machinery are represented in the genomes of the anoxygenic phototrophic bacteria Chlorobium tepidum, Rhodopseudomonas palustris, Chloroflexus aurantiacus, or Heliobacillus mobilis. These observations, coupled with recent geological data on the properties of the ancient phototrophs, suggest that photosynthesis originated in the cyanobacterial lineage under the selective pressures of UV light and depletion of electron donors. We propose that the first phototrophs were anaerobic ancestors of cyanobacteria ("procyanobacteria") that conducted anoxygenic photosynthesis using a photosystem I-like reaction center, somewhat similar to the heterocysts of modern filamentous cyanobacteria. From procyanobacteria, photosynthesis spread to other phyla by way of lateral gene transfer.