Transcription of an Expanded Genetic Alphabet

Abstract
Expansion of the genetic alphabet with a third base pair would have immediate biotechnology applications and also lay the foundation for a semisynthetic organism with an expanded genetic code. A variety of unnatural base pairs have been shown to be formed efficiently and selectively during DNA replication, and the pairs formed between the unnatural nucleotide d5SICS and either dMMO2 or dNaM are particularly interesting because they have been shown to be replicated with efficiencies and fidelities that are beginning to approach those of a natural base pair. Not only are these unnatural base pairs promising for different applications, but they also demonstrate that nucleobase shape and hydrophobicity are sufficient to control replication. While a variety of unnatural base pairs have been shown to be substrates for transcription, none are transcribed in both possible strand contexts, and the transcription of a fully hydrophobic base pair has not been demonstrated. We show here that both of the unnatural base pairs d5SICS:dMMO2 and d5SICS:dNaM are selectively transcribed by T7 RNA polymerase and that the efficiency of d5SICS:dNaM transcription in both possible strand contexts is only marginally reduced relative to that of a natural base pair. Thus, as with replication, we find that hydrogen-bonding is not essential for transcription and may be replaced with packing and hydrophobic forces. The results also demonstrate that d5SICS:dNaM is both replicated and transcribed with efficiencies and fidelities that should be sufficient for use as part of an in vitro expanded genetic alphabet.