Abstract
DNA is widely used to construct heterologously expressed genes. The adaptation of the codons to the host organism is necessary in order to ensure sufficient production of proteins. The GC content, codon identity and the mRNA from the translation site are also important in the design of the gene construct. This study performed a strategy for the design of synthetic gene encoding HPV52 L1 protein and several analyses at the genetic level to optimize its protein expression in the Escherichia coli BL21(DE3) host. The determination of the codon optimization was performed by collecting 75 HPV52 L1 protein sequences in the NCBI database. Furthermore, all the sequences were analyzed using multiple global alignments by Clustal Omega web server. Once the model was determined, codon optimization was performed using OPTIMIZER and the web server of the IDT codon optimization tool based on the E. Coli B. The generated open reading frame (ORF) sequence was analyzed using Restriction mapper web server to choose the restriction site for facilitating the cloning stage, which is adjusted for pJExpress414 expression vector. To maximize the protein expression level, the mRNA secondary structure analysis around the ribosome binding site (rbs) was performed. A slight modification at the 5’-terminal end waa carried out in order to get more accessible rbs and increasing mRNA folding free energy. Finally, the construction of the synthetic gene was confirmed to ensure that no mutation occurs in the protein and to calculate its Codon Adaptation Index (CAI) and GC content. The above strategy, which leads to a good ORF sequence with the value of the free mRNA folding energy around rbs, is -5.5 kcal / mol, CAI = 0.787 and GC content 49.5%. This result is much better than its original gene. This result is much better compared to its native gene. Theoretically it is possible that this synthetic gene construct generates a high level protein expression in E. coli BL21 (DE3) under the regulation of the T7 promoter.