Do not use free codon optimization algorithm provided by some
"leading" companies, such as GeneArt, Genscript, IDT etc. A
company has recently synthesized ~ 100 genes using GeneArt software, or
sequences provided by Genscript and IDT, 90% of the synthetic genes
could not be expressed in E coli, or expressed at extremely low
level (compared to wt).
Save money to do your own optimization for expression of your gene in
1. Analyse wild type DNA sequence using
2. Change rare codons
Arg: AGG, AGA, CGG, and CGA
based on this E coli codon usage table:
cta –> CTG
ata –> ATC or ATT
aga –> CGC or CGT
cgg –> CGC or CGT
cga–> CGC or CGT
agg –> CGC or CGT
ccc –> CCG
3. Change second amino acid to A (gct, gca), K (aaa) or S (agc, tcc, tct)
4. Change G and C to A and T at the 5'-end.
A high GC content in the 5'-end of the gene of interest –> formation of secondary structure in the mRNA —> Inefficient translation —> low expression
5. Remove cis-acting DNA sequences such as internal TATA-boxes, chi-sites, and ribosomal entry sites; AT-rich or GC-rich sequence stretches; repeat sequences; and RNA secondary structures. Remove internal Shine-Dalgarno sequences such as AAGGAG(nnnnn)ATG, GAAGGAGA(nnnnn)ATG, AAGGAGG(nnnnn)ATG, AAGGAGGT(nnnnn)ATG, GGAG, GAGG, and AGGA.
For example, you should change ATAATA to ATCATC; change AGAAGA to CGCCGT; ATAAGA to ATCCGT; AGAATA to CGCATT; ATAAGG to ATTCGT; AGAAGG to CGTCGC
Remove ATTTA sequences, which are known to destabilize transcripts (Gutie ́rrez et al. 1999).
Remove mRNA secondary structure
6. Add two stop codons TAATAA
7. Append another strong terminator to the end of your DNA sequence