Genetic Code Expansion
The genetic code for all life is based upon four nucleotides, 64 codons, and 20 amino acids. Yet in the past two decades, biologists have expanded the genetic code by redirecting specific codons to encode amino acids beyond the 20 standard amino acids.
Expanding the Genetic Code
In protein translation, an aminoacyl-tRNA synthetase (aaRS) loads its cognate tRNA with a specific amino acid. Then, the tRNA is pulled into the ribosome and if the anticodon on the tRNA can bind to the mRNA (hence, the anticodon is complementary to the codon), the amino acid from the tRNA is incorporated into the growing peptide chain.
To expand the genetic code, modified tRNAs, codons, and tRNA synthetases are introduced into the cell on plasmids and the new amino acid is introduced in the media. Generally, you will need two plasmids, as depicted in the figure below:
-
A plasmid expressing the tRNA and its cognate aminoacyl-tRNA-synthetase (aaRS) that has been evolved to incorporate non-canonical amino acids (ncAAs).
-
A plasmid containing the gene of interest with the modified codon (typically the amber codon) that is recognized by the cognate charged tRNA.
-
Once these plasmids have been introduced in the cells, the non-canonical amino acid can be incorporated using the existing protein translation machinery.

To expand the genetic code, 4 major changes to the standard translation machinery are needed in order to incorporate a non-canonical amino acid into the protein of interest:
- The non-canonical amino acid, which is generally introduced in the media.
- A new codon to be allocated to the new amino acid. Because there are no free codons, this can be challenging. In E.coli, the rarest codon is the amber stop codon (UAG) and thus this codon is often used. The gene of interest can be expressed from a plasmid containing a UAG codon at the place where the new amino acid would be incorporated. Other options, such a 4-base pair codons, have also been utilized.
- A tRNA that recognizes this codon.
- An aminoacyl-tRNA synthetase to load the new amino acid onto the tRNA. The tRNA and synthetase are called an orthogonal set, because they should not crosstalk with the endogenous tRNA and synthetase sets. Many of these sets are derived from M. jannaschii, M. barkeri, or E.coli and can be mutated and screened through directed evolution to charge the tRNA with a different amino acid. They are typically expressed from a single plasmid, with multiple copies of the tRNA.
Applications
By making small changes in selected amino acids within a protein, any alterations in structure or function in the protein can be observed. The introduced amino acid can also be used to intentionally change the activity of a protein (e.g. converting a DNA binding protein to a DNA cleaving enzyme) or to regulate the activity of a protein so that it is responsive to specific stimuli, such as light. On a broader scale, the expanded genetic code can help us understand and evolve proteins for various purposes from therapeutics to biopolymers.
Tips for Success
Before beginning to reprogram the genetic code, there are several things to consider. If you are working off of a previously established protocol, make sure to match the growth medium, ncAA concentration, and the cell lines used previously.
Remember that the orthogonal pairs of synthetase and tRNA that work for one organism may not work for another. The orthogonal synthetase must aminoacylate only the orthogonal tRNA, and not endogenous ones. Endogenous synthetases cannot aminoacylate the orthogonal tRNA. And the orthogonal tRNA has to bind to an unallocated codon. Therefore many controls must be used to make sure that these conditions are true. Always express first with a control reporter gene – GFP for E. coli or mCherry-GFP for mammalian cells. You should also express the protein with and without the ncAA in the media to make sure that the full length protein is only made when the ncAA is included.
Browse Synthetase Plasmids
The table below highlights plasmids that contain aminoacyl tRNA synthetase for use in E.coli and Mammalian Cells. Many of the plasmids also contain one or more copies of the cognate tRNA gene.
ID | Plasmid | Synthetase | Origin | ncAA | Organism | Codon | PI | |
---|---|---|---|---|---|---|---|---|
31186 | pEVOL-pAzF | p-azidohenylalanine RS | Methanocaldococcus janaschii | p-azido-l-phenylalanine | E. coli | TAG | Schultz | |
31190 | pEVOL-pBpF | p-benzoylphenylalanine RS | Methanocaldococcus janaschii | p-benzoyl-l-phenylalanine | E. coli | TAG | Schultz | |
48215 | pULTRA-CNF | tyrosyl synthetase | Methanocaldococcus janaschii | para-cyanophenylalanine (pCNPhe) | E. coli | TAG | Schultz | |
48696 | pANAP | AnapRS | E. coli | fluorescent AA, Anap | Mammalian | TAG | Schultz | |
49086 | pDULE-ABK | pyrrolysyl tRNA sythetase | Methanosarcina barkeri | aliphatic diazirine amino acid | E. coli and Mammalian cells | TAG | Schultz | |
50831 | pAcBac2.tR4-OMeYRS/GFP* | tyrosyl-tRNA synthetase | E. coli | various unnatural amino acids | Mammalian | TAG | Schultz | |
50832 | pAcBac1.tR4-MbPyl | pyrrolysyl-tRNA synthetase | Methanosarcina barkeri | variety of unnatural amino acids | Mammalian | TAG | Schultz | |
68292 | SepOTSλ | SepRS9 | E. coli | Sep | E. coli | TAG | Rinehart | |
73546 | pEvol-pAzFRS.2.t1 | pAzFRS.2.t1 | E. coli | p-azido-l-phenylalanine (pAzF) | E. coli | TAG | Isaacs | |
51401 | pAM1 | NLL-MetRS | E. coli | azi-donorleucine (Anl) | Y. enterocolitica | ATG | Tirrell | |
63177 | pMarsL274G | methionyl-tRNA synthetase (L274GMmMetRS) | murine mutant | azidonorleucine (Anl) | Mammalian | ATG | Tirrell | |
64915 | pMAH-POLY | tyrosyl synthetase | E. coli | pBof | Mammalian | TAG | Ai | |
73544 | pEvol-pAcFRS.2.t1 | pAcFRS.2.t1 | E. coli | p-acetyl-l-phenylalanine (pAcF) | E. coli | TAG | Isaacs | |
73547 | pEvol-pAzFRS.1.t1 | pAzFRS.1.t1 | E. coli | p-azido-l-phenylalanine (pAzF) | E. coli | TAG | Isaacs | |
89189 | pMaRSC | methionyl-tRNA synthetase (L274GMmMetRS) | murine mutant | azidonorleucine (Anl) | Mammalian | ATG | Tirrell | |
62598 | pKPY514 | phenylalanyl-tRNA synthetase subunit | E. coli | p-azido-L-phenylalanine (Azf) | C. elegans | ATG | Tirrell | |
62599 | pKPY197 | phenylalanyl-tRNA synthetase (CePheRS) | C.elegans | p-azido-L-phenylalanine (Azf) | C. elegans | ATG | Tirrell | |
71403 | pCMV-DnpK | pyrrolysyl-tRNA-synthetase | Methanosarcina barkeri | N6‐(2‐(2,4‐dinitrophenyl)acetyl)lysine (DnpK) | E. coli and Mammalian | TAG | Ai | |
71404 | pMAH2-CageCys | leucyl-tRNA-synthetase | E. coli | photocaged cysteine | Mammalian | TAG | Ai | |
73545 | pEvol-pAcFRS.1.t1 | pAcFRS.1.t1 | E. coli | p-acetyl-l-phenylalanine (pAcF) | E. coli | TAG | Isaacs | |
82417 | pUltra-sY | sY-specific aaRS | Methanocaldococcus janaschii | sulfotyrosine (sY) | E. coli and rice | TAG | Liu | |
85484 | pDule-tfmF A65V S158A | tri-fluoromethyl-phenylalanine synthetase | Methanocaldococcus janaschii | family of [19]F-UAAs | E. coli | TAG | Mehl | |
85494 | pDule-pCNF | para-cyanophenylalanine synthetase | Methanocaldococcus janaschii | azidoPhenylalanine | E. coli | TAG | Mehl | |
85495 | pDule2-pCNF | para-cyanophenylalanine synthetase | Methanocaldococcus janaschii | azidoPhenylalanine | E. coli | TAG | Mehl | |
85496 | pDule-Tet2.0 | Tetrazine2.0 tRNA synthetase | Methanocaldococcus janaschii | Tetrazine 2.0 | E. coli | TAG | Mehl | |
85497 | pDule2-Tet2.0 | Tetrazine2.0 tRNA synthetas | Methanocaldococcus janaschii | Tetrazine 2.0 | E. coli | TAG | Mehl | |
85498 | pDule-3-nitroTyrosine (5B) | 3NY (5B) synthetase | Methanocaldococcus janaschii | 3-nitroTyrosine | E. coli | TAG | Mehl | |
85499 | pDule2-3-nitroTyrosine (5B) | 3NY (5B) synthetase | Methanocaldococcus janaschii | 3-nitroTyrosine | E. coli | TAG | Mehl | |
85500 | pDule-IBBN (G2) | IBBN (G2) synthetase | Methanocaldococcus janaschii | 4-(2′-bromoisobutyramido)-phenylalanine (IBBN) and structurally analogous amino acids | E. coli | TAG | Mehl | |
85501 | pDule2-IBBN (G2) | IBBN (G2) synthetase | Methanocaldococcus janaschii | 4-(2′-bromoisobutyramido)-phenylalanine (IBBN) and structurally analogous amino acids | E. coli | TAG | Mehl | |
85502 | pDule-para-aminoPhe | pAF synthetase | Methanocaldococcus janaschii | para-aminoPhe | E. coli | TAG | Mehl | |
85503 | pDule2-para-aminoPhe | pAF synthetase | Methanocaldococcus janaschii | para-aminoPhe | E. coli | TAG | Mehl | |
91705 | pSupAR-MbPylRS(DiZPK) | pyrrolysyl-tRNA synthetase | Methanosarcina barkeri | photocrosslinkers DiZPK, DiZSeK, or DiZHSeC | E. coli | TAG | Chen | |
91706 | pCMV-MbPylRS(DiZPK) | pyrrolysyl-tRNA synthetase | Methanosarcina barkeri | photocrosslinkers DiZPK, DiZSeK, or DiZHSeC | Mammalian | TAG | Chen | |
92047 | pCOTS-pyl-GFP(35TAG) | Pyrrolysyl tRNA synthetase | Methanosarcina mazei | cyanobacterial | TAG | Alfonta | ||
92048 | gCOTS-pyl | Pyrrolysyl tRNA synthetase | Methanosarcina mazei | cyanobacterial | TAG | Alfonta | ||
99222 | pTECH-chPylRS(IPYE) | Pyrrolysyl tRNA synthetase | Chimeric | p-iodo-L-phenylalanine | E. coli | TAG | Soll | |
104069 | pTECH-chAcK3RS(IPYE) | AcK3RS | Chimeric | Nε-acetyl-L-lysine | E. coli | TAG | Liu | |
104070 | pTECH-MbAcK3RS(IPYE) | AcK3RS | Methanosarcina barkeri | Nε-acetyl-L-lysine | E. coli | TAG | Liu | |
104071 | pTECH-MmAcK3RS(IPYE) | AcK3RS | Methanosarcina mazei | Nε-acetyl-L-lysine | E. coli | TAG | Liu | |
104072 | pTECH-MbPylRS(IPYE) | Pyrrolysyl tRNA synthetase | Methanosarcina barkeri | m-iodo-L-phenylalanine | E. coli | TAG | Liu | |
104073 | pTECH-MmPylRS(IPYE) | Pyrrolysyl tRNA synthetase | Methanosarcina mazei | m-iodo-L-phenylalanine | E. coli | TAG | Liu | |
105829 | pIRE4-Azi | Azi-tRNA synthetase (EAziRS) | humanized | p-Azido-phenylalanine (Azi) | Mammalian | TAG | Coin | |
105830 | pNEU-hMbPylRS-4xU6M15 | Pyrrolysyl tRNA synthetase | Methanosarcina barkeri | Pyl-like click amino acids, tRNA M15 | Mammalian | TAG | Coin | |
113644 | pRF0G-Tyr | tyrosine tRNA synthetase | Methanococcus jannaschii | tyrosine | E. coli | TAG | Barrick | |
113645 | pRF0G-IodoY | iodotyrosine tRNA synthetase | Methanococcus jannaschii | iodotyrosine | E. coli | TAG | Barrick | |
122650 | Mm-PylRS-AF/Pyl-tRNACUA | Pyrrolysyl tRNA synthetase | Methanosarcina mazei | trans-cyclooct- 2-ene-lysine (TCOK) | Mammalian | TAG | Hang |
Browse Strains
The table below highlights bacteria strains that have been modified to enhance non-standard amino acid incorporation.
ID | Strain | Description | PI | |
---|---|---|---|---|
48999 | C321 | all TAG sites changes to TAA | Church | |
48998 | C321.ΔA | all TAG sites changes to TAA, RF1 function removed | Church | |
49018 | C321.ΔA.exp | all TAG sites changes to TAA, RF1 function removed, MutS restored so decreased mutation rate | Church | |
98564 | C321.Ub-UAG-sfGFP | all TAG sites changes to TAA, RF1 function removed, with Ubiquitin-UAG-sfGFP reporter | Church | |
98565 | C321.ΔClpS.Ub-UAG-sfGFP | all TAG sites changes to TAA, RF1 function removed, ClpS inactivated, with Ubiquitin-UAG-sfGFP reporter | Church | |
68306 | C321.ΔA | all TAG sites changes to TAA, RF1 function removed, deletion of SerB to maintain sufficient levels of Sep in the cytoplasm for protein synthesis | Rinehart | |
73581 | C321.deltaA (Isaacs lab) | all TAG sites changes to TAA, RF1 function removed | Isaacs | |
87359 | C321.∆A.opt | all TAG sites changes to TAA, RF1 function removed, improved doubling time. | Church | |
69493 | MCJ.559 | TAG sites, RF1 function removed, genomic deletions for improved incorporation | Jewett | |
69495 | rEc.13.delA | 13 amber sites changed to TAA, RF1 function removed | Jewett |
Browse Target Plasmids
The table below highlights plasmids that contain genes with modified codons for unnatural amino acid incorporation.
ID | Plasmid | Gene | Expression Type | PI | |
---|---|---|---|---|---|
105666 | pBad-CA TAG20 | carbonic anhydrase II | Bacterial Expression | Mehl | |
105667 | pBad-CA TAG93 | carbonic anhydrase II | Bacterial Expression | Mehl | |
105668 | pBad-CAts TAG97 | carbonic anhydrase II | Bacterial Expression | Mehl | |
105836 | pBad-CA TAG126 | carbonic anhydrase II | Bacterial Expression | Mehl | |
105837 | pBad-CA TAG186 | carbonic anhydrase II | Bacterial Expression | Mehl | |
105838 | pBad-CA TAG233 | carbonic anhydrase II | Bacterial Expression | Mehl | |
105839 | pBad-HPII | hydroperoxidase II catalase | Bacterial Expression | Mehl | |
105843 | pBad-HPII TAG283 | hydroperoxidase II catalase | Bacterial Expression | Mehl | |
105844 | pBad-HPII TAG348 | hydroperoxidase II catalase | Bacterial Expression | Mehl | |
105845 | pBad-HPII TAG568 | hydroperoxidase II catalase | Bacterial Expression | Mehl | |
105846 | pBad-HPII TAG206 | hydroperoxidase II catalase | Bacterial Expression | Mehl | |
105847 | pBad-HPII TAG415 | hydroperoxidase II catalase | Bacterial Expression | Mehl | |
105848 | pBad-HPII TAG392 | hydroperoxidase II catalase | Bacterial Expression | Mehl | |
85483 | pBad-sfGFP 150TAG | sfGFP 150TAG | Bacterial Expression | Mehl | |
85482 | pBad-sfGFP | sfGFP | Bacterial Expression | Mehl | |
82501 | pGLO-GFP-3UAG | GFP-3UAG | Bacterial Expression | Liu | |
82500 | pGLO-GFP-1UAG | GFP-1UAG | Bacterial Expression | Liu |