AI Enhances Yeast Protein Production Efficiency, Reducing Drug Development Costs

Reading Time: 2 minutes

AI Enhances Yeast Protein Production Efficiency, Reducing Drug Development Costs
MIT chemical engineers have innovatively employed artificial intelligence to enhance protein manufacturing processes using industrial yeast, aiming to reduce costs in drug development and manufacturing. The team utilized a large language model (LLM) to analyze the genetic code of the yeast Komagataella phaffii, focusing on its codon usage, which varies among organisms.

The model studied codon patterns in K. phaffii to predict optimal sequences for protein production, improving the efficiency of manufacturing six proteins, including human growth hormone and a cancer treatment monoclonal antibody. This advancement in predictive tools could expedite moving ideas from conception to production, ultimately saving time and money, according to J. Christopher Love, a chemical engineering professor at MIT and senior author of the study published in the Proceedings of the National Academy of Sciences.

Yeast like K. phaffii play a crucial role in the biopharmaceutical industry, producing protein drugs and vaccines. Researchers engineer yeast by integrating modified genes from other organisms into the yeast’s genome, optimizing DNA sequences, adjusting growth conditions, and purifying the final product. This development process accounts for a significant portion of drug commercialization costs.

Currently, these steps are labor-intensive, but the MIT team is exploring machine learning to simplify and reliably predict aspects of the process. In their study, they focused on optimizing DNA codon sequences for proteins, as each amino acid can be encoded by multiple codons corresponding to specific transfer RNA (tRNA) molecules.

Instead of merely choosing the most frequent codons in the host organism, the MIT researchers used an encoder-decoder LLM to analyze DNA sequences and learn codon relationships within specific genes. They trained this model with publicly available data on amino acid sequences and corresponding DNA sequences from K. phaffii’s naturally produced proteins.

The model, after training, optimized codon sequences for six proteins, including human serum albumin and trastuzumab, a cancer treatment antibody. These sequences, when inserted into K. phaffii cells, generally outperformed those produced by four other commercial codon optimization tools.

This method’s success highlights the model’s ability to understand the biological “language” of codons and optimize protein production. The researchers have made their model available for others to use with K. phaffii or different organisms. Testing on datasets from various species suggests that species-specific models may be necessary for optimal codon predictions.

Interestingly, the model appeared to learn biological principles it wasn’t explicitly taught, like avoiding DNA sequences that inhibit gene expression. This discovery implies the model comprehends biophysical and biochemical features, further validating its predictions.

The research received support from several MIT initiatives and fellowships, underscoring its potential impact on biopharmaceutical manufacturing efficiency and cost-effectiveness.


Source: MIT News
Read Original:
https://news.mit.edu/2026/new-ai-model-could-cut-costs-developing-protein-drugs-0216

0
Would love your thoughts, please comment.x
()
x