Engineering RNA Therapeutics with Foundry Scale
Today we're talking R&D for RNA therapeutics. Why do we want Ginkgo scale for engineering RNA? How much can we improve the design of an RNA sequence with AI models trained on large datasets?
Let's take a look as some recent results from the foundry showing improved mRNA stability in a cell-based assay and a mouse model.
Transcript
Ginkgo is an R&D partner for RNA therapeutics. You're trying to discover and optimize new RNA medicines. We want to help you do that.
Here in the Ginkgo foundry, we love biology and we can't get enough of it. We've seen how scale can make biology easier to engineer. So we've invested heavily in assets and capabilities for generating biological big data. We think large datasets, often combined with AI models, are the right way to take on the hardest problems in biotech.
Let's take a look at how this works for a particular hard problem: stability for an RNA therapeutic. You've got an RNA molecule that needs to deliver a therapeutic payload. You're thinking about safety and efficacy, that therapeutic index, and manufacturing.
RNA stability hits all of these things, so it's probably somewhere on your list of priorities in between "very important" and "existential." The good news is that an RNA molecule offers plenty of opportunities to optimize for stability independently of the payload. So we have the design room to get it right.
Let's take a look at the sequence of a typical mRNA. You've got a coding sequence for your therapeutic payload. You've got a 5' region upstream that controls things like expression strength and tissue specificity. And you've got a 3' region downstream that tends to have a lot of influence on the stability and degradation rate.
With the power of the Ginkgo foundry, we can design and build many RNA sequences in parallel, letting us explore sequence space efficiently and in an unbiased way. So, for this dataset, we built about 200,000 different sequence variants and transformed them into a cell line to measure how long they last.
That data became the input for an AI model, which generated the sequences you see here. Down here is a benchmark sequence that is used commonly for RNA constructs. It degrades quickly, with a half life of only six hours. We see a lot of constructs with much better stability. The best designs have more than double the stable half-life.
And this is, honestly, not too surprising. We know the 3' UTR sequence changes stability and we know that many of the standard options have not been optimized. So naturally, when you put this kind of scale into your data generation, you'll find that there's plenty of room for improvement.
Somewhere on a plot like this is the performance threshold for efficacy in your application. RNA molecules that degrade faster will deliver less of your therapeutic payload. That means you might need a higher dose, with the potential for more untargeted, systemic effects and toxicity.
Now you might ask, are we solving the real problem here? Are we optimizing for stability in cell lines when what really matters is stability in patients? In this case, the improved stability does translate into an animal model. The optimized RNAs were packaged into lipid nanoparticles and delivered into a mouse. The payload was the bioluminescent protein luciferase, which produces a light signal we can quantify over time.
Here's the mouse data. The optimized RNA remained functional longer, with 30-100x more activity depending on the time point, including significantly more function at 7 days. One week after injection, we still see luciferase activity in the Ginkgo Design mice, while the Benchmark mice are blank.
And this was true for 3 different sequence designs. The large data sets that we generated in cell lines were capturing at least some of the sequence features that are relevant for stability in a live mouse. It's not just about screening a library for a single good clone, it's about building a resource that is able to generate sequence designs consistently, adapted to the context of your therapeutic payload.
Now, if we can just cut back to the original pooled stability assay for a minute.
You might ask, what if you had worked with a smaller dataset? What if you had been satisfied with Design B, which is already a pretty impressive improvement, only to find that only Design A had the therapeutic index you needed for the clinic.
Now I don't mean to scare you here. Well OK I do a little bit. Because marketing. And of course there's no way for me to know the stability requirements for your application. Maybe B class designs are good enough for your patients.
But working at Ginkgo and seeing these gigantic datasets combined with AI has really changed the way that I look at the hardest problems in biotech. We are starting to get a sense for how far we can really push biological optimization with scale. Often, it’s much farther than we thought in the era of small batch engineering.







