The Golden Age of Protein Thermostability
Enzymes are always amazing but sometimes they can be a little squishy. Let’s ask the glowing green raccoon what we should do about that.
Transcript
Thermostability is underrated. It's underappreciated as a property of a good enzyme. Which is kind of ironic because you could argue that we owe all of modern biotechnology to a thermostable enzyme. I'm talking about Taq polymerase, the thing that made it practical to replicate DNA in a test tube.
If you’ve studied biology you probably know the story of the invention of PCR. Kary Mullis was tripping on LSD back in the 80s when a glowing green racoon on an orange motorcycle appeared to him in a vision to reveal the secret1. Or something like that. That part's not important.
What’s important is the thermostability of Taq polymerase. Because PCR required cycling a sample through high temperatures to separate the DNA strands, then lower temperatures to replicate them, it needed an enzyme that could withstand essentially boiling water. Normal DNA polymerase is destroyed but Taq polymerase could handle it.
The idea of using enzymes to replicate DNA had existed earlier but thermostability made it practical. It brought the power of the polymerase to the real world. Without it, the process needed precise conditions that could only be achieved at a small scale. With it, any high school student can get a tube of Taq and start thermocycling.
Today, we're in a good position to bring that practicality to other enzymes. Almost every application can benefit. In some cases, we'll see entirely new products made possible by stabilized enzymes that wouldn't work with flimsier enzymes. Maybe you're running a process that absolutely has to happen at a high temperature. In other cases, thermostability just makes an existing reaction more efficient. Higher temperatures mean faster rates. Stable enzymes means less money spent replacing enzymes.
So how do we make it happen? It's the combination of AI and lab automation that we bring together in the Ginkgo Bioworks foundry. Even that glowing green racoon could not have foreseen how much this was going to change the game.
At Ginkgo, we can do thermostable enzyme discovery as a service and at scale. We don't have to mount an expedition to Yellowstone Park to isolate Thermus aquaticus from a hot spring, like Thomas Brock did in 19662. We can computationally scan hundreds of enzymes from DNA databases. Our in-house database at Ginkgo has more than 2.7 billion unique genes to help sourcing.
Here's an example from a recent customer project. This was for an industrial partner doing a high volume process that was cost sensitive. They were already using an enzyme, but it was misfolding at high temperatures and breaking down too early. So they either had to supplement with extra enzyme (i.e. spend more) or settle for lower yields (i.e. less profit).
For this project we scanned billions of naturally occurring enzymes to choose 942 different candidates. These things came from bacteria, fungi, plants, or animals. Some of them were 99% identical to the enzyme the customer was already using, some of them were totally different: as low as 30% sequence matches. In the Ginkgo foundry, we built and tested all of them at high temperature.
Here's what that looks like with enzymes ranked from lowest to highest activity. The customer's previous enzyme is marked here as a point of reference. Seeing all the data like this, a couple of things jump out. First of all, take a look at how many of the enzymes had improved thermostability. About 25% of the candidates were better than the starting enzyme. You might look at that and think "huh, stable enzymes are actually not that hard to find." And I think that's going to be true in a lot of cases. Particularly when there hasn't been a previous effort to engineer for thermostability.
That means you could have half-assed this problem. You could have screened 12 enzymes instead of 942, and probably found some improvement. But the other interesting feature of this curve is the hockey stick shape. The very best enzymes are a lot better than the second tier enzymes. That's what makes a serious discovery effort worth it. When you have the large datasets to find 942 candidates, you need the large foundry infrastructure to actually test them and find the very best.
In this case, the very best means about 5x more activity at higher temperatures. For our customer, that meant they could use 90% less enzyme in their process and get about 10% more product titer. We estimate this nets out to about 1 million dollars more revenue per production run.
The winning enzyme came from a public DNA database, not from Ginkgo's in-house sequence collection. But it was only about 35% similar to known enzymes of this class and it wasn't annotated. In other words, you never would have found this thing in the pre-AI era. And even if you did, you'd never know how good it really was without being able to test it against the field.
That’s why industrial biotech needs enzyme discovery at scale and as a service. Only you know if your process can benefit from a more stable, more active enzyme. But if you haven't tried a major optimization effort in the last few years, there's probably room for improvement. The golden age of enzyme engineering is driven by AI-guided sequence discovery, AI-guided sequence design, and the ability to build and test large libraries of candidates efficiently.
So if you're an industrial chemist looking to innovate, just think of me as your glowing racoon. I have appeared on my orange motorcycle to tell you: now is the time to rethink your bioprocess. Go and seek out enzymes with improved thermostability and take your products to the world.
Kary Mullis, Dancing Naked in the Mind Field, Vintage Books, 2000
Brock TD, Freeze H. Thermus aquaticus gen. n. and sp. n., a nonsporulating extreme thermophile. J Bacteriol. 1969 Apr;98(1):289-97.