Physics and math were made for each other. Many of the hardest problems in physics have found elegant solutions written in the language of math. Math is also a useful tool for biology, but it has never quite found the same levels of success. We explore why biology is hard to grasp with math and why AI might help humans reach deeper levels of understanding.
Foundry Theory is also available on the Ginkgo Bioworks website
Transcript
Galileo wrote: "[The universe] cannot be read until we have learnt the language and become familiar with the characters in which it is written.”1
He was talking about mathematics. Galileo, and many others, have noticed a special connection between mathematics and physics. I wonder if there's a similar connection between biology and machine learning?
Physics and math were made for each other: F = MA, E= MC2, The Boltzmann Distribution. Major advances happen when physicists can look at something that happens in the natural world, describe it with a mathematical equation, and then generalize that equation as a law of nature.
Why is mathematics the language of physics? That’s a profound question and I don’t know. But the practical consequences are all around us. Because engineers can use those equations to build things.
F=MA by itself isn't enough to land a rocket ship on the moon. But if you know enough math, and how to use it, you unlock all the modern technologies built with physics.
Biology does not seem to work like that. Now I have to be careful about how I say this because mathematical biology totally is a thing and I love it. My PhD is in systems biology and I've spent more than a little time trying to describe biology with math. Sometimes it works, and it can be very beautiful: The Michaelis-Menten model, the Hardy-Weinberg equation.
But there are large parts of biology that resist being captured in expressions like this. The equations that we do have don't cover the most interesting and important problems. Engineering biology is possible and math is useful for it, but they just don't line up as well as we might like.
Why should this be the case? Why doesn't biology translate easily into the language of math? I can think of 3 reasons: scale, integration and variability.
Scale. Math works well for small systems with few interactions. It also works well for large systems, because large numbers give a reliable average. Biological systems often live right in between, with between 10s and 1000s of interacting parts. Too big to be simple, too small to average. This is sometimes called a mesoscale problem.
Integration. In biology, it often sometimes seems like everything interacts with everything else. A living cell has thousands of moving parts with millions of connections between them. Not all of these connections are really important. We can focus on some of them, and disregard others, to find a system that is simple enough to describe with an equation. But it is hard to escape the feeling that we're missing something, that something is lost when we separate a part from the whole.
Variability. Most of what we want to know about biology is subject to evolution and therefore always changing. Ask a biologist for the laws of biology and they'll give you a list of exceptions. I love this about biology and I wouldn't have it any other way. But equations are less useful if you constantly have to rewrite them.
The challenges of scale, integration and variability have annoyed me for my entire career. The metaphor of a language is really perfect for capturing this feeling. Engineering biology with math feels like visiting a world where I don't quite speak the language.
But look how fluently machine learning can handle these challenges. Systems at the scale of 1000 interacting parts are no problem, input sequences of that size are easy to specify. When biology functions as an integrated network, machine learning models it as an integrated network.
And as for variation, well biological variation is exactly what we need to train a model. Variation is data. Every individual cell, every species, every DNA sequence is a data point. Evolution and mutation have prepared for us a perfect dataset to learn from.
There is something genuinely beautiful about the way machine learning and biology just kind of fit together. It's our Galileo moment. I feel it and I think other biologists feel it too. It would be easy to get carried away.
I'm not going to get carried away. Biology is still hard. Machine learning isn't magic. Many of us have the opinion that AI is just a black box, something that can solve problems but not really generate insight. Maybe that's true. But then physics is hard too and math isn't magic either.
I don't think there are any biologists yet who really speak AI in the way that the best physicists speak math. In physics, they study math for years before finding that place of insight that Galileo described. Biology is going to be at least that hard. Engineering biology is going to take at least that much work, even if we know the language.
But what if, at last, we really do know the language?
Il Saggiatore, 1623 | wikipedia.org/wiki/The_Assayer
That was a brilliant explanation 👏👏👏