If you open any biology textbook to the section on proteins, you will learn that a protein is made up of a sequence of amino acids, that the sequence determines how the chain of amino acids folds into a compact structure, and that the folded protein’s structure determines its function. In other words sequence encodes structure and function derives from structure.
But the textbooks may have to be rewritten. As Rohit Pappu and two colleagues explain in a perspective published Sept. 20 in Science, a large class of proteins doesn’t adhere to the structure-function paradigm. Called intrinsically disordered proteins, these proteins fail fold either in whole or in part and yet they are functional.
We sat down recently with Pappu, PhD, professor of biomedical engineering and director of the Center for Biological Systems Engineering at Washington University in St. Louis to catch up on the latest science.
When did people realize some proteins violate the rules?
It’s been about 20 years. The earliest clue was that some protein segments didn’t show up in X-ray crystallography or NMR studies, the standard ways of studying protein structure.
By the 1990s people who studied how proteins interact with DNA had noticed the proteins often change shape when they interact with DNA. In the absence of DNA all the standard probes for protein structure reported back that the proteins were floppy, and yet when the protein formed a complex with DNA it had a well-defined three-dimensional structure.
How did you first come to hear about them?
By serendipity. When I was leaving Johns Hopkins University to come to Washington University in 2001 I had a meeting with Keith Dunker of the Indiana University Schools of Medicine and Informatics, one of the founding fathers of this field. It was pure chance.
The meeting started awkwardly because Keith was wondering who I was and I had never heard of him. I was working on a polymer physics description of unfolded proteins, and it turned out he had just written an 80-page review paper on intrinsically disordered proteins.
“Every time you talk to people in the back alleys of protein science,” he said, “they tell you their proteins are very flexible or highly dynamic, and this dynamism is important for function.”
So Keith did two things. He synthesized all of the information then known about these flexible, highly disordered proteins. And, together with his colleague Vladimir Uversky, he asked if it was possible to predict which sequences would be incapable of folding autonomously.
With the help of computer scientists who taught him how to look for patterns in high-dimensional spaces, he learned that 11 out of the 20 amino acids predispose sequences toward being disordered. Today there are about 20 predictors of disorder.
So when I heard this story I thought, “OK, either this is absolutely crackers or it is going to be transformative. I’m going to take a bet on transformative because I find what he’s saying compelling.”
So during my first two years at Washington University I started to devour the literature. I think I scared a lot of people here who weren’t sure they had hired the person they thought they were hiring.
What percentage of proteins are intrinsically disordered?
It goes by kingdoms. So in bacteria and prokaryotic organisms these numbers are pretty small. They’re about 5 percent of the proteome, the entire set of proteins made by an organism. But if you go to eukaryotes or multicellular organisms then the numbers get to 30 or 40 percent of the entire proteome.
But if you ask what percentage of sequences that make up the signaling proteome — proteins that are busy passing messages to other proteins — are intrinsically disordered, then the numbers jump up to 60 to 70 percent.
There seems to be a division of responsibilities. Structured proteins take part in catalysis and transport. Intrinsically disordered proteins are important for signaling and regulation.
Why are disordered proteins involved in signaling and regulation?
I think there are two logical reasons. One is that complexes involving intrinsically disordered proteins are short-lived and the other is that they typically bind many rather than just one molecule.
If a molecule cannot fold except in the context of a complex, then some of the energy used for folding must come from intermolecular interactions. And if the molecule has taken out an energy loan, the complex that forms is not going to be very stable or long-lived.
You’re combining high specificity (because the protein will only fold when it recognizes the molecule with which it forms a complex) with low overall affinity (because the complex is not very stable).
The many-to-one interactions arise because disordered proteins typically function through short amino acid stretches instead of large protein-protein interfaces. So a single polypeptide stretch can interact with multiple targets. One motif talks to one protein, and a second motif talks to another protein, but through the chain they can communicate with each other.
That’s why these molecules happen to be at hubs within networks. They’re trafficking information through networks like the air traffic control tower in an airport hub.
Because most of their functions are carried out by these very short motifs, they are capable of coordinating large amounts of information that are disparate in nature. You get many things happening at the same time.
Read more in the WUSTL Newsroom.