The majority of functions we associate with living things—growth, reproduction, metabolism, decision-making, cognition, and many others—are made possible thanks to the molecular actions of protein molecules. Proteins, just like cars and other macroscopic machines, need a specific 3D structure in order to be able to perform their functions. Failure to acquire these molecular structures is linked to numerous diseases including Alzheimer’s, Parkinson’s, and various cancers. Yet despite decades of research, we still do not understand how proteins fold up into these structures, and what determines whether the ultimate structure attained is correct or abnormal (as in disease)—these questions are not addressed by structure-prediction algorithms such as DeepMind’s AlphaFold. It is crucial that we make progress on these issues if we are to rationally design treatments for misfolding diseases, and to predict evolution of organisms, which is often mediated by changes to protein folding and function (see section entitled “Physical principles governing evolution”).
All proteins are made up of one or more chains of amino acid molecules, which are chemically strung together in sequence. Each amino acid has its own chemical and physical properties, and these interactions cause amino acids to interact with one another in various ways (for example, amino acids with opposite charges will attract each other, while “greasy” hydrophobic amino acids will tend to clump together so as to keep out water). It turns out that, for some proteins, these interactions are all that is needed to force the amino acid chain to fold up into the specific 3D structure that allows the protein to function. This astounding process is akin to a string of yarn spontaneously knitting itself into a sweater! But growing research suggests that, for many other proteins, these interactions instead cause the amino acid chain to misfold into non-functional molecular structures, akin to yarn getting tangled up. A major goal of my research is to understand how this conundrum is resolved in the complex cellular environment.
One possible resolution to this issue may lie in the fact that, in addition to folding, a protein molecule needs to be synthesized, that is, its constituent amino acids need to be added one at a time to the protein chain by a machine called the ribosome. It turns out that many proteins can start folding as they are being synthesized, a process known as co-translational folding, and that this can significantly increase the odds that the protein ultimately folds into its correct structure vs. if folding occurred entirely after synthesis were complete. To understand this, we can go back to our knitting analogy—most knitters unspool a little yarn at a time, rather than spooling the whole ball of yarn and then beginning to knit, which would lead to a bunch of tangles. Likewise, many proteins benefit by beginning to fold up piece-by-piece as amino acids are being added. In fact, it turns out that there are genetic indicators which affect how quickly these amino acids are added to the protein, which can in turn affect how the protein folds up. This is akin to how dance (analogous to protein folding) is closely linked to rhythm (how quickly amino acids are added)—I may have taken this analogy a bit too far and written a musical piece inspired by it (The Dance of the Nascent Chain).
My research aims to develop a detailed molecular picture of this process by combining physics theory, computer simulations, and experiments in vitro and in vivo. For more details, see publications. I am also very interested in how this process is linked to disease and evolution—see the next sections.
Protein misfolding in disease
Creative Commons License: Amyloid plaques alzheimer disease HE stain.jpg by Jensflorian on Wikimedia commons. https://commons.wikimedia.org/wiki/File:Amyloid_plaques_alzheimer_disease_HE_stain.jpg
As discussed in the previous section, the failure of proteins to fold into their functional structures may cause significant harm to organisms. These misfolded proteins often stick together, or aggregate to form larger complexes known as oligomers, or insoluble plaques known as amyloids, both of which can be toxic and damage cellular structures as well as tissues and organs. I am very interested in combining simulation and experiment to understand the molecular mechanisms by which misfolding and subsequent aggregation occur. In a recent study (see publications), we focused on the receptor binding domain (RBD) of the SARS-CoV-2 virus (i.e. the coronavirus causing Covid-19). The RBD is the protein that allows the virus interact with host cell receptors and ultimately enter the cells. We showed that under certain conditions, this protein has a high propensity towards folding into an incorrect structure that does not allow it to function (misfolding). Our work generated a detailed atomistic model of what these misfolded states look like, their molecular properties, and how these issues may be resolved if the protein folds co-translationally (see previous section). Future works will examine the possible relevance of this misfolding towards the clinical pathology of SARS-CoV-2 infection—these insights are also very important towards predicting viral evolution and emergence of new variants, as viruses are expected to evolve in such a way as to minimize this misfolding in the cell. In future work, I plan to use these simulation/experimental tools to understand molecular mechanisms underlying other misfolding diseases, such as Alzheimer’s, with the ultimate goal of finding new treatments for these conditions.
Physical principles governing evolution of life
Creative Commons License. PA clan structures 10.png by Thomas Shafee on Wikimedia commons. https://commons.wikimedia.org/wiki/File:PA_clan_structures_10.png
I am broadly interested in understanding how physics shapes the evolution of life, and its important relationship to disease. All living things emerge through the process of Darwinian evolution by natural selection in which random genetic changes which happen to improve an organism's fitness get selected for and come to dominate the population--in the case of changes to protein-coding genes, these changes typically affect a protein's folding and thus its function (see the previous sections). Evolution is fundamentally quite different from the process of engineering, as evolution is not goal directed. All organismal traits--including human intelligence, a leopard's stripes, the ability of a virus to infect a cell, the "shapes" of our proteins and DNA molecules, and many others--emerge by chance, and stick around because they benefit the organism's ability to survive and reproduce in its environment. This is quite different from the deliberate planning and design process that underlies the engineering of a city.
But despite this significant difference, evolution and engineering share an important similarity: both take advantage of the laws of physics, and both are constrained by these laws. Many of the same principles that enable the design of buildings, windmills, and other structures are also at the heart of life--for example, Newton's laws and conservation of energy allow us to walk and run, electrical signaling enables us to think and read these sentences, and forces at the microscopic level give our proteins and DNA their functional structures. Likewise, just as physical laws limit the size and weight of a bridge, physics places important constraints on life, which when violated, can lead to catastrophic failure--i.e. disease.
Through my research, I hope to understand how physics, particularly at the molecular level, constrains and enables the evolution of life, and how these principles crucially relate to human health and disease. A detailed understanding of the physics governing evolution is needed if we are to predict the rise of new coronavirus variants, antibiotic resistant bacteria, and to tailor cancer chemotherapy treatments for a given patient so as to minimize the risk of resistance and relapses.