Research

My goal is to improve the technology behind computer-aided drug design

Atomistic simulations for drug discovery

Much of my research revolves around atomistic simulations of biomolecules, which include proteins, drugs, solvent (e.g. water), and other small molecules that are of biological interest. The simulations model the forces between atoms to predict configurations and dynamical processes. These in turn can be used to estimate quantities that can be verified (or falsified) experimentally. The simulations are atomistic in resolution, giving us an unprecedented level of detail on the systems in question. We can use them to generate hypotheses or to address questions like ‘how does this particular drug bind?’ and ‘how does this particular protein function?’.

Despite being simplified models of the atomistic world, the simulations are highly complex and time-intensive. For a typically-sized protein immersed in water, the simulations have to evaluate the forces between tens of thousands of atoms over hundreds of thousands of iterations. Depending on the system and the questions one is trying to answer, simulations can take days to weeks. Despite the increasing power of computer processors and smarter use of hardware, sampling the important states of the biomolecules still remains a challenge.

Traditionally, the two main problems my field has grappled with are 1) improving the accuracy of the modelled atomistic forces, and 2) improving the thoroughness of the sampling. Much of my research has been on the latter, with particular focus on enhancing the sampling of water in buried protein cavities , particularly in the context of protein-ligand binding free energy calculations. See, for instance see Binding free energies are a measure of strongly one molecule attaches to another, so they are really important quantities to be able to calculate to aid drug discovery. Experimentally, binding free energies are measured as equilibrium constants and inhibition constants.

I am also interested in reducing the barriers to using atomistic simulations. They require expertise to set up, run, and analyse. One large barrier is the complexity of the software we use, a problem that is compounded by the frequent augmentations and modifications that are made to the software as the science progresses. This in turn raises issues around the sustainability of the software itself. At Schrodinger, I work a lot on the FEP+ package, which aims to be the most accurate software for calculating binding free energies as well as the most robust and user friendly. Prior to joining Schrodinger, I strove to make all my software easy to use, free, open-source, and written in an interpretable way. I developed methods for ProtoMS and worked on tools for OpenMM.

Another barrier to using atomistic simulations is knowing how make reliable inferences from the large and complex data that is produced. It is important to ask the right questions from the data, and have the best tools to address them. To this end, I'm very interested in statistics (with a Bayesian flavour) and machine learning.

We are entering an exciting phase of computer aided drug-design. Modern calculation tools, like FEP+, have demonstrated that atomistic simulations really can drive drug discovery projects forward and make the process more efficient. The central questions of the field no longer center around "do these methods work?", but "how can we make these methods even faster?". The faster we make our methods, the larger volume of chemical space can be rapidly assayed in silico. As more and more binding free energy data predictions are made, the greater the possibility of training machine learning models that can side-step more and more of these expensive simulations.