Computational Happiness, Delayed

We would all like to compute our way to happiness in this business, and for early-stage drug discovery that would mean being able to predict the binding affinities of small molecules to target proteins. Such predictions would allow us to search for hits and leads without actually having to prepare the protein or any of those small molecules either, until we settle on some good candidates. So you can see the appeal! One roadblock could well be that such computations turn out to be so resource-intensive that it would be easier to do the screening out here in the physical world, but I think a lot of us figure that advances in hardware and software will come to our rescue there, one way or another. But first you have to get this virtual screening to work reliably.

It’s not like we don’t do this already – pretty much everyone does some sort of virtual screening, but it’s an art form with uncertain outcomes. That is, sometimes it provides some really useful results, and sometimes it strikes out – and there’s no way to know up front which of those domains you’re in. Some binding sites are more computationally tractable than others, and for that matter some small molecule ligand candidates are, too. I’ve recently written about some of the challenges here, and this 2018 paper is well worth going back to as well.

One of the things that people outside the field find surprising is that even when we have X-ray crystallography data showing various small molecules bound to a particular site in a protein, you can’t necessarily “eyeball” them and pick out the ones with the best binding constants. Part of that is because you don’t know the entropic gains or losses of that binding event just by looking at the end result in an X-ray structure, and part of it is that we can have trouble evaluating even the enthalpic contributions of the final interactions that we do see. What computational screening gives you, though, is sort of an “imaginary X-ray structure”, a picture of a given ligand sitting in a given binding site, so evaluating how good it is (and doing that reliably a huge number of times as quickly as possible) is the whole game.

A technique that people are very interested in applying in this area is some sort of machine learning, as you’d imagine (plenty of references coming up below). We’ve recently seen some really impressive advances in predicting protein structure (and even protein-protein binding) through such techniques, using the vast pile of empirical data found in the Protein Data Bank. But as vast as protein space is, small molecules are an even larger and more varied data set, and they can even partake of interactions (such as halogen bonds, highlighted here the other day) that regular proteins don’t normally use. So although getting protein structure predictions through machine learning was not exactly straightforward, applying those techniques to protein-ligand binding is even less so. But just as protein folding prediction switched over the years from “Let’s compute this from first principles” to “Let’s hunt for patterns in the proteins whose structures we already know”, some of the hope for the small-molecule binding problem has been along those same lines. Can we use the (also quite large, although not really large enough) data set of bound ligands to find useful rules and approximations?

This new paper has some sobering thoughts about that task. It uses the PDBbind database, a well-curated collection of thousands of ligand-bound protein structures, which are further grouped into categories (such as a set for which several ligands are known for the same protein site, etc.) As mentioned, it still isn’t really comprehensive enough, though, and there’s room to question its completeness compared to other such data sets, but it’s still one of the best places to try out your machine-learning ideas. But here’s the gauntlet, thrown down:

Despite the strong commitment of data scientists, we believe that drug discovery has not really benefited from the already described models for the major reasons that machine (deep) learning scoring functions still generalize poorly and are not readily applicable to virtual screening of large compound libraries. (32) This major discrepancy does not prevent computer scientists to propose novel deep learning models, almost on a monthly basis, usually focusing on the novelty of the DNN architecture but often omitting to answer three questions: (i) Is the apparent performance biased by either the chosen descriptors (43, 44) or the protein–ligand training space? (29, 45) (ii) Does the model generalize well to external test sets? (iii) Has the model captured the physics of intermolecular interactions and does it achieve good predictions for meaningful reasons?

The authors note that even though the databases used have continued to expand, the accuracy of machine-learning prediction of compound binding seems to have plateaued for a while now – independent of the neural net architecture, the size (and provenance) of the training set, and so on. What’s more, explicitly describing various interactions doesn’t seem to improve the results when compared to more simple approximations, and that’s a flag that something isn’t quite right, either.

A problem with the PDBbind dataset is that it contains relatively few examples of single proteins with multiple different ligands bound to them, and even fewer examples of single ligands that have been found to bind multiple proteins. This really affects the results of training, as you might well imagine. This team was unable to shed the protein- and/or ligand-dependent nature of their results, for this reason. There just aren’t enough data points to work with. Fixing this “. . . will necessitate a coordinated effort from the drug design community and research financing agencies to solve a wide array of protein–ligand structures in which the same target is repeatedly pictured with different ligands of various affinities, and vice-versa.” So not only are we not quite there with machine-learning virtual screening, we probably don’t even have the tools yet to get there at all. There’s nothing impossible about fixing this, at either level, but you know what they say: the first step is to admit that you have a problem. . .