MedChemica Levinthal Lecture at Openeye CUP XX 2020
This lecture originally presented at the Openeye 20th User group meeting in Santa Fe,
New Mexico March 2020
In this talk we’re going to discuss the essential purpose of improving medicinal and
computational medicinal chemistry.
In retrospect the ideas in this lecture started over 15 years ago when my manager (Dr
Andy Barker, ACS Hero of Medicinal Chemistry 2011), asked me to look at how we
could train the next generation of expert medicinal chemists. That indirectly led to
the Medicinal chemistry Primer(https://www.amazon.co.uk/Medicinal-Chemistry-
r=1-1) . It also led to an appreciation that I didn’t really know how to do medicinal
chemistry really well in an objective sense, which in turn led to a secondment into
the AstraZeneca Computational chemistry group. There I developed matched
molecular pair methods, and an internal online expert system. When I left
AstraZeneca in 2012 we started the development of a cross company collaboration to
mine ADMET rules for medicinal chemistry about which details were published in
2018(http://pubs.acs.org/doi/abs/10.1021/acs.jmedchem.7b00935) . Pushing
forward our interest in MedChemica has been in how to integrate systems and
integrate them into Medicinal Chemistry practice. The issues that this involves are
explored in two publications that are in press: the chapter in Burgers Medicinal
chemistry and the paper in J. Med Chem., both of which address both the wider
impacts of AI technologies on drug hunting practice and the inter-personal aspects of
having automated systems as “part of the team”.
Having had three careers I see parallels between them, however the critical
difference in both synthetic chemistry and software engineering is that they don’t
have to deal with the uncertainty that medicinal chemistry does. There are two types
of uncertainty in medicinal chemistry: experimental uncertainty where you are
uncertain of the value an assay returns, and biological uncertainty, which is where
you are uncertain of even the relevance of the assay – ie whether it is even the right
protein, cell, or has anything to do with the disease process? Obviously the later is far
more difficult to deal with.
The big idea of this talk is that we don’t have to just follow the hype cycle, we can
choose our strategies and modulate it, do things to reduce the hype and
despondency and get to productive use and benefits of AI tools quicker.
I started out looking at the behaviors of experts
An then if we consider the marks of general AI systems – not just the Deep neural
networks but the full requirements of a working AI in a domain we see these
Aligning these of course we see a lot of commonalities, the two references are key
reading in both fields.
I’ve been a medicinal chemistry mentor and generally we start with somebody who
has a PhD or post-doctoral experience in synthetic chemistry and train them up. To
be fully independent on a project takes about 5 years, but most expert medicinal
chemists would say that it’s not a job that could be done by a machine.
Rather smarter than the average medicinal chemist, John von Neumann recognized
that in principal if you could adequately specify a problem you could at least consider
solving it. There is another corollary , which of course is if you can’t describe how
you’re doing a job then do you really know what you’re doing. This may be a explicit
verses tacit knowledge problem, in that some people would claim that “they can just
do it” but if they can’t explain it, they’ll make poor mentors.
So the idea is:– create a Field manual and us it both to train upcoming medicinal
chemists, but also to structure our ideas on what a real AI system for medicinal
chemistry might need and look like.
So to the question what is a field guide? I consulted a friend who’s spent 25 years in
the British military, and now is a management coach, we came up with these critical
contents. Three good examples are shown in the slide: a US combined forces First
Aid manual, for political balance the Soviet Guerrilla warfare manual (which I don’t
own), and the British Mountaineering Councils guide to Navigation in the
mountains(which I do).
And an equally mocked up contents page – but we’ll use this to structure the rest of
Here we start with what medicinal chemistry is trying to achieve, it’s important to
understand that patenting is not negotiable if you actually want to generate new
therapeutic agents compare to just biological tools. If you don’t have IP property
then even not-for-profit’s won’t be able to cover their costs and you will find it almost
impossible to get a compound into clinical development.
All the other stuff that you might want is peripheral, you need to park your ego to be
Connecting back to the earlier comments about the essential difference of medicinal
chemistry is dealing with both biological and experimental uncertainty, the first rule is
always check your evidence. Whether that’s retesting an important measurement,
using an orthogonal assay, or testing in a second in vivo system, the more important
the implications of the result (especially if surprisingly potent) the more essential it is
to check the evidence.
After that avoiding doing stupid things and focusing on the obvious is a lot harder
than it seems. Medicinal chemists are creative people, but usually you have to focus
on doing the straightforward first before moving onto the more radical ideas. It
doesn’t mean you don’t do them – but you do the ‘bread and butter’ things first.
Working from our navigation metaphor, here is a picture of a stunning mountain on
the Isle of Skye in Scotland.
This is Sgurr nan Gillian in the Black Cuillin range. It’s the remains of a volcanic cone
where the top of the mountain stuck up through the glacier when the whole area was
covered in ice sheet during the last ice age. Consequently the top of the mountain is
very ragged and unpredictable. The analogy to much SAR is obvious.
Of course most people never see the top of the Cuillins – as the weather is terrible,
often the peaks are covered in cloud and storms – again like a drug hunting project,
you have an indication of where the peak might be but you can’t see it.
And to make things worse, many of the rocks are magnetic so your compass won’t
work – just like in drug hunting where the methods of guiding yourselves forward is
Finally in drug hunting, unlike on the Isle of Skye, there is no map. We have to move
forward, creating the map as we go, with the hope that there is a peak of SAR in the
future. It’s that “map making” attitude that really guides much medicinal chemistry
action, often you’re moving across pretty flat SAR, there are local peaks and also
gullies that you drop into. Much of the time it may feel like you’re moving through
marsh, but at least there are no midges.
So the first important task for a chemist is to provide input into the definition of
where the project is going as an objective. Over the last 2 days we heard about the
importance of the Target Product Profile, but they were treated as something
‘handed down’ to the chemist without the chemist having any input. I take a different
view in that the chemist should be an active participant in TPP definition, asking
questions and helping get the best most realistic specification possible. We’re going
to look at three case studies.
The first is a project we consult on. Here the combination of hospital and geriatric
drives the route of administration (iv) and the safety profile– patients on multiple
other therapies. The upside is a more relaxed view of clearance – patients are on a
drip anyway, so up to tid is probably acceptable, and best of all, absolute clarity about
the disease model, linkage from cell+ PK data to efficacy and security of the target in
the clinic. One catch is the solubility which needs to not only be high but show the
right pH profile – it’s no use having a strong base that will crash out at blood pH (7.4).
In contrast, here is a TPP for an oncology project targeted at Ras mutations. Here is
challenges are different, long duration of therapy means uid oral for compliance, as
the agents will be used in combination no DDI’s and safe again, the real challenges
are that lung cancer generates brain metastases, and therefore CNS penetration is
needed. Also the actual linkage from the target through the cell models to the
disease is poor as there is always the possibility of feedback mechanisms and escape
pathways which may mean that you inhibit the target, but downstream there is
feedback and upregulation which means the flux through a pathway can actually go
up. Obviously some of the risks for this project can’t really be defrayed until
compounds get to the clinic.
And here’s an example if you just go data mining without thinking and researching
the TPP mechanism well enough. Constructing a knowledge graph this team
identified baricitinib as a potential therapy.
Distressingly the first contra-indication for baricitinib is for those suffering from
respiratory infections, and actually it is a viral reactivator. This is not hidden away in
any obscure source, it’s in documents that are required to be in the public domain:
the European public assessment report for baricitinib. The message here is that data
scientists and chemists can’t work on their own, and there has to be a dialogue in
translating the clinical and biological needs into chemical objectives.
Referring back to our navigation model, the next things we need after defining where
we are going is some way of navigating. Composite measures are popular, but rather
than make up some multiparametric score, the one that makes the most sense is
predicted dose. This has a key advantage in that it can be communicated easily
outside a project, and is relevant. Obviously there are issues: in many cases the PK-PD
relationship is unknown, so you have to get compounds that have enough exposure
and a PD model that you trust to develop the relationship first and then use that best
estimate to go from potency+ distribution + PK to get a dose estimate. The PD effect
may depend on Cmax, Cmin, some time over C_threshold, or AUC, all can be tested
with well structured experiments.
All the estimates carry an uncertainty – but at least with dose estimate tracking you
know where the potential sources of error are. I’ve used Lipophilic ligand efficiency
in the past, and it’s useful for ranking hits from an HTS, but outside that both the
theoretical background (see Pete Kenny’s paper) and the practical (see the Evotec
review as cited) suggest it’s a poor metric for tracking. The top flow chart shows how
a dose estimation evolves from hit to lead to pre clinical development. I would move
to physiologically based pharmacokinetic prediction earlier than candidate selection –
but that depends on the group. The bottom two graphs on the left shows a projects
progression, the left hand one is 1/dose – showing how the project steps up with the
best projected compound being easy to identify at all times, the right hand plot is LLE
through the project which clearly is not ranking compounds correctly.
Much has been said at this meeting about lead finding. Clearly it’s the area where
computational methods have made the most impact over the last 5 years and I think
the review by Brown and Bostrom understates the importance of computational
Looking at the crux of medicinal chemistry, how do we actually improve compounds,
my tactics are simple and as described in this slide. At the bottom are the 4 key
milestones I see projects actually passing – these are the points where management
teams feel more confident or investors feel that progress has actually been made and
getting between them as efficiently as possible is the job of the medicinal chemist.
The last point is often missed (I’ve done it) and it’s the key feature that in order to
assess the safety of a compound in vivo, you need to be able to dose it and get 10,30,
or 100 times the projected therapeutic exposure, so you can make the case that’s you
have a safety margin to go into human trials. If you have a compound that you can
only just get a therapeutic exposure in a model animal species, you won’t be able to
get a safety margin in pre clinical toxicity testing.
Here is a ‘master list’ of medicinal chemistry tactics for “exploring a compounds SAR’.
Each point has medicinal chemistry logic behind it, trying to understand the value of
each potential interacting group.
One that is non-obvious to the non practitioner is chiral centre introduction and
inversion. The usual dogma people hear is that ‘chemist try to get rid of chiral centres
because they are expensive to produce” but actually discovering that you can
introduce a chiral centre and the two enantiomers have very different activity is a
huge source of relief for a chemist as you then have some confirmation that the
interaction is specific. Conversely if you introduce a chiral centre in the middle of a
molecule and the activities of the enantiomers are identical (within the bounds of
experimental variability) is a red flag that something isn’t right. It may be that the
two enantiomers can adopt very similar conformations – but the simpler answer is
that your assay isn’t telling you quite what you thought it was.
The references are just really good resources to refer to.
One of the developments we were involved in trying to improve our medicinal
chemistry performance was to data mine medicinal chemistry ADME rules within
AstraZeneca using matched molecular pair analysis (MMPA) . When we left AZ, we
then took this technology to enable sharing between three large Pharma and the
results are reported in the paper cited.
An example of the results of the data mining was in looking at the relationship
between solubility and logD. Each point represents a medicinal chemistry ‘rule’ or
transformation, each one backed by at least 20 matched pairs. The general assertion
of medicinal chemists is ‘increase solubility by reducing logD and although this is
generally true, the overall trend is poor. The slope of the best fit line is -0.57
indicating that on average, for every 1 log decrease in logD, one would only expect to
get 0,57 log increase in solubility. There are however cleverer transformations that
can be efficient given >= 1 unit increase in solubility for every 1 log decrease in logD
and even better in the ‘lucky’ sector the unexpectedly good transformations where
you can increase logD and increase solubility. Obviously these are very interesting if
you have a permeability issue.
Here are a few examples of transformations that buck the general trends. The middle
sulphone to secondary amide is interesting as an isolipophilic change that results in
an increase in solubility, and the bottom removal of a gem dimethyl group that
although the logD goes down, makes no improvement in solubility, are both
We see a similar, if more extreme situation with human liver microsomal clearance.
Here reduction of a unit in logD would only be expected on average to generate a
0.23 decrease in clearance, and again there are efficient and surprising
transformations that can deliver significant changes in clearance while only requiring
minor decrease in logD or in a few cases even increase logD.
Again here are some examples, the pyrazole case is interesting, as when this was
explored in detail, it appears that pyrazoles can share a metabolic route with purines,
and by adding the ethyl group this pathway is blocked, even though the logD is
increasing. A counter case is the replacement of a linking azetidine with a piperidine,
where despite the drop in logD, the piperidine is actually more heavily metabolized.
Here is how we put the knowledge to work. The rules are stored as SMIRKS(the
reaction dialect of SMILES) we can take a starting molecule (say with bad solubility)
and apply all the solubility transformations to it to create new molecules that have a
precedent to improve solubility. In the current lingua franca that’s a ’generative
model’ but what’s important is we are taking account of the local environment round
the point of change,(so it’s the relevant transformations that are applied) and we can
see the exact evidence supporting the rule.
Here’s an example interface that we supply (firstname.lastname@example.org if you’re
interested), designed for the medicinal chemist: draw your structure, select your
problem(like solubility) and what you want to achieve(increase unsurprisingly in this
case) and press submit. You can also put on Lipinski style filters and even lock in
some substructures that have to be obtained.
And the output is an excel spreadsheet – with the consistent parts of the molecule
put in and on each suggestion highlighted, so that you can see the change that has
been applied, simple Lipinski calculations, and a summary of the direction for each
property for which that transformation has enough precedent to comment on. You
can also drill down to the actual pairs providing the evidence for that transformation.
Here’s an example drilling down to the solubility data, there’s are pairs of compounds
for the indicated transformation, so a chemist can judge for themselves whether the
evidence justifies making the compound. We can also show summary data like the
median change and the number of examples where the solubility increase (47) or
decreases(21), so it’s clear that there’s not a guarantee – but the chemist is given the
information to make a professional decision from. Likewise if they look at the
examples and they all come from one series, then they might consider that a risk,
alternatively if they all come from different series then it looks like a more general
Toxicology is different. Usually you run into toxicology issues towards the end of a
program when you have higher in vivo exposure. Under those circumstances the
chemistry team is under a lot of pressure to “escape from the toxicity” because the
efficacy is interesting, it needs what I’ve called ‘close quarters medicinal chemistry”
thinking – every compound proposed must have a hypothesis behind it. We have a
couple of tools that can be useful.
Permutative MMPA is an old idea (essentially we proposed it in our Wizepairz paper
https://dx.doi.org/10.1021/ci100084s Figure 1) the diagram on the bottom left shows
how we can identify a transform t1 in several pairs in a data set (M1àM2, M3àM4)
and then apply it to compound M5 to generate a new compound M*. The example
on the right is a case we worked on with a client. The compounds in blue were
measured compounds, the compounds in green, suggested compounds with their
estimated potencies, with clogP and estimated potency filters applied.
Here’s a worked example using a D2 data set showing how the permutative MMPA
strategy is particularly efficient – although we can generate huge numbers more
compounds by using other transformation rule sets (like the combined D1,D3,D4 set
known as Dopamine class) or the hit-to-Lead(most commonly used transforms) ,
solubility or metabolism data sets, leveraging the SAR already known generates a few
very sensible suggestions. In active learning terminology this is a pure ‘exploit’
In the D2 example, the m-OMe à o-OH transformation if applied to the propyl
compound gives a 1.6log increase in potency (known measured compound not in
Note the env = 4 is only using environment 4 transformations from MCPairs – so we
only transfer exact SAR, not generically pepper the compounds with all the
substituents eg just m-Cl not all the Chloros.
The same technique can of course be tried to minimize potency and here are
suggestions made to a clients project (in red) from the measured compounds (in
So in review where are the areas AI can contribute to medicinal chemistry practice.
Top of my list are in Hit to lead and LO where retrosynthesis tools could reduce the
variability(and poor prediction) of synthesis times, as well as hopefully in the end, the
mean synthesis time (which means more design-make-test-analyse cycles).
Returning to our nominal contents page, we’ll look at some of the other aspects of a
medicinal chemists job, the critical value of communication and building and
maintaining a team. What can these teach us about implementing AI systems?
First of all we can’t treat communication as a luxury. Quite reasonably,
computational chemists can become frustrated if all they feel they are asked to do is
‘generate a pretty picture for the VP’, but influencing either a management team,
investors, or clinical collaborators that what you are doing is based on firm
understanding and evidence can be a critical stage for a project.
Some of the mnemonics I’ve been taught are captured on this slide, probably the
most critical points though are the idea that you have to communicate in the style
that your audience finds easiest to digest, which is based on the idea that you’re
trying to persuade them of something not just demonstrate your brilliance. Which is
often a hard message to take on board, especially when we’ve put a lot of work in.
We want to show how much we’ve done , but the truth is it’s very hard to make the
complex simple , any idiot can make the simple complex…
If that’s what we want from human – human communication, then we need the same
from out human-AI communications.
An example of a interface we have built for a toxicity prediction system. All the
chemist needs to do is draw a structure and press “submit”
The summary results are given just as traffic light colours and although it grates, the
most important button is the “export to powerpoint”. This means that a chemist can
draw a structure, press a button and three minutes later be heading to a project team
meeting with ready to show slides of a toxicity risk evaluation and enough
information to make a professional judgement about the course of action to take.
Here’s an example of one slide from the export to powerpoint. Predictions are given
at the top with the atoms in the molecule highlighted. The first model is a Regression
forest model built on pharmacophore feature pairs where we have specially
optimized the pharmacophore definitions for acid, base, hydrogen bond donor and
acceptor, to make them as accurate as we can. We generate a fingerprint that is used
unfolded, so then permutative feature importance can be used on the regression
forest to identify the the relative importance of the features. These are the three
structures in the middle, the left hand red substructure makes the strongest
contribution to increasing the potency, the blue substructures contribute to
You can also see the median potencies of compounds in the test set with and without
the structures, and the number of compounds with and without the structures. All of
which enable the chemist to judge if they want to modify particular structures.
Below are the nearest neighbours to the test compound in the training set. Given
that there are no close neighbours, I’d take the information from this to be that as
there are no close neighbours, I would ideally test the compound and perhaps a
couple neighbours in a wet assay.
Finally we come to perhaps the most sensitive part of the lecture. What do we know
about building teams and how can we apply it to integrating AI into our teams?
So here are the basics of building and maintaining a human team.
Let’s consider though the relationships between medicinal and computational
chemists. None of this should be controversial – and is common across many highly
trained professionals – Chris Argyris was Professor of Education and Organizational
Behavior at Harvard, and although chemists of all flavors are often resistant to
learning from behavioral scientists, we should show it some respect.
We can use the previous slides ideas to explain where the hype curve comes from.
As a defense strategy the computational chemists try to drive adoption with evidence
and hype, this threatens medicinal chemists who respond by demanding more
evidence (constantly) and putting forward projects that are the most challenging(so
that they fail and the medicinal chemists embarrassment is saved). The
computational chemists blame the medicinal chemists for not engaging and therefore
can claim new methods were not ”given a fair chance” and so in turn save their own
embarrassment about having to fix aspects of systems or algorithms.
This is of course a polarized view – but does it sound familiar…?
But the world is changing – computational chemists can now just subcontract getting
compounds made. As Openeye have demonstrated with the collaboration with
Enamine, you can virtual screen on rented cloud hardware, and buy the compounds
and ship them straight to your screeners – bypassing the chemists altogether. They
will be relatively sensible if they are in the Enamine library. With retrosynthetic
planning software coming on stream a computational chemist can even get a route
evaluated by an automated system and therefore request only the feasible
compounds from a synthetic chemistry CRO.
So what’s the way round highly trained professionals defence reflexes, and how can
we apply those ideas to AI system adoption?
We are trying to move from Ai systems supporting medicinal and computational
chemists to maybe in 30 years time, supplanting them. It’s going to be a long term
project, which is why we do it stepwise and develop as we go.
What we get people to do are the top 4 bullet points, so if we consider an AI system
to be like another team member and translate those points into what we might need
for AI, that’s the points in the green box.
As we said earlier, for highly trained and committed professionals the adoption of AI
systems will not just be a change in job, but a change in identity. There are some
steps all actors in this arena need to take on. Medicinal chemists need to start using
and citing computational methods. For computational chemists, getting in the lab to
really understand how experiments are done, the limits of measurements and
For managers it’s important to focus on the real side rather than the superficial. If
you do the superficial and don’t do the real, people notice and it undermines the
intent. You can however do the real and not the superficial and people will get the
These ideas come from Schein’s work on how managers can change cultures.
However we need to modulate the message by the culture we’re applying them in –
Meyer’s work in the area is really useful (having worked with US, Chinese, English,
Swedish, French, German and Swiss collegues this has been the most useful work I’ve
ever read on cross cultural working – so many of my problems could have been
avoided if I’d known what’s in that book).
Here are examples from the literature, on the left is work from AstraZeneca
Sodertalje where they introduced compulsory estimation of solubility before
synthesis and the number of compounds with good measured solubility increased.
The second paper from Genentech discusses the organization and management of
their comp chem group and keeping focused on work that can lead to specific
Finally there is the recent paper from Darren Green and co-workers at GSK who
describe the fully integrated system they have made available to medicinal chemists
to support design. It’s a paper really worth reading to understand the focus on
building a system that is both robust and medicinal chemist friendly.
So to finish off: we can take measures to change how AI is adopted, the
uncomfortable part of it is for both medicinal and computational chemistry is that to
be more successful we are going to need to change significant aspects of how we
work. That means looking very seriously at how we integrate systems and most
importantly stressing the open, honest and supportive collaboration between
Feature definitions are pairs from Taylor and Cosgrove
With the addition of a halogen class, distances are topological distance, binary
fingerprints not scalar counts of number of matches.
Feature importance is permutative importance not impurity