SlideShare a Scribd company logo
Abhishek Pai 1319400
1 | P a g e
SCHOOL OF CHEMISTRY
CHM3B2 – Literature Project
Literature Review
Fixing Problems and Finding Alternatives to Combinatorial Chemistry to
increase molecular diversity
Abhishek Pai
1319400
April 2016
PROJECT SUPERVISOR: Dr John Wilkie
Abhishek Pai 1319400
2 | P a g e
Abstract
Drug discovery and development is a very difficult area of research and consumes a large
amount of resources to produce effective results. Researchers have found a technique of
synthesising large numbers of molecules called combinatorial synthesis. The use of this
method meant that libraries increased in size dramatically anywhere from 100,000s to
1,000,000s of compounds. The production of these compounds saves the pharmaceutical
industry a lot of resources that would normally be used to discover new compounds through
fieldwork. But they discovered a new problem with this synthesis technique; it would produce
large libraries but introduced the added disadvantage of having to find compounds that were
therapeutically active or diverse in structure and property. The large numbers became more
of a hindrance than a solution; the best analogy for this problem is “Needle in a haystack”.
The diversity aspect of drug discovery was also very important as it meant that novel drug
treatments could be found. This literature review focusses on two aspects, firstly solving the
issues related to finding compounds within combinatorial libraries, secondly finding
alternative routes of synthesis that may produce a larger range of molecular diversity for
drug discovery and development.
Introduction
Currently within the pharmaceutical industry the most popular method of increasing the
number of available compounds involves the use of combinatorial chemistry. The classical
method of discovering new molecules required fieldwork to obtain new and interesting
compounds that chemists can then manipulate and test.
An example of such fieldwork is the exploration of remote biomes such as rain forests,
deserts, tundra’s etc. The rainforest is an incredible breeding ground for compounds that
have never seen in labs. There is enormous potential for discovery due to their isolation from
humans. Sloths are an example of creatures that have a lot of potential to produce new
research material. Their fur is believed to be an excellent breeding ground for new species of
bacteria and fungi. Researchers are hoping to exploit this to obtain novel antibacterial and
antifungal compounds. The serendipitous discovery of new drug compounds is what the
researchers are looking for, since the fungi and bacteria are able to evolve new ways of
combating problems such as antibiotic resistance that is very prevalent around the globe.
But this technique is very time-consuming and cost inefficient. It requires the investments of
large sums of money and the returns may never be as fruitfuli
.
Abhishek Pai 1319400
3 | P a g e
This is where combinatorial chemistry is comes in as it is very efficient at producing large
numbers of diverse compounds which may occur naturally in wild. It doesn’t require the man
power and resources like fieldwork. The process of combinatorial chemistry speeds up the
process of creating a large library of compounds by avoiding the conventional route of
chemical synthesis. The conventional method involves reacting compounds A and B to form
the product AB. This process is very slow and only produces one compound whereas
combinatorial chemistry uses several derivatives or analogues of compounds A and B to
form combinations of products. For example, combining A1, A2, A3 … An and B1, B2, B3 … Bn
to form A1B1, A1B2, A2B1 … AnBn, Figure 1
shows how the process of combinatorial
synthesis works. The diversity of compounds
produced is directly related to the variation of
the initial compounds that are available for
the reaction. This enables the Pharmaceutical
companies to build large libraries of
compounds over short periods of time due to
the rapid nature of this process. It is
hypothesised that increasing the number of
compounds in the combinatorial library also makes it more likely that compounds that are
therapeutically active are discoveredii
. It is important to note that not all the compounds
produced using this method will work in medicinal chemistry; it may have uses in other fields
such as food, cleaning, petro-chemical etc. it may also be useless, this factor is explored
later on in the literature review.
Means of testing the large numbers of compounds produced by combinatorial synthesis for
further development and commercial viability were also developed called High Throughput
Screening (HTS) and High Throughput Docking (HTD). HTS uses live cultures of cells with
the aid of robotics and computerised techniques to filter the large combinatorial libraries from
millions of compounds down to hundreds of compounds. Since not every compound within
the library has a pharmacological effect they eliminate them from the testing sample. They
test these libraries of compounds on live cell cultures to detect any response that maybe
produced that shows promising signs of being used for therapeutic purposes.
On the other hand, HTD is a computational simulation that analyses the 3D structure of the
target and the compound to find the ideal conformation for binding. This method is
sometimes considered too rigid as it can only use data points inputted into the simulation to
determine the binding ability of the compound. Since research on compounds and targets in
ongoing to determine their binding points new information is constantly being discovered
Figure 1
Abhishek Pai 1319400
4 | P a g e
about the way a compound binds to a target. This lack of information and flexibility means
that the binding capability of some compounds may be completely ignored, whereas HTS
would’ve detected this capability due to the flexibility and fluidity of the live cell cultures. But
HTD has the ability to use ranking systems to list compounds that are the most likely to bind
and produce a response in cell cultures. As it is a computational method of screening
potential drug candidates it is very cheap and efficient at producing results. Increasing the
speed of testing is simply a matter of increasing the computing power which has become
cheaper over the years due to developments in computing technologyiii
.
Review of Literature
Fixing Problems associated with Combinatorial Libraries
The large size of combinatorial libraries has become a little bit of a hindrance for drug
discovery and development. It has become very difficult to assess the viability of all the
compounds and determine their value to researchers. So over the years analytical
techniques have been developed with the aid of computer software to search through
libraries and find useful compounds, but also to determine the relative molecular diversity
and even aiding in the acquisitions of company libraries. The techniques discussed below
are currently being used to fix the issues that researchers and companies face when dealing
with combinatorial libraries.
Virtual Docking
The development of innovative software techniques is being used to enable more accurate
and efficient docking of compounds to targetsiv
. Testing each and every single compound in
a library using HTS is very time consuming and incredibly expensive for researchers. This is
where HTD has been used as an alternative, open source software such as Autodock 4 has
been developed as a means of achieving these goals. The nature of this software
development allows anyone from around the world to tweak and improve the software to
their needs. They can also submit improvements for everyone to use and share, due to the
open source nature of the softwarev
. Current forms of docking software were mainly used to
input a large array of compound into the simulation to bind to the one specific target. One
great leap that needs to be taken is the use of flexibility in target receptors, as they are
biological components and therefore they don’t have a rigid structure but rather a certain
amount of fluidity. This increases the chances of a conformational match of novel new
compounds that would’ve never been considered as potential candidates for future drug
development. It also allows researchers to find and eliminate the possibilities of side effects
Abhishek Pai 1319400
5 | P a g e
and adverse reactions when testing the drug. This can be a major issue as compounds can
have multiple sites of action which may be unknown to the researchers, these may be hard
to detect during early phases of the drug trial. Thus further testing of particular compounds
can become a waste of resources for pharmaceutical companies. This technique increases
the overall success rate of further drug development trialsvi
.
Compounds have several different spatial conformations, this can be exploited as it
increases the diversity of molecular shapes that are available to interact with and bind to
receptors. But this diversity in conformation isn’t exclusive to compounds, it also affects
receptors, as they are also fluid and flexible, this is believed to be the next breakthrough in
improving the results obtained by High Throughput Docking. Similar to drug compounds,
receptors have bonds that rotate and bend to produce different conformations and locations
for the drug compounds to interact and bind. It is this relaxed nature of proteins that needs to
be incorporated into the molecular docking software to increase the possibilities of docking
of unconventional and novel compounds. The research into receptor and protein fluidity is
still ongoing and there is still much to be discovered. But the inclusion of the limited data that
is available is a good start to reducing unwanted interactions and reactions in the body.
Researchers have used data collected from multiple crystallographic receptor conformations
and incorporated this into the HTD. The reasoning behind such an approach is to use HTD
software to test unconventional binding locations for diverse range of compounds. The
inclusion of protein flexibility will enable the docking software to predict any possible
interactions and binding that has not been previously observed in conventional HTD
methods. This technique increases the number of possible active compounds that are
discovered which would usually be considered as nonviable. It also enables the increase in
diversity of compounds that are found to be therapeutically active for further drug
developmentvii
.
Usually compounds that appear to be very similar in structure can have vastly different
interaction and binding properties. This can cause a great deal of hindrance to researchers
as they may use HTD programs that only analyse the structure of a compound and
determine that it might be therapeutically active and viable for further development. But in
reality these compounds aren’t able to bind effectively to their targets and produce an
effective response; these sorts of compounds only slow the progress of drug development. A
new development has been made that uses bioassays in order to calculate the IC50 of each
compound against a range of proteins, this data is used to create a bioactive profile called
affinity fingerprint. This information in conjunction with the structural data already available in
combinatorial libraries can aid HTD immensely. It eliminates compounds that have similar
structure but zero to very low biological activity, but increases the scope of HTD to find
Abhishek Pai 1319400
6 | P a g e
compounds with a diverse range of structures which would never have been considered for
further developmentviii
.
Inverse Docking
Inverse-docking is another technique that can be used to search through databases of three-
dimensional protein targets to find cavities where the ligand can form a successful bond. The
theory is implemented to help predict any unwanted binding interaction that may occur. This
process when incorporated into a ranking system will increase the accuracy of protein
binding when transferred onto the experimental phase. The process of docking is conducted
by testing each ligand conformation against the protein for the bonding sites. The cavity
within the protein is tested to ensure the least amount of energy is required for both the
ligand and protein to form a bond. The docking software calculates these energy values to
enable better ranking of potential drug development. The term “inverse” refers to the
software finding proteins that bind to ligands that isn’t the primary target. The use of inverse-
docking to input all variations of receptors and test against them all will result in
improvements in drug development, increasing the efficiency and reducing costs by
eliminating ligands that are not viable for drug developmentix
.
Molecular Descriptors
Molecular descriptors are a vital part of pharmaceutical chemistry, they take the chemical
structure and all the information that can be provided from it and convert it into numerical
data. The symbolic representation of chemical structure is converted into mathematical data
such as affinity, efficacy, polarizability, hydrophilicity, lipophilicity etc. using bioassays; this
data is more useful to researchers. When searching for new compounds, it is more effective
to find compounds which have differing properties and structures as too many molecular
descriptor similarities will result in a waste of time and resources for the pharmaceutical
company. However, differences in certain molecular descriptors will produce compounds
that are very diverse in their structure and physical properties that may aid in the discovery
of novel drug treatment options. Activity Island is a region on the molecular descriptor graph
that indicate molecule with the ability to be active compounds. The information gained from
the molecular descriptor graph can be used to modify drugs and produce newer improved
versions. Another parameter that is used is called neighbourhood region this is a boundary
around the properties of any compound that is created so that this region can be avoided as
any compound produced would have very similar structural and physical properties. Figure 2
shows the difference between typical and ideal compound libraries and the distribution of
Abhishek Pai 1319400
7 | P a g e
compounds using two molecular
descriptors. It also shows the
neighbourhood regions and activity
islands that can be explored to
produce more diverse drugs.
Researchers choose specific molecular
descriptors and plot the data of large
combinatorial libraries to determine the
activity island and neighbourhood
regions. They can then focus on
avoiding the neighbourhood regions
around a compound and focus on creating molecules that fit into the activity island. The
avoidance of neighbourhood regions will reduce the risk of producing compounds with
similar properties and structures, thus increasing molecular diversity in the processx
.
A study was conducted on the molecular diversity of chemical databases using the
molecular descriptors to compare five different databases. CMC and MDDR are two of the
databases contained medicinal compounds, ACD and SPECS contain commercially
available chemicals and the Welcome Registry a database containing potential biochemical
compounds. Using the descriptors the researchers were able to identify the super-population
of compounds that had very similar properties and structures to one another. This enabled
the discovery of two things, firstly commercially available compounds that were also
medicinally active and secondly the discovery of outliers that could have some potential to
be developed into therapeutic drugs. This was only achieved by the superimposition of the
five databases using specific molecular descriptors and producing a single metric that can
be compared between each database. Using this technique diverse compound was
discovered for further developmentxi
.
Combining Libraries
The information collected from molecular descriptors, activity islands and neighbourhood
regions can be used very effectively to combine large combinatorial libraries. Since the
advent of combinatorial chemistry the pharmaceutical industry has been churning out new
compounds and increasing the size of their combinatorial libraries. This has resulted in very
large libraries of compounds, as corporations have a vested interest in discovering new drug
compounds they will make calculated decisions to combine their libraries to increase their
library sizes and in-turn increase their chances of making new discoveries. Companies can
Figure 2
Abhishek Pai 1319400
8 | P a g e
continue to use combinatorial synthesis and expand their libraries but an alternative is
combining libraries to fill chemical spaces or activity islands that already exist within your
own library. The paper, “Rendezvous in chemical space? Comparing the small molecule
compound libraries of Bayer and Schering”, discusses the combining of large combinatorial
libraries of two companies and assesses the advantages and disadvantages of combining
these two particular libraries togetherxiixiii
.
The technique that was used to test these libraries is called LASSOO. It stands for Library
Acquisition with Simultaneous Scoring to Optimize Ordering and it is used to determine
whether an external and internal library is worth combining together for both companies. It
does this by plotting the molecular descriptors of compounds and determining whether new
compounds have the desired characteristics that are unique to the external library when
compared to the internal library by using a scoring method. The external library needs to be
diverse enough to by fill in “chemical spaces” that the pharmaceutical company have within
their internal library.
Figure 3 shows a two dimensional
representation of different molecular
descriptor for an external (Target)
and Internal (Current) libraries. The
composite is a combination of the
two libraries and the areas where the
combining of the libraries will be
favourable and unfavourable. The
light grouping suggests that the
compounds are favourable to being
added into the internal library; meanwhile the dark groupings suggest it is unfavourable for
combining. This scoring method helps researchers to determine whether this target library is
worth merging with as the compounds it contains are different from the current library. By
using LASSOO the companies can determine the relative distribution and diversity of
molecular descriptors when comparing the two libraries, the benefits associated with
combining the libraries together. The image shown above only utilises 2 molecular
descriptors to produce a 2-Dimensional graph of data but more descriptors can be included
to produce 3-,4-,… dimensional graphs. The graphs at higher dimensions will be difficult to
visualise but the software can use its scoring method to determine the viability of the library
mergerxiv
.
Figure 3
Abhishek Pai 1319400
9 | P a g e
The author of the paper analysed the compounds from Bayer and Schering. After doing the
analysis it was found that the structural identity was very low but the physico-chemical
overlap was very similar. The overlap in chemical spaces was also very significant meaning
that the company’s libraries were very complimentary to one another. The decision was
taken to keep the libraries separate and in-house but instead information regarding “hit” lists
would be exchanged periodically, this was seen as a better option. Combining two extremely
large libraries would require a large amount of resources but also increase the hindrance
when screening the library. Screening compounds in each library independent of one
another would pose a better chance of making hits on lead compounds for drug
development.
Analysing Drug and Non-Drug compounds within Libraries
Another problem that arises frequently in combinatorial chemistry is how to distinguish
between “drug” and “nondrug” compounds. The combinatorial synthesis process uses
biologically active compounds to produce a larger number of iterations and derivatives of
that compounds but the compounds produced won’t necessarily be biologically or
therapeutically active. The large size of libraries makes the process of testing compounds
resource intensive by using HTS. Discriminating between biologically active and inactive
compounds is vital in reducing the redundancy of chemical compounds in the library. By
testing and eliminating these redundant chemicals from the library the researchers can
increase the speed of screening for viable compounds for drug development.
A software scoring method has been developed to overcome this problem, by automatically
and rapidly classifying compounds into “drug” and “nondrug” categories. The researchers
developing this technique used two databases of compounds to validate their theory,
169,331 molecules from the Available Chemicals Directory (ACD) and 38,416 molecules
from the World Drug Index (WDI). They used these publically available libraries as they were
easily accessible and the molecules were already validated by the pharmaceutical
community and government regulators. This software scoring method was able to classify
83% of ACD and 77% of WDI as medicinal drugs. It is important to highlight that the WDI
only contains biologically active drugs and the scoring scheme should have determined that
most of all the compounds to be drugs accounting for false negative and positives. So there
is room for improvement in the scoring method to increase its effectiveness. The potential
uses for this scoring method is testing the large numbers of diverse molecules in libraries
and determine whether or not they can be used as potential drugs. It can also be used when
combining two large libraries to make sure that libraries containing more potential drug
Abhishek Pai 1319400
10 | P a g e
compounds rather than non-drug compounds, this enables them to selectively purchase or
even test compounds in large librariesxv
.
Analysing Molecular Diversity
The main focus of research has been in searching through vast combinatorial libraries of
compounds using HTD or HTS without any regard for molecular similarity or diversity. The
techniques discussed above are all trying to use computer software to increase the
probability of finding a compound for drug development by analysing and/or combining
libraries. Due to the lack of data on molecular diversity researchers have used mass
spectrometry as an analytical technique that can be used to produce quantitative data that
can be vital in drug development. Mass spectrometry works by breaking down large
compounds into small fragments of ionised compounds. The breakdown of the main
compound is very unique and produces a fingerprint on spectra. The unique nature of the
fingerprint helps researchers analyse and determine which compound is being tested. The
molecular fingerprint has been used to create a fragment dictionary that had been indexed
with a specific pattern of fragment. Since fragments produced by two very similar compound
structures are also going to be very similar, the spectral information can be used to
determine the relative diversity of compounds as comparing two compounds with very
diverse structure will produce fragments which are also very diverse in structure. The data
created from this technique is called structural keys and hashed fingerprint. Comparing this
information will help give a quantitative method of analysing the relative diversity of
compounds within a libraryxvi
.
Alternatives to Combinatorial Synthesis to increase Molecular
Diversity
The main problem with all the techniques that were discussed above they analyse
compounds synthesised using combinatorial synthesis, they all focus on creating larger
libraries and analysing these libraries to find a “hit” and then take this compound onto
producing a lead compound for development. Diversity is very important when it comes to
discovering a novel treatment method for a particular disease, but combinatorial synthesis
doesn’t seem to be able to provide this however. Combinatorial synthesis is very good at
producing large numbers of new compounds, most of which start from a pool of compounds
that may be very similar in structure. This similarity in structure is carried over from the
starting compounds to the final products, resulting in a lack of diversity. In order to solve this
issue I try finding new methods of synthesis that is different from combinatorial synthesis that
can produce compounds that have a larger diversity in structure.
Abhishek Pai 1319400
11 | P a g e
Structure-based synthesis
Structure-based synthesis, also known as target-oriented synthesis (TOS), is based on
designing a drug with a specific structure rather than discovering it in nature or a large library
of compounds. Data on the 3-D shape of the target molecule is collected using X-Ray
crystallography or NMR spectroscopy, this information is then used to design drugs that fit
into the target site. Since the target site is already known the drug candidates can be
developed to have higher affinities and selectivity. This can also help reduce the number of
drugs binding to secondary sites and causing adverse side effects that result. The main
issue with combinatorial chemistry is the large number of compounds that it produces. Most
of the compounds produced may never be useful as therapeutic drugs.
By using the structural information of
the target sites and the rapid rate of
producing large number of compounds
from combinatorial chemistry we will
see a huge leap forward in the
discovery of molecularly diverse yet
target specific drugs. This technique
uses a base molecule that fits easily
into the target and then produces
copious numbers of derivatives that
have a much higher probability of
working in the target site. Figure 4 shows the diversity of compounds produced by
combinatorial synthesis although these are very diverse they aren’t selective or specific in
their target site. The integration of these two methods involves finding an initial candidate
that fits the target site e.g. Figure 5 shows the steroid backbone that is shared by all the
derivatives of the compound, this backbone enables the compounds to bind to the target site
with greater selectivity but at the same time have varying affinity, efficacy and other
properties, improving the therapeutic indexes of new
drug treatmentsxvii
.
Research conducted by UC Berkley and UC San
Francisco used this integrative concept of structural
based drug design and combinatorial chemistry.
They identified a protein called Cathepsin D which
doesn’t have any potent inhibitors that are currently
available in the market. They found the smallest
Figure 4
Figure 5
Abhishek Pai 1319400
12 | P a g e
molecule that can inhibit this protein and created a second generation library where they
used combinatorial synthesis to produce new compounds based off a basic scaffold. This
second generation library can then be searched and filtered appropriately to find compounds
that have a specific range of properties. The researchers used the property of potency to
search through libraries with a diverse range of compounds and receptor targeted libraries. It
was found that the compounds discovered in the second generation receptor targeted
combinatorial library were 5 to 6 fold more potent than the compounds discovered in the
diverse library. The success of this study shows the potential of targeted structure-based
combinatorial chemistry and the great benefits that it holds in producing compounds which
are viable for therapeutic usexviii
.
Fragment-based synthesis
Fragment-based drug discovery (FBDD) is a technique based on discovering small
molecules that weakly bind to the target site and build up the molecules to produce large,
complex, diverse molecules with varying properties such as higher affinity, efficacy,
selectivity etc. The process employs HTS to test small compounds against a variety of
different target site in vitro. HTS is carried out on compounds smaller than 500Da from large
combinatorial libraries that may contain between 100,000s to 1,000,000s of compounds. The
weight limit reduces the number of molecules that need to be tested significantly reducing
the amount of time it takes to find a suitable candidate for FBDD. The strategy of this
technique is to build large complex molecules from small simple moleculesxix
.
Research has been conducted on combining both computational chemistry and Fragment-
based drug discovery to make the process of discovering a fragment and producing
fragment-to-lead (F2L) much more efficient. Computational chemistry is also used to narrow
down the large libraries of compounds
to produce smaller libraries containing
fragments that can be used in FBDD.
The creation of a focussed library of
fragment increases the efficiency of
discovering new potential drug
compounds. With the aid of Bioassays,
NMR and X-Ray Crystallography the
structures of compounds and ligands
can be determined. This information is
then fed into HTD to determine the
relative compatibility of the compoundFigure 6
Abhishek Pai 1319400
13 | P a g e
called hit conformation. The compounds are then tested in vitro to determine their structure
activity relationship (SAR) this information can be used to further develop the compound to
change their potency, affinity etc. Figure 6 shows the process of FBDD and how it filters
compounds down the flow chart, they are then built up to become larger molecules resulting
in increased molecular diversity. In the pharmaceutical research market the filing of patent
for intellectual property is very crowded but the use of FBDD to start from small compounds
to build larger compounds means that the potential for conflict is very low. The compounds
that are produced via FBDD can vary dramatically even when starting from the same initial
compound fragment. This increases the molecular diversity quite considerably and results in
the production of new novel drug compounds for therapeutic usexx
.
Diversity-oriented synthesis
Combinatorial chemistry has been very useful at producing a large number of compounds
resulting in very large libraries but there is an issue with the compounds that are produced,
they lack diversity. They are too similar in structure as they all start from the same set of
starting compounds resulting in iterative derivatives of compounds with very similar
structures. Figure 7 shows a compound before and after it was put through the process of
combinatorial synthesis, it shows the relative similarity of the structure of the starting and
finishing compounds. They have very little change in structure; the only change that does
occur is the addition or removal of functional groups. Target-oriented Synthesis (TOS), also
known as structure-based synthesis, has been used analyse a target and find a compound
that fits the structure then large numbers of derivatives are produced, these compounds lack
a huge amount of diversity. Using TOS to synthesise compounds results in very specific
chemical descriptors for the products causing a decrease in diversity of the drug compounds
that are produced. The answer to these issues is Diversity-oriented Synthesis (DOS); this
process intentionally produces more
than one set of compounds that are
diverse in structure and properties in an
efficient manner to solve complex drug
targeting issuesxxixxii
.
DOS was developed to use
combinatorial synthesis to produce large
libraries of compounds, but fixes the
issue of diversity in the process. DOS
aims to produce compounds with a
broad range within specific chemical
Figure 7
Abhishek Pai 1319400
14 | P a g e
descriptors making the chances of producing very diverse compounds much higher. This is
because changing the properties of compounds may involve change a large number of
functional groups, stereogenic sites etc. this can result in a change in the overall structure of
the compound thus increasing diversity. Figure 8 shows how all three routes of synthesis
create compounds and the relative diversity of the compounds that are produced. This
representation shows how much more advanced DOS is in comparison to conventional
combinatorial synthesis and TOS at producing compounds with very high molecular
diversityxxiii
.
There are four requirements for increasing molecular diversity in drug synthesis, appendage
diversity, stereochemical diversity, functional group diversity and skeletal diversity. The final
requirement is the hardest to achieve and there are two strategies that have been developed
to help improve the skeletal diversity of novel compounds. Folding and Branching pathways
are two techniques that are being used to develop a diverse range of drug compounds;
figure 9 shows a basic schematic of how the techniques work. The folding pathway uses one
common reagent that folds the structure in a
specific way but uses a variety of different
starting materials to produce a large number of
diverse compounds. Branching pathway uses a
one starting compound and a range of different
reagents to alter the one compound into several
different diverse ranges of compounds. At the
end of both processes a large diverse range of
compounds will be produced.
Figure 8
Figure 9
Figure 10
Abhishek Pai 1319400
15 | P a g e
When either technique is used the diversity of molecules produced increases dramatically.
Figure 10 shows a graph using two molecular descriptors to identify the distribution of
compounds. The MDDR library contains compounds that are known to be drugs and when
comparing this database with DOS library (red) and focussed library (blue) it is clear that the
distribution of DOS library is much higher than the focussed library, showing the level of
diversity that is achieved when DOS is used to produce new compoundsxxiv
.
Conclusion
This literature review chose to focus on two aspects of fixing problems associated with
combinatorial synthesis and finding alternative routes of synthesis that improve molecular
diversity. Firstly, fixing problems associated with combinatorial synthesis involving the
improvement of techniques to analyse compounds in libraries and using them to combine
libraries, with the aid of computation techniques such as virtual docking, inverse docking,
LASOO, molecular descriptors, flexibility of drug binding and the analysis of drug and non-
drug compounds. These techniques were used to analyse large libraries of compounds
containing 100,000s and 1,000,000s of compounds to determine their relative diversity and
viability as useful compounds for further drug development.
Second part of the review focused on finding alternative routes of synthesis to combinatorial
synthesis to improve molecular diversity. There are techniques that can be used to synthesis
new compounds that are very diverse such as structural-based synthesis, fragment-based
synthesis and diversity-oriented synthesis. All three of these techniques have been modified
and improved to increase the molecular diversity that can be achieved so that diverse drug
compounds can go from discovery to lead with greater success rates.
The techniques listed above and discussed in this literature review highlight the
improvements that are being made to the process of drug discovery. As these areas of
research progress further the relative ease by which more diverse compounds are
discovered and synthesised will rise.
i
Higginbotham, S., Wong, W., Linington, R., Spadafora, C., Iturrado, L. and Arnold, A. (2014). Sloth Hair as a
Novel Source of Fungi with Potent Anti-Parasitic, Anti-Cancer and Anti-Bacterial Bioactivity. PLoS ONE, 9(1),
p.e84549.
ii
Combinatorial Chemistry Review. (2016). [online] Combichemistry.com. Available at:
http://www.combichemistry.com.
iii
Klon, A., Glick, M., Thoma, M., Acklin, P. and Davies, J. (2004). Finding More Needles in the Haystack: A
Simple and Efficient Method for Improving High-Throughput Docking Results. J. Med. Chem., 47(11), pp.2743-
2749.
Abhishek Pai 1319400
16 | P a g e
iv
Shoichet, B. (2004). Virtual screening of chemical libraries. Nature, 432(7019), pp.862-865.
v
Ellingson, S., Dakshanamurthy, S., Brown, M., Smith, J. and Baudry, J. (2013). Accelerating virtual high-
throughput ligand docking: current technology and case study on a petascale supercomputer. Concurrency and
Computation: Practice and Experience, 26(6), pp.1268-1277.
vi
Hou, X., Li, K., Yu, X., Sun, J. and Fang, H. (2015). Protein Flexibility in Docking-Based Virtual Screening:
Discovery of Novel Lymphoid-Specific Tyrosine Phosphatase Inhibitors Using Multiple Crystal Structures. Journal
of Chemical Information and Modeling, 55(9), pp.1973-1983.
vii
Bottegoni, G., Rocchia, W., Rueda, M., Abagyan, R. and Cavalli, A. (2011). Systematic Exploitation of Multiple
Receptor Conformations for Virtual Ligand Screening. PLoS ONE, 6(5), p.e18845.
viii
Dixon, S. and Villar, H. (1998). Bioactive Diversity and Screening Library Selection via Affinity Fingerprinting.
Journal of Chemical Information and Modeling, 38(6), pp.1192-1203.
ix
Chen, Y. and Ung, C. (2001). Prediction of potential toxicity and side effect protein targets of a small molecule
by a ligand–protein inverse docking approach. Journal of Molecular Graphics and Modelling, 20(3), pp.199-218.
x
Patterson, D., Cramer, R., Ferguson, A., Clark, R. and Weinberger, L. (1996). Neighborhood Behavior: A Useful
Concept for Validation of “Molecular Diversity” Descriptors. J. Med. Chem., 39(16), pp.3049-3059.
xi
Cummins, D., Andrews, C., Bentley, J. and Cory, M. (1996). Molecular Diversity in Chemical Databases:
Comparison of Medicinal Chemistry Knowledge Bases and Databases of Commercially Available Compounds.
Journal of Chemical Information and Modeling, 36(4), pp.750-763.
xii
Schamberger, J., Grimm, M., Steinmeyer, A. and Hillisch, A. (2011). Rendezvous in chemical space? Comparing
the small molecule compound libraries of Bayer and Schering. Drug Discovery Today, 16(13-14), pp.636-641.
xiii
Hassan, M., Bielawski, J., Hempel, J. and Waldman, M. (1996). Optimization and visualization of molecular
diversity of combinatorial libraries. Molecular Diversity, 2(1-2), pp.64-74.
xiv
Koehler, R., Dixon, S. and Villar, H. (1999). LASSOO: A Generalized Directed Diversity Approach to the Design
and Enrichment of Chemical Libraries. J. Med. Chem., 42(22), pp.4695-4704.
xv
Sadowski, J. and Kubinyi, H. (1998). A Scoring Scheme for Discriminating between Drugs and Nondrugs. J.
Med. Chem., 41(18), pp.3325-3329.
xvi
Schoonjans, V., Questier, F., Borosy, A., Walczak, B., Massart, D. and Hudson, B. (2000). Use of mass
spectrometry for assessing similarity/diversity of natural products with unknown chemical structures. Journal of
Pharmaceutical and Biomedical Analysis, 21(6), pp.1197-1214.
xvii
Li, J., Murray, C., Waszkowycz, B. and Young, S. (1998). Targeted molecular diversity in drug discovery:
Integration of structure-based design and combinatorial chemistry. Drug Discovery Today, 3(3), pp.105-112.
xviii
Kick, E., Roe, D., Geoffrey Skillman, A., Liu, G., Ewing, T., Sun, Y., Kuntz, I. and Ellman, J. (1997). Structure-
based design and combinatorial chemistry yield low nanomolar inhibitors of cathepsin D. Chemistry & Biology,
4(4), pp.297-307.
xix
Murray, C. and Rees, D. (2015). Opportunity Knocks: Organic Chemistry for Fragment-Based Drug Discovery
(FBDD). Angewandte Chemie International Edition, 55(2), pp.488-492.
xx
Law, R., Barker, O., Barker, J., Hesterkamp, T., Godemann, R., Andersen, O., Fryatt, T., Courtney, S., Hallett, D.
and Whittaker, M. (2009). The multiple roles of computational chemistry in fragment-based drug design.
Journal of Computer-Aided Molecular Design, 23(8), pp.459-473.
xxi
Spring, D. (2003). Diversity-oriented synthesis; a challenge for synthetic chemistsElectronic supplementary
information (ESI) available: Excel file of all the FDA new molecular entities between the years 1998 and July
2003, and new drug approvals between the years 1990 and 2002. See
http://www.rsc.org/suppdata/ob/b3/b310752n/. Organic & Biomolecular Chemistry, 1(22), p.3867.
xxii
Ma DL, L. (2013). Future Frontiers in Diversity-Oriented Synthesis. Organic Chem Curr Res, 03(01).
xxiii
Fergus, S., Bender, A. and Spring, D. (2005). Assessment of structural diversity in combinatorial synthesis.
Current Opinion in Chemical Biology, 9(3), pp.304-309.
xxiv
Spandl, R., Díaz‐Gavilán, M., O'Connell, K., Thomas, G. and Spring, D. (2008). Diversity‐oriented synthesis.
Chem. Record, 8(3), pp.129-142.

More Related Content

What's hot

Collaboraive sharing of molecules and data in the mobile age
Collaboraive sharing of molecules and data in the mobile ageCollaboraive sharing of molecules and data in the mobile age
Collaboraive sharing of molecules and data in the mobile age
Sean Ekins
 
Connecting the Data Wires
Connecting the Data WiresConnecting the Data Wires
Connecting the Data Wires
Medicines Discovery Catapult
 
Computer aided drug design - a new drug discovery tool
Computer aided drug design - a new drug discovery toolComputer aided drug design - a new drug discovery tool
Computer aided drug design - a new drug discovery tool
Vikas Soni
 
Assessing GtoPdb ligand content in PubChem
Assessing GtoPdb ligand content in PubChemAssessing GtoPdb ligand content in PubChem
Assessing GtoPdb ligand content in PubChem
Chris Southan
 
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
Andrew McEachran
 
Computer Aided Drug Design
Computer Aided Drug DesignComputer Aided Drug Design
Computer Aided Drug Design
pooja sabarinathan
 
Virtual Screening and Hit Prioritization
Virtual Screening and Hit PrioritizationVirtual Screening and Hit Prioritization
Virtual Screening and Hit Prioritization
Puneet Kacker
 
Multiplexing analysis of 1000 approved drugs in PubChem
Multiplexing analysis of 1000 approved drugs in PubChemMultiplexing analysis of 1000 approved drugs in PubChem
Multiplexing analysis of 1000 approved drugs in PubChem
Chris Southan
 
Cadd
CaddCadd
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
3.cadd
3.cadd3.cadd
Computer aided-drug-design-boc sciences
Computer aided-drug-design-boc sciencesComputer aided-drug-design-boc sciences
Computer aided-drug-design-boc sciences
BOC-Sciences
 
ChemInform RxnFinder
ChemInform RxnFinderChemInform RxnFinder
ChemInform RxnFinder
Stephan Heineke
 
Assessing Drug Safety Using AI
Assessing Drug Safety Using AIAssessing Drug Safety Using AI
Assessing Drug Safety Using AI
Databricks
 
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...
Frederik van den Broek
 
Drug Discovery and Development Using AI
Drug Discovery and Development Using AIDrug Discovery and Development Using AI
Drug Discovery and Development Using AI
Databricks
 
Developing tools for high resolution mass spectrometry-based screening via th...
Developing tools for high resolution mass spectrometry-based screening via th...Developing tools for high resolution mass spectrometry-based screening via th...
Developing tools for high resolution mass spectrometry-based screening via th...
Andrew McEachran
 
Aiding Computer Aided Drug Design
Aiding Computer Aided Drug DesignAiding Computer Aided Drug Design
Aiding Computer Aided Drug Design
Shahir Shamsir
 
Structure Based Drug Design
Structure Based Drug DesignStructure Based Drug Design
Structure Based Drug Design
nmicaelo
 
Slicing and dicing curated protein targets: Analysing the drugged, druggable ...
Slicing and dicing curated protein targets: Analysing the drugged, druggable ...Slicing and dicing curated protein targets: Analysing the drugged, druggable ...
Slicing and dicing curated protein targets: Analysing the drugged, druggable ...
Guide to PHARMACOLOGY
 

What's hot (20)

Collaboraive sharing of molecules and data in the mobile age
Collaboraive sharing of molecules and data in the mobile ageCollaboraive sharing of molecules and data in the mobile age
Collaboraive sharing of molecules and data in the mobile age
 
Connecting the Data Wires
Connecting the Data WiresConnecting the Data Wires
Connecting the Data Wires
 
Computer aided drug design - a new drug discovery tool
Computer aided drug design - a new drug discovery toolComputer aided drug design - a new drug discovery tool
Computer aided drug design - a new drug discovery tool
 
Assessing GtoPdb ligand content in PubChem
Assessing GtoPdb ligand content in PubChemAssessing GtoPdb ligand content in PubChem
Assessing GtoPdb ligand content in PubChem
 
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
 
Computer Aided Drug Design
Computer Aided Drug DesignComputer Aided Drug Design
Computer Aided Drug Design
 
Virtual Screening and Hit Prioritization
Virtual Screening and Hit PrioritizationVirtual Screening and Hit Prioritization
Virtual Screening and Hit Prioritization
 
Multiplexing analysis of 1000 approved drugs in PubChem
Multiplexing analysis of 1000 approved drugs in PubChemMultiplexing analysis of 1000 approved drugs in PubChem
Multiplexing analysis of 1000 approved drugs in PubChem
 
Cadd
CaddCadd
Cadd
 
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
 
3.cadd
3.cadd3.cadd
3.cadd
 
Computer aided-drug-design-boc sciences
Computer aided-drug-design-boc sciencesComputer aided-drug-design-boc sciences
Computer aided-drug-design-boc sciences
 
ChemInform RxnFinder
ChemInform RxnFinderChemInform RxnFinder
ChemInform RxnFinder
 
Assessing Drug Safety Using AI
Assessing Drug Safety Using AIAssessing Drug Safety Using AI
Assessing Drug Safety Using AI
 
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...
 
Drug Discovery and Development Using AI
Drug Discovery and Development Using AIDrug Discovery and Development Using AI
Drug Discovery and Development Using AI
 
Developing tools for high resolution mass spectrometry-based screening via th...
Developing tools for high resolution mass spectrometry-based screening via th...Developing tools for high resolution mass spectrometry-based screening via th...
Developing tools for high resolution mass spectrometry-based screening via th...
 
Aiding Computer Aided Drug Design
Aiding Computer Aided Drug DesignAiding Computer Aided Drug Design
Aiding Computer Aided Drug Design
 
Structure Based Drug Design
Structure Based Drug DesignStructure Based Drug Design
Structure Based Drug Design
 
Slicing and dicing curated protein targets: Analysing the drugged, druggable ...
Slicing and dicing curated protein targets: Analysing the drugged, druggable ...Slicing and dicing curated protein targets: Analysing the drugged, druggable ...
Slicing and dicing curated protein targets: Analysing the drugged, druggable ...
 

Similar to AXP302

Open Journal of Chemistry
Open Journal of ChemistryOpen Journal of Chemistry
Open Journal of Chemistry
peertechzpublication
 
new drug discovery studies
new drug discovery studiesnew drug discovery studies
new drug discovery studies
Drx Rather Ishfaq
 
Presentation
PresentationPresentation
Presentation
EmanSherra
 
Combinatorial chemistry
Combinatorial chemistryCombinatorial chemistry
Combinatorial chemistry
Aartikamble7
 
Case 5.1 - DESIGNING DRUGS VIRTUALLY
Case 5.1 - DESIGNING DRUGS VIRTUALLYCase 5.1 - DESIGNING DRUGS VIRTUALLY
Case 5.1 - DESIGNING DRUGS VIRTUALLY
Aya Wan Idris
 
ABT 609 PPT
ABT 609 PPTABT 609 PPT
ABT 609 PPT
Jane Awah
 
Wk 5 case 1 designing drug virtually
Wk 5 case 1 designing drug virtually Wk 5 case 1 designing drug virtually
Wk 5 case 1 designing drug virtually
dyadelm
 
Combinatorial chemistry
Combinatorial chemistryCombinatorial chemistry
Combinatorial chemistry
Harendra Bisht
 
Cadd
CaddCadd
Applicationsofbioinformaticsindrugdiscoveryandprocess
ApplicationsofbioinformaticsindrugdiscoveryandprocessApplicationsofbioinformaticsindrugdiscoveryandprocess
Applicationsofbioinformaticsindrugdiscoveryandprocess
jaidev53ster
 
In silico Drug Design: Prospective for Drug Lead Discovery
In silico Drug Design: Prospective for Drug Lead DiscoveryIn silico Drug Design: Prospective for Drug Lead Discovery
In silico Drug Design: Prospective for Drug Lead Discovery
inventionjournals
 
Session 1 part 3
Session 1 part 3Session 1 part 3
Session 1 part 3
plmiami
 
Bioinformatics role in Pharmaceutical industries
Bioinformatics role in Pharmaceutical industriesBioinformatics role in Pharmaceutical industries
Bioinformatics role in Pharmaceutical industries
Muzna Kashaf
 
Back Rapid lead compounds discovery through high-throughput screening
 Back Rapid lead compounds discovery through high-throughput screening Back Rapid lead compounds discovery through high-throughput screening
Back Rapid lead compounds discovery through high-throughput screening
rita martin
 
Applications Of Bioinformatics In Drug Discovery And Process
Applications Of Bioinformatics In Drug Discovery And ProcessApplications Of Bioinformatics In Drug Discovery And Process
Applications Of Bioinformatics In Drug Discovery And Process
Prof. Dr. Basavaraj Nanjwade
 
LEAD IDENTIFICATION BY SUHAS PATIL (S.K.)
LEAD IDENTIFICATION BY SUHAS PATIL (S.K.)LEAD IDENTIFICATION BY SUHAS PATIL (S.K.)
LEAD IDENTIFICATION BY SUHAS PATIL (S.K.)
suhaspatil114
 
Drug discovery anthony crasto
Drug discovery  anthony crastoDrug discovery  anthony crasto
Drug discovery anthony crasto
Anthony Melvin Crasto Ph.D
 
Open Notebook Science and One Future for Scientific Research
Open Notebook Science and One Future for Scientific ResearchOpen Notebook Science and One Future for Scientific Research
Open Notebook Science and One Future for Scientific Research
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Slides for burroughs wellcome foundation ajw100611 sefinal
Slides for burroughs wellcome foundation ajw100611 sefinalSlides for burroughs wellcome foundation ajw100611 sefinal
Slides for burroughs wellcome foundation ajw100611 sefinal
Sean Ekins
 
verlinde1994.pdf
verlinde1994.pdfverlinde1994.pdf
verlinde1994.pdf
Bandita Datta
 

Similar to AXP302 (20)

Open Journal of Chemistry
Open Journal of ChemistryOpen Journal of Chemistry
Open Journal of Chemistry
 
new drug discovery studies
new drug discovery studiesnew drug discovery studies
new drug discovery studies
 
Presentation
PresentationPresentation
Presentation
 
Combinatorial chemistry
Combinatorial chemistryCombinatorial chemistry
Combinatorial chemistry
 
Case 5.1 - DESIGNING DRUGS VIRTUALLY
Case 5.1 - DESIGNING DRUGS VIRTUALLYCase 5.1 - DESIGNING DRUGS VIRTUALLY
Case 5.1 - DESIGNING DRUGS VIRTUALLY
 
ABT 609 PPT
ABT 609 PPTABT 609 PPT
ABT 609 PPT
 
Wk 5 case 1 designing drug virtually
Wk 5 case 1 designing drug virtually Wk 5 case 1 designing drug virtually
Wk 5 case 1 designing drug virtually
 
Combinatorial chemistry
Combinatorial chemistryCombinatorial chemistry
Combinatorial chemistry
 
Cadd
CaddCadd
Cadd
 
Applicationsofbioinformaticsindrugdiscoveryandprocess
ApplicationsofbioinformaticsindrugdiscoveryandprocessApplicationsofbioinformaticsindrugdiscoveryandprocess
Applicationsofbioinformaticsindrugdiscoveryandprocess
 
In silico Drug Design: Prospective for Drug Lead Discovery
In silico Drug Design: Prospective for Drug Lead DiscoveryIn silico Drug Design: Prospective for Drug Lead Discovery
In silico Drug Design: Prospective for Drug Lead Discovery
 
Session 1 part 3
Session 1 part 3Session 1 part 3
Session 1 part 3
 
Bioinformatics role in Pharmaceutical industries
Bioinformatics role in Pharmaceutical industriesBioinformatics role in Pharmaceutical industries
Bioinformatics role in Pharmaceutical industries
 
Back Rapid lead compounds discovery through high-throughput screening
 Back Rapid lead compounds discovery through high-throughput screening Back Rapid lead compounds discovery through high-throughput screening
Back Rapid lead compounds discovery through high-throughput screening
 
Applications Of Bioinformatics In Drug Discovery And Process
Applications Of Bioinformatics In Drug Discovery And ProcessApplications Of Bioinformatics In Drug Discovery And Process
Applications Of Bioinformatics In Drug Discovery And Process
 
LEAD IDENTIFICATION BY SUHAS PATIL (S.K.)
LEAD IDENTIFICATION BY SUHAS PATIL (S.K.)LEAD IDENTIFICATION BY SUHAS PATIL (S.K.)
LEAD IDENTIFICATION BY SUHAS PATIL (S.K.)
 
Drug discovery anthony crasto
Drug discovery  anthony crastoDrug discovery  anthony crasto
Drug discovery anthony crasto
 
Open Notebook Science and One Future for Scientific Research
Open Notebook Science and One Future for Scientific ResearchOpen Notebook Science and One Future for Scientific Research
Open Notebook Science and One Future for Scientific Research
 
Slides for burroughs wellcome foundation ajw100611 sefinal
Slides for burroughs wellcome foundation ajw100611 sefinalSlides for burroughs wellcome foundation ajw100611 sefinal
Slides for burroughs wellcome foundation ajw100611 sefinal
 
verlinde1994.pdf
verlinde1994.pdfverlinde1994.pdf
verlinde1994.pdf
 

AXP302

  • 1. Abhishek Pai 1319400 1 | P a g e SCHOOL OF CHEMISTRY CHM3B2 – Literature Project Literature Review Fixing Problems and Finding Alternatives to Combinatorial Chemistry to increase molecular diversity Abhishek Pai 1319400 April 2016 PROJECT SUPERVISOR: Dr John Wilkie
  • 2. Abhishek Pai 1319400 2 | P a g e Abstract Drug discovery and development is a very difficult area of research and consumes a large amount of resources to produce effective results. Researchers have found a technique of synthesising large numbers of molecules called combinatorial synthesis. The use of this method meant that libraries increased in size dramatically anywhere from 100,000s to 1,000,000s of compounds. The production of these compounds saves the pharmaceutical industry a lot of resources that would normally be used to discover new compounds through fieldwork. But they discovered a new problem with this synthesis technique; it would produce large libraries but introduced the added disadvantage of having to find compounds that were therapeutically active or diverse in structure and property. The large numbers became more of a hindrance than a solution; the best analogy for this problem is “Needle in a haystack”. The diversity aspect of drug discovery was also very important as it meant that novel drug treatments could be found. This literature review focusses on two aspects, firstly solving the issues related to finding compounds within combinatorial libraries, secondly finding alternative routes of synthesis that may produce a larger range of molecular diversity for drug discovery and development. Introduction Currently within the pharmaceutical industry the most popular method of increasing the number of available compounds involves the use of combinatorial chemistry. The classical method of discovering new molecules required fieldwork to obtain new and interesting compounds that chemists can then manipulate and test. An example of such fieldwork is the exploration of remote biomes such as rain forests, deserts, tundra’s etc. The rainforest is an incredible breeding ground for compounds that have never seen in labs. There is enormous potential for discovery due to their isolation from humans. Sloths are an example of creatures that have a lot of potential to produce new research material. Their fur is believed to be an excellent breeding ground for new species of bacteria and fungi. Researchers are hoping to exploit this to obtain novel antibacterial and antifungal compounds. The serendipitous discovery of new drug compounds is what the researchers are looking for, since the fungi and bacteria are able to evolve new ways of combating problems such as antibiotic resistance that is very prevalent around the globe. But this technique is very time-consuming and cost inefficient. It requires the investments of large sums of money and the returns may never be as fruitfuli .
  • 3. Abhishek Pai 1319400 3 | P a g e This is where combinatorial chemistry is comes in as it is very efficient at producing large numbers of diverse compounds which may occur naturally in wild. It doesn’t require the man power and resources like fieldwork. The process of combinatorial chemistry speeds up the process of creating a large library of compounds by avoiding the conventional route of chemical synthesis. The conventional method involves reacting compounds A and B to form the product AB. This process is very slow and only produces one compound whereas combinatorial chemistry uses several derivatives or analogues of compounds A and B to form combinations of products. For example, combining A1, A2, A3 … An and B1, B2, B3 … Bn to form A1B1, A1B2, A2B1 … AnBn, Figure 1 shows how the process of combinatorial synthesis works. The diversity of compounds produced is directly related to the variation of the initial compounds that are available for the reaction. This enables the Pharmaceutical companies to build large libraries of compounds over short periods of time due to the rapid nature of this process. It is hypothesised that increasing the number of compounds in the combinatorial library also makes it more likely that compounds that are therapeutically active are discoveredii . It is important to note that not all the compounds produced using this method will work in medicinal chemistry; it may have uses in other fields such as food, cleaning, petro-chemical etc. it may also be useless, this factor is explored later on in the literature review. Means of testing the large numbers of compounds produced by combinatorial synthesis for further development and commercial viability were also developed called High Throughput Screening (HTS) and High Throughput Docking (HTD). HTS uses live cultures of cells with the aid of robotics and computerised techniques to filter the large combinatorial libraries from millions of compounds down to hundreds of compounds. Since not every compound within the library has a pharmacological effect they eliminate them from the testing sample. They test these libraries of compounds on live cell cultures to detect any response that maybe produced that shows promising signs of being used for therapeutic purposes. On the other hand, HTD is a computational simulation that analyses the 3D structure of the target and the compound to find the ideal conformation for binding. This method is sometimes considered too rigid as it can only use data points inputted into the simulation to determine the binding ability of the compound. Since research on compounds and targets in ongoing to determine their binding points new information is constantly being discovered Figure 1
  • 4. Abhishek Pai 1319400 4 | P a g e about the way a compound binds to a target. This lack of information and flexibility means that the binding capability of some compounds may be completely ignored, whereas HTS would’ve detected this capability due to the flexibility and fluidity of the live cell cultures. But HTD has the ability to use ranking systems to list compounds that are the most likely to bind and produce a response in cell cultures. As it is a computational method of screening potential drug candidates it is very cheap and efficient at producing results. Increasing the speed of testing is simply a matter of increasing the computing power which has become cheaper over the years due to developments in computing technologyiii . Review of Literature Fixing Problems associated with Combinatorial Libraries The large size of combinatorial libraries has become a little bit of a hindrance for drug discovery and development. It has become very difficult to assess the viability of all the compounds and determine their value to researchers. So over the years analytical techniques have been developed with the aid of computer software to search through libraries and find useful compounds, but also to determine the relative molecular diversity and even aiding in the acquisitions of company libraries. The techniques discussed below are currently being used to fix the issues that researchers and companies face when dealing with combinatorial libraries. Virtual Docking The development of innovative software techniques is being used to enable more accurate and efficient docking of compounds to targetsiv . Testing each and every single compound in a library using HTS is very time consuming and incredibly expensive for researchers. This is where HTD has been used as an alternative, open source software such as Autodock 4 has been developed as a means of achieving these goals. The nature of this software development allows anyone from around the world to tweak and improve the software to their needs. They can also submit improvements for everyone to use and share, due to the open source nature of the softwarev . Current forms of docking software were mainly used to input a large array of compound into the simulation to bind to the one specific target. One great leap that needs to be taken is the use of flexibility in target receptors, as they are biological components and therefore they don’t have a rigid structure but rather a certain amount of fluidity. This increases the chances of a conformational match of novel new compounds that would’ve never been considered as potential candidates for future drug development. It also allows researchers to find and eliminate the possibilities of side effects
  • 5. Abhishek Pai 1319400 5 | P a g e and adverse reactions when testing the drug. This can be a major issue as compounds can have multiple sites of action which may be unknown to the researchers, these may be hard to detect during early phases of the drug trial. Thus further testing of particular compounds can become a waste of resources for pharmaceutical companies. This technique increases the overall success rate of further drug development trialsvi . Compounds have several different spatial conformations, this can be exploited as it increases the diversity of molecular shapes that are available to interact with and bind to receptors. But this diversity in conformation isn’t exclusive to compounds, it also affects receptors, as they are also fluid and flexible, this is believed to be the next breakthrough in improving the results obtained by High Throughput Docking. Similar to drug compounds, receptors have bonds that rotate and bend to produce different conformations and locations for the drug compounds to interact and bind. It is this relaxed nature of proteins that needs to be incorporated into the molecular docking software to increase the possibilities of docking of unconventional and novel compounds. The research into receptor and protein fluidity is still ongoing and there is still much to be discovered. But the inclusion of the limited data that is available is a good start to reducing unwanted interactions and reactions in the body. Researchers have used data collected from multiple crystallographic receptor conformations and incorporated this into the HTD. The reasoning behind such an approach is to use HTD software to test unconventional binding locations for diverse range of compounds. The inclusion of protein flexibility will enable the docking software to predict any possible interactions and binding that has not been previously observed in conventional HTD methods. This technique increases the number of possible active compounds that are discovered which would usually be considered as nonviable. It also enables the increase in diversity of compounds that are found to be therapeutically active for further drug developmentvii . Usually compounds that appear to be very similar in structure can have vastly different interaction and binding properties. This can cause a great deal of hindrance to researchers as they may use HTD programs that only analyse the structure of a compound and determine that it might be therapeutically active and viable for further development. But in reality these compounds aren’t able to bind effectively to their targets and produce an effective response; these sorts of compounds only slow the progress of drug development. A new development has been made that uses bioassays in order to calculate the IC50 of each compound against a range of proteins, this data is used to create a bioactive profile called affinity fingerprint. This information in conjunction with the structural data already available in combinatorial libraries can aid HTD immensely. It eliminates compounds that have similar structure but zero to very low biological activity, but increases the scope of HTD to find
  • 6. Abhishek Pai 1319400 6 | P a g e compounds with a diverse range of structures which would never have been considered for further developmentviii . Inverse Docking Inverse-docking is another technique that can be used to search through databases of three- dimensional protein targets to find cavities where the ligand can form a successful bond. The theory is implemented to help predict any unwanted binding interaction that may occur. This process when incorporated into a ranking system will increase the accuracy of protein binding when transferred onto the experimental phase. The process of docking is conducted by testing each ligand conformation against the protein for the bonding sites. The cavity within the protein is tested to ensure the least amount of energy is required for both the ligand and protein to form a bond. The docking software calculates these energy values to enable better ranking of potential drug development. The term “inverse” refers to the software finding proteins that bind to ligands that isn’t the primary target. The use of inverse- docking to input all variations of receptors and test against them all will result in improvements in drug development, increasing the efficiency and reducing costs by eliminating ligands that are not viable for drug developmentix . Molecular Descriptors Molecular descriptors are a vital part of pharmaceutical chemistry, they take the chemical structure and all the information that can be provided from it and convert it into numerical data. The symbolic representation of chemical structure is converted into mathematical data such as affinity, efficacy, polarizability, hydrophilicity, lipophilicity etc. using bioassays; this data is more useful to researchers. When searching for new compounds, it is more effective to find compounds which have differing properties and structures as too many molecular descriptor similarities will result in a waste of time and resources for the pharmaceutical company. However, differences in certain molecular descriptors will produce compounds that are very diverse in their structure and physical properties that may aid in the discovery of novel drug treatment options. Activity Island is a region on the molecular descriptor graph that indicate molecule with the ability to be active compounds. The information gained from the molecular descriptor graph can be used to modify drugs and produce newer improved versions. Another parameter that is used is called neighbourhood region this is a boundary around the properties of any compound that is created so that this region can be avoided as any compound produced would have very similar structural and physical properties. Figure 2 shows the difference between typical and ideal compound libraries and the distribution of
  • 7. Abhishek Pai 1319400 7 | P a g e compounds using two molecular descriptors. It also shows the neighbourhood regions and activity islands that can be explored to produce more diverse drugs. Researchers choose specific molecular descriptors and plot the data of large combinatorial libraries to determine the activity island and neighbourhood regions. They can then focus on avoiding the neighbourhood regions around a compound and focus on creating molecules that fit into the activity island. The avoidance of neighbourhood regions will reduce the risk of producing compounds with similar properties and structures, thus increasing molecular diversity in the processx . A study was conducted on the molecular diversity of chemical databases using the molecular descriptors to compare five different databases. CMC and MDDR are two of the databases contained medicinal compounds, ACD and SPECS contain commercially available chemicals and the Welcome Registry a database containing potential biochemical compounds. Using the descriptors the researchers were able to identify the super-population of compounds that had very similar properties and structures to one another. This enabled the discovery of two things, firstly commercially available compounds that were also medicinally active and secondly the discovery of outliers that could have some potential to be developed into therapeutic drugs. This was only achieved by the superimposition of the five databases using specific molecular descriptors and producing a single metric that can be compared between each database. Using this technique diverse compound was discovered for further developmentxi . Combining Libraries The information collected from molecular descriptors, activity islands and neighbourhood regions can be used very effectively to combine large combinatorial libraries. Since the advent of combinatorial chemistry the pharmaceutical industry has been churning out new compounds and increasing the size of their combinatorial libraries. This has resulted in very large libraries of compounds, as corporations have a vested interest in discovering new drug compounds they will make calculated decisions to combine their libraries to increase their library sizes and in-turn increase their chances of making new discoveries. Companies can Figure 2
  • 8. Abhishek Pai 1319400 8 | P a g e continue to use combinatorial synthesis and expand their libraries but an alternative is combining libraries to fill chemical spaces or activity islands that already exist within your own library. The paper, “Rendezvous in chemical space? Comparing the small molecule compound libraries of Bayer and Schering”, discusses the combining of large combinatorial libraries of two companies and assesses the advantages and disadvantages of combining these two particular libraries togetherxiixiii . The technique that was used to test these libraries is called LASSOO. It stands for Library Acquisition with Simultaneous Scoring to Optimize Ordering and it is used to determine whether an external and internal library is worth combining together for both companies. It does this by plotting the molecular descriptors of compounds and determining whether new compounds have the desired characteristics that are unique to the external library when compared to the internal library by using a scoring method. The external library needs to be diverse enough to by fill in “chemical spaces” that the pharmaceutical company have within their internal library. Figure 3 shows a two dimensional representation of different molecular descriptor for an external (Target) and Internal (Current) libraries. The composite is a combination of the two libraries and the areas where the combining of the libraries will be favourable and unfavourable. The light grouping suggests that the compounds are favourable to being added into the internal library; meanwhile the dark groupings suggest it is unfavourable for combining. This scoring method helps researchers to determine whether this target library is worth merging with as the compounds it contains are different from the current library. By using LASSOO the companies can determine the relative distribution and diversity of molecular descriptors when comparing the two libraries, the benefits associated with combining the libraries together. The image shown above only utilises 2 molecular descriptors to produce a 2-Dimensional graph of data but more descriptors can be included to produce 3-,4-,… dimensional graphs. The graphs at higher dimensions will be difficult to visualise but the software can use its scoring method to determine the viability of the library mergerxiv . Figure 3
  • 9. Abhishek Pai 1319400 9 | P a g e The author of the paper analysed the compounds from Bayer and Schering. After doing the analysis it was found that the structural identity was very low but the physico-chemical overlap was very similar. The overlap in chemical spaces was also very significant meaning that the company’s libraries were very complimentary to one another. The decision was taken to keep the libraries separate and in-house but instead information regarding “hit” lists would be exchanged periodically, this was seen as a better option. Combining two extremely large libraries would require a large amount of resources but also increase the hindrance when screening the library. Screening compounds in each library independent of one another would pose a better chance of making hits on lead compounds for drug development. Analysing Drug and Non-Drug compounds within Libraries Another problem that arises frequently in combinatorial chemistry is how to distinguish between “drug” and “nondrug” compounds. The combinatorial synthesis process uses biologically active compounds to produce a larger number of iterations and derivatives of that compounds but the compounds produced won’t necessarily be biologically or therapeutically active. The large size of libraries makes the process of testing compounds resource intensive by using HTS. Discriminating between biologically active and inactive compounds is vital in reducing the redundancy of chemical compounds in the library. By testing and eliminating these redundant chemicals from the library the researchers can increase the speed of screening for viable compounds for drug development. A software scoring method has been developed to overcome this problem, by automatically and rapidly classifying compounds into “drug” and “nondrug” categories. The researchers developing this technique used two databases of compounds to validate their theory, 169,331 molecules from the Available Chemicals Directory (ACD) and 38,416 molecules from the World Drug Index (WDI). They used these publically available libraries as they were easily accessible and the molecules were already validated by the pharmaceutical community and government regulators. This software scoring method was able to classify 83% of ACD and 77% of WDI as medicinal drugs. It is important to highlight that the WDI only contains biologically active drugs and the scoring scheme should have determined that most of all the compounds to be drugs accounting for false negative and positives. So there is room for improvement in the scoring method to increase its effectiveness. The potential uses for this scoring method is testing the large numbers of diverse molecules in libraries and determine whether or not they can be used as potential drugs. It can also be used when combining two large libraries to make sure that libraries containing more potential drug
  • 10. Abhishek Pai 1319400 10 | P a g e compounds rather than non-drug compounds, this enables them to selectively purchase or even test compounds in large librariesxv . Analysing Molecular Diversity The main focus of research has been in searching through vast combinatorial libraries of compounds using HTD or HTS without any regard for molecular similarity or diversity. The techniques discussed above are all trying to use computer software to increase the probability of finding a compound for drug development by analysing and/or combining libraries. Due to the lack of data on molecular diversity researchers have used mass spectrometry as an analytical technique that can be used to produce quantitative data that can be vital in drug development. Mass spectrometry works by breaking down large compounds into small fragments of ionised compounds. The breakdown of the main compound is very unique and produces a fingerprint on spectra. The unique nature of the fingerprint helps researchers analyse and determine which compound is being tested. The molecular fingerprint has been used to create a fragment dictionary that had been indexed with a specific pattern of fragment. Since fragments produced by two very similar compound structures are also going to be very similar, the spectral information can be used to determine the relative diversity of compounds as comparing two compounds with very diverse structure will produce fragments which are also very diverse in structure. The data created from this technique is called structural keys and hashed fingerprint. Comparing this information will help give a quantitative method of analysing the relative diversity of compounds within a libraryxvi . Alternatives to Combinatorial Synthesis to increase Molecular Diversity The main problem with all the techniques that were discussed above they analyse compounds synthesised using combinatorial synthesis, they all focus on creating larger libraries and analysing these libraries to find a “hit” and then take this compound onto producing a lead compound for development. Diversity is very important when it comes to discovering a novel treatment method for a particular disease, but combinatorial synthesis doesn’t seem to be able to provide this however. Combinatorial synthesis is very good at producing large numbers of new compounds, most of which start from a pool of compounds that may be very similar in structure. This similarity in structure is carried over from the starting compounds to the final products, resulting in a lack of diversity. In order to solve this issue I try finding new methods of synthesis that is different from combinatorial synthesis that can produce compounds that have a larger diversity in structure.
  • 11. Abhishek Pai 1319400 11 | P a g e Structure-based synthesis Structure-based synthesis, also known as target-oriented synthesis (TOS), is based on designing a drug with a specific structure rather than discovering it in nature or a large library of compounds. Data on the 3-D shape of the target molecule is collected using X-Ray crystallography or NMR spectroscopy, this information is then used to design drugs that fit into the target site. Since the target site is already known the drug candidates can be developed to have higher affinities and selectivity. This can also help reduce the number of drugs binding to secondary sites and causing adverse side effects that result. The main issue with combinatorial chemistry is the large number of compounds that it produces. Most of the compounds produced may never be useful as therapeutic drugs. By using the structural information of the target sites and the rapid rate of producing large number of compounds from combinatorial chemistry we will see a huge leap forward in the discovery of molecularly diverse yet target specific drugs. This technique uses a base molecule that fits easily into the target and then produces copious numbers of derivatives that have a much higher probability of working in the target site. Figure 4 shows the diversity of compounds produced by combinatorial synthesis although these are very diverse they aren’t selective or specific in their target site. The integration of these two methods involves finding an initial candidate that fits the target site e.g. Figure 5 shows the steroid backbone that is shared by all the derivatives of the compound, this backbone enables the compounds to bind to the target site with greater selectivity but at the same time have varying affinity, efficacy and other properties, improving the therapeutic indexes of new drug treatmentsxvii . Research conducted by UC Berkley and UC San Francisco used this integrative concept of structural based drug design and combinatorial chemistry. They identified a protein called Cathepsin D which doesn’t have any potent inhibitors that are currently available in the market. They found the smallest Figure 4 Figure 5
  • 12. Abhishek Pai 1319400 12 | P a g e molecule that can inhibit this protein and created a second generation library where they used combinatorial synthesis to produce new compounds based off a basic scaffold. This second generation library can then be searched and filtered appropriately to find compounds that have a specific range of properties. The researchers used the property of potency to search through libraries with a diverse range of compounds and receptor targeted libraries. It was found that the compounds discovered in the second generation receptor targeted combinatorial library were 5 to 6 fold more potent than the compounds discovered in the diverse library. The success of this study shows the potential of targeted structure-based combinatorial chemistry and the great benefits that it holds in producing compounds which are viable for therapeutic usexviii . Fragment-based synthesis Fragment-based drug discovery (FBDD) is a technique based on discovering small molecules that weakly bind to the target site and build up the molecules to produce large, complex, diverse molecules with varying properties such as higher affinity, efficacy, selectivity etc. The process employs HTS to test small compounds against a variety of different target site in vitro. HTS is carried out on compounds smaller than 500Da from large combinatorial libraries that may contain between 100,000s to 1,000,000s of compounds. The weight limit reduces the number of molecules that need to be tested significantly reducing the amount of time it takes to find a suitable candidate for FBDD. The strategy of this technique is to build large complex molecules from small simple moleculesxix . Research has been conducted on combining both computational chemistry and Fragment- based drug discovery to make the process of discovering a fragment and producing fragment-to-lead (F2L) much more efficient. Computational chemistry is also used to narrow down the large libraries of compounds to produce smaller libraries containing fragments that can be used in FBDD. The creation of a focussed library of fragment increases the efficiency of discovering new potential drug compounds. With the aid of Bioassays, NMR and X-Ray Crystallography the structures of compounds and ligands can be determined. This information is then fed into HTD to determine the relative compatibility of the compoundFigure 6
  • 13. Abhishek Pai 1319400 13 | P a g e called hit conformation. The compounds are then tested in vitro to determine their structure activity relationship (SAR) this information can be used to further develop the compound to change their potency, affinity etc. Figure 6 shows the process of FBDD and how it filters compounds down the flow chart, they are then built up to become larger molecules resulting in increased molecular diversity. In the pharmaceutical research market the filing of patent for intellectual property is very crowded but the use of FBDD to start from small compounds to build larger compounds means that the potential for conflict is very low. The compounds that are produced via FBDD can vary dramatically even when starting from the same initial compound fragment. This increases the molecular diversity quite considerably and results in the production of new novel drug compounds for therapeutic usexx . Diversity-oriented synthesis Combinatorial chemistry has been very useful at producing a large number of compounds resulting in very large libraries but there is an issue with the compounds that are produced, they lack diversity. They are too similar in structure as they all start from the same set of starting compounds resulting in iterative derivatives of compounds with very similar structures. Figure 7 shows a compound before and after it was put through the process of combinatorial synthesis, it shows the relative similarity of the structure of the starting and finishing compounds. They have very little change in structure; the only change that does occur is the addition or removal of functional groups. Target-oriented Synthesis (TOS), also known as structure-based synthesis, has been used analyse a target and find a compound that fits the structure then large numbers of derivatives are produced, these compounds lack a huge amount of diversity. Using TOS to synthesise compounds results in very specific chemical descriptors for the products causing a decrease in diversity of the drug compounds that are produced. The answer to these issues is Diversity-oriented Synthesis (DOS); this process intentionally produces more than one set of compounds that are diverse in structure and properties in an efficient manner to solve complex drug targeting issuesxxixxii . DOS was developed to use combinatorial synthesis to produce large libraries of compounds, but fixes the issue of diversity in the process. DOS aims to produce compounds with a broad range within specific chemical Figure 7
  • 14. Abhishek Pai 1319400 14 | P a g e descriptors making the chances of producing very diverse compounds much higher. This is because changing the properties of compounds may involve change a large number of functional groups, stereogenic sites etc. this can result in a change in the overall structure of the compound thus increasing diversity. Figure 8 shows how all three routes of synthesis create compounds and the relative diversity of the compounds that are produced. This representation shows how much more advanced DOS is in comparison to conventional combinatorial synthesis and TOS at producing compounds with very high molecular diversityxxiii . There are four requirements for increasing molecular diversity in drug synthesis, appendage diversity, stereochemical diversity, functional group diversity and skeletal diversity. The final requirement is the hardest to achieve and there are two strategies that have been developed to help improve the skeletal diversity of novel compounds. Folding and Branching pathways are two techniques that are being used to develop a diverse range of drug compounds; figure 9 shows a basic schematic of how the techniques work. The folding pathway uses one common reagent that folds the structure in a specific way but uses a variety of different starting materials to produce a large number of diverse compounds. Branching pathway uses a one starting compound and a range of different reagents to alter the one compound into several different diverse ranges of compounds. At the end of both processes a large diverse range of compounds will be produced. Figure 8 Figure 9 Figure 10
  • 15. Abhishek Pai 1319400 15 | P a g e When either technique is used the diversity of molecules produced increases dramatically. Figure 10 shows a graph using two molecular descriptors to identify the distribution of compounds. The MDDR library contains compounds that are known to be drugs and when comparing this database with DOS library (red) and focussed library (blue) it is clear that the distribution of DOS library is much higher than the focussed library, showing the level of diversity that is achieved when DOS is used to produce new compoundsxxiv . Conclusion This literature review chose to focus on two aspects of fixing problems associated with combinatorial synthesis and finding alternative routes of synthesis that improve molecular diversity. Firstly, fixing problems associated with combinatorial synthesis involving the improvement of techniques to analyse compounds in libraries and using them to combine libraries, with the aid of computation techniques such as virtual docking, inverse docking, LASOO, molecular descriptors, flexibility of drug binding and the analysis of drug and non- drug compounds. These techniques were used to analyse large libraries of compounds containing 100,000s and 1,000,000s of compounds to determine their relative diversity and viability as useful compounds for further drug development. Second part of the review focused on finding alternative routes of synthesis to combinatorial synthesis to improve molecular diversity. There are techniques that can be used to synthesis new compounds that are very diverse such as structural-based synthesis, fragment-based synthesis and diversity-oriented synthesis. All three of these techniques have been modified and improved to increase the molecular diversity that can be achieved so that diverse drug compounds can go from discovery to lead with greater success rates. The techniques listed above and discussed in this literature review highlight the improvements that are being made to the process of drug discovery. As these areas of research progress further the relative ease by which more diverse compounds are discovered and synthesised will rise. i Higginbotham, S., Wong, W., Linington, R., Spadafora, C., Iturrado, L. and Arnold, A. (2014). Sloth Hair as a Novel Source of Fungi with Potent Anti-Parasitic, Anti-Cancer and Anti-Bacterial Bioactivity. PLoS ONE, 9(1), p.e84549. ii Combinatorial Chemistry Review. (2016). [online] Combichemistry.com. Available at: http://www.combichemistry.com. iii Klon, A., Glick, M., Thoma, M., Acklin, P. and Davies, J. (2004). Finding More Needles in the Haystack: A Simple and Efficient Method for Improving High-Throughput Docking Results. J. Med. Chem., 47(11), pp.2743- 2749.
  • 16. Abhishek Pai 1319400 16 | P a g e iv Shoichet, B. (2004). Virtual screening of chemical libraries. Nature, 432(7019), pp.862-865. v Ellingson, S., Dakshanamurthy, S., Brown, M., Smith, J. and Baudry, J. (2013). Accelerating virtual high- throughput ligand docking: current technology and case study on a petascale supercomputer. Concurrency and Computation: Practice and Experience, 26(6), pp.1268-1277. vi Hou, X., Li, K., Yu, X., Sun, J. and Fang, H. (2015). Protein Flexibility in Docking-Based Virtual Screening: Discovery of Novel Lymphoid-Specific Tyrosine Phosphatase Inhibitors Using Multiple Crystal Structures. Journal of Chemical Information and Modeling, 55(9), pp.1973-1983. vii Bottegoni, G., Rocchia, W., Rueda, M., Abagyan, R. and Cavalli, A. (2011). Systematic Exploitation of Multiple Receptor Conformations for Virtual Ligand Screening. PLoS ONE, 6(5), p.e18845. viii Dixon, S. and Villar, H. (1998). Bioactive Diversity and Screening Library Selection via Affinity Fingerprinting. Journal of Chemical Information and Modeling, 38(6), pp.1192-1203. ix Chen, Y. and Ung, C. (2001). Prediction of potential toxicity and side effect protein targets of a small molecule by a ligand–protein inverse docking approach. Journal of Molecular Graphics and Modelling, 20(3), pp.199-218. x Patterson, D., Cramer, R., Ferguson, A., Clark, R. and Weinberger, L. (1996). Neighborhood Behavior: A Useful Concept for Validation of “Molecular Diversity” Descriptors. J. Med. Chem., 39(16), pp.3049-3059. xi Cummins, D., Andrews, C., Bentley, J. and Cory, M. (1996). Molecular Diversity in Chemical Databases: Comparison of Medicinal Chemistry Knowledge Bases and Databases of Commercially Available Compounds. Journal of Chemical Information and Modeling, 36(4), pp.750-763. xii Schamberger, J., Grimm, M., Steinmeyer, A. and Hillisch, A. (2011). Rendezvous in chemical space? Comparing the small molecule compound libraries of Bayer and Schering. Drug Discovery Today, 16(13-14), pp.636-641. xiii Hassan, M., Bielawski, J., Hempel, J. and Waldman, M. (1996). Optimization and visualization of molecular diversity of combinatorial libraries. Molecular Diversity, 2(1-2), pp.64-74. xiv Koehler, R., Dixon, S. and Villar, H. (1999). LASSOO: A Generalized Directed Diversity Approach to the Design and Enrichment of Chemical Libraries. J. Med. Chem., 42(22), pp.4695-4704. xv Sadowski, J. and Kubinyi, H. (1998). A Scoring Scheme for Discriminating between Drugs and Nondrugs. J. Med. Chem., 41(18), pp.3325-3329. xvi Schoonjans, V., Questier, F., Borosy, A., Walczak, B., Massart, D. and Hudson, B. (2000). Use of mass spectrometry for assessing similarity/diversity of natural products with unknown chemical structures. Journal of Pharmaceutical and Biomedical Analysis, 21(6), pp.1197-1214. xvii Li, J., Murray, C., Waszkowycz, B. and Young, S. (1998). Targeted molecular diversity in drug discovery: Integration of structure-based design and combinatorial chemistry. Drug Discovery Today, 3(3), pp.105-112. xviii Kick, E., Roe, D., Geoffrey Skillman, A., Liu, G., Ewing, T., Sun, Y., Kuntz, I. and Ellman, J. (1997). Structure- based design and combinatorial chemistry yield low nanomolar inhibitors of cathepsin D. Chemistry & Biology, 4(4), pp.297-307. xix Murray, C. and Rees, D. (2015). Opportunity Knocks: Organic Chemistry for Fragment-Based Drug Discovery (FBDD). Angewandte Chemie International Edition, 55(2), pp.488-492. xx Law, R., Barker, O., Barker, J., Hesterkamp, T., Godemann, R., Andersen, O., Fryatt, T., Courtney, S., Hallett, D. and Whittaker, M. (2009). The multiple roles of computational chemistry in fragment-based drug design. Journal of Computer-Aided Molecular Design, 23(8), pp.459-473. xxi Spring, D. (2003). Diversity-oriented synthesis; a challenge for synthetic chemistsElectronic supplementary information (ESI) available: Excel file of all the FDA new molecular entities between the years 1998 and July 2003, and new drug approvals between the years 1990 and 2002. See http://www.rsc.org/suppdata/ob/b3/b310752n/. Organic & Biomolecular Chemistry, 1(22), p.3867. xxii Ma DL, L. (2013). Future Frontiers in Diversity-Oriented Synthesis. Organic Chem Curr Res, 03(01). xxiii Fergus, S., Bender, A. and Spring, D. (2005). Assessment of structural diversity in combinatorial synthesis. Current Opinion in Chemical Biology, 9(3), pp.304-309. xxiv Spandl, R., Díaz‐Gavilán, M., O'Connell, K., Thomas, G. and Spring, D. (2008). Diversity‐oriented synthesis. Chem. Record, 8(3), pp.129-142.