The talk describes the science and results of a consortium of multiple pharmaceutical companies extracting medicinal chemistry knowledge from research data and the application to real drug design projects. A new technique for automating pharmacophore / toxophore finding from public data is disclosed.
MedChemica Active Learning - Combining MMPA and MLAl Dossetter
Describes MedChemica research on combining Matched Molecular Pair Analysis (MMPA) and Machine Learning (ML) into a closed loop to find and optimize new hits for drug discovery. The talks describes the MMPA and Regression Forest models and how they were combined and some early conclusion. Of these permutative MMPA is the clear winner (Free Wilson ++)
MedChemica - Automated Extraction of Actionable Knowledge from Large Scale in...Al Dossetter
The talk describes the importance of stereochemistry in drug discovery research. A picture of a life sciences industry that is faced with a need to reduce R&D costs is disclosed. The science and results of a consortium of multiple pharmaceutical companies extracting medicinal chemistry knowledge from research data and the application to real drug design projects is presented as a method of reducing costs in research. A new technique for automating pharmacophore / toxophore finding from public data is also disclosed.
How we Built a Large Scale Matched Pair Analysis Engine (MCPairs) using OpenE...Al Dossetter
MCPairs performed Matched Molecular Pair Analysis on large scale to build databases of exploitable knowledge which is accessible for Drug Discovery to accelerate research projects. The talk describes how we did this and some of the challenges.
MedChemica BigData What Is That All About?Al Dossetter
A light look at the world of BigData for the lay person - a look at a couple of examples and what we do in MedChemica to speed up drug discovery. First presented at Macclesfield SciBar, and then Knutsford SciBar.
Practical Drug Discovery using Explainable Artificial IntelligenceAl Dossetter
How to build AI systems to enable the drug hunting medicinal chemist in their day-to-day work. Levels are AI are described and the meaning and context Explainable AI to medicinal chemists. Six medicinal chemist projects are described, as well as Matched Molecular Pair Analysis (MMPA), Machine Learning and Permutative MMPA. In each case how a system can be built to drill back to chemical sub-structures so effective decisions can be made.
Accelerating multiple medicinal chemistry projects using Artificial Intellige...Al Dossetter
The technical methods and results of Matched Molecular Pair Analysis (MMPA) applied from a small, individual assay scale through large pharma scale, to multiple pharma data sharing scale have been published and reviewed. The drive behind these efforts has been to derive a medicinal chemistry knowledge base (i.e. definitive textbook) that can be applied to drug discovery projects. The aim is to greatly decrease the time in lead identification and optimization by the synthesis of fewer compounds. Such a system suggests compound designs to expert chemists to triage; such a process is Artificial Intelligence (AI). Given this context, how does this work on projects? How do the chemists make decisions? What are the results? The talk will answer these questions through project examples where MMPA has been applied and how this led to drug candidates. The projects disclosed are from multiple organisations and describe Cathepsin K inhibitors, Glucokinase Inhibitors, 11β-Hydroxysteroid Dehydrogenase Type I Inhibitors (11β-HSD1), Ghrelin inverse antagonists and Tubulin Polymerization inhibitors. An overview of MMPA will be presented and each project will be briefly described with a focus on how the chemists used MMPA to understand SAR and design compounds. The impact of project progress to CD will be quantified.
Lecture given by Ed Griffen UKQSAR meeting Sept 2017. Covers material from work in our paper http://pubs.acs.org/doi/10.1021/acs.jmedchem.7b00935 background discussed in https://www.linkedin.com/pulse/first-draft-medicinal-chemistry-admet-encyclopedia-ed-griffen/
MedChemica Active Learning - Combining MMPA and MLAl Dossetter
Describes MedChemica research on combining Matched Molecular Pair Analysis (MMPA) and Machine Learning (ML) into a closed loop to find and optimize new hits for drug discovery. The talks describes the MMPA and Regression Forest models and how they were combined and some early conclusion. Of these permutative MMPA is the clear winner (Free Wilson ++)
MedChemica - Automated Extraction of Actionable Knowledge from Large Scale in...Al Dossetter
The talk describes the importance of stereochemistry in drug discovery research. A picture of a life sciences industry that is faced with a need to reduce R&D costs is disclosed. The science and results of a consortium of multiple pharmaceutical companies extracting medicinal chemistry knowledge from research data and the application to real drug design projects is presented as a method of reducing costs in research. A new technique for automating pharmacophore / toxophore finding from public data is also disclosed.
How we Built a Large Scale Matched Pair Analysis Engine (MCPairs) using OpenE...Al Dossetter
MCPairs performed Matched Molecular Pair Analysis on large scale to build databases of exploitable knowledge which is accessible for Drug Discovery to accelerate research projects. The talk describes how we did this and some of the challenges.
MedChemica BigData What Is That All About?Al Dossetter
A light look at the world of BigData for the lay person - a look at a couple of examples and what we do in MedChemica to speed up drug discovery. First presented at Macclesfield SciBar, and then Knutsford SciBar.
Practical Drug Discovery using Explainable Artificial IntelligenceAl Dossetter
How to build AI systems to enable the drug hunting medicinal chemist in their day-to-day work. Levels are AI are described and the meaning and context Explainable AI to medicinal chemists. Six medicinal chemist projects are described, as well as Matched Molecular Pair Analysis (MMPA), Machine Learning and Permutative MMPA. In each case how a system can be built to drill back to chemical sub-structures so effective decisions can be made.
Accelerating multiple medicinal chemistry projects using Artificial Intellige...Al Dossetter
The technical methods and results of Matched Molecular Pair Analysis (MMPA) applied from a small, individual assay scale through large pharma scale, to multiple pharma data sharing scale have been published and reviewed. The drive behind these efforts has been to derive a medicinal chemistry knowledge base (i.e. definitive textbook) that can be applied to drug discovery projects. The aim is to greatly decrease the time in lead identification and optimization by the synthesis of fewer compounds. Such a system suggests compound designs to expert chemists to triage; such a process is Artificial Intelligence (AI). Given this context, how does this work on projects? How do the chemists make decisions? What are the results? The talk will answer these questions through project examples where MMPA has been applied and how this led to drug candidates. The projects disclosed are from multiple organisations and describe Cathepsin K inhibitors, Glucokinase Inhibitors, 11β-Hydroxysteroid Dehydrogenase Type I Inhibitors (11β-HSD1), Ghrelin inverse antagonists and Tubulin Polymerization inhibitors. An overview of MMPA will be presented and each project will be briefly described with a focus on how the chemists used MMPA to understand SAR and design compounds. The impact of project progress to CD will be quantified.
Lecture given by Ed Griffen UKQSAR meeting Sept 2017. Covers material from work in our paper http://pubs.acs.org/doi/10.1021/acs.jmedchem.7b00935 background discussed in https://www.linkedin.com/pulse/first-draft-medicinal-chemistry-admet-encyclopedia-ed-griffen/
Emerging Challenges for Artificial Intelligence in Medicinal ChemistryEd Griffen
Presentation by Dr Ed Griffen of MedChemica Ltd, at The IBSA Conference "How Artificial Intelligence Can Change the Pharmaceutical Landscape“ - LUGANO, October 9th 2019.
How predictive models help Medicinal Chemists design better drugs_webinarAnn-Marie Roche
All scientific disciplines, including medicinal chemistry, are experiencing a revolution in unprecedented rates of data being generated and the subsequent analysis and exploitation of this data is increasingly fundamental to innovation. Using data to design better compounds is a challenge for Medicinal and Computational chemists.
The design of small-molecule drug candidates, encompassing characteristics such as potency, selectivity and ADMET (absorption, distribution, metabolism, excretion and toxicity) is a key factor in the success of clinical trials and computer-aided drug discovery/design methods have played a major role in the development of therapeutically important small molecules for over three decades. These methods are broadly classified as either structure-based or ligand-based.
In this webinar our expert Dr. Olivier Barberan will discuss ligand-based methods and he will cover the following:
How to use only ligand information to predict activity depending on its similarity/dissimilarity to previously known active ligands.
- Discuss ligand-based pharmacophores, molecular descriptors, and quantitative structure-activity relationships and important tools such as target/ligand databases necessary for successful implementation of various computer-aided drug discovery/design methods in a drug discovery campaign.
Webinar: New RMC - Your lead_optimization Solution June082017Ann-Marie Roche
The drug discovery landscape is rapidly changing and drives the need to generate leads with lower attrition rates.
In this webinar, our expert Dr. Olivier Barberan discussed how NEW Reaxys Medicinal chemistry in NEW Reaxys allows better discovery and exploration of structure activity relationship and also supports a more efficient property-based drug design approach. He covered the following:
• How has RMC being transformed into a more accessible tool for all users, allowing complex searches and workflows to be easily carried out.
• A demonstration of how more than ever RMC is the only lead-optimization solution you will need.
Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)Hellmuth Broda
While we bemoan the ever increasing data tsunami new technologies allow to harvest the gold nuggets in the hay stack.
Using the example of the Pharmaceutical Industry some of the possible business uses for Big Data Analitics are outlined.
We describe what we have learned from four years of collaborating across the industry/not-for-profit boundary. Over this time we pursued multiple projects and some of the lessons learned are described here.
Many large pharma companies have reduced their research activity at the very early, hit- and lead- seeking phase of research. To compensate, organisations are becoming more porous and working more collaboratively in risk-sharing arrangements. Both parties need to give up some control, but gain a great deal in return
Emerging Challenges for Artificial Intelligence in Medicinal ChemistryEd Griffen
Presentation by Dr Ed Griffen of MedChemica Ltd, at The IBSA Conference "How Artificial Intelligence Can Change the Pharmaceutical Landscape“ - LUGANO, October 9th 2019.
How predictive models help Medicinal Chemists design better drugs_webinarAnn-Marie Roche
All scientific disciplines, including medicinal chemistry, are experiencing a revolution in unprecedented rates of data being generated and the subsequent analysis and exploitation of this data is increasingly fundamental to innovation. Using data to design better compounds is a challenge for Medicinal and Computational chemists.
The design of small-molecule drug candidates, encompassing characteristics such as potency, selectivity and ADMET (absorption, distribution, metabolism, excretion and toxicity) is a key factor in the success of clinical trials and computer-aided drug discovery/design methods have played a major role in the development of therapeutically important small molecules for over three decades. These methods are broadly classified as either structure-based or ligand-based.
In this webinar our expert Dr. Olivier Barberan will discuss ligand-based methods and he will cover the following:
How to use only ligand information to predict activity depending on its similarity/dissimilarity to previously known active ligands.
- Discuss ligand-based pharmacophores, molecular descriptors, and quantitative structure-activity relationships and important tools such as target/ligand databases necessary for successful implementation of various computer-aided drug discovery/design methods in a drug discovery campaign.
Webinar: New RMC - Your lead_optimization Solution June082017Ann-Marie Roche
The drug discovery landscape is rapidly changing and drives the need to generate leads with lower attrition rates.
In this webinar, our expert Dr. Olivier Barberan discussed how NEW Reaxys Medicinal chemistry in NEW Reaxys allows better discovery and exploration of structure activity relationship and also supports a more efficient property-based drug design approach. He covered the following:
• How has RMC being transformed into a more accessible tool for all users, allowing complex searches and workflows to be easily carried out.
• A demonstration of how more than ever RMC is the only lead-optimization solution you will need.
Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)Hellmuth Broda
While we bemoan the ever increasing data tsunami new technologies allow to harvest the gold nuggets in the hay stack.
Using the example of the Pharmaceutical Industry some of the possible business uses for Big Data Analitics are outlined.
We describe what we have learned from four years of collaborating across the industry/not-for-profit boundary. Over this time we pursued multiple projects and some of the lessons learned are described here.
Many large pharma companies have reduced their research activity at the very early, hit- and lead- seeking phase of research. To compensate, organisations are becoming more porous and working more collaboratively in risk-sharing arrangements. Both parties need to give up some control, but gain a great deal in return
5 Signs Your Critical Assets Are Costing You More Than You ThinkStephen Dobson
There are the obvious costs to owning assets in a business, but there are also the costs that you don’t see. Take a look to see how critical assets are costing you more than you think
In fashion, just like many other industries, branding is everything. But, in an increasingly online world, this presents a problem. Success is determined by the laws of conversion, not by offering serendipity. Sadly, this is why online fashion stores are starting to look the same.
Do you refuse to be next in line for the cookie cutter? We support that! At Fabrique we take branding seriously; branding that creates
a difference.
Here are some examples of our work. You will see that brands and conversion actually make a pretty good couple... when they are paired carefully.
Brought to you by Electio Invest Group - an international technology metal commodity brokerage.
Electio Invest offers individuals the opportunity to invest in tech metals as physical commodities. The investment offers high returns in a regulated structure and clients have the ability to buy and sell baskets of metals through Electio's bespoke online portal. Metals are stored in a secure facility in Jebel Ali, Dubai under the custody of a regulated custodian bank.
More information can be found at www.electio-invest.com
Recruit, Retain, Realize - How Third Party Transactional Data Can Power Your ...Doug Oldfield
This presentation provides an overview of many types of commercial B2C data on the market as well as how and when to use each type. We also cover how special and unique transactional data is and how this type of data can be incredibly powerful in predicting response.
My presentation at the DMU-hosted seminar [held on 8 December 2011] on THE ASSAULT ON UNIVERSITIES: Privatisation, Secrecy and the Future of Higher Education.
A look at how organizations can use social listening and analysis as input into their social media and business strategies. Shares case studies, use cases, and processes.
Sharing and standards christopher hart - clinical innovation and partnering...Christopher Hart
Acknowledging the increasing need for cooperation and collaboration in data sharing and access. Describing the complexity that this can bring. Then describing some of the ways to simplify that.
Originally presented at Terrapin's Clinical innovation and partnering world March 8-9 2017.
http://www.terrapinn.com/conference/innovation-and-partnering/index.stm
Insights from Building the Future of Drug Discovery with Apache Spark with Lu...Databricks
Human genetics holds the key to understanding pathogenesis of many devastating diseases like type 2 diabetes and Alzheimer’s disease. The discovery, development, and commercialization of new classes of drugs can take 10-15 years and greater than $5 billion in R&D investment only to see less than 5% of the drugs make it to market. Committed to creating therapeutic innovations, Regeneron has built one of the world’s most comprehensive genetics databases to supplement our state-of-the-art drug development pipeline. While these massive volumes of data provide an unprecedented opportunity to gain novel therapeutic insights, Regeneron has encountered a number of challenges on the road to delivering on the promises of big data and genomics in drug discovery. For example, how do you enable fast and accurate query from >80B data points? And how do you expedite novel statistical tests on TB-scale data?
This presentation will share Regeneron’s vision for building a scalable and performant informatics infrastructure to accelerate genetics-driven drug development. Specifically, we highlight key challenges in establishing the world’s largest clinical genetics databases, provide an overview of how Regeneron leverages Databricks’ Unified Analytics Platform and Apache Spark, and discuss in detail key engineering innovations that have already come out of this collaborative effort.
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...Bigfinite
Maximize Your Understanding of Operational Realities in Manufacturing with Predictive Insights using Big Data, Artificial Intelligence, and Pharma 4.0
by Toni Manzano, PhD, Co-founder and CSO, Bigfinite
PDA Annual Meeting 2020
Microsoft: A Waking Giant in Healthcare Analytics and Big DataDale Sanders
Ten years ago, critics didn’t believe that Microsoft could scale in the second generation of relational data warehouses, but they did. More recently, many of these same pundits have criticized Microsoft for missing the technology wave du jour in cloud offerings, mobile technology, and big data. But, once again, Microsoft has been quietly reengineering its culture and products, and as a result, they now offer the best value and most visionary platform for cloud services, big data, and analytics in healthcare.
The iCSS CompTox Dashboard is a publicly accessible dashboard provided by the National Center for Computation Toxicology at the US-EPA. It serves a number of purposes, including providing a chemistry database underpinning many of our public-facing projects (e.g. ToxCast and ExpoCast). The available data and searches provide a valuable path to structure identification using mass spectrometry as the source data. With an underlying database of over 720,000 chemicals, the dashboard has already been used to assist in identifying chemicals present in house dust. However, it can also be applied to many other purposes, e.g., the identification of agrochemicals in waste streams. This presentation will provide a review of the EPA’s platform and underlying algorithms used for the purpose of compound identification using high-resolution mass spectrometry data. In order to examine its performance for structure identification, especially in terms of rank-ordering database hits, we have compared it with the ChemSpider database, a well-regarded public database that has become one of the community standards for structure identification. The study has shown that the CompTox Dashboard outperforms ChemSpider in terms of structure identification and ranking providing improved outcomes for mass spectrometry analysis of “known unknowns”.
Microsoft: A Waking Giant In Healthcare Analytics and Big DataHealth Catalyst
In 2005, Northwestern Memorial Healthcare embarked upon a strategic Enterprise Data Warehousing (EDW) initiative with the Microsoft technology platform as the foundation. Dale Sanders was CIO at Northwestern and led the development of Northwestern’s Microsoft-based EDW. At that time, Microsoft as an EDW platform was not en vogue and there were many who doubted the success of the Northwestern project. While other organizations were spending millions of dollars and years developing EDW’s and analytics on other platforms, Northwestern achieved great and rapid value at a fraction of the cost of the more typical technology platforms. Now, there are more healthcare data warehouses built around Microsoft products than any other vendor. The risky bet on Microsoft in 2005 paid off.
Ten years ago, critics didn’t believe that Microsoft could scale in the second generation of relational data warehouses, but they did. More recently, many of these same pundits have criticized Microsoft for missing the technology wave du jour in cloud offerings, mobile technology, and big data. But, once again, Microsoft has been quietly reengineering its culture and products, and as a result, they now offer the best value and most visionary platform for cloud services, big data, and analytics in healthcare.
In this context, Dale will talk about:
His up and down journey with Microsoft as an Air Force and healthcare CIO, and why he is now more bullish on Microsoft like never before
A quick review of the Healthcare Analytics Adoption Model and Closed Loop Analytics in healthcare, and how Microsoft products relate to both
The rise of highly specialized, cloud-based analytic services and their value to healthcare organizations’ analytics strategies
Microsoft’s transformation from a closed-system, desktop PC company to an open-system consumer and business infrastructure company
The current transition period of enterprise data warehouses between the decline of relational databases and the rise of non-relational databases, and the new Microsoft products, notably Azure and the Analytic Platform System (APS), that bridge the transition of skills and technology while still integrating with core products like Office, Active Directory, and System Center
Microsoft’s strategy with its PowerX product line, and geospatial analysis and machine learning visualization tools
Curlew Research Brussels 2014 Electronic Data & Knowledge ManagementNick Lynch
Life Science externalisation and collaboration overview and the challenges that Life Science companies face in delivering successful data sharing with their partners in either Open Innovation or pre-competitive workflows
Presented at Artificial Intelligence and Machine Learning for Advanced Drug Discovery & Development 2019 on 28th May 2019 by Dr Ed Griffen of MedChemica Ltd
Database design in the context of Clinical Data Management (CDM) is a crucial aspect of organizing and managing clinical trial data effectively and efficiently. A well-designed database ensures that data collected during a clinical trial is accurate, consistent, and accessible, facilitating data analysis, reporting, and regulatory submissions. Clinical Data Management involves various steps, including data collection, validation, cleaning, and reporting
DataFAIRy bioassays pilot -- lessons learned and future outlookIsabella Feierberg
We describe a precompetitive collaboration that makes public life science data FAIR and annotated with detailed, high quality metadata, at a shared cost. A data model based on public ontologies was defined to address the participants' business questions. This slide deck was presented at the Cambridge Cheminformatics meeting on June 2, 2021.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Studia Poinsotiana
I Introduction
II Subalternation and Theology
III Theology and Dogmatic Declarations
IV The Mixed Principles of Theology
V Virtual Revelation: The Unity of Theology
VI Theology as a Natural Science
VII Theology’s Certitude
VIII Conclusion
Notes
Bibliography
All the contents are fully attributable to the author, Doctor Victor Salas. Should you wish to get this text republished, get in touch with the author or the editorial committee of the Studia Poinsotiana. Insofar as possible, we will be happy to broker your contact.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
Toxic effects of heavy metals : Lead and Arsenicsanjana502982
Heavy metals are naturally occuring metallic chemical elements that have relatively high density, and are toxic at even low concentrations. All toxic metals are termed as heavy metals irrespective of their atomic mass and density, eg. arsenic, lead, mercury, cadmium, thallium, chromium, etc.
MedChemica - Extracting and exploiting medicinal chemistry ADMET knowledge automatically from public and large pharma data
1. MedChemica | 2016
ACS Philadelphia 2016
Extracting and exploiting
medicinal chemistry ADMET
knowledge automatically from
public and large pharma data
MedChemica
ACS August 2016
2. MedChemica | 2016
ACS Philadelphia 2016
Better compounds designed from Data
• Improved compounds quicker
• Applicable ideas
• Confident design decisions
• Help when stuck
• Clearly describable plans
• Maximizing value from ADMET testing
• Pursuing dead-end series
• Pursuing dead-end projects
• Running out of time or $
Essentials
Gains
Pains
3. MedChemica | 2016
ACS Philadelphia 2016
Key findings:
• Secure sharing of large scale ADMET knowledge
between large Pharma is possible
• The collaboration generated great synergy
• Many findings are highly significant
• MMP is a great tool for idea generation
• The rules have been used in drug-discovery projects
and generated meaningful results
• MMPA methodology can be extended to extract
pharmacophores
4. MedChemica | 2016
ACS Philadelphia 2016
Grand Rule database
Better medicinal chemistry by sharing knowledge not data & structures
5. MedChemica | 2016
ACS Philadelphia 2016
Barriers Broken to Sharing Knowledge
Data
Integrity and
curation
Knowledge
extraction
algorithms
Consortium
building to
share
knowledge Into the minds of
chemists
6. MedChemica | 2016
ACS Philadelphia 2016
Data Integrity and Curation
Structural
• Extensive standards for
inclusion of mixtures,
chiral compounds, salt
forms
• Tautomer and charge
state canonicalisation
client side
• Automated validation of
structures run client side
= “clean” comparable
structures submitted to
pair finding
Measured Data
• Assay protocols reviewed
prior to merging
• Precise documentation
on unit definitions and
data reporting standards
• Option to share standard
compound measured
values
• Automated extensive
data validation checks
prior to merging data
“client
side”
=
behind
Pharma
firewall
7. MedChemica | 2016
ACS Philadelphia 2016
IP security
• Contributing company structures are NOT shared
• Contributing company identifiers are NOT shared
• Each contributing company receives it’s OWN
custom GRD copy to enable drill back to OWN
example data
• Minimum of n=6 example pairs required for
inclusion in the GRD
• Enforced limits of shared substructure sizes
• Option for “time-blocking” sharing of data to
withhold data < 6 months old for IP submission
8. MedChemica | 2016
ACS Philadelphia 2016
Barriers Broken to Sharing Knowledge
Data
Integrity and
curation
Knowledge
extraction
algorithms
Consortium
building to
share
knowledge Into the minds of
chemists
✓
9. MedChemica | 2016
ACS Philadelphia 2016
MCPairs Platform
• Extract rules using Advanced Matched Molecular Pair Analysis
• Knowledge is captured as transformations
– divorced from structures => sharable
Measured
Data
rule
finder Exploitable
Knowledge
MC Expert
Enumerator
System
Problem molecule
Solution molecules
Pharmacophores
& toxophores
SMARTS
matching
Alerts
Virtual
screening
Library
design
Protect
the
IP
jewels
10. MedChemica | 2016
ACS Philadelphia 2016
• Matched Molecular Pairs – Molecules
that differ only by a particular, well-
defined structural transformation
• Transformation with environment capture –
MMPs can be recorded as transformations
from A B
• Environment is essential to understand
chemistry
Statistical analysis
• Learn what effect the transformation has had on ADMET properties in
the past
Griffen,
E.
et
al.
Matched
Molecular
Pairs
as
a
Medicinal
Chemistry
Tool.
Journal
of
Medicinal
Chemistry.
2011,
54(22),
pp.7739-‐7750.
Advanced MMPA
Δ Data
A-B1
2
2
3
3
3
4
4
4
12
23
3
34
4
4
A
B
11. MedChemica | 2016
ACS Philadelphia 2016
Environment really matters
HMe:
• Median Δlog(Solubility)
• 225 different
environments
2.5log
1.5log
HMe:
• Median Δlog(Clint)
Human microsomal
clearance
• 278 different
environments
12. MedChemica | 2016
ACS Philadelphia 2016
More environment = right detail
HMe Solubility:
• 225 different environments
13. MedChemica | 2016
ACS Philadelphia 2016
HF What effect on Clearance?
• Median Δlog(Clint) Human microsomal clearance
• 37 different environments
2
fold
improvement
2
fold
worse
Increase
clearance
decrease
clearance
14. MedChemica | 2016
ACS Philadelphia 2016
Identify and group matching SMIRKS
Calculate statistical parameters for each unique
SMIRKS (n, median, sd, se, n_up/n_down)
Is n ≥ 6?
Not enough data:
ignore transformation
Is the |median| ≤ 0.05 and the
intercentile range (10-90%) ≤ 0.3?
Perform two-tailed binomial test on the
transformation to determine the
significance of the up/ down frequency
transformation is
classified as ‘neutral’
Transformation classified as
‘NED’ (No Effect Determined)
Transformation classified as
‘increase’ or ‘decrease’
depending on which direction the
property is changing
pass fail
yes no
yes no
Rule selection
0 +ve-ve
Median data difference
Neutral
Increase
Decrease
NED
15. MedChemica | 2016
ACS Philadelphia 2016
Barriers Broken to Sharing Knowledge
Data
Integrity and
curation
Knowledge
extraction
algorithms
Consortium
building to
share
knowledge Into the minds of
chemists
✓
✓
16. MedChemica | 2016
ACS Philadelphia 2016
Merging knowledge
• Use the transforms that
are robust in both
companies to calibrate
assays.
• Once the assays are
calibrated against each
other the transform
data can be combined
to build support in
poorly exemplified
transforms
• Methodology
precedented in other
fields
CalibrateRobust
Robust
Weak
Weak
Discover
Novel
Pharma
1
Pharma
2
17. MedChemica | 2016
ACS Philadelphia 2016
Merging Datasets
• Datasets are standardized by comparison of
transformations shared by contributing companies
• Transformations are examined at the “pair example”
level
• Minimum of 6 transformations, each with a minimum of 6
pairs (42 compounds bare minimum) required to
standardise
• “calibration factors” extracted to standardize the
datasets to a common value – mean of calibration
factors 0.94, typical range 0.8-1.2.
• Datasets with too few common transformations have
standard compound measurements shared for
calibration.
“Blinded”
source
of
transformaLons
18. Current Knowledge sets – GRDv3
Numbers of statistically valid transforms
Grouped Datasets Number of Rules
logD7.4 153449
Merged solubility 46655
In vitro microsomal clearance:
Human, rat ,mouse, cyno, dog
88423
In vitro hepatocyte clearance :
Human, rat ,mouse, cyno, dog 26627
MCDK permeability A-B / B – A efflux 1852
Cytochrome P450 inhibition:
2C9, 2D6 , 3A4 , 2C19 , 1A2 40605
Cardiac ion channels
NaV 1.5 , hERG ion channel inhibition
15636
Glutathione Stability 116
plasma protein or albumin binding
Human, rat ,mouse, cyno, dog
64622
19. MedChemica | 2016
ACS Philadelphia 2016
Pharma 1 100k rules
Pharma 2 92k rules
Pharma 3 37k rules
5.8k rules in common (pre-merge) ~ 2%
New Rules 88k
~26% of total
Merge
Combining
data
yields
brand
new
rules
Gains:
300
-‐
900%
Merging knowledge – GRDv1
20. MedChemica | 2016
ACS Philadelphia 2016
Key findings:
• Secure sharing of large scale ADMET knowledge
between large Pharma is possible
• The collaboration generated great synergy
• Many findings are highly significant.
• MMP is a great tool for idea generation.
• The rules have been used in drug-discovery projects
and generated meaningful results
• MMPA methodology can be extended to extract
pharmacophores
21. MedChemica | 2016
ACS Philadelphia 2016
Barriers Broken to Sharing Knowledge
Data
Integrity and
curation
Knowledge
extraction
algorithms
Consortium
building to
share
knowledge Into the minds of
chemists
✓
✓
✓
22. MedChemica | 2016
ACS Philadelphia 2016
Integrating knowledge exploitation
..many possible connections
..high Stability and flexibility**
Rule
Database
MCExpert
API server
** api.medchemica.com has
delivered 18 months of uptime and
drives SaltTraX (Elixir)
Approx 50 chemists use the tool
RESTful
API
Chemistry
Shape
and
electrostaHcs
26. MedChemica | 2016
ACS Philadelphia 2016
A Less Simple Example
Increase logD and gain solubility
Property
Number
of
Observa@ons
Direc@on
Mean
Change
Probability
logD
8
Increase
1.2
100%
Log(Solubility)
14
Increase
1.4
92%
What
is
the
effect
on
lipophilicity
and
solubility?
Roche
data
is
inconclusive!
(2
pairs
for
logD,
1
pair
for
solubility)
logD
=
2.65
KineLc
solubility
=
84
µg/ml
IC50
SST5
=
0.8
µM
logD
=
3.63
KineLc
solubility
=
>452
µg/ml
IC50
SST5
=
0.19
µM
Ques@on:
Available
Sta@s@cs:
Roche
Example:
27. MedChemica | 2016
ACS Philadelphia 2016
The application helped lead
optimization in project
27
• 193
compounds
• Enumerated
Objective: improve metabolic stability
MMP
Enumeration
Calculated Property
Docking
8 compounds
synthesized
28. MedChemica | 2016
ACS Philadelphia 2016
Solving a tBu metabolism issue
Benchmark
compound
Predicted
to
offer
most
improvement
in
microsomal
stability
(in
at
least
1
species
/
assay)
R2
R1
tBu
Me
Et
iPr
99
392
16
64
78
410
53
550
99
288
78
515
41
35
98
327
92
372
24
247
35
128
24
62
60
395
39
445
3
21
20
27
57
89
54
89
• Data shown are Clint for HLM and MLM (top and bottom, respectively)
R1
R2
R1
tBu
Roger Butlin
Rebecca Newton
Allan Jordan
29. MedChemica | 2016
ACS Philadelphia 2016
Early successes
From GRDv1 May 2014
29
J.
Med.
Chem.,
2015,
58
(23),
pp
9309–9333
DOI:
10.1021/acs.jmedchem.5b01312
30. MedChemica | 2016
ACS Philadelphia 2016
Comparison of Merck in-house MMPA with SALTMinerTM
Structure:
ADMET Issue: hERG
Lead A2A receptor antagonist
compound in Merck Parkinson's
project
138 suggestion molecules with
predicted improvement in hERG
binding
How many match the results of
Merck?
• Also shows potent binding
to the hERG ion channel
• Deng et al performed in-
house MMPA on hERG
binding compound data
and have published 18
resulting fluorobenzene
transformations, which they
have synthesized and
tested for hERG activity
Deng
et
al,
Bioorg.
&
Med
Chem
Let
(2015),
doi:
hhp://dx.doi.org/10.1016/
j.bmcl.2015.05.036
31. MedChemica | 2016
ACS Philadelphia 2016
R
group:
Measured
hERG
pIC50
change
-‐1.187
-‐1.149
-‐1.038
-‐1.215
-‐1.157
-‐0.149
-‐1.487
-‐1.133
GRD
median
historic
pIC50
change
0
-‐0.171
-‐0.1
-‐0.283
-‐0.219
-‐0.318
-‐0.159
-‐0.103
Results:
8 out of the 18 fluorobenzene transformations produced by Merck were also suggested
by MCExpert to decrease hERG binding:
Searching the GRD for transformations that increase hERG there were none that
matched the remaining 10 of 18 transformations in the paper.
MCExpert also suggested an additional 50 fluorobenzene replacements to decrease
hERG binding NOT mentioned in the publication.
32. MedChemica | 2016
ACS Philadelphia 2016
Key findings:
• Secure sharing of large scale ADMET knowledge
between large Pharma is possible
• The collaboration generated great synergy
• Many findings are highly significant.
• MMP is a great tool for idea generation.
• The rules have been used in drug-discovery projects
and generated meaningful results
• MMPA methodology can be extended to extract
pharmacophores
33. MedChemica | 2016
ACS Philadelphia 2016
Barriers Broken to Sharing Knowledge
Data
Integrity and
curation
Knowledge
extraction
algorithms
Consortium
building to
share
knowledge Into the minds of
chemists
✓
✓
✓
✓
34. MedChemica | 2016
ACS Philadelphia 2016
Pharmacophores and Toxophores
by extended analysis from the MMPA
PharmacophoresBigData Stats
Matched
Pairs
Finding
Public and in-
house potency
data
35. Mining transform sets to find influential fragments
Identify the ‘Z’ fragments associated with a
significant number of potency increasing changes –
irrespective of what they are replaced with
‘Z’ is ‘worse than anything you replace it with’
Fragment A Fragment B
Change in binding
measurement
Public
Data
Find
Matched
Pairs
Find Potent
Fragments
+2.7
+3.2
+0.6
+0.6
Identify the ‘A’ fragments associated with a
significant number of potency decreasing changes
– irrespective of what they are replaced with
‘A’ is ‘better than anything you replace it with’
A
+2.1
+2.2
+1.4
+0.4
+1.8
Z
pKi/
pIC50
Compounds with
destructive fragment
Compounds with
constructive fragments
Generate
Pharmacophore
dyads
by
permutaLng
all
the
fragments
with
the
shortest
path
between
them
36. MedChemica | 2016
ACS Philadelphia 2016
Toxophores - Detailed, specific & transparent
Dopamine D2 receptor human
Actual: 9.5
Predicted: 9.1
Mean with: 8.0
Mean without: 6.6
Odds Ratio: 340
Dopamine Transporter
Actual: 9.1
Predicted: 8.6
Mean with: 8.3
Mean without: 6.7
Odds Ratio: 407
GABA-A
Actual: 9.0
Predicted: 8.7
Mean with: 8.0
Mean without: 6.8
Odds Ratio: 1506
β1 adrenergic receptor
Actual: 7.8
Predicted: 7.7
Mean with: 6.5
Mean without: 5.7
Odds Ratio: 1501
Find Potent
Fragments
Matched
Pairs
Finding
Find
Pharmacophore
Dyads
Public and in-
house potency
data
41. MedChemica | 2016
ACS Philadelphia 2016
Matched Molecular Pair
data A data B
data C data D
data E data F
Chemical Transformations
Δ data A B
Δ data C D
Δ data E F
Chemical Transformations
Δ data A B
Δ data C D
Δ data E F
Δ data G H
Δ data I J
Δ data K L
Matched Molecular Pair Analysis (MMPA) enables SAR sharing
Without sharing underlying structures and data
Grand
Rule
Database
Enumeration
Rate-My-Idea
GRD-Browser
ChEMBL Tox database Toxophores
MC-
Biophore
42. MedChemica | 2016
ACS Philadelphia 2016
Take home messages
• Systematic “Big Data” analysis of ADMET data yields a system
that SUGGESTS solution molecules for medicinal chemists
• Better decision making / Faster project progression
• Knowledge Sharing IS possible between pharma companies
– This yields HIGH VALUE databases and VAST new knowledge
• Pharmacophores and toxophores can be extracted from MMPA
44. MedChemica | 2016
ACS Philadelphia 2016
A Collaboration of the willing
Craig Bruce OE
John Cumming Roche
David Cosgrove C4XD
Andy Grant★
Martin Harrison Elixir
Huw Jones Base360
Al Rabow Consulting
David Riley AZ
Graeme Robb AZ
Attilla Ting AZ
Howard Tucker retired
Dan Warner Myjar
Steve St-Galley Syngenta
David Wood JDR
Lauren Reid MedChemica
Shane Monague MedChemica
Jessica Stacey MedChemica
Andy Barker Consulting
Pat Barton AZ
Andy Davis AZ
Andrew Griffin Elixir
Phil Jewsbury AZ
Mike Snowden AZ
Peter Sjo AZ
Martin Packer AZ
Manos Perros Entasis Therapeutics
Nick Tomkinson AZ
Martin Stahl Roche
Jerome Hert Roche
Martin Blapp Roche
Torsten Schindler Roche
Paula Petrone Roche
Christian Kramer Roche
Jeff Blaney Genentech
Hao Zheng Genentech
Slaton Lipscomb Genentech
Alberto Gobbi Genentech