SlideShare a Scribd company logo
1 of 61
This webinar is being recorded
Natural Language
Processing and Machine
Learning: Beyond the Hype
A Pistoia Alliance Debates Webinar
Moderated by David Milward –Linguamatics
September 14, 2017
This webinar is being recorded
Poll Question 1: What role do you play in
your company?
A. IT
B. Data scientist/bioinformatician
C. Clinical/bench scientist
D. Information professional
E. Other
©PistoiaAlliance
The Panel
5
David Milward, Ph.D., CTO Linguamatics
David Milward is chief technology officer (CTO) at Linguamatics. He is a pioneer of interactive
text mining, and a founder of Linguamatics. He has over 20 years experience of product
development, consultancy and research in natural language processing (NLP). After receiving a
PhD from the University of Cambridge, he was a researcher and lecturer at the University of
Edinburgh. He has published in the areas of information extraction, spoken dialogue, parsing,
syntax and semantics.
Chengyi Zheng, Ph.D. , NLP Specialist Kaiser Permanente
Chengyi Zheng, PhD, is a NLP specialist at the Kaiser Permanente Southern California. He has
worked on over 30 research projects using the electronic health records (EHR) data from millions
of patients. He is the principal investigator of a CDC funded study involving 5 health care
institutions on using NLP in the vaccine safety studies. He was the winner of the Kaiser
Permanente predictive modeling competition. He ranked the 1st place in the innovation
competition (InnoCentive@Lilly) while served as the biomedical informatics scientist at Eli Lilly.
He was trained in computer science with a concentration on speech recognition. He will share
some experiences on using NLP and Machine learning on EHR for outcomes prediction.
Eugene Myshkin, Ph.D., Senior Research Scientist, Clarivate
Eugene Myshkin, PhD, is a senior scientist in bioinformatics at Clarivate Analytics. He
has over 15 years experience in drug discovery, cheminformatics and bioinformatics. He
has also been involved in a number of text mining projects including mining of chemical
reagents and antibodies from scientific
literature.
September 14, 2017 NLP and ML
©PistoiaAlliance
Agenda
6
• AI, NLP and ML (David)
• Using NLP and ML in clinical research (Chengyi)
• Network and pathway driven machine learning
approaches to biomarker discovery and patient
stratification (Eugene)
6September 14, 2017 NLP and ML
NLP, AI and Machine Learning
David Milward, PhD
CTO, Linguamatics
2017
Overview
AI (Artificial Intelligence)
NLP (Natural Language Processing)
− and its applications in life sciences
ML (Machine Learning) and DL (Deep Learning)
NLP to feed ML-based DS (Decision Support)
ML in NLP
DLAI
ML
NLP DS
© 2017 Linguamatics8
Overview
AI (Artificial Intelligence)
NLP (Natural Language Processing)
− and its applications in life sciences
ML (Machine Learning) and DL (Deep Learning)
NLP to feed ML-based DS (Decision Support)
ML in NLP
AI
© 2017 Linguamatics9
Overview
AI (Artificial Intelligence)
NLP (Natural Language Processing)
− and its applications in life sciences
ML (Machine Learning) and DL (Deep Learning)
NLP to feed ML-based DS (Decision Support)
ML in NLP
AI
NLP
© 2017 Linguamatics10
Overview
AI (Artificial Intelligence)
NLP (Natural Language Processing)
− and its applications in life sciences
ML (Machine Learning) and DL (Deep Learning)
NLP to feed ML-based DS (Decision Support)
ML in NLP
DLAI
ML
© 2017 Linguamatics11
Overview
AI (Artificial Intelligence)
NLP (Natural Language Processing)
− and its applications in life sciences
ML (Machine Learning) and DL (Deep Learning)
NLP to feed ML-based DS (Decision Support)
ML in NLP
AI
NLP DS
© 2017 Linguamatics12
Overview
AI (Artificial Intelligence)
NLP (Natural Language Processing)
− and its applications in life sciences
ML (Machine Learning) and DL (Deep Learning)
NLP to feed ML-based DS (Decision Support)
ML in NLP
AI
ML
NLP
© 2017 Linguamatics13
Artificial Intelligence (AI)
Artificial intelligence is intelligence
exhibited by machines
The central goals of AI research include
reasoning, knowledge, planning,
learning, natural language processing
(communication), perception and the
ability to move and manipulate objects
As machines become increasingly
capable, tasks considered as requiring
"intelligence" are often removed from
the definition, leading to the quip
“AI is whatever hasn't been done yet”
Wikipedia
© 2017 Linguamatics14
Natural Language Processing (NLP)
Processing of natural languages e.g.
English, French, Chinese by computers
NLP is part of AI, but also key to other
areas of AI e.g. providing decision
support
− If 80% of knowledge is unstructured
we need NLP to get the right information
to provide good suggestions
− Currently many AI projects are limited: they
can only address questions where there is
structured data
− Worse, they often use inappropriate
structured data such as ICD billing codes for
non-billing tasks
© 2017 Linguamatics15
Find information however it is expressed
© 2017 Linguamatics16
Different word,
same meaning
cyclosporine
ciclosporin
Neoral
Sandimmune
Different expression,
same meaning
Non-smoker
Does not smoke
Does not drink or smoke
Denies tobacco use
Different grammar,
same meaning
5mg/kg of cyclosporine daily
5mg/kg/d of cyclosporine
cyclosporine 5mg/kg/day
Same word,
different context
Diagnosed with diabetes
Family history of diabetes
No family history of diabetes
NLP
Represent it in a standardized form
© 2017 Linguamatics17
Concept Text Normalized Value
Diseases breast cancer Breast Neoplasm
carcinoma of the breast
Genes Raf-1 RAF1
Raf I
Dates 27th Feb 2014 20140227
2014/02/27
Measurements 0.2g 200 mg
Two hundred milligrams
Mutations Val 158 Met V158M
Val by Met at codon 158
Entrez Gene ID: 5743
inhibits
nimesulide, a selective COX2 inhibitor, …
From Bench to Bedside: NLP Provides Insight
© 2017 Linguamatics18
Regulatory
approval
Phase 3
Clinical
trials
Basic
research
Idea Patient
care
Phase 2Phase 1
DeliveryDevelopmentDiscovery
Business critical questions
What targets are
involved in bone
cancer?
What companies are
patenting a particular
technology?
What are the safety
risks of my drug?
Where can I site my
Phase 1, Phase 3
clinical study?
What are the clinical
risks for my patients?
Direct access to the Unstructured
© 2017 Linguamatics
Weight ≥ 80kg
Below 60 years old
Reports after 2010
With mutation C677T
Cancer patients
19
Machine Learning
Machine Learning is used for AI in general and as a
technique within NLP
3 main flavours:
− Supervised
− uses annotated data mapping between inputs and outputs
− Semi-supervised
− uses machine analysis but incorporates a human in the loop
− Unsupervised
− uses unannotated data, usually at very large scale.
© 2017 Linguamatics20
Recent successes with deep
learning approaches based on
neural networks for supervised and
unsupervised ML e.g.
− Machine translation using parallel
corpora
− Image classification in medicine
Using NLP to feed other AI
NLP provides access to the 80% of
information in unstructured text
Provides a set of potential features to be
used in e.g. ML models for Decision
Support
Example: building risk models from RWD
sets
− Predicting patients at risk of misusing opioid
prescription drugs (AMIA November 2017)
− Features extracted by Linguamatics I2E from
8.9 million de-identified medical record full-
text transcripts from RealHealthData
− SVM classifier trained on the features to flag
patients at risk
© 2017 Linguamatics21
Machine Learning in NLP
Supervised ML
− Requires large-scale, representative annotated documents
− Main paradigm for core NLP components
− For extraction patterns, used in academic systems but less commonly
in commercial
Semi-supervised ML
− Useful for new tasks or data sets where no existing representative
annotated data
− Useful where a task is initially ill-defined
− Puts a human in the loop judging suggestions from the machine
learning
− Can provide good quality results quickly e.g. to test whether a feature
extracted by NLP is useful for a ML model
Unsupervised ML
− Uses large-scale unannotated data
− Key example is learning the meaning of a word via the context it
keeps (word embeddings)
© 2017 Linguamatics22
Semi-Supervised ML Approaches
Similar distributions for words and syntactic constructions
Automatically discover what is in the data using an interactive,
agile text mining platform such as Linguamatics I2E
A long tail of infrequent cases
− prioritize the more frequent constructions
− generalize to cover items in the tail
© 2017 Linguamatics23
Zipf’s Law: the frequency of
any word is inversely
proportional to its rank in the
frequency table
Semi-Supervised NLP using
Linguamatics I2E
© 2017 Linguamatics24
Summary
NLP is critical to success of many ML projects
− access to the unstructured text is key to using ML
widely, not just where there is convenient structured
data
Semi-supervised approaches to NLP provide an
efficient way to capture features for ML projects
© 2017 Linguamatics25
DLAI
ML
NLP DS
Poll Question 2: What is your company’s
primary use for NLP?
A. Early Discovery/ Pre-clinical
B. Clinical
C. Real world data
D. Other
E. Don’t use NLP
Using NLP and ML in clinical research
Chengyi Zheng, PhD, MS
DEPARTMENT of Research & Evaluation
28 DEPARTMENT of Research & Evaluation
10/6/2012 10/19/2012
10/7/2012 10/14/2012
10/7/2012
Pt called
10/7/2012
Nurse Called Back
10/8/2012
Orthopedic office visit
Where: Medical Center, Department
10/8/2012
Progress Notes:
Reason for visit: Knee Pain
Vital Sign/BMI/Pain level/History
PE/Findings/Impression/A&P
Dx: icd-9 code
Nurse Exam Note:
…
10/9/2012
Lab
10/10/2012
Pre-op dental exam (ext)
10/6/2012
Imaging:
DEXA Bone density
10/11/2012
office visit
10/11/2012
Rx Prescribed
10/10/2012
Surgery Scheduled
10/11/2012
Office Visit
Sinus Congestion
Ankle itchy
Dx:
401.9 Essential Hypertension
274.9 Gout
461.9 Acute Sinusitis
10/12/2012
Picked up the Rx
10/13/2012
Pt missed appt.
10/13/2012
Telephone Consult
Healthy bones PN
10/14/2012
Pt emailed:
Drug adverse event
10/14/2012
Pt called
cancerous area
10/15/2012
EKG
Dx: Screening
10/15/2012
Ear Wax Wash
10/18/2012
Pathology Report Out
10/16/2012
Procedure:
Remove Skin
10/16/2012 - 10/19/2012
Hospitalization
Two weeks records of a patient in an EMR system
5 Ws: What, Who, When, Where and Why
Membership length: 70% > 5 years, 50% >10 years.
29
5 Ws: What, Who, When, Where and Why
 What
– What is the reason of visit?
– What happened? (pain after fall, pain after drink a beer?)
 Who
– Who is the caregiver? (primary physician, rheumatologist?)
– What we know about this patient? (age, race, past medical history, et.
al.)
 Where
– Where this visit occurred?
 When
– When the problem started?
 Why
– Why this problem happened? Possible causes?
DEPARTMENT of Research & Evaluation
30
Visual representation of KPSC research databases
DEPARTMENT of Research & Evaluation
31
Case study: Identify acute gout flare
 Published methods to identify gout flares using claims
data
– Clinical coding is unreliable: under-coding, over-coding, too
general
– Medication is unreliable:
 Drugs for gout maintenance
 Drugs also for other diseases (Share similar symptoms)
 NLP has been used to:
– Identify study population and patients information
– Identify and extract clinical variables (genetic, biopsy, radiology)
– Evaluate patients status (disease progression, medication status)
DEPARTMENT of Research & Evaluation
Solution and challenges (NLP)
Challenges:
– Gout is a chronic disease which can be controlled but not cured
 Signs and symptoms could appeared in follow up visit
 Differentiate between acute and chronic status
– Gout population is generally old with comorbidity sharing
similar symptoms
 100+ types of arthritis (> 50 million people)
 Pain, erythema, and swelling joint
– Information documented varies by clinical notes
Standard solutions:
– Each search query captures one set of information
– Each search query has its own sensitivity/specificity etc.
– Logic operator combines search results (union, join, etc.)
 Difficult to optimize on the overall sensitivity/specificity etc.
32 DEPARTMENT of Research & Evaluation
Mining vs. NLP & ML in clinical research
Steps:
1. Preliminary analysis, estimate feasibility
2. Develop plan, estimate cost
3. Seek permit (government vs. IRB)
4. Mine (mining equipment vs. NLP)
– Focus on completeness (high sensitivity)
– Shallow & deep mining (good specificity)
5. Refine (chemical process vs. ML)
– Improve purity (higher specificity)
6. Manual verification (optional)
7. Deliver to customer
“art and science combined” “resource-heavy and time-consuming
process”
33 DEPARTMENT of Research & Evaluation
Solution and challenges (NLP+ML)
Goal:
 NLP focus on sensitivity or information completeness
– Separate ores from rock
 ML focus on improving the specificity
– Improve purity without much loss of sensitivity
Solution:
 NLP results as input features to the ML system
– Identify related signs and symptoms
– Identify temporal relationship (when and how long?)
– Identify disease association (related to any other disease?)
– Identify implicit and explicit mention of gout flare
– Identify treatment plan associated with disease onset
34 DEPARTMENT of Research & Evaluation
Overview of the system development steps
35
Study period: 1/1/2007 to 12/31/2010.
Patients > 18 years, with a diagnosis of gout and on urate-lowering therapy.
Within [-3,+12] months of index date, 599,317 clinical notes for 16,519 patients.
DEPARTMENT of Research & Evaluation
Overview of the NLP+ML system
36 DEPARTMENT of Research & Evaluation
Performance comparisons
81.1
95.4
88.3 92.290.9
97.3
93 96.5
84.8
92.2
81.1
93.9
70
80
90
100
Sensitivity Specificity PPV NPV
Clinical note level gout flare identification
Rheumatologist 1 Rheumatologist 2 NLP+ML
37
98.5
92.9
97.1 96.397.1
92.9
97.1
92.9
98.5 96.4 98.5 96.4
88.2 89.3
95.2
75.8
70
80
90
100
Sensitivity Specificity PPV NPV
Identify patients with ≥ 1 gout flares
Rheumatologist 1 Rheumatologist 2 NLP+ML ICD-9
74.2
92.3
82.1 88.283.9
95.4 89.7 92.593.5
84.6
74.4
96.5
41.9
95.4
81.3 77.5
30
50
70
90
Sensitivity Specificity PPV NPV
Identify patients with ≥ 3 gout flares (refractory gout)
Rheumatologist 1 Rheumatologist 2 NLP+ML ICD-9
DEPARTMENT of Research & Evaluation
Results
 Note level (gout flare, n= 599,317):
– NLP: 49,415 positive cases => ML: 18,869 positive cases
 Patient level (with ≥ 3 flares, n=16,519):
– Number of patients: 1,402 (NLP+ML) vs. 516 (Claim)
– Sensitivity: 93.5% (NLP+ML) vs. 41.9% (Claim)
 Impact:
– Identify refractory disease patients
– Estimate market size (KPSC / US population = 4.5/325 million =
1.4%)
– Better disease management, improve quality of life, and help
reduce healthcare resource use.
 1,402 patients is more manageable than 16,519 patients
38 DEPARTMENT of Research & Evaluation
39
ML in healthcare
 Tremendous opportunities
 Prediction: high utilizers, risk scores
 Identification: cases, outcomes, social needs
 Image recognition: pathology and radiology images
– Challenges (Data)
 Data quality: dirty, missing data
 Heterogeneous data: different systems
 Structured, semi-structured and free text data
 Image, scanned documents
 Genetic and biobank data
– Challenges (People)
 Who understands NLP, ML and healthcare
 Who understands the complexity of healthcare data
DEPARTMENT of Research & Evaluation
Poll Question 3: How does your company
primarily use machine learning in drug
discovery?
A. Target prediction and repositioning
B. Biomarker discovery
C. Patient stratification
D. Other
E. We don’t use machine learning
Network and pathway
driven machine learning
approaches to biomarker
discovery and patient
stratification
Eugene Myshkin, PhD
September 2017
42
CLARIVATE ANALYTICS TEXT MINING
• Clarivate Analytics literature data feed
• Comprehensive coverage
– >20,000 journals
– Journal content mirrors: Current Contents; Web of Science; Biosis; International Pharmaceutical Abstracts;
Derwent Drug File
– http://ip-science.thomsonreuters.com/cgi-bin/jrnlst/jlresults.cgi?PC=MASTER
• Latest information
– Updated with over 170,921 articles/month, or 2,051,051+ articles/year
• Full text, cover to cover searching of all journals
• Comprehensive synonym collections
• Controlled vocabulary management software to support mining
43
CLARIVATE ANALYTICS LIFE SCIENCES SOLUTIONS
Pharmacovigilance Literature Monitoring Biological and Chemical Reagent Monitoring
Concepts in social media Automated Curation of Clinical
Data
Protein and Gene Variant
Monitoring
44
USING NLP FOR MANUSCRIPT MATCHING
Analyze citation connections to place the publication in the right journal
45
DRUG TARGET DISEASE
PITFALLS OF NLP FEATURES FOR ML
• 1-10 million of features
• Feature vectors are binary and sparse
• Feature redundancy
• Feature selection takes a long time
These associations can be obtained
with NLP but precision is a problem -
a flood of false positives and the
necessity to hire a bunch of people
just to sort the true from the false
alerts.
FOCUS OF DRUG DISCOVERY:
46
—
METABASE MANUALLY ANNOTATED CONTENT
PUBLICATIONS
(209 for EGF-EGFR interaction)
•Manual annotation from publications
•Team of PhDs, MDs
•Advanced editorial systems
•Controlled vocabularies
•Multiple levels of QC
•invested more than 400 man yearsMOLECULAR
INTERACTION
NETWORK:
PATHWAY
~ 1,500,000
molecular interactions
~ 3,000 pathways
47
—
INTEGRATED APPROACH
Pathway knowledge
Pathway-driven
approaches
Statistical
approaches
1. Target identification or repositioning
2. Biomarker discovery
3. Patient stratification
48
—
Drug toxic but
beneficial
Drug toxic but
NOT beneficial
Drug NOT toxic and
beneficial
Drug NOT toxic and
NOT beneficial
Patient stratification
“The most efficient and safe drug for a
cohort of patients”
WHY DIFFERENT PATIENT RESPONSE?
Blockbuster strategy
“One drug for all patients”
New strategy is needed
49
—
HOW CAN PATIENTS BE STRATIFIED?
Mechanism 1 Mechanism 2
Biomarkers Biomarkers
Biomarker – measurable molecular indicator of:
disease subtype/progress
drug efficacy
side effect/toxicity
• Identify subtypes resulting in multiple
drug targets rather than one.
• A shift from the presumption of a disease
to multiple diseases would reframe the
drug development strategy
50
—
ORION BIONETWORKS
Orion Bionetworks (Cohen Veteran Biosciences) is an alliance of world leading
organizations in patient care, computational modelling, translational research and
patient advocacy that aims to develop open-source computational models for
multiple sclerosis and improve upon existing analytical tools for model
development.
~186 subjects with gene expression data and clinical parameters like time to
relapse, etc
GOALS:
 Understand the structure of the population based on
the molecular data – identify cohorts of patients whose
clinical course differs over time
 Build stratification models
 Identify new therapeutic targets
51
—
NETWORK/PATHWAY BASED METHODS FOR BIOMARKER DISCOVERY
52
—
1. PATHWAY IDENTIFICATION
— 56 pathways identified
• 136 genes
• 39/136 genes were present in multiple pathways
• 44/136 genes known MS biomarkers or drug targets (p =
5x10-6)
52
• individual expression values of each
member gene were averaged into a
combined z-score
• activity score association with time to
relapse in a Cox proportional hazard
model was calculated
53
—
2. PATIENTS CLUSTERING BY PATHWAYS
Clusters are significantly
associated with time to
relapse in the presence of
important clinical
covariates
patients were clustered into groups based on k-means clustering of their pathway
activity profiles, k=3 resulted in the best separation of patient profiles.
54
—
— A K-Nearest Neighbor model was previously generated to predict
risk groups 1-3 using all biomarkers
— Feature selection was performed by taking the variable importance
calculated from the trained KNN model.
— Forward feature selection was then conducted using 10-fold CV
adding features to the model in order of their importance.
— Once this process was complete the predictive performance was
evaluated in terms of the ability of the model to separate the three
risk groups
— Final feature set was applied to test data
3. CLASSIFICATION MODEL
Signature was reduced
from 56 to 13 pathways,
containing 65 genes
GENE ONLY MODEL WAS NOT ROBUST TO TEST DATA
PATHWAY BASED APPROACH GENE BASED APPROACH
56
—
CONCLUSIONS
— Signature differentiating between patient cohorts was reduced
from 56 to 13 pathways
— This new signature contains 65 genes
— 13 biomarkers could stratify subjects into risk groups with
statistically significant differences in time to relapse
— This was validated in test subjects with results being consistent
to what was observed in the training cohort
— Pathway activities were more robust than gene expression
56
Poll Question 4: What is the greatest
barrier to application of NLP/ML at your
company?
A. Technical expertise
B. Access to data
C. Data quality
D. Management support/understanding
E. Other
Poll Question 5: Do you expect an
increase in ML within Life Science in the
next 2 years?
A. Yes
B. No
C: Don’t Know
Audience Q&A
Please use the Question function in GoToWebinar
Where will AI/Deep learning
have an impact in Life Science
& Health?
The next Pistoia Alliance Debates Webinar:
Moderator: Nick Lynch with Sean Ekins CEO, Collaborations
Pharmaceuticals Inc, David Pearah, CEO HDF group, and Peter Henstock,
Pfizer Research
Date: September 27, 2017
check http://www.pistoiaalliance.org/pistoia-alliance-debates-webinar-
series/ for the latest information
info@pistoiaalliance.org @pistoiaalliance www.pistoiaalliance.org

More Related Content

What's hot

What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...Simplilearn
 
Neural Networks with Focus on Language Modeling
Neural Networks with Focus on Language ModelingNeural Networks with Focus on Language Modeling
Neural Networks with Focus on Language ModelingAdel Rahimi
 
Natural language processing
Natural language processingNatural language processing
Natural language processingYogendra Tamang
 
Notes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew NgNotes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew NgdataHacker. rs
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual IntroductionLukas Masuch
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingIla Group
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionAritra Mukherjee
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Simplilearn
 
Ross Chayka. Gartner Hype Cycle
Ross Chayka. Gartner Hype CycleRoss Chayka. Gartner Hype Cycle
Ross Chayka. Gartner Hype CycleLviv Startup Club
 
Artificial Intelligence: Natural Language Processing
Artificial Intelligence: Natural Language ProcessingArtificial Intelligence: Natural Language Processing
Artificial Intelligence: Natural Language ProcessingFrank Cunha
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingsaurabhnarhe
 
Information retrieval 10 tf idf and bag of words
Information retrieval 10 tf idf and bag of wordsInformation retrieval 10 tf idf and bag of words
Information retrieval 10 tf idf and bag of wordsVaibhav Khanna
 
Graph Based Clustering
Graph Based ClusteringGraph Based Clustering
Graph Based ClusteringSSA KPI
 
Heuristic or informed search
Heuristic or informed searchHeuristic or informed search
Heuristic or informed searchHamzaJaved64
 
Computer Vision image classification
Computer Vision image classificationComputer Vision image classification
Computer Vision image classificationWael Badawy
 
Matrix chain multiplication 2
Matrix chain multiplication 2Matrix chain multiplication 2
Matrix chain multiplication 2Maher Alshammari
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
 

What's hot (20)

What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
 
Neural Networks with Focus on Language Modeling
Neural Networks with Focus on Language ModelingNeural Networks with Focus on Language Modeling
Neural Networks with Focus on Language Modeling
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
NLP
NLPNLP
NLP
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Notes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew NgNotes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew Ng
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - Introduction
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
Ross Chayka. Gartner Hype Cycle
Ross Chayka. Gartner Hype CycleRoss Chayka. Gartner Hype Cycle
Ross Chayka. Gartner Hype Cycle
 
Artificial Intelligence: Natural Language Processing
Artificial Intelligence: Natural Language ProcessingArtificial Intelligence: Natural Language Processing
Artificial Intelligence: Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Information retrieval 10 tf idf and bag of words
Information retrieval 10 tf idf and bag of wordsInformation retrieval 10 tf idf and bag of words
Information retrieval 10 tf idf and bag of words
 
Graph Based Clustering
Graph Based ClusteringGraph Based Clustering
Graph Based Clustering
 
Heuristic or informed search
Heuristic or informed searchHeuristic or informed search
Heuristic or informed search
 
Sementic nets
Sementic netsSementic nets
Sementic nets
 
Computer Vision image classification
Computer Vision image classificationComputer Vision image classification
Computer Vision image classification
 
Matrix chain multiplication 2
Matrix chain multiplication 2Matrix chain multiplication 2
Matrix chain multiplication 2
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 

Similar to NLP & ML Webinar

The Power of Natural Language Processing (NLP) | Enterprise Wired
The Power of Natural Language Processing (NLP) | Enterprise WiredThe Power of Natural Language Processing (NLP) | Enterprise Wired
The Power of Natural Language Processing (NLP) | Enterprise WiredEnterprise Wired
 
Demystifying Natural Language Processing: A Beginner’s Guide
Demystifying Natural Language Processing: A Beginner’s GuideDemystifying Natural Language Processing: A Beginner’s Guide
Demystifying Natural Language Processing: A Beginner’s Guidecyberprosocial
 
APznzaalselifJKjGQdTCA51cF7bldYdFMvDcshM8opKFZ_ZaIV-dqkiLoIKIfhz2tS6Fw5UBk25u...
APznzaalselifJKjGQdTCA51cF7bldYdFMvDcshM8opKFZ_ZaIV-dqkiLoIKIfhz2tS6Fw5UBk25u...APznzaalselifJKjGQdTCA51cF7bldYdFMvDcshM8opKFZ_ZaIV-dqkiLoIKIfhz2tS6Fw5UBk25u...
APznzaalselifJKjGQdTCA51cF7bldYdFMvDcshM8opKFZ_ZaIV-dqkiLoIKIfhz2tS6Fw5UBk25u...AishwaryaChemate
 
Lecture 7: Learning from Massive Datasets
Lecture 7: Learning from Massive DatasetsLecture 7: Learning from Massive Datasets
Lecture 7: Learning from Massive DatasetsMarina Santini
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...write4
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...write5
 
Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...
Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...
Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...PhD Assistance
 
A prior case study of natural language processing on different domain
A prior case study of natural language processing  on different domain A prior case study of natural language processing  on different domain
A prior case study of natural language processing on different domain IJECEIAES
 
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...PhD Assistance
 
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...PhD Assistance
 
emerging-tech2018-01-15justfrank-180218202037.pdf
emerging-tech2018-01-15justfrank-180218202037.pdfemerging-tech2018-01-15justfrank-180218202037.pdf
emerging-tech2018-01-15justfrank-180218202037.pdfssuser0ca68e
 
A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3
A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3
A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3AIRCC Publishing Corporation
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingKevinSims18
 
6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docxpriestmanmable
 
6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docxsodhi3
 
Natural Language Processing .pdf
Natural Language Processing .pdfNatural Language Processing .pdf
Natural Language Processing .pdfAnime196637
 
A Guide to Natural Language Processing NLP.pdf
A Guide to Natural Language Processing NLP.pdfA Guide to Natural Language Processing NLP.pdf
A Guide to Natural Language Processing NLP.pdfSoluLab1231
 
An Overview Of Natural Language Processing
An Overview Of Natural Language ProcessingAn Overview Of Natural Language Processing
An Overview Of Natural Language ProcessingScott Faria
 

Similar to NLP & ML Webinar (20)

The Power of Natural Language Processing (NLP) | Enterprise Wired
The Power of Natural Language Processing (NLP) | Enterprise WiredThe Power of Natural Language Processing (NLP) | Enterprise Wired
The Power of Natural Language Processing (NLP) | Enterprise Wired
 
Demystifying Natural Language Processing: A Beginner’s Guide
Demystifying Natural Language Processing: A Beginner’s GuideDemystifying Natural Language Processing: A Beginner’s Guide
Demystifying Natural Language Processing: A Beginner’s Guide
 
APznzaalselifJKjGQdTCA51cF7bldYdFMvDcshM8opKFZ_ZaIV-dqkiLoIKIfhz2tS6Fw5UBk25u...
APznzaalselifJKjGQdTCA51cF7bldYdFMvDcshM8opKFZ_ZaIV-dqkiLoIKIfhz2tS6Fw5UBk25u...APznzaalselifJKjGQdTCA51cF7bldYdFMvDcshM8opKFZ_ZaIV-dqkiLoIKIfhz2tS6Fw5UBk25u...
APznzaalselifJKjGQdTCA51cF7bldYdFMvDcshM8opKFZ_ZaIV-dqkiLoIKIfhz2tS6Fw5UBk25u...
 
Lecture 7: Learning from Massive Datasets
Lecture 7: Learning from Massive DatasetsLecture 7: Learning from Massive Datasets
Lecture 7: Learning from Massive Datasets
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...
 
NLP unit-VI.pptx
NLP unit-VI.pptxNLP unit-VI.pptx
NLP unit-VI.pptx
 
Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...
Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...
Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...
 
A prior case study of natural language processing on different domain
A prior case study of natural language processing  on different domain A prior case study of natural language processing  on different domain
A prior case study of natural language processing on different domain
 
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
 
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
 
emerging-tech2018-01-15justfrank-180218202037.pdf
emerging-tech2018-01-15justfrank-180218202037.pdfemerging-tech2018-01-15justfrank-180218202037.pdf
emerging-tech2018-01-15justfrank-180218202037.pdf
 
A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3
A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3
A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx
 
6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx
 
sample PPT.pptx
sample PPT.pptxsample PPT.pptx
sample PPT.pptx
 
Natural Language Processing .pdf
Natural Language Processing .pdfNatural Language Processing .pdf
Natural Language Processing .pdf
 
A Guide to Natural Language Processing NLP.pdf
A Guide to Natural Language Processing NLP.pdfA Guide to Natural Language Processing NLP.pdf
A Guide to Natural Language Processing NLP.pdf
 
An Overview Of Natural Language Processing
An Overview Of Natural Language ProcessingAn Overview Of Natural Language Processing
An Overview Of Natural Language Processing
 

More from Pistoia Alliance

Fairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matricesFairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matricesPistoia Alliance
 
Digital webinar master deck final
Digital webinar master deck finalDigital webinar master deck final
Digital webinar master deck finalPistoia Alliance
 
Heartificial intelligence - claudio-mirti
Heartificial intelligence - claudio-mirtiHeartificial intelligence - claudio-mirti
Heartificial intelligence - claudio-mirtiPistoia Alliance
 
Knowledge graphs ilaria maresi the hyve 23apr2020
Knowledge graphs   ilaria maresi the hyve 23apr2020Knowledge graphs   ilaria maresi the hyve 23apr2020
Knowledge graphs ilaria maresi the hyve 23apr2020Pistoia Alliance
 
2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinarPistoia Alliance
 
Data market evolution, a future shaped by FAIR
Data market evolution, a future shaped by FAIRData market evolution, a future shaped by FAIR
Data market evolution, a future shaped by FAIRPistoia Alliance
 
AI in translational medicine webinar
AI in translational medicine webinarAI in translational medicine webinar
AI in translational medicine webinarPistoia Alliance
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata managementPistoia Alliance
 
Open interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIOpen interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIPistoia Alliance
 
Fair webinar, Ted slater: progress towards commercial fair data products and ...
Fair webinar, Ted slater: progress towards commercial fair data products and ...Fair webinar, Ted slater: progress towards commercial fair data products and ...
Fair webinar, Ted slater: progress towards commercial fair data products and ...Pistoia Alliance
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesPistoia Alliance
 
Implementing Blockchain applications in healthcare
Implementing Blockchain applications in healthcareImplementing Blockchain applications in healthcare
Implementing Blockchain applications in healthcarePistoia Alliance
 
Building trust and accountability - the role User Experience design can play ...
Building trust and accountability - the role User Experience design can play ...Building trust and accountability - the role User Experience design can play ...
Building trust and accountability - the role User Experience design can play ...Pistoia Alliance
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance
 
Data for AI models, the past, the present, the future
Data for AI models, the past, the present, the futureData for AI models, the past, the present, the future
Data for AI models, the past, the present, the futurePistoia Alliance
 
PA webinar on benefits & costs of FAIR implementation in life sciences
PA webinar on benefits & costs of FAIR implementation in life sciences PA webinar on benefits & costs of FAIR implementation in life sciences
PA webinar on benefits & costs of FAIR implementation in life sciences Pistoia Alliance
 
AI & ML in Drug Design: Pistoia Alliance CoE
AI & ML in Drug Design: Pistoia Alliance CoEAI & ML in Drug Design: Pistoia Alliance CoE
AI & ML in Drug Design: Pistoia Alliance CoEPistoia Alliance
 
Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Pistoia Alliance
 

More from Pistoia Alliance (20)

Fairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matricesFairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matrices
 
MPS webinar master deck
MPS webinar master deckMPS webinar master deck
MPS webinar master deck
 
Digital webinar master deck final
Digital webinar master deck finalDigital webinar master deck final
Digital webinar master deck final
 
Heartificial intelligence - claudio-mirti
Heartificial intelligence - claudio-mirtiHeartificial intelligence - claudio-mirti
Heartificial intelligence - claudio-mirti
 
Fair by design
Fair by designFair by design
Fair by design
 
Knowledge graphs ilaria maresi the hyve 23apr2020
Knowledge graphs   ilaria maresi the hyve 23apr2020Knowledge graphs   ilaria maresi the hyve 23apr2020
Knowledge graphs ilaria maresi the hyve 23apr2020
 
2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar
 
Data market evolution, a future shaped by FAIR
Data market evolution, a future shaped by FAIRData market evolution, a future shaped by FAIR
Data market evolution, a future shaped by FAIR
 
AI in translational medicine webinar
AI in translational medicine webinarAI in translational medicine webinar
AI in translational medicine webinar
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata management
 
Open interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIOpen interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBI
 
Fair webinar, Ted slater: progress towards commercial fair data products and ...
Fair webinar, Ted slater: progress towards commercial fair data products and ...Fair webinar, Ted slater: progress towards commercial fair data products and ...
Fair webinar, Ted slater: progress towards commercial fair data products and ...
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
 
Implementing Blockchain applications in healthcare
Implementing Blockchain applications in healthcareImplementing Blockchain applications in healthcare
Implementing Blockchain applications in healthcare
 
Building trust and accountability - the role User Experience design can play ...
Building trust and accountability - the role User Experience design can play ...Building trust and accountability - the role User Experience design can play ...
Building trust and accountability - the role User Experience design can play ...
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier Datathon
 
Data for AI models, the past, the present, the future
Data for AI models, the past, the present, the futureData for AI models, the past, the present, the future
Data for AI models, the past, the present, the future
 
PA webinar on benefits & costs of FAIR implementation in life sciences
PA webinar on benefits & costs of FAIR implementation in life sciences PA webinar on benefits & costs of FAIR implementation in life sciences
PA webinar on benefits & costs of FAIR implementation in life sciences
 
AI & ML in Drug Design: Pistoia Alliance CoE
AI & ML in Drug Design: Pistoia Alliance CoEAI & ML in Drug Design: Pistoia Alliance CoE
AI & ML in Drug Design: Pistoia Alliance CoE
 
Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019
 

Recently uploaded

independent Call Girls Sarjapur Road - 7001305949 with real photos and phone ...
independent Call Girls Sarjapur Road - 7001305949 with real photos and phone ...independent Call Girls Sarjapur Road - 7001305949 with real photos and phone ...
independent Call Girls Sarjapur Road - 7001305949 with real photos and phone ...narwatsonia7
 
Soft Toric contact lens fitting (NSO).pptx
Soft Toric contact lens fitting (NSO).pptxSoft Toric contact lens fitting (NSO).pptx
Soft Toric contact lens fitting (NSO).pptxJasmin Modi
 
Call Girl Service ITPL - [ Cash on Delivery ] Contact 7001305949 Escorts Service
Call Girl Service ITPL - [ Cash on Delivery ] Contact 7001305949 Escorts ServiceCall Girl Service ITPL - [ Cash on Delivery ] Contact 7001305949 Escorts Service
Call Girl Service ITPL - [ Cash on Delivery ] Contact 7001305949 Escorts Servicenarwatsonia7
 
Russian Call Girl Chandapura Dommasandra Road - 7001305949 Escorts Service 50...
Russian Call Girl Chandapura Dommasandra Road - 7001305949 Escorts Service 50...Russian Call Girl Chandapura Dommasandra Road - 7001305949 Escorts Service 50...
Russian Call Girl Chandapura Dommasandra Road - 7001305949 Escorts Service 50...narwatsonia7
 
Gurgaon Sector 90 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...
Gurgaon Sector 90 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...Gurgaon Sector 90 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...
Gurgaon Sector 90 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...ggsonu500
 
Call Girls South Delhi 9999965857 Cheap and Best with original Photos
Call Girls South Delhi 9999965857 Cheap and Best with original PhotosCall Girls South Delhi 9999965857 Cheap and Best with original Photos
Call Girls South Delhi 9999965857 Cheap and Best with original Photosparshadkalavatidevi7
 
Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...
Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...
Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...narwatsonia7
 
Pregnancy and Breastfeeding Dental Considerations.pptx
Pregnancy and Breastfeeding Dental Considerations.pptxPregnancy and Breastfeeding Dental Considerations.pptx
Pregnancy and Breastfeeding Dental Considerations.pptxcrosalofton
 
Russian Call Girls Mohan Nagar | 9711199171 | High Profile -New Model -Availa...
Russian Call Girls Mohan Nagar | 9711199171 | High Profile -New Model -Availa...Russian Call Girls Mohan Nagar | 9711199171 | High Profile -New Model -Availa...
Russian Call Girls Mohan Nagar | 9711199171 | High Profile -New Model -Availa...sandeepkumar69420
 
Call Girls Hsr Layout Whatsapp 7001305949 Independent Escort Service
Call Girls Hsr Layout Whatsapp 7001305949 Independent Escort ServiceCall Girls Hsr Layout Whatsapp 7001305949 Independent Escort Service
Call Girls Hsr Layout Whatsapp 7001305949 Independent Escort Servicenarwatsonia7
 
FAMILY in sociology for physiotherapists.pptx
FAMILY in sociology for physiotherapists.pptxFAMILY in sociology for physiotherapists.pptx
FAMILY in sociology for physiotherapists.pptxMumux Mirani
 
Models Call Girls Electronic City | 7001305949 At Low Cost Cash Payment Booking
Models Call Girls Electronic City | 7001305949 At Low Cost Cash Payment BookingModels Call Girls Electronic City | 7001305949 At Low Cost Cash Payment Booking
Models Call Girls Electronic City | 7001305949 At Low Cost Cash Payment Bookingnarwatsonia7
 
Low Rate Call Girls In Bommanahalli Just Call 7001305949
Low Rate Call Girls In Bommanahalli Just Call 7001305949Low Rate Call Girls In Bommanahalli Just Call 7001305949
Low Rate Call Girls In Bommanahalli Just Call 7001305949ps5894268
 
Rohini Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...
Rohini Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...Rohini Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...
Rohini Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...ddev2574
 
Gurgaon Sector 68 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...
Gurgaon Sector 68 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...Gurgaon Sector 68 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...
Gurgaon Sector 68 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...ggsonu500
 
EMS and Extrication: Coordinating Critical Care
EMS and Extrication: Coordinating Critical CareEMS and Extrication: Coordinating Critical Care
EMS and Extrication: Coordinating Critical CareRommie Duckworth
 
Low Vision Case (Nisreen mokhanawala).pptx
Low Vision Case (Nisreen mokhanawala).pptxLow Vision Case (Nisreen mokhanawala).pptx
Low Vision Case (Nisreen mokhanawala).pptxShubham
 
Call Girls Ghaziabad 9999965857 Cheap and Best with original Photos
Call Girls Ghaziabad 9999965857 Cheap and Best with original PhotosCall Girls Ghaziabad 9999965857 Cheap and Best with original Photos
Call Girls Ghaziabad 9999965857 Cheap and Best with original Photosparshadkalavatidevi7
 

Recently uploaded (20)

independent Call Girls Sarjapur Road - 7001305949 with real photos and phone ...
independent Call Girls Sarjapur Road - 7001305949 with real photos and phone ...independent Call Girls Sarjapur Road - 7001305949 with real photos and phone ...
independent Call Girls Sarjapur Road - 7001305949 with real photos and phone ...
 
Soft Toric contact lens fitting (NSO).pptx
Soft Toric contact lens fitting (NSO).pptxSoft Toric contact lens fitting (NSO).pptx
Soft Toric contact lens fitting (NSO).pptx
 
Call Girl Service ITPL - [ Cash on Delivery ] Contact 7001305949 Escorts Service
Call Girl Service ITPL - [ Cash on Delivery ] Contact 7001305949 Escorts ServiceCall Girl Service ITPL - [ Cash on Delivery ] Contact 7001305949 Escorts Service
Call Girl Service ITPL - [ Cash on Delivery ] Contact 7001305949 Escorts Service
 
Russian Call Girl Chandapura Dommasandra Road - 7001305949 Escorts Service 50...
Russian Call Girl Chandapura Dommasandra Road - 7001305949 Escorts Service 50...Russian Call Girl Chandapura Dommasandra Road - 7001305949 Escorts Service 50...
Russian Call Girl Chandapura Dommasandra Road - 7001305949 Escorts Service 50...
 
Russian Call Girls Jor Bagh 9711199171 discount on your booking
Russian Call Girls Jor Bagh 9711199171 discount on your bookingRussian Call Girls Jor Bagh 9711199171 discount on your booking
Russian Call Girls Jor Bagh 9711199171 discount on your booking
 
Gurgaon Sector 90 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...
Gurgaon Sector 90 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...Gurgaon Sector 90 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...
Gurgaon Sector 90 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...
 
Call Girls South Delhi 9999965857 Cheap and Best with original Photos
Call Girls South Delhi 9999965857 Cheap and Best with original PhotosCall Girls South Delhi 9999965857 Cheap and Best with original Photos
Call Girls South Delhi 9999965857 Cheap and Best with original Photos
 
Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...
Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...
Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...
 
Pregnancy and Breastfeeding Dental Considerations.pptx
Pregnancy and Breastfeeding Dental Considerations.pptxPregnancy and Breastfeeding Dental Considerations.pptx
Pregnancy and Breastfeeding Dental Considerations.pptx
 
Russian Call Girls Mohan Nagar | 9711199171 | High Profile -New Model -Availa...
Russian Call Girls Mohan Nagar | 9711199171 | High Profile -New Model -Availa...Russian Call Girls Mohan Nagar | 9711199171 | High Profile -New Model -Availa...
Russian Call Girls Mohan Nagar | 9711199171 | High Profile -New Model -Availa...
 
Call Girls Hsr Layout Whatsapp 7001305949 Independent Escort Service
Call Girls Hsr Layout Whatsapp 7001305949 Independent Escort ServiceCall Girls Hsr Layout Whatsapp 7001305949 Independent Escort Service
Call Girls Hsr Layout Whatsapp 7001305949 Independent Escort Service
 
FAMILY in sociology for physiotherapists.pptx
FAMILY in sociology for physiotherapists.pptxFAMILY in sociology for physiotherapists.pptx
FAMILY in sociology for physiotherapists.pptx
 
Models Call Girls Electronic City | 7001305949 At Low Cost Cash Payment Booking
Models Call Girls Electronic City | 7001305949 At Low Cost Cash Payment BookingModels Call Girls Electronic City | 7001305949 At Low Cost Cash Payment Booking
Models Call Girls Electronic City | 7001305949 At Low Cost Cash Payment Booking
 
Low Rate Call Girls In Bommanahalli Just Call 7001305949
Low Rate Call Girls In Bommanahalli Just Call 7001305949Low Rate Call Girls In Bommanahalli Just Call 7001305949
Low Rate Call Girls In Bommanahalli Just Call 7001305949
 
Rohini Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...
Rohini Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...Rohini Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...
Rohini Sector 6 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few Cl...
 
Gurgaon Sector 68 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...
Gurgaon Sector 68 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...Gurgaon Sector 68 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...
Gurgaon Sector 68 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few ...
 
Russian Call Girls South Delhi 9711199171 discount on your booking
Russian Call Girls South Delhi 9711199171 discount on your bookingRussian Call Girls South Delhi 9711199171 discount on your booking
Russian Call Girls South Delhi 9711199171 discount on your booking
 
EMS and Extrication: Coordinating Critical Care
EMS and Extrication: Coordinating Critical CareEMS and Extrication: Coordinating Critical Care
EMS and Extrication: Coordinating Critical Care
 
Low Vision Case (Nisreen mokhanawala).pptx
Low Vision Case (Nisreen mokhanawala).pptxLow Vision Case (Nisreen mokhanawala).pptx
Low Vision Case (Nisreen mokhanawala).pptx
 
Call Girls Ghaziabad 9999965857 Cheap and Best with original Photos
Call Girls Ghaziabad 9999965857 Cheap and Best with original PhotosCall Girls Ghaziabad 9999965857 Cheap and Best with original Photos
Call Girls Ghaziabad 9999965857 Cheap and Best with original Photos
 

NLP & ML Webinar

  • 1. This webinar is being recorded
  • 2. Natural Language Processing and Machine Learning: Beyond the Hype A Pistoia Alliance Debates Webinar Moderated by David Milward –Linguamatics September 14, 2017
  • 3. This webinar is being recorded
  • 4. Poll Question 1: What role do you play in your company? A. IT B. Data scientist/bioinformatician C. Clinical/bench scientist D. Information professional E. Other
  • 5. ©PistoiaAlliance The Panel 5 David Milward, Ph.D., CTO Linguamatics David Milward is chief technology officer (CTO) at Linguamatics. He is a pioneer of interactive text mining, and a founder of Linguamatics. He has over 20 years experience of product development, consultancy and research in natural language processing (NLP). After receiving a PhD from the University of Cambridge, he was a researcher and lecturer at the University of Edinburgh. He has published in the areas of information extraction, spoken dialogue, parsing, syntax and semantics. Chengyi Zheng, Ph.D. , NLP Specialist Kaiser Permanente Chengyi Zheng, PhD, is a NLP specialist at the Kaiser Permanente Southern California. He has worked on over 30 research projects using the electronic health records (EHR) data from millions of patients. He is the principal investigator of a CDC funded study involving 5 health care institutions on using NLP in the vaccine safety studies. He was the winner of the Kaiser Permanente predictive modeling competition. He ranked the 1st place in the innovation competition (InnoCentive@Lilly) while served as the biomedical informatics scientist at Eli Lilly. He was trained in computer science with a concentration on speech recognition. He will share some experiences on using NLP and Machine learning on EHR for outcomes prediction. Eugene Myshkin, Ph.D., Senior Research Scientist, Clarivate Eugene Myshkin, PhD, is a senior scientist in bioinformatics at Clarivate Analytics. He has over 15 years experience in drug discovery, cheminformatics and bioinformatics. He has also been involved in a number of text mining projects including mining of chemical reagents and antibodies from scientific literature. September 14, 2017 NLP and ML
  • 6. ©PistoiaAlliance Agenda 6 • AI, NLP and ML (David) • Using NLP and ML in clinical research (Chengyi) • Network and pathway driven machine learning approaches to biomarker discovery and patient stratification (Eugene) 6September 14, 2017 NLP and ML
  • 7. NLP, AI and Machine Learning David Milward, PhD CTO, Linguamatics 2017
  • 8. Overview AI (Artificial Intelligence) NLP (Natural Language Processing) − and its applications in life sciences ML (Machine Learning) and DL (Deep Learning) NLP to feed ML-based DS (Decision Support) ML in NLP DLAI ML NLP DS © 2017 Linguamatics8
  • 9. Overview AI (Artificial Intelligence) NLP (Natural Language Processing) − and its applications in life sciences ML (Machine Learning) and DL (Deep Learning) NLP to feed ML-based DS (Decision Support) ML in NLP AI © 2017 Linguamatics9
  • 10. Overview AI (Artificial Intelligence) NLP (Natural Language Processing) − and its applications in life sciences ML (Machine Learning) and DL (Deep Learning) NLP to feed ML-based DS (Decision Support) ML in NLP AI NLP © 2017 Linguamatics10
  • 11. Overview AI (Artificial Intelligence) NLP (Natural Language Processing) − and its applications in life sciences ML (Machine Learning) and DL (Deep Learning) NLP to feed ML-based DS (Decision Support) ML in NLP DLAI ML © 2017 Linguamatics11
  • 12. Overview AI (Artificial Intelligence) NLP (Natural Language Processing) − and its applications in life sciences ML (Machine Learning) and DL (Deep Learning) NLP to feed ML-based DS (Decision Support) ML in NLP AI NLP DS © 2017 Linguamatics12
  • 13. Overview AI (Artificial Intelligence) NLP (Natural Language Processing) − and its applications in life sciences ML (Machine Learning) and DL (Deep Learning) NLP to feed ML-based DS (Decision Support) ML in NLP AI ML NLP © 2017 Linguamatics13
  • 14. Artificial Intelligence (AI) Artificial intelligence is intelligence exhibited by machines The central goals of AI research include reasoning, knowledge, planning, learning, natural language processing (communication), perception and the ability to move and manipulate objects As machines become increasingly capable, tasks considered as requiring "intelligence" are often removed from the definition, leading to the quip “AI is whatever hasn't been done yet” Wikipedia © 2017 Linguamatics14
  • 15. Natural Language Processing (NLP) Processing of natural languages e.g. English, French, Chinese by computers NLP is part of AI, but also key to other areas of AI e.g. providing decision support − If 80% of knowledge is unstructured we need NLP to get the right information to provide good suggestions − Currently many AI projects are limited: they can only address questions where there is structured data − Worse, they often use inappropriate structured data such as ICD billing codes for non-billing tasks © 2017 Linguamatics15
  • 16. Find information however it is expressed © 2017 Linguamatics16 Different word, same meaning cyclosporine ciclosporin Neoral Sandimmune Different expression, same meaning Non-smoker Does not smoke Does not drink or smoke Denies tobacco use Different grammar, same meaning 5mg/kg of cyclosporine daily 5mg/kg/d of cyclosporine cyclosporine 5mg/kg/day Same word, different context Diagnosed with diabetes Family history of diabetes No family history of diabetes NLP
  • 17. Represent it in a standardized form © 2017 Linguamatics17 Concept Text Normalized Value Diseases breast cancer Breast Neoplasm carcinoma of the breast Genes Raf-1 RAF1 Raf I Dates 27th Feb 2014 20140227 2014/02/27 Measurements 0.2g 200 mg Two hundred milligrams Mutations Val 158 Met V158M Val by Met at codon 158 Entrez Gene ID: 5743 inhibits nimesulide, a selective COX2 inhibitor, …
  • 18. From Bench to Bedside: NLP Provides Insight © 2017 Linguamatics18 Regulatory approval Phase 3 Clinical trials Basic research Idea Patient care Phase 2Phase 1 DeliveryDevelopmentDiscovery Business critical questions What targets are involved in bone cancer? What companies are patenting a particular technology? What are the safety risks of my drug? Where can I site my Phase 1, Phase 3 clinical study? What are the clinical risks for my patients?
  • 19. Direct access to the Unstructured © 2017 Linguamatics Weight ≥ 80kg Below 60 years old Reports after 2010 With mutation C677T Cancer patients 19
  • 20. Machine Learning Machine Learning is used for AI in general and as a technique within NLP 3 main flavours: − Supervised − uses annotated data mapping between inputs and outputs − Semi-supervised − uses machine analysis but incorporates a human in the loop − Unsupervised − uses unannotated data, usually at very large scale. © 2017 Linguamatics20 Recent successes with deep learning approaches based on neural networks for supervised and unsupervised ML e.g. − Machine translation using parallel corpora − Image classification in medicine
  • 21. Using NLP to feed other AI NLP provides access to the 80% of information in unstructured text Provides a set of potential features to be used in e.g. ML models for Decision Support Example: building risk models from RWD sets − Predicting patients at risk of misusing opioid prescription drugs (AMIA November 2017) − Features extracted by Linguamatics I2E from 8.9 million de-identified medical record full- text transcripts from RealHealthData − SVM classifier trained on the features to flag patients at risk © 2017 Linguamatics21
  • 22. Machine Learning in NLP Supervised ML − Requires large-scale, representative annotated documents − Main paradigm for core NLP components − For extraction patterns, used in academic systems but less commonly in commercial Semi-supervised ML − Useful for new tasks or data sets where no existing representative annotated data − Useful where a task is initially ill-defined − Puts a human in the loop judging suggestions from the machine learning − Can provide good quality results quickly e.g. to test whether a feature extracted by NLP is useful for a ML model Unsupervised ML − Uses large-scale unannotated data − Key example is learning the meaning of a word via the context it keeps (word embeddings) © 2017 Linguamatics22
  • 23. Semi-Supervised ML Approaches Similar distributions for words and syntactic constructions Automatically discover what is in the data using an interactive, agile text mining platform such as Linguamatics I2E A long tail of infrequent cases − prioritize the more frequent constructions − generalize to cover items in the tail © 2017 Linguamatics23 Zipf’s Law: the frequency of any word is inversely proportional to its rank in the frequency table
  • 24. Semi-Supervised NLP using Linguamatics I2E © 2017 Linguamatics24
  • 25. Summary NLP is critical to success of many ML projects − access to the unstructured text is key to using ML widely, not just where there is convenient structured data Semi-supervised approaches to NLP provide an efficient way to capture features for ML projects © 2017 Linguamatics25 DLAI ML NLP DS
  • 26. Poll Question 2: What is your company’s primary use for NLP? A. Early Discovery/ Pre-clinical B. Clinical C. Real world data D. Other E. Don’t use NLP
  • 27. Using NLP and ML in clinical research Chengyi Zheng, PhD, MS DEPARTMENT of Research & Evaluation
  • 28. 28 DEPARTMENT of Research & Evaluation 10/6/2012 10/19/2012 10/7/2012 10/14/2012 10/7/2012 Pt called 10/7/2012 Nurse Called Back 10/8/2012 Orthopedic office visit Where: Medical Center, Department 10/8/2012 Progress Notes: Reason for visit: Knee Pain Vital Sign/BMI/Pain level/History PE/Findings/Impression/A&P Dx: icd-9 code Nurse Exam Note: … 10/9/2012 Lab 10/10/2012 Pre-op dental exam (ext) 10/6/2012 Imaging: DEXA Bone density 10/11/2012 office visit 10/11/2012 Rx Prescribed 10/10/2012 Surgery Scheduled 10/11/2012 Office Visit Sinus Congestion Ankle itchy Dx: 401.9 Essential Hypertension 274.9 Gout 461.9 Acute Sinusitis 10/12/2012 Picked up the Rx 10/13/2012 Pt missed appt. 10/13/2012 Telephone Consult Healthy bones PN 10/14/2012 Pt emailed: Drug adverse event 10/14/2012 Pt called cancerous area 10/15/2012 EKG Dx: Screening 10/15/2012 Ear Wax Wash 10/18/2012 Pathology Report Out 10/16/2012 Procedure: Remove Skin 10/16/2012 - 10/19/2012 Hospitalization Two weeks records of a patient in an EMR system 5 Ws: What, Who, When, Where and Why Membership length: 70% > 5 years, 50% >10 years.
  • 29. 29 5 Ws: What, Who, When, Where and Why  What – What is the reason of visit? – What happened? (pain after fall, pain after drink a beer?)  Who – Who is the caregiver? (primary physician, rheumatologist?) – What we know about this patient? (age, race, past medical history, et. al.)  Where – Where this visit occurred?  When – When the problem started?  Why – Why this problem happened? Possible causes? DEPARTMENT of Research & Evaluation
  • 30. 30 Visual representation of KPSC research databases DEPARTMENT of Research & Evaluation
  • 31. 31 Case study: Identify acute gout flare  Published methods to identify gout flares using claims data – Clinical coding is unreliable: under-coding, over-coding, too general – Medication is unreliable:  Drugs for gout maintenance  Drugs also for other diseases (Share similar symptoms)  NLP has been used to: – Identify study population and patients information – Identify and extract clinical variables (genetic, biopsy, radiology) – Evaluate patients status (disease progression, medication status) DEPARTMENT of Research & Evaluation
  • 32. Solution and challenges (NLP) Challenges: – Gout is a chronic disease which can be controlled but not cured  Signs and symptoms could appeared in follow up visit  Differentiate between acute and chronic status – Gout population is generally old with comorbidity sharing similar symptoms  100+ types of arthritis (> 50 million people)  Pain, erythema, and swelling joint – Information documented varies by clinical notes Standard solutions: – Each search query captures one set of information – Each search query has its own sensitivity/specificity etc. – Logic operator combines search results (union, join, etc.)  Difficult to optimize on the overall sensitivity/specificity etc. 32 DEPARTMENT of Research & Evaluation
  • 33. Mining vs. NLP & ML in clinical research Steps: 1. Preliminary analysis, estimate feasibility 2. Develop plan, estimate cost 3. Seek permit (government vs. IRB) 4. Mine (mining equipment vs. NLP) – Focus on completeness (high sensitivity) – Shallow & deep mining (good specificity) 5. Refine (chemical process vs. ML) – Improve purity (higher specificity) 6. Manual verification (optional) 7. Deliver to customer “art and science combined” “resource-heavy and time-consuming process” 33 DEPARTMENT of Research & Evaluation
  • 34. Solution and challenges (NLP+ML) Goal:  NLP focus on sensitivity or information completeness – Separate ores from rock  ML focus on improving the specificity – Improve purity without much loss of sensitivity Solution:  NLP results as input features to the ML system – Identify related signs and symptoms – Identify temporal relationship (when and how long?) – Identify disease association (related to any other disease?) – Identify implicit and explicit mention of gout flare – Identify treatment plan associated with disease onset 34 DEPARTMENT of Research & Evaluation
  • 35. Overview of the system development steps 35 Study period: 1/1/2007 to 12/31/2010. Patients > 18 years, with a diagnosis of gout and on urate-lowering therapy. Within [-3,+12] months of index date, 599,317 clinical notes for 16,519 patients. DEPARTMENT of Research & Evaluation
  • 36. Overview of the NLP+ML system 36 DEPARTMENT of Research & Evaluation
  • 37. Performance comparisons 81.1 95.4 88.3 92.290.9 97.3 93 96.5 84.8 92.2 81.1 93.9 70 80 90 100 Sensitivity Specificity PPV NPV Clinical note level gout flare identification Rheumatologist 1 Rheumatologist 2 NLP+ML 37 98.5 92.9 97.1 96.397.1 92.9 97.1 92.9 98.5 96.4 98.5 96.4 88.2 89.3 95.2 75.8 70 80 90 100 Sensitivity Specificity PPV NPV Identify patients with ≥ 1 gout flares Rheumatologist 1 Rheumatologist 2 NLP+ML ICD-9 74.2 92.3 82.1 88.283.9 95.4 89.7 92.593.5 84.6 74.4 96.5 41.9 95.4 81.3 77.5 30 50 70 90 Sensitivity Specificity PPV NPV Identify patients with ≥ 3 gout flares (refractory gout) Rheumatologist 1 Rheumatologist 2 NLP+ML ICD-9 DEPARTMENT of Research & Evaluation
  • 38. Results  Note level (gout flare, n= 599,317): – NLP: 49,415 positive cases => ML: 18,869 positive cases  Patient level (with ≥ 3 flares, n=16,519): – Number of patients: 1,402 (NLP+ML) vs. 516 (Claim) – Sensitivity: 93.5% (NLP+ML) vs. 41.9% (Claim)  Impact: – Identify refractory disease patients – Estimate market size (KPSC / US population = 4.5/325 million = 1.4%) – Better disease management, improve quality of life, and help reduce healthcare resource use.  1,402 patients is more manageable than 16,519 patients 38 DEPARTMENT of Research & Evaluation
  • 39. 39 ML in healthcare  Tremendous opportunities  Prediction: high utilizers, risk scores  Identification: cases, outcomes, social needs  Image recognition: pathology and radiology images – Challenges (Data)  Data quality: dirty, missing data  Heterogeneous data: different systems  Structured, semi-structured and free text data  Image, scanned documents  Genetic and biobank data – Challenges (People)  Who understands NLP, ML and healthcare  Who understands the complexity of healthcare data DEPARTMENT of Research & Evaluation
  • 40. Poll Question 3: How does your company primarily use machine learning in drug discovery? A. Target prediction and repositioning B. Biomarker discovery C. Patient stratification D. Other E. We don’t use machine learning
  • 41. Network and pathway driven machine learning approaches to biomarker discovery and patient stratification Eugene Myshkin, PhD September 2017
  • 42. 42 CLARIVATE ANALYTICS TEXT MINING • Clarivate Analytics literature data feed • Comprehensive coverage – >20,000 journals – Journal content mirrors: Current Contents; Web of Science; Biosis; International Pharmaceutical Abstracts; Derwent Drug File – http://ip-science.thomsonreuters.com/cgi-bin/jrnlst/jlresults.cgi?PC=MASTER • Latest information – Updated with over 170,921 articles/month, or 2,051,051+ articles/year • Full text, cover to cover searching of all journals • Comprehensive synonym collections • Controlled vocabulary management software to support mining
  • 43. 43 CLARIVATE ANALYTICS LIFE SCIENCES SOLUTIONS Pharmacovigilance Literature Monitoring Biological and Chemical Reagent Monitoring Concepts in social media Automated Curation of Clinical Data Protein and Gene Variant Monitoring
  • 44. 44 USING NLP FOR MANUSCRIPT MATCHING Analyze citation connections to place the publication in the right journal
  • 45. 45 DRUG TARGET DISEASE PITFALLS OF NLP FEATURES FOR ML • 1-10 million of features • Feature vectors are binary and sparse • Feature redundancy • Feature selection takes a long time These associations can be obtained with NLP but precision is a problem - a flood of false positives and the necessity to hire a bunch of people just to sort the true from the false alerts. FOCUS OF DRUG DISCOVERY:
  • 46. 46 — METABASE MANUALLY ANNOTATED CONTENT PUBLICATIONS (209 for EGF-EGFR interaction) •Manual annotation from publications •Team of PhDs, MDs •Advanced editorial systems •Controlled vocabularies •Multiple levels of QC •invested more than 400 man yearsMOLECULAR INTERACTION NETWORK: PATHWAY ~ 1,500,000 molecular interactions ~ 3,000 pathways
  • 47. 47 — INTEGRATED APPROACH Pathway knowledge Pathway-driven approaches Statistical approaches 1. Target identification or repositioning 2. Biomarker discovery 3. Patient stratification
  • 48. 48 — Drug toxic but beneficial Drug toxic but NOT beneficial Drug NOT toxic and beneficial Drug NOT toxic and NOT beneficial Patient stratification “The most efficient and safe drug for a cohort of patients” WHY DIFFERENT PATIENT RESPONSE? Blockbuster strategy “One drug for all patients” New strategy is needed
  • 49. 49 — HOW CAN PATIENTS BE STRATIFIED? Mechanism 1 Mechanism 2 Biomarkers Biomarkers Biomarker – measurable molecular indicator of: disease subtype/progress drug efficacy side effect/toxicity • Identify subtypes resulting in multiple drug targets rather than one. • A shift from the presumption of a disease to multiple diseases would reframe the drug development strategy
  • 50. 50 — ORION BIONETWORKS Orion Bionetworks (Cohen Veteran Biosciences) is an alliance of world leading organizations in patient care, computational modelling, translational research and patient advocacy that aims to develop open-source computational models for multiple sclerosis and improve upon existing analytical tools for model development. ~186 subjects with gene expression data and clinical parameters like time to relapse, etc GOALS:  Understand the structure of the population based on the molecular data – identify cohorts of patients whose clinical course differs over time  Build stratification models  Identify new therapeutic targets
  • 51. 51 — NETWORK/PATHWAY BASED METHODS FOR BIOMARKER DISCOVERY
  • 52. 52 — 1. PATHWAY IDENTIFICATION — 56 pathways identified • 136 genes • 39/136 genes were present in multiple pathways • 44/136 genes known MS biomarkers or drug targets (p = 5x10-6) 52 • individual expression values of each member gene were averaged into a combined z-score • activity score association with time to relapse in a Cox proportional hazard model was calculated
  • 53. 53 — 2. PATIENTS CLUSTERING BY PATHWAYS Clusters are significantly associated with time to relapse in the presence of important clinical covariates patients were clustered into groups based on k-means clustering of their pathway activity profiles, k=3 resulted in the best separation of patient profiles.
  • 54. 54 — — A K-Nearest Neighbor model was previously generated to predict risk groups 1-3 using all biomarkers — Feature selection was performed by taking the variable importance calculated from the trained KNN model. — Forward feature selection was then conducted using 10-fold CV adding features to the model in order of their importance. — Once this process was complete the predictive performance was evaluated in terms of the ability of the model to separate the three risk groups — Final feature set was applied to test data 3. CLASSIFICATION MODEL Signature was reduced from 56 to 13 pathways, containing 65 genes
  • 55. GENE ONLY MODEL WAS NOT ROBUST TO TEST DATA PATHWAY BASED APPROACH GENE BASED APPROACH
  • 56. 56 — CONCLUSIONS — Signature differentiating between patient cohorts was reduced from 56 to 13 pathways — This new signature contains 65 genes — 13 biomarkers could stratify subjects into risk groups with statistically significant differences in time to relapse — This was validated in test subjects with results being consistent to what was observed in the training cohort — Pathway activities were more robust than gene expression 56
  • 57. Poll Question 4: What is the greatest barrier to application of NLP/ML at your company? A. Technical expertise B. Access to data C. Data quality D. Management support/understanding E. Other
  • 58. Poll Question 5: Do you expect an increase in ML within Life Science in the next 2 years? A. Yes B. No C: Don’t Know
  • 59. Audience Q&A Please use the Question function in GoToWebinar
  • 60. Where will AI/Deep learning have an impact in Life Science & Health? The next Pistoia Alliance Debates Webinar: Moderator: Nick Lynch with Sean Ekins CEO, Collaborations Pharmaceuticals Inc, David Pearah, CEO HDF group, and Peter Henstock, Pfizer Research Date: September 27, 2017 check http://www.pistoiaalliance.org/pistoia-alliance-debates-webinar- series/ for the latest information