SlideShare a Scribd company logo
1 of 38
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Using weak supervision and
transfer learning techniques to
build knowledge graph
July 25
Sanghamitra Deb
@sangha_deb,
sdeb@chegg.com
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Zoha Zargham Sakshi Bhargava Priya Venkat
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Natural Language Processing
• Giving structure to
unstructured data
• Learn properties of the data
that makes decision making
simple
• Provide consise information to
drive intelligence of different
systems.
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Natural Language Processing
All industries generate content.
Healthcare
Finance
Retail
Legal
Education
Marketing
Real Estate Social Media
Academics
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Natural Language Processing :
How?
Build a Machine Learning
Pipeline to infer properties of
the text that will solve a
particular problem
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
• Student first company to improve
education
• Multiple services: question
answering, online tutoring,
flashcards, writing, math solver,
internships.
• Content drives product.
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
7
Chegg Study
*The Chegg logo is a registered trademark of Chegg, Inc. All other trademarks are owned by their respective owners
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
8
Chegg Study
*The Chegg logo is a registered trademark of Chegg, Inc. All other trademarks are owned by their respective owners
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
NLP Goal: Create a Knowledge Base
What is a knowledgebase?
All Content
Algebra Physics Statistics Mechanical Eng Accounting ….
Several Tens of Subjects
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
NLP Goal: Create a Knowledge Base
What is a knowledgebase?
Statistics
Probability Testing Regression
Discrete
PDs
Continuous
PD’s
Sampling
Estimation Hypothesis Testing Regression
Binomial
Normal
Probability
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Building a Machine Learning Pipeline for NLP
• Classification using tfidf
• Weak Supervision
• Transfer learning techniques
• Thresholding
• Active Learning
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Machine Learning needs huge number of examples
This is expensive
1. Collecting Data 2. Gathering labelled data 3. Feature Engineering 4. Fit a model
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Machine Learning needs huge number of examples
This is expensive
1. Collecting Data 2. Gathering labelled data 3. Feature Engineering 4. Fit a model
Deep learning replaces
feature engineering !!
However, DL requires huge
amounts of data.
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Weak Supervision
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
What is Weak Supervision?
Using noisy sources of truth to generate training data.
• Rules/heuristics
• Constraints
• Invariances
• Existing knowledge
• Cheaper sources of Labels
• Pre-trained model
• Get labels from the
customer
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
16
https://hazyresearch.github.io/snorkel/
A System for Fast Training Data Creation
• Weak supervision replaces manual
generation of labelled data
• Fast adoption and user friendly interface
Snorkel – packaging Weak supervision tools
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
17
Weak Supervision Pipeline using Snorkel
Labeling Functions:
o 𝜆1: Does the word
’wife’ occur between
names?
o 𝜆2: Is this a name in
the dictionary of
couples?
o 𝜆3: Does the word
‘spouse’ occur in the
sentence?
• Write multiple functions that can label data
• The functions encode one of the weak
supervision techniques
External
Knowledge
Sources
Patterns and
Dictionaries
Domain
heuristics
SMEs (Subject
Matter Experts)
providing
valuable inputs
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
18
Weak Supervision Pipeline using Snorkel
1. David and his wife Mel boarded the
flight to Adelaide.
2. Former US President Barack Obama
and the former first lady Michelle
Obama waved to the crowds in Ohio.
Document Parsing
Sentence
Phrases/n-grams
Labeling Functions:
o 𝜆1: Does the word
’wife’ occur between
names?
o 𝜆2: Is this a name in
the dictionary of
couples?
o 𝜆3: Does the word
‘spouse’ occur in the
sentence?
External
Knowledge
Sources
Patterns and
Dictionaries
Domain
heuristics
SMEs (Subject
Matter Experts)
providing
valuable inputs
Input Data
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Weak Supervision Pipeline using Snorkel
• Labeling Functions have different latent accuracies
• We want to learn these accuracies without using
labeled data
• Essentially, compare agreements and disagreements
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Weak Supervision Pipeline using Snorkel
• Build supervised models using probabilistic training labels
• Increase coverage
• Improved number of features
Ratner et. al., 2016
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Weak Supervision Steps
v
Prove that a triangle is equilateral
Question word preceding a Keyword
Keyword is contained in known Algebra: triangles
database
Concept: Algebra: triangles✓
Broad Stroke Filtering Rules
Language Pattern Recognition: Using part of
speech tags to extract keywords.
‘symmetric matrices’, ‘real number’
‘JJ NNS’
JJ = Noun, singular
NNS = Noun, plural
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
How do the rules work
0
20
40
60
80
100
120
140
160
180
200
1 0 -1
Training set
1: Class 1,document belongs to Algebra: triangles
0: unlabelled data
-1: Class 2, document does not belong to Algebra: triangles.
Rule 1: All documents with words/phrases contained in
Algebra: triangles database preceeded by question
words or phrases (ex: prove, calculate, etc) fall under the
category of Algebra: triangles
Rule 2: If words or phrases have appeared only once in
your corpus it is not Algebra: triangles.
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
How do the rules work?
0
20
40
60
80
100
120
140
160
180
200
1 0 -1
Training set
Rule 1: All documents with words/phrases contained in
Algebra: triangles database preceeded by question
words or phrases (ex: prove, calculate, etc) fall under the
category of Algebra: triangles
Rule 2: If words or phrases have appeared only once in
your corpus it is not Algebra: triangles.
Rule 3: All documents with words/phrases not in Algebra:
database do not fall under the category of Algebra:
triangles
Rule 4: All documents containing> 4 keywords/phrases
from Algebra: triangles database belong to Algebra:
triangles.
1: Class 1,document belongs to Algebra: triangles
0: unlabelled data
-1: Class 2, document does not belong to Algebra: triangles.
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Building the Machine Learning
Pipeline
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Thresholding
Probability
0 10.5
0.70.3
• You want your model to be correct.
• It is possible improve the
percentage of correct results at the
cost of coverage.
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Active Learning
1. Train a classifier and
predict on unseen data.
2. Evaluate points close to
the decision boundary
3. Collect SME annotations
on these points and add
them to the training set
Advantages:
• Requires less training Data.
• Strategizing can lead to higher coverage of
the model space
Dis-advantages:
• Uncertaintly sampling is noise
seeking: this can lead to fitting the
noise in the data.
• Outliers: Getting label on outliers
does not improve the models
(there are techniques to avoid this
issue).
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Machine Learning Pipline
Produce training data with
Weak Supervison
Feature Generation with
Transfer Learning
Supervised
Learning:
Active Learning:
gather more training
data
Populate database with inferred tags
from model
Prob>threshold
Yes:
prediction
accepted
No
Goal: Concept classification
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Collaboration
Product/Businesss
Content team
Iterate
• Redefine Filters
• Redefine rules
• Explain model
performance
What is the
application?
Product Integration
Collaboration
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
• Routing appropriate topics to relevant experts for answering questions
• Recommend appropriate topics to a student for practicing before an
exam
• Determine topics with highest demand to students and improve or create
content for these topics
• Connect different products based on topic similarity
Applications
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
In conclusion …
There is infinite amount of content
Language itself has logic, coherence and meaning.
Human curated examples for training models are expensive
Be smart about collecting examples and working with small amounts of example
Tackling new use cases less expensive.
Facilitate automation
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Questions
Sanghamitra Deb
@sangha_deb,
sdeb@chegg.com
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Appendix
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Using generative models to combine rules
Independent
Rules x_i
n accuracy dependencies
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Using generative models to combine rules
Including dependency
Here T is the set of
dependency types of interest
Learning the parameters of the generative model jointly requires
Gibbs sampling to estimate gradients. As the number of possible
dependencies increases at least quadratically in the number of
labeling functions
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Results: Precision
Subject
(example:
Statistics)
Level 2:
(probability,te
sting,regressi
on)
>90%
>90%
Level 3:
(discrete PDF,
estimation,sam
pling,etc
~80%
Level 4:
(discrete PDF,
estimation,sam
pling,etc
~70%
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Evaluation Metrics
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Other open source embedding tools
• Shallow Networks: Word2vec, Glove
• FASTtext : facebook, character based.
• ULMfit : fast.ai
• GPT-2 latest models from openai
• Shallow Networks: Word2vec, Glove
• FASTtext : facebook, character based.
• ULMfit : fast.ai
• GPT-2 latest models from openai
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Sentence Embeddings
Autoencoders Language Models
Skip Thought Vectors
Embed
LSTM
Softmax
Predict

More Related Content

Similar to Using weak supervision and transfer learning techniques to build knowledge graph to improve student experiences at Chegg by Sanghamitra Deb, Staff Data Scientist @Chegg

Taras Fedorov "Evolution from ML to DL in NLP project"
Taras Fedorov "Evolution from ML to DL in NLP project"Taras Fedorov "Evolution from ML to DL in NLP project"
Taras Fedorov "Evolution from ML to DL in NLP project"Lviv Startup Club
 
Online ENGL 202 Syllabus
Online ENGL 202 SyllabusOnline ENGL 202 Syllabus
Online ENGL 202 SyllabusJodie Nicotra
 
Technology in Education & The Internet
Technology in Education& The InternetTechnology in Education& The Internet
Technology in Education & The InternetCARLOS MARTINEZ
 
Adaptive learning - Eight Key Questions
Adaptive learning - Eight Key QuestionsAdaptive learning - Eight Key Questions
Adaptive learning - Eight Key QuestionsCogBooks Ltd
 
7 steps to keep your ICT spend under control
7 steps to keep your ICT spend under control7 steps to keep your ICT spend under control
7 steps to keep your ICT spend under controlLesley Smith
 
Introduction to Viva Topics #CCAS2022
Introduction to Viva Topics #CCAS2022Introduction to Viva Topics #CCAS2022
Introduction to Viva Topics #CCAS2022Kanwal Khipple
 
How DeepSphere.AI Transformed Fresh Graduates Into Data Scientists At Databri...
How DeepSphere.AI Transformed Fresh Graduates Into Data Scientists At Databri...How DeepSphere.AI Transformed Fresh Graduates Into Data Scientists At Databri...
How DeepSphere.AI Transformed Fresh Graduates Into Data Scientists At Databri...HemaMaliniP5
 
How Free is Free?: Building courses with OERs
How Free is Free?: Building courses with OERsHow Free is Free?: Building courses with OERs
How Free is Free?: Building courses with OERsBCcampus
 
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...Alok Singh
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningYogesh Sharma
 
Final Assignment Written assignment (Research Proposal)
Final Assignment Written assignment (Research Proposal) Final Assignment Written assignment (Research Proposal)
Final Assignment Written assignment (Research Proposal) ChereCheek752
 
Ngs Hsm 700bl Module 5 01062009
Ngs Hsm 700bl  Module 5 01062009Ngs Hsm 700bl  Module 5 01062009
Ngs Hsm 700bl Module 5 01062009Peter Stinson
 
Bridging the Divide: High Technology in Low-resource Settings -- an update (S...
Bridging the Divide: High Technology in Low-resource Settings -- an update (S...Bridging the Divide: High Technology in Low-resource Settings -- an update (S...
Bridging the Divide: High Technology in Low-resource Settings -- an update (S...James BonTempo
 
Assignment02 question file s1 2017
Assignment02 question file s1 2017Assignment02 question file s1 2017
Assignment02 question file s1 2017Sandeep Ratnam
 
Hvordan få søk til å fungere effektivt
Hvordan få søk til å fungere effektivtHvordan få søk til å fungere effektivt
Hvordan få søk til å fungere effektivtKristian Norling
 
Using the power of OpenAI with your own data: what's possible and how to start?
Using the power of OpenAI with your own data: what's possible and how to start?Using the power of OpenAI with your own data: what's possible and how to start?
Using the power of OpenAI with your own data: what's possible and how to start?Maxim Salnikov
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesDianaGray10
 
Inclusive, Accessible Tech: Bias-Free Language in Code and Configurations
Inclusive, Accessible Tech: Bias-Free Language in Code and ConfigurationsInclusive, Accessible Tech: Bias-Free Language in Code and Configurations
Inclusive, Accessible Tech: Bias-Free Language in Code and ConfigurationsAnne Gentle
 

Similar to Using weak supervision and transfer learning techniques to build knowledge graph to improve student experiences at Chegg by Sanghamitra Deb, Staff Data Scientist @Chegg (20)

Taras Fedorov "Evolution from ML to DL in NLP project"
Taras Fedorov "Evolution from ML to DL in NLP project"Taras Fedorov "Evolution from ML to DL in NLP project"
Taras Fedorov "Evolution from ML to DL in NLP project"
 
Online ENGL 202 Syllabus
Online ENGL 202 SyllabusOnline ENGL 202 Syllabus
Online ENGL 202 Syllabus
 
Technology in Education & The Internet
Technology in Education& The InternetTechnology in Education& The Internet
Technology in Education & The Internet
 
Adaptive learning - Eight Key Questions
Adaptive learning - Eight Key QuestionsAdaptive learning - Eight Key Questions
Adaptive learning - Eight Key Questions
 
7 steps to keep your ICT spend under control
7 steps to keep your ICT spend under control7 steps to keep your ICT spend under control
7 steps to keep your ICT spend under control
 
Introduction to Viva Topics #CCAS2022
Introduction to Viva Topics #CCAS2022Introduction to Viva Topics #CCAS2022
Introduction to Viva Topics #CCAS2022
 
How DeepSphere.AI Transformed Fresh Graduates Into Data Scientists At Databri...
How DeepSphere.AI Transformed Fresh Graduates Into Data Scientists At Databri...How DeepSphere.AI Transformed Fresh Graduates Into Data Scientists At Databri...
How DeepSphere.AI Transformed Fresh Graduates Into Data Scientists At Databri...
 
How Free is Free?: Building courses with OERs
How Free is Free?: Building courses with OERsHow Free is Free?: Building courses with OERs
How Free is Free?: Building courses with OERs
 
Digital employability workshop
Digital employability workshop Digital employability workshop
Digital employability workshop
 
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Final Assignment Written assignment (Research Proposal)
Final Assignment Written assignment (Research Proposal) Final Assignment Written assignment (Research Proposal)
Final Assignment Written assignment (Research Proposal)
 
Ngs Hsm 700bl Module 5 01062009
Ngs Hsm 700bl  Module 5 01062009Ngs Hsm 700bl  Module 5 01062009
Ngs Hsm 700bl Module 5 01062009
 
Bridging the Divide: High Technology in Low-resource Settings -- an update (S...
Bridging the Divide: High Technology in Low-resource Settings -- an update (S...Bridging the Divide: High Technology in Low-resource Settings -- an update (S...
Bridging the Divide: High Technology in Low-resource Settings -- an update (S...
 
dScribe Workshop - U-M
dScribe Workshop - U-MdScribe Workshop - U-M
dScribe Workshop - U-M
 
Assignment02 question file s1 2017
Assignment02 question file s1 2017Assignment02 question file s1 2017
Assignment02 question file s1 2017
 
Hvordan få søk til å fungere effektivt
Hvordan få søk til å fungere effektivtHvordan få søk til å fungere effektivt
Hvordan få søk til å fungere effektivt
 
Using the power of OpenAI with your own data: what's possible and how to start?
Using the power of OpenAI with your own data: what's possible and how to start?Using the power of OpenAI with your own data: what's possible and how to start?
Using the power of OpenAI with your own data: what's possible and how to start?
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practices
 
Inclusive, Accessible Tech: Bias-Free Language in Code and Configurations
Inclusive, Accessible Tech: Bias-Free Language in Code and ConfigurationsInclusive, Accessible Tech: Bias-Free Language in Code and Configurations
Inclusive, Accessible Tech: Bias-Free Language in Code and Configurations
 

More from Paris Women in Machine Learning and Data Science

More from Paris Women in Machine Learning and Data Science (20)

Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Managing international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha DimbanManaging international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha Dimban
 
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria KnorpsOptimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
 
Perspectives, by M. Pannegeon
Perspectives, by M. PannegeonPerspectives, by M. Pannegeon
Perspectives, by M. Pannegeon
 
Evaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled dataEvaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled data
 
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
 
An age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-PierreAn age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-Pierre
 
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle LautréApplying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
 
How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
How to supervise a thesis in NLP in the ChatGPT era? By Laure SoulierHow to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
 
Global Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna AbreuGlobal Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna Abreu
 
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie DelonPlug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
 
Sales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca IannuzziSales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca Iannuzzi
 
Identifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta BinkyteIdentifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta Binkyte
 
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
 
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
 
Sandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI projectSandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI project
 
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
 
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdfKhrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
 
Iana Iatsun_ML in production_20Dec2022.pdf
Iana Iatsun_ML in production_20Dec2022.pdfIana Iatsun_ML in production_20Dec2022.pdf
Iana Iatsun_ML in production_20Dec2022.pdf
 
41 WiMLDS Kyiv Paris Poznan.pdf
41 WiMLDS Kyiv Paris Poznan.pdf41 WiMLDS Kyiv Paris Poznan.pdf
41 WiMLDS Kyiv Paris Poznan.pdf
 

Recently uploaded

High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...Call girls in Ahmedabad High profile
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 

Recently uploaded (20)

High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 

Using weak supervision and transfer learning techniques to build knowledge graph to improve student experiences at Chegg by Sanghamitra Deb, Staff Data Scientist @Chegg

  • 1. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Using weak supervision and transfer learning techniques to build knowledge graph July 25 Sanghamitra Deb @sangha_deb, sdeb@chegg.com
  • 2. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Zoha Zargham Sakshi Bhargava Priya Venkat
  • 3. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Natural Language Processing • Giving structure to unstructured data • Learn properties of the data that makes decision making simple • Provide consise information to drive intelligence of different systems.
  • 4. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Natural Language Processing All industries generate content. Healthcare Finance Retail Legal Education Marketing Real Estate Social Media Academics
  • 5. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Natural Language Processing : How? Build a Machine Learning Pipeline to infer properties of the text that will solve a particular problem
  • 6. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved • Student first company to improve education • Multiple services: question answering, online tutoring, flashcards, writing, math solver, internships. • Content drives product.
  • 7. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved 7 Chegg Study *The Chegg logo is a registered trademark of Chegg, Inc. All other trademarks are owned by their respective owners
  • 8. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved 8 Chegg Study *The Chegg logo is a registered trademark of Chegg, Inc. All other trademarks are owned by their respective owners
  • 9. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved NLP Goal: Create a Knowledge Base What is a knowledgebase? All Content Algebra Physics Statistics Mechanical Eng Accounting …. Several Tens of Subjects
  • 10. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved NLP Goal: Create a Knowledge Base What is a knowledgebase? Statistics Probability Testing Regression Discrete PDs Continuous PD’s Sampling Estimation Hypothesis Testing Regression Binomial Normal Probability
  • 11. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Building a Machine Learning Pipeline for NLP • Classification using tfidf • Weak Supervision • Transfer learning techniques • Thresholding • Active Learning
  • 12. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Machine Learning needs huge number of examples This is expensive 1. Collecting Data 2. Gathering labelled data 3. Feature Engineering 4. Fit a model
  • 13. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Machine Learning needs huge number of examples This is expensive 1. Collecting Data 2. Gathering labelled data 3. Feature Engineering 4. Fit a model Deep learning replaces feature engineering !! However, DL requires huge amounts of data.
  • 14. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Weak Supervision
  • 15. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved What is Weak Supervision? Using noisy sources of truth to generate training data. • Rules/heuristics • Constraints • Invariances • Existing knowledge • Cheaper sources of Labels • Pre-trained model • Get labels from the customer
  • 16. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved 16 https://hazyresearch.github.io/snorkel/ A System for Fast Training Data Creation • Weak supervision replaces manual generation of labelled data • Fast adoption and user friendly interface Snorkel – packaging Weak supervision tools
  • 17. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved 17 Weak Supervision Pipeline using Snorkel Labeling Functions: o 𝜆1: Does the word ’wife’ occur between names? o 𝜆2: Is this a name in the dictionary of couples? o 𝜆3: Does the word ‘spouse’ occur in the sentence? • Write multiple functions that can label data • The functions encode one of the weak supervision techniques External Knowledge Sources Patterns and Dictionaries Domain heuristics SMEs (Subject Matter Experts) providing valuable inputs
  • 18. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved 18 Weak Supervision Pipeline using Snorkel 1. David and his wife Mel boarded the flight to Adelaide. 2. Former US President Barack Obama and the former first lady Michelle Obama waved to the crowds in Ohio. Document Parsing Sentence Phrases/n-grams Labeling Functions: o 𝜆1: Does the word ’wife’ occur between names? o 𝜆2: Is this a name in the dictionary of couples? o 𝜆3: Does the word ‘spouse’ occur in the sentence? External Knowledge Sources Patterns and Dictionaries Domain heuristics SMEs (Subject Matter Experts) providing valuable inputs Input Data
  • 19. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Weak Supervision Pipeline using Snorkel • Labeling Functions have different latent accuracies • We want to learn these accuracies without using labeled data • Essentially, compare agreements and disagreements
  • 20. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Weak Supervision Pipeline using Snorkel • Build supervised models using probabilistic training labels • Increase coverage • Improved number of features Ratner et. al., 2016
  • 21. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Weak Supervision Steps v Prove that a triangle is equilateral Question word preceding a Keyword Keyword is contained in known Algebra: triangles database Concept: Algebra: triangles✓ Broad Stroke Filtering Rules Language Pattern Recognition: Using part of speech tags to extract keywords. ‘symmetric matrices’, ‘real number’ ‘JJ NNS’ JJ = Noun, singular NNS = Noun, plural
  • 22. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved How do the rules work 0 20 40 60 80 100 120 140 160 180 200 1 0 -1 Training set 1: Class 1,document belongs to Algebra: triangles 0: unlabelled data -1: Class 2, document does not belong to Algebra: triangles. Rule 1: All documents with words/phrases contained in Algebra: triangles database preceeded by question words or phrases (ex: prove, calculate, etc) fall under the category of Algebra: triangles Rule 2: If words or phrases have appeared only once in your corpus it is not Algebra: triangles.
  • 23. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved How do the rules work? 0 20 40 60 80 100 120 140 160 180 200 1 0 -1 Training set Rule 1: All documents with words/phrases contained in Algebra: triangles database preceeded by question words or phrases (ex: prove, calculate, etc) fall under the category of Algebra: triangles Rule 2: If words or phrases have appeared only once in your corpus it is not Algebra: triangles. Rule 3: All documents with words/phrases not in Algebra: database do not fall under the category of Algebra: triangles Rule 4: All documents containing> 4 keywords/phrases from Algebra: triangles database belong to Algebra: triangles. 1: Class 1,document belongs to Algebra: triangles 0: unlabelled data -1: Class 2, document does not belong to Algebra: triangles.
  • 24. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Building the Machine Learning Pipeline
  • 25. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Thresholding Probability 0 10.5 0.70.3 • You want your model to be correct. • It is possible improve the percentage of correct results at the cost of coverage.
  • 26. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Active Learning 1. Train a classifier and predict on unseen data. 2. Evaluate points close to the decision boundary 3. Collect SME annotations on these points and add them to the training set Advantages: • Requires less training Data. • Strategizing can lead to higher coverage of the model space Dis-advantages: • Uncertaintly sampling is noise seeking: this can lead to fitting the noise in the data. • Outliers: Getting label on outliers does not improve the models (there are techniques to avoid this issue).
  • 27. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Machine Learning Pipline Produce training data with Weak Supervison Feature Generation with Transfer Learning Supervised Learning: Active Learning: gather more training data Populate database with inferred tags from model Prob>threshold Yes: prediction accepted No Goal: Concept classification
  • 28. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Collaboration Product/Businesss Content team Iterate • Redefine Filters • Redefine rules • Explain model performance What is the application? Product Integration Collaboration
  • 29. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved • Routing appropriate topics to relevant experts for answering questions • Recommend appropriate topics to a student for practicing before an exam • Determine topics with highest demand to students and improve or create content for these topics • Connect different products based on topic similarity Applications
  • 30. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved In conclusion … There is infinite amount of content Language itself has logic, coherence and meaning. Human curated examples for training models are expensive Be smart about collecting examples and working with small amounts of example Tackling new use cases less expensive. Facilitate automation
  • 31. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Questions Sanghamitra Deb @sangha_deb, sdeb@chegg.com
  • 32. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Appendix
  • 33. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Using generative models to combine rules Independent Rules x_i n accuracy dependencies
  • 34. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Using generative models to combine rules Including dependency Here T is the set of dependency types of interest Learning the parameters of the generative model jointly requires Gibbs sampling to estimate gradients. As the number of possible dependencies increases at least quadratically in the number of labeling functions
  • 35. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Results: Precision Subject (example: Statistics) Level 2: (probability,te sting,regressi on) >90% >90% Level 3: (discrete PDF, estimation,sam pling,etc ~80% Level 4: (discrete PDF, estimation,sam pling,etc ~70%
  • 36. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Evaluation Metrics
  • 37. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Other open source embedding tools • Shallow Networks: Word2vec, Glove • FASTtext : facebook, character based. • ULMfit : fast.ai • GPT-2 latest models from openai • Shallow Networks: Word2vec, Glove • FASTtext : facebook, character based. • ULMfit : fast.ai • GPT-2 latest models from openai
  • 38. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Sentence Embeddings Autoencoders Language Models Skip Thought Vectors Embed LSTM Softmax Predict

Editor's Notes

  1. With respect to the diagram above, if we take the word “cat” it will be one of the words in the 10,000 word vocabulary.  Therefore we can represent it as a 10,000 length one-hot vector.  We then interface this input vector to a 300 node hidden layer (if you need to scrub up on neural networks, see this tutorial).  The weights connecting this layer will be our new word vectors – more on this soon.  The activations of the nodes in this hidden layer are simply linear summations of the weighted inputs (i.e. no non-linear activation, like a sigmoid or tanh, is applied).  These nodes are then fed into a softmax output layer.  During training, we want to change the weights of this neural network so that words surrounding “cat” have a higher probability in the softmax output layer.  So, for instance, if our text data set has a lot of Dr Seuss books, we would want our network to assign large probabilities to words like “the”, “sat” and “on” (given lots of sentences like “the cat sat on the mat”). By training this network, we would be creating a 10,000 x 300 weight matrix connecting the 10,000 length input with the 300 node hidden layer.  Each row in this matrix corresponds to a word in our 10,000 word vocabulary – so we have effectively reduced 10,000 length one-hot vector representations of our words to 300 length vectors.  The weight matrix essentially becomes a look-up or encoding table of our words.  Not only that, but these weight values contain context information due to the way we’ve trained our network.  Once we’ve trained the network, we abandon the softmax layer and just use the 10,000 x 300 weight matrix as our word embedding lookup table.
  2. For two classes the predicted probability varies between {0,1}, probability close to 0 implies one class & probability close to 1  other class. What if for a particular piece of content the probability is 0.55? This implies that the content does not have pattern that helps the model distinguish between the two classes. Here we do not accept results that lie between thresholds (say 0.35 - 0.65). This will give higher accuracy at the cost of lower coverage