SlideShare a Scribd company logo
Fusepool
Machine Learning
Framework
June 25th, Brussels
Fusepool
Structured Content
Visualization
Enable personalized software
Outline
Introduction to adaptive interfaces
Source refinement
Document labeling
Link prediction
Adaptive layout
Simple Machine Learning: Listen-Update-Predict (LUP)
LUP in detail for document labelling
Predictive Query: Predictive queries
Adaptive interfaces
Guillaume Bouchard (Xerox)
Customization/Contextualization of
interfaces
Known and
accepted by big
internet companies
Nor easy to
implement for SMEs
Annotation tools
●To manage large
knowledge bases, the is a
need for efficient interfaces
for annotators
●Web2.0 companies are
investigating these tools
●Mixed initiative
oA learning algorithm + human
interface
●Remark: a user can be
an annotator for some time
Supervised automation
Introduction
Challenge
LOD provides huge amount of data
Hard to organize
Goal
Streamline KB cleaning and management through
implicit and explicit feedback
Specifications
Easy tagging of documents
Near real-time prediction
Adaptive components in Fusepool
Document category
prediction
Entity labeling
Source refinement (re-ranking based on previous user clicks)
Adaptive Layout
Simple Machine Learning:
Listen-Update-Predict (LUP)
Guillaume Bouchard (Xerox)
Motivation
●Adaptive systems
●Many systems use machine learning algorithms as internal components
●The interaction between raw data, annotations, algorithms and predictions is not
simple:
• Data: Large and distributed (the 3 Vs: Velocity, Variety, Volume)
• Algorithms: multiple possible algorithms for the same task, slow
training/inference
• Visualization: must carry the uncertainty about data, annotations and
predictions
●Common problems:
• Confusion between predictions and data
• Models not automatically updated (manually « re-train » models)
• No simple way to test new algorithms
• Annotations not shared accross models in the same system
• Too few annotations in specific domain (no principled way to gather new
annotations)
Prior art
• Patterns (and Anti-Patterns) for Developing Machine Learning Systems. SysML 2008
• https://www.usenix.org/legacy/event/sysml08/tech/rios_talk.pdf
• The Agent Learning Pattern: Implementing ML algorithms in multiagent systems
• http://www.cs.cmu.edu/~alberto/papers/LearningPatternSugarLoaf.pdf
• Gestalt, a general-purpose integrated development environment designed the application of
machine learning
• Kayur Patel (University of Washington)
• http://www.acm.org/uist/archive/adjunct/2010/pdf/doctoral_consortium/p355.pdf
• Scikit-learn. Three complementary interfaces: Estimator, Predictor, transformer
• http://hal.inria.fr/docs/00/85/65/11/PDF/paper.pdf
• Infer.net: Probabilistic programming. Compilation of machine learning codes
• http://research.microsoft.com/en-us/um/people/cmbishop/downloads/bishop-mbml-2012.pdf
• Never-Ending Language Learning (NELL). The closest to our work but focused on language
• www.cs.cmu.edu/~acarlson/papers/carlson-aaai10.pdf
Never Ending Language Learning
● ●Intelligent computer agent
●Runs forever. Every day:
1. extract, or read, information from
the web
2. learn to perform this task better
●Carlson, Betteridge, Kisiel, Settles,
Hruschka and Mitchell (2010) give
the design principles for such an
agent
Machine learning process
LUPI Module overview
Listen
Gets notified when new annotations arrive
Update
Process annotation & update learning models
Predict
Exposes a prediction service available for other
components
Investigate
Actively ask for new annotations
LUP modules are monitored by
Fusepool main platform
LUP Module Implementation
●LUPEngine in a java interface
●Locations: com.xerox.services.LUPEngine
o + getGraphListener(...);
o + graphChanged(...);
o + updateModels(...);
o + predict(...);
Guillaume Bouchard (Xerox)
Supervised automation
Follow the LUP
Listen
Users give labels to documents in the GUI
Labels stored in annotation store
Update
Optimize the model with latest annotations
Warm start machine learning algorithms
Predict
Real time prediction based on updated model
Visible in the GUI
Supervised automation
Architecture
Components Process
Supervised automation
Xerox web services
Update and prediction using REST interface
Scaling up prediction to huge datasets
Listen
private class MyListener implements GraphListener {
public void graphChanged(List<GraphEvent> list) {
/**
* Listener method: called when matching modifications detected on
* the Annostore. This method triggers the Learning process, using
* the updateModels(HashMap<String,String> paramas) method.
*/
annostore = tcManager.getMGraph(ANNOTATION_GRAPH_NAME);
for (GraphEvent e : list) {
log.info("New #MyKindOfAnnotation !");
HashMap<String,String> params = new HashMap<String, String>();
// 1.) Accessing the target of the annotation
Iterator<Triple> it = annostore.filter(e.getTriple().getSubject(),
new UriRef("http://www.w3.org/ns/oa#hasTarget"),
null);
// 2.) Accessing the content as text of the target
// e.g. the new word to insert into the dictionary
Resource target = it.next().getObject();
it = annostore.filter((NonLiteral)target,
new UriRef("http://www.w3.org/2011/content#chars"),
null);
String newWord = it.next().getObject().toString();
params.put("newWord", newWord);
updateModels(params);
}
}
}
Update
public void updateModels(HashMap<String, String> params) {
/**
* This method updates the learning models.
*/
String newWord = params.get("newWord");
log.info("Adding " + newWord + " to dictionnary");
myDictionnary.add(newWord);
}
Predict
HashMap<String,String> params = new HashMap<String,String>();
String docURI = "<http://fusepool.info/doc/pmc/2751467>";
/**
* We build the parameters to give it to the L3.4via the predictionHub
*/
params.put("docURI", docURI);
/**
* We call the LUP34.predict(...) method via the predictionHub.predict(...)
method
*/
String predictedLabels = predictionHub.predict("LUP34", params);
/**
* We dump the result of the prediction
*/
log.info(predictedLabels);
/**
* "tissue__0.713##sodium__0.09135##English__0.016"
*/
Supervised automation
Multi-task learning services
● Better prediction based on
multi-task algorithm with label
embedding
● Efficient learning algorithms
o Alternating optimization
o Stochastic Gradient Descent
● Efficient storage based on
Cassandra
Supervised automation
Sequence diagram
1. The GUI insert
annotations
2. The Listener calls the
LUP3.4 Module
3. The LUP calls the
REST API
4. Then the information
flows back when
doing prediction
Supervised automation
Properly tested interface
Corpus 20 Newgroups WebKB Cade
Tolerance 1 2 3 1 2 3 1 2
Rank = 20 0.152 0.074 0.05 0.15 0.055 0.035 0.348 0.222
Rank = 50 0.16 0.072 0.052 0.2 0.085 0.04 0.386 0.266
Rank = 100 0.256 0.166 0.126 0.335 0.18 0.11 0.134 0.072
Predictive queries
Guillaume Bouchard (Xerox)
Motivation for predictive queries
Most of prediction problems can be expressed as a query
on “missing” information.
SELECT ?n WHERE
<?d, hasLabel, “WellWritten”>
<?p, isAuthor, ?d>
<?p, hasName, ?n>
Semantic Search API
Predictive SPARQL
Core idea: learn a model on KB
 Now we can query missing data!
● SPARQL is a standard query language for semantic data
● Predictive SPARQL: generalization to probabilistic models
Semantic Search API
Predictive SPARQL example
Semantic Search API
Predictive model
● Use of tensor
factorization methods
● Tensor=generalization of
matrices
● Scalable probabilistic
models
● Based on Rescal
approximation:
Tikj ≈ ei
TRk ej
where:
o ei and ej are entities
o Rk is the relational matrix
Predictive Sparql example
Conclusion
Guillaume Bouchard (Xerox)
Main achievements
● LUP: Listen-Update-Predict is a design pattern
that provide software engineering best practices
● Predictive SPARQL: A framework for predictive
queries on RDF data
Future of Fusepool
Xerox is using Fusepool for exploring and
organizing its customer KB

More Related Content

What's hot

Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
MLconf
 
More Data Science with Less Engineering: Machine Learning Infrastructure at N...
More Data Science with Less Engineering: Machine Learning Infrastructure at N...More Data Science with Less Engineering: Machine Learning Infrastructure at N...
More Data Science with Less Engineering: Machine Learning Infrastructure at N...
Ville Tuulos
 
Practical pairing of generative programming with functional programming.
Practical pairing of generative programming with functional programming.Practical pairing of generative programming with functional programming.
Practical pairing of generative programming with functional programming.
Eugene Lazutkin
 
An Introduction to TensorFlow architecture
An Introduction to TensorFlow architectureAn Introduction to TensorFlow architecture
An Introduction to TensorFlow architecture
Mani Goswami
 
running Tensorflow in Production
running Tensorflow in Productionrunning Tensorflow in Production
running Tensorflow in Production
Matthias Feys
 
Some experiences for porting application to Intel Xeon Phi
Some experiences for porting application to Intel Xeon PhiSome experiences for porting application to Intel Xeon Phi
Some experiences for porting application to Intel Xeon Phi
Maho Nakata
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlow
Ndjido Ardo BAR
 
Intro to the Distributed Version of TensorFlow
Intro to the Distributed Version of TensorFlowIntro to the Distributed Version of TensorFlow
Intro to the Distributed Version of TensorFlow
Altoros
 
Parsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonParsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in Python
Daniel S. Katz
 
Get Your Hands Dirty with Intel® Distribution for Python*
Get Your Hands Dirty with Intel® Distribution for Python*Get Your Hands Dirty with Intel® Distribution for Python*
Get Your Hands Dirty with Intel® Distribution for Python*
Intel® Software
 
H2O World - Munging, modeling, and pipelines using Python - Hank Roark
H2O World - Munging, modeling, and pipelines using Python - Hank RoarkH2O World - Munging, modeling, and pipelines using Python - Hank Roark
H2O World - Munging, modeling, and pipelines using Python - Hank Roark
Sri Ambati
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Big Data Spain
 
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Intel® Software
 
The Materials API
The Materials APIThe Materials API
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
MLconf
 

What's hot (15)

Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
 
More Data Science with Less Engineering: Machine Learning Infrastructure at N...
More Data Science with Less Engineering: Machine Learning Infrastructure at N...More Data Science with Less Engineering: Machine Learning Infrastructure at N...
More Data Science with Less Engineering: Machine Learning Infrastructure at N...
 
Practical pairing of generative programming with functional programming.
Practical pairing of generative programming with functional programming.Practical pairing of generative programming with functional programming.
Practical pairing of generative programming with functional programming.
 
An Introduction to TensorFlow architecture
An Introduction to TensorFlow architectureAn Introduction to TensorFlow architecture
An Introduction to TensorFlow architecture
 
running Tensorflow in Production
running Tensorflow in Productionrunning Tensorflow in Production
running Tensorflow in Production
 
Some experiences for porting application to Intel Xeon Phi
Some experiences for porting application to Intel Xeon PhiSome experiences for porting application to Intel Xeon Phi
Some experiences for porting application to Intel Xeon Phi
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlow
 
Intro to the Distributed Version of TensorFlow
Intro to the Distributed Version of TensorFlowIntro to the Distributed Version of TensorFlow
Intro to the Distributed Version of TensorFlow
 
Parsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonParsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in Python
 
Get Your Hands Dirty with Intel® Distribution for Python*
Get Your Hands Dirty with Intel® Distribution for Python*Get Your Hands Dirty with Intel® Distribution for Python*
Get Your Hands Dirty with Intel® Distribution for Python*
 
H2O World - Munging, modeling, and pipelines using Python - Hank Roark
H2O World - Munging, modeling, and pipelines using Python - Hank RoarkH2O World - Munging, modeling, and pipelines using Python - Hank Roark
H2O World - Munging, modeling, and pipelines using Python - Hank Roark
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
 
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
 
The Materials API
The Materials APIThe Materials API
The Materials API
 
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
 

Viewers also liked

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Lior Rokach
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine Learning
Sri Ambati
 
BD-ACA Week8a
BD-ACA Week8aBD-ACA Week8a
Magnet 360 Executive Summit - Fall '10 - Curated Content Marketing Intro
Magnet 360 Executive Summit - Fall '10 - Curated Content Marketing IntroMagnet 360 Executive Summit - Fall '10 - Curated Content Marketing Intro
Magnet 360 Executive Summit - Fall '10 - Curated Content Marketing Intro
Joseph Rueter
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
Paris Open Source Summit
 
The Future of Curated Content - Contextual Machine Learning w/ Thoughtly
The Future of Curated Content - Contextual Machine Learning w/ ThoughtlyThe Future of Curated Content - Contextual Machine Learning w/ Thoughtly
The Future of Curated Content - Contextual Machine Learning w/ Thoughtly
Chase Perkins
 
BDACA1516s2 - Lecture8
BDACA1516s2 - Lecture8BDACA1516s2 - Lecture8
Deep Learning and Text Mining
Deep Learning and Text MiningDeep Learning and Text Mining
Deep Learning and Text Mining
Will Stanton
 
Deep learning for text analytics
Deep learning for text analyticsDeep learning for text analytics
Deep learning for text analytics
Erik Tromp
 
Optimising Google's Knowledge Graph - #SMX Munich
Optimising Google's Knowledge Graph - #SMX MunichOptimising Google's Knowledge Graph - #SMX Munich
Optimising Google's Knowledge Graph - #SMX Munich
Jan-Willem Bobbink - Freelance SEO Consultant
 
Mohan Chaddha - Machine Learning & Content Marketing
Mohan Chaddha - Machine Learning & Content MarketingMohan Chaddha - Machine Learning & Content Marketing
Mohan Chaddha - Machine Learning & Content Marketing
introtodigital
 
Http2
Http2Http2
Applying Machine Learning and Artificial Intelligence to Business
Applying Machine Learning and Artificial Intelligence to BusinessApplying Machine Learning and Artificial Intelligence to Business
Applying Machine Learning and Artificial Intelligence to Business
Russell Miles
 
Machine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of DataMachine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of Data
DataWorks Summit/Hadoop Summit
 
Distributed Online Machine Learning Framework for Big Data
Distributed Online Machine Learning Framework for Big DataDistributed Online Machine Learning Framework for Big Data
Distributed Online Machine Learning Framework for Big Data
JubatusOfficial
 
CMOs: Transforming Marketing into a Growth Engine
CMOs: Transforming Marketing into a Growth EngineCMOs: Transforming Marketing into a Growth Engine
CMOs: Transforming Marketing into a Growth Engine
Marketo
 
Predictive Content: Engineer Higher Conversions with Machine Learning
Predictive Content: Engineer Higher Conversions with Machine LearningPredictive Content: Engineer Higher Conversions with Machine Learning
Predictive Content: Engineer Higher Conversions with Machine Learning
Marketo
 
algebra-booleana matematicas discretas
algebra-booleana matematicas discretasalgebra-booleana matematicas discretas
algebra-booleana matematicas discretas
tomas vergersent
 
How to Spot a Bear - An Intro to Machine Learning for SEO
How to Spot a Bear - An Intro to Machine Learning for SEOHow to Spot a Bear - An Intro to Machine Learning for SEO
How to Spot a Bear - An Intro to Machine Learning for SEO
Tom Anthony
 
From data to AI with the Machine Learning Canvas by Louis Dorard Slides
From data to AI with the Machine Learning Canvas by Louis  Dorard SlidesFrom data to AI with the Machine Learning Canvas by Louis  Dorard Slides
From data to AI with the Machine Learning Canvas by Louis Dorard Slides
Big Data Spain
 

Viewers also liked (20)

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine Learning
 
BD-ACA Week8a
BD-ACA Week8aBD-ACA Week8a
BD-ACA Week8a
 
Magnet 360 Executive Summit - Fall '10 - Curated Content Marketing Intro
Magnet 360 Executive Summit - Fall '10 - Curated Content Marketing IntroMagnet 360 Executive Summit - Fall '10 - Curated Content Marketing Intro
Magnet 360 Executive Summit - Fall '10 - Curated Content Marketing Intro
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 
The Future of Curated Content - Contextual Machine Learning w/ Thoughtly
The Future of Curated Content - Contextual Machine Learning w/ ThoughtlyThe Future of Curated Content - Contextual Machine Learning w/ Thoughtly
The Future of Curated Content - Contextual Machine Learning w/ Thoughtly
 
BDACA1516s2 - Lecture8
BDACA1516s2 - Lecture8BDACA1516s2 - Lecture8
BDACA1516s2 - Lecture8
 
Deep Learning and Text Mining
Deep Learning and Text MiningDeep Learning and Text Mining
Deep Learning and Text Mining
 
Deep learning for text analytics
Deep learning for text analyticsDeep learning for text analytics
Deep learning for text analytics
 
Optimising Google's Knowledge Graph - #SMX Munich
Optimising Google's Knowledge Graph - #SMX MunichOptimising Google's Knowledge Graph - #SMX Munich
Optimising Google's Knowledge Graph - #SMX Munich
 
Mohan Chaddha - Machine Learning & Content Marketing
Mohan Chaddha - Machine Learning & Content MarketingMohan Chaddha - Machine Learning & Content Marketing
Mohan Chaddha - Machine Learning & Content Marketing
 
Http2
Http2Http2
Http2
 
Applying Machine Learning and Artificial Intelligence to Business
Applying Machine Learning and Artificial Intelligence to BusinessApplying Machine Learning and Artificial Intelligence to Business
Applying Machine Learning and Artificial Intelligence to Business
 
Machine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of DataMachine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of Data
 
Distributed Online Machine Learning Framework for Big Data
Distributed Online Machine Learning Framework for Big DataDistributed Online Machine Learning Framework for Big Data
Distributed Online Machine Learning Framework for Big Data
 
CMOs: Transforming Marketing into a Growth Engine
CMOs: Transforming Marketing into a Growth EngineCMOs: Transforming Marketing into a Growth Engine
CMOs: Transforming Marketing into a Growth Engine
 
Predictive Content: Engineer Higher Conversions with Machine Learning
Predictive Content: Engineer Higher Conversions with Machine LearningPredictive Content: Engineer Higher Conversions with Machine Learning
Predictive Content: Engineer Higher Conversions with Machine Learning
 
algebra-booleana matematicas discretas
algebra-booleana matematicas discretasalgebra-booleana matematicas discretas
algebra-booleana matematicas discretas
 
How to Spot a Bear - An Intro to Machine Learning for SEO
How to Spot a Bear - An Intro to Machine Learning for SEOHow to Spot a Bear - An Intro to Machine Learning for SEO
How to Spot a Bear - An Intro to Machine Learning for SEO
 
From data to AI with the Machine Learning Canvas by Louis Dorard Slides
From data to AI with the Machine Learning Canvas by Louis  Dorard SlidesFrom data to AI with the Machine Learning Canvas by Louis  Dorard Slides
From data to AI with the Machine Learning Canvas by Louis Dorard Slides
 

Similar to Fusepool Machine Learning Framework

Reproducibility in artificial intelligence
Reproducibility in artificial intelligenceReproducibility in artificial intelligence
Reproducibility in artificial intelligence
Carlos Toxtli
 
HiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOSHiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOS
Tulipp. Eu
 
Data Science und Machine Learning im Kubernetes-Ökosystem
Data Science und Machine Learning im Kubernetes-ÖkosystemData Science und Machine Learning im Kubernetes-Ökosystem
Data Science und Machine Learning im Kubernetes-Ökosystem
inovex GmbH
 
Meet Puppet's new product lineup 12/7/2017
Meet Puppet's new product lineup 12/7/2017Meet Puppet's new product lineup 12/7/2017
Meet Puppet's new product lineup 12/7/2017
Puppet
 
Proposal with sdlc
Proposal with sdlcProposal with sdlc
Proposal with sdlc
Kamau Francis
 
Ramkumar_python_perl_unix shell script developer
Ramkumar_python_perl_unix shell script developerRamkumar_python_perl_unix shell script developer
Ramkumar_python_perl_unix shell script developer
Ramkumar Shankar
 
Puppet Keynote by Ralph Luchs
Puppet Keynote by Ralph LuchsPuppet Keynote by Ralph Luchs
Puppet Keynote by Ralph Luchs
NETWAYS
 
Resume
ResumeResume
Automation for the Modern Enterprise_26oct2017
Automation for the Modern Enterprise_26oct2017Automation for the Modern Enterprise_26oct2017
Automation for the Modern Enterprise_26oct2017
Claire Priester Papas
 
SC'18 BoF Presentation
SC'18 BoF PresentationSC'18 BoF Presentation
SC'18 BoF Presentation
rcastain
 
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Yury Leonychev
 
Machine learning in cybersecutiry
Machine learning in cybersecutiryMachine learning in cybersecutiry
Machine learning in cybersecutiry
Vishwas N
 
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Luciano Resende
 
stackconf 2020 | Infrastructure as Software by Paul Stack
stackconf 2020 | Infrastructure as Software by Paul Stackstackconf 2020 | Infrastructure as Software by Paul Stack
stackconf 2020 | Infrastructure as Software by Paul Stack
NETWAYS
 
Cloud Study Jam - 1 (AI/ML & GCP)
Cloud Study Jam - 1 (AI/ML & GCP)Cloud Study Jam - 1 (AI/ML & GCP)
Cloud Study Jam - 1 (AI/ML & GCP)
Shiv Prakash
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of ML
Nordic APIs
 
Puppet overview
Puppet overviewPuppet overview
Puppet overview
joshbeard
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big Data
DataWorks Summit
 
Shaping the Future: To Globus Compute and Beyond!
Shaping the Future: To Globus Compute and Beyond!Shaping the Future: To Globus Compute and Beyond!
Shaping the Future: To Globus Compute and Beyond!
Globus
 
Managing Postgres with Ansible
Managing Postgres with AnsibleManaging Postgres with Ansible
Managing Postgres with Ansible
Gulcin Yildirim Jelinek
 

Similar to Fusepool Machine Learning Framework (20)

Reproducibility in artificial intelligence
Reproducibility in artificial intelligenceReproducibility in artificial intelligence
Reproducibility in artificial intelligence
 
HiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOSHiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOS
 
Data Science und Machine Learning im Kubernetes-Ökosystem
Data Science und Machine Learning im Kubernetes-ÖkosystemData Science und Machine Learning im Kubernetes-Ökosystem
Data Science und Machine Learning im Kubernetes-Ökosystem
 
Meet Puppet's new product lineup 12/7/2017
Meet Puppet's new product lineup 12/7/2017Meet Puppet's new product lineup 12/7/2017
Meet Puppet's new product lineup 12/7/2017
 
Proposal with sdlc
Proposal with sdlcProposal with sdlc
Proposal with sdlc
 
Ramkumar_python_perl_unix shell script developer
Ramkumar_python_perl_unix shell script developerRamkumar_python_perl_unix shell script developer
Ramkumar_python_perl_unix shell script developer
 
Puppet Keynote by Ralph Luchs
Puppet Keynote by Ralph LuchsPuppet Keynote by Ralph Luchs
Puppet Keynote by Ralph Luchs
 
Resume
ResumeResume
Resume
 
Automation for the Modern Enterprise_26oct2017
Automation for the Modern Enterprise_26oct2017Automation for the Modern Enterprise_26oct2017
Automation for the Modern Enterprise_26oct2017
 
SC'18 BoF Presentation
SC'18 BoF PresentationSC'18 BoF Presentation
SC'18 BoF Presentation
 
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
 
Machine learning in cybersecutiry
Machine learning in cybersecutiryMachine learning in cybersecutiry
Machine learning in cybersecutiry
 
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
 
stackconf 2020 | Infrastructure as Software by Paul Stack
stackconf 2020 | Infrastructure as Software by Paul Stackstackconf 2020 | Infrastructure as Software by Paul Stack
stackconf 2020 | Infrastructure as Software by Paul Stack
 
Cloud Study Jam - 1 (AI/ML & GCP)
Cloud Study Jam - 1 (AI/ML & GCP)Cloud Study Jam - 1 (AI/ML & GCP)
Cloud Study Jam - 1 (AI/ML & GCP)
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of ML
 
Puppet overview
Puppet overviewPuppet overview
Puppet overview
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big Data
 
Shaping the Future: To Globus Compute and Beyond!
Shaping the Future: To Globus Compute and Beyond!Shaping the Future: To Globus Compute and Beyond!
Shaping the Future: To Globus Compute and Beyond!
 
Managing Postgres with Ansible
Managing Postgres with AnsibleManaging Postgres with Ansible
Managing Postgres with Ansible
 

More from Fusepool SME project

GAIA Fusepool End User presentation Idoia Munoz
GAIA Fusepool End User presentation Idoia MunozGAIA Fusepool End User presentation Idoia Munoz
GAIA Fusepool End User presentation Idoia Munoz
Fusepool SME project
 
Future Intelligence Fusepool End User presentation Harris Moysadis
Future Intelligence Fusepool End User presentation Harris MoysadisFuture Intelligence Fusepool End User presentation Harris Moysadis
Future Intelligence Fusepool End User presentation Harris Moysadis
Fusepool SME project
 
EUresearch Fusepool End User Eric Kienzle
EUresearch Fusepool End User Eric Kienzle EUresearch Fusepool End User Eric Kienzle
EUresearch Fusepool End User Eric Kienzle
Fusepool SME project
 
TREBAG Developer presentation Fusepool Zsuzsanna Bodi
TREBAG Developer presentation Fusepool Zsuzsanna BodiTREBAG Developer presentation Fusepool Zsuzsanna Bodi
TREBAG Developer presentation Fusepool Zsuzsanna Bodi
Fusepool SME project
 
Fusepool introduction
Fusepool introduction Fusepool introduction
Fusepool introduction
Fusepool SME project
 
Fusepool platform
Fusepool platform  Fusepool platform
Fusepool platform
Fusepool SME project
 
Fusepool uduvudu
Fusepool uduvudu Fusepool uduvudu
Fusepool uduvudu
Fusepool SME project
 
Johannes Hercher Developer Linking Data presentation Fusepool
Johannes Hercher Developer Linking Data presentation Fusepool Johannes Hercher Developer Linking Data presentation Fusepool
Johannes Hercher Developer Linking Data presentation Fusepool
Fusepool SME project
 
O parl Developer presentation Fusepool-Locationmapper Andreas Kuckartz
O parl Developer presentation Fusepool-Locationmapper Andreas KuckartzO parl Developer presentation Fusepool-Locationmapper Andreas Kuckartz
O parl Developer presentation Fusepool-Locationmapper Andreas Kuckartz
Fusepool SME project
 
Fusepool Trepare - Advanced vizualization
Fusepool Trepare - Advanced vizualizationFusepool Trepare - Advanced vizualization
Fusepool Trepare - Advanced vizualization
Fusepool SME project
 
Pricing and business model Fusepool
Pricing and business model FusepoolPricing and business model Fusepool
Pricing and business model Fusepool
Fusepool SME project
 

More from Fusepool SME project (11)

GAIA Fusepool End User presentation Idoia Munoz
GAIA Fusepool End User presentation Idoia MunozGAIA Fusepool End User presentation Idoia Munoz
GAIA Fusepool End User presentation Idoia Munoz
 
Future Intelligence Fusepool End User presentation Harris Moysadis
Future Intelligence Fusepool End User presentation Harris MoysadisFuture Intelligence Fusepool End User presentation Harris Moysadis
Future Intelligence Fusepool End User presentation Harris Moysadis
 
EUresearch Fusepool End User Eric Kienzle
EUresearch Fusepool End User Eric Kienzle EUresearch Fusepool End User Eric Kienzle
EUresearch Fusepool End User Eric Kienzle
 
TREBAG Developer presentation Fusepool Zsuzsanna Bodi
TREBAG Developer presentation Fusepool Zsuzsanna BodiTREBAG Developer presentation Fusepool Zsuzsanna Bodi
TREBAG Developer presentation Fusepool Zsuzsanna Bodi
 
Fusepool introduction
Fusepool introduction Fusepool introduction
Fusepool introduction
 
Fusepool platform
Fusepool platform  Fusepool platform
Fusepool platform
 
Fusepool uduvudu
Fusepool uduvudu Fusepool uduvudu
Fusepool uduvudu
 
Johannes Hercher Developer Linking Data presentation Fusepool
Johannes Hercher Developer Linking Data presentation Fusepool Johannes Hercher Developer Linking Data presentation Fusepool
Johannes Hercher Developer Linking Data presentation Fusepool
 
O parl Developer presentation Fusepool-Locationmapper Andreas Kuckartz
O parl Developer presentation Fusepool-Locationmapper Andreas KuckartzO parl Developer presentation Fusepool-Locationmapper Andreas Kuckartz
O parl Developer presentation Fusepool-Locationmapper Andreas Kuckartz
 
Fusepool Trepare - Advanced vizualization
Fusepool Trepare - Advanced vizualizationFusepool Trepare - Advanced vizualization
Fusepool Trepare - Advanced vizualization
 
Pricing and business model Fusepool
Pricing and business model FusepoolPricing and business model Fusepool
Pricing and business model Fusepool
 

Recently uploaded

AI in the Workplace Reskilling, Upskilling, and Future Work.pptx
AI in the Workplace Reskilling, Upskilling, and Future Work.pptxAI in the Workplace Reskilling, Upskilling, and Future Work.pptx
AI in the Workplace Reskilling, Upskilling, and Future Work.pptx
Sunil Jagani
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
ScyllaDB
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfLee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
leebarnesutopia
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
christinelarrosa
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
What is an RPA CoE? Session 2 – CoE Roles
What is an RPA CoE?  Session 2 – CoE RolesWhat is an RPA CoE?  Session 2 – CoE Roles
What is an RPA CoE? Session 2 – CoE Roles
DianaGray10
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
christinelarrosa
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
DianaGray10
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
UiPathCommunity
 

Recently uploaded (20)

AI in the Workplace Reskilling, Upskilling, and Future Work.pptx
AI in the Workplace Reskilling, Upskilling, and Future Work.pptxAI in the Workplace Reskilling, Upskilling, and Future Work.pptx
AI in the Workplace Reskilling, Upskilling, and Future Work.pptx
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfLee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
What is an RPA CoE? Session 2 – CoE Roles
What is an RPA CoE?  Session 2 – CoE RolesWhat is an RPA CoE?  Session 2 – CoE Roles
What is an RPA CoE? Session 2 – CoE Roles
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
 

Fusepool Machine Learning Framework

  • 3. Outline Introduction to adaptive interfaces Source refinement Document labeling Link prediction Adaptive layout Simple Machine Learning: Listen-Update-Predict (LUP) LUP in detail for document labelling Predictive Query: Predictive queries
  • 5. Customization/Contextualization of interfaces Known and accepted by big internet companies Nor easy to implement for SMEs
  • 6. Annotation tools ●To manage large knowledge bases, the is a need for efficient interfaces for annotators ●Web2.0 companies are investigating these tools ●Mixed initiative oA learning algorithm + human interface ●Remark: a user can be an annotator for some time
  • 7. Supervised automation Introduction Challenge LOD provides huge amount of data Hard to organize Goal Streamline KB cleaning and management through implicit and explicit feedback Specifications Easy tagging of documents Near real-time prediction
  • 8. Adaptive components in Fusepool Document category prediction Entity labeling Source refinement (re-ranking based on previous user clicks) Adaptive Layout
  • 9. Simple Machine Learning: Listen-Update-Predict (LUP) Guillaume Bouchard (Xerox)
  • 10. Motivation ●Adaptive systems ●Many systems use machine learning algorithms as internal components ●The interaction between raw data, annotations, algorithms and predictions is not simple: • Data: Large and distributed (the 3 Vs: Velocity, Variety, Volume) • Algorithms: multiple possible algorithms for the same task, slow training/inference • Visualization: must carry the uncertainty about data, annotations and predictions ●Common problems: • Confusion between predictions and data • Models not automatically updated (manually « re-train » models) • No simple way to test new algorithms • Annotations not shared accross models in the same system • Too few annotations in specific domain (no principled way to gather new annotations)
  • 11. Prior art • Patterns (and Anti-Patterns) for Developing Machine Learning Systems. SysML 2008 • https://www.usenix.org/legacy/event/sysml08/tech/rios_talk.pdf • The Agent Learning Pattern: Implementing ML algorithms in multiagent systems • http://www.cs.cmu.edu/~alberto/papers/LearningPatternSugarLoaf.pdf • Gestalt, a general-purpose integrated development environment designed the application of machine learning • Kayur Patel (University of Washington) • http://www.acm.org/uist/archive/adjunct/2010/pdf/doctoral_consortium/p355.pdf • Scikit-learn. Three complementary interfaces: Estimator, Predictor, transformer • http://hal.inria.fr/docs/00/85/65/11/PDF/paper.pdf • Infer.net: Probabilistic programming. Compilation of machine learning codes • http://research.microsoft.com/en-us/um/people/cmbishop/downloads/bishop-mbml-2012.pdf • Never-Ending Language Learning (NELL). The closest to our work but focused on language • www.cs.cmu.edu/~acarlson/papers/carlson-aaai10.pdf
  • 12. Never Ending Language Learning ● ●Intelligent computer agent ●Runs forever. Every day: 1. extract, or read, information from the web 2. learn to perform this task better ●Carlson, Betteridge, Kisiel, Settles, Hruschka and Mitchell (2010) give the design principles for such an agent
  • 14. LUPI Module overview Listen Gets notified when new annotations arrive Update Process annotation & update learning models Predict Exposes a prediction service available for other components Investigate Actively ask for new annotations
  • 15. LUP modules are monitored by Fusepool main platform
  • 16. LUP Module Implementation ●LUPEngine in a java interface ●Locations: com.xerox.services.LUPEngine o + getGraphListener(...); o + graphChanged(...); o + updateModels(...); o + predict(...);
  • 17.
  • 19. Supervised automation Follow the LUP Listen Users give labels to documents in the GUI Labels stored in annotation store Update Optimize the model with latest annotations Warm start machine learning algorithms Predict Real time prediction based on updated model Visible in the GUI
  • 21. Supervised automation Xerox web services Update and prediction using REST interface Scaling up prediction to huge datasets
  • 22. Listen private class MyListener implements GraphListener { public void graphChanged(List<GraphEvent> list) { /** * Listener method: called when matching modifications detected on * the Annostore. This method triggers the Learning process, using * the updateModels(HashMap<String,String> paramas) method. */ annostore = tcManager.getMGraph(ANNOTATION_GRAPH_NAME); for (GraphEvent e : list) { log.info("New #MyKindOfAnnotation !"); HashMap<String,String> params = new HashMap<String, String>(); // 1.) Accessing the target of the annotation Iterator<Triple> it = annostore.filter(e.getTriple().getSubject(), new UriRef("http://www.w3.org/ns/oa#hasTarget"), null); // 2.) Accessing the content as text of the target // e.g. the new word to insert into the dictionary Resource target = it.next().getObject(); it = annostore.filter((NonLiteral)target, new UriRef("http://www.w3.org/2011/content#chars"), null); String newWord = it.next().getObject().toString(); params.put("newWord", newWord); updateModels(params); } } }
  • 23. Update public void updateModels(HashMap<String, String> params) { /** * This method updates the learning models. */ String newWord = params.get("newWord"); log.info("Adding " + newWord + " to dictionnary"); myDictionnary.add(newWord); }
  • 24. Predict HashMap<String,String> params = new HashMap<String,String>(); String docURI = "<http://fusepool.info/doc/pmc/2751467>"; /** * We build the parameters to give it to the L3.4via the predictionHub */ params.put("docURI", docURI); /** * We call the LUP34.predict(...) method via the predictionHub.predict(...) method */ String predictedLabels = predictionHub.predict("LUP34", params); /** * We dump the result of the prediction */ log.info(predictedLabels); /** * "tissue__0.713##sodium__0.09135##English__0.016" */
  • 25. Supervised automation Multi-task learning services ● Better prediction based on multi-task algorithm with label embedding ● Efficient learning algorithms o Alternating optimization o Stochastic Gradient Descent ● Efficient storage based on Cassandra
  • 26. Supervised automation Sequence diagram 1. The GUI insert annotations 2. The Listener calls the LUP3.4 Module 3. The LUP calls the REST API 4. Then the information flows back when doing prediction
  • 27. Supervised automation Properly tested interface Corpus 20 Newgroups WebKB Cade Tolerance 1 2 3 1 2 3 1 2 Rank = 20 0.152 0.074 0.05 0.15 0.055 0.035 0.348 0.222 Rank = 50 0.16 0.072 0.052 0.2 0.085 0.04 0.386 0.266 Rank = 100 0.256 0.166 0.126 0.335 0.18 0.11 0.134 0.072
  • 29. Motivation for predictive queries Most of prediction problems can be expressed as a query on “missing” information. SELECT ?n WHERE <?d, hasLabel, “WellWritten”> <?p, isAuthor, ?d> <?p, hasName, ?n>
  • 30. Semantic Search API Predictive SPARQL Core idea: learn a model on KB  Now we can query missing data! ● SPARQL is a standard query language for semantic data ● Predictive SPARQL: generalization to probabilistic models
  • 32. Semantic Search API Predictive model ● Use of tensor factorization methods ● Tensor=generalization of matrices ● Scalable probabilistic models ● Based on Rescal approximation: Tikj ≈ ei TRk ej where: o ei and ej are entities o Rk is the relational matrix
  • 35. Main achievements ● LUP: Listen-Update-Predict is a design pattern that provide software engineering best practices ● Predictive SPARQL: A framework for predictive queries on RDF data
  • 36. Future of Fusepool Xerox is using Fusepool for exploring and organizing its customer KB