SlideShare a Scribd company logo
Deep Learning
2015/07/04
Marat Zhanikeev
maratishe@gmail.com
GI研@天神イムズ
PDF: http://bit.do/150704
in Human-Guided
Text MiningvsMultidimensional Classification
.
Deep Learning vs MD Classifiers
• Deep Learning 08 10
◦ Feature-based: image → features → NN
◦ Raw/Pixels : image → raw pixels → NN
• Multi-Dimentional Classification 04 05
◦ assigning classes to items in multiple dimensions
• Human-Guided Text Mining 02
◦ Folksonomy + BigData
◦ learning from empty state with gradually diminishing human feedback
08 A.Nguyen+2 "Deep Neural Networks are Easily Fooled..." IEEE CVPR (2015)
10 G.Goos+2 "Neural Networks: Tricks of the Trade" Springer LNCS vol.7700, 2nd edition (2012)
04 X.Zhu+1 "Introduction to Semi-Supervised Learning" Morgan and Claypool Publishers (2009)
05 D.Koller+1 "Probabilistic Graphical Models: Principles and Techniques" MIT Press (2009)
02 myself+0 "Multidimensional Classification Automation with Human Interface based on Metromaps" 4th AAI (2015)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 2/26
...
2/26
.
Deep Learning
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 3/26
...
3/26
.
Deep Learning (1) Feature-Based
• many feature extraction libraries, normally specific to environments/targets
• problem 1: wide range of errors, can be from 50% up to 96%
• problem 2: who decides on the features?
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 4/26
...
4/26
.
Deep Learning (2) Raw Pixels
• just feed the raw pixels to the Neural Network and let it sort it out for itself
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 5/26
...
5/26
.
Deep Learning (3) Google Faces
• a feature-based method, extremely specific, recently acquired by Google
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 6/26
...
6/26
.
Deep Learning (4) Google Cats
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 7/26
...
7/26
.
Deep Learning (5) Raw/Pixel Method
• a standard process for a pixel-based learning 12
• CSV files are traditional, one image becomes one line
0 1 1 … 0
1 …
0 …
… …
1 …
Handwriting
Black -n-white
Pixel map
Matrix in a CSV file
3
Deep
Learning
3
Training
Testing
12 "MNIST Dataset of Handwritten Digits" http://yann.lecun.com/exdb/mnist/ (2015)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 8/26
...
8/26
.
Multi-Dimensional Classification
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 9/26
...
9/26
.
MDC : Binary Relevance (BR) Classes
• single dimension
• not practical today, when most things exists in multi-dimensional space 06
Training
Tuples
x1 x2 Y1 Y2 Y3
1 0.7 0.4 1 1 0
2 0.6 0.2 1 1 0
3 0.1 0.9 0 0 1
4 0.3 0.1 0 0 0
h1: X → Y1
h2: X → Y2
h3: X → Y3
06 J.Ortigosa-Hernandez+3 "A Semi-supervised Approach to Multi-dimensional Classification..." 6th TAMIDA (2010)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 10/26
...
10/26
.
MDC : PairWise (PW) Sets
• define classes as pairs of base BR classes 06
• lower complexity, higher error rate
Training
Tuples
x1 x2 Y1 Y2 Y3
1 0.7 0.4 1 1 0
2 0.6 0.2 1 1 0
0.1 0.9 0 0 1
0.3 0.1 0 0 0
h1: X → Z1
h2: X → Z2
Z1 Z2
1 0
0 1
0 0
0 0
06 J.Ortigosa-Hernandez+3 "A Semi-supervised Approach to Multi-dimensional Classification..." 6th TAMIDA (2010)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 11/26
...
11/26
.
MDC : Label Combination (LC) Method
• a class for all combinations of base BR 06
• very high complexity, still high error rate
Training
Tuples
x1 x2 Y1 Y2 Y3
1 0.7 0.4 1 1 0
2 0.6 0.2 1 1 0
3 0.1 0.9 0 0 1
4 0.3 0.1 0 0 0
h: X → Z
Z
1
0
0
0
06 J.Ortigosa-Hernandez+3 "A Semi-supervised Approach to Multi-dimensional Classification..." 6th TAMIDA (2010)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 12/26
...
12/26
.
MDC : The CC Method
• CC: Classifier Chains method 07 -- literally, a chain of BR classes
• controlled complexity, much better error rate, but the main problem is which order?
Training
Tuples
x1 x2 Y1 Y2 Y3
1 0.7 0.4 1 1 0
2 0.6 0.2 1 1 0
3 0.1 0.9 0 0 1
4 0.3 0.1 0 0 0
h1: X → Y1
h2: Y1 → Y2
h3: Y2 → Y3
h2h1 h3
07 J.Read+3 "Classifier chains for multi-label classification" Machine Learning, Springer (2011)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 13/26
...
13/26
.
The MetroMap Classifier (MMC)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 14/26
...
14/26
.
The Metromap Concept
• like a map of a train network 01
• main advantage: e2e paths in (ontology) graphs
01 myself+0 "On Context Management Using Metro Maps" 7th SOCA (2014)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 15/26
...
15/26
.
MMC : A Practical Setting
Human
judgment
Auto
judgement
Folksonomy
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 16/26
...
16/26
.
MMC : Processing Logic
• processing based on human-defined metromap, the function is similar to
chaining BR classes, but with higher performance
Metromap
Classifier
Human
Check
Metromap
Fuzzy?
Cold?
Hot?
Robot (Automatic Classification)
Bad
Input
No
Yes
No
No
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 17/26
...
17/26
.
MDC (MMC) vs DL(pixels)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 18/26
...
18/26
.
DL : graphics vs Text
• graphics
◦ pixels are already numeric
◦ images can be resized to provide same-size input -- DL needs fixed-size input
• text requires complex processing
1. tokenize text (words)
2. frequency distribution -- variable size
3. sample distribution -- finally, the same/fixed size!
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 19/26
...
19/26
.
Experimental Setup (1) Humans
• 2 main cases: hot + cold = picked but not used, hot - cold = picked and used
(blackswans) 03
03 myself+0 "Black Swan Disaster Scenarios" IEICE PRMU研 (2014)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 20/26
...
20/26
.
Experimental Setup (2) Process
• the text is not numeric by nature, has to be converted into sampled
frequency distribution
• calculations in R, used h2o package 11 for deep learning
0 1 1 … 0
1 …
0 …
… …
1 …
Text
Matrix in a CSV file
Deep
Learning
Tokenize
Frequency
Distribution
Sample
Bayes
Many
(Chains, Metromap , etc.)
Path 1
Path 2
11 "H2O: R Package for Learning Algorithms" http://cran.r-project.org/web/packages/h2o (2015)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 21/26
...
21/26
.
Results (1) MMC vs BR
0 20 40 60 80 100 120
Time sequence
0
10
20
30
40
50
60
70
80
90
Goodcount
Dumb Classifier
Metromap Classifier
Hits on a timeline
title
0
10
20
30
40
50
60
70
80
Goodcount
title:keywords
0
10
20
30
40
50
60
70
80
90
Goodcount
title:keywords:abstract
0 20 40 60 80 100 120
Time sequence
0 20 40 60 80 100 120
Time sequence
02 myself+0 "Multidimensional Classification Automation with Human Interface based on Metromaps" 4th AAI (2015)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 22/26
...
22/26
.
Results (2) DL Results
0 20 40 60 80 100
Time sequence
0
20
40
60
80
100
Deeplearninghits
Diagonal/humanDeep learning
keys(title)
rule(cold#yes hot#yes)
0 20 40 60 80 100
Time sequence
0
20
40
60
80
100
Deeplearninghits
keys(title:keywords:abstract)
rule(cold#yes hot#yes)
0 20 40 60 80 100
Time sequence
0
20
40
60
80
100
Deeplearninghits
keys(title:keywords:abstract)
rule(cold#no hot#yes)
0 20 40 60 80 100
Time sequence
0
20
40
60
80
100
Deeplearninghits
keys(title:keywords:abstract)
rule(cold#yes hot#yes)
• compared to x = y
case
• DL performs very
badly
• best performs
when abstract is
used, even then about
25% hits
• same performance for
hot + cold and hot
- cold cases
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 23/26
...
23/26
.
That’s all, thank you ...
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 24/26
...
24/26
.
MDC and Social Robotics Go Together
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 25/26
...
25/26
.
Social Robotics in Text Mining Context
Rebot
(careless)
Input
Human
Human
{structure}
(pinpoint)
Select
Browse
(or use otherwise)
Some
Knowledge
(folksonomies,
knowledge bases,
databases, indexes,
ontologies, etc.)
(metromaps )
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 26/26
...
26/26

More Related Content

Similar to Deep Learning vs Multidimensional Classification in Human-Guided Text Mining

Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...
Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...
Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...
Tokyo University of Science
 
Multidimentional Classification Automation with Human Interface based on Metr...
Multidimentional Classification Automation with Human Interface based on Metr...Multidimentional Classification Automation with Human Interface based on Metr...
Multidimentional Classification Automation with Human Interface based on Metr...
Tokyo University of Science
 
MetroMaps versus Facets: What Exactly is the Ontological Context?
MetroMaps versus Facets: What Exactly is the Ontological Context?MetroMaps versus Facets: What Exactly is the Ontological Context?
MetroMaps versus Facets: What Exactly is the Ontological Context?
Tokyo University of Science
 
HILDA 2023 Keynote Bill Howe
HILDA 2023 Keynote Bill HoweHILDA 2023 Keynote Bill Howe
HILDA 2023 Keynote Bill Howe
domoritz
 
On Context Management Using Metro Maps
On Context Management Using Metro MapsOn Context Management Using Metro Maps
On Context Management Using Metro Maps
Tokyo University of Science
 
Complexity Resolution Control for Context Based on Metromaps
Complexity Resolution Control for Context Based on MetromapsComplexity Resolution Control for Context Based on Metromaps
Complexity Resolution Control for Context Based on Metromaps
Tokyo University of Science
 
The Unbearable Lightness of Wiking
The Unbearable Lightness of Wiking The Unbearable Lightness of Wiking
The Unbearable Lightness of Wiking
Jie Bao
 
Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012
xin wang
 
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
BigMine
 
The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)
Matthew Lease
 
Top (10) challenging problems in data mining
Top (10) challenging problems  in data miningTop (10) challenging problems  in data mining
Top (10) challenging problems in data mining
Ahmedasbasb
 
A density based clustering approach for web robot detection
A density based clustering approach for web robot detectionA density based clustering approach for web robot detection
A density based clustering approach for web robot detection
Wright State University, Dayton, OH, USA
 
Complex Models for Big Data
Complex Models for Big DataComplex Models for Big Data
Complex Models for Big Data
Data Science Research Center
 
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
Ioan Toma
 
lecture1.pptx
lecture1.pptxlecture1.pptx
lecture1.pptx
MrsKanimozhiKAIDS
 
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2ODeep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Sri Ambati
 
Seattle Scalability Mahout
Seattle Scalability MahoutSeattle Scalability Mahout
Seattle Scalability Mahout
Jake Mannix
 
Classification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different FacetsClassification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different Facets
Geoffrey Fox
 
A Server-Assigned Crowdsourcing Framework
A Server-Assigned Crowdsourcing FrameworkA Server-Assigned Crowdsourcing Framework
A Server-Assigned Crowdsourcing Framework
University of Southern California
 
A Software Design and Algorithms for Multicore Capture in Data Center Forensics
A Software Design and Algorithms for Multicore Capture in Data Center ForensicsA Software Design and Algorithms for Multicore Capture in Data Center Forensics
A Software Design and Algorithms for Multicore Capture in Data Center Forensics
Tokyo University of Science
 

Similar to Deep Learning vs Multidimensional Classification in Human-Guided Text Mining (20)

Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...
Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...
Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...
 
Multidimentional Classification Automation with Human Interface based on Metr...
Multidimentional Classification Automation with Human Interface based on Metr...Multidimentional Classification Automation with Human Interface based on Metr...
Multidimentional Classification Automation with Human Interface based on Metr...
 
MetroMaps versus Facets: What Exactly is the Ontological Context?
MetroMaps versus Facets: What Exactly is the Ontological Context?MetroMaps versus Facets: What Exactly is the Ontological Context?
MetroMaps versus Facets: What Exactly is the Ontological Context?
 
HILDA 2023 Keynote Bill Howe
HILDA 2023 Keynote Bill HoweHILDA 2023 Keynote Bill Howe
HILDA 2023 Keynote Bill Howe
 
On Context Management Using Metro Maps
On Context Management Using Metro MapsOn Context Management Using Metro Maps
On Context Management Using Metro Maps
 
Complexity Resolution Control for Context Based on Metromaps
Complexity Resolution Control for Context Based on MetromapsComplexity Resolution Control for Context Based on Metromaps
Complexity Resolution Control for Context Based on Metromaps
 
The Unbearable Lightness of Wiking
The Unbearable Lightness of Wiking The Unbearable Lightness of Wiking
The Unbearable Lightness of Wiking
 
Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012
 
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
 
The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)
 
Top (10) challenging problems in data mining
Top (10) challenging problems  in data miningTop (10) challenging problems  in data mining
Top (10) challenging problems in data mining
 
A density based clustering approach for web robot detection
A density based clustering approach for web robot detectionA density based clustering approach for web robot detection
A density based clustering approach for web robot detection
 
Complex Models for Big Data
Complex Models for Big DataComplex Models for Big Data
Complex Models for Big Data
 
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
 
lecture1.pptx
lecture1.pptxlecture1.pptx
lecture1.pptx
 
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2ODeep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
 
Seattle Scalability Mahout
Seattle Scalability MahoutSeattle Scalability Mahout
Seattle Scalability Mahout
 
Classification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different FacetsClassification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different Facets
 
A Server-Assigned Crowdsourcing Framework
A Server-Assigned Crowdsourcing FrameworkA Server-Assigned Crowdsourcing Framework
A Server-Assigned Crowdsourcing Framework
 
A Software Design and Algorithms for Multicore Capture in Data Center Forensics
A Software Design and Algorithms for Multicore Capture in Data Center ForensicsA Software Design and Algorithms for Multicore Capture in Data Center Forensics
A Software Design and Algorithms for Multicore Capture in Data Center Forensics
 

More from Tokyo University of Science

A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...
A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...
A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...
Tokyo University of Science
 
Ultrasound Relative Positioning for IoT Devices in Dense Wireless Spaces
Ultrasound Relative Positioning for IoT Devices in Dense Wireless SpacesUltrasound Relative Positioning for IoT Devices in Dense Wireless Spaces
Ultrasound Relative Positioning for IoT Devices in Dense Wireless Spaces
Tokyo University of Science
 
Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...
Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...
Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...
Tokyo University of Science
 
What if We Atomize Student Data and Apps and Put Them on Docker Containers?
What if We Atomize Student Data and Apps and Put Them on Docker Containers?What if We Atomize Student Data and Apps and Put Them on Docker Containers?
What if We Atomize Student Data and Apps and Put Them on Docker Containers?
Tokyo University of Science
 
Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...
Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...
Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...
Tokyo University of Science
 
On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
On Performance Under Hotspots in Hadoop versus Bigdata Replay PlatformsOn Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
Tokyo University of Science
 
Taking the Step from Software to Product Development \\ when teaching PBL at ...
Taking the Step from Software to Product Development \\ when teaching PBL at ...Taking the Step from Software to Product Development \\ when teaching PBL at ...
Taking the Step from Software to Product Development \\ when teaching PBL at ...
Tokyo University of Science
 
Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...
Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...
Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...
Tokyo University of Science
 
The Switchboard Optimization Problem and Heuristics for Cut-Through Networking
The Switchboard Optimization Problem and Heuristics for Cut-Through NetworkingThe Switchboard Optimization Problem and Heuristics for Cut-Through Networking
The Switchboard Optimization Problem and Heuristics for Cut-Through Networking
Tokyo University of Science
 
The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...
The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...
The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...
Tokyo University of Science
 
Bulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless Spaces
Bulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless SpacesBulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless Spaces
Bulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless Spaces
Tokyo University of Science
 
Fog Cloud Caching at Network Edge via Local Hardware Awareness Spaces
Fog Cloud Caching at Network Edge via Local Hardware Awareness SpacesFog Cloud Caching at Network Edge via Local Hardware Awareness Spaces
Fog Cloud Caching at Network Edge via Local Hardware Awareness Spaces
Tokyo University of Science
 
On a Hybrid Packets-and-Circuits Switching Logic
On a Hybrid Packets-and-Circuits Switching LogicOn a Hybrid Packets-and-Circuits Switching Logic
On a Hybrid Packets-and-Circuits Switching Logic
Tokyo University of Science
 
Image-Related Uses for Roadside Infrastructure \\ based on Wireless Beacons
Image-Related Uses for Roadside Infrastructure \\ based on Wireless BeaconsImage-Related Uses for Roadside Infrastructure \\ based on Wireless Beacons
Image-Related Uses for Roadside Infrastructure \\ based on Wireless Beacons
Tokyo University of Science
 
The Declarative-Coordinated Model for Self-Optimization of Service Networks
The Declarative-Coordinated Model for Self-Optimization of Service NetworksThe Declarative-Coordinated Model for Self-Optimization of Service Networks
The Declarative-Coordinated Model for Self-Optimization of Service Networks
Tokyo University of Science
 
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
Tokyo University of Science
 
3-Way Scripts as a Base Unit for Flexible Scale-Out Code
3-Way Scripts as a Base Unit for Flexible Scale-Out Code3-Way Scripts as a Base Unit for Flexible Scale-Out Code
3-Way Scripts as a Base Unit for Flexible Scale-Out Code
Tokyo University of Science
 
Towards Social Robotics on Smartphones with Simple XYZV Sensor Feedback
Towards Social Robotics on Smartphones with Simple XYZV Sensor FeedbackTowards Social Robotics on Smartphones with Simple XYZV Sensor Feedback
Towards Social Robotics on Smartphones with Simple XYZV Sensor Feedback
Tokyo University of Science
 
Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...
Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...
Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...
Tokyo University of Science
 
Browser Visualization using PNGs Generated by HTML5 Workers on Multicore
Browser Visualization using PNGs Generated by HTML5 Workers on MulticoreBrowser Visualization using PNGs Generated by HTML5 Workers on Multicore
Browser Visualization using PNGs Generated by HTML5 Workers on Multicore
Tokyo University of Science
 

More from Tokyo University of Science (20)

A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...
A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...
A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...
 
Ultrasound Relative Positioning for IoT Devices in Dense Wireless Spaces
Ultrasound Relative Positioning for IoT Devices in Dense Wireless SpacesUltrasound Relative Positioning for IoT Devices in Dense Wireless Spaces
Ultrasound Relative Positioning for IoT Devices in Dense Wireless Spaces
 
Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...
Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...
Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...
 
What if We Atomize Student Data and Apps and Put Them on Docker Containers?
What if We Atomize Student Data and Apps and Put Them on Docker Containers?What if We Atomize Student Data and Apps and Put Them on Docker Containers?
What if We Atomize Student Data and Apps and Put Them on Docker Containers?
 
Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...
Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...
Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...
 
On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
On Performance Under Hotspots in Hadoop versus Bigdata Replay PlatformsOn Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
 
Taking the Step from Software to Product Development \\ when teaching PBL at ...
Taking the Step from Software to Product Development \\ when teaching PBL at ...Taking the Step from Software to Product Development \\ when teaching PBL at ...
Taking the Step from Software to Product Development \\ when teaching PBL at ...
 
Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...
Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...
Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...
 
The Switchboard Optimization Problem and Heuristics for Cut-Through Networking
The Switchboard Optimization Problem and Heuristics for Cut-Through NetworkingThe Switchboard Optimization Problem and Heuristics for Cut-Through Networking
The Switchboard Optimization Problem and Heuristics for Cut-Through Networking
 
The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...
The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...
The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...
 
Bulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless Spaces
Bulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless SpacesBulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless Spaces
Bulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless Spaces
 
Fog Cloud Caching at Network Edge via Local Hardware Awareness Spaces
Fog Cloud Caching at Network Edge via Local Hardware Awareness SpacesFog Cloud Caching at Network Edge via Local Hardware Awareness Spaces
Fog Cloud Caching at Network Edge via Local Hardware Awareness Spaces
 
On a Hybrid Packets-and-Circuits Switching Logic
On a Hybrid Packets-and-Circuits Switching LogicOn a Hybrid Packets-and-Circuits Switching Logic
On a Hybrid Packets-and-Circuits Switching Logic
 
Image-Related Uses for Roadside Infrastructure \\ based on Wireless Beacons
Image-Related Uses for Roadside Infrastructure \\ based on Wireless BeaconsImage-Related Uses for Roadside Infrastructure \\ based on Wireless Beacons
Image-Related Uses for Roadside Infrastructure \\ based on Wireless Beacons
 
The Declarative-Coordinated Model for Self-Optimization of Service Networks
The Declarative-Coordinated Model for Self-Optimization of Service NetworksThe Declarative-Coordinated Model for Self-Optimization of Service Networks
The Declarative-Coordinated Model for Self-Optimization of Service Networks
 
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
 
3-Way Scripts as a Base Unit for Flexible Scale-Out Code
3-Way Scripts as a Base Unit for Flexible Scale-Out Code3-Way Scripts as a Base Unit for Flexible Scale-Out Code
3-Way Scripts as a Base Unit for Flexible Scale-Out Code
 
Towards Social Robotics on Smartphones with Simple XYZV Sensor Feedback
Towards Social Robotics on Smartphones with Simple XYZV Sensor FeedbackTowards Social Robotics on Smartphones with Simple XYZV Sensor Feedback
Towards Social Robotics on Smartphones with Simple XYZV Sensor Feedback
 
Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...
Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...
Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...
 
Browser Visualization using PNGs Generated by HTML5 Workers on Multicore
Browser Visualization using PNGs Generated by HTML5 Workers on MulticoreBrowser Visualization using PNGs Generated by HTML5 Workers on Multicore
Browser Visualization using PNGs Generated by HTML5 Workers on Multicore
 

Recently uploaded

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 

Recently uploaded (20)

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 

Deep Learning vs Multidimensional Classification in Human-Guided Text Mining

  • 1. Deep Learning 2015/07/04 Marat Zhanikeev maratishe@gmail.com GI研@天神イムズ PDF: http://bit.do/150704 in Human-Guided Text MiningvsMultidimensional Classification
  • 2. . Deep Learning vs MD Classifiers • Deep Learning 08 10 ◦ Feature-based: image → features → NN ◦ Raw/Pixels : image → raw pixels → NN • Multi-Dimentional Classification 04 05 ◦ assigning classes to items in multiple dimensions • Human-Guided Text Mining 02 ◦ Folksonomy + BigData ◦ learning from empty state with gradually diminishing human feedback 08 A.Nguyen+2 "Deep Neural Networks are Easily Fooled..." IEEE CVPR (2015) 10 G.Goos+2 "Neural Networks: Tricks of the Trade" Springer LNCS vol.7700, 2nd edition (2012) 04 X.Zhu+1 "Introduction to Semi-Supervised Learning" Morgan and Claypool Publishers (2009) 05 D.Koller+1 "Probabilistic Graphical Models: Principles and Techniques" MIT Press (2009) 02 myself+0 "Multidimensional Classification Automation with Human Interface based on Metromaps" 4th AAI (2015) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 2/26 ... 2/26
  • 3. . Deep Learning M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 3/26 ... 3/26
  • 4. . Deep Learning (1) Feature-Based • many feature extraction libraries, normally specific to environments/targets • problem 1: wide range of errors, can be from 50% up to 96% • problem 2: who decides on the features? M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 4/26 ... 4/26
  • 5. . Deep Learning (2) Raw Pixels • just feed the raw pixels to the Neural Network and let it sort it out for itself M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 5/26 ... 5/26
  • 6. . Deep Learning (3) Google Faces • a feature-based method, extremely specific, recently acquired by Google M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 6/26 ... 6/26
  • 7. . Deep Learning (4) Google Cats M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 7/26 ... 7/26
  • 8. . Deep Learning (5) Raw/Pixel Method • a standard process for a pixel-based learning 12 • CSV files are traditional, one image becomes one line 0 1 1 … 0 1 … 0 … … … 1 … Handwriting Black -n-white Pixel map Matrix in a CSV file 3 Deep Learning 3 Training Testing 12 "MNIST Dataset of Handwritten Digits" http://yann.lecun.com/exdb/mnist/ (2015) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 8/26 ... 8/26
  • 9. . Multi-Dimensional Classification M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 9/26 ... 9/26
  • 10. . MDC : Binary Relevance (BR) Classes • single dimension • not practical today, when most things exists in multi-dimensional space 06 Training Tuples x1 x2 Y1 Y2 Y3 1 0.7 0.4 1 1 0 2 0.6 0.2 1 1 0 3 0.1 0.9 0 0 1 4 0.3 0.1 0 0 0 h1: X → Y1 h2: X → Y2 h3: X → Y3 06 J.Ortigosa-Hernandez+3 "A Semi-supervised Approach to Multi-dimensional Classification..." 6th TAMIDA (2010) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 10/26 ... 10/26
  • 11. . MDC : PairWise (PW) Sets • define classes as pairs of base BR classes 06 • lower complexity, higher error rate Training Tuples x1 x2 Y1 Y2 Y3 1 0.7 0.4 1 1 0 2 0.6 0.2 1 1 0 0.1 0.9 0 0 1 0.3 0.1 0 0 0 h1: X → Z1 h2: X → Z2 Z1 Z2 1 0 0 1 0 0 0 0 06 J.Ortigosa-Hernandez+3 "A Semi-supervised Approach to Multi-dimensional Classification..." 6th TAMIDA (2010) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 11/26 ... 11/26
  • 12. . MDC : Label Combination (LC) Method • a class for all combinations of base BR 06 • very high complexity, still high error rate Training Tuples x1 x2 Y1 Y2 Y3 1 0.7 0.4 1 1 0 2 0.6 0.2 1 1 0 3 0.1 0.9 0 0 1 4 0.3 0.1 0 0 0 h: X → Z Z 1 0 0 0 06 J.Ortigosa-Hernandez+3 "A Semi-supervised Approach to Multi-dimensional Classification..." 6th TAMIDA (2010) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 12/26 ... 12/26
  • 13. . MDC : The CC Method • CC: Classifier Chains method 07 -- literally, a chain of BR classes • controlled complexity, much better error rate, but the main problem is which order? Training Tuples x1 x2 Y1 Y2 Y3 1 0.7 0.4 1 1 0 2 0.6 0.2 1 1 0 3 0.1 0.9 0 0 1 4 0.3 0.1 0 0 0 h1: X → Y1 h2: Y1 → Y2 h3: Y2 → Y3 h2h1 h3 07 J.Read+3 "Classifier chains for multi-label classification" Machine Learning, Springer (2011) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 13/26 ... 13/26
  • 14. . The MetroMap Classifier (MMC) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 14/26 ... 14/26
  • 15. . The Metromap Concept • like a map of a train network 01 • main advantage: e2e paths in (ontology) graphs 01 myself+0 "On Context Management Using Metro Maps" 7th SOCA (2014) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 15/26 ... 15/26
  • 16. . MMC : A Practical Setting Human judgment Auto judgement Folksonomy M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 16/26 ... 16/26
  • 17. . MMC : Processing Logic • processing based on human-defined metromap, the function is similar to chaining BR classes, but with higher performance Metromap Classifier Human Check Metromap Fuzzy? Cold? Hot? Robot (Automatic Classification) Bad Input No Yes No No M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 17/26 ... 17/26
  • 18. . MDC (MMC) vs DL(pixels) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 18/26 ... 18/26
  • 19. . DL : graphics vs Text • graphics ◦ pixels are already numeric ◦ images can be resized to provide same-size input -- DL needs fixed-size input • text requires complex processing 1. tokenize text (words) 2. frequency distribution -- variable size 3. sample distribution -- finally, the same/fixed size! M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 19/26 ... 19/26
  • 20. . Experimental Setup (1) Humans • 2 main cases: hot + cold = picked but not used, hot - cold = picked and used (blackswans) 03 03 myself+0 "Black Swan Disaster Scenarios" IEICE PRMU研 (2014) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 20/26 ... 20/26
  • 21. . Experimental Setup (2) Process • the text is not numeric by nature, has to be converted into sampled frequency distribution • calculations in R, used h2o package 11 for deep learning 0 1 1 … 0 1 … 0 … … … 1 … Text Matrix in a CSV file Deep Learning Tokenize Frequency Distribution Sample Bayes Many (Chains, Metromap , etc.) Path 1 Path 2 11 "H2O: R Package for Learning Algorithms" http://cran.r-project.org/web/packages/h2o (2015) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 21/26 ... 21/26
  • 22. . Results (1) MMC vs BR 0 20 40 60 80 100 120 Time sequence 0 10 20 30 40 50 60 70 80 90 Goodcount Dumb Classifier Metromap Classifier Hits on a timeline title 0 10 20 30 40 50 60 70 80 Goodcount title:keywords 0 10 20 30 40 50 60 70 80 90 Goodcount title:keywords:abstract 0 20 40 60 80 100 120 Time sequence 0 20 40 60 80 100 120 Time sequence 02 myself+0 "Multidimensional Classification Automation with Human Interface based on Metromaps" 4th AAI (2015) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 22/26 ... 22/26
  • 23. . Results (2) DL Results 0 20 40 60 80 100 Time sequence 0 20 40 60 80 100 Deeplearninghits Diagonal/humanDeep learning keys(title) rule(cold#yes hot#yes) 0 20 40 60 80 100 Time sequence 0 20 40 60 80 100 Deeplearninghits keys(title:keywords:abstract) rule(cold#yes hot#yes) 0 20 40 60 80 100 Time sequence 0 20 40 60 80 100 Deeplearninghits keys(title:keywords:abstract) rule(cold#no hot#yes) 0 20 40 60 80 100 Time sequence 0 20 40 60 80 100 Deeplearninghits keys(title:keywords:abstract) rule(cold#yes hot#yes) • compared to x = y case • DL performs very badly • best performs when abstract is used, even then about 25% hits • same performance for hot + cold and hot - cold cases M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 23/26 ... 23/26
  • 24. . That’s all, thank you ... M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 24/26 ... 24/26
  • 25. . MDC and Social Robotics Go Together M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 25/26 ... 25/26
  • 26. . Social Robotics in Text Mining Context Rebot (careless) Input Human Human {structure} (pinpoint) Select Browse (or use otherwise) Some Knowledge (folksonomies, knowledge bases, databases, indexes, ontologies, etc.) (metromaps ) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 26/26 ... 26/26