SlideShare a Scribd company logo
1 of 23
Download to read offline
Echo State Hoeffding Tree Learning
Diego Marr´on (dmarron@ac.upc.edu)
Jesse Read (jesse.read@telecom-paristech.fr)
Albert Bifet (albert.bifet@telecom-paristech.fr)
Talel Abdessalem (talel.abdessalem@telecom-paristech.fr)
Eduard Ayguad´e (eduard.ayguade@bsc.es)
Jos´e R. Herrero (josepr@ac.upc.edu)
ACML 2016
Hamilton, New Zeland
Introduction ESHT Evaluations Conclusions
Introduction
• Real-time classification of Big Data streams is becoming
essential in a variety of application domains.
• Real-time classification imposes some challenges:
• Deal with potentially infinite streams
• Strong temporal dependences
• React to changes on the stream
• Response time and memory are bounded
2/18
Introduction ESHT Evaluations Conclusions
Real Time Classification
• In real-time classification:
• Hoeffding Tree (HT) is the streaming state-of-the art decision
tree
• HTs are powerful and easy–to–deploy (no hyper-parameter to
tune)
• But, they are unable to capture strong temporal dependences
• Recurrent Neural Networks (RNN) are very popular nowadays
3/18
Introduction ESHT Evaluations Conclusions
Recurrent Neural Networks
• Recurrent Neural Networks (RNNs) are the state-of-the-art in
handwriting recognition, speech recognition, natural language
processing among others
• They are able to capture time dependences
• But their use for data streams is not straight forward
• Very sensitive to hyper-parameters configuration
• Training requires many iterations over data...
• ...and large amount of time
4/18
Introduction ESHT Evaluations Conclusions
RNN: Echo State Network
• A type of Recurrent Neural Network
• Echo State Layer (ESL):
• Dynamics only driven by the input
• Requires very few computations
• Easy to understand hyper-parameters
• Can capture time dependences
• ESN also requires the hyper-parameters needed by the NN
• Gradient Descent methods have slow convergence
5/18
Introduction ESHT Evaluations Conclusions
Contribution
• Objective:
• Need to model the evolution of the stream over time
• Reduce number of hyper-parameters
• Reduce amount of samples needed to learn
• In this work we present the ESHT:
• Combination of HT + ESL
• To learn temporal dependences in data streams in real-time
• Requires less hyper-parameters than the ESN
6/18
Introduction ESHT Evaluations Conclusions
ESHT
• Echo State Layer (ESL):
• Only needs two hyper-parameters:
• Alpha (α): weights events in X(n) importance over new ones
• Density: Wres is a sparse matrix with given density
• Encodes time-dependences
• FIMT-DD: Hoeffding tree for regression
• Works out-of-the-box: no hyper-parameters tuning
7/18
Introduction ESHT Evaluations Conclusions
ESHT: Evaluation Methodology
• We propose the ESHT to learn character-stream functions:
• Counter (skipped in this presentation)
• lastIndexOf
• emailFilter
• lastIndexOf Evaluation:
• Study the effects of hyper-parameters: α and density
• Alpha (α): weights events in X(n) importance over new ones
• Density: Wres is a sparse matrix with given density
• Use 1,000 neurons on the ESL
• emailFilter evaluation:
• We focus on the speed of learning
• Use outcomes from previous evaluations to configure the
ESHT for this task
• Metrics:
• Cumulative loss
• We consider an error if |yt − ˆy| >= 0.5
8/18
Introduction ESHT Evaluations Conclusions
Input format
• Input is a vector of floats
• Number of attributes = number of input symbols
• Attribute representing current symbol set to 0.5
• Other attributes are set to zero
9/18
Introduction ESHT Evaluations Conclusions
LastIndexOf
• Counts the number of time steps since the current symbol was
last observed
• Input stream is randomly generated
• We 2,3 and 4 symbols
10/18
Introduction ESHT Evaluations Conclusions
LastIndexOf: Vector vs Scalar Input
• Vector input improves accuracy in all cases
• Specially with 4 symbols
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
α
Accuracy(%)
2symbols density=0.4
2symbols-vec density=0.4
3symbols density=0.4
3symbols-vec density=0.4
4symbols density=0.4
4symbols-vec density=0.4
11/18
Introduction ESHT Evaluations Conclusions
LastIndexOf: Alpha and Density vs Accuracy
• Lower values of alpha (α) have low accuracy
• There is no clear correlation between accuracy and density
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Alpha (α)
Accuracy(%)
2symbols density=0.1
2symbols density=0.4
3symbols density=0.1
3symbols density=0.4
4symbols density=0.1
4symbols density=0.4
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0.3
0.4
0.5
0.6
0.7
0.8
Density
Accuracy(%)
α=0.2
α=0.3
α=0.4
α=0.5
α=0.6
α=0.7
α=0.8
α=0.9
α=1.0
12/18
Introduction ESHT Evaluations Conclusions
EmailFilter
• ESHT configuration:
• ESL: 4,000 neurons
• α = 1.0 and density = 0.1
• Outputs the length on the next space character
• Dataset: 20 newsgroups dataset
• Extracted 590 characters and repeated them 8 times
• To reduce the memory usage we used an input vector of 4
symbols
13/18
Introduction ESHT Evaluations Conclusions
EmailFilter: Recurrence vs Non Recurrence
• Non-recurrent methods (FIMT-DD and NN) fail to capture
temporal dependences
• NN defaults to majority class
Algorithm Density α Learning rate Loss Accuracy (%)
FIMT-DD - - - 4,119.7 91.61
NN - - 0.8 2,760 97.80
ESN1 0.2 1.0 0.1 1,032 98.47
ESN2 0.7 1.0 0.1 850 98.47
ESHT 0.1 1.0 - 180 99.75
14/18
Introduction ESHT Evaluations Conclusions
EmailFilter: ESN vs ESHT
• After 500 samples the ESHT loss is close to 0 (and 0 loss
after the 1,000 samples)
0
1,000
2,000
3,000
4,000
0
200
400
600
800
1,000
1,200
500
# Samples
CummulativeLoss
ESN1
ESN2
ESHT
15/18
Introduction ESHT Evaluations Conclusions
Conclusions and Future Work
• Conclusions:
• We presented the ESHT to learn temporal dependences in data
streams in real-time
• The ESHT requires less hyper-parameters than the ESN
• Our proof-of-concept implementation is able to learn faster
than an ESN (Most of them at first shot)
• Future Work:
• We are currently reimplementing our prototype so we can test
larger input sequences
• We need to study the effects of the initial state vanishing in
large sequences
16/18
Thank you
Echo State Hoeffding Tree Learning
Diego Marr´on (dmarron@ac.upc.edu)
Jesse Read (jesse.read@telecom-paristech.fr)
Albert Bifet (albert.bifet@telecom-paristech.fr)
Talel Abdessalem (talel.abdessalem@telecom-paristech.fr)
Eduard Ayguad´e (eduard.ayguade@bsc.es)
Jos´e R. Herrero (josepr@ac.upc.edu)
ACML 2016
Hamilton, New Zeland
ESHT: Module Architecture
• In each evaluation we use the following architecture
• Label generator implements the function to be learnt
1/0
Counter: Introduction
• Stream of zeros and ones randomly generated
• Input is a scalar
• Two variants:
• Option1: Outputs cumulative count
• Option2: Outputs total count on the next zero
2/0
Counter: Cumulative Loss
• After 200 samples the loss is stable
0
200
400
600
800
1,000
0
10
20
30
# Samples
CummulativeLoss
Op1(density=0.3,α=1.0)
Op1(density=1.0,α=0.7)
Op2(density=0.8,α=1.0)
Op2(density=0.8,α=0.7)
3/0
Counter: Alpha and Density vs Accuracy
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0.5
0.6
0.7
0.8
0.9
1
Alpha (α)
Accuracy(%)
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0.5
0.6
0.7
0.8
0.9
1
Density (%)
Accuracy(%)
4/0
EmailFilter: ASCII to 4 symbols Table
ASCII Domain 4-Symbols Domain
Original Symbols Target Symbol Target Symbol Index
[t n r]+ Single space 0
[a-zA-Z0-9] x 1
@ @ 2
. . 3
5/0

More Related Content

What's hot

Calculus ppt format
Calculus ppt formatCalculus ppt format
Calculus ppt formatvaani pathak
 
Magellan FOSS4G Talk, Boston 2017
Magellan FOSS4G Talk, Boston 2017Magellan FOSS4G Talk, Boston 2017
Magellan FOSS4G Talk, Boston 2017Ram Sriharsha
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch Eran Shlomo
 
Nearest neighbour algorithm
Nearest neighbour algorithmNearest neighbour algorithm
Nearest neighbour algorithmAnmitas1
 
Practical deep learning for computer vision
Practical deep learning for computer visionPractical deep learning for computer vision
Practical deep learning for computer visionEran Shlomo
 
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15MLconf
 
Relaxed Parsing of Regular Approximations of String-Embedded Languages
Relaxed Parsing of Regular Approximations of String-Embedded LanguagesRelaxed Parsing of Regular Approximations of String-Embedded Languages
Relaxed Parsing of Regular Approximations of String-Embedded LanguagesSemyon Grigorev
 
ensembles_emptytemplate_v2
ensembles_emptytemplate_v2ensembles_emptytemplate_v2
ensembles_emptytemplate_v2Shrayes Ramesh
 
Recent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and BeyondRecent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and BeyondXiangrui Meng
 
Integer sequence
Integer sequenceInteger sequence
Integer sequenceINA SINGHAL
 

What's hot (12)

Calculus ppt format
Calculus ppt formatCalculus ppt format
Calculus ppt format
 
Magellan FOSS4G Talk, Boston 2017
Magellan FOSS4G Talk, Boston 2017Magellan FOSS4G Talk, Boston 2017
Magellan FOSS4G Talk, Boston 2017
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
 
Nearest neighbour algorithm
Nearest neighbour algorithmNearest neighbour algorithm
Nearest neighbour algorithm
 
Practical deep learning for computer vision
Practical deep learning for computer visionPractical deep learning for computer vision
Practical deep learning for computer vision
 
Deep Learning for Computer Vision: Optimization (UPC 2016)
Deep Learning for Computer Vision: Optimization (UPC 2016)Deep Learning for Computer Vision: Optimization (UPC 2016)
Deep Learning for Computer Vision: Optimization (UPC 2016)
 
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
 
Relaxed Parsing of Regular Approximations of String-Embedded Languages
Relaxed Parsing of Regular Approximations of String-Embedded LanguagesRelaxed Parsing of Regular Approximations of String-Embedded Languages
Relaxed Parsing of Regular Approximations of String-Embedded Languages
 
ensembles_emptytemplate_v2
ensembles_emptytemplate_v2ensembles_emptytemplate_v2
ensembles_emptytemplate_v2
 
Recent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and BeyondRecent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and Beyond
 
Integer sequence
Integer sequenceInteger sequence
Integer sequence
 
Clustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn TutorialClustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn Tutorial
 

Viewers also liked

Viewers also liked (20)

las tecnologías de la información y comunicación (TIC)
las tecnologías de la información y comunicación (TIC)las tecnologías de la información y comunicación (TIC)
las tecnologías de la información y comunicación (TIC)
 
Los angulos
Los angulosLos angulos
Los angulos
 
Resolucon de la imagen.pptxr
Resolucon de la imagen.pptxrResolucon de la imagen.pptxr
Resolucon de la imagen.pptxr
 
Nr energy
Nr energyNr energy
Nr energy
 
Historia "Una buena pesadilla"
Historia "Una buena pesadilla"Historia "Una buena pesadilla"
Historia "Una buena pesadilla"
 
T3 misw simetria_mm
T3 misw simetria_mmT3 misw simetria_mm
T3 misw simetria_mm
 
Presentacion
PresentacionPresentacion
Presentacion
 
Desarrollo del pensamiento y la creatividad
Desarrollo del pensamiento y la creatividadDesarrollo del pensamiento y la creatividad
Desarrollo del pensamiento y la creatividad
 
Leo da vinci
Leo da vinciLeo da vinci
Leo da vinci
 
Presentación gustavo
Presentación gustavoPresentación gustavo
Presentación gustavo
 
Creatividad
CreatividadCreatividad
Creatividad
 
Proporcionalidad abc
Proporcionalidad abcProporcionalidad abc
Proporcionalidad abc
 
Jorge Newbery
Jorge NewberyJorge Newbery
Jorge Newbery
 
T15 misw derivada_lf
T15 misw derivada_lfT15 misw derivada_lf
T15 misw derivada_lf
 
Presentación1
Presentación1Presentación1
Presentación1
 
Académicos a honorarios: entrando en materia (19/08/2011)
Académicos a honorarios: entrando en materia (19/08/2011)Académicos a honorarios: entrando en materia (19/08/2011)
Académicos a honorarios: entrando en materia (19/08/2011)
 
Proyecto haarp
Proyecto haarpProyecto haarp
Proyecto haarp
 
Por que los perros viven menos que nosotros
Por que los perros viven menos que nosotrosPor que los perros viven menos que nosotros
Por que los perros viven menos que nosotros
 
RFra_FinalPaper
RFra_FinalPaperRFra_FinalPaper
RFra_FinalPaper
 
Presentación2.pptx planos
Presentación2.pptx planosPresentación2.pptx planos
Presentación2.pptx planos
 

Similar to Echo State Hoeffding Tree Learning

Low-latency Multi-threaded Ensemble Learning for Dynamic Big Data Streams
Low-latency Multi-threaded Ensemble Learning for Dynamic Big Data StreamsLow-latency Multi-threaded Ensemble Learning for Dynamic Big Data Streams
Low-latency Multi-threaded Ensemble Learning for Dynamic Big Data StreamsDiego Marrón Vida
 
recurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptxrecurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptxSagarTekwani4
 
Beating Floating Point at its Own Game: Posit Arithmetic
Beating Floating Point at its Own Game: Posit ArithmeticBeating Floating Point at its Own Game: Posit Arithmetic
Beating Floating Point at its Own Game: Posit Arithmeticinside-BigData.com
 
RNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantagesRNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantagesAbhijitVenkatesh1
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You NeedDaiki Tanaka
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model佳蓉 倪
 
Dataworkz odsc london 2018
Dataworkz odsc london 2018Dataworkz odsc london 2018
Dataworkz odsc london 2018Olaf de Leeuw
 
Reservoir Computing Overview (with emphasis on Liquid State Machines)
Reservoir Computing Overview (with emphasis on Liquid State Machines)Reservoir Computing Overview (with emphasis on Liquid State Machines)
Reservoir Computing Overview (with emphasis on Liquid State Machines)Alex Klibisz
 
04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbersYutaka Kawai
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRUananth
 
08 neural networks
08 neural networks08 neural networks
08 neural networksankit_ppt
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningCastLabKAIST
 
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad FeinbergSpark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad FeinbergSpark Summit
 
Histograms at scale - Monitorama 2019
Histograms at scale - Monitorama 2019Histograms at scale - Monitorama 2019
Histograms at scale - Monitorama 2019Evan Chan
 
Distributed Decision Tree Learning for Mining Big Data Streams
Distributed Decision Tree Learning for Mining Big Data StreamsDistributed Decision Tree Learning for Mining Big Data Streams
Distributed Decision Tree Learning for Mining Big Data StreamsArinto Murdopo
 
Model-based programming and AI-assisted software development
Model-based programming and AI-assisted software developmentModel-based programming and AI-assisted software development
Model-based programming and AI-assisted software developmentEficode
 
An Introduction to Distributed Data Streaming
An Introduction to Distributed Data StreamingAn Introduction to Distributed Data Streaming
An Introduction to Distributed Data StreamingParis Carbone
 
Approximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming ApplicationsApproximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming ApplicationsDebasish Ghosh
 
Master Thesis Presentation
Master Thesis PresentationMaster Thesis Presentation
Master Thesis PresentationMohamed Sobh
 

Similar to Echo State Hoeffding Tree Learning (20)

Low-latency Multi-threaded Ensemble Learning for Dynamic Big Data Streams
Low-latency Multi-threaded Ensemble Learning for Dynamic Big Data StreamsLow-latency Multi-threaded Ensemble Learning for Dynamic Big Data Streams
Low-latency Multi-threaded Ensemble Learning for Dynamic Big Data Streams
 
recurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptxrecurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptx
 
Beating Floating Point at its Own Game: Posit Arithmetic
Beating Floating Point at its Own Game: Posit ArithmeticBeating Floating Point at its Own Game: Posit Arithmetic
Beating Floating Point at its Own Game: Posit Arithmetic
 
RNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantagesRNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantages
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
 
Dataworkz odsc london 2018
Dataworkz odsc london 2018Dataworkz odsc london 2018
Dataworkz odsc london 2018
 
Reservoir Computing Overview (with emphasis on Liquid State Machines)
Reservoir Computing Overview (with emphasis on Liquid State Machines)Reservoir Computing Overview (with emphasis on Liquid State Machines)
Reservoir Computing Overview (with emphasis on Liquid State Machines)
 
04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers
 
Searching Algorithms
Searching AlgorithmsSearching Algorithms
Searching Algorithms
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
 
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad FeinbergSpark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
 
Histograms at scale - Monitorama 2019
Histograms at scale - Monitorama 2019Histograms at scale - Monitorama 2019
Histograms at scale - Monitorama 2019
 
Distributed Decision Tree Learning for Mining Big Data Streams
Distributed Decision Tree Learning for Mining Big Data StreamsDistributed Decision Tree Learning for Mining Big Data Streams
Distributed Decision Tree Learning for Mining Big Data Streams
 
Model-based programming and AI-assisted software development
Model-based programming and AI-assisted software developmentModel-based programming and AI-assisted software development
Model-based programming and AI-assisted software development
 
An Introduction to Distributed Data Streaming
An Introduction to Distributed Data StreamingAn Introduction to Distributed Data Streaming
An Introduction to Distributed Data Streaming
 
Approximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming ApplicationsApproximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming Applications
 
Master Thesis Presentation
Master Thesis PresentationMaster Thesis Presentation
Master Thesis Presentation
 

Recently uploaded

Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 

Echo State Hoeffding Tree Learning

  • 1. Echo State Hoeffding Tree Learning Diego Marr´on (dmarron@ac.upc.edu) Jesse Read (jesse.read@telecom-paristech.fr) Albert Bifet (albert.bifet@telecom-paristech.fr) Talel Abdessalem (talel.abdessalem@telecom-paristech.fr) Eduard Ayguad´e (eduard.ayguade@bsc.es) Jos´e R. Herrero (josepr@ac.upc.edu) ACML 2016 Hamilton, New Zeland
  • 2. Introduction ESHT Evaluations Conclusions Introduction • Real-time classification of Big Data streams is becoming essential in a variety of application domains. • Real-time classification imposes some challenges: • Deal with potentially infinite streams • Strong temporal dependences • React to changes on the stream • Response time and memory are bounded 2/18
  • 3. Introduction ESHT Evaluations Conclusions Real Time Classification • In real-time classification: • Hoeffding Tree (HT) is the streaming state-of-the art decision tree • HTs are powerful and easy–to–deploy (no hyper-parameter to tune) • But, they are unable to capture strong temporal dependences • Recurrent Neural Networks (RNN) are very popular nowadays 3/18
  • 4. Introduction ESHT Evaluations Conclusions Recurrent Neural Networks • Recurrent Neural Networks (RNNs) are the state-of-the-art in handwriting recognition, speech recognition, natural language processing among others • They are able to capture time dependences • But their use for data streams is not straight forward • Very sensitive to hyper-parameters configuration • Training requires many iterations over data... • ...and large amount of time 4/18
  • 5. Introduction ESHT Evaluations Conclusions RNN: Echo State Network • A type of Recurrent Neural Network • Echo State Layer (ESL): • Dynamics only driven by the input • Requires very few computations • Easy to understand hyper-parameters • Can capture time dependences • ESN also requires the hyper-parameters needed by the NN • Gradient Descent methods have slow convergence 5/18
  • 6. Introduction ESHT Evaluations Conclusions Contribution • Objective: • Need to model the evolution of the stream over time • Reduce number of hyper-parameters • Reduce amount of samples needed to learn • In this work we present the ESHT: • Combination of HT + ESL • To learn temporal dependences in data streams in real-time • Requires less hyper-parameters than the ESN 6/18
  • 7. Introduction ESHT Evaluations Conclusions ESHT • Echo State Layer (ESL): • Only needs two hyper-parameters: • Alpha (α): weights events in X(n) importance over new ones • Density: Wres is a sparse matrix with given density • Encodes time-dependences • FIMT-DD: Hoeffding tree for regression • Works out-of-the-box: no hyper-parameters tuning 7/18
  • 8. Introduction ESHT Evaluations Conclusions ESHT: Evaluation Methodology • We propose the ESHT to learn character-stream functions: • Counter (skipped in this presentation) • lastIndexOf • emailFilter • lastIndexOf Evaluation: • Study the effects of hyper-parameters: α and density • Alpha (α): weights events in X(n) importance over new ones • Density: Wres is a sparse matrix with given density • Use 1,000 neurons on the ESL • emailFilter evaluation: • We focus on the speed of learning • Use outcomes from previous evaluations to configure the ESHT for this task • Metrics: • Cumulative loss • We consider an error if |yt − ˆy| >= 0.5 8/18
  • 9. Introduction ESHT Evaluations Conclusions Input format • Input is a vector of floats • Number of attributes = number of input symbols • Attribute representing current symbol set to 0.5 • Other attributes are set to zero 9/18
  • 10. Introduction ESHT Evaluations Conclusions LastIndexOf • Counts the number of time steps since the current symbol was last observed • Input stream is randomly generated • We 2,3 and 4 symbols 10/18
  • 11. Introduction ESHT Evaluations Conclusions LastIndexOf: Vector vs Scalar Input • Vector input improves accuracy in all cases • Specially with 4 symbols 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 α Accuracy(%) 2symbols density=0.4 2symbols-vec density=0.4 3symbols density=0.4 3symbols-vec density=0.4 4symbols density=0.4 4symbols-vec density=0.4 11/18
  • 12. Introduction ESHT Evaluations Conclusions LastIndexOf: Alpha and Density vs Accuracy • Lower values of alpha (α) have low accuracy • There is no clear correlation between accuracy and density 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Alpha (α) Accuracy(%) 2symbols density=0.1 2symbols density=0.4 3symbols density=0.1 3symbols density=0.4 4symbols density=0.1 4symbols density=0.4 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.3 0.4 0.5 0.6 0.7 0.8 Density Accuracy(%) α=0.2 α=0.3 α=0.4 α=0.5 α=0.6 α=0.7 α=0.8 α=0.9 α=1.0 12/18
  • 13. Introduction ESHT Evaluations Conclusions EmailFilter • ESHT configuration: • ESL: 4,000 neurons • α = 1.0 and density = 0.1 • Outputs the length on the next space character • Dataset: 20 newsgroups dataset • Extracted 590 characters and repeated them 8 times • To reduce the memory usage we used an input vector of 4 symbols 13/18
  • 14. Introduction ESHT Evaluations Conclusions EmailFilter: Recurrence vs Non Recurrence • Non-recurrent methods (FIMT-DD and NN) fail to capture temporal dependences • NN defaults to majority class Algorithm Density α Learning rate Loss Accuracy (%) FIMT-DD - - - 4,119.7 91.61 NN - - 0.8 2,760 97.80 ESN1 0.2 1.0 0.1 1,032 98.47 ESN2 0.7 1.0 0.1 850 98.47 ESHT 0.1 1.0 - 180 99.75 14/18
  • 15. Introduction ESHT Evaluations Conclusions EmailFilter: ESN vs ESHT • After 500 samples the ESHT loss is close to 0 (and 0 loss after the 1,000 samples) 0 1,000 2,000 3,000 4,000 0 200 400 600 800 1,000 1,200 500 # Samples CummulativeLoss ESN1 ESN2 ESHT 15/18
  • 16. Introduction ESHT Evaluations Conclusions Conclusions and Future Work • Conclusions: • We presented the ESHT to learn temporal dependences in data streams in real-time • The ESHT requires less hyper-parameters than the ESN • Our proof-of-concept implementation is able to learn faster than an ESN (Most of them at first shot) • Future Work: • We are currently reimplementing our prototype so we can test larger input sequences • We need to study the effects of the initial state vanishing in large sequences 16/18
  • 18. Echo State Hoeffding Tree Learning Diego Marr´on (dmarron@ac.upc.edu) Jesse Read (jesse.read@telecom-paristech.fr) Albert Bifet (albert.bifet@telecom-paristech.fr) Talel Abdessalem (talel.abdessalem@telecom-paristech.fr) Eduard Ayguad´e (eduard.ayguade@bsc.es) Jos´e R. Herrero (josepr@ac.upc.edu) ACML 2016 Hamilton, New Zeland
  • 19. ESHT: Module Architecture • In each evaluation we use the following architecture • Label generator implements the function to be learnt 1/0
  • 20. Counter: Introduction • Stream of zeros and ones randomly generated • Input is a scalar • Two variants: • Option1: Outputs cumulative count • Option2: Outputs total count on the next zero 2/0
  • 21. Counter: Cumulative Loss • After 200 samples the loss is stable 0 200 400 600 800 1,000 0 10 20 30 # Samples CummulativeLoss Op1(density=0.3,α=1.0) Op1(density=1.0,α=0.7) Op2(density=0.8,α=1.0) Op2(density=0.8,α=0.7) 3/0
  • 22. Counter: Alpha and Density vs Accuracy 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1 Alpha (α) Accuracy(%) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1 Density (%) Accuracy(%) 4/0
  • 23. EmailFilter: ASCII to 4 symbols Table ASCII Domain 4-Symbols Domain Original Symbols Target Symbol Target Symbol Index [t n r]+ Single space 0 [a-zA-Z0-9] x 1 @ @ 2 . . 3 5/0