SlideShare a Scribd company logo
Improving SVM classification on imbalanced
time series data sets with ghost points
Presenter: Shang-Tse Chen
Authors: Suzan Köknar-Tezel, Longin Jan Latecki
Introduction
● Imbalanced dataset is a challenge for data mining
○ always predict majority class -> high accuracy
○ often, rare events are more interesting
● Common Technique:
○ Up / Down sampling
○ SMOTE (adding synthetic points in feature space)
● This paper
○ adding synthetic points in distance space
Research Question
● For time series data
○ not intuitive to represent as features in Rn
○ distance between two sequence is non-metric
○ Cannot use SMOTE
● In many applications, pair-wise distance is more relevant
○ many classifier only need pair-wise distances,
■ eg. SVM, knn
○ many good algorithms to compute distance in time
series data, e.g. DTW, OSB, …, etc.
Research Question
● Can we add synthetic data in distance space?
● Does it improve the performance?
Methodology
● Given any two points a, b in a distance space X, we can define a
ghost point e = μ(a,b).
● For every x ∈ X, the distance from x to e, d(x, μ(a,b)) is as follows:
○ case 1: {x, a, b} is a metric, then
■ d(x, µ(a, b))2 = ½ d(x, a)2 + ½ d(x, b)2 - ¼ d(a, b)2
○ case 2: If d(a, b) > d(x, a) + d(x, b), then
■ d(x, µ(a, b)) = ½ d(a, b) - d(x, b)
○ case 3a: If d(x, a) > d(x, b) + d(a, b), then
■ d(x, µ(a, b))2 = d(x, b)2 + ¼ d(a, b)2
○ case 3b: If d(x, b) > d(x, a) + d(a, b), then
■ d(x, µ(a, b))2 = d(x, a)2 + ¼ d(a, b)2
Data Collection, Processing
● UCR Time series datasets
○ Use 17 datasets from various domains
○ number of classes range from 2 to 50
● MPEG-7
○ 1400 binary images consisting of 70 object classes
○ within each class there are 20 shapes
○ each shape is represented with 100 equidistant sample points on the contour
○ these points are converted into sequences by calculating the curvature of each point with
respect to its five neighbors on each side.
○ this yields 1400 sequences, each of length 100
○ this transformation is invariant to rotation and scale
Key Results
● UCR Data Sets and OSB
● Shaded results indicate best
performers
● the darker the shade,
the larger the difference
Key Results
● UCR Data Sets and DTW
Key Results
● MPEG-7 dataset
Summary
● Proposed a new approach for over-sampling the minority
class of imbalanced data
● Unlike other feature based methods, the ghost points
are added in distance space.
● Ghost points can be added to non-metric distance space
○ Can be used with DTW, OSB, and many more.
● Empirical results show significant improvement
Critique of work
● For large-scale data, over-sampling is time consuming
● Introduce another parameters, i.e. the number of
ghost points that we should add
● May not perform well in highly noisy data

More Related Content

What's hot

Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...ArchiLab 7
 
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...Computer Science Club
 
Discrete time signals on MATLAB
Discrete time signals on MATLABDiscrete time signals on MATLAB
Discrete time signals on MATLAB
Martin Wachiye Wafula
 
Generalized Notions of Data Depth
Generalized Notions of Data DepthGeneralized Notions of Data Depth
Generalized Notions of Data Depth
Mukund Raj
 
Application of Dijkstra Algorithm in Robot path planning
Application of Dijkstra Algorithm in Robot path planningApplication of Dijkstra Algorithm in Robot path planning
Application of Dijkstra Algorithm in Robot path planning
Darling Jemima
 
DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorit...
DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorit...DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorit...
DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorit...
Menlo Systems GmbH
 
Using Principal Component Analysis to Remove Correlated Signal from Astronomi...
Using Principal Component Analysis to Remove Correlated Signal from Astronomi...Using Principal Component Analysis to Remove Correlated Signal from Astronomi...
Using Principal Component Analysis to Remove Correlated Signal from Astronomi...CvilleDataScience
 
Graph Evolution Models
Graph Evolution ModelsGraph Evolution Models
Graph Evolution Models
Carlos Castillo (ChaTo)
 
Minicourse on Network Science
Minicourse on Network ScienceMinicourse on Network Science
Minicourse on Network Science
Pavel Loskot
 
Line Detection
Line DetectionLine Detection
Line Detection
Upekha Vandebona
 
Representation
RepresentationRepresentation
Representation
Syed Zaid Irshad
 
Quiz 2
Quiz 2Quiz 2
Quiz 2
Gopi Saiteja
 
Deep single view 3 d object reconstruction with visual hull
Deep single view 3 d object reconstruction with visual hullDeep single view 3 d object reconstruction with visual hull
Deep single view 3 d object reconstruction with visual hull
Hanqing Wang
 
R programmingmilano
R programmingmilanoR programmingmilano
R programmingmilano
Ismail Seyrik
 
Modelling the Clustering Coefficient of a Random graph
Modelling the Clustering Coefficient of a Random graphModelling the Clustering Coefficient of a Random graph
Modelling the Clustering Coefficient of a Random graph
Graph-TA
 
Hubba Deep Learning
Hubba Deep LearningHubba Deep Learning
Hubba Deep Learning
Ivan Goloskokovic
 
Deep Learning meetup
Deep Learning meetupDeep Learning meetup
Deep Learning meetup
Ivan Goloskokovic
 
15 chapter9 graph_algorithms_mst
15 chapter9 graph_algorithms_mst15 chapter9 graph_algorithms_mst
15 chapter9 graph_algorithms_mst
SSE_AndyLi
 
Multidimension Scaling and Isomap
Multidimension Scaling and IsomapMultidimension Scaling and Isomap
Multidimension Scaling and Isomap
Cheng-Shiang Li
 

What's hot (20)

Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
 
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...
 
Discrete time signals on MATLAB
Discrete time signals on MATLABDiscrete time signals on MATLAB
Discrete time signals on MATLAB
 
Assignment 1
Assignment 1Assignment 1
Assignment 1
 
Generalized Notions of Data Depth
Generalized Notions of Data DepthGeneralized Notions of Data Depth
Generalized Notions of Data Depth
 
Application of Dijkstra Algorithm in Robot path planning
Application of Dijkstra Algorithm in Robot path planningApplication of Dijkstra Algorithm in Robot path planning
Application of Dijkstra Algorithm in Robot path planning
 
DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorit...
DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorit...DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorit...
DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorit...
 
Using Principal Component Analysis to Remove Correlated Signal from Astronomi...
Using Principal Component Analysis to Remove Correlated Signal from Astronomi...Using Principal Component Analysis to Remove Correlated Signal from Astronomi...
Using Principal Component Analysis to Remove Correlated Signal from Astronomi...
 
Graph Evolution Models
Graph Evolution ModelsGraph Evolution Models
Graph Evolution Models
 
Minicourse on Network Science
Minicourse on Network ScienceMinicourse on Network Science
Minicourse on Network Science
 
Line Detection
Line DetectionLine Detection
Line Detection
 
Representation
RepresentationRepresentation
Representation
 
Quiz 2
Quiz 2Quiz 2
Quiz 2
 
Deep single view 3 d object reconstruction with visual hull
Deep single view 3 d object reconstruction with visual hullDeep single view 3 d object reconstruction with visual hull
Deep single view 3 d object reconstruction with visual hull
 
R programmingmilano
R programmingmilanoR programmingmilano
R programmingmilano
 
Modelling the Clustering Coefficient of a Random graph
Modelling the Clustering Coefficient of a Random graphModelling the Clustering Coefficient of a Random graph
Modelling the Clustering Coefficient of a Random graph
 
Hubba Deep Learning
Hubba Deep LearningHubba Deep Learning
Hubba Deep Learning
 
Deep Learning meetup
Deep Learning meetupDeep Learning meetup
Deep Learning meetup
 
15 chapter9 graph_algorithms_mst
15 chapter9 graph_algorithms_mst15 chapter9 graph_algorithms_mst
15 chapter9 graph_algorithms_mst
 
Multidimension Scaling and Isomap
Multidimension Scaling and IsomapMultidimension Scaling and Isomap
Multidimension Scaling and Isomap
 

Viewers also liked

Probability cheatsheet
Probability cheatsheetProbability cheatsheet
Probability cheatsheet
Joachim Gwoke
 
Smoking soujanya
Smoking soujanyaSmoking soujanya
Smoking soujanya
BBKuhn
 
Revista digital pdf jorge pinzon
Revista digital pdf jorge pinzonRevista digital pdf jorge pinzon
Revista digital pdf jorge pinzon
Jorge Pinzon Cuervo
 
Presentation yamin
Presentation yaminPresentation yamin
Presentation yamin
BBKuhn
 
Sound shredding moustafa
Sound shredding moustafaSound shredding moustafa
Sound shredding moustafa
BBKuhn
 
2014.chi.structured labeling to facilitate concept evolution in machine learning
2014.chi.structured labeling to facilitate concept evolution in machine learning2014.chi.structured labeling to facilitate concept evolution in machine learning
2014.chi.structured labeling to facilitate concept evolution in machine learning
BBKuhn
 
banheiras
banheirasbanheiras
banheiras
hlopez10
 
SAVE_Memb_Certificate.jpg
SAVE_Memb_Certificate.jpgSAVE_Memb_Certificate.jpg
SAVE_Memb_Certificate.jpgSAAD al-ZUBAIDI
 
Determinar la ecuación general de la circunferencia que pasa por el punto A(-...
Determinar la ecuación general de la circunferencia que pasa por el punto A(-...Determinar la ecuación general de la circunferencia que pasa por el punto A(-...
Determinar la ecuación general de la circunferencia que pasa por el punto A(-...
Sergio Damian Reinoso Rivadeneira
 

Viewers also liked (10)

Probability cheatsheet
Probability cheatsheetProbability cheatsheet
Probability cheatsheet
 
Smoking soujanya
Smoking soujanyaSmoking soujanya
Smoking soujanya
 
Revista digital pdf jorge pinzon
Revista digital pdf jorge pinzonRevista digital pdf jorge pinzon
Revista digital pdf jorge pinzon
 
Presentation yamin
Presentation yaminPresentation yamin
Presentation yamin
 
Sound shredding moustafa
Sound shredding moustafaSound shredding moustafa
Sound shredding moustafa
 
2014.chi.structured labeling to facilitate concept evolution in machine learning
2014.chi.structured labeling to facilitate concept evolution in machine learning2014.chi.structured labeling to facilitate concept evolution in machine learning
2014.chi.structured labeling to facilitate concept evolution in machine learning
 
banheiras
banheirasbanheiras
banheiras
 
SAVE_Memb_Certificate.jpg
SAVE_Memb_Certificate.jpgSAVE_Memb_Certificate.jpg
SAVE_Memb_Certificate.jpg
 
Business Resume
Business ResumeBusiness Resume
Business Resume
 
Determinar la ecuación general de la circunferencia que pasa por el punto A(-...
Determinar la ecuación general de la circunferencia que pasa por el punto A(-...Determinar la ecuación general de la circunferencia que pasa por el punto A(-...
Determinar la ecuación general de la circunferencia que pasa por el punto A(-...
 

Similar to Md2k 0219 shang

Making BIG DATA smaller
Making BIG DATA smallerMaking BIG DATA smaller
Making BIG DATA smaller
Tony Tran
 
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
ssuser2624f71
 
Vectorise all the things
Vectorise all the thingsVectorise all the things
Vectorise all the things
JodieBurchell1
 
Neural Network Approximation.pdf
Neural Network Approximation.pdfNeural Network Approximation.pdf
Neural Network Approximation.pdf
bvhrs2
 
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reductionAaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
AminaRepo
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
RAHUL BHOJWANI
 
CS 354 More Graphics Pipeline
CS 354 More Graphics PipelineCS 354 More Graphics Pipeline
CS 354 More Graphics Pipeline
Mark Kilgard
 
Learning multifractal structure in large networks (Purdue ML Seminar)
Learning multifractal structure in large networks (Purdue ML Seminar)Learning multifractal structure in large networks (Purdue ML Seminar)
Learning multifractal structure in large networks (Purdue ML Seminar)
Austin Benson
 
Design and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture NotesDesign and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture Notes
Sreedhar Chowdam
 
MLHEP Lectures - day 1, basic track
MLHEP Lectures - day 1, basic trackMLHEP Lectures - day 1, basic track
MLHEP Lectures - day 1, basic track
arogozhnikov
 
Knn Algorithm presentation
Knn Algorithm presentationKnn Algorithm presentation
Knn Algorithm presentation
RishavSharma112
 
Vectorise all the things - long version.pptx
Vectorise all the things - long version.pptxVectorise all the things - long version.pptx
Vectorise all the things - long version.pptx
JodieBurchell1
 
Noisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) SurveyNoisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) Survey
Olivier Teytaud
 
DBSCAN
DBSCANDBSCAN
DBSCAN
ssuseraef7e0
 
Chromatic Sparse Learning
Chromatic Sparse LearningChromatic Sparse Learning
Chromatic Sparse Learning
Databricks
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
Bhaskar Mitra
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
Bhaskar Mitra
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
MadhuriMulik1
 
On clusteredsteinertree slide-ver 1.1
On clusteredsteinertree slide-ver 1.1On clusteredsteinertree slide-ver 1.1
On clusteredsteinertree slide-ver 1.1
VitAnhNguyn94
 
ImageSegmentation (1).ppt
ImageSegmentation (1).pptImageSegmentation (1).ppt
ImageSegmentation (1).ppt
NoorUlHaq47
 

Similar to Md2k 0219 shang (20)

Making BIG DATA smaller
Making BIG DATA smallerMaking BIG DATA smaller
Making BIG DATA smaller
 
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
 
Vectorise all the things
Vectorise all the thingsVectorise all the things
Vectorise all the things
 
Neural Network Approximation.pdf
Neural Network Approximation.pdfNeural Network Approximation.pdf
Neural Network Approximation.pdf
 
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reductionAaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
CS 354 More Graphics Pipeline
CS 354 More Graphics PipelineCS 354 More Graphics Pipeline
CS 354 More Graphics Pipeline
 
Learning multifractal structure in large networks (Purdue ML Seminar)
Learning multifractal structure in large networks (Purdue ML Seminar)Learning multifractal structure in large networks (Purdue ML Seminar)
Learning multifractal structure in large networks (Purdue ML Seminar)
 
Design and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture NotesDesign and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture Notes
 
MLHEP Lectures - day 1, basic track
MLHEP Lectures - day 1, basic trackMLHEP Lectures - day 1, basic track
MLHEP Lectures - day 1, basic track
 
Knn Algorithm presentation
Knn Algorithm presentationKnn Algorithm presentation
Knn Algorithm presentation
 
Vectorise all the things - long version.pptx
Vectorise all the things - long version.pptxVectorise all the things - long version.pptx
Vectorise all the things - long version.pptx
 
Noisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) SurveyNoisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) Survey
 
DBSCAN
DBSCANDBSCAN
DBSCAN
 
Chromatic Sparse Learning
Chromatic Sparse LearningChromatic Sparse Learning
Chromatic Sparse Learning
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
On clusteredsteinertree slide-ver 1.1
On clusteredsteinertree slide-ver 1.1On clusteredsteinertree slide-ver 1.1
On clusteredsteinertree slide-ver 1.1
 
ImageSegmentation (1).ppt
ImageSegmentation (1).pptImageSegmentation (1).ppt
ImageSegmentation (1).ppt
 

More from BBKuhn

Md2 k 04_19_2015
Md2 k 04_19_2015Md2 k 04_19_2015
Md2 k 04_19_2015
BBKuhn
 
March19 tun
March19 tunMarch19 tun
March19 tun
BBKuhn
 
March12 rahman
March12 rahmanMarch12 rahman
March12 rahman
BBKuhn
 
March12 natarajan
March12 natarajanMarch12 natarajan
March12 natarajan
BBKuhn
 
March12 chatterjee
March12 chatterjeeMarch12 chatterjee
March12 chatterjee
BBKuhn
 
March12 alzantot
March12 alzantotMarch12 alzantot
March12 alzantot
BBKuhn
 
March5 gao
March5 gaoMarch5 gao
March5 gao
BBKuhn
 
March5 bargar
March5 bargarMarch5 bargar
March5 bargar
BBKuhn
 
MD2K Presentation to Stanford Mobilize (1/22/15)
MD2K Presentation to Stanford Mobilize (1/22/15)MD2K Presentation to Stanford Mobilize (1/22/15)
MD2K Presentation to Stanford Mobilize (1/22/15)
BBKuhn
 

More from BBKuhn (9)

Md2 k 04_19_2015
Md2 k 04_19_2015Md2 k 04_19_2015
Md2 k 04_19_2015
 
March19 tun
March19 tunMarch19 tun
March19 tun
 
March12 rahman
March12 rahmanMarch12 rahman
March12 rahman
 
March12 natarajan
March12 natarajanMarch12 natarajan
March12 natarajan
 
March12 chatterjee
March12 chatterjeeMarch12 chatterjee
March12 chatterjee
 
March12 alzantot
March12 alzantotMarch12 alzantot
March12 alzantot
 
March5 gao
March5 gaoMarch5 gao
March5 gao
 
March5 bargar
March5 bargarMarch5 bargar
March5 bargar
 
MD2K Presentation to Stanford Mobilize (1/22/15)
MD2K Presentation to Stanford Mobilize (1/22/15)MD2K Presentation to Stanford Mobilize (1/22/15)
MD2K Presentation to Stanford Mobilize (1/22/15)
 

Recently uploaded

PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdfMudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
frank0071
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
Areesha Ahmad
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
ronaldlakony0
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
tonzsalvador2222
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
alishadewangan1
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Studia Poinsotiana
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 

Recently uploaded (20)

PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdfMudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 

Md2k 0219 shang

  • 1. Improving SVM classification on imbalanced time series data sets with ghost points Presenter: Shang-Tse Chen Authors: Suzan Köknar-Tezel, Longin Jan Latecki
  • 2. Introduction ● Imbalanced dataset is a challenge for data mining ○ always predict majority class -> high accuracy ○ often, rare events are more interesting ● Common Technique: ○ Up / Down sampling ○ SMOTE (adding synthetic points in feature space) ● This paper ○ adding synthetic points in distance space
  • 3. Research Question ● For time series data ○ not intuitive to represent as features in Rn ○ distance between two sequence is non-metric ○ Cannot use SMOTE ● In many applications, pair-wise distance is more relevant ○ many classifier only need pair-wise distances, ■ eg. SVM, knn ○ many good algorithms to compute distance in time series data, e.g. DTW, OSB, …, etc.
  • 4. Research Question ● Can we add synthetic data in distance space? ● Does it improve the performance?
  • 5. Methodology ● Given any two points a, b in a distance space X, we can define a ghost point e = μ(a,b). ● For every x ∈ X, the distance from x to e, d(x, μ(a,b)) is as follows: ○ case 1: {x, a, b} is a metric, then ■ d(x, µ(a, b))2 = ½ d(x, a)2 + ½ d(x, b)2 - ¼ d(a, b)2 ○ case 2: If d(a, b) > d(x, a) + d(x, b), then ■ d(x, µ(a, b)) = ½ d(a, b) - d(x, b) ○ case 3a: If d(x, a) > d(x, b) + d(a, b), then ■ d(x, µ(a, b))2 = d(x, b)2 + ¼ d(a, b)2 ○ case 3b: If d(x, b) > d(x, a) + d(a, b), then ■ d(x, µ(a, b))2 = d(x, a)2 + ¼ d(a, b)2
  • 6. Data Collection, Processing ● UCR Time series datasets ○ Use 17 datasets from various domains ○ number of classes range from 2 to 50 ● MPEG-7 ○ 1400 binary images consisting of 70 object classes ○ within each class there are 20 shapes ○ each shape is represented with 100 equidistant sample points on the contour ○ these points are converted into sequences by calculating the curvature of each point with respect to its five neighbors on each side. ○ this yields 1400 sequences, each of length 100 ○ this transformation is invariant to rotation and scale
  • 7. Key Results ● UCR Data Sets and OSB ● Shaded results indicate best performers ● the darker the shade, the larger the difference
  • 8. Key Results ● UCR Data Sets and DTW
  • 10. Summary ● Proposed a new approach for over-sampling the minority class of imbalanced data ● Unlike other feature based methods, the ghost points are added in distance space. ● Ghost points can be added to non-metric distance space ○ Can be used with DTW, OSB, and many more. ● Empirical results show significant improvement
  • 11. Critique of work ● For large-scale data, over-sampling is time consuming ● Introduce another parameters, i.e. the number of ghost points that we should add ● May not perform well in highly noisy data