SlideShare a Scribd company logo
1 of 20
Download to read offline
Classification by CUT:
Clearance Under Threshold
Ryan McBride (rom2@sfu.ca),
Ke Wang (wangk@cs.sfu.ca),
and Wenyuan Li (wenyuanli630@gail.com)
June 17, 2015
Summary
Domain knowledge helps identify “bad”
cases.
Usual Domain Knowledge: Each
outcome’s cost or relative benefit - cost
sensitive classification.
But costs are too hard to specify in
practice.
Our Idea: Model with a regulatory
threshold, a maximum acceptable
frequency in future cases.
Problem: Given a collection of
sampled electrical transformers, predict
ones with carcinogenic polychlorinated
biphenyls (PCBs), known to be harmful
to human and environment.
Similar Problems
Predict a cancer patient
Predict an unqualified applicant
Predict a broken car brake
Conventional Solution
User sets cost matrix
(note: negative=bad)
Object Class j
Positive Negative
Predicted Positive C1 C2
Class i Negative C3 C4
Issue: What is the cost of not
removing a public health hazard?
Our Solution: Thresholds
Insight: Problems without costs focus
on acceptable rates of negatives:
1. Regulations: At most “1 hazard out of
100”.
2. Power Industries: Too frequent outages
in equipment ⇒ Strengthen equipment.
Idea: Model to find “under threshold”
groups.
CUT Classification: Given t,
partition attribute space:
-
x
y + +
+
+ +
+
+
- + +
+
+
+
-
---
- -+
+
Gi Over Threshold ⇒ Mitigate Risk.
Gi Under Threshold ⇒ Delay Action.
Defining Cleared Groups
When is a group “under
threshold”?
One sample that isn’t contaminated?
One hundred samples with no PCBs?
Million samples with no PCBs?
Only “clear” if enough
observations...
Use statistics to estimate
potential frequencies
Statistical Clearance
Use confidence interval with some
confidence (e.g. 99%):
Frequency in future cases is no more
than upper bound: ub(Gi)
Example: There is a 99% chance that
no more than 5% of Dynamo
Incorporated transformers are
contaminated.
Unknown class object o cleared if in Gi
where ub(Gi) ≤ t.
Partitioning Objective
Goal: Prove many future cases
are cleared.
CUT+
Algorithm: Repeated
search for large cleared groupings.
Example with t = 5% on next slide.
List valid partitions and choose one:
Lowlands:
2 PCB of 300
ub(Lowlands):
1.6%
300 CLEARED
Midlands:
103 PCB of 150
ub(Midlands):
76.3%
NON-CLEARED
Partition A: Region for t=5%
Highlands:
45 PCB of 550
ub(Highlands):
10.3%
NON-CLEARED
Partition B: Manufacturer for t=5%
Made-Up Electric:
130 PCB of 400
ub(Made-Up)=36.4%
NON-CLEARED
Dynamo Inc:
20 PCB of 600
ub(Dynamo)=4.8%
600 CLEARED
Partition A clears 300 samples.
Partition B clears 600 samples.
Partition B preferred because it clears
more objects.
Current Tree Partition:
Produced by
Made-Up Electric
20 PCB of 600
ub(Dynamo): 4.8%,
600 CLEARED
Produced by
Dynamo Inc
All Objects
130 PCB of 400
ub(Made-Up): 36.4%
NON-CLEARED
Improvement 1: Repeat partition search in
non-cleared groups.
Final Tree
20 PCB of 600
ub(Dynamo): 4.8%,
600 CLEARED
In Surrey
Produced by
Dynamo Inc
All Objects
98 PCB of 100
ub(G): 100%,
NON-CLEARED
In Lowlands
In Midlands
In Highlands
30 PCB of 150
ub(G): 25.8%,
NON-CLEARED
2 PCB of 150
ub(G): 4.2%,
150 CLEARED
Produced by
Made-Up Electric
Improvement 2: Merge all non-cleared
regions then search again.
CUT+ Algorithm
Given a set of training objects, G, and a
clearance threshold, t
REPEAT UNTIL no cleared group is
found:
CUT Tree(G, t)
Remove the objects assigned to a cleared
group from G
Three heuristics for building trees:
1. Immediate Clearance
2. Risk Reduction
3. Pure Potential
Experiments (1)
Use cross-validation and compare:
3 CUT+
algorithms.
Competitors from other classification
areas.
Problem Set: PCB
identification problems.
Experiments (2)
Evaluate partition {G1, . . . , Gn}
with test set by:
Percent of positives cleared (TPR).
PCB Experiment (1)
t ranges from
0% to ˆp.
ˆp is the
observed rate
of PCB cases.
0%
1%
2%
3%
4%
5%
0%
0.1p̂
0.2p̂
0.3p̂
0.4p̂
0.5p̂
0.6p̂
0.7p̂
0.8p̂
0.9p̂
1.0p̂
FPR(t)
Clearance Threshold, t
Pure Potential Baseline1: C4.5
Baseline2: SMOTE Baseline3: MetaCost
0%
20%
40%
60%
80%
100%
0%
0.1p̂
0.2p̂
0.3p̂
0.4p̂
0.5p̂
0.6p̂
0.7p̂
0.8p̂
0.9p̂
1.0p̂
TPR
Results for PCB50
CUT+
clears more non-PCB transformers.
Paper results show that there are not too
many “over threshold” errors.
PCB Experiment (2)
t ranges from
0% to ˆp.
ˆp is the
observed rate
of PCB cases.
0%
1%
2%
3%
4%
5%
0%
0.1p̂
0.2p̂
0.3p̂
0.4p̂
0.5p̂
0.6p̂
0.7p̂
0.8p̂
0.9p̂
1.0p̂
FPR(t)
Clearance Threshold, t
Pure Potential Baseline1: C4.5
Baseline2: SMOTE Baseline3: MetaCost
0%
20%
40%
60%
80%
100%
0%
0.1p̂
0.2p̂
0.3p̂
0.4p̂
0.5p̂
0.6p̂
0.7p̂
0.8p̂
0.9p̂
1.0p̂
TPR
Results for PCB50
Competitors have few cleared groups since:
Too few observations to clear group.
Or frequency too high to clear group.
More Experiments on UCI Sets:
Pure Potential best algorithm in 22 out
of 25 tests.
Code available at
http://www.cs.sfu.ca/~wangk/
software/CUT_classification
Acknowledgments
Funding: BC Hydro R&D program and
Canada’s NSERC.
Transformer Image Source:
Wikipedia user Benutzer:Stahlkocher;
License: GFDL.

More Related Content

Viewers also liked

2015 07-tuto1-phrase mining
2015 07-tuto1-phrase mining2015 07-tuto1-phrase mining
2015 07-tuto1-phrase miningjins0618
 
Ling liu part 02:big graph processing
Ling liu part 02:big graph processingLing liu part 02:big graph processing
Ling liu part 02:big graph processingjins0618
 
Calton pu experimental methods on performance in cloud and accuracy in big da...
Calton pu experimental methods on performance in cloud and accuracy in big da...Calton pu experimental methods on performance in cloud and accuracy in big da...
Calton pu experimental methods on performance in cloud and accuracy in big da...jins0618
 
Processing Large Graphs in Hadoop
Processing Large Graphs in HadoopProcessing Large Graphs in Hadoop
Processing Large Graphs in HadoopDani Solà Lagares
 
Trade-offs in Processing Large Graphs: Representations, Storage, Systems and ...
Trade-offs in Processing Large Graphs: Representations, Storage, Systems and ...Trade-offs in Processing Large Graphs: Representations, Storage, Systems and ...
Trade-offs in Processing Large Graphs: Representations, Storage, Systems and ...Deepak Ajwani
 
Chen li asterix db: 大数据处理开源平台
Chen li asterix db: 大数据处理开源平台Chen li asterix db: 大数据处理开源平台
Chen li asterix db: 大数据处理开源平台jins0618
 
Batch Graph Processing Frameworks
Batch Graph Processing FrameworksBatch Graph Processing Frameworks
Batch Graph Processing FrameworksAlex Averbuch
 
Christian jensen advanced routing in spatial networks using big data
Christian jensen advanced routing in spatial networks using big dataChristian jensen advanced routing in spatial networks using big data
Christian jensen advanced routing in spatial networks using big datajins0618
 
2015 07-tuto0-courseoutline
2015 07-tuto0-courseoutline2015 07-tuto0-courseoutline
2015 07-tuto0-courseoutlinejins0618
 
Jeffrey xu yu large graph processing
Jeffrey xu yu large graph processingJeffrey xu yu large graph processing
Jeffrey xu yu large graph processingjins0618
 
Machine Status Prediction for Dynamic and Heterogenous Cloud Environment
Machine Status Prediction for Dynamic and Heterogenous Cloud EnvironmentMachine Status Prediction for Dynamic and Heterogenous Cloud Environment
Machine Status Prediction for Dynamic and Heterogenous Cloud Environmentjins0618
 
Ling liu part 01:big graph processing
Ling liu part 01:big graph processingLing liu part 01:big graph processing
Ling liu part 01:big graph processingjins0618
 
Big Graph Analytics Systems (Sigmod16 Tutorial)
Big Graph Analytics Systems (Sigmod16 Tutorial)Big Graph Analytics Systems (Sigmod16 Tutorial)
Big Graph Analytics Systems (Sigmod16 Tutorial)Yuanyuan Tian
 
Chengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big dataChengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big datajins0618
 
吕潇 星环科技大数据技术探索与应用实践
吕潇 星环科技大数据技术探索与应用实践吕潇 星环科技大数据技术探索与应用实践
吕潇 星环科技大数据技术探索与应用实践jins0618
 

Viewers also liked (15)

2015 07-tuto1-phrase mining
2015 07-tuto1-phrase mining2015 07-tuto1-phrase mining
2015 07-tuto1-phrase mining
 
Ling liu part 02:big graph processing
Ling liu part 02:big graph processingLing liu part 02:big graph processing
Ling liu part 02:big graph processing
 
Calton pu experimental methods on performance in cloud and accuracy in big da...
Calton pu experimental methods on performance in cloud and accuracy in big da...Calton pu experimental methods on performance in cloud and accuracy in big da...
Calton pu experimental methods on performance in cloud and accuracy in big da...
 
Processing Large Graphs in Hadoop
Processing Large Graphs in HadoopProcessing Large Graphs in Hadoop
Processing Large Graphs in Hadoop
 
Trade-offs in Processing Large Graphs: Representations, Storage, Systems and ...
Trade-offs in Processing Large Graphs: Representations, Storage, Systems and ...Trade-offs in Processing Large Graphs: Representations, Storage, Systems and ...
Trade-offs in Processing Large Graphs: Representations, Storage, Systems and ...
 
Chen li asterix db: 大数据处理开源平台
Chen li asterix db: 大数据处理开源平台Chen li asterix db: 大数据处理开源平台
Chen li asterix db: 大数据处理开源平台
 
Batch Graph Processing Frameworks
Batch Graph Processing FrameworksBatch Graph Processing Frameworks
Batch Graph Processing Frameworks
 
Christian jensen advanced routing in spatial networks using big data
Christian jensen advanced routing in spatial networks using big dataChristian jensen advanced routing in spatial networks using big data
Christian jensen advanced routing in spatial networks using big data
 
2015 07-tuto0-courseoutline
2015 07-tuto0-courseoutline2015 07-tuto0-courseoutline
2015 07-tuto0-courseoutline
 
Jeffrey xu yu large graph processing
Jeffrey xu yu large graph processingJeffrey xu yu large graph processing
Jeffrey xu yu large graph processing
 
Machine Status Prediction for Dynamic and Heterogenous Cloud Environment
Machine Status Prediction for Dynamic and Heterogenous Cloud EnvironmentMachine Status Prediction for Dynamic and Heterogenous Cloud Environment
Machine Status Prediction for Dynamic and Heterogenous Cloud Environment
 
Ling liu part 01:big graph processing
Ling liu part 01:big graph processingLing liu part 01:big graph processing
Ling liu part 01:big graph processing
 
Big Graph Analytics Systems (Sigmod16 Tutorial)
Big Graph Analytics Systems (Sigmod16 Tutorial)Big Graph Analytics Systems (Sigmod16 Tutorial)
Big Graph Analytics Systems (Sigmod16 Tutorial)
 
Chengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big dataChengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big data
 
吕潇 星环科技大数据技术探索与应用实践
吕潇 星环科技大数据技术探索与应用实践吕潇 星环科技大数据技术探索与应用实践
吕潇 星环科技大数据技术探索与应用实践
 

Similar to Wang ke classification by cut clearance under threshold

Ecoult Energy Storage - Solar Smoothing and Shifting
Ecoult Energy Storage -  Solar Smoothing and ShiftingEcoult Energy Storage -  Solar Smoothing and Shifting
Ecoult Energy Storage - Solar Smoothing and ShiftingEcoult123
 
HOW DRY ELECTRODE COATING CAN CHANGE GIGAFACTORIES
HOW DRY ELECTRODE COATING CAN CHANGE GIGAFACTORIESHOW DRY ELECTRODE COATING CAN CHANGE GIGAFACTORIES
HOW DRY ELECTRODE COATING CAN CHANGE GIGAFACTORIESiQHub
 
Monte Carlo Simulation of the Statistical Uncertainty of Emission Measurement...
Monte Carlo Simulation of the Statistical Uncertainty of Emission Measurement...Monte Carlo Simulation of the Statistical Uncertainty of Emission Measurement...
Monte Carlo Simulation of the Statistical Uncertainty of Emission Measurement...Mathias Magdowski
 
Computational tools for drug discovery
Computational tools for drug discoveryComputational tools for drug discovery
Computational tools for drug discoveryEszter Szabó
 
ECONOMIC LOAD DISPATCH USING PARTICLE SWARM OPTIMIZATION
ECONOMIC LOAD DISPATCH USING PARTICLE SWARM OPTIMIZATIONECONOMIC LOAD DISPATCH USING PARTICLE SWARM OPTIMIZATION
ECONOMIC LOAD DISPATCH USING PARTICLE SWARM OPTIMIZATIONMln Phaneendra
 
Right Management Solar Presentation
Right Management Solar PresentationRight Management Solar Presentation
Right Management Solar Presentationakay69
 
LED, BGA, and QFN assembly and inspection case studies
LED, BGA, and QFN assembly and inspection case studiesLED, BGA, and QFN assembly and inspection case studies
LED, BGA, and QFN assembly and inspection case studiesBill Cardoso
 
High Capacity Planar Supercapacitors and Lithium-Ion Batteries by Modular Man...
High Capacity Planar Supercapacitors and Lithium-Ion Batteries byModular Man...High Capacity Planar Supercapacitors and Lithium-Ion Batteries byModular Man...
High Capacity Planar Supercapacitors and Lithium-Ion Batteries by Modular Man...Bing Hsieh
 
Application of Shainin techniques in Manufacturing Industry- Scientific Probl...
Application of Shainin techniques in Manufacturing Industry- Scientific Probl...Application of Shainin techniques in Manufacturing Industry- Scientific Probl...
Application of Shainin techniques in Manufacturing Industry- Scientific Probl...Karthikeyan Kannappan
 
ACE305: Aircraft Components Design and Manufacture Lec 3
ACE305: Aircraft Components Design and Manufacture Lec 3ACE305: Aircraft Components Design and Manufacture Lec 3
ACE305: Aircraft Components Design and Manufacture Lec 3Dr Mohamed Elfarran
 
Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchic...
Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchic...Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchic...
Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchic...Martin Pelikan
 
Single phase ac ac buck-boost converter with single-phase matrix topology
Single phase ac ac buck-boost converter with single-phase matrix topologySingle phase ac ac buck-boost converter with single-phase matrix topology
Single phase ac ac buck-boost converter with single-phase matrix topologyAsoka Technologies
 
An ESD Case Study with High Speed Interface in Electronics Manufacturing and ...
An ESD Case Study with High Speed Interface in Electronics Manufacturing and ...An ESD Case Study with High Speed Interface in Electronics Manufacturing and ...
An ESD Case Study with High Speed Interface in Electronics Manufacturing and ...James Tsan
 

Similar to Wang ke classification by cut clearance under threshold (20)

BE2120 MFC
BE2120 MFCBE2120 MFC
BE2120 MFC
 
Ecoult Energy Storage - Solar Smoothing and Shifting
Ecoult Energy Storage -  Solar Smoothing and ShiftingEcoult Energy Storage -  Solar Smoothing and Shifting
Ecoult Energy Storage - Solar Smoothing and Shifting
 
HOW DRY ELECTRODE COATING CAN CHANGE GIGAFACTORIES
HOW DRY ELECTRODE COATING CAN CHANGE GIGAFACTORIESHOW DRY ELECTRODE COATING CAN CHANGE GIGAFACTORIES
HOW DRY ELECTRODE COATING CAN CHANGE GIGAFACTORIES
 
How gallium nitride can save energy, purify water, be used in cancer therapy ...
How gallium nitride can save energy, purify water, be used in cancer therapy ...How gallium nitride can save energy, purify water, be used in cancer therapy ...
How gallium nitride can save energy, purify water, be used in cancer therapy ...
 
Electronics Quiz
Electronics QuizElectronics Quiz
Electronics Quiz
 
Monte Carlo Simulation of the Statistical Uncertainty of Emission Measurement...
Monte Carlo Simulation of the Statistical Uncertainty of Emission Measurement...Monte Carlo Simulation of the Statistical Uncertainty of Emission Measurement...
Monte Carlo Simulation of the Statistical Uncertainty of Emission Measurement...
 
Computational tools for drug discovery
Computational tools for drug discoveryComputational tools for drug discovery
Computational tools for drug discovery
 
ECONOMIC LOAD DISPATCH USING PARTICLE SWARM OPTIMIZATION
ECONOMIC LOAD DISPATCH USING PARTICLE SWARM OPTIMIZATIONECONOMIC LOAD DISPATCH USING PARTICLE SWARM OPTIMIZATION
ECONOMIC LOAD DISPATCH USING PARTICLE SWARM OPTIMIZATION
 
Right Management Solar Presentation
Right Management Solar PresentationRight Management Solar Presentation
Right Management Solar Presentation
 
Pv Solar Grid Connect Solar Design
Pv Solar Grid Connect Solar DesignPv Solar Grid Connect Solar Design
Pv Solar Grid Connect Solar Design
 
LED, BGA, and QFN assembly and inspection case studies
LED, BGA, and QFN assembly and inspection case studiesLED, BGA, and QFN assembly and inspection case studies
LED, BGA, and QFN assembly and inspection case studies
 
High Capacity Planar Supercapacitors and Lithium-Ion Batteries by Modular Man...
High Capacity Planar Supercapacitors and Lithium-Ion Batteries byModular Man...High Capacity Planar Supercapacitors and Lithium-Ion Batteries byModular Man...
High Capacity Planar Supercapacitors and Lithium-Ion Batteries by Modular Man...
 
4 2 castillo- aguilella - annual bifacial energy yield best-fit model
4 2 castillo- aguilella - annual bifacial energy yield best-fit model4 2 castillo- aguilella - annual bifacial energy yield best-fit model
4 2 castillo- aguilella - annual bifacial energy yield best-fit model
 
Application of Shainin techniques in Manufacturing Industry- Scientific Probl...
Application of Shainin techniques in Manufacturing Industry- Scientific Probl...Application of Shainin techniques in Manufacturing Industry- Scientific Probl...
Application of Shainin techniques in Manufacturing Industry- Scientific Probl...
 
CSUN345-72M
CSUN345-72MCSUN345-72M
CSUN345-72M
 
2014 PV Performance Modeling Workshop: Outdoor Module Characterization Method...
2014 PV Performance Modeling Workshop: Outdoor Module Characterization Method...2014 PV Performance Modeling Workshop: Outdoor Module Characterization Method...
2014 PV Performance Modeling Workshop: Outdoor Module Characterization Method...
 
ACE305: Aircraft Components Design and Manufacture Lec 3
ACE305: Aircraft Components Design and Manufacture Lec 3ACE305: Aircraft Components Design and Manufacture Lec 3
ACE305: Aircraft Components Design and Manufacture Lec 3
 
Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchic...
Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchic...Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchic...
Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchic...
 
Single phase ac ac buck-boost converter with single-phase matrix topology
Single phase ac ac buck-boost converter with single-phase matrix topologySingle phase ac ac buck-boost converter with single-phase matrix topology
Single phase ac ac buck-boost converter with single-phase matrix topology
 
An ESD Case Study with High Speed Interface in Electronics Manufacturing and ...
An ESD Case Study with High Speed Interface in Electronics Manufacturing and ...An ESD Case Study with High Speed Interface in Electronics Manufacturing and ...
An ESD Case Study with High Speed Interface in Electronics Manufacturing and ...
 

More from jins0618

Latent Interest and Topic Mining on User-item Bipartite Networks
Latent Interest and Topic Mining on User-item Bipartite NetworksLatent Interest and Topic Mining on User-item Bipartite Networks
Latent Interest and Topic Mining on User-item Bipartite Networksjins0618
 
Web Service QoS Prediction Approach in Mobile Internet Environments
Web Service QoS Prediction Approach in Mobile Internet EnvironmentsWeb Service QoS Prediction Approach in Mobile Internet Environments
Web Service QoS Prediction Approach in Mobile Internet Environmentsjins0618
 
李战怀 大数据环境下数据存储与管理的研究
李战怀 大数据环境下数据存储与管理的研究李战怀 大数据环境下数据存储与管理的研究
李战怀 大数据环境下数据存储与管理的研究jins0618
 
2015 07-tuto3-mining hin
2015 07-tuto3-mining hin2015 07-tuto3-mining hin
2015 07-tuto3-mining hinjins0618
 
2015 07-tuto0-courseoutline
2015 07-tuto0-courseoutline2015 07-tuto0-courseoutline
2015 07-tuto0-courseoutlinejins0618
 
Some links of recommender system
Some links of recommender systemSome links of recommender system
Some links of recommender systemjins0618
 
Clustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture modelClustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture modeljins0618
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overviewjins0618
 

More from jins0618 (9)

Latent Interest and Topic Mining on User-item Bipartite Networks
Latent Interest and Topic Mining on User-item Bipartite NetworksLatent Interest and Topic Mining on User-item Bipartite Networks
Latent Interest and Topic Mining on User-item Bipartite Networks
 
Web Service QoS Prediction Approach in Mobile Internet Environments
Web Service QoS Prediction Approach in Mobile Internet EnvironmentsWeb Service QoS Prediction Approach in Mobile Internet Environments
Web Service QoS Prediction Approach in Mobile Internet Environments
 
李战怀 大数据环境下数据存储与管理的研究
李战怀 大数据环境下数据存储与管理的研究李战怀 大数据环境下数据存储与管理的研究
李战怀 大数据环境下数据存储与管理的研究
 
2015 07-tuto3-mining hin
2015 07-tuto3-mining hin2015 07-tuto3-mining hin
2015 07-tuto3-mining hin
 
2015 07-tuto0-courseoutline
2015 07-tuto0-courseoutline2015 07-tuto0-courseoutline
2015 07-tuto0-courseoutline
 
LITM
LITMLITM
LITM
 
Some links of recommender system
Some links of recommender systemSome links of recommender system
Some links of recommender system
 
Clustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture modelClustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture model
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
 

Recently uploaded

Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 

Recently uploaded (20)

Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 

Wang ke classification by cut clearance under threshold

  • 1. Classification by CUT: Clearance Under Threshold Ryan McBride (rom2@sfu.ca), Ke Wang (wangk@cs.sfu.ca), and Wenyuan Li (wenyuanli630@gail.com) June 17, 2015
  • 2. Summary Domain knowledge helps identify “bad” cases. Usual Domain Knowledge: Each outcome’s cost or relative benefit - cost sensitive classification. But costs are too hard to specify in practice. Our Idea: Model with a regulatory threshold, a maximum acceptable frequency in future cases.
  • 3. Problem: Given a collection of sampled electrical transformers, predict ones with carcinogenic polychlorinated biphenyls (PCBs), known to be harmful to human and environment.
  • 4. Similar Problems Predict a cancer patient Predict an unqualified applicant Predict a broken car brake
  • 5. Conventional Solution User sets cost matrix (note: negative=bad) Object Class j Positive Negative Predicted Positive C1 C2 Class i Negative C3 C4 Issue: What is the cost of not removing a public health hazard?
  • 6. Our Solution: Thresholds Insight: Problems without costs focus on acceptable rates of negatives: 1. Regulations: At most “1 hazard out of 100”. 2. Power Industries: Too frequent outages in equipment ⇒ Strengthen equipment. Idea: Model to find “under threshold” groups.
  • 7. CUT Classification: Given t, partition attribute space: - x y + + + + + + + - + + + + + - --- - -+ + Gi Over Threshold ⇒ Mitigate Risk. Gi Under Threshold ⇒ Delay Action.
  • 8. Defining Cleared Groups When is a group “under threshold”? One sample that isn’t contaminated? One hundred samples with no PCBs? Million samples with no PCBs? Only “clear” if enough observations... Use statistics to estimate potential frequencies
  • 9. Statistical Clearance Use confidence interval with some confidence (e.g. 99%): Frequency in future cases is no more than upper bound: ub(Gi) Example: There is a 99% chance that no more than 5% of Dynamo Incorporated transformers are contaminated. Unknown class object o cleared if in Gi where ub(Gi) ≤ t.
  • 10. Partitioning Objective Goal: Prove many future cases are cleared. CUT+ Algorithm: Repeated search for large cleared groupings. Example with t = 5% on next slide.
  • 11. List valid partitions and choose one: Lowlands: 2 PCB of 300 ub(Lowlands): 1.6% 300 CLEARED Midlands: 103 PCB of 150 ub(Midlands): 76.3% NON-CLEARED Partition A: Region for t=5% Highlands: 45 PCB of 550 ub(Highlands): 10.3% NON-CLEARED Partition B: Manufacturer for t=5% Made-Up Electric: 130 PCB of 400 ub(Made-Up)=36.4% NON-CLEARED Dynamo Inc: 20 PCB of 600 ub(Dynamo)=4.8% 600 CLEARED Partition A clears 300 samples. Partition B clears 600 samples. Partition B preferred because it clears more objects.
  • 12. Current Tree Partition: Produced by Made-Up Electric 20 PCB of 600 ub(Dynamo): 4.8%, 600 CLEARED Produced by Dynamo Inc All Objects 130 PCB of 400 ub(Made-Up): 36.4% NON-CLEARED Improvement 1: Repeat partition search in non-cleared groups.
  • 13. Final Tree 20 PCB of 600 ub(Dynamo): 4.8%, 600 CLEARED In Surrey Produced by Dynamo Inc All Objects 98 PCB of 100 ub(G): 100%, NON-CLEARED In Lowlands In Midlands In Highlands 30 PCB of 150 ub(G): 25.8%, NON-CLEARED 2 PCB of 150 ub(G): 4.2%, 150 CLEARED Produced by Made-Up Electric Improvement 2: Merge all non-cleared regions then search again.
  • 14. CUT+ Algorithm Given a set of training objects, G, and a clearance threshold, t REPEAT UNTIL no cleared group is found: CUT Tree(G, t) Remove the objects assigned to a cleared group from G Three heuristics for building trees: 1. Immediate Clearance 2. Risk Reduction 3. Pure Potential
  • 15. Experiments (1) Use cross-validation and compare: 3 CUT+ algorithms. Competitors from other classification areas. Problem Set: PCB identification problems.
  • 16. Experiments (2) Evaluate partition {G1, . . . , Gn} with test set by: Percent of positives cleared (TPR).
  • 17. PCB Experiment (1) t ranges from 0% to ˆp. ˆp is the observed rate of PCB cases. 0% 1% 2% 3% 4% 5% 0% 0.1p̂ 0.2p̂ 0.3p̂ 0.4p̂ 0.5p̂ 0.6p̂ 0.7p̂ 0.8p̂ 0.9p̂ 1.0p̂ FPR(t) Clearance Threshold, t Pure Potential Baseline1: C4.5 Baseline2: SMOTE Baseline3: MetaCost 0% 20% 40% 60% 80% 100% 0% 0.1p̂ 0.2p̂ 0.3p̂ 0.4p̂ 0.5p̂ 0.6p̂ 0.7p̂ 0.8p̂ 0.9p̂ 1.0p̂ TPR Results for PCB50 CUT+ clears more non-PCB transformers. Paper results show that there are not too many “over threshold” errors.
  • 18. PCB Experiment (2) t ranges from 0% to ˆp. ˆp is the observed rate of PCB cases. 0% 1% 2% 3% 4% 5% 0% 0.1p̂ 0.2p̂ 0.3p̂ 0.4p̂ 0.5p̂ 0.6p̂ 0.7p̂ 0.8p̂ 0.9p̂ 1.0p̂ FPR(t) Clearance Threshold, t Pure Potential Baseline1: C4.5 Baseline2: SMOTE Baseline3: MetaCost 0% 20% 40% 60% 80% 100% 0% 0.1p̂ 0.2p̂ 0.3p̂ 0.4p̂ 0.5p̂ 0.6p̂ 0.7p̂ 0.8p̂ 0.9p̂ 1.0p̂ TPR Results for PCB50 Competitors have few cleared groups since: Too few observations to clear group. Or frequency too high to clear group.
  • 19. More Experiments on UCI Sets: Pure Potential best algorithm in 22 out of 25 tests. Code available at http://www.cs.sfu.ca/~wangk/ software/CUT_classification
  • 20. Acknowledgments Funding: BC Hydro R&D program and Canada’s NSERC. Transformer Image Source: Wikipedia user Benutzer:Stahlkocher; License: GFDL.