SlideShare a Scribd company logo
1 of 14
Download to read offline
Team AAA.
How we solved the Laser Challenge.
Data preprocessing & visualisation
● Clean up data (delete NaNs, Norm …)
● Grouping the data by ID
○ ID 00* - Basic
○ ID 01* - Sheet
○ ID 02* - Part
○ ID 03* - Pin
● Used t-SNE to visualize data clusters
Data insights - t-SNE
First approach - 5 Classificators
● 2 Classes
○ 68600 Good examples
○ 52400 Bad examples
○ Balanced ?
● Train 5 classificators
● Default hyperparameters
● -> Overfitting...
Second approach - Feature Selection
● Feature selection
○ Feature cross correlation & Manual
via histogram
○ 89 -> 16 features
● 0.85 accuracy & still overfitting
●
●
●
overlap separable?
Possible explanation for overfitting
● Many columns
describe one .LST
file
● 607 unique .LST
files
Third approach - Gradient Boosting
● CatBoost (Gradient boosting
on Decision Trees)
● BUT the same .LST file must
not be in test and train same
time
● Accuracy only 61%, but no
overfitting to .LST files!
Most relevant features
Results
● Specificity > 98% (Minimized False Positive Rate)
● Reduced feature set from 100 to 16
● Tunable decisions by adjusting features in our selected feature set
● Simple model interpretations through reduced feature set
In a nutshell:
If model predicts 1 (success), one can be almost sure it’s true
→ Potential usage: choose best parameter set from a list of possible
process parameters, almost 98% success chance
Experimental approach - Genetic Algorithm
● Use GA for feature selection
● Evaluated with logistic regression
● Own implementation - Recombine top 1/2/3 specimen of each
generation semi-randomly for each new feature vector
● Progress from 50% accuracy to 58% within less than two hours
● training still in progress
One step
further
In a nutshell:
If model predicts 1 (success), one can be almost
sure it’s true
→ Potential usage: Choose best parameter set
from a list of possible process parameters,
almost 98% success chance
Data insights - Feature selection
● Threshold feature cross correlation &
Manual via histogram
● 89 -> 15 features
overlap separable?
The challenge
● Goal: Problems with the part removal should be avoided
○ Binary decision problem
● No geometry of the parts
○ 89 derived characteristics from parts, pins, laser cutter ...
● CatBoost allows direct

More Related Content

Similar to Team AAA Pitch - Deeptech AI Hackathon

Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsScott Clark
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsSigOpt
 
Dimensionality Reduction in Machine Learning
Dimensionality Reduction in Machine LearningDimensionality Reduction in Machine Learning
Dimensionality Reduction in Machine LearningRomiRoy4
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...PATHALAMRAJESH
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Md. Main Uddin Rony
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016MLconf
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkSigOpt
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality ReductionSaad Elbeleidy
 
Experimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles BakerExperimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles BakerDatabricks
 
DC02. Interpretation of predictions
DC02. Interpretation of predictionsDC02. Interpretation of predictions
DC02. Interpretation of predictionsAnton Kulesh
 
30thSep2014
30thSep201430thSep2014
30thSep2014Mia liu
 
House Sale Price Prediction
House Sale Price PredictionHouse Sale Price Prediction
House Sale Price Predictionsriram30691
 
Build Deep Learning model to identify santader bank's dissatisfied customers
Build Deep Learning model to identify santader bank's dissatisfied customersBuild Deep Learning model to identify santader bank's dissatisfied customers
Build Deep Learning model to identify santader bank's dissatisfied customerssriram30691
 
NUS-ISS Learning Day 2018- Application of analytics in manufacturing sector
NUS-ISS Learning Day 2018- Application of analytics in manufacturing sectorNUS-ISS Learning Day 2018- Application of analytics in manufacturing sector
NUS-ISS Learning Day 2018- Application of analytics in manufacturing sectorNUS-ISS
 
Final Presentation.pptx
Final Presentation.pptxFinal Presentation.pptx
Final Presentation.pptxMarkBauer47
 
Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1khairulhuda242
 

Similar to Team AAA Pitch - Deeptech AI Hackathon (20)

Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
HW03 (1).pdf
HW03 (1).pdfHW03 (1).pdf
HW03 (1).pdf
 
Dimensionality Reduction in Machine Learning
Dimensionality Reduction in Machine LearningDimensionality Reduction in Machine Learning
Dimensionality Reduction in Machine Learning
 
MySQL performance tuning
MySQL performance tuningMySQL performance tuning
MySQL performance tuning
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Experimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles BakerExperimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles Baker
 
DC02. Interpretation of predictions
DC02. Interpretation of predictionsDC02. Interpretation of predictions
DC02. Interpretation of predictions
 
30thSep2014
30thSep201430thSep2014
30thSep2014
 
House Sale Price Prediction
House Sale Price PredictionHouse Sale Price Prediction
House Sale Price Prediction
 
Build Deep Learning model to identify santader bank's dissatisfied customers
Build Deep Learning model to identify santader bank's dissatisfied customersBuild Deep Learning model to identify santader bank's dissatisfied customers
Build Deep Learning model to identify santader bank's dissatisfied customers
 
NUS-ISS Learning Day 2018- Application of analytics in manufacturing sector
NUS-ISS Learning Day 2018- Application of analytics in manufacturing sectorNUS-ISS Learning Day 2018- Application of analytics in manufacturing sector
NUS-ISS Learning Day 2018- Application of analytics in manufacturing sector
 
eam2
eam2eam2
eam2
 
Module-4_Part-II.pptx
Module-4_Part-II.pptxModule-4_Part-II.pptx
Module-4_Part-II.pptx
 
Final Presentation.pptx
Final Presentation.pptxFinal Presentation.pptx
Final Presentation.pptx
 
Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1
 

Recently uploaded

APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Recently uploaded (20)

APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 

Team AAA Pitch - Deeptech AI Hackathon

  • 1. Team AAA. How we solved the Laser Challenge.
  • 2. Data preprocessing & visualisation ● Clean up data (delete NaNs, Norm …) ● Grouping the data by ID ○ ID 00* - Basic ○ ID 01* - Sheet ○ ID 02* - Part ○ ID 03* - Pin ● Used t-SNE to visualize data clusters
  • 4. First approach - 5 Classificators ● 2 Classes ○ 68600 Good examples ○ 52400 Bad examples ○ Balanced ? ● Train 5 classificators ● Default hyperparameters ● -> Overfitting...
  • 5. Second approach - Feature Selection ● Feature selection ○ Feature cross correlation & Manual via histogram ○ 89 -> 16 features ● 0.85 accuracy & still overfitting ● ● ● overlap separable?
  • 6. Possible explanation for overfitting ● Many columns describe one .LST file ● 607 unique .LST files
  • 7. Third approach - Gradient Boosting ● CatBoost (Gradient boosting on Decision Trees) ● BUT the same .LST file must not be in test and train same time ● Accuracy only 61%, but no overfitting to .LST files! Most relevant features
  • 8. Results ● Specificity > 98% (Minimized False Positive Rate) ● Reduced feature set from 100 to 16 ● Tunable decisions by adjusting features in our selected feature set ● Simple model interpretations through reduced feature set In a nutshell: If model predicts 1 (success), one can be almost sure it’s true → Potential usage: choose best parameter set from a list of possible process parameters, almost 98% success chance
  • 9. Experimental approach - Genetic Algorithm ● Use GA for feature selection ● Evaluated with logistic regression ● Own implementation - Recombine top 1/2/3 specimen of each generation semi-randomly for each new feature vector ● Progress from 50% accuracy to 58% within less than two hours ● training still in progress
  • 11. In a nutshell: If model predicts 1 (success), one can be almost sure it’s true → Potential usage: Choose best parameter set from a list of possible process parameters, almost 98% success chance
  • 12. Data insights - Feature selection ● Threshold feature cross correlation & Manual via histogram ● 89 -> 15 features overlap separable?
  • 13. The challenge ● Goal: Problems with the part removal should be avoided ○ Binary decision problem ● No geometry of the parts ○ 89 derived characteristics from parts, pins, laser cutter ...