SlideShare a Scribd company logo
Ericsson Internal | 2018-02-21
Practicalchallenges in
ML workflows
Some tips from working with Ericsson data sets
and use cases.
Steven Rochefort SA OSS PDU OSS S&T BX 2019-03-04
Ericsson Internal | 2018-02-21
— Steve Rochefort is hands-on data scientist
team leader, experienced leader and telecom
professional.
— An academic background in mathematics and
marketing research has been applied in various
ways over a 30+ year career in industry.
— Software & System development for
mission critical systems
— Innovation/advanced R&D
— Machine Learning/Data Science/AI
— Most recently developing innovative use cases
for telecom data sets at Ericsson (3+ years)
Bio
Ericsson Internal | 2018-02-21
ProcessofDataScience
Data science process flowchart from "Doing Data Science", Cathy O'Neil and Rachel Schutt, 2013
From <https://en.wikipedia.org/wiki/Data_analysis#/media/File:Data_visualization_process_v1.png>
Ericsson Internal | 2018-02-21
AProcessofDataScienceatEricsson
Ericsson Internal | 2018-02-21
DataScienceEffortallocation
1. Develop UC
2. Gain access to sample data
and validate UC potential
based on data exploration,
and deliver a report
3. Develop ML model and test
it
4. Deploy model in context of
solving the UC and/or
deliver dashboard /
visualization on UC
1
2
3
4
5%
70%
15%
10%
2-3 people working 4-5 months is a
complete cycle
Ericsson Internal | 2018-02-21
—ML is about finding useful
correlation between data
(features) and results (labels)
—Correlation is very useful to make
predictions and inferences, only
as long as the correlations persist
—If you want causation, find an
expert to make you some rules
Tip#1:CorrelationisnotCausation
Ericsson Internal | 2018-02-21
—Find out what is in the data
—Relationships
—Patterns
—Find out how the data relates to
the label
—Then…start building models
Tip#2:Lookatthedata…alot
Ericsson Internal | 2018-02-21
—More data is better than less data
—All the data is never a good thing
—Sample the data statistically to
make some inferences and
models about the whole data
Tip#3:SampleTheData
You can drink the ocean … but only cup by cup
Ericsson Internal | 2018-02-21
—Simulation
—Creating realistic data from
physical models
—Alteration
—Take existing data and overlay
a pattern or experience
—Simplification
—Create parallel way to
generate data
Tip#4:Ifyoudon’thavedata,
makeyourown
Ericsson Internal | 2018-02-21
—It is hard not to get a result after
using a ML algorithm or
Python/R script (GIGO)
—Validate the result against your
common sense to ensure it is a
correct answer
Tip#5:Validateyourresults
Even if everything adds up it can still be wrong
Ericsson Internal | 2018-02-21
—Try your data against a lot of
different algorithms and see
which ones work better.
—Tuning hyperparameters will
give a small improvement, but
not a lot usually.
—Choose the simplest algorithm
that works the best
Tip#6:Mostalgorithmswork
prettygoodoutofthebox
Ericsson Internal | 2018-02-21
—A ML that improves overall
performance of a UC by 5% could
be fantastic … or complete
rubbish.
—The usefulness of an algorithm is
based on how it solves a real-life
problem
Tip#7:Assessthealgorithms
usefulness
Ericsson Internal | 2018-02-21
—For unbalanced data sets,
accuracy (TP/#observations) is
not very useful
—For a 99-1 dataset split the
algorithm that says
everything is the majority set
is 99% accurate
—Consider using the F1 Score as a
measure (link)
Tip#8:Classification:Accuracyisrarelya
goodmeasureofperformance
Ericsson Internal | 2018-02-21
—Comparing regression
algorithms using RMSE, MAE,
R2, etc are goods way to see
which algorithm is performing
better than others.
—However the real measure is
how well the algorithm meets
the use case expectations on
performance
Tip#9:Regression:AlgorithmPerformance
doesnotimplyUseCasePerformance
Ericsson Internal | 2018-02-21
—The real world is messy and
tends to mess up many a good
theory.
—On the other hand, just because a
solution is not perfect, it can be
useful sometimes.
—Weight the benefit against the
utility obtained, not the original
goal.
Tip#10:ImperfectionCanstillbe
useful
Practical Challenges ML Workflows

More Related Content

What's hot

Agenda slide
Agenda slideAgenda slide
Threats to federated learning a survey
Threats to federated learning  a surveyThreats to federated learning  a survey
Threats to federated learning a survey
Wasae Qureshi
 
Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...
Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...
Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...
Tutors India
 
Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...
Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...
Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...
Tutors India
 
IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big dataIEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
IEEEMEMTECHSTUDENTPROJECTS
 
Karan Mehta- Resume
Karan Mehta- ResumeKaran Mehta- Resume
Karan Mehta- ResumeKaran Mehta
 
HPC, Big Data & Data Center Explanation by Mert Akın
HPC, Big Data & Data  Center Explanation by Mert AkınHPC, Big Data & Data  Center Explanation by Mert Akın
HPC, Big Data & Data Center Explanation by Mert Akın
Mert Akın
 
Swapnil Shende-resume
Swapnil Shende-resumeSwapnil Shende-resume
Swapnil Shende-resume
Swapnil Shende
 
From a sea of projects to collaboration opportunities within seconds
From a sea of projects to collaboration opportunities within secondsFrom a sea of projects to collaboration opportunities within seconds
From a sea of projects to collaboration opportunities within seconds
Michel Drescher
 
Green computing on Consumer's buying behavior
Green computing on Consumer's buying behavior Green computing on Consumer's buying behavior
Green computing on Consumer's buying behavior
Shibly Ahamed
 
Feature engineering
Feature engineeringFeature engineering
Feature engineering
SaurabhWani6
 
44
4444

What's hot (12)

Agenda slide
Agenda slideAgenda slide
Agenda slide
 
Threats to federated learning a survey
Threats to federated learning  a surveyThreats to federated learning  a survey
Threats to federated learning a survey
 
Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...
Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...
Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...
 
Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...
Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...
Recent Master’s Research Topic Ideas for Electrical and Electronics Engineeri...
 
IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big dataIEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
 
Karan Mehta- Resume
Karan Mehta- ResumeKaran Mehta- Resume
Karan Mehta- Resume
 
HPC, Big Data & Data Center Explanation by Mert Akın
HPC, Big Data & Data  Center Explanation by Mert AkınHPC, Big Data & Data  Center Explanation by Mert Akın
HPC, Big Data & Data Center Explanation by Mert Akın
 
Swapnil Shende-resume
Swapnil Shende-resumeSwapnil Shende-resume
Swapnil Shende-resume
 
From a sea of projects to collaboration opportunities within seconds
From a sea of projects to collaboration opportunities within secondsFrom a sea of projects to collaboration opportunities within seconds
From a sea of projects to collaboration opportunities within seconds
 
Green computing on Consumer's buying behavior
Green computing on Consumer's buying behavior Green computing on Consumer's buying behavior
Green computing on Consumer's buying behavior
 
Feature engineering
Feature engineeringFeature engineering
Feature engineering
 
44
4444
44
 

Similar to Practical Challenges ML Workflows

Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
mattdenesuk
 
Test-Driven Machine Learning
Test-Driven Machine LearningTest-Driven Machine Learning
Test-Driven Machine Learning
C4Media
 
SplunkLive! Frankfurt 2018 - Integrating Metrics & Logs
SplunkLive! Frankfurt 2018 - Integrating Metrics & LogsSplunkLive! Frankfurt 2018 - Integrating Metrics & Logs
SplunkLive! Frankfurt 2018 - Integrating Metrics & Logs
Splunk
 
SplunkLive! Munich 2018: Integrating Metrics and Logs
SplunkLive! Munich 2018: Integrating Metrics and LogsSplunkLive! Munich 2018: Integrating Metrics and Logs
SplunkLive! Munich 2018: Integrating Metrics and Logs
Splunk
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)
Denodo
 
Comparing the performance of a business process: using Excel & Python
Comparing the performance of a business process: using Excel & PythonComparing the performance of a business process: using Excel & Python
Comparing the performance of a business process: using Excel & Python
IRJET Journal
 
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
arpit206900
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
Pouria Amirian
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
Pouria Amirian
 
Big Data and Computer Science Education
Big Data and Computer Science EducationBig Data and Computer Science Education
Big Data and Computer Science Education
James Hendler
 
SplunkLive! Zurich 2018: Integrating Metrics and Logs
SplunkLive! Zurich 2018: Integrating Metrics and LogsSplunkLive! Zurich 2018: Integrating Metrics and Logs
SplunkLive! Zurich 2018: Integrating Metrics and Logs
Splunk
 
An Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsAn Overview of Python for Data Analytics
An Overview of Python for Data Analytics
IRJET Journal
 
Ch1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxCh1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptx
AbderrahmanABID2
 
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
Alok Singh
 
BigMLSchool: Customer Segmentation
BigMLSchool: Customer SegmentationBigMLSchool: Customer Segmentation
BigMLSchool: Customer Segmentation
BigML, Inc
 
Data science governance and GDPR
Data science governance and GDPRData science governance and GDPR
Data science governance and GDPR
Andy Petrella
 
IRJET- Comparison of Classification Algorithms using Machine Learning
IRJET- Comparison of Classification Algorithms using Machine LearningIRJET- Comparison of Classification Algorithms using Machine Learning
IRJET- Comparison of Classification Algorithms using Machine Learning
IRJET Journal
 
Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal GreenplumSimplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
VMware Tanzu
 
From DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transitionFrom DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transition
Anne-Marie Tousch
 
15cs81-iot-module4-convertedPass .pptx
15cs81-iot-module4-convertedPass   .pptx15cs81-iot-module4-convertedPass   .pptx
15cs81-iot-module4-convertedPass .pptx
GaganaGowda31
 

Similar to Practical Challenges ML Workflows (20)

Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
Big Data, Physics, and the Industrial Internet: How Modeling & Analytics are ...
 
Test-Driven Machine Learning
Test-Driven Machine LearningTest-Driven Machine Learning
Test-Driven Machine Learning
 
SplunkLive! Frankfurt 2018 - Integrating Metrics & Logs
SplunkLive! Frankfurt 2018 - Integrating Metrics & LogsSplunkLive! Frankfurt 2018 - Integrating Metrics & Logs
SplunkLive! Frankfurt 2018 - Integrating Metrics & Logs
 
SplunkLive! Munich 2018: Integrating Metrics and Logs
SplunkLive! Munich 2018: Integrating Metrics and LogsSplunkLive! Munich 2018: Integrating Metrics and Logs
SplunkLive! Munich 2018: Integrating Metrics and Logs
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)
 
Comparing the performance of a business process: using Excel & Python
Comparing the performance of a business process: using Excel & PythonComparing the performance of a business process: using Excel & Python
Comparing the performance of a business process: using Excel & Python
 
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
Big Data and Computer Science Education
Big Data and Computer Science EducationBig Data and Computer Science Education
Big Data and Computer Science Education
 
SplunkLive! Zurich 2018: Integrating Metrics and Logs
SplunkLive! Zurich 2018: Integrating Metrics and LogsSplunkLive! Zurich 2018: Integrating Metrics and Logs
SplunkLive! Zurich 2018: Integrating Metrics and Logs
 
An Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsAn Overview of Python for Data Analytics
An Overview of Python for Data Analytics
 
Ch1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxCh1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptx
 
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
 
BigMLSchool: Customer Segmentation
BigMLSchool: Customer SegmentationBigMLSchool: Customer Segmentation
BigMLSchool: Customer Segmentation
 
Data science governance and GDPR
Data science governance and GDPRData science governance and GDPR
Data science governance and GDPR
 
IRJET- Comparison of Classification Algorithms using Machine Learning
IRJET- Comparison of Classification Algorithms using Machine LearningIRJET- Comparison of Classification Algorithms using Machine Learning
IRJET- Comparison of Classification Algorithms using Machine Learning
 
Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal GreenplumSimplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
 
From DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transitionFrom DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transition
 
15cs81-iot-module4-convertedPass .pptx
15cs81-iot-module4-convertedPass   .pptx15cs81-iot-module4-convertedPass   .pptx
15cs81-iot-module4-convertedPass .pptx
 

More from Jenny Midwinter

Machine learning applications in clinical brain computer interfacing
Machine learning applications in clinical brain computer interfacingMachine learning applications in clinical brain computer interfacing
Machine learning applications in clinical brain computer interfacing
Jenny Midwinter
 
Augmented Intelligence Bridging the Gap Between BI and AI
Augmented Intelligence  Bridging the Gap Between BI and AIAugmented Intelligence  Bridging the Gap Between BI and AI
Augmented Intelligence Bridging the Gap Between BI and AI
Jenny Midwinter
 
Autonomous Learning for Autonomous Systems, by Prof. Plamen Angelov
Autonomous Learning for Autonomous Systems, by Prof. Plamen AngelovAutonomous Learning for Autonomous Systems, by Prof. Plamen Angelov
Autonomous Learning for Autonomous Systems, by Prof. Plamen Angelov
Jenny Midwinter
 
Ai and analytics for business
Ai and analytics for businessAi and analytics for business
Ai and analytics for business
Jenny Midwinter
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
Jenny Midwinter
 
Building an NLP DNN in 5 Minutes
Building an NLP DNN in 5 MinutesBuilding an NLP DNN in 5 Minutes
Building an NLP DNN in 5 Minutes
Jenny Midwinter
 
Machine Learning meets Granular Computing
Machine Learning meets Granular ComputingMachine Learning meets Granular Computing
Machine Learning meets Granular Computing
Jenny Midwinter
 
2016 09-19 - stephan jou - machine learning meetup v1
2016 09-19 - stephan jou - machine learning meetup v12016 09-19 - stephan jou - machine learning meetup v1
2016 09-19 - stephan jou - machine learning meetup v1
Jenny Midwinter
 
Machine Learning at Amazon
Machine Learning at AmazonMachine Learning at Amazon
Machine Learning at Amazon
Jenny Midwinter
 
AI and Machine Learning: The many different approaches
AI and Machine Learning: The many different approachesAI and Machine Learning: The many different approaches
AI and Machine Learning: The many different approaches
Jenny Midwinter
 
Applying Deep Learning Vision Technology to low-cost/power Embedded Systems
Applying Deep Learning Vision Technology to low-cost/power Embedded SystemsApplying Deep Learning Vision Technology to low-cost/power Embedded Systems
Applying Deep Learning Vision Technology to low-cost/power Embedded Systems
Jenny Midwinter
 

More from Jenny Midwinter (11)

Machine learning applications in clinical brain computer interfacing
Machine learning applications in clinical brain computer interfacingMachine learning applications in clinical brain computer interfacing
Machine learning applications in clinical brain computer interfacing
 
Augmented Intelligence Bridging the Gap Between BI and AI
Augmented Intelligence  Bridging the Gap Between BI and AIAugmented Intelligence  Bridging the Gap Between BI and AI
Augmented Intelligence Bridging the Gap Between BI and AI
 
Autonomous Learning for Autonomous Systems, by Prof. Plamen Angelov
Autonomous Learning for Autonomous Systems, by Prof. Plamen AngelovAutonomous Learning for Autonomous Systems, by Prof. Plamen Angelov
Autonomous Learning for Autonomous Systems, by Prof. Plamen Angelov
 
Ai and analytics for business
Ai and analytics for businessAi and analytics for business
Ai and analytics for business
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Building an NLP DNN in 5 Minutes
Building an NLP DNN in 5 MinutesBuilding an NLP DNN in 5 Minutes
Building an NLP DNN in 5 Minutes
 
Machine Learning meets Granular Computing
Machine Learning meets Granular ComputingMachine Learning meets Granular Computing
Machine Learning meets Granular Computing
 
2016 09-19 - stephan jou - machine learning meetup v1
2016 09-19 - stephan jou - machine learning meetup v12016 09-19 - stephan jou - machine learning meetup v1
2016 09-19 - stephan jou - machine learning meetup v1
 
Machine Learning at Amazon
Machine Learning at AmazonMachine Learning at Amazon
Machine Learning at Amazon
 
AI and Machine Learning: The many different approaches
AI and Machine Learning: The many different approachesAI and Machine Learning: The many different approaches
AI and Machine Learning: The many different approaches
 
Applying Deep Learning Vision Technology to low-cost/power Embedded Systems
Applying Deep Learning Vision Technology to low-cost/power Embedded SystemsApplying Deep Learning Vision Technology to low-cost/power Embedded Systems
Applying Deep Learning Vision Technology to low-cost/power Embedded Systems
 

Recently uploaded

Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 

Recently uploaded (20)

Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 

Practical Challenges ML Workflows

  • 1. Ericsson Internal | 2018-02-21 Practicalchallenges in ML workflows Some tips from working with Ericsson data sets and use cases. Steven Rochefort SA OSS PDU OSS S&T BX 2019-03-04
  • 2. Ericsson Internal | 2018-02-21 — Steve Rochefort is hands-on data scientist team leader, experienced leader and telecom professional. — An academic background in mathematics and marketing research has been applied in various ways over a 30+ year career in industry. — Software & System development for mission critical systems — Innovation/advanced R&D — Machine Learning/Data Science/AI — Most recently developing innovative use cases for telecom data sets at Ericsson (3+ years) Bio
  • 3. Ericsson Internal | 2018-02-21 ProcessofDataScience Data science process flowchart from "Doing Data Science", Cathy O'Neil and Rachel Schutt, 2013 From <https://en.wikipedia.org/wiki/Data_analysis#/media/File:Data_visualization_process_v1.png>
  • 4. Ericsson Internal | 2018-02-21 AProcessofDataScienceatEricsson
  • 5. Ericsson Internal | 2018-02-21 DataScienceEffortallocation 1. Develop UC 2. Gain access to sample data and validate UC potential based on data exploration, and deliver a report 3. Develop ML model and test it 4. Deploy model in context of solving the UC and/or deliver dashboard / visualization on UC 1 2 3 4 5% 70% 15% 10% 2-3 people working 4-5 months is a complete cycle
  • 6. Ericsson Internal | 2018-02-21 —ML is about finding useful correlation between data (features) and results (labels) —Correlation is very useful to make predictions and inferences, only as long as the correlations persist —If you want causation, find an expert to make you some rules Tip#1:CorrelationisnotCausation
  • 7. Ericsson Internal | 2018-02-21 —Find out what is in the data —Relationships —Patterns —Find out how the data relates to the label —Then…start building models Tip#2:Lookatthedata…alot
  • 8. Ericsson Internal | 2018-02-21 —More data is better than less data —All the data is never a good thing —Sample the data statistically to make some inferences and models about the whole data Tip#3:SampleTheData You can drink the ocean … but only cup by cup
  • 9. Ericsson Internal | 2018-02-21 —Simulation —Creating realistic data from physical models —Alteration —Take existing data and overlay a pattern or experience —Simplification —Create parallel way to generate data Tip#4:Ifyoudon’thavedata, makeyourown
  • 10. Ericsson Internal | 2018-02-21 —It is hard not to get a result after using a ML algorithm or Python/R script (GIGO) —Validate the result against your common sense to ensure it is a correct answer Tip#5:Validateyourresults Even if everything adds up it can still be wrong
  • 11. Ericsson Internal | 2018-02-21 —Try your data against a lot of different algorithms and see which ones work better. —Tuning hyperparameters will give a small improvement, but not a lot usually. —Choose the simplest algorithm that works the best Tip#6:Mostalgorithmswork prettygoodoutofthebox
  • 12. Ericsson Internal | 2018-02-21 —A ML that improves overall performance of a UC by 5% could be fantastic … or complete rubbish. —The usefulness of an algorithm is based on how it solves a real-life problem Tip#7:Assessthealgorithms usefulness
  • 13. Ericsson Internal | 2018-02-21 —For unbalanced data sets, accuracy (TP/#observations) is not very useful —For a 99-1 dataset split the algorithm that says everything is the majority set is 99% accurate —Consider using the F1 Score as a measure (link) Tip#8:Classification:Accuracyisrarelya goodmeasureofperformance
  • 14. Ericsson Internal | 2018-02-21 —Comparing regression algorithms using RMSE, MAE, R2, etc are goods way to see which algorithm is performing better than others. —However the real measure is how well the algorithm meets the use case expectations on performance Tip#9:Regression:AlgorithmPerformance doesnotimplyUseCasePerformance
  • 15. Ericsson Internal | 2018-02-21 —The real world is messy and tends to mess up many a good theory. —On the other hand, just because a solution is not perfect, it can be useful sometimes. —Weight the benefit against the utility obtained, not the original goal. Tip#10:ImperfectionCanstillbe useful