SlideShare a Scribd company logo
1 of 15
Download to read offline
© 2018 KNIME AG. All Right Reserved.
From	Raw	Data	to	Deployment
Kilian.Thiel@knime.com
Marten.Pfannenschmidt@knime.com
Kathrin.Melcher@knime.com
KNIME
© 2018 KNIME AG. All Rights Reserved.
Do	you	recognize	this?
2
https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining
© 2018 KNIME AG. All Rights Reserved.
Let’s	unroll	it!
It	always	starts	
with	some	data	…
3
Data	
Preparation
Model	
Training
Model	
Optimization
Deployment
Data	Manipulation
Data	Blending
Missing	Values	Handling
Feature	Generation
Dimensionality	Reduction
Feature	Selection
Outlier	Removal
Normalization
Partitioning
…
Model	Training
Bag	of	Models
Model	Selection
Ensemble	Models
Own	Ensemble	Model
External	Models
Import	Existing	Models
Model	Factory
…
Parameter	Tuning
Parameter	Optimization
Regularization
Model	Size
No	Iterations
…
Performance	Measures
Accuracy
ROC	Curve
Cross-Validation
…
Files	&	DBs
Dashboards
REST	API
SQL	Code	Export
Reporting
…
Model	
Evaluation
© 2018 KNIME AG. All Rights Reserved.
The	many	Lives	of	a	Dataset
4
Data	
Preparation
Model	
Training
Model	
Optimization
Model	
Evaluation
Deployment
Partitioning:
• Training	Set
• Validation	Set
• Test	Set
Training	Set Validation	Set Test	Set New	Data	from	Real	
World	Applications
Original	Data	
Set	with	Past	
Observations
© 2018 KNIME AG. All Rights Reserved.
Data	Exploration
• Sometimes	in	between	Data	Access	and	Data	
Preparation	there	is	a	Data	Exploration	phase
• The	Data	Exploration	phase	is	useful	to	get	to	
know	the	data
• KNIME	offers	a	few	visualization	nodes	to	build	
dashboards	to	explore	the	data
5
© 2018 KNIME AG. All Rights Reserved.
One	Example	for	Every	Need
The	KNIME	EXAMPLES	Server
6
50_Applications/27_FromRawDataToDeployment
© 2018 KNIME AG. All Rights Reserved.
Classification	Problem	&	Data	Set
• Airline	Dataset:	http://stat-computing.org/dataexpo/2009/the-data.html
• Smaller	dataset	(Jan	2007)	(AirlineDataset.table)
• Challenge:
Predict	Departure	Delays	
If	on	original	airline	dataset,	only	flights	from	airport	ORD
Output	Class	=	“delay”	if	depdelay >	15min	
otherwise	“no	delay”
Input	features	all	what	is	available	and	more	if	you	can	find	it!
7
© 2018 KNIME AG. All Rights Reserved.
Challenges
• Group	1. Data	Access	and	Data	Preparation
• Group	2. ML	Model	Training
• Group	3. Model	Deployment
• Import	file	Learnathon_2018.knar into	your	workspace	
8
© 2018 KNIME AG. All Rights Reserved.
Group	1. Data	Access	and	Data	Preparation
9
© 2018 KNIME AG. All Rights Reserved.
Group	2.	Model	Training	&	Optimization
10
© 2018 KNIME AG. All Rights Reserved.
Group	3. Deployment
11
© 2018 KNIME AG. All Rights Reserved.
KNIME	Spring	Summit	2018
March	5	– 9	at	Hotel	Berlin,	Berlin	in	Germany
• Monday	&	Tuesday:	One-day	courses
• Wednesday	&	Thursday: Summit	sessions
• Friday:	Workshops
Use	the	code
LEARNATHON
for	10% off	tickets!
Register	at	
www.KNIME.com
© 2018 KNIME AG. All Rights Reserved.
KNIME	Beginner’s	Luck	Book
Free	Copy	of	KNIME	Beginner’s	Luck	Book	at	KNIME	Press	
https://www.knime.org/knimepress
Promotion	Code:
KNIME_Learnathon_2018
© 2018 KNIME AG. All Rights Reserved.
You	can	find	KNIMers here!
14
• KNIME (www.knime.org)
• BLOG	for	news,	tips	and	tricks(www.knime.org/blog)
• FORUM for	questions	and	answers	(tech.knime.org/forum)
• EXAMPLE	SERVER	for	example	workflows
• LEARNING	HUB (www.knime.org/learning-hub)
• KNIME	TV		channel on
• KNIME	on														@KNIME
• KNIME on https://www.facebook.com/KNIMEanalytics
• KNIME	User	Group	UK	on	
https://www.meetup.com/KNIME-User-Group-UK/
© 2017 KNIME AG. All Rights Reserved. 15
The	KNIME®	trademark	and	logo	and	OPEN	FOR	INNOVATION®	trademark	are	used	by	KNIME.com	AG	under	license	from	KNIME	GmbH,	
and	are	registered	in	the	United	States.	KNIME®	is	also	registered	in	Germany.
Thank	You!

More Related Content

What's hot

Anomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine LearningAnomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine LearningKNIMESlides
 
AWS reInvent 2019 Trip Report
AWS reInvent 2019 Trip ReportAWS reInvent 2019 Trip Report
AWS reInvent 2019 Trip ReportCraig Milroy
 
#AI + #Cloud = #DigitalTransformation
#AI + #Cloud = #DigitalTransformation#AI + #Cloud = #DigitalTransformation
#AI + #Cloud = #DigitalTransformationCraig Milroy
 
Cloud Governance within The Climate Corporation
Cloud Governance within The Climate CorporationCloud Governance within The Climate Corporation
Cloud Governance within The Climate CorporationMohamed Ahmed
 
[Cisco Connect 2018 - Vietnam] Joseph yap journey to the multi cloud
[Cisco Connect 2018 - Vietnam] Joseph yap journey to the multi cloud[Cisco Connect 2018 - Vietnam] Joseph yap journey to the multi cloud
[Cisco Connect 2018 - Vietnam] Joseph yap journey to the multi cloudNur Shiqim Chok
 
Unlock Your CAD Data for Real-Time Development (Unity+PiXYZ) - AEC
Unlock Your CAD Data for Real-Time Development (Unity+PiXYZ) - AECUnlock Your CAD Data for Real-Time Development (Unity+PiXYZ) - AEC
Unlock Your CAD Data for Real-Time Development (Unity+PiXYZ) - AECUnity Technologies
 
Melodic Keynote presentation at OW2con'19, June 12-13, Paris.
Melodic Keynote presentation at OW2con'19, June 12-13, Paris. Melodic Keynote presentation at OW2con'19, June 12-13, Paris.
Melodic Keynote presentation at OW2con'19, June 12-13, Paris. OW2
 
Hosting For Your Startup, Side Project, or Big Dollar App - Minnebar 12
Hosting For Your Startup, Side Project, or Big Dollar App - Minnebar 12Hosting For Your Startup, Side Project, or Big Dollar App - Minnebar 12
Hosting For Your Startup, Side Project, or Big Dollar App - Minnebar 12Keith Resar
 
From Interactive to Automatic CAD Data Prep
From Interactive to Automatic CAD Data PrepFrom Interactive to Automatic CAD Data Prep
From Interactive to Automatic CAD Data PrepUnity Technologies
 
Get Your Aircraft Spare Parts Inventory Management Off the Ground
Get Your Aircraft Spare Parts Inventory Management Off the GroundGet Your Aircraft Spare Parts Inventory Management Off the Ground
Get Your Aircraft Spare Parts Inventory Management Off the GroundPTC
 
IPv6 and Cloud Hosting
IPv6 and Cloud HostingIPv6 and Cloud Hosting
IPv6 and Cloud HostingRIPE NCC
 
Amberix Energy Efficient Facilities
Amberix Energy Efficient FacilitiesAmberix Energy Efficient Facilities
Amberix Energy Efficient Facilitiesgueste5667f2
 
What is Capability Analysis?
What is Capability Analysis?What is Capability Analysis?
What is Capability Analysis?Jay Arthur
 
Creating a GraphQL API in Python: from Django to fully asynchronous
Creating a GraphQL API in Python: from Django to fully asynchronousCreating a GraphQL API in Python: from Django to fully asynchronous
Creating a GraphQL API in Python: from Django to fully asynchronousMirumee Software
 
Optimise Energy Usage Using Amazon SageMaker Reinforcement Learning and Publi...
Optimise Energy Usage Using Amazon SageMaker Reinforcement Learning and Publi...Optimise Energy Usage Using Amazon SageMaker Reinforcement Learning and Publi...
Optimise Energy Usage Using Amazon SageMaker Reinforcement Learning and Publi...Amazon Web Services
 
PlaatEnergy Design
PlaatEnergy DesignPlaatEnergy Design
PlaatEnergy Designwplaat
 
Summer 2017
Summer 2017Summer 2017
Summer 2017sabativi
 
Sentry: Baselining, cloud-scale monitoring and auto-remediation with app mon ...
Sentry: Baselining, cloud-scale monitoring and auto-remediation with app mon ...Sentry: Baselining, cloud-scale monitoring and auto-remediation with app mon ...
Sentry: Baselining, cloud-scale monitoring and auto-remediation with app mon ...Dynatrace
 
AppSphere 15 - Monitoring Cloud & Asynchronous Applications
AppSphere 15 - Monitoring Cloud & Asynchronous ApplicationsAppSphere 15 - Monitoring Cloud & Asynchronous Applications
AppSphere 15 - Monitoring Cloud & Asynchronous ApplicationsAppDynamics
 

What's hot (20)

Anomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine LearningAnomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
 
AWS reInvent 2019 Trip Report
AWS reInvent 2019 Trip ReportAWS reInvent 2019 Trip Report
AWS reInvent 2019 Trip Report
 
#AI + #Cloud = #DigitalTransformation
#AI + #Cloud = #DigitalTransformation#AI + #Cloud = #DigitalTransformation
#AI + #Cloud = #DigitalTransformation
 
Cloud Governance within The Climate Corporation
Cloud Governance within The Climate CorporationCloud Governance within The Climate Corporation
Cloud Governance within The Climate Corporation
 
Esri in AWS Cloud
Esri in AWS CloudEsri in AWS Cloud
Esri in AWS Cloud
 
[Cisco Connect 2018 - Vietnam] Joseph yap journey to the multi cloud
[Cisco Connect 2018 - Vietnam] Joseph yap journey to the multi cloud[Cisco Connect 2018 - Vietnam] Joseph yap journey to the multi cloud
[Cisco Connect 2018 - Vietnam] Joseph yap journey to the multi cloud
 
Unlock Your CAD Data for Real-Time Development (Unity+PiXYZ) - AEC
Unlock Your CAD Data for Real-Time Development (Unity+PiXYZ) - AECUnlock Your CAD Data for Real-Time Development (Unity+PiXYZ) - AEC
Unlock Your CAD Data for Real-Time Development (Unity+PiXYZ) - AEC
 
Melodic Keynote presentation at OW2con'19, June 12-13, Paris.
Melodic Keynote presentation at OW2con'19, June 12-13, Paris. Melodic Keynote presentation at OW2con'19, June 12-13, Paris.
Melodic Keynote presentation at OW2con'19, June 12-13, Paris.
 
Hosting For Your Startup, Side Project, or Big Dollar App - Minnebar 12
Hosting For Your Startup, Side Project, or Big Dollar App - Minnebar 12Hosting For Your Startup, Side Project, or Big Dollar App - Minnebar 12
Hosting For Your Startup, Side Project, or Big Dollar App - Minnebar 12
 
From Interactive to Automatic CAD Data Prep
From Interactive to Automatic CAD Data PrepFrom Interactive to Automatic CAD Data Prep
From Interactive to Automatic CAD Data Prep
 
Get Your Aircraft Spare Parts Inventory Management Off the Ground
Get Your Aircraft Spare Parts Inventory Management Off the GroundGet Your Aircraft Spare Parts Inventory Management Off the Ground
Get Your Aircraft Spare Parts Inventory Management Off the Ground
 
IPv6 and Cloud Hosting
IPv6 and Cloud HostingIPv6 and Cloud Hosting
IPv6 and Cloud Hosting
 
Amberix Energy Efficient Facilities
Amberix Energy Efficient FacilitiesAmberix Energy Efficient Facilities
Amberix Energy Efficient Facilities
 
What is Capability Analysis?
What is Capability Analysis?What is Capability Analysis?
What is Capability Analysis?
 
Creating a GraphQL API in Python: from Django to fully asynchronous
Creating a GraphQL API in Python: from Django to fully asynchronousCreating a GraphQL API in Python: from Django to fully asynchronous
Creating a GraphQL API in Python: from Django to fully asynchronous
 
Optimise Energy Usage Using Amazon SageMaker Reinforcement Learning and Publi...
Optimise Energy Usage Using Amazon SageMaker Reinforcement Learning and Publi...Optimise Energy Usage Using Amazon SageMaker Reinforcement Learning and Publi...
Optimise Energy Usage Using Amazon SageMaker Reinforcement Learning and Publi...
 
PlaatEnergy Design
PlaatEnergy DesignPlaatEnergy Design
PlaatEnergy Design
 
Summer 2017
Summer 2017Summer 2017
Summer 2017
 
Sentry: Baselining, cloud-scale monitoring and auto-remediation with app mon ...
Sentry: Baselining, cloud-scale monitoring and auto-remediation with app mon ...Sentry: Baselining, cloud-scale monitoring and auto-remediation with app mon ...
Sentry: Baselining, cloud-scale monitoring and auto-remediation with app mon ...
 
AppSphere 15 - Monitoring Cloud & Asynchronous Applications
AppSphere 15 - Monitoring Cloud & Asynchronous ApplicationsAppSphere 15 - Monitoring Cloud & Asynchronous Applications
AppSphere 15 - Monitoring Cloud & Asynchronous Applications
 

Similar to From raw data to deployment

KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...KNIMESlides
 
From Raw Data to Deployment
From Raw Data to DeploymentFrom Raw Data to Deployment
From Raw Data to DeploymentKNIMESlides
 
How Do You Build and Validate 1500 Models and What Can You Learn from Them?
How Do You Build and Validate 1500 Models and What Can You Learn from Them? How Do You Build and Validate 1500 Models and What Can You Learn from Them?
How Do You Build and Validate 1500 Models and What Can You Learn from Them? Greg Landrum
 
AI/ML is a Means to Digital Transformation, Not an End Itself
AI/ML is a Means to Digital Transformation, Not an End ItselfAI/ML is a Means to Digital Transformation, Not an End Itself
AI/ML is a Means to Digital Transformation, Not an End ItselfBESPIN GLOBAL
 
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?SnapLogic
 
From notebook to production with Amazon Sagemaker
From notebook to production with Amazon SagemakerFrom notebook to production with Amazon Sagemaker
From notebook to production with Amazon SagemakerAmazon Web Services
 
An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)Julien SIMON
 
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...Alok Singh
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon Web Services
 
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS SummitWork with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS SummitAmazon Web Services
 
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...Amazon Web Services
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software EngineeringMiroslaw Staron
 
Building, Training, and Deploying fast.ai Models Using Amazon SageMaker (AIM4...
Building, Training, and Deploying fast.ai Models Using Amazon SageMaker (AIM4...Building, Training, and Deploying fast.ai Models Using Amazon SageMaker (AIM4...
Building, Training, and Deploying fast.ai Models Using Amazon SageMaker (AIM4...Amazon Web Services
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksAmazon Web Services
 
AutoML - The Future of AI
AutoML - The Future of AIAutoML - The Future of AI
AutoML - The Future of AINing Jiang
 
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018Amazon Web Services
 
Predictive Analytics - Big Data Warehousing Meetup, Zementis
Predictive Analytics - Big Data Warehousing Meetup, ZementisPredictive Analytics - Big Data Warehousing Meetup, Zementis
Predictive Analytics - Big Data Warehousing Meetup, ZementisCaserta
 
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Amazon Web Services
 
OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...
OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...
OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...NETWAYS
 

Similar to From raw data to deployment (20)

KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
 
From Raw Data to Deployment
From Raw Data to DeploymentFrom Raw Data to Deployment
From Raw Data to Deployment
 
How Do You Build and Validate 1500 Models and What Can You Learn from Them?
How Do You Build and Validate 1500 Models and What Can You Learn from Them? How Do You Build and Validate 1500 Models and What Can You Learn from Them?
How Do You Build and Validate 1500 Models and What Can You Learn from Them?
 
AI/ML is a Means to Digital Transformation, Not an End Itself
AI/ML is a Means to Digital Transformation, Not an End ItselfAI/ML is a Means to Digital Transformation, Not an End Itself
AI/ML is a Means to Digital Transformation, Not an End Itself
 
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
 
From notebook to production with Amazon Sagemaker
From notebook to production with Amazon SagemakerFrom notebook to production with Amazon Sagemaker
From notebook to production with Amazon Sagemaker
 
An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)
 
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
 
Amazon SageMaker
Amazon SageMakerAmazon SageMaker
Amazon SageMaker
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)
 
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS SummitWork with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
 
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 
Building, Training, and Deploying fast.ai Models Using Amazon SageMaker (AIM4...
Building, Training, and Deploying fast.ai Models Using Amazon SageMaker (AIM4...Building, Training, and Deploying fast.ai Models Using Amazon SageMaker (AIM4...
Building, Training, and Deploying fast.ai Models Using Amazon SageMaker (AIM4...
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech Talks
 
AutoML - The Future of AI
AutoML - The Future of AIAutoML - The Future of AI
AutoML - The Future of AI
 
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
 
Predictive Analytics - Big Data Warehousing Meetup, Zementis
Predictive Analytics - Big Data Warehousing Meetup, ZementisPredictive Analytics - Big Data Warehousing Meetup, Zementis
Predictive Analytics - Big Data Warehousing Meetup, Zementis
 
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
 
OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...
OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...
OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...
 

More from KNIMESlides

What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1KNIMESlides
 
Codeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image ClassificationCodeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image ClassificationKNIMESlides
 
Automating Inferences out of Financial Data
Automating Inferences out of Financial DataAutomating Inferences out of Financial Data
Automating Inferences out of Financial DataKNIMESlides
 
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020KNIMESlides
 
Credit Card Fraud Detection Tutorial
Credit Card Fraud Detection TutorialCredit Card Fraud Detection Tutorial
Credit Card Fraud Detection TutorialKNIMESlides
 
Practicing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case StudiesPracticing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case StudiesKNIMESlides
 
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9KNIMESlides
 
Scoring Metrics for Classification Models
Scoring Metrics for Classification ModelsScoring Metrics for Classification Models
Scoring Metrics for Classification ModelsKNIMESlides
 
Sentiment Analysis with KNIME Analytics Platform
Sentiment Analysis with KNIME Analytics PlatformSentiment Analysis with KNIME Analytics Platform
Sentiment Analysis with KNIME Analytics PlatformKNIMESlides
 
Chemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics PlatformChemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics PlatformKNIMESlides
 
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon basedSentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon basedKNIMESlides
 
KNIME Software Overview
KNIME Software OverviewKNIME Software Overview
KNIME Software OverviewKNIMESlides
 
Heterogeneous Data Mining with Spark
Heterogeneous Data Mining with SparkHeterogeneous Data Mining with Spark
Heterogeneous Data Mining with SparkKNIMESlides
 
Knime customer intelligence on social media: Text Analytics vs. Network Mining
Knime customer intelligence on social media: Text Analytics vs. Network MiningKnime customer intelligence on social media: Text Analytics vs. Network Mining
Knime customer intelligence on social media: Text Analytics vs. Network MiningKNIMESlides
 
Text Processing with KNIME
Text Processing with KNIMEText Processing with KNIME
Text Processing with KNIMEKNIMESlides
 
Big Data with KNIME is as easy as 1, 2, 3, ...4!
Big Data with KNIME is as easy as 1, 2, 3, ...4!Big Data with KNIME is as easy as 1, 2, 3, ...4!
Big Data with KNIME is as easy as 1, 2, 3, ...4!KNIMESlides
 

More from KNIMESlides (16)

What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1
 
Codeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image ClassificationCodeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image Classification
 
Automating Inferences out of Financial Data
Automating Inferences out of Financial DataAutomating Inferences out of Financial Data
Automating Inferences out of Financial Data
 
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
 
Credit Card Fraud Detection Tutorial
Credit Card Fraud Detection TutorialCredit Card Fraud Detection Tutorial
Credit Card Fraud Detection Tutorial
 
Practicing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case StudiesPracticing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case Studies
 
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
 
Scoring Metrics for Classification Models
Scoring Metrics for Classification ModelsScoring Metrics for Classification Models
Scoring Metrics for Classification Models
 
Sentiment Analysis with KNIME Analytics Platform
Sentiment Analysis with KNIME Analytics PlatformSentiment Analysis with KNIME Analytics Platform
Sentiment Analysis with KNIME Analytics Platform
 
Chemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics PlatformChemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics Platform
 
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon basedSentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based
 
KNIME Software Overview
KNIME Software OverviewKNIME Software Overview
KNIME Software Overview
 
Heterogeneous Data Mining with Spark
Heterogeneous Data Mining with SparkHeterogeneous Data Mining with Spark
Heterogeneous Data Mining with Spark
 
Knime customer intelligence on social media: Text Analytics vs. Network Mining
Knime customer intelligence on social media: Text Analytics vs. Network MiningKnime customer intelligence on social media: Text Analytics vs. Network Mining
Knime customer intelligence on social media: Text Analytics vs. Network Mining
 
Text Processing with KNIME
Text Processing with KNIMEText Processing with KNIME
Text Processing with KNIME
 
Big Data with KNIME is as easy as 1, 2, 3, ...4!
Big Data with KNIME is as easy as 1, 2, 3, ...4!Big Data with KNIME is as easy as 1, 2, 3, ...4!
Big Data with KNIME is as easy as 1, 2, 3, ...4!
 

Recently uploaded

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 

Recently uploaded (20)

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 

From raw data to deployment

  • 1. © 2018 KNIME AG. All Right Reserved. From Raw Data to Deployment Kilian.Thiel@knime.com Marten.Pfannenschmidt@knime.com Kathrin.Melcher@knime.com KNIME
  • 2. © 2018 KNIME AG. All Rights Reserved. Do you recognize this? 2 https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining
  • 3. © 2018 KNIME AG. All Rights Reserved. Let’s unroll it! It always starts with some data … 3 Data Preparation Model Training Model Optimization Deployment Data Manipulation Data Blending Missing Values Handling Feature Generation Dimensionality Reduction Feature Selection Outlier Removal Normalization Partitioning … Model Training Bag of Models Model Selection Ensemble Models Own Ensemble Model External Models Import Existing Models Model Factory … Parameter Tuning Parameter Optimization Regularization Model Size No Iterations … Performance Measures Accuracy ROC Curve Cross-Validation … Files & DBs Dashboards REST API SQL Code Export Reporting … Model Evaluation
  • 4. © 2018 KNIME AG. All Rights Reserved. The many Lives of a Dataset 4 Data Preparation Model Training Model Optimization Model Evaluation Deployment Partitioning: • Training Set • Validation Set • Test Set Training Set Validation Set Test Set New Data from Real World Applications Original Data Set with Past Observations
  • 5. © 2018 KNIME AG. All Rights Reserved. Data Exploration • Sometimes in between Data Access and Data Preparation there is a Data Exploration phase • The Data Exploration phase is useful to get to know the data • KNIME offers a few visualization nodes to build dashboards to explore the data 5
  • 6. © 2018 KNIME AG. All Rights Reserved. One Example for Every Need The KNIME EXAMPLES Server 6 50_Applications/27_FromRawDataToDeployment
  • 7. © 2018 KNIME AG. All Rights Reserved. Classification Problem & Data Set • Airline Dataset: http://stat-computing.org/dataexpo/2009/the-data.html • Smaller dataset (Jan 2007) (AirlineDataset.table) • Challenge: Predict Departure Delays If on original airline dataset, only flights from airport ORD Output Class = “delay” if depdelay > 15min otherwise “no delay” Input features all what is available and more if you can find it! 7
  • 8. © 2018 KNIME AG. All Rights Reserved. Challenges • Group 1. Data Access and Data Preparation • Group 2. ML Model Training • Group 3. Model Deployment • Import file Learnathon_2018.knar into your workspace 8
  • 9. © 2018 KNIME AG. All Rights Reserved. Group 1. Data Access and Data Preparation 9
  • 10. © 2018 KNIME AG. All Rights Reserved. Group 2. Model Training & Optimization 10
  • 11. © 2018 KNIME AG. All Rights Reserved. Group 3. Deployment 11
  • 12. © 2018 KNIME AG. All Rights Reserved. KNIME Spring Summit 2018 March 5 – 9 at Hotel Berlin, Berlin in Germany • Monday & Tuesday: One-day courses • Wednesday & Thursday: Summit sessions • Friday: Workshops Use the code LEARNATHON for 10% off tickets! Register at www.KNIME.com
  • 13. © 2018 KNIME AG. All Rights Reserved. KNIME Beginner’s Luck Book Free Copy of KNIME Beginner’s Luck Book at KNIME Press https://www.knime.org/knimepress Promotion Code: KNIME_Learnathon_2018
  • 14. © 2018 KNIME AG. All Rights Reserved. You can find KNIMers here! 14 • KNIME (www.knime.org) • BLOG for news, tips and tricks(www.knime.org/blog) • FORUM for questions and answers (tech.knime.org/forum) • EXAMPLE SERVER for example workflows • LEARNING HUB (www.knime.org/learning-hub) • KNIME TV channel on • KNIME on @KNIME • KNIME on https://www.facebook.com/KNIMEanalytics • KNIME User Group UK on https://www.meetup.com/KNIME-User-Group-UK/
  • 15. © 2017 KNIME AG. All Rights Reserved. 15 The KNIME® trademark and logo and OPEN FOR INNOVATION® trademark are used by KNIME.com AG under license from KNIME GmbH, and are registered in the United States. KNIME® is also registered in Germany. Thank You!