SlideShare a Scribd company logo
1 of 29
Download to read offline
Copyright © 2015 KNIME.com AG
Big Data Science is just a
Click Away!
Rosaria Silipo
KNIME.com
Copyright © 2015 KNIME.com AG
Variety, Volume, Velocity
Variety:
• integrating heterogeneous data (and tools)
Volume:
• from small files...
• ...to distributed data repositories (Hadoop)
• bring the tools to the data
Velocity:
• from distributing computationally heavy
computations...
• ...to real time scoring of millions of
records/sec.
4
Copyright © 2015 KNIME.com AG
Every Minute…
5
Copyright © 2015 KNIME.com AG
IoT
6
Copyright © 2015 KNIME.com AG 7
The Challenge
Copyright © 2015 KNIME.com AG
Energy Usage Prediction from Smart Meters Data
• Read Smart Meter Energy Data (176 millions rows)
• Clean Up and Aggregate total Energy Usage by hour,
week, day, month, year
• Calculate Behavioral Measures for each Smart Meter
• Cluster Smart Meters with Similar Behavior (k-
Means)
• Predict Energy Usage in Clustered Smart Meters
(Auto-Regressive Time Series Prediction)
8
Workflow 1
Workflow 2
Workflow 3
Copyright © 2015 KNIME.com AG
Workflow 1: PrepareData
9
~ 2 days
Copyright © 2015 KNIME.com AG 10
Big Data
Copyright © 2015 KNIME.com AG
Big Data Support
• KNIME Big Data Access Nodes
– preconfigured connectors
– in database processing
• Big Data Platforms
– HDFS, Hive, Impala, HP Vertica, Hortonworks, ParStream,
Actian, any big data platform really!
• Spark MLlib integration (coming soon)
• Streaming Executor (coming soon)
Copyright © 2015 KNIME.com AG
Hadoop Sandboxes
• Hortonworks:
http://hortonworks.com/products/hortonworks-sandbox/
• Cloudera:
http://www.cloudera.com/content/cloudera/en/downloads/
quickstart_vms.html
• Virtual Box
https://www.virtualbox.org/
• VMWare Player
http://www.vmware.com/
12
Copyright © 2015 KNIME.com AG
Access Big
Data
Select Table
In-DB
Processing
Into
KNIME
… as easy as 1,2,3,… 4
13
4321
Copyright © 2015 KNIME.com AG
1. Database Connector
Generic Database Connector
– Can connect to any JDBC source
– Register new JDBC driver via
preferences page
14
Access Big
Data
Copyright © 2015 KNIME.com AG
1. Register JDBC Driver
15
Open KNIME and go to
File -> Preferences
Increase connection timeout for
long running retrieval operations
Access Big
Data
Copyright © 2015 KNIME.com AG
1. Dedicated Connectors
Dedicated pre-configured connectors
– Bundling necessary JDBC drivers
– Easy to use
– DB specific behavior/capability
Some dedicated connectors are part of
the open source KNIME Analytics
Platform, some belong to the
commercial KNIME Big Data Extension
16
works for most
Hadoop HIVE
installations,
including
Hortonworks
free
Access Big
Data
Copyright © 2015 KNIME.com AG
2. Data Table Selection
18
Select
Table
Copyright © 2015 KNIME.com AG
3. In-Database Processing
• Filter rows and columns
• Join tables/queries
• Sort your data
• Write your own query
• Aggregate* your data
19
Similar Settings as
GroupBy node
Similar Settings as
Joiner node
* Database GroupBy node exposes DB specific aggregation methods
In-DB
Processing
Copyright © 2015 KNIME.com AG
3. Queries for average Measures
20
In-DB
Processing
Copyright © 2015 KNIME.com AG
3. Average Monthly Values
22
In-DB
Processing
Copyright © 2015 KNIME.com AG
4. Import Data from Database
23
< 30 min
1 2
3
4
Into KNIME
Copyright © 2015 KNIME.com AG
New Big Data Platform?
24
No problem!
Just change the connector node!
Copyright © 2015 KNIME.com AG
Other Useful Database Nodes
• Drop table
– missing table handling
– cascade option
• Execute any SQL
statement
• Manipulate existing
queries
25
Executes several
queries separated
by ; and new line
Copyright © 2015 KNIME.com AG 26
KNIME Big Data Extension
Copyright © 2015 KNIME.com AG
KNIME Big Data Extension
• KNIME Big Data Access Nodes
– preconfigured connectors
– HDFS File Handling
– Hive/Impala Loader
• Big Data Platforms
– HDFS, Hive, Impala, HP Vertica, Hortonworks, ParStream,
Actian, SAP Hana (to be), …
• Spark MLlib integration (coming soon)
• Streaming Executor (coming soon)
Copyright © 2015 KNIME.com AG
HDFS File Handling
• KNIME & Extensions ->
KNIME File Handling Nodes
• HDFS Connection and
HDFS File Permission nodes
28
Copyright © 2015 KNIME.com AG
Hive/Impala Loader
29
• Upload a KNIME data table to Hive/Impala
Copyright © 2015 KNIME.com AG
KNIME Big Data Extension: Download and Install
KNIME.com Extension Store
License Required!
Installation Instructions
http://tech.knime.org/installation-instructions
Product Description
http://www.knime.org/knime-big-data-extension
Copyright © 2015 KNIME.com AG
License on KNIME Store
http://tech.knime.org/knime-store
30-day trial license available with special Promotion Code
education@knime.com
Copyright © 2015 KNIME.com AG
References
• Whitepaper “KNIME opens the Doors to Big Data”
http://www.knime.org/files/big_data_in_knime_1.pdf
• Blog Post “Integrating Big data is as Easy as 1,2,3, … 4”
http://www.knime.org/blog/integrating-big-data-is-as-easy-as-
1-2-3-4
• The Big Data Extension Product Description
http://www.knime.org/knime-big-data-extension
32
Copyright © 2015 KNIME.com AG
Thank You!
• education@knime.com
• Twitter: @KNIME
• LinkedIn Group: KNIME
• KNIME Blog: http://www.knime.org/blog
33

More Related Content

What's hot

Key-Key-Value Stores for Efficiently Processing Graph Data in the Cloud
Key-Key-Value Stores for Efficiently Processing Graph Data in the CloudKey-Key-Value Stores for Efficiently Processing Graph Data in the Cloud
Key-Key-Value Stores for Efficiently Processing Graph Data in the CloudUniversity of New South Wales
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Amazon Web Services
 
Rundeck's History and Future
Rundeck's History and FutureRundeck's History and Future
Rundeck's History and Futuredev2ops
 
Demystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceDemystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceSnowflake Computing
 
Pgday bdr 천정대
Pgday bdr 천정대Pgday bdr 천정대
Pgday bdr 천정대PgDay.Seoul
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentialsqureshihamid
 
Come i Microservizi favoriscono il lavoro dei Feature Teams
Come i Microservizi favoriscono il lavoro dei Feature TeamsCome i Microservizi favoriscono il lavoro dei Feature Teams
Come i Microservizi favoriscono il lavoro dei Feature TeamsGiulio Roggero
 
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...HostedbyConfluent
 
Kapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing EngineKapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing EnginePrashant Vats
 
Snowflake free trial_lab_guide
Snowflake free trial_lab_guideSnowflake free trial_lab_guide
Snowflake free trial_lab_guideslidedown1
 
Let’s get to know Snowflake
Let’s get to know SnowflakeLet’s get to know Snowflake
Let’s get to know SnowflakeKnoldus Inc.
 
RESTful services on IBM Domino/XWork
RESTful services on IBM Domino/XWorkRESTful services on IBM Domino/XWork
RESTful services on IBM Domino/XWorkJohn Dalsgaard
 
Redis Indices (#RedisTLV)
Redis Indices (#RedisTLV)Redis Indices (#RedisTLV)
Redis Indices (#RedisTLV)Itamar Haber
 
Backups And Recovery
Backups And RecoveryBackups And Recovery
Backups And Recoveryasifmalik110
 
Best Practice TLS for IBM Domino
Best Practice TLS for IBM DominoBest Practice TLS for IBM Domino
Best Practice TLS for IBM DominoJared Roberts
 
Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Ontico
 
[백서] Google 드라이브에서 다운로드 방지할수 있을까
[백서] Google 드라이브에서 다운로드 방지할수 있을까 [백서] Google 드라이브에서 다운로드 방지할수 있을까
[백서] Google 드라이브에서 다운로드 방지할수 있을까 Charly Choi
 
Testing and Deployment - Full Stack Deep Learning
Testing and Deployment - Full Stack Deep LearningTesting and Deployment - Full Stack Deep Learning
Testing and Deployment - Full Stack Deep LearningSergey Karayev
 

What's hot (20)

Key-Key-Value Stores for Efficiently Processing Graph Data in the Cloud
Key-Key-Value Stores for Efficiently Processing Graph Data in the CloudKey-Key-Value Stores for Efficiently Processing Graph Data in the Cloud
Key-Key-Value Stores for Efficiently Processing Graph Data in the Cloud
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
 
Rundeck's History and Future
Rundeck's History and FutureRundeck's History and Future
Rundeck's History and Future
 
Demystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceDemystifying Data Warehouse as a Service
Demystifying Data Warehouse as a Service
 
Pgday bdr 천정대
Pgday bdr 천정대Pgday bdr 천정대
Pgday bdr 천정대
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentials
 
Come i Microservizi favoriscono il lavoro dei Feature Teams
Come i Microservizi favoriscono il lavoro dei Feature TeamsCome i Microservizi favoriscono il lavoro dei Feature Teams
Come i Microservizi favoriscono il lavoro dei Feature Teams
 
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...
 
Kapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing EngineKapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing Engine
 
Snowflake free trial_lab_guide
Snowflake free trial_lab_guideSnowflake free trial_lab_guide
Snowflake free trial_lab_guide
 
Let’s get to know Snowflake
Let’s get to know SnowflakeLet’s get to know Snowflake
Let’s get to know Snowflake
 
RESTful services on IBM Domino/XWork
RESTful services on IBM Domino/XWorkRESTful services on IBM Domino/XWork
RESTful services on IBM Domino/XWork
 
Redis Indices (#RedisTLV)
Redis Indices (#RedisTLV)Redis Indices (#RedisTLV)
Redis Indices (#RedisTLV)
 
Big Data Proof of Concept
Big Data Proof of ConceptBig Data Proof of Concept
Big Data Proof of Concept
 
Backups And Recovery
Backups And RecoveryBackups And Recovery
Backups And Recovery
 
Best Practice TLS for IBM Domino
Best Practice TLS for IBM DominoBest Practice TLS for IBM Domino
Best Practice TLS for IBM Domino
 
Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...
 
[백서] Google 드라이브에서 다운로드 방지할수 있을까
[백서] Google 드라이브에서 다운로드 방지할수 있을까 [백서] Google 드라이브에서 다운로드 방지할수 있을까
[백서] Google 드라이브에서 다운로드 방지할수 있을까
 
2 db2 instance creation
2 db2 instance creation2 db2 instance creation
2 db2 instance creation
 
Testing and Deployment - Full Stack Deep Learning
Testing and Deployment - Full Stack Deep LearningTesting and Deployment - Full Stack Deep Learning
Testing and Deployment - Full Stack Deep Learning
 

Viewers also liked

Text Processing with KNIME
Text Processing with KNIMEText Processing with KNIME
Text Processing with KNIMEKNIMESlides
 
KNIME - Create Workflow with KNIME
KNIME - Create Workflow with KNIMEKNIME - Create Workflow with KNIME
KNIME - Create Workflow with KNIMEBilly Wong
 
Just add Imagination
Just add ImaginationJust add Imagination
Just add ImaginationKNIMESlides
 
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...Paul Shapiro
 
Knime customer intelligence on social media: Text Analytics vs. Network Mining
Knime customer intelligence on social media: Text Analytics vs. Network MiningKnime customer intelligence on social media: Text Analytics vs. Network Mining
Knime customer intelligence on social media: Text Analytics vs. Network MiningKNIMESlides
 
Sentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine LearningSentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine LearningNihar Suryawanshi
 
Advanced analytics for the Internet of Things. Restocking Rental Bike Stations
Advanced analytics for the Internet of Things. Restocking Rental Bike StationsAdvanced analytics for the Internet of Things. Restocking Rental Bike Stations
Advanced analytics for the Internet of Things. Restocking Rental Bike StationsKNIMESlides
 
SearchLove Boston 2016 | Paul Shapiro | How to Automate Your Keyword Research
SearchLove Boston 2016 | Paul Shapiro | How to Automate Your Keyword ResearchSearchLove Boston 2016 | Paul Shapiro | How to Automate Your Keyword Research
SearchLove Boston 2016 | Paul Shapiro | How to Automate Your Keyword ResearchDistilled
 
Productionizing Spark and the REST Job Server- Evan Chan
Productionizing Spark and the REST Job Server- Evan ChanProductionizing Spark and the REST Job Server- Evan Chan
Productionizing Spark and the REST Job Server- Evan ChanSpark Summit
 
#APIDays Paris - NamSor API for 'Gender Gap Grader'
#APIDays Paris - NamSor API for 'Gender Gap Grader'#APIDays Paris - NamSor API for 'Gender Gap Grader'
#APIDays Paris - NamSor API for 'Gender Gap Grader'Elian CARSENAT
 
2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ES2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ESFrançois Belleau
 
Apresentação Webinar – Analytics em Mídia Sociais
Apresentação Webinar – Analytics em Mídia SociaisApresentação Webinar – Analytics em Mídia Sociais
Apresentação Webinar – Analytics em Mídia SociaisEstanislao Training & Solution
 
Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...
Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...
Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...Instituto Diáspora Brasil (IDB)
 

Viewers also liked (20)

KNIME tutorial
KNIME tutorialKNIME tutorial
KNIME tutorial
 
Text Processing with KNIME
Text Processing with KNIMEText Processing with KNIME
Text Processing with KNIME
 
Knime
KnimeKnime
Knime
 
KNIME - Create Workflow with KNIME
KNIME - Create Workflow with KNIMEKNIME - Create Workflow with KNIME
KNIME - Create Workflow with KNIME
 
Just add Imagination
Just add ImaginationJust add Imagination
Just add Imagination
 
Knime
Knime Knime
Knime
 
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
 
Manual Básico Knime
Manual Básico KnimeManual Básico Knime
Manual Básico Knime
 
Knime customer intelligence on social media: Text Analytics vs. Network Mining
Knime customer intelligence on social media: Text Analytics vs. Network MiningKnime customer intelligence on social media: Text Analytics vs. Network Mining
Knime customer intelligence on social media: Text Analytics vs. Network Mining
 
Webinar Social Media Analytics - Using KNIME
Webinar Social Media Analytics - Using KNIMEWebinar Social Media Analytics - Using KNIME
Webinar Social Media Analytics - Using KNIME
 
Introduction to knime
Introduction to knimeIntroduction to knime
Introduction to knime
 
Sentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine LearningSentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine Learning
 
Advanced analytics for the Internet of Things. Restocking Rental Bike Stations
Advanced analytics for the Internet of Things. Restocking Rental Bike StationsAdvanced analytics for the Internet of Things. Restocking Rental Bike Stations
Advanced analytics for the Internet of Things. Restocking Rental Bike Stations
 
SearchLove Boston 2016 | Paul Shapiro | How to Automate Your Keyword Research
SearchLove Boston 2016 | Paul Shapiro | How to Automate Your Keyword ResearchSearchLove Boston 2016 | Paul Shapiro | How to Automate Your Keyword Research
SearchLove Boston 2016 | Paul Shapiro | How to Automate Your Keyword Research
 
Productionizing Spark and the REST Job Server- Evan Chan
Productionizing Spark and the REST Job Server- Evan ChanProductionizing Spark and the REST Job Server- Evan Chan
Productionizing Spark and the REST Job Server- Evan Chan
 
#APIDays Paris - NamSor API for 'Gender Gap Grader'
#APIDays Paris - NamSor API for 'Gender Gap Grader'#APIDays Paris - NamSor API for 'Gender Gap Grader'
#APIDays Paris - NamSor API for 'Gender Gap Grader'
 
2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ES2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ES
 
Apresentação Webinar – Analytics em Mídia Sociais
Apresentação Webinar – Analytics em Mídia SociaisApresentação Webinar – Analytics em Mídia Sociais
Apresentação Webinar – Analytics em Mídia Sociais
 
CURRICULO_LeonardoLopes _20160623
CURRICULO_LeonardoLopes _20160623CURRICULO_LeonardoLopes _20160623
CURRICULO_LeonardoLopes _20160623
 
Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...
Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...
Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...
 

Similar to Big Data Science is just a Click Away

SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impalamarkgrover
 
What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1KNIMESlides
 
InfoSphere BigInsights - Analytics power for Hadoop - field experience
InfoSphere BigInsights - Analytics power for Hadoop - field experienceInfoSphere BigInsights - Analytics power for Hadoop - field experience
InfoSphere BigInsights - Analytics power for Hadoop - field experienceWilfried Hoge
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesCloudera, Inc.
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLMatt Lord
 
Hadoop Application Architectures tutorial at Big DataService 2015
Hadoop Application Architectures tutorial at Big DataService 2015Hadoop Application Architectures tutorial at Big DataService 2015
Hadoop Application Architectures tutorial at Big DataService 2015hadooparchbook
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
 
Software Defined Infrastructure
Software Defined InfrastructureSoftware Defined Infrastructure
Software Defined Infrastructureinside-BigData.com
 
Multi-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BTMulti-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BTCloudera, Inc.
 
KNIME Software Overview
KNIME Software OverviewKNIME Software Overview
KNIME Software OverviewKNIMESlides
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanJim Kaskade
 
Strata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationsStrata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationshadooparchbook
 
Open Sourcing GemFire - Apache Geode
Open Sourcing GemFire - Apache GeodeOpen Sourcing GemFire - Apache Geode
Open Sourcing GemFire - Apache GeodeApache Geode
 
An Introduction to Apache Geode (incubating)
An Introduction to Apache Geode (incubating)An Introduction to Apache Geode (incubating)
An Introduction to Apache Geode (incubating)Anthony Baker
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoopChiou-Nan Chen
 
Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!DataWorks Summit
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Cloudera, Inc.
 
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformEMC
 
Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...
Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...
Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...Amazon Web Services
 
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal GemfireIMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal GemfireIn-Memory Computing Summit
 

Similar to Big Data Science is just a Click Away (20)

SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1
 
InfoSphere BigInsights - Analytics power for Hadoop - field experience
InfoSphere BigInsights - Analytics power for Hadoop - field experienceInfoSphere BigInsights - Analytics power for Hadoop - field experience
InfoSphere BigInsights - Analytics power for Hadoop - field experience
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQL
 
Hadoop Application Architectures tutorial at Big DataService 2015
Hadoop Application Architectures tutorial at Big DataService 2015Hadoop Application Architectures tutorial at Big DataService 2015
Hadoop Application Architectures tutorial at Big DataService 2015
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Software Defined Infrastructure
Software Defined InfrastructureSoftware Defined Infrastructure
Software Defined Infrastructure
 
Multi-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BTMulti-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BT
 
KNIME Software Overview
KNIME Software OverviewKNIME Software Overview
KNIME Software Overview
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
 
Strata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationsStrata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applications
 
Open Sourcing GemFire - Apache Geode
Open Sourcing GemFire - Apache GeodeOpen Sourcing GemFire - Apache Geode
Open Sourcing GemFire - Apache Geode
 
An Introduction to Apache Geode (incubating)
An Introduction to Apache Geode (incubating)An Introduction to Apache Geode (incubating)
An Introduction to Apache Geode (incubating)
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoop
 
Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
 
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
 
Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...
Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...
Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...
 
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal GemfireIMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
 

More from KNIMESlides

Codeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image ClassificationCodeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image ClassificationKNIMESlides
 
Automating Inferences out of Financial Data
Automating Inferences out of Financial DataAutomating Inferences out of Financial Data
Automating Inferences out of Financial DataKNIMESlides
 
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020KNIMESlides
 
Credit Card Fraud Detection Tutorial
Credit Card Fraud Detection TutorialCredit Card Fraud Detection Tutorial
Credit Card Fraud Detection TutorialKNIMESlides
 
Practicing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case StudiesPracticing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case StudiesKNIMESlides
 
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9KNIMESlides
 
Webinar: Behind the Scenes on Guided Analytics
Webinar: Behind the Scenes on Guided AnalyticsWebinar: Behind the Scenes on Guided Analytics
Webinar: Behind the Scenes on Guided AnalyticsKNIMESlides
 
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019KNIMESlides
 
Scoring Metrics for Classification Models
Scoring Metrics for Classification ModelsScoring Metrics for Classification Models
Scoring Metrics for Classification ModelsKNIMESlides
 
Open Source Story and what’s new in KNIME Software
Open Source Story and what’s new in KNIME SoftwareOpen Source Story and what’s new in KNIME Software
Open Source Story and what’s new in KNIME SoftwareKNIMESlides
 
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine LearningAnomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine LearningKNIMESlides
 
Sharing and Deploying Data Science with KNIME Server
Sharing and Deploying Data Science with KNIME ServerSharing and Deploying Data Science with KNIME Server
Sharing and Deploying Data Science with KNIME ServerKNIMESlides
 
Guided Automation- A Blueprint for Interactive Automated Machine Learning
Guided Automation- A Blueprint for Interactive Automated Machine LearningGuided Automation- A Blueprint for Interactive Automated Machine Learning
Guided Automation- A Blueprint for Interactive Automated Machine LearningKNIMESlides
 
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...KNIMESlides
 
Sentiment Analysis with KNIME Analytics Platform
Sentiment Analysis with KNIME Analytics PlatformSentiment Analysis with KNIME Analytics Platform
Sentiment Analysis with KNIME Analytics PlatformKNIMESlides
 
Chemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics PlatformChemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics PlatformKNIMESlides
 
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon basedSentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon basedKNIMESlides
 
KNIME Data Science Learnathon: From Raw Data To Deployment
KNIME Data Science Learnathon: From Raw Data To DeploymentKNIME Data Science Learnathon: From Raw Data To Deployment
KNIME Data Science Learnathon: From Raw Data To DeploymentKNIMESlides
 
From Raw Data to Deployment
From Raw Data to DeploymentFrom Raw Data to Deployment
From Raw Data to DeploymentKNIMESlides
 
From raw data to deployment
From raw data to deployment From raw data to deployment
From raw data to deployment KNIMESlides
 

More from KNIMESlides (20)

Codeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image ClassificationCodeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image Classification
 
Automating Inferences out of Financial Data
Automating Inferences out of Financial DataAutomating Inferences out of Financial Data
Automating Inferences out of Financial Data
 
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
 
Credit Card Fraud Detection Tutorial
Credit Card Fraud Detection TutorialCredit Card Fraud Detection Tutorial
Credit Card Fraud Detection Tutorial
 
Practicing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case StudiesPracticing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case Studies
 
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
 
Webinar: Behind the Scenes on Guided Analytics
Webinar: Behind the Scenes on Guided AnalyticsWebinar: Behind the Scenes on Guided Analytics
Webinar: Behind the Scenes on Guided Analytics
 
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019
 
Scoring Metrics for Classification Models
Scoring Metrics for Classification ModelsScoring Metrics for Classification Models
Scoring Metrics for Classification Models
 
Open Source Story and what’s new in KNIME Software
Open Source Story and what’s new in KNIME SoftwareOpen Source Story and what’s new in KNIME Software
Open Source Story and what’s new in KNIME Software
 
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine LearningAnomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
 
Sharing and Deploying Data Science with KNIME Server
Sharing and Deploying Data Science with KNIME ServerSharing and Deploying Data Science with KNIME Server
Sharing and Deploying Data Science with KNIME Server
 
Guided Automation- A Blueprint for Interactive Automated Machine Learning
Guided Automation- A Blueprint for Interactive Automated Machine LearningGuided Automation- A Blueprint for Interactive Automated Machine Learning
Guided Automation- A Blueprint for Interactive Automated Machine Learning
 
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
 
Sentiment Analysis with KNIME Analytics Platform
Sentiment Analysis with KNIME Analytics PlatformSentiment Analysis with KNIME Analytics Platform
Sentiment Analysis with KNIME Analytics Platform
 
Chemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics PlatformChemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics Platform
 
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon basedSentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based
 
KNIME Data Science Learnathon: From Raw Data To Deployment
KNIME Data Science Learnathon: From Raw Data To DeploymentKNIME Data Science Learnathon: From Raw Data To Deployment
KNIME Data Science Learnathon: From Raw Data To Deployment
 
From Raw Data to Deployment
From Raw Data to DeploymentFrom Raw Data to Deployment
From Raw Data to Deployment
 
From raw data to deployment
From raw data to deployment From raw data to deployment
From raw data to deployment
 

Recently uploaded

04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 

Recently uploaded (20)

04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 

Big Data Science is just a Click Away

  • 1. Copyright © 2015 KNIME.com AG Big Data Science is just a Click Away! Rosaria Silipo KNIME.com
  • 2. Copyright © 2015 KNIME.com AG Variety, Volume, Velocity Variety: • integrating heterogeneous data (and tools) Volume: • from small files... • ...to distributed data repositories (Hadoop) • bring the tools to the data Velocity: • from distributing computationally heavy computations... • ...to real time scoring of millions of records/sec. 4
  • 3. Copyright © 2015 KNIME.com AG Every Minute… 5
  • 4. Copyright © 2015 KNIME.com AG IoT 6
  • 5. Copyright © 2015 KNIME.com AG 7 The Challenge
  • 6. Copyright © 2015 KNIME.com AG Energy Usage Prediction from Smart Meters Data • Read Smart Meter Energy Data (176 millions rows) • Clean Up and Aggregate total Energy Usage by hour, week, day, month, year • Calculate Behavioral Measures for each Smart Meter • Cluster Smart Meters with Similar Behavior (k- Means) • Predict Energy Usage in Clustered Smart Meters (Auto-Regressive Time Series Prediction) 8 Workflow 1 Workflow 2 Workflow 3
  • 7. Copyright © 2015 KNIME.com AG Workflow 1: PrepareData 9 ~ 2 days
  • 8. Copyright © 2015 KNIME.com AG 10 Big Data
  • 9. Copyright © 2015 KNIME.com AG Big Data Support • KNIME Big Data Access Nodes – preconfigured connectors – in database processing • Big Data Platforms – HDFS, Hive, Impala, HP Vertica, Hortonworks, ParStream, Actian, any big data platform really! • Spark MLlib integration (coming soon) • Streaming Executor (coming soon)
  • 10. Copyright © 2015 KNIME.com AG Hadoop Sandboxes • Hortonworks: http://hortonworks.com/products/hortonworks-sandbox/ • Cloudera: http://www.cloudera.com/content/cloudera/en/downloads/ quickstart_vms.html • Virtual Box https://www.virtualbox.org/ • VMWare Player http://www.vmware.com/ 12
  • 11. Copyright © 2015 KNIME.com AG Access Big Data Select Table In-DB Processing Into KNIME … as easy as 1,2,3,… 4 13 4321
  • 12. Copyright © 2015 KNIME.com AG 1. Database Connector Generic Database Connector – Can connect to any JDBC source – Register new JDBC driver via preferences page 14 Access Big Data
  • 13. Copyright © 2015 KNIME.com AG 1. Register JDBC Driver 15 Open KNIME and go to File -> Preferences Increase connection timeout for long running retrieval operations Access Big Data
  • 14. Copyright © 2015 KNIME.com AG 1. Dedicated Connectors Dedicated pre-configured connectors – Bundling necessary JDBC drivers – Easy to use – DB specific behavior/capability Some dedicated connectors are part of the open source KNIME Analytics Platform, some belong to the commercial KNIME Big Data Extension 16 works for most Hadoop HIVE installations, including Hortonworks free Access Big Data
  • 15. Copyright © 2015 KNIME.com AG 2. Data Table Selection 18 Select Table
  • 16. Copyright © 2015 KNIME.com AG 3. In-Database Processing • Filter rows and columns • Join tables/queries • Sort your data • Write your own query • Aggregate* your data 19 Similar Settings as GroupBy node Similar Settings as Joiner node * Database GroupBy node exposes DB specific aggregation methods In-DB Processing
  • 17. Copyright © 2015 KNIME.com AG 3. Queries for average Measures 20 In-DB Processing
  • 18. Copyright © 2015 KNIME.com AG 3. Average Monthly Values 22 In-DB Processing
  • 19. Copyright © 2015 KNIME.com AG 4. Import Data from Database 23 < 30 min 1 2 3 4 Into KNIME
  • 20. Copyright © 2015 KNIME.com AG New Big Data Platform? 24 No problem! Just change the connector node!
  • 21. Copyright © 2015 KNIME.com AG Other Useful Database Nodes • Drop table – missing table handling – cascade option • Execute any SQL statement • Manipulate existing queries 25 Executes several queries separated by ; and new line
  • 22. Copyright © 2015 KNIME.com AG 26 KNIME Big Data Extension
  • 23. Copyright © 2015 KNIME.com AG KNIME Big Data Extension • KNIME Big Data Access Nodes – preconfigured connectors – HDFS File Handling – Hive/Impala Loader • Big Data Platforms – HDFS, Hive, Impala, HP Vertica, Hortonworks, ParStream, Actian, SAP Hana (to be), … • Spark MLlib integration (coming soon) • Streaming Executor (coming soon)
  • 24. Copyright © 2015 KNIME.com AG HDFS File Handling • KNIME & Extensions -> KNIME File Handling Nodes • HDFS Connection and HDFS File Permission nodes 28
  • 25. Copyright © 2015 KNIME.com AG Hive/Impala Loader 29 • Upload a KNIME data table to Hive/Impala
  • 26. Copyright © 2015 KNIME.com AG KNIME Big Data Extension: Download and Install KNIME.com Extension Store License Required! Installation Instructions http://tech.knime.org/installation-instructions Product Description http://www.knime.org/knime-big-data-extension
  • 27. Copyright © 2015 KNIME.com AG License on KNIME Store http://tech.knime.org/knime-store 30-day trial license available with special Promotion Code education@knime.com
  • 28. Copyright © 2015 KNIME.com AG References • Whitepaper “KNIME opens the Doors to Big Data” http://www.knime.org/files/big_data_in_knime_1.pdf • Blog Post “Integrating Big data is as Easy as 1,2,3, … 4” http://www.knime.org/blog/integrating-big-data-is-as-easy-as- 1-2-3-4 • The Big Data Extension Product Description http://www.knime.org/knime-big-data-extension 32
  • 29. Copyright © 2015 KNIME.com AG Thank You! • education@knime.com • Twitter: @KNIME • LinkedIn Group: KNIME • KNIME Blog: http://www.knime.org/blog 33