03/05/2016 www.bilot.fi 1
#azurehadoop
Hosts
03/05/2016 www.bilot.fi 2
Tuomas Autio
Bilot
Head of Big Data &
Business Lead (BI)
tuomas.autio@bilot.fi
@BigDataTuomas
Mikko Mattila
Bilot
Solution Lead,
Analytics
mikko.mattila@bilot.fi
@MattilaJMikko
Antti Alila
Microsoft
Product Manager,
Azure
antti.alila@microsoft.c
om
Mats Johansson
Hortonworks
Solution Architect
mjohansson@
hortonworks.com
Pasi Vuorela
Hortonworks
Sales Manager Nordics
pvuorela@
hortonworks.com
Hadoop and Modern Data
Architecture
Breakfast seminar 26.4.2016
Agenda
• Introductions
• Microsoft and Azure Marketplace
• Hadoop and modern data architecture + demo
• Hortonworks, HDP and HDF
• Case study by Hortonworks
• Wrap-up & next steps
03/05/2016 www.bilot.fi 4
Key take-aways from today
What to expect
• What Hadoop is
• How does Hadoop fit into
enterprise architecture
• What does Hadoop mean
to my organizational
structure
• Big data is relevant to
every industry
• Real world use cases
03/05/2016 www.bilot.fi 5
”Hadoop plays significant role filling that gap in the market. Open standard
approach is needed to keep up with the pace. Old technologies are not capable
for billions of things to be connected.” GE’s CIO Vince Campisi
”Spark [on top of Hadoop] has been ‘instrumental in where we’ve gotten to’”
Vinoth Chandar, Uber
”100 % of large (over $1 bil) enterprises adapts Hadoop by 2020” Forrester’s
Principal Analyst Mike Gualteri
“Hadoop is the most important technological part of the digitalization” SAP’s
CTO Quentin Clark
“Who cares about Hadoop on Linux? Microsoft (yes, really) … We want Azure
to be a place where all operating systems can run” T. K. "Ranga" Rengarajan,
Microsoft's corporate VP, Data Platform
Bilot stands for BI
Bilot’s offering for Analytics & Big data, Tuomas Autio Bilot
03/05/2016 www.bilot.fi 6
About us
130+
EXPERTS
100+
CUSTOMERS
16M€
TURNOVER IN
2014
+40%
AVERAGE
GROWTH
15
NATIONALITIES
100%
OWNED BY EMPLOYEES
10
YEARS
2
COUNTRY HQ’S
Bilot’s portfolio and analytics?
Our customers* are recognized
leaders in their markets
03/05/2016 www.bilot.fi 9
*) >120 customers in total. And increasing…
Microsoft and Azure
Marketplace
Antti Alila, Microsoft
03/05/2016 www.bilot.fi 10
Hadoop 101
Tuomas Autio, Bilot
03/05/2016 www.bilot.fi 11
03/05/2016 www.bilot.fi 12
DATA SYSTEMS
REPORTING & APPLICATIONS
Analytics
Custom
applications
Packaged
applications
EDWRDBMS MPP
New Data Sources
Social media
Click-stream
Marketing
data
Server logs /
RFID
(TRADITIONAL) DATA SOURCES
POS
ERP CRM
…
1
Sensor / Machine
data
Geo locations
Unsctructured
documents
2
(Old) Architectures under pressure
Quick Intro to Hadoop
03/05/2016 www.bilot.fi 13
• Hadoop is an open source framework for distributed file
storage
• Managed by Apache Foundation
• De facto standard for big data
• Enterprise Hadoop distributions
• Hortonworks HDP (”Red Hat” of Hadoop), HDP for Windows,
IBM, Microsoft Azure HDInsight (HDP), Cloudera, MapR, AWS
(EMR), Rackspace
• >50% of US Fortune 100 companies use Hadoop, ~60% CAGR
(2020 $50bn)
• ~25 Finnish instances, ~10 known production instances in
Finland (strongly behind US and central European markets)
Hadoop 2.x
Framework
Key Features
• Cluster of commodity servers, scales out ”infinitely” affordably
• Linear growth of performance
• Distributed processing
• Schemaless
• Hadoop stores files in a distributed file system
• Fast (for big data), maps data wherever it is located in cluster
• Resilient to failure
• Flexible
• Cost effective
03/05/2016 www.bilot.fi 14
03/05/2016 www.bilot.fi 15
USE CASE
BUT”Haters to the left! Kill the fear! Just get it started and go!”,
Symantec’s Cloud Platform Engineering Leader David Lin
Value compounds with use, as more use cases,
sources, time periods join in a data lake
”Hadoop – It’s damn hard to use”, anonymous CXO
03/05/2016 www.bilot.fi 17
Mitigation: Right Team and skills!
IT and the Business MUST Work Together to Create Maximum Value
Typical (new) roles needed in the
organization:
• The Data Architect
• The Data Scientist
• The Business Analyst
• The Developer
• The Administrators
Modern Data Architecture
Mikko Mattila, Bilot
03/05/2016 www.bilot.fi 18
Why Hadoop will success
IKEA’s Business Idea
“to offer a wide range of home furnishings with good design and
function at prices so low that as many people as possible will be
able to afford them”
03/05/2016 www.bilot.fi 19
Why Hadoop will success
“HADOOP IS A SOFTWARE PACKAGE AT SUCH A LOW PRICE
THAT ALMOST EVERY COMPANY IS ABLE TO AFFORD IT
ALREADY”
“HADOOP AND OTHER OPEN SOURCE BIG DATA PROJECTS
PROVIDE A HUGE RANGE OF IT SOFTWARE FOR AREAS OF
DATA MANAGEMENT AND SYSTEM INTEGRATION”
“HADOOP TOOLS ARE DESIGNED TO SOLVE ISSUES
IMPOSSIBLE FOR TRADITIONAL COMMERCIAL TOOLS”
03/05/2016 www.bilot.fi 20
Hadoop Scenario 1: pre-process ETL
03/05/2016 www.bilot.fi 21
Hadoop Scenario 2: hot and cold
storage
03/05/2016 www.bilot.fi 22
Hot
data
in DW
Cold
data
in
Hadoop
Hadoop Scenario 3: true data discovery
03/05/2016 www.bilot.fi 23
Traditional Enterprise software and filesOnline systems (log or Streams)
5/3/2016 www.bilot.fi
RDBMS
ERP
Hadoop ecosystem: All you need for modern analytics
architecture as open source
Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc
(Un)Structured &
documents
Clickstream
Server logs /
RFID
Sentiment,
Some Sensor
ETL +
DW
Digital organization Traditional organization
Traditional Enterprise software and filesOnline systems (log or Streams)
5/3/2016 www.bilot.fi
RDBMS
ERP
Hadoop ecosystem: All you need for modern analytics
architecture as open source
Real-time stream, log data and rdbms change capturing
(Flume or Hortonworks data flow)
Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc
(Un)Structured &
documents
Clickstream
Server logs /
RFID
Sentiment,
Some Sensor
Message Queue and history
(Kafka)
Complex event processing
(Storm, SparkStreaming, KafkaStreams, Flink)
Real time machine interface for applications
ETL +
DW
Digital organization Traditional organization
Traditional Enteprice software and files
Interactive
processing & queries
(Spark & Hive)
Online systems (log or Streams)
FileSystem (HDFS) +
Core Services
5/3/2016 www.bilot.fi
RDBMS
ERP
Hadoop ecosystem: All you need for modern analytics
architecture as open source
Real-time stream, log data and rdbms change capturing
(Flume or Hortonworks data flow)
Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc
(Un)Structured &
documents
Clickstream
Server logs /
RFID
Sentiment,
Some Sensor
Message Queue and history
(Kafka)
Complex event processing
(Storm, SparkStreaming, KafkaStreams, Flink)
Real time machine interface for applications
ETL +
DW
BI User
Digital organization Traditional organization
Batch Processing
Traditional Enterprise software and files
Interactive
processing & queries
(Spark & Hive)
Online systems (log or Streams)
FileSystem (HDFS) +
Core Services
5/3/2016 www.bilot.fi
RDBMS
ERP
Batch processing
(MapReduce & Pig
Latin)
Hadoop ecosystem: All you need for modern analytics
architecture as open source
Real-time stream, log data and rdbms change capturing
(Flume or Hortonworks data flow)
Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc
(Un)Structured &
documents
Clickstream
Server logs /
RFID
Sentiment,
Some Sensor
Message Queue and history
(Kafka)
Complex event processing
(Storm, SparkStreaming, KafkaStreams, Flink)
Real time machine interface for applications
ETL +
DW
RDBMS ->
HDFS
batch load
(Sqoop)
Statistical
Analysis
(Spark)
BI User Data Scientist
Digital organization Traditional organization
Batch Processing
Traditional Enterprise software and files
Interactive
processing & queries
(Spark & Hive)
Online systems (log or Streams)
FileSystem (HDFS) +
Core Services
5/3/2016 www.bilot.fi
RDBMS
ERP
Batch processing
(MapReduce & Pig
Latin)
Hadoop ecosystem: All you need for modern analytics
architecture as open source
Real-time stream, log data and rdbms change capturing
(Flume or Hortonworks data flow)
Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc
(Un)Structured &
documents
Clickstream
Server logs /
RFID
Sentiment,
Some Sensor
Message Queue and history
(Kafka)
Complex event processing
(Storm, SparkStreaming, KafkaStreams, Flink)
Real time machine interface for applications
ETL +
DW
RDBMS ->
HDFS
batch load
(Sqoop)
Statistical
Analysis
(Spark)
BI User Data Scientist
Digital organization Traditional organization
Batch Processing
Traditional Enterprise software and files
Interactive
processing & queries
(Spark & Hive)
Online systems (log or Streams)
FileSystem (HDFS) +
Core Services
5/3/2016 www.bilot.fi
RDBMS
ERP
Batch processing
(MapReduce & Pig
Latin)
Hadoop ecosystem: All you need for modern analytics
architecture as open source
Real-time stream, log data and rdbms change capturing
(Flume or Hortonworks data flow)
Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc
(Un)Structured &
documents
Clickstream
Server logs /
RFID
Sentiment,
Some Sensor
Message Queue and history
(Kafka)
Complex event processing
(Storm, SparkStreaming, KafkaStreams, Flink)
Real time machine interface for applications
ETL +
DW
RDBMS ->
HDFS
batch load
(Sqoop)
Statistical
Analysis
(Spark)
NoSQL
database for
interactive
use (hbase)
BI User Data Scientist
Batch Processing
Digital organization Traditional organization
Traditional Enterprise software and files
Interactive
processing & queries
(Spark & Hive)
Online systems (log or Streams)
FileSystem (HDFS) +
Core Services
5/3/2016 www.bilot.fi
RDBMS
ERP
Batch processing
(MapReduce & Pig
Latin)
Hadoop ecosystem: All you need for modern analytics
architecture as open source
Real-time stream, log data and rdbms change capturing
(Flume or Hortonworks data flow)
Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc
(Un)Structured &
documents
Clickstream
Server logs /
RFID
Sentiment,
Some Sensor
Message Queue and history
(Kafka)
Complex event processing
(Storm, SparkStreaming, KafkaStreams, Flink)
Real time machine interface for applications
ETL +
DW
RDBMS ->
HDFS
batch load
(Sqoop)
Statistical
Analysis
(Spark)
NoSQL
database for
interactive
use (hbase) Data Virtualization
Virtual Datamodels / security
O/JDBC, MDX, REST outbound interfaces
BI User Data Scientist
Batch Processing
O/JDBC, MDX, REST inbound interfaces
Logical Data Warehouse
Traditional BI Tools
Digital organization Traditional organization
Traditional Enterprise software and files
Interactive
processing & queries
(Spark & Hive)
Online systems (log or Streams)
FileSystem (HDFS) +
Core Services
5/3/2016 www.bilot.fi
RDBMS
ERP
Batch processing
(MapReduce & Pig
Latin)
Hadoop ecosystem: All you need for modern analytics
architecture as open source
Real-time stream, log data and rdbms change capturing
(Flume or Hortonworks data flow)
Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc
(Un)Structured &
documents
Clickstream
Server logs /
RFID
Sentiment,
Some Sensor
Message Queue and history
(Kafka)
Complex event processing
(Storm, SparkStreaming, KafkaStreams, Flink)
Real time machine interface for applications
ETL +
DW
RDBMS ->
HDFS
batch load
(Sqoop)
Statistical
Analysis
(Spark)
NoSQL
database for
interactive
use (hbase) Data Virtualization
Virtual Datamodels / security
O/JDBC, MDX, REST outbound interfaces
BI User Data Scientist
Batch Processing
O/JDBC, MDX, REST inbound interfaces
Logical Data Warehouse
Traditional BI Tools
Digital organization Traditional organization
Example use case: Dynamic Pricing
Dynamic pricing will be more and more common in the future
Usage of dynamic pricing should be business decision – not
restricted by your technical capabilities
5/3/2016 www.bilot.fi 32
Dynamic Pricing
Same price for every one
in every store
More you visit on
booking pages the
higher price
Dynamic OmniChannel Pricing
5/3/2016 www.bilot.fi 33
Store
Consumer
buying
On-line Channel
Consumption
(IoT)
Price Cache
(SmartPricing
Accelerator SPA)
Pricing
rules
Price List
Customer
Product
Basket Size
History
Warehouse levels
Delivery time / type
WebSite Activity
IOT consumption
MQ
Analytics and Pricing Simulations
(SmartPricing)
Supply Chain
Management
(+other sources)
Batch
Processing
& History
Second & Minute Level Price optimizationMonthly level Price optimization
Orders /
ClickStream
Sensor Data
POS data
CEP
Demo Scope
5/3/2016 www.bilot.fi 34
Consumer
buying
Pricing
updates
MQ
Analytics
Data
Warehouse CEP
WebShopClickStream
ClickStream
Orders, Product categories, Suppliers
MS SQL Server
HTML5 + Tomcat server
Kafka
HDFS +
Hive
MS PowerBI
Log file sniffing to stream
Flume-ng
Every visit to ”product
page” increases
price with 5%
Indentifies ”product page”
and viewed product +
sends request to increase
price
DEMO
Real-time Dynamic Webshop Pricing and real-time Reporting (Hadoop), Mikko Mattila Bilot
03/05/2016 www.bilot.fi 35
Hortonworks: HDP & HDF
References & Use Cases
Pasi Vuorela, Hortonworks
03/05/2016 www.bilot.fi 36
Hortonworks Techical Case
Study
Mats Johansson, Hortonworks
03/05/2016 www.bilot.fi 37
Next Steps?
Tuomas Autio, Bilot
03/05/2016 www.bilot.fi 38
Bilot’s Hadoop Accelerator Program
03/05/2016 www.bilot.fi 39
1. Business
Strategy
2. Hadoop
bootcamp
3. Proof of
Concept
4. Proof of
Solution
5. Build &
Implement
6. Run
0,5 day 1 day
• Intro to Hadoop
• Vision
• Use cases
• Prioritization
• 1 use case
• Deep dive with
business, IT,
and operations
• Business case
• Platform
deployed on
Azure
• Integrations +
use case
• Look & feel
• Test drive
• Scalability
• Security
• Tools and
methods
• Cloud/on-prem
• Licences/ support
descriptions
• Implementation
• Agile dev
• Roll-out and
roadmap
• Change mgmt.
begins
• Hadoop as a
Service
• AMS
• Data driven
enterprise/
organization
dev
2 - 8 weeks 2-3 months 3-6 months
• Insight for
Hadoop-
enabled
business
• List of
prioritized
Hadoop use
cases
DELIVERABLES
• Business case
for PoC use
case
• “How to get
there?”
• Technical: Up
and running
system and
technical
evaluation
• Confirmed
business case
• Plans for
scalable and
secure Hadoop
solution ready
for
implementation
• Hadoop
implemented
• Roadmap for
further use
cases
• Fully functional
Hadoop
environment
• Continuous
support model
• Organizational
adaptation
PoC / Pilot Production implementation
Contact Bilot to hear more
Interested? Contact us for a tailored demo
and workshop!
Bilot is Hortonworks’ first systems integrator partner in Finland and
Microsoft’s Gold Partner
03/05/2016 www.bilot.fi 40
Real customer usecases and industry examples
available for demo. Contact us for your own
tailored session!
In pre-PoC phase for sandboxing and light
demo purposes we can utilize Azure or Bilot’s 5-
node on-premises HDP cluster
Mikko Mattila
Solution Lead,
Analytics
Mikko.mattila@bilot.fi
@MattilaJMikko
Tuomas Autio
Head of Big Data & BI
Business Lead
tuomas.autio@bilot.fi
@BigDataTuomas

Hadoop and Modern Data Architecture

  • 1.
  • 2.
    Hosts 03/05/2016 www.bilot.fi 2 TuomasAutio Bilot Head of Big Data & Business Lead (BI) tuomas.autio@bilot.fi @BigDataTuomas Mikko Mattila Bilot Solution Lead, Analytics mikko.mattila@bilot.fi @MattilaJMikko Antti Alila Microsoft Product Manager, Azure antti.alila@microsoft.c om Mats Johansson Hortonworks Solution Architect mjohansson@ hortonworks.com Pasi Vuorela Hortonworks Sales Manager Nordics pvuorela@ hortonworks.com
  • 3.
    Hadoop and ModernData Architecture Breakfast seminar 26.4.2016
  • 4.
    Agenda • Introductions • Microsoftand Azure Marketplace • Hadoop and modern data architecture + demo • Hortonworks, HDP and HDF • Case study by Hortonworks • Wrap-up & next steps 03/05/2016 www.bilot.fi 4
  • 5.
    Key take-aways fromtoday What to expect • What Hadoop is • How does Hadoop fit into enterprise architecture • What does Hadoop mean to my organizational structure • Big data is relevant to every industry • Real world use cases 03/05/2016 www.bilot.fi 5 ”Hadoop plays significant role filling that gap in the market. Open standard approach is needed to keep up with the pace. Old technologies are not capable for billions of things to be connected.” GE’s CIO Vince Campisi ”Spark [on top of Hadoop] has been ‘instrumental in where we’ve gotten to’” Vinoth Chandar, Uber ”100 % of large (over $1 bil) enterprises adapts Hadoop by 2020” Forrester’s Principal Analyst Mike Gualteri “Hadoop is the most important technological part of the digitalization” SAP’s CTO Quentin Clark “Who cares about Hadoop on Linux? Microsoft (yes, really) … We want Azure to be a place where all operating systems can run” T. K. "Ranga" Rengarajan, Microsoft's corporate VP, Data Platform
  • 6.
    Bilot stands forBI Bilot’s offering for Analytics & Big data, Tuomas Autio Bilot 03/05/2016 www.bilot.fi 6
  • 7.
  • 8.
  • 9.
    Our customers* arerecognized leaders in their markets 03/05/2016 www.bilot.fi 9 *) >120 customers in total. And increasing…
  • 10.
    Microsoft and Azure Marketplace AnttiAlila, Microsoft 03/05/2016 www.bilot.fi 10
  • 11.
    Hadoop 101 Tuomas Autio,Bilot 03/05/2016 www.bilot.fi 11
  • 12.
    03/05/2016 www.bilot.fi 12 DATASYSTEMS REPORTING & APPLICATIONS Analytics Custom applications Packaged applications EDWRDBMS MPP New Data Sources Social media Click-stream Marketing data Server logs / RFID (TRADITIONAL) DATA SOURCES POS ERP CRM … 1 Sensor / Machine data Geo locations Unsctructured documents 2 (Old) Architectures under pressure
  • 13.
    Quick Intro toHadoop 03/05/2016 www.bilot.fi 13 • Hadoop is an open source framework for distributed file storage • Managed by Apache Foundation • De facto standard for big data • Enterprise Hadoop distributions • Hortonworks HDP (”Red Hat” of Hadoop), HDP for Windows, IBM, Microsoft Azure HDInsight (HDP), Cloudera, MapR, AWS (EMR), Rackspace • >50% of US Fortune 100 companies use Hadoop, ~60% CAGR (2020 $50bn) • ~25 Finnish instances, ~10 known production instances in Finland (strongly behind US and central European markets) Hadoop 2.x Framework
  • 14.
    Key Features • Clusterof commodity servers, scales out ”infinitely” affordably • Linear growth of performance • Distributed processing • Schemaless • Hadoop stores files in a distributed file system • Fast (for big data), maps data wherever it is located in cluster • Resilient to failure • Flexible • Cost effective 03/05/2016 www.bilot.fi 14
  • 15.
    03/05/2016 www.bilot.fi 15 USECASE BUT”Haters to the left! Kill the fear! Just get it started and go!”, Symantec’s Cloud Platform Engineering Leader David Lin Value compounds with use, as more use cases, sources, time periods join in a data lake
  • 16.
    ”Hadoop – It’sdamn hard to use”, anonymous CXO 03/05/2016 www.bilot.fi 17 Mitigation: Right Team and skills! IT and the Business MUST Work Together to Create Maximum Value Typical (new) roles needed in the organization: • The Data Architect • The Data Scientist • The Business Analyst • The Developer • The Administrators
  • 17.
    Modern Data Architecture MikkoMattila, Bilot 03/05/2016 www.bilot.fi 18
  • 18.
    Why Hadoop willsuccess IKEA’s Business Idea “to offer a wide range of home furnishings with good design and function at prices so low that as many people as possible will be able to afford them” 03/05/2016 www.bilot.fi 19
  • 19.
    Why Hadoop willsuccess “HADOOP IS A SOFTWARE PACKAGE AT SUCH A LOW PRICE THAT ALMOST EVERY COMPANY IS ABLE TO AFFORD IT ALREADY” “HADOOP AND OTHER OPEN SOURCE BIG DATA PROJECTS PROVIDE A HUGE RANGE OF IT SOFTWARE FOR AREAS OF DATA MANAGEMENT AND SYSTEM INTEGRATION” “HADOOP TOOLS ARE DESIGNED TO SOLVE ISSUES IMPOSSIBLE FOR TRADITIONAL COMMERCIAL TOOLS” 03/05/2016 www.bilot.fi 20
  • 20.
    Hadoop Scenario 1:pre-process ETL 03/05/2016 www.bilot.fi 21
  • 21.
    Hadoop Scenario 2:hot and cold storage 03/05/2016 www.bilot.fi 22 Hot data in DW Cold data in Hadoop
  • 22.
    Hadoop Scenario 3:true data discovery 03/05/2016 www.bilot.fi 23
  • 23.
    Traditional Enterprise softwareand filesOnline systems (log or Streams) 5/3/2016 www.bilot.fi RDBMS ERP Hadoop ecosystem: All you need for modern analytics architecture as open source Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc (Un)Structured & documents Clickstream Server logs / RFID Sentiment, Some Sensor ETL + DW Digital organization Traditional organization
  • 24.
    Traditional Enterprise softwareand filesOnline systems (log or Streams) 5/3/2016 www.bilot.fi RDBMS ERP Hadoop ecosystem: All you need for modern analytics architecture as open source Real-time stream, log data and rdbms change capturing (Flume or Hortonworks data flow) Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc (Un)Structured & documents Clickstream Server logs / RFID Sentiment, Some Sensor Message Queue and history (Kafka) Complex event processing (Storm, SparkStreaming, KafkaStreams, Flink) Real time machine interface for applications ETL + DW Digital organization Traditional organization
  • 25.
    Traditional Enteprice softwareand files Interactive processing & queries (Spark & Hive) Online systems (log or Streams) FileSystem (HDFS) + Core Services 5/3/2016 www.bilot.fi RDBMS ERP Hadoop ecosystem: All you need for modern analytics architecture as open source Real-time stream, log data and rdbms change capturing (Flume or Hortonworks data flow) Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc (Un)Structured & documents Clickstream Server logs / RFID Sentiment, Some Sensor Message Queue and history (Kafka) Complex event processing (Storm, SparkStreaming, KafkaStreams, Flink) Real time machine interface for applications ETL + DW BI User Digital organization Traditional organization Batch Processing
  • 26.
    Traditional Enterprise softwareand files Interactive processing & queries (Spark & Hive) Online systems (log or Streams) FileSystem (HDFS) + Core Services 5/3/2016 www.bilot.fi RDBMS ERP Batch processing (MapReduce & Pig Latin) Hadoop ecosystem: All you need for modern analytics architecture as open source Real-time stream, log data and rdbms change capturing (Flume or Hortonworks data flow) Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc (Un)Structured & documents Clickstream Server logs / RFID Sentiment, Some Sensor Message Queue and history (Kafka) Complex event processing (Storm, SparkStreaming, KafkaStreams, Flink) Real time machine interface for applications ETL + DW RDBMS -> HDFS batch load (Sqoop) Statistical Analysis (Spark) BI User Data Scientist Digital organization Traditional organization Batch Processing
  • 27.
    Traditional Enterprise softwareand files Interactive processing & queries (Spark & Hive) Online systems (log or Streams) FileSystem (HDFS) + Core Services 5/3/2016 www.bilot.fi RDBMS ERP Batch processing (MapReduce & Pig Latin) Hadoop ecosystem: All you need for modern analytics architecture as open source Real-time stream, log data and rdbms change capturing (Flume or Hortonworks data flow) Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc (Un)Structured & documents Clickstream Server logs / RFID Sentiment, Some Sensor Message Queue and history (Kafka) Complex event processing (Storm, SparkStreaming, KafkaStreams, Flink) Real time machine interface for applications ETL + DW RDBMS -> HDFS batch load (Sqoop) Statistical Analysis (Spark) BI User Data Scientist Digital organization Traditional organization Batch Processing
  • 28.
    Traditional Enterprise softwareand files Interactive processing & queries (Spark & Hive) Online systems (log or Streams) FileSystem (HDFS) + Core Services 5/3/2016 www.bilot.fi RDBMS ERP Batch processing (MapReduce & Pig Latin) Hadoop ecosystem: All you need for modern analytics architecture as open source Real-time stream, log data and rdbms change capturing (Flume or Hortonworks data flow) Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc (Un)Structured & documents Clickstream Server logs / RFID Sentiment, Some Sensor Message Queue and history (Kafka) Complex event processing (Storm, SparkStreaming, KafkaStreams, Flink) Real time machine interface for applications ETL + DW RDBMS -> HDFS batch load (Sqoop) Statistical Analysis (Spark) NoSQL database for interactive use (hbase) BI User Data Scientist Batch Processing Digital organization Traditional organization
  • 29.
    Traditional Enterprise softwareand files Interactive processing & queries (Spark & Hive) Online systems (log or Streams) FileSystem (HDFS) + Core Services 5/3/2016 www.bilot.fi RDBMS ERP Batch processing (MapReduce & Pig Latin) Hadoop ecosystem: All you need for modern analytics architecture as open source Real-time stream, log data and rdbms change capturing (Flume or Hortonworks data flow) Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc (Un)Structured & documents Clickstream Server logs / RFID Sentiment, Some Sensor Message Queue and history (Kafka) Complex event processing (Storm, SparkStreaming, KafkaStreams, Flink) Real time machine interface for applications ETL + DW RDBMS -> HDFS batch load (Sqoop) Statistical Analysis (Spark) NoSQL database for interactive use (hbase) Data Virtualization Virtual Datamodels / security O/JDBC, MDX, REST outbound interfaces BI User Data Scientist Batch Processing O/JDBC, MDX, REST inbound interfaces Logical Data Warehouse Traditional BI Tools Digital organization Traditional organization
  • 30.
    Traditional Enterprise softwareand files Interactive processing & queries (Spark & Hive) Online systems (log or Streams) FileSystem (HDFS) + Core Services 5/3/2016 www.bilot.fi RDBMS ERP Batch processing (MapReduce & Pig Latin) Hadoop ecosystem: All you need for modern analytics architecture as open source Real-time stream, log data and rdbms change capturing (Flume or Hortonworks data flow) Webshops, Mobile Applications, Contact Centers, ERP, CRM systems etc (Un)Structured & documents Clickstream Server logs / RFID Sentiment, Some Sensor Message Queue and history (Kafka) Complex event processing (Storm, SparkStreaming, KafkaStreams, Flink) Real time machine interface for applications ETL + DW RDBMS -> HDFS batch load (Sqoop) Statistical Analysis (Spark) NoSQL database for interactive use (hbase) Data Virtualization Virtual Datamodels / security O/JDBC, MDX, REST outbound interfaces BI User Data Scientist Batch Processing O/JDBC, MDX, REST inbound interfaces Logical Data Warehouse Traditional BI Tools Digital organization Traditional organization
  • 31.
    Example use case:Dynamic Pricing Dynamic pricing will be more and more common in the future Usage of dynamic pricing should be business decision – not restricted by your technical capabilities 5/3/2016 www.bilot.fi 32 Dynamic Pricing Same price for every one in every store More you visit on booking pages the higher price
  • 32.
    Dynamic OmniChannel Pricing 5/3/2016www.bilot.fi 33 Store Consumer buying On-line Channel Consumption (IoT) Price Cache (SmartPricing Accelerator SPA) Pricing rules Price List Customer Product Basket Size History Warehouse levels Delivery time / type WebSite Activity IOT consumption MQ Analytics and Pricing Simulations (SmartPricing) Supply Chain Management (+other sources) Batch Processing & History Second & Minute Level Price optimizationMonthly level Price optimization Orders / ClickStream Sensor Data POS data CEP
  • 33.
    Demo Scope 5/3/2016 www.bilot.fi34 Consumer buying Pricing updates MQ Analytics Data Warehouse CEP WebShopClickStream ClickStream Orders, Product categories, Suppliers MS SQL Server HTML5 + Tomcat server Kafka HDFS + Hive MS PowerBI Log file sniffing to stream Flume-ng Every visit to ”product page” increases price with 5% Indentifies ”product page” and viewed product + sends request to increase price
  • 34.
    DEMO Real-time Dynamic WebshopPricing and real-time Reporting (Hadoop), Mikko Mattila Bilot 03/05/2016 www.bilot.fi 35
  • 35.
    Hortonworks: HDP &HDF References & Use Cases Pasi Vuorela, Hortonworks 03/05/2016 www.bilot.fi 36
  • 36.
    Hortonworks Techical Case Study MatsJohansson, Hortonworks 03/05/2016 www.bilot.fi 37
  • 37.
    Next Steps? Tuomas Autio,Bilot 03/05/2016 www.bilot.fi 38
  • 38.
    Bilot’s Hadoop AcceleratorProgram 03/05/2016 www.bilot.fi 39 1. Business Strategy 2. Hadoop bootcamp 3. Proof of Concept 4. Proof of Solution 5. Build & Implement 6. Run 0,5 day 1 day • Intro to Hadoop • Vision • Use cases • Prioritization • 1 use case • Deep dive with business, IT, and operations • Business case • Platform deployed on Azure • Integrations + use case • Look & feel • Test drive • Scalability • Security • Tools and methods • Cloud/on-prem • Licences/ support descriptions • Implementation • Agile dev • Roll-out and roadmap • Change mgmt. begins • Hadoop as a Service • AMS • Data driven enterprise/ organization dev 2 - 8 weeks 2-3 months 3-6 months • Insight for Hadoop- enabled business • List of prioritized Hadoop use cases DELIVERABLES • Business case for PoC use case • “How to get there?” • Technical: Up and running system and technical evaluation • Confirmed business case • Plans for scalable and secure Hadoop solution ready for implementation • Hadoop implemented • Roadmap for further use cases • Fully functional Hadoop environment • Continuous support model • Organizational adaptation PoC / Pilot Production implementation Contact Bilot to hear more
  • 39.
    Interested? Contact usfor a tailored demo and workshop! Bilot is Hortonworks’ first systems integrator partner in Finland and Microsoft’s Gold Partner 03/05/2016 www.bilot.fi 40 Real customer usecases and industry examples available for demo. Contact us for your own tailored session! In pre-PoC phase for sandboxing and light demo purposes we can utilize Azure or Bilot’s 5- node on-premises HDP cluster Mikko Mattila Solution Lead, Analytics Mikko.mattila@bilot.fi @MattilaJMikko Tuomas Autio Head of Big Data & BI Business Lead tuomas.autio@bilot.fi @BigDataTuomas