STOP moving data around
Edge Analytics made easy
Analytics built for IoT
Where IoT analytics delivers 15% more output
from renewable energy sources
30TB
Analyze Data
in Real-time
15%
Increase
Efficiency
$18K/hr; $158M/yr
Generate Operational/
Economic Benefits
(20,000 Wind Turbines; 10 GW Capacity; .3 Capacity Factor; $40/MW-hour)
2
IoT analytics has a set of distinct requirements.
3
Big Data
Data is growing faster and bigger
because of number of sensors
10B+ rows
5TB+
Fast Data
Data streamed from sensors
requires fast ingestion
1M+ rows
per sec
Edge Analytics
IoT data is mostly generated
at the ‘Edges’ of the network
100+
Locations
Real-Time Insights
Use cases require near
Real Time Analytics
<1 sec query
response
time
Problem
Horizontal Integration, Vertical Applications
Delivering actionable insights requires use-case specific
applications that are adapted to changing needs rappidly
4
Network
Analytics
Query
4 billion
records
4 billion
records
Today - Purpose built Solutions stacks
Mobile
Analytics
Query
Network
Analytics
Query
Future – Flexible Application Architecture
Mobile
Analytics
User-Experience
Management
?
Data Collection
Storage &
Analytics
Application
ParStream is integrated with leading IoT solutions.
5
Custom Apps DATAWATCH
AnalyticsVisualization
Data
Collection
Solution
Standard SQL
Streaming import (MQTT, flume Avro...)
UDX for
analytical
libraries
Geo-Distributed setup substantially reduces network traffic
enabling continuous monitoring (sampling proofed insufficient)
Decentralization
Overcoming bandwidth limitations and keeping data transfer
costs low required analytics to be pushed closer to the data
source (EDGE, FOG..)
6
Application
Database
More than
20 billion records
returned
Query Search Results
40 records found
4 billion
records
4 billion
records
4 billion
records
4 billion
records
4 billion
records
Today
Application
Query Search Results
40 records found
ParStream
ParStream Geo-Distributed Server
7
records
18
records
5
records
12
records
8
records
4 billion
records
ParStream ParStream ParStream ParStream
Tomorrow with ParStream Geo-Distributed Analytics
GDA architecture vs MPP vs Federated servers
7
Analytical
Cluster (MPP)
Geo Distributed
Deployment
Federation
Servers
Architecture
High
Availability
Data
Model
Query
Execution
Problem
Tightly
coupled
nodes
Losely
coupled
nodes
Independent
sub-systems
‚System‘ redundancy
Full or no results
‚Node‘ redundancy
Full or partial results
‚Sub-system‘
redundancy
Full or no results
One model
One model with
local extensions
Joint view on
local models
Data-structure-aware
system optimization
Data-structure-aware
execution plan with
data-aware local optimization
Data-structure-aware
execution-plan-translation
with independent execution
ParStream introduces EdgeAnalyticsBox
The industry’s first appliance built for edge analytics/GDA
• Specifically designed to enable edge analytics (Geo-Distributed Analytics).
• Ruggedized for use in real-world edge analytics applications such as oil/drilling
sites, cell phone towers, wind farms, etc.
• Pre-loaded and tested with ParStream software.
• Technical Specs: Intel Core i5/i7 processor, 8-16 GB RAM and 64-128GB SSD
• EdgeAnalyticsBox provides customers with the convenience of a one-stop shop
for the their edge analytics needs, however, customers can run GDA on any
standard hardware with certain processing and storage requirements.
8
Solution
New Product of the Week
Geo-Distributed setup substantially reduces network traffic
enabling continuous monitoring (sampling proofed insufficient)
Integration of analytics into OT processes
Demystification and integration of advanced analytics into
regular business processes connecting operations and analytics
9
Traditional OT setup Future OT + Data Scientist setup
Monitoring
&
Control
Compliance
&
Archiving
HistorianCEP
Data Stream
Compliance
&
Archiving
HistorianCEP
Data Stream
Analytics
&
Optimization
ParStream
Monitoring
&
Control
ParStream is the only solution for all IoT analytics
requirements.
10
Customer Applications and Visualization Tools
IoT Data Collection Platforms Enterprise Data Sources
ParStream DB
Geo-
Distributed
Analytics
Alarm +
Action
Time
Series
Advanced
Analytics
Solution
Demo: Sensor Analytics for Real-time Environmental
Compliance
11
Solution
Backup
Global IoT Survey
- Global, cross industry survey
- 50/50 mix of business and
technology leaders
- Over 200 participants
- Focus on the use and value of
data in IoT initiatives
Better IoT data collection and analysis delivers more value
• 70% say they make better, more meaningful decisions with improved data
• 86% would increase the ROI of their IoT investment
IoT not delivering full potential because of data challenges
• 86% of business stakeholders say data is important to their IoT project
• Only 8% are fully capturing and analyzing IoT data in a timely fashion
• 94% face challenges collecting and analyzing IoT data
IoT projects vary widely – but all have challenges
• 53% are using IoT projects to optimize existing businesses
47% as a strategic business investment
• 96% have faced challenges with their IoT projects
Global IoT Survey: Key Findings
Analytical Frameworks
15
Query/Scripting
Language SQL Hive Pig
Mongo
MR
Data Store
Execution
Engine
Programming
Interface/Model MapReduce
Hadoop
MapReduce
MongoDB
MapReduce
MongoDB
Column
Store
HBASE
Row / Document
Store
Giraph
......
......
......
......
MPP Exec.
Engine
Declarative
API
Mongo
CRUD
Storage
Access
Key-ValueFile
HDFS
Spark
Spark
SQL
Scala
Focus
Real-Time
Analytics
Peta-Scale
Batch Proc.
High-Throughput
Key Value Doc. Store
IoT analytics has a set of distinct requirements.
16
Big Data
Data is growing faster and bigger
because of number of sensors
10B+ rows
5TB+
Wind turbine: 100 turbines x 100M rows per year
Race car: 400M records / day x 365 days test drive
Telco: 1.000 cells x 1.000 rows / sec x 1 days - wow
Traffic analysis: 60M cars x 1 read / min x 365 days
Oil rig: 1 rig = 8 billion records / day (not verified)
Fast Data
Data streamed from sensors
requires fast ingestion
1M+ rows
per sec
Network monitoring: 1M rows per sec per cell
Asset monitoring: 60M cars x 1 reading per minute
Airplane monitoring: 4 turbines x 3k sensors x 100Hz
Oil exploration: 10.000 wells x 100 sensors x 1Hz
Oil rig: 1 drilling rig x 10.000 sensors x avg 100Hz
Edge Analytics
IoT data is mostly generated
at the ‘Edges’ of the network
100+
Locations
Manufacturing: 300.000 plants in US (2012)
Cars / ships / airplanes: >1 billion world wide
Telco: 190.000 cell towers in US (2013)
Oil: 950.000 wells worldwide; 500.000 in US
Mobile advertising: de-central adserving / monitoring
Real-Time Insights
Use cases require near
Real Time Analytics
<1 sec query
response
time
Dashboarding: real-time visualization, many queries
Network monitoring: root cause analysis, optimization
Asset monitoring: conditional monitoring, safety
Security: anomalie detection, building safety
Traffic: location aware recommendations
Problem
Customer Proof Point for IoT in Renewable Energy:
Real-time Analytics for Wind Turbines
17
Business Challenge
 Optimize wind turbine performance by quickly adjusting to changing
environmental factors (e.g., wind direction, temperature, etc.)
 Minimize turbine downtime thru predictive maintenance.
Use Case
 Real-time and continuous monitoring of data from 20,000 wind turbines,
including analysis of over 20TB of historical data
ParStream‘s Technology Value Proposition
 Real-time monitoring of continuous data-flow for immediate
insights/actions
 Historical analysis thru enabling storage and analytics in an integrated
platform by immediately importing and storing readings from turbines.
Benefits/Results (estimated)
 15% improvement in productivity
 Decreased downtime
 $158M of annual economic benefits
Traction
Customer Proof Point for IoT in Manufacturing:
Real-time Analytics for Gas Turbine Monitoring
18
Business Challenge
 Optimization of complex systems for efficiency and operational
(automated) decisions in real-time
 Enabling new service-driven business models
Use Case
 5,000 data points per sec/turbine for real-time analytics and
historical storage for model-based learning/root cause analysis.
ParStream‘s Technology Value Proposition
 Real-time monitoring of continuous data-flow for immediate
insights/actions
 “Analytical Historian” enabling storage and analytics in an
integrated platform by immediately importing and storing all
sensor readings
Benefits/Results
 Improved startup with less vibration equals less deterioration
 Less NOx emission
 Improved overall efficiency
 Prediction of material fatigue
Traction
Customer Proof Point for IoT in Manufacturing:
Real-time Analytics for Semiconductor Testing
19
Business Challenge
 Current MySQL environment requires pre-built aggregations. The ability to
perform root cause analysis is limited.
 Computing aggregations takes too long reducing machine utilization and
causing more scrap product.
Use Case
 One Automated Testing Equipment handles 24 wafers per lot, 1 wafer
generates 1 Billions test results. Data volume required pre-built
aggregations which took too long to build
ParStream‘s Technology Value Proposition
 Real-Time monitoring of continuous data-flow for immediate insight /action
to reduce waste and increase outputs
 Unlimited scalability allows Galaxy to market to bigger semiconductor
testing and manufacturing companies
Benefits/Results
 Improved Machine Utilization: Current batch style analysis of test data
causes expensive test machines to be underutilized
 Revenue Increase: Increased data volume opens new, more lucrative
markets, ability to sell to larger customers
 New Products: Drill down analysis to detail test results leads to new
insights
 Cost Savings: Ability to analyze detail level data expected to produce new
insights in causes of test failures
Traction
Customer Proof Point for IoT in Automotive/Telematics:
Real-time Analytics for Sensor Data/Vehicle Monitoring
20
Business Challenge
 Optimization of multiple systems for efficiency and operational
(automated) decisions on billions of records
 Enabling new service-driven business models
Use Case
 Real-time monitoring of continuous GPS data and events flows
ParStream‘s Technology Value Proposition
 Over 260 million new records/month for real-time analytics
 31 billion records of historical data
 ParStream collects all data from different systems near-real-time
Benefits/Results
 Reduced overall data manipulation time by over 90%
 Reduced annual hardware by over 60%
 Improved execution time and scheduling efforts
 Improved analysis/prediction of driver profiles
Traction
ParStream is uniquely positioned for Real-time
Analytics in IoT
21
REAL-TIME
IMPORT
REAL-TIME
QUERYING
FLEXIBLE
ANALYTICS
Small Form Factor / Low TCO
BillionsofRecords
Thousands of Columns
Solution
ParStream’s patented technology provides a
competitive advantage.
1
2
3 Lockless architecture
Enables ultra-fast query and
data import performance
Massive parallel processing
Delivers linear scalability and
high query throughput
4 Small footprint
Enables analytics at the edge
with a low TCO
High Performance
Compressed Indexes
Provide ultra-high query
performance
SQL API / JDBC /ODBC C++ UDx API
Real-Time Analytics Engine
In-Memory and
Disk Technology
Multi-Dimensional
Partitioning
Massively Parallel
Processing (MPP)
Shared Nothing
Architecture
3rd generation Columnar Storage
High Speed Parallel Loader with Low Latency
High
Performance
Compressed
Index (HPCI)
22
Solution
Industry-leading Product Recognition
23
ParStream is the most
reliable System in our
Data Center
CTO, etracker
ParStream was 40 times
faster than its nearest
competitor.
CTO, Cake Marketing
ParStream enabled us
to scale internationally -
TCO is much lower than
with Hadoop
VP Eng, Searchmetrics
#1
Big Data
Startup
Cisco Entrepreneurs in Residence
Traction
2014 IoT Excellence Award
GDA architecture vs MPP vs Federated servers
24
Analytical Cluster
(MPP)
Geo Distributed
Deployment
Federation
Servers
Hardware − Commodity or high-end DB
Server
− “low latency, high
bandwidth”
− Colocated Nodes
−
− GDA master is loosely coupled to slave
− systemNodes are individual ParStream
deployments
− Nodes can have very different sizing
− coupled to component systems
− Geologically spread
High
Availability
− Redundancy per structure
− Full/no results
− decoupled from the Component Systems
− Best of bread per backend system
− GDA master expects nodes to be not
available at query runtime
− Nodes with normal HA characteristics
− decoupled from the Component
Systems
− Best of breed per Backend
System
Data
Model
− Global Model − Master and Nodes share a common data-
model
− Independent local Schema extensions
− Distribution models supported very
similar to cluster (COLOCATION,
EVERYWHERE replicated dimension
tables)
− joint View on component data
model
Execution − MPP
− Data Structure aware
− Computation next to data
− Data-aware engine optimizations during
query executions
− Node-local, partition centric workload
− query results can be fully or partially
computed remotely
− Queries against the virtual data
model
− Translation for different
components
− No Coupling and Optimization as
in a native cluster possible
Problem

Michael Hummel - Stop Storing Data! - Parstream

  • 1.
    STOP moving dataaround Edge Analytics made easy Analytics built for IoT
  • 2.
    Where IoT analyticsdelivers 15% more output from renewable energy sources 30TB Analyze Data in Real-time 15% Increase Efficiency $18K/hr; $158M/yr Generate Operational/ Economic Benefits (20,000 Wind Turbines; 10 GW Capacity; .3 Capacity Factor; $40/MW-hour) 2
  • 3.
    IoT analytics hasa set of distinct requirements. 3 Big Data Data is growing faster and bigger because of number of sensors 10B+ rows 5TB+ Fast Data Data streamed from sensors requires fast ingestion 1M+ rows per sec Edge Analytics IoT data is mostly generated at the ‘Edges’ of the network 100+ Locations Real-Time Insights Use cases require near Real Time Analytics <1 sec query response time Problem
  • 4.
    Horizontal Integration, VerticalApplications Delivering actionable insights requires use-case specific applications that are adapted to changing needs rappidly 4 Network Analytics Query 4 billion records 4 billion records Today - Purpose built Solutions stacks Mobile Analytics Query Network Analytics Query Future – Flexible Application Architecture Mobile Analytics User-Experience Management ? Data Collection Storage & Analytics Application
  • 5.
    ParStream is integratedwith leading IoT solutions. 5 Custom Apps DATAWATCH AnalyticsVisualization Data Collection Solution Standard SQL Streaming import (MQTT, flume Avro...) UDX for analytical libraries
  • 6.
    Geo-Distributed setup substantiallyreduces network traffic enabling continuous monitoring (sampling proofed insufficient) Decentralization Overcoming bandwidth limitations and keeping data transfer costs low required analytics to be pushed closer to the data source (EDGE, FOG..) 6 Application Database More than 20 billion records returned Query Search Results 40 records found 4 billion records 4 billion records 4 billion records 4 billion records 4 billion records Today Application Query Search Results 40 records found ParStream ParStream Geo-Distributed Server 7 records 18 records 5 records 12 records 8 records 4 billion records ParStream ParStream ParStream ParStream Tomorrow with ParStream Geo-Distributed Analytics
  • 7.
    GDA architecture vsMPP vs Federated servers 7 Analytical Cluster (MPP) Geo Distributed Deployment Federation Servers Architecture High Availability Data Model Query Execution Problem Tightly coupled nodes Losely coupled nodes Independent sub-systems ‚System‘ redundancy Full or no results ‚Node‘ redundancy Full or partial results ‚Sub-system‘ redundancy Full or no results One model One model with local extensions Joint view on local models Data-structure-aware system optimization Data-structure-aware execution plan with data-aware local optimization Data-structure-aware execution-plan-translation with independent execution
  • 8.
    ParStream introduces EdgeAnalyticsBox Theindustry’s first appliance built for edge analytics/GDA • Specifically designed to enable edge analytics (Geo-Distributed Analytics). • Ruggedized for use in real-world edge analytics applications such as oil/drilling sites, cell phone towers, wind farms, etc. • Pre-loaded and tested with ParStream software. • Technical Specs: Intel Core i5/i7 processor, 8-16 GB RAM and 64-128GB SSD • EdgeAnalyticsBox provides customers with the convenience of a one-stop shop for the their edge analytics needs, however, customers can run GDA on any standard hardware with certain processing and storage requirements. 8 Solution New Product of the Week
  • 9.
    Geo-Distributed setup substantiallyreduces network traffic enabling continuous monitoring (sampling proofed insufficient) Integration of analytics into OT processes Demystification and integration of advanced analytics into regular business processes connecting operations and analytics 9 Traditional OT setup Future OT + Data Scientist setup Monitoring & Control Compliance & Archiving HistorianCEP Data Stream Compliance & Archiving HistorianCEP Data Stream Analytics & Optimization ParStream Monitoring & Control
  • 10.
    ParStream is theonly solution for all IoT analytics requirements. 10 Customer Applications and Visualization Tools IoT Data Collection Platforms Enterprise Data Sources ParStream DB Geo- Distributed Analytics Alarm + Action Time Series Advanced Analytics Solution
  • 11.
    Demo: Sensor Analyticsfor Real-time Environmental Compliance 11 Solution
  • 12.
  • 13.
    Global IoT Survey -Global, cross industry survey - 50/50 mix of business and technology leaders - Over 200 participants - Focus on the use and value of data in IoT initiatives
  • 14.
    Better IoT datacollection and analysis delivers more value • 70% say they make better, more meaningful decisions with improved data • 86% would increase the ROI of their IoT investment IoT not delivering full potential because of data challenges • 86% of business stakeholders say data is important to their IoT project • Only 8% are fully capturing and analyzing IoT data in a timely fashion • 94% face challenges collecting and analyzing IoT data IoT projects vary widely – but all have challenges • 53% are using IoT projects to optimize existing businesses 47% as a strategic business investment • 96% have faced challenges with their IoT projects Global IoT Survey: Key Findings
  • 15.
    Analytical Frameworks 15 Query/Scripting Language SQLHive Pig Mongo MR Data Store Execution Engine Programming Interface/Model MapReduce Hadoop MapReduce MongoDB MapReduce MongoDB Column Store HBASE Row / Document Store Giraph ...... ...... ...... ...... MPP Exec. Engine Declarative API Mongo CRUD Storage Access Key-ValueFile HDFS Spark Spark SQL Scala Focus Real-Time Analytics Peta-Scale Batch Proc. High-Throughput Key Value Doc. Store
  • 16.
    IoT analytics hasa set of distinct requirements. 16 Big Data Data is growing faster and bigger because of number of sensors 10B+ rows 5TB+ Wind turbine: 100 turbines x 100M rows per year Race car: 400M records / day x 365 days test drive Telco: 1.000 cells x 1.000 rows / sec x 1 days - wow Traffic analysis: 60M cars x 1 read / min x 365 days Oil rig: 1 rig = 8 billion records / day (not verified) Fast Data Data streamed from sensors requires fast ingestion 1M+ rows per sec Network monitoring: 1M rows per sec per cell Asset monitoring: 60M cars x 1 reading per minute Airplane monitoring: 4 turbines x 3k sensors x 100Hz Oil exploration: 10.000 wells x 100 sensors x 1Hz Oil rig: 1 drilling rig x 10.000 sensors x avg 100Hz Edge Analytics IoT data is mostly generated at the ‘Edges’ of the network 100+ Locations Manufacturing: 300.000 plants in US (2012) Cars / ships / airplanes: >1 billion world wide Telco: 190.000 cell towers in US (2013) Oil: 950.000 wells worldwide; 500.000 in US Mobile advertising: de-central adserving / monitoring Real-Time Insights Use cases require near Real Time Analytics <1 sec query response time Dashboarding: real-time visualization, many queries Network monitoring: root cause analysis, optimization Asset monitoring: conditional monitoring, safety Security: anomalie detection, building safety Traffic: location aware recommendations Problem
  • 17.
    Customer Proof Pointfor IoT in Renewable Energy: Real-time Analytics for Wind Turbines 17 Business Challenge  Optimize wind turbine performance by quickly adjusting to changing environmental factors (e.g., wind direction, temperature, etc.)  Minimize turbine downtime thru predictive maintenance. Use Case  Real-time and continuous monitoring of data from 20,000 wind turbines, including analysis of over 20TB of historical data ParStream‘s Technology Value Proposition  Real-time monitoring of continuous data-flow for immediate insights/actions  Historical analysis thru enabling storage and analytics in an integrated platform by immediately importing and storing readings from turbines. Benefits/Results (estimated)  15% improvement in productivity  Decreased downtime  $158M of annual economic benefits Traction
  • 18.
    Customer Proof Pointfor IoT in Manufacturing: Real-time Analytics for Gas Turbine Monitoring 18 Business Challenge  Optimization of complex systems for efficiency and operational (automated) decisions in real-time  Enabling new service-driven business models Use Case  5,000 data points per sec/turbine for real-time analytics and historical storage for model-based learning/root cause analysis. ParStream‘s Technology Value Proposition  Real-time monitoring of continuous data-flow for immediate insights/actions  “Analytical Historian” enabling storage and analytics in an integrated platform by immediately importing and storing all sensor readings Benefits/Results  Improved startup with less vibration equals less deterioration  Less NOx emission  Improved overall efficiency  Prediction of material fatigue Traction
  • 19.
    Customer Proof Pointfor IoT in Manufacturing: Real-time Analytics for Semiconductor Testing 19 Business Challenge  Current MySQL environment requires pre-built aggregations. The ability to perform root cause analysis is limited.  Computing aggregations takes too long reducing machine utilization and causing more scrap product. Use Case  One Automated Testing Equipment handles 24 wafers per lot, 1 wafer generates 1 Billions test results. Data volume required pre-built aggregations which took too long to build ParStream‘s Technology Value Proposition  Real-Time monitoring of continuous data-flow for immediate insight /action to reduce waste and increase outputs  Unlimited scalability allows Galaxy to market to bigger semiconductor testing and manufacturing companies Benefits/Results  Improved Machine Utilization: Current batch style analysis of test data causes expensive test machines to be underutilized  Revenue Increase: Increased data volume opens new, more lucrative markets, ability to sell to larger customers  New Products: Drill down analysis to detail test results leads to new insights  Cost Savings: Ability to analyze detail level data expected to produce new insights in causes of test failures Traction
  • 20.
    Customer Proof Pointfor IoT in Automotive/Telematics: Real-time Analytics for Sensor Data/Vehicle Monitoring 20 Business Challenge  Optimization of multiple systems for efficiency and operational (automated) decisions on billions of records  Enabling new service-driven business models Use Case  Real-time monitoring of continuous GPS data and events flows ParStream‘s Technology Value Proposition  Over 260 million new records/month for real-time analytics  31 billion records of historical data  ParStream collects all data from different systems near-real-time Benefits/Results  Reduced overall data manipulation time by over 90%  Reduced annual hardware by over 60%  Improved execution time and scheduling efforts  Improved analysis/prediction of driver profiles Traction
  • 21.
    ParStream is uniquelypositioned for Real-time Analytics in IoT 21 REAL-TIME IMPORT REAL-TIME QUERYING FLEXIBLE ANALYTICS Small Form Factor / Low TCO BillionsofRecords Thousands of Columns Solution
  • 22.
    ParStream’s patented technologyprovides a competitive advantage. 1 2 3 Lockless architecture Enables ultra-fast query and data import performance Massive parallel processing Delivers linear scalability and high query throughput 4 Small footprint Enables analytics at the edge with a low TCO High Performance Compressed Indexes Provide ultra-high query performance SQL API / JDBC /ODBC C++ UDx API Real-Time Analytics Engine In-Memory and Disk Technology Multi-Dimensional Partitioning Massively Parallel Processing (MPP) Shared Nothing Architecture 3rd generation Columnar Storage High Speed Parallel Loader with Low Latency High Performance Compressed Index (HPCI) 22 Solution
  • 23.
    Industry-leading Product Recognition 23 ParStreamis the most reliable System in our Data Center CTO, etracker ParStream was 40 times faster than its nearest competitor. CTO, Cake Marketing ParStream enabled us to scale internationally - TCO is much lower than with Hadoop VP Eng, Searchmetrics #1 Big Data Startup Cisco Entrepreneurs in Residence Traction 2014 IoT Excellence Award
  • 24.
    GDA architecture vsMPP vs Federated servers 24 Analytical Cluster (MPP) Geo Distributed Deployment Federation Servers Hardware − Commodity or high-end DB Server − “low latency, high bandwidth” − Colocated Nodes − − GDA master is loosely coupled to slave − systemNodes are individual ParStream deployments − Nodes can have very different sizing − coupled to component systems − Geologically spread High Availability − Redundancy per structure − Full/no results − decoupled from the Component Systems − Best of bread per backend system − GDA master expects nodes to be not available at query runtime − Nodes with normal HA characteristics − decoupled from the Component Systems − Best of breed per Backend System Data Model − Global Model − Master and Nodes share a common data- model − Independent local Schema extensions − Distribution models supported very similar to cluster (COLOCATION, EVERYWHERE replicated dimension tables) − joint View on component data model Execution − MPP − Data Structure aware − Computation next to data − Data-aware engine optimizations during query executions − Node-local, partition centric workload − query results can be fully or partially computed remotely − Queries against the virtual data model − Translation for different components − No Coupling and Optimization as in a native cluster possible Problem

Editor's Notes