SlideShare a Scribd company logo
New Business Applications 
Powered by In-Memory 
Technologies from Academia 
MIT Forum for Supply Chain Innovation 
HPI February 2011 
Paul Hofmann, PhD 
VP, Group of the Chief Scientist 
SAP Chief 
Scientist 
Group
Examples of Projects with In-Memory 
Technology in Collaboration with Academia 
© SAP AG 2011. All rights reserved. / Page 2 
Taking In-Memory Computing Seriously 
Coherent Shared Memory – BigIron in Palo Alto 
Keep programming model simple AND solve very complex problems 
Princeton University 
 Optimal pricing for energy management, online pricing, truck scheduling, … 
Infinite DRAM - RAMCloud 
Stanford University and HPI 
 Extremely low latency and very high bandwidth 
 Facebook like problems with high read AND write rate 
 Advanced analytics, what-if scenarios, demand planning, ... 
Hybrid In-Memory Store 
MIT CSAIL and HPI 
 Aggregate column store – the best of both worlds 
Multithreading Real Time Event Platform 
MIT Auto-ID Lab and HPI 
 500k events/s and millions of threads in-memory or distributed 
 Automatic meter reading, online billing, mobile billing, Smart Grid
Taking In-Memory Computing Seriously 
Chief Scientist Group 
Princeton University – Operations Research and Financial Engineering 
Warren Powell, et al 
© SAP AG 2011. All rights reserved. / Page 3
Taking In-Memory Computing Seriously! 
Basic Assumptions 
 Disk is tape - active data must be in DRAM 
 Data locality is king  avoid cache misses and stalled CPU 
Problems and Opportunities for In-Memory Computing 
 Addressable DRAM per box is limited – different than hard disks. 
We need to scale memory independently from physical boxes 
 Scaling Architecture 
– Arbitrary scaling of the amount of data stored in DRAM 
– Arbitrary & independent scaling of number of active users & associated computing load 
 Inter-Process Communication is slow and hard to program (latencies are in the area of 
0.5-1ms ) 
We can do better 
© SAP AG 2011. All rights reserved. / Page 4
Taking In-Memory Computing Seriously! 
How can we do better? 
 Coherent Shared Memory or ccNUMA 
All CPUs can access all memory and all I/O channels in about 1 μs 
We can scale independently with DRAM and CPUs 
You need more computing power – add another board … 
You need more DRAM – add another board … 
 Merge Application Server & DB Server – reference memory directly from app 
© SAP AG 2011. All rights reserved. / Page 5
BigIron - A System We Architected For Hana with 
Leading-Edge, Cluster Server Components 
System Specifications Architecture, Assembly, 
© SAP AG 2011. All rights reserved. / Page 6 
System architecture: SAP Technology 
Infrastructure 
Research Practice 
Assembly and Test: Colfax 
International 
Hosting: Bay Area Internet 
Solutions, Santa 
Clara, CA 
• Large shared 
coherent 
memory (5TB) 
across 
servers via 
Scale MP 
• 160 cores(320 
HT) 
Big Iron 2 
Extreme Performance, Scalability, 
and much simpler system model 
Research Server Cluster 
 5 x 4U Servers 
(4 Intel XEON x7560 2.26Ghz) 
 160 cores (32 Cores/Server) 
 5TB memory (64 x 16MB 
DDR3/Server) 
 30TB SSD (solid state disk) 
storage 
 5 Networks 
 VPN of ScaleMP (40- 
160GbIB) 
 VPN of Server Cluster 
(10GbE) 
 VPN of Storage Array 
(10GbE) 
 VPN of SAP Internal Network 
(10MbE metered) 
 Firewalled GW to Internet 
(1GbE Expandable) 
 1 NAS (72TB Expandable to 180) 
 1 x 48U Rack 
 System Software 
 SLES11 Linux OS Licenses 
 ScaleMP vSMP Licenses 
 System cost: $618K with tax and 
support 
& Hosting
Coherent Shared Memory – Improved 
Productivity at Low Price 
Traditional server clusters – Distributed memory 
Programming of Queries, 
Distribution of Data 
Physical 
Server #1 
© SAP AG 2011. All rights reserved. / Page 7 
Server Cluster with Coherent Shared Memory 
Traditional 
SAP developer 
Coherent Shared Memory via 
software approach 
 Developers can treat system as one “big” server and 
let the operating system and lower level software 
handle the problem 
 Initial design is timeless – hardware scaling handled 
below app design layer 
 Developers do not need additional skills for in-memory 
computing 
Before 
Physical 
Server #2 
SAP developer 
for in-memory 
computing 
Physical 
Server n … 
 Developers need to distribute queries and data 
across physical servers; access to data requires 
mastering complex communications protocols 
 Design trapped at a single “scale” – platform growth 
forces redesign every couple years 
 Specialized programming skills held only by SAP’s 
top developers 
After 
Physical 
Server #1 
Physical 
Server #2 
Physical 
Server n …
Solve Very Compute Intensive Problems 
Like Stochastic Optimization 
We need to juggle intermittent energy from wind or solar and volatile electricity 
prices to meet time-varying loads – Princeton has the necessary algorithms 
Wind speed 
Electricity prices 
We can reduce compute time from days to minutes! 
© SAP AG 2011. All rights reserved. / Page 8 
Load
Modeling uncertainty in power scheduling 
12 
10 
8 
6 
4 
2 
© SAP AG 2011. All rights reserved. / Page 9 
The effect of modeling uncertainty in wind 
0 
Uncertain forecast Perfect forecast Constant wind 
2% wind 
40% wind
Modeling Uncertainty In Power Scheduling 
Designing energy portfolios…. 
… is like building a stone wall. You can do a perfect job with a perfect forecast. The 
challenge is dealing with uncertainty. 
© SAP AG 2011. All rights reserved. / Page 10
In the beginning…. 
© SAP AG 2011. All rights reserved. / Page 11
Infinite DRAM with RAM Cloud 
Stanford University and HPI 
John Ousterhout, Mendel Rosenblum, Christian Tinnefeld, et al 
© SAP AG 2011. All rights reserved. / Page 12
Impact of Latency for Internet Applications 
© SAP AG 2011. All rights reserved. / Page 13 
Large-scale Apps Struggle with High Latency 
Web Application 
Application Server Storage Server 
0.5-10ms latency
RAMCloud – Stanford University 
Create distributed storage system that keeps data entirely in DRAM 
Combining the main memories of thousands of servers 
Use high-end network (10 Gigabit Ethernet / Infiniband) 
Replicate data synchronously into the main memories of other nodes 
Asynchronous writes to disk only for backup/archival purpose 
This results in: 
 Gracefully scaling storage solution 
 Very low access latency AND high bandwidth 
 Latency of memory access via network 1-5 μs 
 Eventually consistent data storage – consistency is sub second 
© SAP AG 2011. All rights reserved. / Page 14
RAMCloud – Stanford University 
RamCoud ideal for apps with millions of concurrent users that need more complex 
data structures than key value store e.g. Facebook, Quora, Yelp, etc. 
Research Questions in the Context of Enterprise Applications 
What-If Analytics 
 Ideally, companies want to do what-if analytics on their complete history of transactional 
data, not on a subset (Think about WalMart or Unilever) 
 What-if Analytics need high bursts of read AND write access 
 How can the needed data be modeled and placed in a RAMCloud? 
 How can the needed data operations be implemented in a scalable way? 
Transactional Properties 
 Enterprise applications rely on transactional properties 
 RAMCloud provides extremely low latency, but only eventually consistent 
 Are the reduced transaction times sufficient to avoid lock wait times or reduce aborted 
transactions? 
© SAP AG 2011. All rights reserved. / Page 15
Hybrid In-Memory Store HIRYSE 
MIT CSAIL and HPI 
Sam Madden, Philippe Cudre-Mauroux, Jens Krueger, Martin Grund, et al 
© SAP AG 2011. All rights reserved. / Page 16
Hybrid Partitioning for Mixed Workloads 
OLTP 
OLAP 
Row 
1 
Row 
2 
Row 
3 
Row 
1 
Row 
2 
Row 
3 
© SAP AG 2011. All rights reserved. / Page 17 
Row Store Column Store Hybrid Store 
Row 
4 
Row 
4 
Doc 
Num 
Doc 
Date 
Sold- 
To 
Value 
Sales 
Org 
Status 
Doc 
Num 
Doc 
Date 
Sold- 
To 
Value 
Sales 
Org 
Status 
Row 
1 
Row 
2 
Row 
3 
Row 
4 
Doc 
Num 
Sold- 
To 
Doc 
Date 
Value 
Sales 
Org 
Status
HYRISE Architecture 
© SAP AG 2011. All rights reserved. / Page 18 
■ Query Processor - chooses the best 
possible query plan for the hybrid data 
storage structure 
■ Layout Manager - given a workload 
it performs an evaluation based on a 
cache miss based cost model and 
generates the best possible layout 
■ Storage Manager - main memory 
hybrid storage manager capable of 
vertical partitioning in single relations
Cost Model 
Used to describe the layout dependent costs based on cache misses 
Combine complex access patterns from simpler ones (scan, lookup,…) 
Measure 
 All different access variants to the hybrid storage layer - full projection, partial 
projection, selection, tuple reconstruction 
 Different performance parameters - compiler settings, hardware pre-fetcher, … 
Observe 
 Combined behavior of all components – look at total workload 
 Count cache misses using the CPU hardware performance counter 
Understand 
 Develop the cost model 
© SAP AG 2011. All rights reserved. / Page 19
Results 
© SAP AG 2011. All rights reserved. / Page 20
Multithreading Real Time Event Platform 
MIT Auto ID Lab and HPI 
John Williams, Sergio Herrero, Abel Sanchez, et al 
© SAP AG 2011. All rights reserved. / Page 21
Motivation: Rapid Growth of Events and 
Messaging Platforms 
Verizon and T-Mobile: 2-3 days to generate phone bill 
© SAP AG 2011. All rights reserved. / Page 22 
A Comparative Study of Data Storage and Processing Architectures 
iTunes: 24 hours to generate bill 
Uninterrupted Growth of online billing systems (Hulu, Netflix…) 
Dynamic Pricing on SmartGrid requires design of infrastructure 
capable of ingesting millions of events in quasi-real time 
Goal: Design a 
multi-threaded 
system that 
produces the 
electricity 
consumption bill 
of a city of 1M 
households 
8 hours  seconds
Smart Meter Reading Problem 
Users 
© SAP AG 2011. All rights reserved. / Page 23 
Energy 
Producers 
Data Generation 
Data Persistence 
Data Processing
Multithreaded Simulator for Smart Meter 
Management 
Layer 
Generation Layer Consumption Layer 
© SAP AG 2011. All rights reserved. / Page 24 
Storage Layer 
Meter Read 
Generators 
Head Ends 
Simulation 
Manager 
Query 
Client 
Query 
Client 
Query 
Client 
MDUS 
Meter 
Data 
Unification 
System 
DRYAD 
HADOOPDB 
SAP HANA 
Web Service 
Interfaces 
Web Service 
Interfaces
Conclusion 
Platform that handles billions of events/day AND large numbers of 
threads on one machine (> 1 million), e.g. Siemens 500k events/s 
RDBMS (used by today’s MDUS vendors) provides good query 
performance but does not scale to millions of households (8 h) 
Distributed File System (DFS): Provides the scalability, reliability & insert 
performance necessary for the storage of Smart Meter Reader data 
Using Map Reduce on top of a DFS (HDFS or CEPH) is good for batch 
processing systems 
Loading data from DFS and executing queries in-memory using HANA 
provides very good performance results for real time queries  RAPTOR 
Prototype for SmartGrid allowing to ingest smart meter data in real time, 
do dynamic pricing (4 buckets), store in DFS & do real time analytics 
© SAP AG 2011. All rights reserved. / Page 25 
A Comparative Study of Data Storage and Processing Architectures 
Bill for 1 M households in seconds
Sense 
• Collect current conditions at 
fine grain in real time 
Analyze 
• Access real-time data – analyze 
data, learn, build models 
Actuate 
• Control (soft, hard, persuasive, 
personal) 
ELECTRONIC NERVOUS SYSTEM 
Analytical 
Approach 
Inductive 
Approach 
No Data Complete Data 
No Prior 
Knowledge 
Perfect 
Knowledge 
Data 
Knowledge 
2 
   
    
2 
1 
exp 
2 
i 
i 
x 
y 
 
  
  
  
X1 -1 Y1 0.02540487 
X2 -0.9 Y2 0.02779527 
X3 -0.8 Y3 0.03010825 
X4 -0.7 Y4 0.03228947 
X5 -0.6 Y5 0.03428442 .. .. .. .. . . . . 
GPS SIM 
Card

More Related Content

What's hot

Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forum
bigdatawf
 
How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)
Rand Group
 
TUSC-Piocon OBIEE Case Studies
TUSC-Piocon OBIEE Case StudiesTUSC-Piocon OBIEE Case Studies
TUSC-Piocon OBIEE Case Studies
Mark West
 
Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...
Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...
Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...
StampedeCon
 
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
HP Enterprise Italia
 
Why Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & AnalyticsWhy Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & Analytics
Rick Perret
 
SAP vs SAS - Comparison
SAP vs SAS - ComparisonSAP vs SAS - Comparison
SAP vs SAS - Comparison
Arnab Roy Chowdhury
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
DataWorks Summit
 
Building Innovative Industry Solutions for System z
Building Innovative Industry Solutions for System zBuilding Innovative Industry Solutions for System z
Building Innovative Industry Solutions for System z
dkang
 
Infosys – Cloud Business Value Architecture
Infosys – Cloud Business Value ArchitectureInfosys – Cloud Business Value Architecture
Infosys – Cloud Business Value Architecture
Infosys
 
IBM InfoSphere Data Replication for Big Data
IBM InfoSphere Data Replication for Big DataIBM InfoSphere Data Replication for Big Data
IBM InfoSphere Data Replication for Big Data
IBM Analytics
 
Dell AI Oil and Gas Webinar
Dell AI Oil and Gas WebinarDell AI Oil and Gas Webinar
Dell AI Oil and Gas Webinar
Bill Wong
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forum
bigdatawf
 
next-generation-data-centers
next-generation-data-centersnext-generation-data-centers
next-generation-data-centers
Jason Hoffman
 
Big Memory Webcast
Big Memory WebcastBig Memory Webcast
Big Memory Webcast
MemVerge
 
IBM Power Systems at the heart of Cognitive Solutions
IBM Power Systems at the heart of Cognitive SolutionsIBM Power Systems at the heart of Cognitive Solutions
IBM Power Systems at the heart of Cognitive Solutions
David Spurway
 
Haven 2 0
Haven 2 0 Haven 2 0
Keynote for the IBM Avnet Indonesia MSP Day
Keynote for the IBM Avnet Indonesia MSP DayKeynote for the IBM Avnet Indonesia MSP Day
Keynote for the IBM Avnet Indonesia MSP Day
Pandu W Sastrowardoyo
 
1524 how ibm's big data solution can help you gain insight into your data cen...
1524 how ibm's big data solution can help you gain insight into your data cen...1524 how ibm's big data solution can help you gain insight into your data cen...
1524 how ibm's big data solution can help you gain insight into your data cen...
IBM
 
PaaS: Open For Business
PaaS: Open For Business PaaS: Open For Business
PaaS: Open For Business
VMware Tanzu
 

What's hot (20)

Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forum
 
How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)
 
TUSC-Piocon OBIEE Case Studies
TUSC-Piocon OBIEE Case StudiesTUSC-Piocon OBIEE Case Studies
TUSC-Piocon OBIEE Case Studies
 
Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...
Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...
Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...
 
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
 
Why Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & AnalyticsWhy Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & Analytics
 
SAP vs SAS - Comparison
SAP vs SAS - ComparisonSAP vs SAS - Comparison
SAP vs SAS - Comparison
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
 
Building Innovative Industry Solutions for System z
Building Innovative Industry Solutions for System zBuilding Innovative Industry Solutions for System z
Building Innovative Industry Solutions for System z
 
Infosys – Cloud Business Value Architecture
Infosys – Cloud Business Value ArchitectureInfosys – Cloud Business Value Architecture
Infosys – Cloud Business Value Architecture
 
IBM InfoSphere Data Replication for Big Data
IBM InfoSphere Data Replication for Big DataIBM InfoSphere Data Replication for Big Data
IBM InfoSphere Data Replication for Big Data
 
Dell AI Oil and Gas Webinar
Dell AI Oil and Gas WebinarDell AI Oil and Gas Webinar
Dell AI Oil and Gas Webinar
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forum
 
next-generation-data-centers
next-generation-data-centersnext-generation-data-centers
next-generation-data-centers
 
Big Memory Webcast
Big Memory WebcastBig Memory Webcast
Big Memory Webcast
 
IBM Power Systems at the heart of Cognitive Solutions
IBM Power Systems at the heart of Cognitive SolutionsIBM Power Systems at the heart of Cognitive Solutions
IBM Power Systems at the heart of Cognitive Solutions
 
Haven 2 0
Haven 2 0 Haven 2 0
Haven 2 0
 
Keynote for the IBM Avnet Indonesia MSP Day
Keynote for the IBM Avnet Indonesia MSP DayKeynote for the IBM Avnet Indonesia MSP Day
Keynote for the IBM Avnet Indonesia MSP Day
 
1524 how ibm's big data solution can help you gain insight into your data cen...
1524 how ibm's big data solution can help you gain insight into your data cen...1524 how ibm's big data solution can help you gain insight into your data cen...
1524 how ibm's big data solution can help you gain insight into your data cen...
 
PaaS: Open For Business
PaaS: Open For Business PaaS: Open For Business
PaaS: Open For Business
 

Viewers also liked

Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & Statistics
Paul Hofmann
 
Economics of Cloud Computing
Economics of Cloud ComputingEconomics of Cloud Computing
Economics of Cloud Computing
Paul Hofmann
 
RFID Simulation of the US Pharmaceutical Supply Chain
RFID Simulation of the US Pharmaceutical Supply ChainRFID Simulation of the US Pharmaceutical Supply Chain
RFID Simulation of the US Pharmaceutical Supply Chain
Paul Hofmann
 
LINK TO VIDEOS
LINK TO VIDEOSLINK TO VIDEOS
LINK TO VIDEOS
Saffron Technology Inc.
 
Saffron Tech Company Profile
Saffron Tech Company ProfileSaffron Tech Company Profile
Saffron Tech Company Profile
IT Chimes
 
e-Learning Reimagined: the Secret to Achieving and Measuring ROI
e-Learning Reimagined: the Secret to Achieving and Measuring ROIe-Learning Reimagined: the Secret to Achieving and Measuring ROI
e-Learning Reimagined: the Secret to Achieving and Measuring ROI
Saffron Interactive
 
Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...
Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...
Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...
Mr.Allah Dad Khan
 
Saffron
SaffronSaffron
深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開
Seiya Tokui
 

Viewers also liked (9)

Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & Statistics
 
Economics of Cloud Computing
Economics of Cloud ComputingEconomics of Cloud Computing
Economics of Cloud Computing
 
RFID Simulation of the US Pharmaceutical Supply Chain
RFID Simulation of the US Pharmaceutical Supply ChainRFID Simulation of the US Pharmaceutical Supply Chain
RFID Simulation of the US Pharmaceutical Supply Chain
 
LINK TO VIDEOS
LINK TO VIDEOSLINK TO VIDEOS
LINK TO VIDEOS
 
Saffron Tech Company Profile
Saffron Tech Company ProfileSaffron Tech Company Profile
Saffron Tech Company Profile
 
e-Learning Reimagined: the Secret to Achieving and Measuring ROI
e-Learning Reimagined: the Secret to Achieving and Measuring ROIe-Learning Reimagined: the Secret to Achieving and Measuring ROI
e-Learning Reimagined: the Secret to Achieving and Measuring ROI
 
Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...
Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...
Production technology and processing of saffron (crocus) by Mr Allah Dad Khan...
 
Saffron
SaffronSaffron
Saffron
 
深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開
 

Similar to New Business Applications Powered by In-Memory Technology @MIT Forum for Supply Chain Innovation, 2011

NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in VacouverNGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
Scott Shadley, MBA,PMC-III
 
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled ArchitectureDM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DATAVERSITY
 
IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015
Doug O'Flaherty
 
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...
Fujitsu India
 
NetApp All Flash storage
NetApp All Flash storageNetApp All Flash storage
NetApp All Flash storage
MarketingArrowECS_CZ
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big Data
Lviv Startup Club
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)
Lviv Startup Club
 
Webinar: Three Reasons Why NAS is No Good for AI and Machine Learning
Webinar: Three Reasons Why NAS is No Good for AI and Machine LearningWebinar: Three Reasons Why NAS is No Good for AI and Machine Learning
Webinar: Three Reasons Why NAS is No Good for AI and Machine Learning
Storage Switzerland
 
Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad
IIIT ALLAHABAD
 
Aerospike: Enabling Your Digital Transformation
Aerospike: Enabling Your Digital TransformationAerospike: Enabling Your Digital Transformation
Aerospike: Enabling Your Digital Transformation
Brillix
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
Nati Shalom
 
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
StampedeCon
 
Presentazione PernixData @ VMUGIT UserCon 2015
Presentazione PernixData @ VMUGIT UserCon 2015Presentazione PernixData @ VMUGIT UserCon 2015
Presentazione PernixData @ VMUGIT UserCon 2015
VMUG IT
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWER
inside-BigData.com
 
Macroview Netapp Overview
Macroview Netapp OverviewMacroview Netapp Overview
Macroview Netapp Overview
Alex Tsui
 
Autodesk Technical Webinar: SAP HANA in-memory database
Autodesk Technical Webinar: SAP HANA in-memory databaseAutodesk Technical Webinar: SAP HANA in-memory database
Autodesk Technical Webinar: SAP HANA in-memory database
SAP PartnerEdge program for Application Development
 
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Prolifics
 
EMEA TechTalk – The NetApp Flash Optimized Portfolio
EMEA TechTalk – The NetApp Flash Optimized PortfolioEMEA TechTalk – The NetApp Flash Optimized Portfolio
EMEA TechTalk – The NetApp Flash Optimized Portfolio
NetApp
 
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 TechnologyAdd Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
IBM India Smarter Computing
 
Tendencias Storage
Tendencias StorageTendencias Storage
Tendencias Storage
Fran Navarro
 

Similar to New Business Applications Powered by In-Memory Technology @MIT Forum for Supply Chain Innovation, 2011 (20)

NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in VacouverNGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
 
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled ArchitectureDM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
 
IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015
 
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...
K5.Fujitsu World Tour 2016-Winning with NetApp in Digital Transformation Age,...
 
NetApp All Flash storage
NetApp All Flash storageNetApp All Flash storage
NetApp All Flash storage
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big Data
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)
 
Webinar: Three Reasons Why NAS is No Good for AI and Machine Learning
Webinar: Three Reasons Why NAS is No Good for AI and Machine LearningWebinar: Three Reasons Why NAS is No Good for AI and Machine Learning
Webinar: Three Reasons Why NAS is No Good for AI and Machine Learning
 
Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad
 
Aerospike: Enabling Your Digital Transformation
Aerospike: Enabling Your Digital TransformationAerospike: Enabling Your Digital Transformation
Aerospike: Enabling Your Digital Transformation
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
 
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
 
Presentazione PernixData @ VMUGIT UserCon 2015
Presentazione PernixData @ VMUGIT UserCon 2015Presentazione PernixData @ VMUGIT UserCon 2015
Presentazione PernixData @ VMUGIT UserCon 2015
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWER
 
Macroview Netapp Overview
Macroview Netapp OverviewMacroview Netapp Overview
Macroview Netapp Overview
 
Autodesk Technical Webinar: SAP HANA in-memory database
Autodesk Technical Webinar: SAP HANA in-memory databaseAutodesk Technical Webinar: SAP HANA in-memory database
Autodesk Technical Webinar: SAP HANA in-memory database
 
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
 
EMEA TechTalk – The NetApp Flash Optimized Portfolio
EMEA TechTalk – The NetApp Flash Optimized PortfolioEMEA TechTalk – The NetApp Flash Optimized Portfolio
EMEA TechTalk – The NetApp Flash Optimized Portfolio
 
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 TechnologyAdd Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
Add Memory, Improve Performance, and Lower Costs with IBM MAX5 Technology
 
Tendencias Storage
Tendencias StorageTendencias Storage
Tendencias Storage
 

Recently uploaded

Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
ScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking ReplicationScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking Replication
ScyllaDB
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
Fwdays
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
christinelarrosa
 
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
LizaNolte
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
Fwdays
 
Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!
Tobias Schneck
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
Mydbops
 
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
AlexanderRichford
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeckPoznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
FilipTomaszewski5
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
BibashShahi
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
christinelarrosa
 

Recently uploaded (20)

Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
ScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking ReplicationScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking Replication
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
 
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
 
Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
 
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeckPoznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
 

New Business Applications Powered by In-Memory Technology @MIT Forum for Supply Chain Innovation, 2011

  • 1. New Business Applications Powered by In-Memory Technologies from Academia MIT Forum for Supply Chain Innovation HPI February 2011 Paul Hofmann, PhD VP, Group of the Chief Scientist SAP Chief Scientist Group
  • 2. Examples of Projects with In-Memory Technology in Collaboration with Academia © SAP AG 2011. All rights reserved. / Page 2 Taking In-Memory Computing Seriously Coherent Shared Memory – BigIron in Palo Alto Keep programming model simple AND solve very complex problems Princeton University  Optimal pricing for energy management, online pricing, truck scheduling, … Infinite DRAM - RAMCloud Stanford University and HPI  Extremely low latency and very high bandwidth  Facebook like problems with high read AND write rate  Advanced analytics, what-if scenarios, demand planning, ... Hybrid In-Memory Store MIT CSAIL and HPI  Aggregate column store – the best of both worlds Multithreading Real Time Event Platform MIT Auto-ID Lab and HPI  500k events/s and millions of threads in-memory or distributed  Automatic meter reading, online billing, mobile billing, Smart Grid
  • 3. Taking In-Memory Computing Seriously Chief Scientist Group Princeton University – Operations Research and Financial Engineering Warren Powell, et al © SAP AG 2011. All rights reserved. / Page 3
  • 4. Taking In-Memory Computing Seriously! Basic Assumptions  Disk is tape - active data must be in DRAM  Data locality is king  avoid cache misses and stalled CPU Problems and Opportunities for In-Memory Computing  Addressable DRAM per box is limited – different than hard disks. We need to scale memory independently from physical boxes  Scaling Architecture – Arbitrary scaling of the amount of data stored in DRAM – Arbitrary & independent scaling of number of active users & associated computing load  Inter-Process Communication is slow and hard to program (latencies are in the area of 0.5-1ms ) We can do better © SAP AG 2011. All rights reserved. / Page 4
  • 5. Taking In-Memory Computing Seriously! How can we do better?  Coherent Shared Memory or ccNUMA All CPUs can access all memory and all I/O channels in about 1 μs We can scale independently with DRAM and CPUs You need more computing power – add another board … You need more DRAM – add another board …  Merge Application Server & DB Server – reference memory directly from app © SAP AG 2011. All rights reserved. / Page 5
  • 6. BigIron - A System We Architected For Hana with Leading-Edge, Cluster Server Components System Specifications Architecture, Assembly, © SAP AG 2011. All rights reserved. / Page 6 System architecture: SAP Technology Infrastructure Research Practice Assembly and Test: Colfax International Hosting: Bay Area Internet Solutions, Santa Clara, CA • Large shared coherent memory (5TB) across servers via Scale MP • 160 cores(320 HT) Big Iron 2 Extreme Performance, Scalability, and much simpler system model Research Server Cluster  5 x 4U Servers (4 Intel XEON x7560 2.26Ghz)  160 cores (32 Cores/Server)  5TB memory (64 x 16MB DDR3/Server)  30TB SSD (solid state disk) storage  5 Networks  VPN of ScaleMP (40- 160GbIB)  VPN of Server Cluster (10GbE)  VPN of Storage Array (10GbE)  VPN of SAP Internal Network (10MbE metered)  Firewalled GW to Internet (1GbE Expandable)  1 NAS (72TB Expandable to 180)  1 x 48U Rack  System Software  SLES11 Linux OS Licenses  ScaleMP vSMP Licenses  System cost: $618K with tax and support & Hosting
  • 7. Coherent Shared Memory – Improved Productivity at Low Price Traditional server clusters – Distributed memory Programming of Queries, Distribution of Data Physical Server #1 © SAP AG 2011. All rights reserved. / Page 7 Server Cluster with Coherent Shared Memory Traditional SAP developer Coherent Shared Memory via software approach  Developers can treat system as one “big” server and let the operating system and lower level software handle the problem  Initial design is timeless – hardware scaling handled below app design layer  Developers do not need additional skills for in-memory computing Before Physical Server #2 SAP developer for in-memory computing Physical Server n …  Developers need to distribute queries and data across physical servers; access to data requires mastering complex communications protocols  Design trapped at a single “scale” – platform growth forces redesign every couple years  Specialized programming skills held only by SAP’s top developers After Physical Server #1 Physical Server #2 Physical Server n …
  • 8. Solve Very Compute Intensive Problems Like Stochastic Optimization We need to juggle intermittent energy from wind or solar and volatile electricity prices to meet time-varying loads – Princeton has the necessary algorithms Wind speed Electricity prices We can reduce compute time from days to minutes! © SAP AG 2011. All rights reserved. / Page 8 Load
  • 9. Modeling uncertainty in power scheduling 12 10 8 6 4 2 © SAP AG 2011. All rights reserved. / Page 9 The effect of modeling uncertainty in wind 0 Uncertain forecast Perfect forecast Constant wind 2% wind 40% wind
  • 10. Modeling Uncertainty In Power Scheduling Designing energy portfolios…. … is like building a stone wall. You can do a perfect job with a perfect forecast. The challenge is dealing with uncertainty. © SAP AG 2011. All rights reserved. / Page 10
  • 11. In the beginning…. © SAP AG 2011. All rights reserved. / Page 11
  • 12. Infinite DRAM with RAM Cloud Stanford University and HPI John Ousterhout, Mendel Rosenblum, Christian Tinnefeld, et al © SAP AG 2011. All rights reserved. / Page 12
  • 13. Impact of Latency for Internet Applications © SAP AG 2011. All rights reserved. / Page 13 Large-scale Apps Struggle with High Latency Web Application Application Server Storage Server 0.5-10ms latency
  • 14. RAMCloud – Stanford University Create distributed storage system that keeps data entirely in DRAM Combining the main memories of thousands of servers Use high-end network (10 Gigabit Ethernet / Infiniband) Replicate data synchronously into the main memories of other nodes Asynchronous writes to disk only for backup/archival purpose This results in:  Gracefully scaling storage solution  Very low access latency AND high bandwidth  Latency of memory access via network 1-5 μs  Eventually consistent data storage – consistency is sub second © SAP AG 2011. All rights reserved. / Page 14
  • 15. RAMCloud – Stanford University RamCoud ideal for apps with millions of concurrent users that need more complex data structures than key value store e.g. Facebook, Quora, Yelp, etc. Research Questions in the Context of Enterprise Applications What-If Analytics  Ideally, companies want to do what-if analytics on their complete history of transactional data, not on a subset (Think about WalMart or Unilever)  What-if Analytics need high bursts of read AND write access  How can the needed data be modeled and placed in a RAMCloud?  How can the needed data operations be implemented in a scalable way? Transactional Properties  Enterprise applications rely on transactional properties  RAMCloud provides extremely low latency, but only eventually consistent  Are the reduced transaction times sufficient to avoid lock wait times or reduce aborted transactions? © SAP AG 2011. All rights reserved. / Page 15
  • 16. Hybrid In-Memory Store HIRYSE MIT CSAIL and HPI Sam Madden, Philippe Cudre-Mauroux, Jens Krueger, Martin Grund, et al © SAP AG 2011. All rights reserved. / Page 16
  • 17. Hybrid Partitioning for Mixed Workloads OLTP OLAP Row 1 Row 2 Row 3 Row 1 Row 2 Row 3 © SAP AG 2011. All rights reserved. / Page 17 Row Store Column Store Hybrid Store Row 4 Row 4 Doc Num Doc Date Sold- To Value Sales Org Status Doc Num Doc Date Sold- To Value Sales Org Status Row 1 Row 2 Row 3 Row 4 Doc Num Sold- To Doc Date Value Sales Org Status
  • 18. HYRISE Architecture © SAP AG 2011. All rights reserved. / Page 18 ■ Query Processor - chooses the best possible query plan for the hybrid data storage structure ■ Layout Manager - given a workload it performs an evaluation based on a cache miss based cost model and generates the best possible layout ■ Storage Manager - main memory hybrid storage manager capable of vertical partitioning in single relations
  • 19. Cost Model Used to describe the layout dependent costs based on cache misses Combine complex access patterns from simpler ones (scan, lookup,…) Measure  All different access variants to the hybrid storage layer - full projection, partial projection, selection, tuple reconstruction  Different performance parameters - compiler settings, hardware pre-fetcher, … Observe  Combined behavior of all components – look at total workload  Count cache misses using the CPU hardware performance counter Understand  Develop the cost model © SAP AG 2011. All rights reserved. / Page 19
  • 20. Results © SAP AG 2011. All rights reserved. / Page 20
  • 21. Multithreading Real Time Event Platform MIT Auto ID Lab and HPI John Williams, Sergio Herrero, Abel Sanchez, et al © SAP AG 2011. All rights reserved. / Page 21
  • 22. Motivation: Rapid Growth of Events and Messaging Platforms Verizon and T-Mobile: 2-3 days to generate phone bill © SAP AG 2011. All rights reserved. / Page 22 A Comparative Study of Data Storage and Processing Architectures iTunes: 24 hours to generate bill Uninterrupted Growth of online billing systems (Hulu, Netflix…) Dynamic Pricing on SmartGrid requires design of infrastructure capable of ingesting millions of events in quasi-real time Goal: Design a multi-threaded system that produces the electricity consumption bill of a city of 1M households 8 hours  seconds
  • 23. Smart Meter Reading Problem Users © SAP AG 2011. All rights reserved. / Page 23 Energy Producers Data Generation Data Persistence Data Processing
  • 24. Multithreaded Simulator for Smart Meter Management Layer Generation Layer Consumption Layer © SAP AG 2011. All rights reserved. / Page 24 Storage Layer Meter Read Generators Head Ends Simulation Manager Query Client Query Client Query Client MDUS Meter Data Unification System DRYAD HADOOPDB SAP HANA Web Service Interfaces Web Service Interfaces
  • 25. Conclusion Platform that handles billions of events/day AND large numbers of threads on one machine (> 1 million), e.g. Siemens 500k events/s RDBMS (used by today’s MDUS vendors) provides good query performance but does not scale to millions of households (8 h) Distributed File System (DFS): Provides the scalability, reliability & insert performance necessary for the storage of Smart Meter Reader data Using Map Reduce on top of a DFS (HDFS or CEPH) is good for batch processing systems Loading data from DFS and executing queries in-memory using HANA provides very good performance results for real time queries  RAPTOR Prototype for SmartGrid allowing to ingest smart meter data in real time, do dynamic pricing (4 buckets), store in DFS & do real time analytics © SAP AG 2011. All rights reserved. / Page 25 A Comparative Study of Data Storage and Processing Architectures Bill for 1 M households in seconds
  • 26. Sense • Collect current conditions at fine grain in real time Analyze • Access real-time data – analyze data, learn, build models Actuate • Control (soft, hard, persuasive, personal) ELECTRONIC NERVOUS SYSTEM Analytical Approach Inductive Approach No Data Complete Data No Prior Knowledge Perfect Knowledge Data Knowledge 2        2 1 exp 2 i i x y        X1 -1 Y1 0.02540487 X2 -0.9 Y2 0.02779527 X3 -0.8 Y3 0.03010825 X4 -0.7 Y4 0.03228947 X5 -0.6 Y5 0.03428442 .. .. .. .. . . . . GPS SIM Card