SlideShare a Scribd company logo
1 of 23
Download to read offline
Munich Re’s Journey to the Big Data and Analytics Self-
Service Platform including a Central Data Lake
06 April 2017
Marc Wewers Hans Edert
IT Architect, Munich RE Solution Architect, SAS Institute GmbH
Agenda
27. März 2017 2Big Data Analytics @ Munich Re
Data Analytics Framework
1 Technology
2
Data Lake
3 Lessons Learned
4
Agenda
3
Data Analytics Framework
1 Technology
2
Data Lake
3 Lessons Learned
4
Big Data Analytics @ Munich Re 27. März 2017
Munich RE
27. März 2017 4
Reinsurance Primary insurance
Asset Management
Munich Re (Group)
Big Data Analytics @ Munich Re
History of Munich Re
27. März 2017 5
1880
Munich Re is founded on
19 April 1880 at the instigation of
Carl von Thieme, Baron Theodor
von Cramer-Klett and
Wilhelm Finck.
1906
First major loss in the
20th century: the earth-quake in
San Francisco on 18 April 1906.
Munich Re's liability:
US$ 2.5m.
Munich Re deals with all aspects
of claims on the spot.
1997
The insurance groups
VICTORIA/D.A.S. and Hamburg-
Mannheimer/DKV announce that
they will merge under the name of
ERGO Versicherungsgruppe AG.
ERGO, which belongs
to Munich Re, is now represented
in more than 30 countries.
2009
Munich Re pools its international
health insurance and reinsurance
expertise in a new business
segment: Munich Health.
With the introduction of the new
brand, Munich Re redefines its
position in the reinsurance markets.
2011
With overall losses amounting to
some US$ 380bn, 2011 becomes
the costliest natural catastrophe
year to date.
After the terrible earthquake in
Japan on 11 March 2011, Munich
Re invites internationally recognised
experts to assess the event.
Image: ERGO Versicherungsgruppe Image: used under license from shutterstock.comImage: Munich Re / Marcus Buck Image: Corbis Image: CorbisImage: Corbis
Big Data Analytics @ Munich Re
Big Data use cases in insurance
Make the uninsurable insurable
 Diabetics
 Wind Energy
Consolidate the information and process
 Automated underwriting
 Risk management platform
Artificial Intelligence supported workflow
 Early Loss Detection
 Visual Loss Adjustment
Image: dpa Picture Alliance Image: Getty Images
Image: Getty Images
Image: used under license from shutterstock.com
Image: used under license from shutterstock.com
Image: used under license from shutterstock.com
27. März 2017 6Big Data Analytics @ Munich Re
Big Data
Analytics
Methods
 Regression Models
 Machine Learning
Models
 Text Mining
Technology
 Hardware
(Compute power)
 Software
(SAS, R, Spark, …)
Data
 Internal Data
 External Data
 Structured Data
 Unstructured Data
People
 Data Scientists
 Data Engineers
 Business People
 IT Architects
Big Data Analytics is a Combination of
Methods, Technology, Data and People
727. März 2017Big Data Analytics @ Munich Re
Agenda
8
Data Analytics Framework
1 Technology
2
Data Lake
3 Lessons Learned
4
Big Data Analytics @ Munich Re 27. März 2017
Design principles for Big Data & Analytics Platform
27. März 2017 9Big Data Analytics @ Munich Re
SAS & Hortonworks Self-Service Multi Tenancy
One Central Datalake
DevOps On-Prem & CloudContinuous improvement
Automation
Hybrid
Building the Infrastructure
27. März 2017 10Big Data Analytics @ Munich Re
SASHANA Hadoop Stack
HANA
User InterfaceUser Interface User Interface
SASHANA Hadoop Stack
HANA
User InterfaceUser Interface User Interface
A2P
Data Lake (HDFS)
Long term unstructured and structured data
BI Lab Production
Roadmap to Production via Lab environment
Q2 - 2015 Q3 - 2015 Q4 - 2015
Setup of new
BI-Lab Hadoop Cluster
On-boarding & support of
Big Data & Analytics pilots
Stabilization of BI-Lab Hadoop cluster
Authentication & Security
Automation
New BI-Lab Hadoop Cluster available
• Large shared cluster
• Dedicated clusters
• Single-Node cluster
Pilot SAS – Hadoop Integration
11
Design Setup / Build Run

Setup of first
BI-Lab Hadoop
Cluster

Enhance / Optimize
  
Enhance / Optimize
Big Data Analytics @ Munich Re 27. März 2017
Building the Big Data & Analytics Platform
Production Environment
12
Design Setup / Build Run
2016
Release v1.0 Release v2.0 Release v3.0
 
• SAS 9.4 M3
• SAS Visual Analytics (VA)
• Self-Service Data Upload
• SAS Embedded Process
for Hadoop
• SAS Enterprise Guide (EG)
• SAS MS Office Add-in
• Data Access to SAP HANA,
Oracle & MS SQL-Server
• SAS Enterprise Miner
• SAS Contextual Analysis
• SAS Mobile BI iOS App
• HDP 2.3
• Hue
• Hive
• Ambari
• Ranger with LDAP
• Sqoop
• Pig
• Spark 1.4
• Oozie
• HDP 2.4.2
• Ambari Views
• Spark 1.6
• Solr Cloud
• Tesseract
Start setup platform Release v4.0
01 02 03 04 05 06 07 0812 09 10 11 12
2017
• SAS VA Row Level Security
• SAS HA
• HDP 2.5
• Atlas
• Zeppelin
• Data Catalogue Tool
• Data Lineage
• Compliance & Security

Optimize
2-week iterations with
Rolling Upgrades
Enhance / Optimize

Enhance

Big Data Analytics @ Munich Re 27. März 2017
• Leader in Advanced Analytics, BI, and Data Management
• More than 80,000 sites, across 148 countries
• 91 of the top 100 companies on the 2014 Fortune Global 500® list
• Top ranked in Fortune Magazine’s list of Best Companies to Work For in the US
• Reinvests 25% of revenue in R&D
About SAS
Founded in 1976
Giving you The Power to Know®
since 1976
14,000+
E M P L O Y E E S
58
C O U N T R I E S
SAS capabilities on the Hortonworks Data Platform
SAS on the Edge
• Conduct low latency assessment , enrichment, and
analytics of high-volume, on high-volume streaming
data
Data Management Inside HDP
• Data ingestion, cleansing, and transformation
Predictive Modeling Inside HDP
• Apply analytics and rules to pinpoint event
relevance and urgency with continuous pattern
detection
Model Deployment Inside HDP
• Apply analytics and rules to pinpoint event
relevance and urgency with continuous pattern
detection
Big Data Analytics @ Munich Re 27. März 2017 15
SAS capabilities on the Hortonworks Data Platform
27. März 2017 15
Hive
HDFS
Data
Lake
SAS In-
Memory
engine
SAS/
Accelerators
SAS/ Access
to Hadoop
Web
Tier
Compute
Tier
Metadata
SQL
HDFS
SAS
EP
YARN
SAS code
Enterprise
Guide
Enterprise
Miner
Web Browser
Enterprise Guide
Enterprise Miner
Visual Analytics
Data Loader for
Hadoop
Big Data Analytics @ Munich Re
„Simplified“ Server Architecture SAS and HDP
Data Node 1 Data Node 2 Data Node 3 Data Node x Data Node x+1 Data Node x+2 Data Node y
SAS
In-Memory
engine
(LASR)
SAS
In-Memory
engine
(LASR)
SAS
In-Memory
engine
(LASR)
SAS
In-Memory
engine
(LASR)
SAS EP SAS EP SAS EP SAS EP SAS EP SAS EP SAS EP
HDFS HDFS HDFS HDFS HDFS HDFS HDFS
SAS
Mgmt &
Metadata
Hadoop
Mgmt &
Metadata
Hive Hive HiveHive Hive Hive Hive
Hadoop
Frontend
SERVER / OS SERVER / OS SERVER / OS
x < y
EP = Embedded Process
“bring calculation to data”
Ambari Views,
Zeppelin, …
16
YARN YARN YARNYARN YARN YARN YARN
Solr Solr SolrSolr Solr Solr Solr
Spark Spark SparkSpark Spark Spark Spark
Big Data Analytics @ Munich Re 27. März 2017
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
Big Data & Analytics Production Environments
Scalability
Sandbox
Integration
Production
SAS SAS
EP
LASR
YARN
Hive
HWX
Scalability
LASR LASR
LASR
EP EP
EP
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
EP EP EP EP
EP EP EP EP
YARN
Hive
YARN
Hive
EP
EP
YARN
Hive
YARN
Hive
EP
EP
YARN
Hive
YARN
Hive
EP
LASR
LASR
EP
Self Service
Analytics
Scheduled
Analytics
Big Data Analytics @ Munich Re 27. März 2017 16
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
Big Data & Analytics Production Environments
Scalability
Sandbox
Integration
Production
SAS SAS
EP
LASR
YARN
Hive
HWX
Scalability
LASR LASR
LASR
EP EP
EP
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
EP EP EP EP
EP EP EP EP
YARN
Hive
YARN
Hive
EP
EP
YARN
Hive
YARN
Hive
EP
EP
YARN
Hive
YARN
Hive
EP
LASR
LASR
EP
LASR LASR LASR
LASR LASRLASR
Self Service
Analytics
Scheduled
Analytics
Big Data Analytics @ Munich Re 27. März 2017 16
Agenda
19
Data Analytics Framework
1 Technology
2
Data Lake
3 Lessons Learned
4
Big Data Analytics @ Munich Re 27. März 2017
Agenda
21
Data Analytics Framework
1 Technology
2
Data Lake
3 Lessons Learned
4
Big Data Analytics @ Munich Re 27. März 2017
Lessons learned
Make use of Lab environment
Enable Self-Service
Agile IT-Project management
27. März 2017 22Big Data Analytics @ Munich Re
1
2
Automatization
Security
YARN queue management3
4
5
6
Thank you!
Contact:
Marc Wewers Hans Edert
IT Architect Solution Architect
mwewers@munichre.com Hans-Joachim.Edert@sas.com
© 2016 Münchener Rückversicherungs-Gesellschaft © 2016 Munich Reinsurance Company

More Related Content

More from DataWorks Summit/Hadoop Summit

How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
 
Efficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and ArrowEfficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and Arrow
DataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
 
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
 
Fishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data Lake Fishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data Lake
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Row/Column- Level Security in SQL for Apache Spark
Row/Column- Level Security in SQL for Apache SparkRow/Column- Level Security in SQL for Apache Spark
Row/Column- Level Security in SQL for Apache Spark
 
Efficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and ArrowEfficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and Arrow
 
Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Munich Re's Journey to the BIg Data and Analytics Self-Service Platform including a Central Data Lake

  • 1. Munich Re’s Journey to the Big Data and Analytics Self- Service Platform including a Central Data Lake 06 April 2017 Marc Wewers Hans Edert IT Architect, Munich RE Solution Architect, SAS Institute GmbH
  • 2. Agenda 27. März 2017 2Big Data Analytics @ Munich Re Data Analytics Framework 1 Technology 2 Data Lake 3 Lessons Learned 4
  • 3. Agenda 3 Data Analytics Framework 1 Technology 2 Data Lake 3 Lessons Learned 4 Big Data Analytics @ Munich Re 27. März 2017
  • 4. Munich RE 27. März 2017 4 Reinsurance Primary insurance Asset Management Munich Re (Group) Big Data Analytics @ Munich Re
  • 5. History of Munich Re 27. März 2017 5 1880 Munich Re is founded on 19 April 1880 at the instigation of Carl von Thieme, Baron Theodor von Cramer-Klett and Wilhelm Finck. 1906 First major loss in the 20th century: the earth-quake in San Francisco on 18 April 1906. Munich Re's liability: US$ 2.5m. Munich Re deals with all aspects of claims on the spot. 1997 The insurance groups VICTORIA/D.A.S. and Hamburg- Mannheimer/DKV announce that they will merge under the name of ERGO Versicherungsgruppe AG. ERGO, which belongs to Munich Re, is now represented in more than 30 countries. 2009 Munich Re pools its international health insurance and reinsurance expertise in a new business segment: Munich Health. With the introduction of the new brand, Munich Re redefines its position in the reinsurance markets. 2011 With overall losses amounting to some US$ 380bn, 2011 becomes the costliest natural catastrophe year to date. After the terrible earthquake in Japan on 11 March 2011, Munich Re invites internationally recognised experts to assess the event. Image: ERGO Versicherungsgruppe Image: used under license from shutterstock.comImage: Munich Re / Marcus Buck Image: Corbis Image: CorbisImage: Corbis Big Data Analytics @ Munich Re
  • 6. Big Data use cases in insurance Make the uninsurable insurable  Diabetics  Wind Energy Consolidate the information and process  Automated underwriting  Risk management platform Artificial Intelligence supported workflow  Early Loss Detection  Visual Loss Adjustment Image: dpa Picture Alliance Image: Getty Images Image: Getty Images Image: used under license from shutterstock.com Image: used under license from shutterstock.com Image: used under license from shutterstock.com 27. März 2017 6Big Data Analytics @ Munich Re
  • 7. Big Data Analytics Methods  Regression Models  Machine Learning Models  Text Mining Technology  Hardware (Compute power)  Software (SAS, R, Spark, …) Data  Internal Data  External Data  Structured Data  Unstructured Data People  Data Scientists  Data Engineers  Business People  IT Architects Big Data Analytics is a Combination of Methods, Technology, Data and People 727. März 2017Big Data Analytics @ Munich Re
  • 8. Agenda 8 Data Analytics Framework 1 Technology 2 Data Lake 3 Lessons Learned 4 Big Data Analytics @ Munich Re 27. März 2017
  • 9. Design principles for Big Data & Analytics Platform 27. März 2017 9Big Data Analytics @ Munich Re SAS & Hortonworks Self-Service Multi Tenancy One Central Datalake DevOps On-Prem & CloudContinuous improvement Automation Hybrid
  • 10. Building the Infrastructure 27. März 2017 10Big Data Analytics @ Munich Re SASHANA Hadoop Stack HANA User InterfaceUser Interface User Interface SASHANA Hadoop Stack HANA User InterfaceUser Interface User Interface A2P Data Lake (HDFS) Long term unstructured and structured data BI Lab Production
  • 11. Roadmap to Production via Lab environment Q2 - 2015 Q3 - 2015 Q4 - 2015 Setup of new BI-Lab Hadoop Cluster On-boarding & support of Big Data & Analytics pilots Stabilization of BI-Lab Hadoop cluster Authentication & Security Automation New BI-Lab Hadoop Cluster available • Large shared cluster • Dedicated clusters • Single-Node cluster Pilot SAS – Hadoop Integration 11 Design Setup / Build Run  Setup of first BI-Lab Hadoop Cluster  Enhance / Optimize    Enhance / Optimize Big Data Analytics @ Munich Re 27. März 2017
  • 12. Building the Big Data & Analytics Platform Production Environment 12 Design Setup / Build Run 2016 Release v1.0 Release v2.0 Release v3.0   • SAS 9.4 M3 • SAS Visual Analytics (VA) • Self-Service Data Upload • SAS Embedded Process for Hadoop • SAS Enterprise Guide (EG) • SAS MS Office Add-in • Data Access to SAP HANA, Oracle & MS SQL-Server • SAS Enterprise Miner • SAS Contextual Analysis • SAS Mobile BI iOS App • HDP 2.3 • Hue • Hive • Ambari • Ranger with LDAP • Sqoop • Pig • Spark 1.4 • Oozie • HDP 2.4.2 • Ambari Views • Spark 1.6 • Solr Cloud • Tesseract Start setup platform Release v4.0 01 02 03 04 05 06 07 0812 09 10 11 12 2017 • SAS VA Row Level Security • SAS HA • HDP 2.5 • Atlas • Zeppelin • Data Catalogue Tool • Data Lineage • Compliance & Security  Optimize 2-week iterations with Rolling Upgrades Enhance / Optimize  Enhance  Big Data Analytics @ Munich Re 27. März 2017
  • 13. • Leader in Advanced Analytics, BI, and Data Management • More than 80,000 sites, across 148 countries • 91 of the top 100 companies on the 2014 Fortune Global 500® list • Top ranked in Fortune Magazine’s list of Best Companies to Work For in the US • Reinvests 25% of revenue in R&D About SAS Founded in 1976 Giving you The Power to Know® since 1976 14,000+ E M P L O Y E E S 58 C O U N T R I E S
  • 14. SAS capabilities on the Hortonworks Data Platform SAS on the Edge • Conduct low latency assessment , enrichment, and analytics of high-volume, on high-volume streaming data Data Management Inside HDP • Data ingestion, cleansing, and transformation Predictive Modeling Inside HDP • Apply analytics and rules to pinpoint event relevance and urgency with continuous pattern detection Model Deployment Inside HDP • Apply analytics and rules to pinpoint event relevance and urgency with continuous pattern detection Big Data Analytics @ Munich Re 27. März 2017 15
  • 15. SAS capabilities on the Hortonworks Data Platform 27. März 2017 15 Hive HDFS Data Lake SAS In- Memory engine SAS/ Accelerators SAS/ Access to Hadoop Web Tier Compute Tier Metadata SQL HDFS SAS EP YARN SAS code Enterprise Guide Enterprise Miner Web Browser Enterprise Guide Enterprise Miner Visual Analytics Data Loader for Hadoop Big Data Analytics @ Munich Re
  • 16. „Simplified“ Server Architecture SAS and HDP Data Node 1 Data Node 2 Data Node 3 Data Node x Data Node x+1 Data Node x+2 Data Node y SAS In-Memory engine (LASR) SAS In-Memory engine (LASR) SAS In-Memory engine (LASR) SAS In-Memory engine (LASR) SAS EP SAS EP SAS EP SAS EP SAS EP SAS EP SAS EP HDFS HDFS HDFS HDFS HDFS HDFS HDFS SAS Mgmt & Metadata Hadoop Mgmt & Metadata Hive Hive HiveHive Hive Hive Hive Hadoop Frontend SERVER / OS SERVER / OS SERVER / OS x < y EP = Embedded Process “bring calculation to data” Ambari Views, Zeppelin, … 16 YARN YARN YARNYARN YARN YARN YARN Solr Solr SolrSolr Solr Solr Solr Spark Spark SparkSpark Spark Spark Spark Big Data Analytics @ Munich Re 27. März 2017
  • 17. YARN Hive YARN Hive YARN Hive YARN Hive Big Data & Analytics Production Environments Scalability Sandbox Integration Production SAS SAS EP LASR YARN Hive HWX Scalability LASR LASR LASR EP EP EP YARN Hive YARN Hive YARN Hive YARN Hive YARN Hive YARN Hive YARN Hive EP EP EP EP EP EP EP EP YARN Hive YARN Hive EP EP YARN Hive YARN Hive EP EP YARN Hive YARN Hive EP LASR LASR EP Self Service Analytics Scheduled Analytics Big Data Analytics @ Munich Re 27. März 2017 16
  • 18. YARN Hive YARN Hive YARN Hive YARN Hive Big Data & Analytics Production Environments Scalability Sandbox Integration Production SAS SAS EP LASR YARN Hive HWX Scalability LASR LASR LASR EP EP EP YARN Hive YARN Hive YARN Hive YARN Hive YARN Hive YARN Hive YARN Hive EP EP EP EP EP EP EP EP YARN Hive YARN Hive EP EP YARN Hive YARN Hive EP EP YARN Hive YARN Hive EP LASR LASR EP LASR LASR LASR LASR LASRLASR Self Service Analytics Scheduled Analytics Big Data Analytics @ Munich Re 27. März 2017 16
  • 19. Agenda 19 Data Analytics Framework 1 Technology 2 Data Lake 3 Lessons Learned 4 Big Data Analytics @ Munich Re 27. März 2017
  • 20.
  • 21. Agenda 21 Data Analytics Framework 1 Technology 2 Data Lake 3 Lessons Learned 4 Big Data Analytics @ Munich Re 27. März 2017
  • 22. Lessons learned Make use of Lab environment Enable Self-Service Agile IT-Project management 27. März 2017 22Big Data Analytics @ Munich Re 1 2 Automatization Security YARN queue management3 4 5 6
  • 23. Thank you! Contact: Marc Wewers Hans Edert IT Architect Solution Architect mwewers@munichre.com Hans-Joachim.Edert@sas.com © 2016 Münchener Rückversicherungs-Gesellschaft © 2016 Munich Reinsurance Company