Munich Re, the world’s largest reinsurer, is relying on SAS Analytics and Hortonworks Data Platform (HDP) for its big data initiative. Together with its technology partners, the reinsurer has installed an innovative platform capable of analyzing extraordinarily large quantities of data.
Munich Re’s big data initiative is a central component of its strategy to actively address evolving worldwide risk. Munich Re intends to increase its clients’ resilience against economic, political and cyber risks while setting and shaping trends in the insurance market. The platform enables departments to effectively explore new ideas, develop new business fields and further enhance customer service. The platform launched in February 2016.
Analyzing semi-structured and unstructured data – from paper documents to emails and video files – is crucial. Munich Re also wants to integrate external data such as weather information or sensor data from vehicles, machinery and other networked devices. They expect the combination of these sources to lead to new insights that promote innovative products, services and processes.
In this presentation we will give an overview of our journey to the Big Data & Analytics Self-Service Platform using a central Data Lake on top of HDP with a tight integration of SAS. Furthermore we will present insights about the architectural approach we took, the technical challenges we encountered and how we solved them.
[2024]Digital Global Overview Report 2024 Meltwater.pdf
Munich Re's Journey to the BIg Data and Analytics Self-Service Platform including a Central Data Lake
1. Munich Re’s Journey to the Big Data and Analytics Self-
Service Platform including a Central Data Lake
06 April 2017
Marc Wewers Hans Edert
IT Architect, Munich RE Solution Architect, SAS Institute GmbH
2. Agenda
27. März 2017 2Big Data Analytics @ Munich Re
Data Analytics Framework
1 Technology
2
Data Lake
3 Lessons Learned
4
4. Munich RE
27. März 2017 4
Reinsurance Primary insurance
Asset Management
Munich Re (Group)
Big Data Analytics @ Munich Re
5. History of Munich Re
27. März 2017 5
1880
Munich Re is founded on
19 April 1880 at the instigation of
Carl von Thieme, Baron Theodor
von Cramer-Klett and
Wilhelm Finck.
1906
First major loss in the
20th century: the earth-quake in
San Francisco on 18 April 1906.
Munich Re's liability:
US$ 2.5m.
Munich Re deals with all aspects
of claims on the spot.
1997
The insurance groups
VICTORIA/D.A.S. and Hamburg-
Mannheimer/DKV announce that
they will merge under the name of
ERGO Versicherungsgruppe AG.
ERGO, which belongs
to Munich Re, is now represented
in more than 30 countries.
2009
Munich Re pools its international
health insurance and reinsurance
expertise in a new business
segment: Munich Health.
With the introduction of the new
brand, Munich Re redefines its
position in the reinsurance markets.
2011
With overall losses amounting to
some US$ 380bn, 2011 becomes
the costliest natural catastrophe
year to date.
After the terrible earthquake in
Japan on 11 March 2011, Munich
Re invites internationally recognised
experts to assess the event.
Image: ERGO Versicherungsgruppe Image: used under license from shutterstock.comImage: Munich Re / Marcus Buck Image: Corbis Image: CorbisImage: Corbis
Big Data Analytics @ Munich Re
6. Big Data use cases in insurance
Make the uninsurable insurable
Diabetics
Wind Energy
Consolidate the information and process
Automated underwriting
Risk management platform
Artificial Intelligence supported workflow
Early Loss Detection
Visual Loss Adjustment
Image: dpa Picture Alliance Image: Getty Images
Image: Getty Images
Image: used under license from shutterstock.com
Image: used under license from shutterstock.com
Image: used under license from shutterstock.com
27. März 2017 6Big Data Analytics @ Munich Re
7. Big Data
Analytics
Methods
Regression Models
Machine Learning
Models
Text Mining
Technology
Hardware
(Compute power)
Software
(SAS, R, Spark, …)
Data
Internal Data
External Data
Structured Data
Unstructured Data
People
Data Scientists
Data Engineers
Business People
IT Architects
Big Data Analytics is a Combination of
Methods, Technology, Data and People
727. März 2017Big Data Analytics @ Munich Re
9. Design principles for Big Data & Analytics Platform
27. März 2017 9Big Data Analytics @ Munich Re
SAS & Hortonworks Self-Service Multi Tenancy
One Central Datalake
DevOps On-Prem & CloudContinuous improvement
Automation
Hybrid
10. Building the Infrastructure
27. März 2017 10Big Data Analytics @ Munich Re
SASHANA Hadoop Stack
HANA
User InterfaceUser Interface User Interface
SASHANA Hadoop Stack
HANA
User InterfaceUser Interface User Interface
A2P
Data Lake (HDFS)
Long term unstructured and structured data
BI Lab Production
11. Roadmap to Production via Lab environment
Q2 - 2015 Q3 - 2015 Q4 - 2015
Setup of new
BI-Lab Hadoop Cluster
On-boarding & support of
Big Data & Analytics pilots
Stabilization of BI-Lab Hadoop cluster
Authentication & Security
Automation
New BI-Lab Hadoop Cluster available
• Large shared cluster
• Dedicated clusters
• Single-Node cluster
Pilot SAS – Hadoop Integration
11
Design Setup / Build Run
Setup of first
BI-Lab Hadoop
Cluster
Enhance / Optimize
Enhance / Optimize
Big Data Analytics @ Munich Re 27. März 2017
12. Building the Big Data & Analytics Platform
Production Environment
12
Design Setup / Build Run
2016
Release v1.0 Release v2.0 Release v3.0
• SAS 9.4 M3
• SAS Visual Analytics (VA)
• Self-Service Data Upload
• SAS Embedded Process
for Hadoop
• SAS Enterprise Guide (EG)
• SAS MS Office Add-in
• Data Access to SAP HANA,
Oracle & MS SQL-Server
• SAS Enterprise Miner
• SAS Contextual Analysis
• SAS Mobile BI iOS App
• HDP 2.3
• Hue
• Hive
• Ambari
• Ranger with LDAP
• Sqoop
• Pig
• Spark 1.4
• Oozie
• HDP 2.4.2
• Ambari Views
• Spark 1.6
• Solr Cloud
• Tesseract
Start setup platform Release v4.0
01 02 03 04 05 06 07 0812 09 10 11 12
2017
• SAS VA Row Level Security
• SAS HA
• HDP 2.5
• Atlas
• Zeppelin
• Data Catalogue Tool
• Data Lineage
• Compliance & Security
Optimize
2-week iterations with
Rolling Upgrades
Enhance / Optimize
Enhance
Big Data Analytics @ Munich Re 27. März 2017
13. • Leader in Advanced Analytics, BI, and Data Management
• More than 80,000 sites, across 148 countries
• 91 of the top 100 companies on the 2014 Fortune Global 500® list
• Top ranked in Fortune Magazine’s list of Best Companies to Work For in the US
• Reinvests 25% of revenue in R&D
About SAS
Founded in 1976
Giving you The Power to Know®
since 1976
14,000+
E M P L O Y E E S
58
C O U N T R I E S
14. SAS capabilities on the Hortonworks Data Platform
SAS on the Edge
• Conduct low latency assessment , enrichment, and
analytics of high-volume, on high-volume streaming
data
Data Management Inside HDP
• Data ingestion, cleansing, and transformation
Predictive Modeling Inside HDP
• Apply analytics and rules to pinpoint event
relevance and urgency with continuous pattern
detection
Model Deployment Inside HDP
• Apply analytics and rules to pinpoint event
relevance and urgency with continuous pattern
detection
Big Data Analytics @ Munich Re 27. März 2017 15
15. SAS capabilities on the Hortonworks Data Platform
27. März 2017 15
Hive
HDFS
Data
Lake
SAS In-
Memory
engine
SAS/
Accelerators
SAS/ Access
to Hadoop
Web
Tier
Compute
Tier
Metadata
SQL
HDFS
SAS
EP
YARN
SAS code
Enterprise
Guide
Enterprise
Miner
Web Browser
Enterprise Guide
Enterprise Miner
Visual Analytics
Data Loader for
Hadoop
Big Data Analytics @ Munich Re
16. „Simplified“ Server Architecture SAS and HDP
Data Node 1 Data Node 2 Data Node 3 Data Node x Data Node x+1 Data Node x+2 Data Node y
SAS
In-Memory
engine
(LASR)
SAS
In-Memory
engine
(LASR)
SAS
In-Memory
engine
(LASR)
SAS
In-Memory
engine
(LASR)
SAS EP SAS EP SAS EP SAS EP SAS EP SAS EP SAS EP
HDFS HDFS HDFS HDFS HDFS HDFS HDFS
SAS
Mgmt &
Metadata
Hadoop
Mgmt &
Metadata
Hive Hive HiveHive Hive Hive Hive
Hadoop
Frontend
SERVER / OS SERVER / OS SERVER / OS
x < y
EP = Embedded Process
“bring calculation to data”
Ambari Views,
Zeppelin, …
16
YARN YARN YARNYARN YARN YARN YARN
Solr Solr SolrSolr Solr Solr Solr
Spark Spark SparkSpark Spark Spark Spark
Big Data Analytics @ Munich Re 27. März 2017
17. YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
Big Data & Analytics Production Environments
Scalability
Sandbox
Integration
Production
SAS SAS
EP
LASR
YARN
Hive
HWX
Scalability
LASR LASR
LASR
EP EP
EP
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
EP EP EP EP
EP EP EP EP
YARN
Hive
YARN
Hive
EP
EP
YARN
Hive
YARN
Hive
EP
EP
YARN
Hive
YARN
Hive
EP
LASR
LASR
EP
Self Service
Analytics
Scheduled
Analytics
Big Data Analytics @ Munich Re 27. März 2017 16
18. YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
Big Data & Analytics Production Environments
Scalability
Sandbox
Integration
Production
SAS SAS
EP
LASR
YARN
Hive
HWX
Scalability
LASR LASR
LASR
EP EP
EP
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
YARN
Hive
EP EP EP EP
EP EP EP EP
YARN
Hive
YARN
Hive
EP
EP
YARN
Hive
YARN
Hive
EP
EP
YARN
Hive
YARN
Hive
EP
LASR
LASR
EP
LASR LASR LASR
LASR LASRLASR
Self Service
Analytics
Scheduled
Analytics
Big Data Analytics @ Munich Re 27. März 2017 16