Speakers
Simon Elliston Ball – Solutions Architect, Hortonworks
Adam Morton – Enterprise Data Architect, Admiral Group plc
• Over 10 years experience in Data Warehousing, Business Intelligence and
Analytics
• Working at Admiral for the past 2 years delivering a greenfield Enterprise Data
Warehouse as part of an overall Data Architecture modernisation programme
The Admiral Group
Admiral Group has grown from a small start up to one of the largest car
insurance providers in the UK with a presence in seven countries.
Our strategy is simple: To continue to progress in the UK Car Insurance market
whilst taking what we do well to new markets and products: keep doing what
we’re doing and do it better year after year.
Admiral – International Operations
Admiral employs more than 7,000 people at its offices in the UK, Spain, Italy, France,
USA, Canada and India.
"People who like what they do, do it better"
R&D at Admiral
• Strong history of using data to drive innovation which needs to be continued
• New function aimed at testing and learning through technology
• Time-boxed iterative efforts of no more than 4-6 weeks
• Fail fast, fail quickly approach; success or failure can end the PoC early
• Understand ‘Big Data’ and trial Hadoop ecosystem projects
Why Telematics?
• Scalability – A product with large potential and potentially huge volumes
• Timeliness - Data & Scoring was processed in batch – how quickly can this be done?
• Granularity - Suppliers provide aggregated data – could map matching be improved?
• Event Notification – Can we respond quickly
to NRT events in the data?
• Data Enrichment - Opportunity to uncover
further insights by integrating with interesting
data sources
Objectives of the Telematics PoC
• Scalability - Prove that data storage and high performance analytics can be
accomplished on large data sets cost effectively
• Timeliness - Reduce scoring time
• Data Enrichment
• NRT data processing – acting on events such as proximity to an airport
• Improve stability and flexibility
• Test the viability of a cloud solution
• Data Visualisation
Technical Challenges – Networking and Security
• Privacy Sensitive
• Third Party Sources
• Real-time data
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
There’s a VPN, it will be fine!
Admiral vNET
Third Party vNET
Telematics
Provider
DC
External
Users
Internal
Users
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Kafka SSL
Admiral vNET
Telematics
Provider
DC
External
Users
Internal
Users
K
SSL
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Ingest with NiFi
Admiral vNET
Telematics
Provider
DC
External
Users
Internal
Users
K
HDF
Other
Providers
Other
Providers
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Real-time Scoring
 Clean up done in NiFi
– Basic data correctness
– Format changes
 Fed To Kafka
 Spark Streaming
– NEAR Real time requirement
– Mixing Scala RDD and Data Frames code
– Integrating with map matching library
 Output fed into Kafka
– Kafka to WebSockets bridge for real-time visualization
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Batch Scoring
 More Spark!
 Zeppelin for ease of use, interaction
 Productionized into batch Spark Jobs
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
SAS on Hive
 Spark as ETL engine
 Hive for Large Scale processing
 SAS connector using Hive
 ORC as a file format
– Significantly smaller than JSON
– So much faster to process
Technical Challenges – Map Matching
• GPS data is messy
• Open Data sources based on roads
• Nearest road is fast, but not very good
• Hidden Markov Models. Know where you’re going,
and where you’ve been.
• Open source to the rescue…
14
Barefoot – Map Matching
• https://github.com/bmwcarit/barefoot
• Docker based service
• PostGIS map server loaded from OSM data
• Serializable map, distributed in Spark
15
Next Steps
 Completing knowledge transfer workshops with Hortonworks
 How to move from a POC to Production – ready?
 Establishing a in-house R&D function
 Deciding on the tools and frameworks to use within a POC
environment in the future

Admiral Group

  • 1.
    Speakers Simon Elliston Ball– Solutions Architect, Hortonworks Adam Morton – Enterprise Data Architect, Admiral Group plc • Over 10 years experience in Data Warehousing, Business Intelligence and Analytics • Working at Admiral for the past 2 years delivering a greenfield Enterprise Data Warehouse as part of an overall Data Architecture modernisation programme
  • 2.
    The Admiral Group AdmiralGroup has grown from a small start up to one of the largest car insurance providers in the UK with a presence in seven countries. Our strategy is simple: To continue to progress in the UK Car Insurance market whilst taking what we do well to new markets and products: keep doing what we’re doing and do it better year after year.
  • 3.
    Admiral – InternationalOperations Admiral employs more than 7,000 people at its offices in the UK, Spain, Italy, France, USA, Canada and India. "People who like what they do, do it better"
  • 4.
    R&D at Admiral •Strong history of using data to drive innovation which needs to be continued • New function aimed at testing and learning through technology • Time-boxed iterative efforts of no more than 4-6 weeks • Fail fast, fail quickly approach; success or failure can end the PoC early • Understand ‘Big Data’ and trial Hadoop ecosystem projects
  • 5.
    Why Telematics? • Scalability– A product with large potential and potentially huge volumes • Timeliness - Data & Scoring was processed in batch – how quickly can this be done? • Granularity - Suppliers provide aggregated data – could map matching be improved? • Event Notification – Can we respond quickly to NRT events in the data? • Data Enrichment - Opportunity to uncover further insights by integrating with interesting data sources
  • 6.
    Objectives of theTelematics PoC • Scalability - Prove that data storage and high performance analytics can be accomplished on large data sets cost effectively • Timeliness - Reduce scoring time • Data Enrichment • NRT data processing – acting on events such as proximity to an airport • Improve stability and flexibility • Test the viability of a cloud solution • Data Visualisation
  • 7.
    Technical Challenges –Networking and Security • Privacy Sensitive • Third Party Sources • Real-time data
  • 8.
    8 © HortonworksInc. 2011 – 2016. All Rights Reserved There’s a VPN, it will be fine! Admiral vNET Third Party vNET Telematics Provider DC External Users Internal Users
  • 9.
    9 © HortonworksInc. 2011 – 2016. All Rights Reserved Kafka SSL Admiral vNET Telematics Provider DC External Users Internal Users K SSL
  • 10.
    10 © HortonworksInc. 2011 – 2016. All Rights Reserved Ingest with NiFi Admiral vNET Telematics Provider DC External Users Internal Users K HDF Other Providers Other Providers
  • 11.
    11 © HortonworksInc. 2011 – 2016. All Rights Reserved Real-time Scoring  Clean up done in NiFi – Basic data correctness – Format changes  Fed To Kafka  Spark Streaming – NEAR Real time requirement – Mixing Scala RDD and Data Frames code – Integrating with map matching library  Output fed into Kafka – Kafka to WebSockets bridge for real-time visualization
  • 12.
    12 © HortonworksInc. 2011 – 2016. All Rights Reserved Batch Scoring  More Spark!  Zeppelin for ease of use, interaction  Productionized into batch Spark Jobs
  • 13.
    13 © HortonworksInc. 2011 – 2016. All Rights Reserved SAS on Hive  Spark as ETL engine  Hive for Large Scale processing  SAS connector using Hive  ORC as a file format – Significantly smaller than JSON – So much faster to process
  • 14.
    Technical Challenges –Map Matching • GPS data is messy • Open Data sources based on roads • Nearest road is fast, but not very good • Hidden Markov Models. Know where you’re going, and where you’ve been. • Open source to the rescue… 14
  • 15.
    Barefoot – MapMatching • https://github.com/bmwcarit/barefoot • Docker based service • PostGIS map server loaded from OSM data • Serializable map, distributed in Spark 15
  • 16.
    Next Steps  Completingknowledge transfer workshops with Hortonworks  How to move from a POC to Production – ready?  Establishing a in-house R&D function  Deciding on the tools and frameworks to use within a POC environment in the future

Editor's Notes

  • #3 Launched in 1993 Admiral Group is an insurance company based out of Cardiff in the UK. It has grown from a start up to become a household name of car insurance in the UK. Historically the business model has been simple and straightforward; “keep doing what we’re doing and do it better year after year” Admiral adopts a culture which encourages people to innovate and suggest new ways of working; whether through new products, processes or technology. All staff are shareholders and lay claim to a small stake of the company.
  • #4 Admiral is also the youngest company in the FTSE 100 employing more than 7,000 staff worldwide. Our philosophy at Admiral is that people who like what they do, do it better so we go out of our way to ensure coming to work here is enjoyable. As a result the Admiral Group is consistently being voted in the top 5 of the best places to work in each office it operates in.