Big Data LDN 2017: Unleash Data Science Upon Your Organisation

•

0 likes•137 views

Date: 15th November 2017 Location: AI Lab Theatre Time: 14:30 - 15:00 Speaker: Simon Ricketts / Dimitris Pertsinis Organisation: SYNTASA / TMG

Data & Analytics

Unleash Data Science
Across Your Organisation
Simon Ricketts
Customer Engagement Director
SYNTASA
Dimitris Pertsinis
Head of Data Science
Telegraph Media Group

2
Predictive Behavioral Analytics Dataflow
INPUT
ADAPTORS
PREDICTIV
E MODELS
DoubleClick
Google
Analytics
Adobe
Clickstream/
Livestream
In-Store
Transactions
. . .
Buying
Behavior
Product
Recommendation
Purchase/
Churn
Likelihood
Offer
Response
Likelihood
. . .
OUTPUT
ADAPTORS
DMP
Adobe
Marketing
Cloud
Email
Automation
CRM
. . .
Unified
Customer
Intelligence
Consolidated
Behavioral
Schema
+
Identity
Resolution

3
Extending the Marketing Cloud
Application
2nd/3rd Party Data
Marketing Cloud
AdvertisementsPersonalisation Communications
PersonalisationData Management
Platform
Email CampaignProfiles & AudiencesAnalytics
Enterprise Data Platform
EmailEPOSERPLoyalty
Digital Events CRM Inventory Call Center

4
Financial Services GovernmentRetail & Ecommerce Media
Industries We Serve

The Key Points
- The Ball on DS Court (Learning to Swim )
- Post Lake Issues
- Deploying Software
- Unleashing DS
6

The Key Non Points
- An in Depth Comparison of Infrastructures (Hadoop Vs GCP Vs AWS)
- Yet Another Overview of the DS Hierarchy of Needs or Maturity Model
- Prescriptive Success
- Complaining
(Maybe Some)
8

Pre Lake
- Interest in Data Science, Team assembled.
- Getting any Significant Project Required Weeks of Data Herding
- Hard to Offer Value with Disparate Data Sets and Lack of Clarity on Schemas
- Investment is Required to Supercharge Returns
- Bring That Data In – Design it so it is not as Disparate
- Day 0 – Date Lake Delivered

Post Lake
In at the Deep End
- Business Invested Now Wants Return
- New Data Sources in to Lake at Rate of
1-2 p/w
- Design Allowed for Deterministic
joining of Most Sources
- Team of Data Engineers Assembled
- New Kinds of Silos
- Lack of Documentation
- Technology Start Building Products on
Top
- Data Engineers Become Resource Gold
- Prototyping Faster – Lack of
Familiarity With Datasets Adding Time
12

Deploying Syntasa
From a Naïve Observer
- Pre Lake Decision to go With GCP and BQ. Reasons:
- Lack of Maintenance Overhead
- Resource Allocation – Speed (~1 Minute to Setup Cluster and Deploy Code)
- Cost
- Previous Employer on Premise Hadoop (on lockdown)
- Difference in Speed of Deployment and Processing Massive
- Access to Outside World - Edge Node on Lockdown

Unleash Your Team
- Automate Data Consolidation, Data Validation, Data Transformation
- Consolidate Front End, Back End System and Service Data
- Minimize Your Data Exploration and Data Cleaning Day to Hours
- Free Team’s Time to Allow Better Prototyping
- Optimise the Time Your Data Engineers Need to be Involved

What's hot

Transforming GE Healthcare with Data Platform StrategyDatabricks

Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Big Data Spain

Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Precisely

Dedup with hadoopNeeta Pande

Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Precisely

DIY: TPCDS HDInsight BenchmarkAshish Thapliyal

ROI of Big Data Analytics Native on HadoopDataWorks Summit

Meet the Infochimps PlatformInfochimps, a CSC Big Data Business

Real Time Business Platform by Ivan Novick from PivotalVMware Tanzu Korea

Geo-Analytics with Apache Spark and In-Memory Data GridsAli Hodroj

Big Data Use Casesboorad

Владимир Слободянюк «DWH & BigData – architecture approaches»Anna Shymchenko

2016 Cybersecurity Analytics State of the UnionCloudera, Inc.

Big Data Real Time Analytics - A Facebook Case StudyNati Shalom

02 a holistic approach to big dataRaul Chong

Spark DC Interactive Meetup: HTAP with Spark and In-Memory Data GridsAli Hodroj

Big Data TelecomTrick Consulting

How to Build Continuous Ingestion for the Internet of ThingsCloudera, Inc.

Ibm big dataPeter Tutty

Can data virtualization uphold performance with complex queries?Denodo

What's hot (20)

Transforming GE Healthcare with Data Platform Strategy

Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017

Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...

Dedup with hadoop

Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...

DIY: TPCDS HDInsight Benchmark

ROI of Big Data Analytics Native on Hadoop

Meet the Infochimps Platform

Real Time Business Platform by Ivan Novick from Pivotal

Geo-Analytics with Apache Spark and In-Memory Data Grids

Big Data Use Cases

Владимир Слободянюк «DWH & BigData – architecture approaches»

2016 Cybersecurity Analytics State of the Union

Big Data Real Time Analytics - A Facebook Case Study

02 a holistic approach to big data

Spark DC Interactive Meetup: HTAP with Spark and In-Memory Data Grids

Big Data Telecom

How to Build Continuous Ingestion for the Internet of Things

Ibm big data

Can data virtualization uphold performance with complex queries?

Similar to Big Data LDN 2017: Unleash Data Science Upon Your Organisation

DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo

Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Denodo

Making Big Data Analytics with Hadoop fast & easy (webinar slides)Yellowfin

NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data FederationNRB

NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation NRB

Big Data LDN 2017: The Logical Data Warehouse – A Modern Analytical Architect...Matt Stubbs

Big Data LDN 2017: The New Dominant Companies Are Running on DataMatt Stubbs

Hadoop and Your Enterprise Data WarehouseEdgar Alejandro Villegas

From Zero to Cloud and BackBATbern

When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY

The new dominant companies are running on data SnapLogic

Big Data for Security - DNS AnalyticsMarco Casassa Mont

Big data beyond the hype may 2014bigdatagurus_meetup

Slides: Success Stories for Data-to-CloudDATAVERSITY

Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...confluent

(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWSAmazon Web Services

SplunkLive! - Splunk for IT OperationsSplunk

How Hewlett Packard Enterprise Gets Real with IoT AnalyticsArcadia Data

Horses for Courses: Database RoundtableEric Kavanagh

Similar to Big Data LDN 2017: Unleash Data Science Upon Your Organisation (20)

DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization

Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...

Making Big Data Analytics with Hadoop fast & easy (webinar slides)

NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation

NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation

Big Data LDN 2017: The Logical Data Warehouse – A Modern Analytical Architect...

Big Data LDN 2017: The New Dominant Companies Are Running on Data

Hadoop and Your Enterprise Data Warehouse

From Zero to Cloud and Back

When and How Data Lakes Fit into a Modern Data Architecture

The new dominant companies are running on data

Big Data for Security - DNS Analytics

Big data beyond the hype may 2014

Slides: Success Stories for Data-to-Cloud

Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...

(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS

SplunkLive! - Splunk for IT Operations

How Hewlett Packard Enterprise Gets Real with IoT Analytics

Horses for Courses: Database Roundtable

Recently uploaded

B2 Creative Industry Response Evaluation.docxStephen266013

Carero dropshipping via API with DroFx.pptxolyaivanovalion

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat

Ukraine War presentation: KNOW THE BASICSAishani27

Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083

Smarteg dropshipping via API with DroFx.pptxolyaivanovalion

Sampling (random) method and Non random.pptDr. Soumendra Kumar Patra

定制英国白金汉大学毕业证（UCB毕业证书）成绩单原版一比一ffjhghh

April 2024 - Crypto Market Report's Analysismanisha194592

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor

代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo

FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg

VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

Ravak dropshipping via API with DroFx.pptxolyaivanovalion

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692

VidaXL dropshipping via API with DroFx.pptxolyaivanovalion

Edukaciniai dropshipping via API with DroFxolyaivanovalion

Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson

Recently uploaded (20)

B2 Creative Industry Response Evaluation.docx

Carero dropshipping via API with DroFx.pptx

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service

Ukraine War presentation: KNOW THE BASICS

Generative AI on Enterprise Cloud with NiFi and Milvus

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call

Smarteg dropshipping via API with DroFx.pptx

Sampling (random) method and Non random.ppt

定制英国白金汉大学毕业证（UCB毕业证书）成绩单原版一比一

April 2024 - Crypto Market Report's Analysis

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati

代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改

FESE Capital Markets Fact Sheet 2024 Q1.pdf

VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...

Ravak dropshipping via API with DroFx.pptx

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx

VidaXL dropshipping via API with DroFx.pptx

Edukaciniai dropshipping via API with DroFx

Schema on read is obsolete. Welcome metaprogramming..pdf

Big Data LDN 2017: Unleash Data Science Upon Your Organisation

1. Unleash Data Science Across Your Organisation Simon Ricketts Customer Engagement Director SYNTASA Dimitris Pertsinis Head of Data Science Telegraph Media Group

2. 2 Predictive Behavioral Analytics Dataflow INPUT ADAPTORS PREDICTIV E MODELS DoubleClick Google Analytics Adobe Clickstream/ Livestream In-Store Transactions . . . Buying Behavior Product Recommendation Purchase/ Churn Likelihood Offer Response Likelihood . . . OUTPUT ADAPTORS DMP Adobe Marketing Cloud Email Automation CRM . . . Unified Customer Intelligence Consolidated Behavioral Schema + Identity Resolution

3. 3 Extending the Marketing Cloud Application 2nd/3rd Party Data Marketing Cloud AdvertisementsPersonalisation Communications PersonalisationData Management Platform Email CampaignProfiles & AudiencesAnalytics Enterprise Data Platform EmailEPOSERPLoyalty Digital Events CRM Inventory Call Center

4. 4 Financial Services GovernmentRetail & Ecommerce Media Industries We Serve

5. What This is About

6. The Key Points - The Ball on DS Court (Learning to Swim ) - Post Lake Issues - Deploying Software - Unleashing DS 6

7. What This is Not About

8. The Key Non Points - An in Depth Comparison of Infrastructures (Hadoop Vs GCP Vs AWS) - Yet Another Overview of the DS Hierarchy of Needs or Maturity Model - Prescriptive Success - Complaining (Maybe Some) 8

9. A History of DS at The Telegraph

10. Pre Lake - Interest in Data Science, Team assembled. - Getting any Significant Project Required Weeks of Data Herding - Hard to Offer Value with Disparate Data Sets and Lack of Clarity on Schemas - Investment is Required to Supercharge Returns - Bring That Data In – Design it so it is not as Disparate - Day 0 – Date Lake Delivered

11. Post Lake Blues

12. Post Lake In at the Deep End - Business Invested Now Wants Return - New Data Sources in to Lake at Rate of 1-2 p/w - Design Allowed for Deterministic joining of Most Sources - Team of Data Engineers Assembled - New Kinds of Silos - Lack of Documentation - Technology Start Building Products on Top - Data Engineers Become Resource Gold - Prototyping Faster – Lack of Familiarity With Datasets Adding Time 12

13. Deploying Software An Interlude

14. Deploying Syntasa From a Naïve Observer - Pre Lake Decision to go With GCP and BQ. Reasons: - Lack of Maintenance Overhead - Resource Allocation – Speed (~1 Minute to Setup Cluster and Deploy Code) - Cost - Previous Employer on Premise Hadoop (on lockdown) - Difference in Speed of Deployment and Processing Massive - Access to Outside World - Edge Node on Lockdown

15. Unleash Your DS Team Our Time Has Come

16. Unleash Your Team - Automate Data Consolidation, Data Validation, Data Transformation - Consolidate Front End, Back End System and Service Data - Minimize Your Data Exploration and Data Cleaning Day to Hours - Free Team’s Time to Allow Better Prototyping - Optimise the Time Your Data Engineers Need to be Involved

17. Questions

Big Data LDN 2017: Unleash Data Science Upon Your Organisation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Big Data LDN 2017: Unleash Data Science Upon Your Organisation

Similar to Big Data LDN 2017: Unleash Data Science Upon Your Organisation (20)

More from Matt Stubbs

More from Matt Stubbs (20)

Recently uploaded

Recently uploaded (20)

Big Data LDN 2017: Unleash Data Science Upon Your Organisation