BIG DATA
PLATFORM, TECHNOLOGY & TOOLS
Summary








Intro – what is Big Data?
Objectives
Technology approach
ETL, infrastructure, applications & tools
Existing platforms and tools
Evolution
What is Big Data?


Big Data = 3V
 High

Volume
 High Velocity
 High Variety


Includes: Capture, Curation,
Storage, Search, Sharing,
Transfer, Analysis, Visualization
Objectives


Actionable analytics
 A/B

testing
 Channel content automation and optimization


Accountable marketing
 Measure

marketing initiatives impact
 Using predictive technology


Creative discovery
 Using

BI tools
 Explore what questions could be asked
Brand Ecosystem

VOLUME / VELOCITY / VARIETY
Web & E-commerce
Social Media
Mobile Applications
Ad Serving
Data & CRM
Platforms & Services
Connecting the dots – Big Data Platform

BIG DATA PLATFORM
Events and Data Capturing
Distributed Infrastructure
Platforms & Tools

Analytics & Reporting
Automation & Optimization
Big Data - High Level System Architecture

Brand Ecosystem

Web
Platforms

Social
Media

Mobile
Applications

Ad
Serving

Data
CRM

Platforms
Services

Events and Data Capturing – Tracking, Logging, ETL
Distributed Infrastructure

Platforms & Tools

Analytics & Reporting

Automation
&
Optimization
Big Data - Data Flow & Tools
DATA SOURCES
Unstructured Data

Log Files

Exhaust Data

Social Media

Sensors, Devices

DB Data

LOG SERVICES

DATA WAREHOUSE

ANALYTICS

REAL TIME DATA STORAGE

REPORTING
d3.js

AUTOMATION,
OPTIMIZATION
Real Time APIs
A/B Testing
Big Data Roles


Program manager




Infrastructure




Project scope definition and planning, delivery, documentation and circulation of an end to
end plan, driving a unified message to all stakeholders, provide actionable detail on future
requirements, present program status and issues

IT Administrators – cluster configuration, management and maintenance

Software


Software Engineers – programming and technical analysis for Big Data main
solution and related products



Software Architects – solution and application architecture for all related products
(ETL, data warehouse, real time databases, platforms and tools)



Data Architects – distributed data storage architecture, related platform and tools
database architecture



BI Developers – programming for distributed queries, predictive analysis tools,
automation tools



Analysis




Data Analysts – data analysis, reporting tools, cross platform data analysis
BI Analysts – predictive multichannel analysis, BI tools
Data Scientists – Big Data algorithms for BI and predictive models
Big Data Components







Events and Data Capturing
Distributed Infrastructure
Platforms & Tools
Reporting & Analytics
Automation & Optimization

Brand Ecosystem
Web
Ecommerce

Social
Media

Mobile
Application
s

Ad
Serving

Data
CRM

Platforms
Services

Events and Data Capturing
Distributed Infrastructure
Platforms & Tools

Analytics & Reporting

Automation
&
Optimization
Events and Data Capturing
Every user action or state change on each client
platform will be logged using a common structure
(Json format):
 USER uid, reg_uid




EVENT tstamp, client_id, app_id, obj_id, event_id





Unique identifier for each en user
When a user is known (logged, across multiple platforms)
merge previous activity (events) on a single thread
When the event occurred
What event is logged (platform, object, event)

CONTEXT ip, uagent, referrer, qstring, geo_coords



User context
(application used)
IP address
and geo-location

Brand Ecosystem
Web
Ecommerce

Social
Media

Mobile
Application
s

Ad
Serving

Data
CRM

Platforms
Services

Events and Data Capturing
Distributed Infrastructure
Platforms & Tools

Analytics & Reporting

Automation
&
Optimization
Events and Data Capturing
Additional data to be captured to complement
user related events and states, such as:
Sales information
 Context information – weather, events, etc.
 Other relevant data
Data stored using a common structure (Json) –
somewhat similar to user events
but related to
the context or
the business client,
not the user


Brand Ecosystem

Web
Ecommerce

Social
Media

Mobile
Application
s

Ad
Serving

Data
CRM

Platforms
Services

Events and Data Capturing
Distributed Infrastructure
Platforms & Tools

Analytics & Reporting

Automation
&
Optimization
Events and Data Capturing


Shared libraries and protocols to be used
across all platforms

LIBRARIES
Browser client
library

Web &
Ecommerce
✔

Social
Media

Mobile
Application
s

Ad
Serving

✔

Data
& CRM

Platforms
& Services

✔

Mobile client
libraries

✔

✔

✔

✔

Log files import

✔

Data import

✔

✔
✔

✔

✔

Brand Ecosystem
Web
Ecommerce

Social
Media

Mobile
Application
s

Ad
Serving

Data
CRM

Platforms
Services

Events and Data Capturing
Distributed Infrastructure
Platforms & Tools

Analytics & Reporting

Automation
&
Optimization
Distributed Infrastructure

Brand Ecosystem
Web
Ecommerce

Social
Media

Mobile
Application
s

Ad
Serving

Data
CRM

Platforms
Services

Events and Data Capturing
Distributed Infrastructure
Platforms & Tools

Analytics & Reporting

Automation
&
Optimization
Platforms & Tools




CRM Marketing View
Media Publishing Platform
Other Platforms & Tools – related to social
media, loyalty platforms, ecommerce, CRM,
etc.

Brand Ecosystem
Web
Ecommerce

Social
Media

Mobile
Application
s

Ad
Serving

Data
CRM

Platforms
Services

Events and Data Capturing
Distributed Infrastructure
Platforms & Tools

Analytics & Reporting

Automation
&
Optimization
Platforms & Tools – CRM Marketing View










Segment CRM users based on a
group/segment definition schema
Generic admin interface for managing
segments and quality control
Generic solution for any CRM platform
Simplify CRM operations
Simplify custom CRM dashboards and reports
Integrates smoothly
with other Big Data
components
Brand Ecosystem

Web
Ecommerce

Social
Media

Mobile
Application
s

Ad
Serving

Data
CRM

Platforms
Services

Events and Data Capturing
Distributed Infrastructure
Platforms & Tools

Analytics & Reporting

Automation
&
Optimization
Platforms & Tools – Publishing Platform














Generic scalable
platform
Easily add any type
of input
Manage real-time
aggregation rules
Automatically publish
live banners, ads, etc.
A/B testing for
output media
Integration with CRM
and live feeds
Integration with other
Big Data components

Brand Ecosystem
Web
Ecommerce

Social
Media

Mobile
Application
s

Ad
Serving

Data
CRM

Platforms
Services

Events and Data Capturing
Distributed Infrastructure
Platforms & Tools

Analytics & Reporting

Automation
&
Optimization
Platforms & Tools – Top Voice




Social media brand influence platform
Real time data synchronization
Scalable infrastructure & services

Brand Ecosystem
Web
Ecommerce

Social
Media

Mobile
Application
s

Ad
Serving

Data
CRM

Platforms
Services

Events and Data Capturing
Distributed Infrastructure
Platforms & Tools

Analytics & Reporting

Automation
&
Optimization
Analytics & Reporting




Big Data Ultimate Dashboards
Trends & Semantic Analysis
BI Applications & Tools

Brand Ecosystem
Web
Ecommerce

Social
Media

Mobile
Application
s

Ad
Serving

Data
CRM

Platforms
Services

Events and Data Capturing
Distributed Infrastructure
Platforms & Tools

Analytics & Reporting

Automation
&
Optimization
Analytics & Reporting – Dashboards







Tableau Software platform
Leader on data visualization
Connects with relational databases
Connects with data stores such as Hadoop,
Google Big Query, HP Vertica
Rich and interactive dashboards and reports
Brand Ecosystem
Web
Ecommerce

Social
Media

Mobile
Application
s

Ad
Serving

Data
CRM

Platforms
Services

Events and Data Capturing
Distributed Infrastructure
Platforms & Tools

Analytics & Reporting

Automation
&
Optimization
Analytics & Reporting – Sentiment Analysis






Nexalogy
Process unstructured text data
Easily connects with social, CRM or any other
brand proprietary data
Finds relevant streams of conversations and
data

Brand Ecosystem
Web
Ecommerce

Social
Media

Mobile
Application
s

Ad
Serving

Data
CRM

Platforms
Services

Events and Data Capturing
Distributed Infrastructure
Platforms & Tools

Analytics & Reporting

Automation
&
Optimization
Analytics & Reporting – BI Applications





BI Tools
Sophisticated reports and correlations
Predictive technology
Software solutions such as Mahout, HP
Vertica, R, Platfora, Datameer, SAS, SPSS,
PSPP, Pivotal

Brand Ecosystem
Web
Ecommerce

Social
Media

Mobile
Application
s

Ad
Serving

Data
CRM

Platforms
Services

Events and Data Capturing
Distributed Infrastructure
Platforms & Tools

Analytics & Reporting

Automation
&
Optimization
Automation & Optimization







Automation services and processes
Dynamic and personalized offers and content
on websites, social media, mobile, ad banners,
etc.
Feed Big Data analytics into live input channel
applications
A/B testing
Brand Ecosystem
Web
Ecommerce

Social
Media

Mobile
Application
s

Ad
Serving

Data
CRM

Platforms
Services

Events and Data Capturing
Distributed Infrastructure
Platforms & Tools

Analytics & Reporting

Automation
&
Optimization
Big Data – Client Facing Tools


Platforms and tools






Media Publishing Platform for real-time content automation
CRM Marketing View for cross platform state marketing
Other tools integrated with Big Data

Analytics & Reporting




Big Data Ultimate Dashboards using Tableau Software
Predictive models for content and campaign optimization
Possibility to expose query tools directly to end-users
Big Data

BEFORE

AFTER

Big Data

  • 1.
  • 2.
    Summary       Intro – whatis Big Data? Objectives Technology approach ETL, infrastructure, applications & tools Existing platforms and tools Evolution
  • 3.
    What is BigData?  Big Data = 3V  High Volume  High Velocity  High Variety  Includes: Capture, Curation, Storage, Search, Sharing, Transfer, Analysis, Visualization
  • 4.
    Objectives  Actionable analytics  A/B testing Channel content automation and optimization  Accountable marketing  Measure marketing initiatives impact  Using predictive technology  Creative discovery  Using BI tools  Explore what questions could be asked
  • 5.
    Brand Ecosystem VOLUME /VELOCITY / VARIETY Web & E-commerce Social Media Mobile Applications Ad Serving Data & CRM Platforms & Services
  • 6.
    Connecting the dots– Big Data Platform BIG DATA PLATFORM Events and Data Capturing Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 7.
    Big Data -High Level System Architecture Brand Ecosystem Web Platforms Social Media Mobile Applications Ad Serving Data CRM Platforms Services Events and Data Capturing – Tracking, Logging, ETL Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 8.
    Big Data -Data Flow & Tools DATA SOURCES Unstructured Data Log Files Exhaust Data Social Media Sensors, Devices DB Data LOG SERVICES DATA WAREHOUSE ANALYTICS REAL TIME DATA STORAGE REPORTING d3.js AUTOMATION, OPTIMIZATION Real Time APIs A/B Testing
  • 9.
    Big Data Roles  Programmanager   Infrastructure   Project scope definition and planning, delivery, documentation and circulation of an end to end plan, driving a unified message to all stakeholders, provide actionable detail on future requirements, present program status and issues IT Administrators – cluster configuration, management and maintenance Software  Software Engineers – programming and technical analysis for Big Data main solution and related products  Software Architects – solution and application architecture for all related products (ETL, data warehouse, real time databases, platforms and tools)  Data Architects – distributed data storage architecture, related platform and tools database architecture  BI Developers – programming for distributed queries, predictive analysis tools, automation tools  Analysis    Data Analysts – data analysis, reporting tools, cross platform data analysis BI Analysts – predictive multichannel analysis, BI tools Data Scientists – Big Data algorithms for BI and predictive models
  • 10.
    Big Data Components      Eventsand Data Capturing Distributed Infrastructure Platforms & Tools Reporting & Analytics Automation & Optimization Brand Ecosystem Web Ecommerce Social Media Mobile Application s Ad Serving Data CRM Platforms Services Events and Data Capturing Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 11.
    Events and DataCapturing Every user action or state change on each client platform will be logged using a common structure (Json format):  USER uid, reg_uid    EVENT tstamp, client_id, app_id, obj_id, event_id    Unique identifier for each en user When a user is known (logged, across multiple platforms) merge previous activity (events) on a single thread When the event occurred What event is logged (platform, object, event) CONTEXT ip, uagent, referrer, qstring, geo_coords   User context (application used) IP address and geo-location Brand Ecosystem Web Ecommerce Social Media Mobile Application s Ad Serving Data CRM Platforms Services Events and Data Capturing Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 12.
    Events and DataCapturing Additional data to be captured to complement user related events and states, such as: Sales information  Context information – weather, events, etc.  Other relevant data Data stored using a common structure (Json) – somewhat similar to user events but related to the context or the business client, not the user  Brand Ecosystem Web Ecommerce Social Media Mobile Application s Ad Serving Data CRM Platforms Services Events and Data Capturing Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 13.
    Events and DataCapturing  Shared libraries and protocols to be used across all platforms LIBRARIES Browser client library Web & Ecommerce ✔ Social Media Mobile Application s Ad Serving ✔ Data & CRM Platforms & Services ✔ Mobile client libraries ✔ ✔ ✔ ✔ Log files import ✔ Data import ✔ ✔ ✔ ✔ ✔ Brand Ecosystem Web Ecommerce Social Media Mobile Application s Ad Serving Data CRM Platforms Services Events and Data Capturing Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 14.
    Distributed Infrastructure Brand Ecosystem Web Ecommerce Social Media Mobile Application s Ad Serving Data CRM Platforms Services Eventsand Data Capturing Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 15.
    Platforms & Tools    CRMMarketing View Media Publishing Platform Other Platforms & Tools – related to social media, loyalty platforms, ecommerce, CRM, etc. Brand Ecosystem Web Ecommerce Social Media Mobile Application s Ad Serving Data CRM Platforms Services Events and Data Capturing Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 16.
    Platforms & Tools– CRM Marketing View       Segment CRM users based on a group/segment definition schema Generic admin interface for managing segments and quality control Generic solution for any CRM platform Simplify CRM operations Simplify custom CRM dashboards and reports Integrates smoothly with other Big Data components Brand Ecosystem Web Ecommerce Social Media Mobile Application s Ad Serving Data CRM Platforms Services Events and Data Capturing Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 17.
    Platforms & Tools– Publishing Platform        Generic scalable platform Easily add any type of input Manage real-time aggregation rules Automatically publish live banners, ads, etc. A/B testing for output media Integration with CRM and live feeds Integration with other Big Data components Brand Ecosystem Web Ecommerce Social Media Mobile Application s Ad Serving Data CRM Platforms Services Events and Data Capturing Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 18.
    Platforms & Tools– Top Voice    Social media brand influence platform Real time data synchronization Scalable infrastructure & services Brand Ecosystem Web Ecommerce Social Media Mobile Application s Ad Serving Data CRM Platforms Services Events and Data Capturing Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 19.
    Analytics & Reporting    BigData Ultimate Dashboards Trends & Semantic Analysis BI Applications & Tools Brand Ecosystem Web Ecommerce Social Media Mobile Application s Ad Serving Data CRM Platforms Services Events and Data Capturing Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 20.
    Analytics & Reporting– Dashboards      Tableau Software platform Leader on data visualization Connects with relational databases Connects with data stores such as Hadoop, Google Big Query, HP Vertica Rich and interactive dashboards and reports Brand Ecosystem Web Ecommerce Social Media Mobile Application s Ad Serving Data CRM Platforms Services Events and Data Capturing Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 21.
    Analytics & Reporting– Sentiment Analysis     Nexalogy Process unstructured text data Easily connects with social, CRM or any other brand proprietary data Finds relevant streams of conversations and data Brand Ecosystem Web Ecommerce Social Media Mobile Application s Ad Serving Data CRM Platforms Services Events and Data Capturing Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 22.
    Analytics & Reporting– BI Applications     BI Tools Sophisticated reports and correlations Predictive technology Software solutions such as Mahout, HP Vertica, R, Platfora, Datameer, SAS, SPSS, PSPP, Pivotal Brand Ecosystem Web Ecommerce Social Media Mobile Application s Ad Serving Data CRM Platforms Services Events and Data Capturing Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 23.
    Automation & Optimization     Automationservices and processes Dynamic and personalized offers and content on websites, social media, mobile, ad banners, etc. Feed Big Data analytics into live input channel applications A/B testing Brand Ecosystem Web Ecommerce Social Media Mobile Application s Ad Serving Data CRM Platforms Services Events and Data Capturing Distributed Infrastructure Platforms & Tools Analytics & Reporting Automation & Optimization
  • 24.
    Big Data –Client Facing Tools  Platforms and tools     Media Publishing Platform for real-time content automation CRM Marketing View for cross platform state marketing Other tools integrated with Big Data Analytics & Reporting    Big Data Ultimate Dashboards using Tableau Software Predictive models for content and campaign optimization Possibility to expose query tools directly to end-users
  • 25.