SlideShare a Scribd company logo
1 of 139
Download to read offline
The Cone™‫‏‬– Digital Marketing
Digital Transformation
Throughout eternity, all that is of like form comes around again –
everything that is the same must return again in its own
everlasting cycle.....
• Marcus Aurelius – Emperor of Rome •
Digital Product Lifecycle Strategy
• Everything that goes around, comes around – everything has its’ own
lifecycle, in its’ own time. Things are born, grow, age, and ultimately
they die. It’s easy to spot a lifecycle in action everywhere you look. As
a person is born, grows, ages, and dies – then so does a star, a tree, a
bird, a bee, or a civilization – and so does a company, a product, a
technology or a market - everything goes around in a lifecycle of it own.
Digital Product Lifecycle Strategy
Investment
Product
Lifecycle
Product
Design
Product
Launch
Product
Planning
Death
Plateau
Product
Maturity
Decline
Aging
Early Growth
Migrate
Customers
to new
Products
Withdraw
Innovation Prototype / Pilot / Proof-of-concept
Cash CowCease
Investment
Digital Product Lifecycle Strategy
The Cone™‫‏‬- Lifestyle Understanding
The CONE™
The CONE™ - Social Intelligence
Getting to the heart of audiences - and
putting audiences back at the heart of marketing.
The CONE™ - Audience Measurement
• Due to severe competition, Communications Service Providers (CSPs) such as 3 Mobile, EE,
Talk-Talk and Vodafone, along with Mobile Virtual Network Operators (MVNOs) such as Virgin,
Tesco and Giff-gaff - no longer make significant profit from their core services (Mobile, Fixed-line
and Broadband). This has caused the dash for “Quad-play”, where CSPs now add Media and
Entertainment Packages to their core network services offering (Mobile, Fixed-line & Broadband).
• TV Set-top Boxes (Virgin, Talk-Talk, Sky, EE) are connected to the Internet and continuously
stream Audience Channel Selection data and Music Play-lists to the Communications Service
Provider (CSP) Audience Insight and Analytics servers. Similarly, Smart Phone Apps (BBC i-
player, Sky Go, Netflix, Spotify) also continuously stream Audience Channel Selection data and
Music Play-lists to the Communications Service Provider (CSP) - via Apigee to AWS Big Data.
• In a typical household (Mother, Father, two children) there may be four Smart Phones and as
many as ten other internet connected devices (Tablets, Laptops, Internet TVs, TV Set-top Boxes
and Video Games Boxes) – all streaming video, audio and data – the details of which are
captured, stored and analysed by the Communications Service Provider (CSP) using “Big Data”
Analytics techniques. This yields valuable Audience Metrics and Analytics based on intimate
understanding of consumer video, audio and internet content from which actionable audience
insights is derived from video, audio and internet streaming data – which drives Personalised
Advertising across all devices (Smart Phone, Tablet, Internet TV, Games Boxes).
The CONE™ - Social Intelligence
This revolutionary Digital Marketing approach is called the Cone™‫-‏‬ a next-
generation Social Intelligence solution for real-time lifestyle understanding: -
• The Cone™‫‏‬solution uses Social Intelligence to get right to the heart of every
audience - and puts the audience back at the heart of every media organisation.
• The Cone™‫‏‬Digital‫‏‬Marketing‫‏‬solution works through Real-time Analytics –
tuning directly into the dynamic nature of people, fashion, media and culture.
• The Cone™‫‏‬solution analyses intimate audience viewing behaviour using Social
Intelligence and Real-time Insight, inspiring better digital marketing campaigns,
faster – ideas which connect directly with the widest possible network audience.
• Most importantly, the Cone™‫‏‬solution tracks and understands the changing
behaviour of viewers, fans and audiences and their propensity to engage with
different ideas, lifestyles, interests, needs, passions, aspirations and desires.
21st Century Lifestyle Understanding
Fanatics (10%) Enthusiasts (20%) Casuals (20%) Indifferent (40%)
Cone™ Fan Base Understanding©
©2013 Innovation Pipeline
The CONE™ - a New Lens
Today we can view
audiences through a
better lens than given by
traditional segmentation.
Our better lens is what we
now call the Cone™.
The Cone™ visualises the
volume and behaviour of
a user-defined audience.
When an audience is
viewed is this way, the
behaviours and volumes
are visualised across our
Cone™ spectrum that
segments the audience’s
propensity to engage.
It’s this behaviour and
volume understanding
that visualises the Cone™.
Scene
Setters
Restless Contented
©2013 Innovation Pipeline
Cone™ Lifestyle Understanding
What‫‏‬is‫‘‏‬The‫‏‬Cone’?
• At its simplest, The‫‏‬Cone™‫‏‬is a visual metaphor that maps the volume of audiences across an
engagement spectrum with regards to how people connect with different passions and ideas.
• At its most sophisticated, the Cone™ delivers total entertainment digital innovation.
Why a Cone?
• The Cone™ shape is informed by the correlation between the volume of audiences and their propensity
to engage with different passions. This Cone shape proves to be universal in it’s application to brands,
ideas and industries that have ‘fans’ i.e. –
1. The thin, pointy end of the Cone™ -
• Low audience volume but incredibly high engagement and therefore high ‘purchase’ intent’
2. The fat, base end of the Cone™ -
• High audience volume but low engagement and therefore, much lower ‘purchase 'intent’
• We use our proprietary IP to produce The Cone™ in industries and clients that have fans (or at least
where people engage through ‘passionate interest’ vs mere ‘consumption’). Thus The‫‏‬Cone™‫‏‬maps
people as fans and audiences with active interests, needs and desires - not just as passive consumers.
Cone™ Lifestyle Understanding
Cone™ Lifestyle
Understanding©
Fanatics (10%)
- Core fans, including
cultural arbiters, trend
setters, curators, editors.
Enthusiasts (20%)
- Social amplifiers,
restless for the new,
who enjoy the discovery
and social kudos of
feeling and “being first”.
Casuals (20%)
- The wider market,
happy to be influenced
by others and open to
engagement through
social influence.
Indifferent (40%)
- Generally agnostic,
uninterested and
indifferent to ideas
in question.
Fanatics
10%
Enthusiasts
20%
Casuals
30%
Indifferent
40%
©2013 Innovation Pipeline
Cone™ Lifestyle Understanding
How does the Cone work?
• The principle of The‫‏‬Cone™‫‏‬Audience‫‏‬Metrics‫‏&‏‬Analytics‫‏‬Solution‫‏‬is firstly to understand
people’s lives, and then understand the role that different entertainment concepts and content
play in their lives. Using this narrative of understanding, we can gain unique insights, helping
make better and more incisive decisions through understanding who ideas are connecting with
and why that inspires creative marketing. We then apply The Cone™ creative inspiration to
innovate compelling propositions and ideas that will connect with the widest possible audiences.
• On the surface, The‫‏‬Cone™‫‏‬profiles people’s propensity to engage with any given lens e.g. film,
reality TV, music, radio, mobile, etc. along our FECI continuum: ranging from Fanatics through
Enthusiasts to Casuals and “Indifferent” – finally the “Unconnected”. We then use proprietary
data analytics to profile and describe groups of similar people within the FECI continuum.
• The‫‏‬Cone™‫‏‬facilitates our understanding of how groups of like-minded individuals are
connecting (or not connecting…..) with our brand and content – thus we can use intimate
personal insights to learn how to inspire the right kinds of ideas and events to better target brand
positioning and product content, influencing more receptive audiences, so delivering new core
fan connections which drives an expanding and increasingly loyal fan base …..
Cone™ Lifestyle Understanding
©2013 Innovation Pipeline
The CONE™ - BBC Radio 1
Cone™‫‏‬Innovation - BBC Radio 1, 2002-05
• In 2002, BBC Radio 1 - the UK’s no.1 youth radio brand (now globally streamed to millions) - was in
danger of losing its public service licence. Listener volume was in decline, with a total RAJAR audience
of circa 7 million. Radio 1 had become disconnected from its core audiences.
• We were asked to help innovate the total transformation of ideas, creativity and environment to return
Radio 1 to its pre-eminent place in youth culture.
• Central to Radio 1’s innovative revival was a new lens through which to view the Radio 1 audience. This
lens helped us understand audience engagement through behaviour - versus fixed demographics.
©2013 Innovation Pipeline
Sony Music: Audience Cone™ / Artist DNA
Sony Music 2007-2011 - Audience Cone™‫/‏‬ Artist DNA
• The key to success at Sony Music was using the Audience‫‏‬Cone™‫‏‬and
Artist DNA in order to help A&R Managers and Producers to understand the
role music plays in people's lives - and then understand the impact of any
particular genre or specific artist within that audience and cultural context.
• We provided a unique approach to make sense of Digital Marketing and
Social Intelligence as part of an Artists musical and career development.
We called it the Artist DNA – a tool which supports the insightful creative
foundation for all artist releases, tours, appearances and campaigns.
• Today the Cone™‫‏‬App‫-‏‬ our proprietary solution using the Audience
Cone™‫‏‬and Artist DNA approach – is used by Sony Music in 32 global
territories – placing the audience back at the heart of Sony Music and putting
the artists back at the heart of their audiences - attracting new fans and re-
connecting with old fans – to give the widest possible audience and fan-base.
The Challenge – American Idol, 2014
The Challenge – American Idol, 2014
• Analyse the Reality TV audience spectrum so that we can better understand who American Idol
fans are, and therefore gain insight into how we can halt the audience decline of 2014…..
• There is a very real and present Reality TV Cone - because there exists distinct Reality TV audience
clusters - discrete groups of people who engage with Reality TV in a variety of different ways…..
• Reality TV is a well understood lens into how people live out their own lives (they might not admit this) –
so that we can understand viewers lives and lifestyle and engage them through the Reality TV lens.
• We can map this lens through our Fanatics, Enthusiasts, Casuals and Indifferent (FECI) spectrum in
order to place each individual along a continuum of audience interest, affinity, loyalty and engagement.
• We can then profile and segment these people into different groups along the FECI spectrum – and
therefore, those within these groups who have a greater propensity and appetite for American Idol: -
– Viewers with an increased or decreased awareness of the Reality TV genre
– Viewers with a higher or lower interest in Reality TV shows / media coverage
– Viewers with a greater or lesser knowledge of Reality TV presenters / participants
– Viewers who invest more or less time in consuming Reality TV – live / streamed content
The CONE™ - American Idol, 2014
Cone™‫‏‬Innovation – American Idol, 2014
1. Fanatics - 10% : - Know about each contestant in every show, devote time to reality TV. Primarily live viewers.
2. Enthusiasts - 26%: - Buy very much into Reality TV. Have other passions. Love social media ‘second screening’.
3. Casuals - 42% : - A more diverse group. Reality TV is only one part of their busy lives. Will engage if it meets
their needs and values. American Idol, 2014 over-indexed on “Casuals”‫–‏‬ but under-indexed on Audience Total
4. Indifferent - 22% : - “Indifferent”‫‏‬viewers interact with the brand when there are other brand Fans within their
social network who act as “Influencers”.‫‏‏‬AI 2014 under-indexed on both “Indifferent”‫‏‬and Audience Total
5. Unconnected. Huge marketplace. Generally, “Unconnected”‫‏‬viewers only connect with the brand if there are
other brand advocates within their social network who act as influencers or “Introducers”‫‏‬to Reality TV series.
Fanatics
10%
Enthusiasts
26%
Casuals
42%
Indifferent
22%
The Challenge – American Idol, 2014
Analyse the Reality TV audience so that we
can better understand who American Idol
fans are, and therefore gain insight into how
we can halt the audience decline of 2014…..
• There is a Reality TV Cone because there
exists discrete groups of people who
engage with Reality TV in different ways.
• Reality TV is a well understood lens in
peoples lives (they might not admit this -
but we can view their lives through this
Reality TV lens).
• We can map this lens through our Fanatics,
Enthusiasts, Casuals and Indifferent
(FECI) continuum in order to place every
individual along the spectrum of audience
engagement.
©2013 Innovation Pipeline
Cone™ Fan Base Understanding
©2013 Innovation Pipeline
The Cone™ Application
• Where old-school audience analysis was retrospective and fixed, the
new Cone™ data science is lean, agile, current, fluid and predictive.
• The‫‏‬Cone™‫‏‬App takes our proven Audience Cone™‫‏‬and Artist DNA
approach and puts it on-line to render a custom lens for an audience; a
lens you can zoom, pan and focus - to reveal more hidden detail.
• The‫‏‬Cone™‫‏‬App applies data science and digital analytics principles to
generate innovative marketing insights - translated into a narrative of
real-time audience understanding - that answers the six key questions: -
1. What’s happening now ?
2. Who’s making it happen ?
3. Where is it happening ?
4. Why is it happening ?
5. When is it happening ?
6. How is it happening ?
The‫‏‬Cone™‫‏‬Application
Social Intelligence
Cloud
CRM
Data
Profile
Data
CRM / CEM
Big Data
Analytics
Customer Management
(CRM / CEM)
Social
Intelligence
Campaign
Management e-Business
Big Data Analytics
The Cone™‫‏‬
Customer Loyalty
& Brand Affinity
The Cone™‫‏‬
Smart Apps
Audience
Survey Data
Insights
Reports
TV Set-
top Box
Proof-of-concept and Prototype
The Cone™‫‏‬approach is lean, agile, smart and creative: -
• We start by providing a custom Cone™ app as a proof of concept. We then work
with client key stakeholders to scope a detailed brief which articulates a business
problem domain that the Cone™ can help resolve.
• Under normal circumstances we utilise all current and past audience research and
any other available internal data to first establish a baseline client Cone™.
• We then augment this by overlaying external data - Social Media Intelligence and
other live streamed audience data that will provide our new real-time view for who /
what / why / where / when and how fan-base and lifestyle understanding.
• Lastly, we apply this understanding social intelligence as new actionable insights
to inform creative marketing campaign solutions against the agreed brief.
• Post proof-of-concept, we then agree a Cone™ app fixed term licence along with
Cone™ consulting, mentoring and support – on-demand, as and when required.
The Cone™‫‏‬– Model Design and Delivery
Phase /
Step
Description Input Design
Process
Output Cost
(estimate)
Skill Set
1 1 Cone™‫‏‬Model‫‏‬Data‫‏‬
Analysis / Design
User
Requirements
Data Analysis &
Data Modelling
Cone™ Logical
Data Model
£k Business /
Data Analyst
2 Cone™‫‏‬Data‫‏‬Design‫‏‬
– Questionnaire
User
Requirements
Data Analysis &
Data Modelling
Questionnaire
Survey Form
£k Business /
Data Analyst
3 Cone™‫‏‬Physical‫‏‬
Database Design
Logical Data
Model
Cone™
Database
Design
Physical
Cone™ Design
£k Data Analyst
/ DBA
4 Cone™‫‏‬Data‫‏‬Load‫–‏‬
Questionnaire /
Survey Forms
Physical Data
Model, Survey
Questionnaire
Cone™ Model
Calibration and
Tuning Runs
Initialised
Cone™ Model
£k Business /
Data Analyst,
DBA
2 5 Cone™‫‏‬Data‫‏‬Load‫–‏‬
In-house CRM and
Audience Data
Physical Data
Model, People
CRM Data
Cone™ Model
CRM Data Load
Populated
Cone™ Model
£k Business /
Data Analyst,
DBA
6 Cone™‫‏‬Profiling Cone™
Clustering
Algorithms
Cone™ Model
Data Profiling –
Kernel k-means
Profiled
Cone™ Model
£k Data Analyst,
DBA, Data
Scientists
3 7 Cone™‫‏‬Streaming‫‏‬
and Segmentation
Historic Sales
and CRM Data
Cone™ History
Matching Runs
Cone™ Historic
Trends
£k Data
Scientists
8 Cone™‫‏‬Real-time
Social Media Feeds
Global Social
Intelligence
Cone™ Real-
Time Analytics
Actionable
Cone™ Insights
(variable with
Cone™ total
data volume)
Data
Scientists
The Cone™‫‏‬– Social Intelligence
The Cone™‫‏‬
The Cone™‫‏‬– Digital Marketing
– turning Social Intelligence into Actionable Marketing Insights / Sales Opportunities…
1. Education Cone™ – Training and Education Business Scenario and Use Cases
2. Utilities Cone™ – Water, Gas and Electricity Business Scenario and Use Cases
3. Media Cone™ – Broadband, Land-line, Mobile and Entertainment Business Scenario and Use Cases
4. Music Cone™ – Brand / Genre / Label / Artists Business Scenario and Use Cases
5. Political Cone™ – Party and Voter Election Business Scenario and Use Cases
6. Fashion Cone™ – Fashion and Luxury Brands Business Scenario and Use Cases
7. Sports Cone™ – Elite Team Sports Franchise Business Scenario and Use Cases
8. Patient Cone™ – Digital Healthcare / medical Business Scenario and Use Cases
The Cone™‫‏‬- Digital Marketing
Telematics
The Internet of Things (IoT) – Smart Devices, Smart Apps, Wearable
Technology, Vehicle Telemetry, Smart Homes and Building Automation
SMACT/4D Digital Technology Stack
The Cone™‫‏‬– Model Design and Delivery
Phase /
Step
Description Input Design
Process
Output Cost
(estimate)
Skill Set
1 1 Cone™‫‏‬Model‫‏‬Data‫‏‬
Analysis / Design
User
Requirements
Data Analysis &
Data Modelling
Cone™ Logical
Data Model
£k Business /
Data Analyst
2 Cone™‫‏‬Data‫‏‬Design‫‏‬
– Questionnaire
User
Requirements
Data Analysis &
Data Modelling
Questionnaire
Survey Form
£k Business /
Data Analyst
3 Cone™‫‏‬Physical‫‏‬
Database Design
Logical Data
Model
Cone™
Database
Design
Physical
Cone™ Design
£k Data Analyst
/ DBA
4 Cone™‫‏‬Data‫‏‬Load‫–‏‬
Questionnaire /
Survey Forms
Physical Data
Model, Survey
Questionnaire
Cone™ Model
Calibration and
Tuning Runs
Initialised
Cone™ Model
£k Business /
Data Analyst,
DBA
2 5 Cone™‫‏‬Data‫‏‬Load‫–‏‬
In-house CRM and
Audience Data
Physical Data
Model, People
CRM Data
Cone™ Model
CRM Data Load
Populated
Cone™ Model
£k Business /
Data Analyst,
DBA
6 Cone™‫‏‬Profiling Cone™
Clustering
Algorithms
Cone™ Model
Data Profiling –
Kernel k-means
Profiled
Cone™ Model
£k Data Analyst,
DBA, Data
Scientists
3 7 Cone™‫‏‬Streaming‫‏‬
and Segmentation
Historic Sales
and CRM Data
Cone™ History
Matching Runs
Cone™ Historic
Trends
£k Data
Scientists
8 Cone™‫‏‬Real-time
Social Media Feeds
Global Social
Intelligence
Cone™ Real-
Time Analytics
Actionable
Cone™ Insights
(variable with
Cone™ total
data volume)
Data
Scientists
Social Intelligence – Brand Loyalty and Affinity
CONE SEGMENTS – Brand Loyalty and Affinity
Social Intelligence drives Brand Loyalty and Affinity, Lifestyle Understanding - Fan-base Profiling, Streaming and
Segmentation and marketing Campaigns – expressed in the creation and maintenance of a detailed History and
Balanced Scorecard for every individual in the Cone, allowing summation by Stream / Segment: -
1. Inactive – need to draw their attention towards the Brand
2. Indifferent – need to educate them about core Brand Values
3. Disconnected– need to re-engage with the Brand
4. Casuals – exhibit Brand awareness and interest
5. Followers – follow the Brand, engage with social media and consume brand communications
6. Enthusiasts – engaged with the Brand, participate in Brand / Product / Media events and merchandising
7. Supporters– show strong need, desire and propensity to support Brand / Product / Media consumption
8. Fanatics – demonstrate total Commitment / Dedication / Loyalty for all aspects of the Brand / Product / Media
PROPENSITY – Balanced Scorecard
• Balanced Scorecard – is a summary of all the data-points for an Individual / Stream / Segment
• Propensity Score – In the statistical analysis of observational data, Propensity Score Matching (PSM) is a
statistical matching technique that attempts to estimate the effect of a Campaign / Offer / Promotion or other
intervention by calculating the impact of factors that predict the outcome of the Campaign / Offer / Promotion.
• Propensity Model – is the Baysian probability of the outcome of an event in an Individual / Stream / Segment
• Predictive Analytics - an area of data mining that deals with extracting information from data and using it to
predict trends and behaviour patterns. Often the unknown event of interest is in the future, however, Predictive
Analytics can be applied to any type of event with an unknown outcome - in the past, present or future.
Social Intelligence – Streaming and Segmentation
Social
Interaction
Brand
Affinity
Geo-demographic
ProfileExperian Mosaic – 15 Groups (Streams), 66 Types (Segments)
Hybrid Cone – 3 Dimensions
The Cone™‫‏‬
Social Interaction
The Cone™‫‏‬– Streaming & Segmentation
Social Intelligence – Social Interaction
Social Interaction Cone Rules
1. Inactive – not engaged – low evidence / low affinity / low interest in Social Media
2. Lone Wolf – sparse / thin social network - may share negative information (Trolling)
3. Home Boy – Social Network clustered around Home Location Postcodes (Gang Culture)
4. Eternal Student – Social Network clustered around School / College / University Alumni
5. Workplace – Social Network clustered around Work and Colleagues (e.g. City Brokers, Traders)
6. Friends and Family – Social Network clustered around physical social contacts - Friends and Family
7. Enthusiast – Social Network clustered around shared, common interests – Sport. Music and Fashion etc.
8. Promiscuous – Open Networker – virtual Social Network across all categories- will connect with anybody
Number of Segments
• With anonymous data (e.g. surveys and polls) then the number of initial Segments is 4 (Matt Hart). With people
data (named individuals) we can discover much richer internal and external data from multiple sources (Social
Media / User Content / Experian) - and therefore segment the population with greater granularity
Individuals Qualifying for Multiple Segments.
• When individuals qualify for multiple segments - we can either add these deviant (non-standard) individuals to
the Segment that they have the greatest affinity with - or kick out any such deviants into an Outlying / Outcast /
Miscellaneous Segment for further statistical processing or for processing throiugh manual intervention
Social Intelligence – Actionable Insights
Brand
Affinity
Social
Interaction
Geo-demographic
Profile
Experian Mosaic – 15 Groups (Segments), 66 Types (Streams)
Hybrid Cone – 3 Dimensions
Fanatics - 10%
Enthusiasts - 20%
Casuals - 30%
Indifferent - 40%
The Cone™‫‏‬
Brand Loyalty & Affinity
The Cone™‫‏‬– Actionable Insights
Social Interaction
How consumers use social media (e.g., Facebook, Twitter) to address and/or engage with companies around social and environmental issues.
The chart above illustrates the richness and diversity of social media.....
The pattern of Social Relationships.....
Social Media is the fastest growing category of user-provided global content and will eventually grow
to 20% of all internet content. Gartner defines social media content as unstructured data created,
edited and published by users on external platforms including Facebook, MySpace, LinkedIn, Twitter,
Xing, YouTube and a myriad of other social networking platforms - in addition to internal Corporate
Wikis, special interest group blogs, communications and collaboration platforms.....
Social Mapping is the method used to describe how social linkage between individuals in order to
define Social Networks and to understand the nature of intimate relationships between individuals.
Social Conversations SCRM in the Cloud
Traditional CRM was very much based around data and information that brands could collect
on their customers, all of which would go into a CRM system that then allowed the company
to better target various customers. CRM is comprised of sales, marketing and service /
support–based functions whose purpose was to move the customer through a pipeline with
the goal of keeping the customer coming back to buy more and more stuff......
TRADITIONAL CRM – Customer Management PipelineTRADITIONAL CRM – Customer Management Pipeline
Evolution of CRM to SCRM - The challenge for organizations now is adapting and evolving
to meet the needs and demands of these new social customers - many organizations still
do not understand the CRM value of social media.....
SOCIAL CRM – Social Media ConversationsSOCIAL CRM – Social Media Conversations
In Social CRM - the customer is actually the focal point of how an organization operates. Instead of
marketing products or pushing messages to customers, brands now talk to and collaborate with
their customers to solve business problems, empower customers to shape their own Customer
Experience and Journeys and develop strong customer relationships - which will over time, turn
participants into brand evangelists and positive customer advocates.....
SOCIAL CRM – Social CRM ProcessesSOCIAL CRM – Social Media Conversations
Posted on April 20, 2010 by Laurance Buchanan - Capgemini
SOCIAL CRM – a Business Framework and Operating Model
Social CRM - a Business Framework and Operating Model
SOCIAL CRM – Business Framework and Operating Model
Social Graphs and Market Sentiment
•‫‏‬Using‫“‏‬BIG‫‏‬DATA”‫‏‬to‫‏‬drive‫‏‬Market‫‏‬Sentiment‫‏•‏‬
Unprompted online conversations, statements and news create an online reflection of real-life events and
issues – influencing the thoughts of individual consumers – managing Reputational Risk and so shaping
Market Sentiment. The Social Media data, Blogs and News feeds that form this digital mirror of the world
provides a gold mine of actionable information.....
• Influencer Programmes have a long history in
industries such as software, computers and
electronics, - but today they are successfully
deployed across all types of industries including
automotive, smart phones, fashion, health and
nutrition, wine, sports, music, technology, travel
tourism and leisure – and financial services.....
• In a hyper-connected world market-makers and
influencers increasingly provide the gateway to
decision makers who drive consumer behaviour.
• Unprompted online conversations, statements
and news create an online reflection of real-life
events and issues – influencing the thoughts of
individual consumers and so shaping Market
Sentiment.
• The Social Media data and News feeds that form
this digital mirror of the world provides a gold
mine of information. However, unlocking the
data is not straight forward as it requires a
complex and unique set of technologies, skills
and methods.....
INFLUENCER PROGRAMMES – Social Media Conversations
INFLUENCER PROGRAMMES – Social Media Conversations
INFLUENCER PROGRAMMES – Social Media Conversations
The Cone™‫‏‬- Digital Marketing
SalesForce.com – a Cloud Platform Social CRM Business Solution
The Cone™‫‏‬- Digital Marketing
The Cone™‫‏‬- Lifestyle Understanding
Customer Management
(CRM / CEM)
Social
Intelligence
Campaign
Management
e-Business
Big Data Analytics
The Cone™‫‏‬
Customer Loyalty
& Brand Affinity
The Cone™‫‏‬
Smart
Apps
Alarms
& Alerts
Reporting
Digital Marketing – Solution Options
Vendor Social
Intelligence
Mobile Big Data Analytics Cloud CRM / CEM
Amazon +
Salesforce
Anomaly 42 Apple iOS +
Android
AWS Elastic
MapReduce
(EMR)
AWS S3
“R” Revolution
Kernel k-means
AWS EC2 SalesForce
+ 3rd Party
Apps Store
Google Google
Analytics
Google
Nexus
Google
Hadoop
Google
Analytics
Google Cloud Google Office
+ Apps
IBM IBM InfoSphere BigInsights IBM Cloud
Microsoft Nokia,
Windows 8
for Mobile
Microsoft
SQL/Server +
Hadoop
Microsoft
Analytics
DOT.NET, C#
Windows
Azure
HDInsight
Microsoft
Office 360 +
Dynamics
Oracle Oracle DBMS +
Hadoop
OBIE Oracle Cloud Oracle CRM
and EBS
SAP SUP + Fiori SAP HANA +
Hadoop
Business
Objects
SAP HANA
Cloud
SAP CRM +
Hybris
The Cone™‫‏‬- Digital Marketing
The Cone™‫‏‬
Lifestyle Understanding
The‫‏‬Cone™‫‏‬– Brand Loyalty and Affinity
The Cloud – SalesForce.com
Amazon Web Services (AWS}
Social
Intelligence
Data Science /
Big Data Analytics
Customer Experience
& Journey - CRM / CEM
Alarms / Alerts
Reporting
e-Business Smart Apps
The Cone™‫‏‬– Digital Marketing
Connecting‫‏‬the‫‏‬Unconnected…..
• FMCG, Media, Entertainment and other enterprises which supply products and services
indirectly to consumers – via Channel Partners such as Distributors, Dealers, Wholesalers
and Retailers – are not directly connected to their customer base. In order to drive brand
strategy and customer loyalty / affinity – they have to reach out to, contact and connect
with, on the most intimate terms - the widest possible range of end-user consumers: -
– Music (e.g. BBC and Sony Music)
– Broadcasting (e.g. Radio 1 / American Idol)
– Digital Media Content (e.g. Sony Films / Netflix)
– Sports Franchises (e.g. Manchester City / New York City)
– Fast Fashion Retailers (e.g. ASOS, Next, New Look, Primark, Top Shop)
– Luxury Brands / Aggregators (e.g. Armani, Burberry, Versace / LVMH, PPR, Richemont)
– Multi-channel Retailers – Loyalty, Campaigns, Offers and Promotions
– Financial Services Companies – Brand Protection and Reputation Management
– Travel, Leisure and Entertainment Organisations - Destination Resorts and Events
– MVNO / CSPs - OTT Business Partner Analytics (Sky Go, Netflix via Firebrand / Apigee)
– Telco, Media and Communications - Churn Management / Conquest / Up-sell / Cross-sell Campaigns
– Digital Healthcare – Private / Public Healthcare Service Provisioning: - Geo-demographic Clustering and
Propensity Modelling (Patient Monitoring, Wellbeing, Clinical Trials, Morbidity and Actuarial Outcomes)
The Cone™‫‏‬- Eight Primitives
Primitive Problem / Opportunity Business
Domain
System Function Software Product
Who ? Who are our Customers ? Party - People /
Organisations
CRM / CEM SalesForce.com -
Customer Management
What ? What are they saying
about us ?
Social Media /
Communications
Social Intelligence Google Analytics,
Anomaly 42
Why ? Why - their Interest /
Behaviour / Motivation /
Aspirations / Desires ?
Brand Identity /
Loyalty / Affinity /
Offers / Promos’
Marketing,
Campaign
Management
Predictive Analytics /
Propensity Modelling
Where ? Where do they Live /
Work / Shop / Relax ?
Places -
Location
GIS / GPS Geospatial Analytics
When ? When do they contact /
buy products from us ?
Time / Date Contact Event /
Sales Transaction
Multi-channel Retail /
Mobile Platforms
How ? How do they contact and
connect with us – Media /
Telecoms Channels ?
Communications
Channel
• Mobile
• Internet
• In-store
Multi-channel Retail /
Mobile Platforms
Which ? Which Brands / Ranges /
Categories / Products ?
Retail
Merchandising
Product
Catalogue
IBM Product Centre /
Stebo / Kalido
Via ? Via Business Partners /
3rd Party Channels ?
Sales Channel Retail Channel /
Outlet
Amazon, E-bay, Alibaba
The Cone™‫‏‬– EIGHT PRIMITIVES
Event
Dimension
Party
Dimension
Geographic
Dimension
Motivation
Dimension
Time
Dimension
Media
Dimension
Cone™‫‏‬
MEDIA
FACT
WHO ? WHAT ? WHERE ?
HOW ?WHEN ?WHY ?
• Indifferent
• Casuals
• Enthusiasts
• Fanatics
• Radio Show
• Television Show
• Internet Advert
• Campaign
• Offer
• Promotion
• Pre-order
• Purchase
• Download
• Playlist
• Booking
• Attendance
• Advert / Publicity
• Posting / Blog
• Facebook
• LinkedIn
• Myspace
• Twitter
• YouTube
• Xing
• Region / Country
• State / County
• City / Town
• Street / Building
• Postcode
• Person
• Organisation
Product
Dimension
WHICH ?
• Category
• Label / Artist
• Album / Track
• Tour / City / Arena
• Merchandise
Channel
Dimension
VIA ?
• Channel / Partner
• In-store
• Internet Service
• Mobile Smart App
(Spotify etc.)
Advert / Publicity Type
Sales Channel
Posting / Blog
Source / Type
Subject
Location
Media
Event
• Awareness
• Interest
• Need
• DesireMotivation
Customer
Time / Date
Version 2 –
Media Co’s
Social Intelligence – Profiling and Analysis
Fanatics - 10%
Enthusiasts - 20%
Casuals - 30%
Indifferent - 40%
The Cone™‫‏‬
Brand Loyalty & Affinity
The Cone™‫‏‬– Profiling & Analysis
The Cone™‫‏‬– Model Development
Initialise
Cone™‫‏‬
Model
Cone™‫‏‬
Model
Design
Data Load
Cone™‫‏‬
Model
Calibration
and Tuning
Cone™‫‏‬
History
Matching
Cone™‫‏‬
Real-Time
Analytics
Survey
Script Data
Data Model
Customer
Data
Profiling
Data
Historic
Data
Real-Time
Data
Cone™
Model
Database
Design
Populated
Cone™
Model
Profiled
Cone™
Model
Historic
Trends
Actionable
Insights
Step 1 Step 3 Step 4 Step 5 Step 6Step 2
The Cone™‫‏‬– Model Delivery
Phase /
Step
Description Input Design
Process
Output Cost
(estimate)
Skill Set
1 1 Cone™‫‏‬Model‫‏‬Data‫‏‬
Analysis / Design
User
Requirements
Data Analysis &
Data Modelling
Cone™ Logical
Data Model
£k Business /
Data Analyst
2 Cone™‫‏‬Data‫‏‬Design‫‏‬
– Questionnaire
User
Requirements
Data Analysis &
Data Modelling
Questionnaire
Survey Form
£k Business /
Data Analyst
3 Cone™‫‏‬Physical‫‏‬
Database Design
Logical Data
Model
Cone™
Database
Design
Physical
Cone™ Design
£k Data Analyst
/ DBA
4 Cone™‫‏‬Data‫‏‬Load‫–‏‬
Questionnaire /
Survey Forms
Physical Data
Model, Survey
Questionnaire
Cone™ Model
Calibration and
Tuning Runs
Initialised
Cone™ Model
£k Business /
Data Analyst,
DBA
2 5 Cone™‫‏‬Data‫‏‬Load‫–‏‬
In-house CRM and
Audience Data
Physical Data
Model, People
CRM Data
Cone™ Model
CRM Data Load
Populated
Cone™ Model
£k Business /
Data Analyst,
DBA
6 Cone™‫‏‬Profiling Cone™
Clustering
Algorithms
Cone™ Model
Data Profiling –
Kernel k-means
Profiled
Cone™ Model
£k Data Analyst,
DBA, Data
Scientists
3 7 Cone™‫‏‬Streaming‫‏‬
and Segmentation
Historic Sales
and CRM Data
Cone™ History
Matching Runs
Cone™ Historic
Trends
£k Data
Scientists
8 Cone™‫‏‬Real-time
Social Media Feeds
Global Social
Intelligence
Cone™ Real-
Time Analytics
Actionable
Cone™ Insights
(variable with
Cone™ total
data volume)
Data
Scientists
The Cone™‫‏‬– Model Implementation
Initialise
Cone™‫‏‬
Model
Cone™‫‏‬
Model
Design
Data Load
Cone™‫‏‬
Model
Calibration
and Tuning
Cone™‫‏‬
History
Matching
Cone™‫‏‬
Real-Time
Analytics
Data Model
Database
Schema
Business
Analyst
DBA
Survey Data
Cone™‫‏‬Model
Data
Architect
DBA
CRM Data
Populated
Cone™‫‏‬Model
Data
Architect
DBA
Stream and
Segment Data
Profiled
Cone™‫‏‬Model
Data
Architect
DBA
Historic Data
Historic
Trends
Data
Architect
Data Scientists
Real-Time
Data
Actionable
Insights
Data
Architect
Data Scientists
The Cone™‫‏‬– Digital Marketing
Data Streams into Revenue Streams…..
• Digital Marketing is the communication, advertising and marketing of
brands, products and services via multiple digital channels and channel
partners in order to reach out to, contact and connect, on the most intimate
terms, with the widest possible range of consumers. Through the exploitation of
Digital Media we can initiate and maintain engaging Social Conversations.
• Digital Marketing extends key Brand Messages across every digital platform,
from simple internet marketing to mobile, broadcast and social media channels
– yielding Social Intelligence data in order to discover actionable Marketing
Insights – which in turn convert digital Data Streams into Revenue Streams
• The key objective of Digital Marketing is to reach out to, contact and connect
directly with carefully selected consumers – so that we create strong, lasting
and durable relationships in order to promote key brand, category and product
messages to targeted consumers and thus develop a tangible, valuable. very
real and distinct brand / category / product interest, following, affinity and loyalty
The Cone™
Converting Data Streams into Revenue Streams
Salesforce
Anomaly 42
Cone
Unica
End User
BIG DATA
ANALYTICS
SOCIAL MEDIA
E-Commerce
Platform
FULFILMENT
Sales Orders
Salesforce
CRM
Geo-demographics
• Streaming
• Segmentation
• Household Data
SOCIAL CRM
Households
Insights
InsightsInsights
Anomaly
42
Unica
Offers and
Promotions
People
and Places
Campaigns
Social Intelligence
• User Content and Blogs
• Social Groups and NetworksSOCIAL INTELLIGENCE
Actionable Marketing Insights
EXPERIAN
The Cone™‫‏‬
Big Wheel keeps on turning – Perfect Store
SalesForce.com – a Cloud Platform Social CRM Business Solution
The Cone™‫‏‬- Digital Marketing
The Cone™‫‏‬- Lifestyle Understanding
Customer Management
(CRM / CEM)
Social
Intelligence
Campaign
Management
e-Business
Big Data Analytics
The Cone™‫‏‬
Customer Loyalty
& Brand Affinity
The Cone™‫‏‬
Smart
Apps
Alarms
& Alerts
Reporting
“DATA‫‏‬SCIENCE”‫–‏‬ my own special area of Business expertise
Targeting – Map / Reduce
Consume – End-User Data
Data Acquisition – High-Volume Data Flows
– Mobile‫‏‬Enterprise‫‏‬Platforms‫(‏‬MEAP’s)
Apache Hadoop Framework
HDFS, MapReduce, Metlab “R”
Autonomy, Vertica
Smart Devices
Smart Apps
Smart Grid
Clinical Trial, Morbidity and Actuarial Outcomes
Market Sentiment and Price Curve Forecasting
Horizon Scanning,, Tracking and Monitoring
Weak Signal, Wild Card and Black Swan Event Forecasting
– Data Delivery and Consumption
News Feeds and Digital Media
Global Internet Content
Social Mapping
Social Media
Social CRM
– Data Discovery and Collection
– Analytics Engines - Hadoop
– Data Presentation and Display
Excel
Web
Mobile
– Data Management Processes
Data Audit
Data Profile
Data Quality Reporting
Data Quality Improvement
Data Extract, Transform, Load
– Performance Acceleration
GPU’s – massive parallelism
SSD’s – in-memory processing
DBMS – ultra-fast data replication
– Data Management Tools
DataFlux
Embarcadero
Informatica
Talend
– Info. Management Tools
Business Objects
Cognos
Hyperion
Microstrategy
Biolap
Jedox
Sagent
Polaris
Teradata
SAP HANA
Netezza (now IBM)
Greenplum (now EMC2)
Extreme Data xdg
Zybert Gridbox
– Data Warehouse Appliances
Ab Initio
Ascential
Genio
Orchestra
SOCIAL CRM – The Emerging Big Data Stack
The Cone™‫‏‬- Brand Loyalty / Affinity
1. Brand
Affinity
2. Social
Interaction
3. Geo-demographic Profile – Experian Mosaic -15 Groups (Segments), 66 Types (Streams)
Hybrid Cone™ – 3 Dimensions
Fanatics - 10%
Enthusiasts - 20%
Casuals - 30%
Indifferent - 40%
The Cone™‫‏‬
Brand Loyalty & Affinity
The Cone™‫‏‬- CAMPAIGN
Salesforce
Anomaly 42
Cone
Unica
End User
BIG DATA
ANALYTICS
Cone™‫‏‬
Brand Affinity
Campaign
CRM
Insights
InsightsInsights
SALES
PEOPLE
DEMOGRAPHICS
Household Data
SOCIAL INTELLIGENCE
User Content, Social
Groups and Networks
Offers and
Promotions
People
& Places
PROFILING
Streaming & Segmentation
The‫‏‬Cone™‫‏‬– CYCLEThe Cone™‫‏‬– CONSUMER CYCLE
e-Business
Smart
Apps
Big Wheel keeps on turning – Perfect Store
Hadoop
Clustering and Managing Data.....
Managing Data Transfers in Networked Computer Clusters using Orchestra
To illustrate I/O Bottlenecks, we studied Data Transfer impact in two clustered computing systems: -
Hadoop - using trace from a 3000-node cluster at Facebook
Spark a MapReduce-like framework with iterative machine learning + graph algorithms.
Mosharaf Chowdhury, Matei Zaharia, Justin Ma, Michael I. Jordan, Ion Stoica
University of California, Berkeley
{mosharaf, matei, jtma, jordan, istoica}@cs.berkeley.edu
Hadoop Framework
• The workhorse relational database has been the tool of choice for businesses for well over 20
years now. Challengers have come and gone but the trusty RDBMS is the foundation of almost
all enterprise systems today. This includes almost all transactional and data warehousing
systems. The RDBMS has earned its place as a proven model that, despite some quirks, is
fundamental to the very integrity and operational success of IT systems around the world.
• The relational database is finally showing some signs of age as data volumes and network
speeds grow faster than the computer industry's present compliance with Moore's Law can
keep pace with. The Web in particular is driving innovation in new ways of processing
information as the data footprints of Internet-scale applications become prohibitive using
traditional SQL database engines.
• When it comes to database processing today, change is being driven by (at least) four factors:
– Speed. The seek times of physical storage is not keeping pace with improvements in network speeds.
– Scale. The difficulty of scaling the RDBMS out efficiently (i.e. clustering beyond a handful of servers is
notoriously hard.)
– Integration. Today's data processing tasks increasingly have to access and combine data from many
different non-relational sources, often over a network.
– Volume. Data volumes have grown from tens of gigabytes in the 1990s to hundreds of terabytes and
often petabytes in recent years.
RDBMS and Hadoop: Apples and Oranges?
• Below is Figure 1 - a comparison of the overall differences between
Database RDBMS and MapReduce-based systems such as Hadoop
• From this it's clear that the MapReduce model cannot replace the
traditional enterprise RDBMS. However, it can be a key enabler of a
number of interesting scenarios that can considerably increase
flexibility, turn-around times, and the ability to tackle problems that
weren't possible before.
• With Database RDBMS platforms, SQL-based processing of data sets
tends to fall away and not scale linearly after a specific volume ceiling,
usually just a handful of nodes in a cluster. With MapReduce, you can
consistently obtain performance gains by increasing the size of the
cluster. In other words, double the size of Hadoop cluster and a job will
run twice as fast - quadruple it will rub four times faster - its the same
linear relationship, irrespective of data volume and throughput.
Comparing Data in DWH, Appliances,
Hadoop Clusters and Analytics Engines
RDBMS DWH DWH Appliance Hadoop Cluster Analytics Appliance
Data size Gigabytes Terabytes Petabytes Petabytes
Access Interactive and
batch
Interactive and batch Batch Interactive
Structure Fixed schema Fixed schema Flexible schema Flexible schema
Language SQL SQL Non-procedural
Languages (Java, C++,
Ruby, “R” etc)
Non-procedural
Languages (Java, C++,
Ruby, “R” etc)
Data Integrity High High Low Very High
Architecture Shared memory -
SMP
Shared nothing - MPP Hadoop DFS In-memory Processing
– GPGPUs / SSDs
Virtualisation Partitions / Regions MPP / Nodal MPP / Clustered MPP / Clustered
Scaling Non-linear Nodal / Linear Clustered / Linear Clustered / Linear
Updates Read and write Write once, read many Write once, read many Write once, read many
Selects Row-based Set-based Column-based Array-based
Latency Low – Real-time Low – Near Real-time High – Historic
Reporting
Very Low – Real-time
Analytics
Figure 1: Comparing RDBMS to MapReduce
Hadoop Framework
• These datasets would previously have been very challenging and expensive to take on with a
traditional RDBMS using standard bulk load and ETL approaches. Never mind trying to efficiently
combining multiple data sources simultaneously or dealing with volumes of data that simply can't
reside on any single machine (or often even dozens). Hadoop deals with this by using a distributed
file system (HDFS) that's designed to deal coherently with datasets that can only reside across
distributed server farms. HDFS is also fault resilient and so doesn't impose the overhead of RAID
drives and mirroring on individual nodes in a Hadoop compute cluster, allowing the use of truly low
cost commodity hardware.
• So what does this specifically mean to enterprise users that would like to improve their data
processing capabilities? Well, first there are some catches to be aware of. Despite enormous
strengths in distributed data processing and analysis, MapReduce is not good in some key areas that
the RDMS is extremely strong in (and vice versa). The MapReduce approach tends to have high
latency (i.e. not suitable for real-time transactions) compared to relational databases and is
strongest at processing large volumes of write-once data where most of the dataset needs to be
processed at one time. The RDBMS excels at point queries and updates, while MapReduce is best
when data is written once and read many times.
• The story is the same with structured data, where the RDBMS and the rules of database
normalization identified precise laws for preserving the integrity of structured data and which have
stood the test of time. MapReduce is designed for a less structured, more federated world where
schemas may be used but data formats can be much looser and freeform.
The Emerging “Big Data” Stack
Targeting – Map / Reduce
Consume – End-User Data
Data Acquisition – High-Volume Data Flows
– Mobile Enterprise Platforms (MEAP’s)
Apache Hadoop Framework
HDFS, MapReduce, Metlab “R”
Autonomy, Vertica
Smart Devices
Smart Apps
Smart Grid
Clinical Trial, Morbidity and Actuarial Outcomes
Market Sentiment and Price Curve Forecasting
Horizon Scanning,, Tracking and Monitoring
Weak Signal, Wild Card and Black Swan Event Forecasting
– Data Delivery and Consumption
News Feeds and Digital Media
Global Internet Content
Social Mapping
Social Media
Social CRM
– Data Discovery and Collection
– Analytics Engines - Hadoop
– Data Presentation and Display
Excel
Web
Mobile
– Data Management Processes
Data Audit
Data Profile
Data Quality Reporting
Data Quality Improvement
Data Extract, Transform, Load
– Performance Acceleration
GPU’s – massive parallelism
SSD’s – in-memory processing
DBMS – ultra-fast database replication
– Data Management Tools
DataFlux
Embarcadero
Informatica
Talend
– Info. Management Tools
Business Objects
Cognos
Hyperion
Microstrategy
Biolap
Jedox
Sagent
Polaris
Teradata
SAP HANA
Netezza (now IBM)
Greenplum (now EMC2)
Extreme Data xdg
Zybert Gridbox
– Data Warehouse Appliances
Ab Initio
Ascential
Genio
Orchestra
Hadoop Framework
• Each of these factors is presently driving interest in alternatives that are significantly better at
dealing with these requirements. I'll be clear here: The relational database has proven to be
incredibly versatile and is the right tool for the majority of business needs today. However, the edge
cases for many large-scale business applications are moving out into areas where the RDBMS is
often not the strongest option. One of the most discussed new alternatives at the moment
is Hadoop, a popular open source implementation of MapReduce. MapReduce is a simple yet very
powerful method for processing and analyzing extremely large data sets, even up to the multi-
petabyte level. At its most basic, MapReduce is a process for combining data from multiple inputs
(creating the "map"), and then reducing it using a supplied function that will distill and extract the
desired results. It was originally invented by engineers at Google to deal with the building of
production search indexes. The MapReduce technique has since spilled over into other disciplines
that process vast quantities of information including science, industry, and systems management.
For its part, Hadoop has become the leading implementation of MapReduce.
• While there are many non-relational database approaches out there today (see my emerging IT and
business topics post for a list), nothing currently matches Hadoop for the amount of attention it's
receiving or the concrete results that are being reported in recent case studies. A quick look at
thelist of organizations that have applications powered by Hadoop includes Yahoo! with over
25,000 nodes (including a single, massive 4,000 node cluster), Quantcast which says it has over
3,000 cores running Hadoop and currently processes over 1PB of data per day, and Adknowledge
who uses Hadoop to process over 500 million clickstream events daily using up to 200 nodes
HP HAVEn Big Data Platform
Informatica / Hortonworks Vibe
Telco 2.0 “Big Data” Analytics Architecture
Case Study – Huawei SmartCare CEM
Customers
Campaign Mart
Analytics &
Customer
Loyalty
Loyalty Mart
CRM Data
Customer DWH Customer Care
“BIG‫‏‬DATA”
Merchandising &
Logistics Data
Retail Data
Warehouse
Retail
Multi-channel
Sales Analysis
Mobile
Platforms
EPOS Data
Call Centre Data
Internet Data
e-Commerce
Systems
Store Systems
Merchandising
Warehousing
& Logistics
Inventory &
Provisioning
Hadoop Cluster
SAP HANA
ERP
Systems
Finance
Managers
Financial Data
Warehouse
Head
OfficeFinancial
Analysis
Reports
ERP Data
OSS – Network Management
Network Provisioning &
Fault Management
OperationsNetwork Data
Network and
Fault Reports
Operations
Managers
Inventory,
Provisioning &
Replenishment
BSS – Rating, Mediation and Billing
Mediation
Rating and
Billing
Systems
Business
Managers
Supplier Data
Product Data
Customer Data
Inventory &
Provisioning
Reports
Planning &
Forecasting
Systems
CDR Data
Call Data
Warehouse
Billing Data
Autonomy Vertica
Operational
“BIG‫‏‬DATA”
Multi-channel Retail
MSS – Head Office – Finance, Planning &Strategy
Social Media -
External Data
Customer Care
Systems
CRM & Digital
Marketing
Systems
Customers
CEM
SAP HANA
Catalogue
Hadoop ClusterPentaho,
MetLab, “R”
Cloudera
Apache
Hadoop
Framework
Big Data – Products
The MapReduce technique has spilled over into many other disciplines that process vast
quantities of information including science, industry, and systems management. The Apache
Hadoop Library has become the most popular implementation of MapReduce – with
framework implementations from Cloudera, Hortonworks and MAPR
Split-Map-Shuffle-Reduce Process
Big Data
Consumers
Split Map Shuffle Reduce
Key / Value Pairs Actionable InsightsData Provisioning Raw Data
Apache Hadoop Component Stack
HDFS
MapReduce
Pig
Zookeeper
Hive
HBase
Oozie
Mahoot
Hadoop Distributed File System (HDFS)
Scalable Data Applications Framework
Procedural Language – abstracts low-level MapReduce operators
High-reliability distributed cluster co-ordination
Structured Data Access Management
Hadoop Database Management System
Job Management and Data Flow Co-ordination
Scalable Knowledge-base Framework
Data Management Component Stack
Informatica
Drill
Millwheel
Informatica Big Data Edition / Vibe Data Stream
Data Analysis Framework
Data Analytics on-the-fly + Extract – Transform – Load Framework
Flume
Sqoop
Scribe
Extract – Transform - Load
Extract – Transform - Load
Extract – Transform - Load
Talend Extract – Transform - Load
Pentaho Extract – Transform – Load Framework + Data Reporting on-the-fly
Big Data Storage Platforms
Autonomy
Vertica
MongoDB
HP Unstructured Data DBMS
HP Columnar DBMS
High-availability DBMS
CouchDB
Couchbase Database Server for Big Data with NoSQL / Hadoop
Integration
Pivotal Pivotal Big Data Suite – GreenPlum, GemFire, SQLFire, HAWQ
Cassandra
Cassandra Distributed Database for Big Data with NoSQL and
Hadoop Integration
NoSQL NoSQL Database for Oracle, SQL/Server, Couchbase etc.
Riak
Basho Technologies Riak Big Data DBMS with NoSQL / Hadoop
Integration
Big Data Analytics Engines and Appliances
Alpine
Karmasphere
Kognito
Alpine Data Studio - Advanced Big Data Analytics
Karmasphere Studio and Analyst – Hadoop Customer Analytics
Kognito In-memory Big Data Analytics MPP Platform
Skytree
Redis
Skytree Server Artificial Intelligence / Machine Learning Platform
Redis is an open source key-value database for AWS, Pivotal etc.
Teradata Teradata Appliance for Hadoop
Neo4j Crunchbase Neo4j - Graphical Database for Big Data
InfiniDB Columnar MPP open-source DB version hosted on GitHub
Big Data Analytics Engines / Appliances
Big Data Analytics and Visualisation Platforms
Tableaux Tableaux - Big Data Visualisation Engine
Eclipse Symentec Eclipse - Big Data Visualisation
Mathematica Mathematical Expressions and Algorithms
StatGraphics Statistical Expressions and Algorithms
FastStats Numerical computation, visualization and programming toolset
MatLab
R
Data Acquisition and Analysis Application Development Toolkit
“R”‫‏‬Statistical‫‏‬Programming‫‏/‏‬Algorithm‫‏‬Language
Revolution Revolution‫‏‬Analytics‫‏‬Framework‫‏‬and‫‏‬Library‫‏‬for‫“‏‬R”
Hadoop / Big Data Extended Infrastructure Stack
SSD Solid State Drive (SSD) – configured as cached memory / fast HDD
CUDA CUDA (Compute Unified Device Architecture)
GPGPU GPGPU (General Purpose Graphical Processing Unit Architecture)
IMDG IMDG (In-memory Data Grid – extended cached memory)
Vibe
Splunk
High Velocity / High Volume Machine / Automatic Data Streaming
High Velocity / High Volume Machine / Automatic Data Streaming
Ambari High-availability distributed cluster co-ordination
YARN Hadoop Resource Scheduling
Big Data Extended Architecture Stack
Cloud-based Big-Data-as-a-Service and Analytics
AWS
Amazon Web Services (AWS) – Big Data-as-a-Service (BDaaS)
Elastic Compute Cloud (ECC) and Simple Storage Service (S3)
1010 Data Big Data Discovery, Visualisation and Sharing Cloud Platform
SAP HANA SAP HANA Cloud - In-memory Big Data Analytics Appliance
Azure Microsoft Azure Data-as-a-Service (DaaS) and Analytics
Anomaly 42 Anomaly 42 Smart-Data-as-a-Service (SDaaS) and Analytics
Workday Workday Big-Data-as-a-Service (BDaaS) and Analytics
Google Cloud
Google Cloud Platform – Cloud Storage, Compute Platform,
Firebrand API Resource Framework
Apigee Apigee API Resource Framework
Gartner Magic Quadrant for BI and Analytics Platforms
Hadoop Framework Distributions
FEATURE Hortonworks Cloudera MAPR Pivotal
Open Source Hadoop Library Yes Yes Yes Pivotal HD
Support Yes Yes Yes Yes
Professional Services Yes Yes Yes Yes
Catalogue Extensions Yes Yes Yes Yes
Management Extensions Yes Yes Yes
Architecture Extensions Yes Yes
Infrastructure Extensions Yes Yes
Library
Support
Services
Catalogue
Job Management
Library
Support
Services
Catalogue
Hortonworks Cloudera MAPR
Library
Support
Services
Catalogue
Job Management
Resilience
High Availability
Performance
Pivotal
Library
Support
Services
Catalogue
Job Management
Resilience
High Availability
Performance
Gartner Magic Quadrant for BI
Data Warehouse Appliance / Real-time
Analytics Engine Price Comparison
Manufacturer
Server
Configuration
Cached Memory
Server
Type
Software
Platform
Cost (est.)
SAP HANA 32-node (4
Channels x 8 CPU)
1.3 Terabytes SMP Proprietary $ 6,000,,000
Teradata 20-node (2
Channels x 10 CPU)
1 Terabyte MPP Proprietary $ 1,000,000
Netezza
(now IBM)
20-node (2
Channels x 10 CPU)
1 Terabyte MPP Proprietary $ 180,000
IBM ex5 (non-HANA
configuration)
32-node (4
Channels x 8 CPU)
1.3 Terabytes SMP Proprietary $ 120,000
Greenplum (now
Pivotal)
20-node (2
Channels x 10 CPU)
1 Terabyte MPP Open Source $ 20,000
XtremeData xdb
(BO BW)
20-node (2
Channels x 10 CPU)
1 Terabyte MPP Open Source $ 18,000
Zybert Gridbox 48-node (4
Channels x 12 CPU)
20 Terabytes SMP Open Source $ 60,000
Clustering in “Big Data”
“A Cluster is a group of the same or similar data elements
which are aggregated – or closely distributed – together”
Clustering is a technique used to explore content and
understand information in every business sector and scientific
field that collects and processes very large volumes of data
Clustering is an essential tool for any “Big Data” problem
• “Big‫‏‬Data”‫‏‬refers to vast aggregations (super sets) consisting of numerous individual
datasets (structured and unstructured) - whose size and scope is beyond the capability of
conventional transactional (OLTP) or analytics (OLAP) Database Management Systems
and Enterprise Software Tools to capture, store, analyse and manage. Examples of “Big
Data” include the vast and ever changing amounts of data generated in social networks
where we maintain Blogs and have conversations with each other, news data streams,
geo-demographic data, internet search and browser logs, as well as the ever-growing
amount of machine data generated by pervasive smart devices - monitors, sensors and
detectors in the environment – captured via the Smart Grid, then processed in the Cloud –
and delivered to end-user Smart Phones and Tablets via Intelligent Agents and Alerts.
• Data Set Mashing and “Big‫‏‬Data”‫‏‬Global‫‏‬Content‫‏‬Analysis – drives Horizon Scanning,
Monitoring and Tracking processes by taking numerous, apparently un-related RSS and
other Information Streams and Data Feeds, loading them into Very large Scale (VLS)
DWH Structures and Document Management Systems for Real-time Analytics – searching
for and identifying possible signs of relationships hidden in data (Facts/Events)– in order to
discover and interpret previously unknown Data Relationships driven by hidden Clustering
Forces – revealed via “Weak‫‏‬Signals”‫‏‬indicating emerging and developing Application
Scenarios, Patterns and Trends - in turn predicating possible, probable and alternative
global transformations which may unfold as future “Wild‫‏‬Card”‫‏‬or “Black‫‏‬Swan”‫‏‬events.
“Big Data”
Clustering in “Big Data”
• The profiling and analysis of
large aggregated datasets in
order to determine a ‘natural’
structure of groupings provides
an important technique for many
statistical and analytic
applications. Cluster analysis
on the basis of profile similarities
or geographic distribution is a
method where no prior
assumptions are made
concerning the number of
groups or group hierarchies and
internal structure. Geo-
demographic techniques are
frequently used in order to
profile and segment populations
by ‘natural’ groupings - such as
common behavioural traits,
Clinical Trial, Morbidity or
Actuarial outcomes - along with
many other shared
characteristics and common
factors.....
Clustering in “Big Data”
•‫"‏‬BIG‫‏‬DATA”‫‏‬ANALYTICS‫–‏‬ PROFILING, CLUSTERING and 4D‫‏‬GEOSPATIAL‫‏‬ANALYSIS‫‏•‏‬
• The profiling and analysis of large aggregated datasets in order to determine a ‘natural’
structure of data relationships or groupings, is an important starting point forming the basis of
many mapping, statistical and analytic applications. Cluster analysis of implicit similarities -
such as time-series demographic or geographic distribution - is a critical technique where no
prior assumptions are made concerning the number or type of groups that may be found, or
their relationships, hierarchies or internal data structures. Geospatial and demographic
techniques are frequently used in order to profile and segment populations by ‘natural’
groupings. Shared characteristics or common factors such as Behaviour / Propensity or
Epidemiology, Clinical, Morbidity and Actuarial outcomes – allow us to discover and explore
previously unknown, concealed or unrecognised insights, patterns, trends or data relationships.
•‫‏‬PREDICTIVE‫‏‬ANALYITICS‫‏‬and‫‏‬EVENT‫‏‬FORECASTING‫•‏‬
• Predictive Analytics and Event Forecasting uses Horizon Scanning, Tracking and Monitoring
methods combined with Cycle, Pattern and Trend Analysis techniques for Event Forecasting
and Propensity Models in order to anticipate a wide range of business. economic, social and
political Future Events – ranging from micro-economic Market phenomena such as forecasting
Market Sentiment and Price Curve movements - to large-scale macro-economic Fiscal
phenomena using Weak Signal processing to predict future Wild Card and Black Swan Events
- such as Monetary System shocks.
Multi-channel Retail - Digital Architecture
• The last decade has seen an unprecedented explosion in mobile platforms
as the internet and mobile worlds came of age. It is no longer acceptable to
have only a bricks-and-mortar high-street presence – customer-focused
companies are now expected to deliver their Customer Experience and
Journey via internet websites, mobiles and more recently tablets.
Targeting – Map / Reduce
Consume – End-User Data
Data Acquisition – High-Volume Data Flows
– Mobile Enterprise Platforms (MEAP’s)
Apache Hadoop Framework
HDFS, MapReduce, Metlab “R”
Autonomy, Vertica
Smart Devices
Smart Apps
Smart Grid
Clinical Trial, Morbidity and Actuarial Outcomes
Market Sentiment and Price Curve Forecasting
Horizon Scanning,, Tracking and Monitoring
Weak Signal, Wild Card and Black Swan Event Forecasting
– Data Delivery and Consumption
News Feeds and Digital Media
Global Internet Content
Social Mapping
Social Media
Social CRM
– Data Discovery and Collection
– Analytics Engines - Hadoop
– Data Presentation and Display
Excel
Web
Mobile
– Data Management Processes
Data Audit
Data Profile
Data Quality Reporting
Data Quality Improvement
– Performance Acceleration
GPU’s – massive parallelism
SSD’s – in-memory processing
DBMS – ultra-fast data replication
– Data Management Tools
DataFlux
Embarcadero
Informatica
Talend
– Info. Management Tools
Business Objects
Cognos
Hyperion
Microstrategy
Biolap
Jedox
Sagent
Polaris
Teradata
SAP HANA
Netezza (now IBM)
Greenplum (now EMC2)
Extreme Data xdg
– Data Warehouse Appliances
Ab Initio
Ascential
Genio
Orchestra
Social Intelligence – The Emerging Big Data Stack
GIS MAPPING and SPATIAL DATA ANALYSIS
• A Geographic Information System (GIS) integrates hardware, software and
digital data capture devices for acquiring, managing, analysing, distributing and
displaying all forms of geographically dependant location data – including
machine generated data such as Computer-aided Design (CAD) data from land
and building surveys, Global Positioning System (GPS) terrestrial location data -
as well as all kinds of data streams - HDCCTV, aerial and satellite image data.....
GIS Mapping and Spatial Analysis
•‫‏‬GIS‫‏‬MAPPING‫‏‬and‫‏‬SPATIAL‫‏‬DATA‫‏‬ANALYSIS‫•‏‬
• A Geographic Information System (GIS) integrates hardware, software and
digital data capture devices for acquiring, managing, analysing, distributing and
displaying all forms of geographically dependant location data – including machine
generated data such as Computer-aided Design (CAD) data from land and
building surveys, Global Positioning System (GPS) terrestrial location data - as
well as all kinds of data streams - HDCCTV, aerial and satellite image data.....
• Spatial Data Analysis is a set of techniques for analysing 3-dimensional spatial
(Geographic) data and location (Positional) object data overlays. Software that
implements spatial analysis techniques requires access to both the locations of
objects and their physical attributes. Spatial statistics extends traditional statistics
to support the analysis of geographic data. Spatial Data Analysis provides
techniques to describe the distribution of data in the geographic space (descriptive
spatial statistics), analyse the spatial patterns of the data (spatial pattern or cluster
analysis), identify and measure spatial relationships (spatial regression), and
create a surface from sampled data (spatial interpolation, usually categorized as
geo-statistics).
• The results of spatial data analysis are largely dependent upon the type,
quantity, distribution and data quality of the spatial objects under analysis.
World-wide Visitor Count – GIS Mapping
Geo-demographic Clustering in “Big Data”
•‫‏‬GEODEMOGRAPHIC‫‏‬PROFILING‫–‏‬ CLUSTERING‫‏‬IN“BIG‫‏‬DATA”‫‏•‏‬
• The profiling and analysis of large aggregated datasets in order to determine a
‘natural’ or implicit structure of data relationships or groupings where no prior
assumptions are made concerning the number or type of groups discovered or group
relationships, hierarchies or internal data structures - in order to discover hidden data
relationships - is an important starting point forming the basis of many statistical and
analytic applications. The subsequent explicit Cluster Analysis as of discovered data
relationships is a critical technique which attempts to explain the nature, cause and
effect of those implicit profile similarities or geographic distributions. Demographic
techniques are frequently used in order to profile and segment populations using
‘natural’ groupings - such as common behavioural traits, Clinical, Morbidity or Actuarial
outcomes, along with many other shared characteristics and common factors – and
then attempt to understand and explain those natural group affinities and geographical
distributions using methods such as Causal Layer Analysis (CLA).....
GIS Mapping and Spatial Analysis
• A Geographic Information System (GIS) integrates hardware, software and digital
data capture devices for acquiring, managing, analysing, distributing and displaying all
forms of geographically dependant location data – including machine generated data
such as Computer-aided Design (CAD) data from land and building surveys, Global
Positioning System (GPS) terrestrial location data - as well as all kinds of data
streams - HDCCTV, aerial and satellite image data.....
• Spatial Data Analysis is a set of techniques for analysing spatial (Geographic)
location data. The results of spatial analysis are dependent on the locations of
the objects being analysed. Software that implements spatial analysis techniques
requires access to both the locations of objects and their physical attributes.
• Spatial statistics extends traditional statistics to support the analysis of geographic
data. Spatial Data Analysis provides techniques to describe the distribution of data in
the geographic space (descriptive spatial statistics), analyse the spatial patterns of the
data (spatial pattern or cluster analysis), identify and measure spatial relationships
(spatial regression), and create a surface from sampled data (spatial interpolation,
usually categorized as geo-statistics).
BTSA Induction Cluster Map
Geo-Demographic Profile Clusters
Targeting – Map / Reduce
Consume – End-User Data
Data Acquisition – High-Volume
– Mobile Enterprise Platforms (MEAP’s)
– Data Delivery and Consumption
– Data Discovery and Collection
– Analytics Engines - Hadoop
– Data Management Processes
– Performance Acceleration
Apache Hadoop Framework
HDFS, MapReduce, Metlab “R”
Autonomy, Vertica
Smart Devices
Smart Apps
Smart Grid
Clinical Trial, Morbidity and Actuarial Outcomes
Market Sentiment and Price Curve Forecasting
Horizon Scanning,, Tracking and Monitoring
Weak Signal, Wild Card and Black Swan Event Forecasting
News Feeds and Digital Media
Global Internet Content
Social Mapping
Social Media
Social CRM
Data Audit
Data Profile
Data Quality Reporting
Data Quality Improvement
Data Extract, Transform, Load
GPU’s – massive parallelism
SSD’s – in-memory processing
DBMS – ultra-fast data replication
– Data Presentation and Display
– Data Management Tools
– Info. Management Tools
– Data Warehouse Appliances
Excel
Web
Mobile
DataFlux
Embarcadero
Informatica
Talend
Business Objects
Cognos
Hyperion
Microstrategy
Biolap
Jedox
Sagent
Polaris
Teradata
SAP HANA
Netezza (now IBM)
Greenplum (now EMC2)
Extreme Data xdg
Zybert Gridbox
Ab Initio
Ascential
Genio
Orchestra
Clustering Phenomena in “Big Data”
“A Cluster is a group of profiled data similarities aggregated closely together”
• Cluster Analysis is a technique which is used to explore very large volumes of
structured and unstructured data - transactional, machine generated (automatic)
social media and internet content and geo-demographic information - in order to
discover previously unknown, unrecognised or hidden logical data relationships.
Event Clusters and Connectivity
A
B
C
D
E
G
H
F
The above is an illustration of Event relationships - how Events might be connected. Any detailed,
intimate understanding of the connection between Events may help us to answer questions such as: -
• If Event A occurs does it make Event B or H more or less likely to occur ?
• If Event B occurs what effect does it have on Events C,D,E, F and G ?
Answering questions such as these allows us to plan our Event Management approach and Risk
mitigation strategy – and to decide how better to focus our Incident / Event resources and effort…..
Event Clusters and Connectivity
• Aggregated Event includes coincident, related, connected and interconnected Event: -
• Coincident - two or more Events appear simultaneously in the same domain –
but they arise from different triggers (unrelated causal events)
• Related - two more Events materialise in the same domain sharing common
Event features or characteristics (may share a possible hidden common trigger or
cause – and so are candidates for further analysis and investigation)
• Connected - two more Events materialise in the same domain due to the same
trigger (common cause)
• Interconnected - two more Events materialise together in a Event cluster, series
or “storm” - the previous (prior) Event event triggering the subsequent (next) event
in an Event Series…..
• A series of Aggregated Events may result in a significant cumulative impact - and are
therefore frequently identified incorrectly as Wild-card or Black Swan Events - rather
than just simply as event clusters or event “storms”.....
Event Clusters and Connectivity
1
2
3
4
5
7
8
6
The above is an illustration of Event relationships - how Risk Events might be connected. A detailed and
intimate understanding of Event clusters and the connection between Events may help us to understand: -
• What is the relationship between Events 1 and 8, and what impact do they have on Events 2 - 7 ?
• Events 2 - 5 and Events 6 and 7 occur in clusters – what are the factors influencing these clusters ?
Answering questions such as these allows us to plan our Risk Event management approach and mitigation
strategy – and to decide how to better focus our resources and effort on Risk Events and fraud management.
Claimant 1
Risk Event
Claimant 2
Residence
Vehicle
Event
Cluster
Aggregated Event Types
ATrigger A
Coincident Events
BTrigger B
Event
Event
CTrigger 1
Related Events
DTrigger 2
Event
Event
E
Trigger
Connected Events
Event
EventF
GTrigger
Inter-connected Events
Event Event
H
Event Complexity Map
• 4D Geospatial Analytics is the
profiling and analysis of large
aggregated datasets in order to
determine a ‘natural’ structure of
groupings provides an important
technique for many statistical and
analytic applications.
• Demographic and Geospatial
Cluster Analysis - on the basis of
profile similarities or geographic
distribution - is a statistical method
whereby no prior assumptions are
made concerning the number of
groups or group hierarchies and
internal structure. Geo-spatial and
geodemographic techniques are
frequently used in order to profile and
segment populations by ‘natural’
groupings - such as common
behavioural traits, Clinical Trial,
Morbidity or Actuarial outcomes - along
with many other shared characteristics
and common factors.....
4D Geospatial Analytics
The Flow of Information through Time
• String Theory predicates that Space-Time exists in discrete packages, with Time Present always
in some way inextricably woven into both Time Past and Time Future. This yields the intriguing
possibility of insights through the mists of time into the outcome of future events – as any item of
Data or Information (Global Content) may contain faint traces which offer glimpses into the future
trajectory of Clusters of linked Past, Present and Future Events. If all future timeline were linear,
then every event would unfold in an unerringly predictable manner towards a known and certain
conclusion. The future is, however, both unknown and unknowable (Hawking Paradox) . Future
outcomes are uncertain – future timelines are non-linear (branched) with a multitude of possible
alternative futures. Chaos Theory suggests that even the most subliminal inputs, originating from
unknown forces so minute as to be undetectable, might become amplified through numerous
system cycles to grow in influence and impact over time – deviating Space-Time trajectories far
away from their original predicted path – so fundamentally altering the outcome of future events.
• Every item of Global Content in the Present is somehow connected with both Past and Future
temporal planes. Space-Time is a Dimension Cluster consisting of the three Spatial dimensions
(x, y and z axes) plus Time (the fourth dimension - t) – which together flow in a single direction –
relentlessly towards the future. Space-Time does not flow uniformly – the “arrow of time” may
be deflected by unknown factors. There may exist “hidden external forces” (unseen interactions)
that create disturbance in the temporal plane stack which marks the passage of time - with the
potential to create eddies, vortices and whirlpools along the trajectory of Time (chaos, disorder
and uncertainty) – which in turn posses the capacity to generate ripples and waves (randomness
and disruption) – thus changing the course of the Space-Time continuum. “Weak‫‏‬Signals”‫‏‬are
“Ghosts‫‏‬in‫‏‬the‫‏‬Machine” – echoes of these subliminal temporal interactions – that may contain
within insights or clues about possible future “Wild‫‏‬card” or “Black‫‏‬Swan”‫‏‬random events
4D Geospatial Analytics – The Temporal Wave
• The Temporal Wave is a novel and innovative method for Visual Modelling and Exploration
of Geospatial “Big Data” - simultaneously within a Time (history) and Space (geographic)
context. The problems encountered in exploring and analysing vast volumes of spatial–
temporal information in today's data-rich landscape – are becoming increasingly difficult to
manage effectively. In order to overcome the problem of data volume and scale in a Time
(history) and Space (location) context requires not only traditional location–space and
attribute–space analysis common in GIS Mapping and Spatial Analysis - but now with the
additional dimension of time–space analysis. The Temporal Wave supports a new method
of Visual Exploration for Geospatial (location) data within a Temporal (timeline) context.
• This time-visualisation approach integrates Geospatial (location) data within a Temporal
(timeline) dataset - along with data visualisation techniques - thus improving accessibility,
exploration and analysis of the huge amounts of geo-spatial data used to support geo-
visual “Big Data” analytics. The temporal wave combines the strengths of both linear
timeline and cyclical wave-form analysis – and is able to represent data both within a Time
(history) and Space (geographic) context simultaneously – and even at different levels of
granularity. Linear and cyclic trends in space-time data may be represented in combination
with other graphic representations typical for location–space and attribute–space data-
types. The Temporal Wave can be used in roles as a time–space data reference system,
as a time–space continuum representation tool, and as time–space interaction tool.
4D Geospatial Analytics – London Timeline
4D Geospatial Analytics – London Timeline
• How did London evolve from its creation as a Roman city in 43AD into the
crowded, chaotic cosmopolitan megacity we see today? The London Evolution
Animation takes a holistic view of what has been constructed in the capital over
different historical periods – what has been lost, what saved and what protected.
• Greater London covers 600 square miles. Up until the 17th century, however,
the capital city was crammed largely into a single square mile which today is
marked by the skyscrapers which are a feature of the financial district of the City.
• This visualisation, originally created for the Almost Lost exhibition by the Bartlett
Centre for Advanced Spatial Analysis (CASA), explores the historic evolution of
the city by plotting a timeline of the development of the road network - along with
documented buildings and other features – through 4D geospatial analysis of a
vast number of diverse geographic, archaeological and historic data sets.
• Unlike other historical cities such as Athens or Rome, with an obvious patchwork
of districts from different periods, London's individual structures scheduled sites
and listed buildings are in many cases constructed gradually by parts assembled
during different periods. Researchers who have tried previously to locate and
document archaeological structures and research historic references will know
that these features, when plotted, appear scrambled up like pieces of different
jigsaw puzzles – all scattered across the contemporary London cityscape.
• The Temporal Wave is a novel and innovative method for Visual Modelling and Exploration
of Geospatial “Big Data” - simultaneously within a Time (history) and Space (geographic)
context. The problems encountered in exploring and analysing vast volumes of spatial–
temporal information in today's data-rich landscape – are becoming increasingly difficult to
manage effectively. In order to overcome the problem of data volume and scale in a Time
(history) and Space (location) context requires not only traditional location–space and
attribute–space analysis common in GIS Mapping and Spatial Analysis - but now with the
additional dimension of time–space analysis. The Temporal Wave supports a new method
of Visual Exploration for Geospatial (location) data within a Temporal (timeline) context.
• This time-visualisation approach integrates Geospatial (location) data within a Temporal
(timeline) dataset - along with data visualisation techniques - thus improving accessibility,
exploration and analysis of the huge amounts of geo-spatial data used to support geo-
visual “Big Data” analytics. The temporal wave combines the strengths of both linear
timeline and cyclical wave-form analysis – and is able to represent data both within a Time
(history) and Space (geographic) context simultaneously – and even at different levels of
granularity. Linear and cyclic trends in space-time data may be represented in combination
with other graphic representations typical for location–space and attribute–space data-
types. The Temporal Wave can be used in roles as a time–space data reference system,
as a time–space continuum representation tool, and as time–space interaction tool.
4D Geospatial Analytics – The Temporal Wave
Social Intelligence – Brand Affinity
CONE SEGMENTS - BRAND AFFINITY
• Social Intelligence drives Brand Loyalty Understanding - Fan-base Profiling, Streaming and Segmentation –
expressed in the creation and maintenance of a detailed History and Balanced Scorecard for every individual in
the Cone, allowing summation by Stream / Segment: -
1. Inactive – need to draw their attention towards the Brand
2. Indifferent – need to educate them about core Brand Values
3. Disconnected– need to re-engage with the Brand
4. Casuals – exhibit Brand awareness and interest
5. Followers – follow the Brand, engage with social media and consume brand communications
6. Enthusiasts – engaged with the Brand, participate in Brand / Product / Media events and merchandising
7. Supporters– show strong need, desire and propensity to support Brand / Product / Media consumption
8. Fanatics – demonstrate total Commitment / Dedication / Loyalty for all aspects of the Brand / Product / Media
PROPENSITY
• Balanced Scorecard – is a summary of all the data-points for an Individual / Stream / Segment
• Propensity Score – In the statistical analysis of observational data, Propensity Score Matching (PSM) is a
statistical matching technique that attempts to estimate the effect of a Campaign / Offer / Promotion or other
intervention by calculating the impact of factors that predict the outcome of the Campaign / Offer / Promotion.
• Propensity Model – is the Baysian probability of the outcome of an event in an Individual / Stream / Segment
• Predictive Analytics - an area of data mining that deals with extracting information from data and using it to
predict trends and behaviour patterns. Often the unknown event of interest is in the future, however, Predictive
Analytics can be applied to any type of event with an unknown outcome - in the past, present or future.
Social Intelligence – Fan-base Understanding
Football Supporters – Map of London
Social Intelligence – Fan-base Understanding
CONE STREAMING and SEGMENTATION
• Multiple Cones can be created and cross-referenced using Social Intelligence and Brand
Interaction / Fan-base Profiling and Segmentation in order to deliver actionable insights for any
genre of Brand Loyalty and Fan-base Understanding – as well as for other Geo-demographic
Analytics purposes – e.g. Digital Healthcare, Clinical Trials, Morbidity and Actuarial Outcomes: -
– Music (e.g. BBC and Sony Music)
– Broadcasting (e.g. Radio 1 / American Idol)
– Digital Media Content (e.g. Sony Films / Netflix)
– Sports Franchises (e.g. Manchester City / New York City)
– Sport Footwear and Apparel (e.g. Nike, Puma, Adidas, Reebok)
– Fast Fashion Retailers (e.g. ASOS, Next, New Look, Primark)
– Luxury Brands / Aggregators (e.g. Armani, Burberry, Versace / LVMH, PPR, Richemont)
– Multi-channel Retailers – Brand Affinity / Loyalty Marketing + Product Campaigns, Offers & Promotions
– Financial Services Companies – Brand Protection and Reputation Management
– Travel, Leisure and Entertainment Organisations - Destination Events and Resorts
– MVNO / CSPs - OTT Business Partner Analytics (Sky Go, Netflix, iPlayer via Firebrand / Apigee)
– Telco, Media and Communications - Churn Management / Conquest / Up-sell / Cross-sell Campaigns
– Digital Healthcare – Private / Public Healthcare Service Provisioning: - Geo-demographic Clustering and
Propensity Modelling (Patient Monitoring, Wellbeing, Clinical Trials, Morbidity and Actuarial Outcomes)
Social Intelligence – Fan-base Understanding
Social Intelligence – Social Interaction
Social Interaction Cone Rules
1. Inactive – not engaged – low evidence / low affinity / low interest in Social Media
2. Lone Wolf – sparse / thin social network - may share negative information (Trolling)
3. Home Boy – Social Network clustered around Home Location Postcodes (Gang Culture)
4. Eternal Student – Social Network clustered around School / College / University Alumni
5. Workplace – Social Network clustered around Work and Colleagues (e.g. City Brokers, Traders)
6. Friends and Family – Social Network clustered around physical social contacts - Friends and Family
7. Enthusiast – Social Network clustered around shared, common interests – Sport. Music and Fashion etc.
8. Promiscuous – Open Networker – virtual Social Network across all categories- will connect with anybody
Number of Segments
• With anonymous data (e.g polls) then the
number of initial Segments is 4 (Matt
Holland). With named individuals we can
discover much richer internal and external
Social Interaction
How consumers use social media (e.g., Facebook, Twitter) to address and/or engage with companies around social and environmental issues.
Clustering in “Big Data”
“A Cluster is a group of profiled data similarities aggregated closely together”
• Cluster Analysis is a technique used to explore very large volumes of transactional and
machine generated (automatic) data, social media and internet content and information -
in order to discover previously unknown, unrecognised or hidden data relationships.
• Clustering is an essential tool for any “Big‫‏‬Data”‫‏‬problem. Cluster Analysis of both
explicit (given) or implicit (discovered) data relationships in “Big‫‏‬Data”‫‏‬is a critical
technique which attempts to explain the nature, cause and effect of the forces which drive
clustering. Any observed profiled data similarities – geographic or temporal aggregations,
mathematical or statistical distributions – may be explained through Causal Layer Analysis.
– Choice of clustering algorithm and parameters are both process and data dependent
– Approximate Kernel K-means provides a good trade-off between clustering accuracy and
data volumes, throughput, performance and scalability
– Challenges include homogeneous and heterogeneous data (structured versus unstructured
data), data quality, streaming, scalability, cluster cardinality and validity
Cluster Types
Deep Space Galactic Clusters
Hadoop Cluster – “Big Data” Servers
Molecular Clusters
Geo-Demographic Clusters
Mineral Lode Clusters
•‫‏‬GEODEMOGRAPHIC‫‏‬PROFILING‫–‏‬ CLUSTERING‫‏‬IN“BIG‫‏‬DATA”‫‏•‏‬
• The profiling and analysis of very large aggregated datasets to determine ‘natural’ or
implicit data relationships and discover hidden common factors and data structures -
where no prior assumptions are made concerning the number or type of groups - is
driven by uncovering previously unknown data relationships and natural groupings.
The discovery of such Cluster / Group relationships, hierarchies or internal data
structures is an important starting point forming the basis of many statistical and
analytic applications which are designed to expose hidden data relationships.
• A subsequent explicit Cluster Analysis of previously discovered data relationships is
an important technique which attempts to understand the true nature, cause and
impact of unknown clustering forces driving implicit profile similarities, mathematical
and geographic distributions. Geo-demographic techniques are frequently used in
order to profile and segment Demographic and Spatial data by ‘natural’ groupings –
including common behavioural traits, Clinical Trial, Morbidity or Actuarial outcomes –
along with numerous other shared characteristics and common factors Cluster
Analysis attempt to understand and explain those natural group affinities and
geographical distributions using methods such as Causal Layer Analysis (CLA).....
Clustering in “Big Data”
Cluster Types
DISCIPLINE CLUSTER TYPE CLUSTERS DIMENSIONS DATA TYPE DATA SOURCE CLUSTERING
FACTORS /
FORCES
Astrophysics 4D Distribution of
Matter across the
Universe through
Space and Time
Star Systems
Stellar Clusters
Galaxies
Galactic Clusters
Mass / Energy
Space / Time
Astronomy Images –
Microwave, Infrared,
Optical, Ultraviolet, Radio,
X-ray, Gamma-ray
Optical Telescope
Infrared Telescope
Radio Telescope
X-ray Telescope
Gravity
Dark Matter
Dark Energy
Dark Flow
Climate Change Temperature Changes
Precipitation Changes
Ice-mass Changes
Hot / Cold
Dry / Wet
More / Less ice
Temperature
Precipitation
Sea / Land Ice
Average Temperature
Average Precipitation
Greenhouse Gases %
Weather Station Data
Ice Core Data
Tree-ring Data
Solar Forcing
Oceanic Forcing
Atmospheric Forcing
Actuarial Science
Morbidity, Clinical
Trials, Epidemiology
Place / Date of birth
Place / Date of death
Cause of Death
Birth / Death
Longevity
Cause of Death
Medical Events
Geography
Time
Biomedical Data
Demographic Data
Geographic data
Register of Births
Register of Deaths
Medical Records
Health
Wealth
Demographics
Price Curves
Economic Modelling
Long-range Forecasting
Economic growth
Economic recession
Bull markets
Bear markets
Monetary Value
Geography
Time
Real (Austrian) GDP
Foreign Exchange Rates
Interest Rates
Price movements
Daily Closing Prices
Government
Central Banks
Money Markets
Stock Exchange
Commodity Exchange
Business Cycles
Economic Trends
Market Sentiment
Fear and Greed
Supply / Demand
Business Clusters Retail Parks
Digital / Fin Tech
Leisure / Tourism
Creative / Academic
Retail
Technology
Resorts
Arts / Sciences
Company / SIC
Geography
Time
Entrepreneurs
Start-ups
Mergers
Acquisitions
Investors
NGAs
Government
Academic Bodies
Capital / Finance
Political policy
Economic policy
Social policy
Elite Team Sports
Performance Science
Winners
Loosens
Team / Athlete
Sport / Club
League Tables
Medal Tables
Sporting Events
Team / Athlete
Sport / Club
Geography
Time
Performance Data
Biomedical Data
Sports Governing Bodies
RSS News Feeds
Social Media
Hawk-Eye
Pro-Zone
Technique
Application
Form / Fitness
Ability / Attitude
Training / Coaching
Speed / Endurance
Future Management Human Activity
Natural Events
Random Events
Waves, Cycles,
Patterns, Trends
Random Events
Geography
Time
Weak Signals
Strong Signals
Wild Card Events
Black Swan Events
Global Internet Content /
Big Data Analytics -
Horizon Scanning,
Tracking and Monitoring
Random Events
Waves, Cycles,
Patterns, Trends,
Extrapolations
Clustering in “Big Data”
•‫"‏‬BIG‫‏‬DATA”‫‏‬ANALYTICS‫–‏‬ PROFILING, CLUSTERING and 4D‫‏‬GEOSPATIAL‫‏‬ANALYSIS‫‏•‏‬
• The profiling and analysis of large aggregated datasets in order to determine a ‘natural’
structure of data relationships or groupings, is an important starting point forming the basis of
many mapping, statistical and analytic applications. Cluster analysis of implicit similarities -
such as time-series demographic or geographic distribution - is a critical technique where no
prior assumptions are made concerning the number or type of groups that may be found, or
their relationships, hierarchies or internal data structures. Geospatial and demographic
techniques are frequently used in order to profile and segment populations by ‘natural’
groupings. Shared characteristics or common factors such as Behaviour / Propensity or
Epidemiology, Clinical, Morbidity and Actuarial outcomes – allow us to discover and explore
previously unknown, concealed or unrecognised insights, patterns, trends or data relationships.
•‫‏‬PREDICTIVE‫‏‬ANALYITICS‫‏‬and‫‏‬EVENT‫‏‬FORECASTING‫•‏‬
• Predictive Analytics and Event Forecasting uses Horizon Scanning, Tracking and Monitoring
methods combined with Cycle, Pattern and Trend Analysis techniques for Event Forecasting
and Propensity Models in order to anticipate a wide range of business. economic, social and
political Future Events – ranging from micro-economic Market phenomena such as forecasting
Market Sentiment and Price Curve movements - to large-scale macro-economic Fiscal
phenomena using Weak Signal processing to predict future Wild Card and Black Swan Events
- such as Monetary System shocks.
Cluster Analysis
• Data Representation
– Metadata - identifying common Data Objects, Types and Formats
• Data Taxonomy and Classification
– Similarity Matrix (labelled data)
– Grouping of explicit data relationships
• Data Audit - given any collection of labelled objects.....
– Identifying relationships between discrete data items
– Identifying common data features - values and ranges
– Identifying unusual data features - outliers and exceptions
• Data Profiling and Clustering - given any collection of unlabeled objects.....
– Pattern Matrix (unlabelled data)
– Discover implicit data relationships
– Find meaningful groupings in Data (Clusters)
– Predictive Analytics – Baysean Event Forecasting
– Wave-form Analytics – Periodicity, Cycles and Trends
– Explore hidden relationships between discrete data features
Many big data problems feature unlabeled objects
k-means/Gaussian-Mixture Clustering of Audio Segments
Cluster Analysis
Clustering Algorithms
Hundreds of spatial, mathematical and statistical clustering algorithms are available –
many clustering algorithms are “admissible” – but no single algorithm alone is “optimal”
• K-means
• Gaussian mixture models
• Kernel K-means
• Spectral Clustering
• Nearest neighbour
• Latent Dirichlet Allocation
Challenges‫‏‬in‫“‏‬Big‫‏‬Data”‫‏‬Clustering
• Data quality
• Volume – number of data items
• Cardinality – number of clusters
• Synergy – measures of similarity
• Values – outliers and exceptions
• Cluster accuracy - validity and verification
• Homogeneous versus heterogeneous data (structured and unstructured data)
Distributed Clustering Model Performance
Clustering 100,000 2-D points with 2 clusters on 2.3 GHz quad-core
Intel Xeon processors, with 8GB memory in intel07 cluster
Network communication cost increases with the no. of processors
K-means Kernel K -means
Distributed
Clustering Models
Number of
processors
Speedup Factor
- K-means
Speedup Factor
- Kernel K-means
2 1.1 1.3
3 2.4 1.5
4 3.1 1.6
5 3.0 3.8
6 3.1 1.9
7 3.3 1.5
8 1.2 1.5
K-means
Kernel K -means
Clustering 100,000 2-D points with 2 clusters on 2.3 GHz quad-core
Intel Xeon processors, with 8GB memory in intel07 cluster
Network communication cost increases with the no. of processors
Distributed Clustering Model Performance
Distributed Approximate Kernel K-means
2-D data set with 2 concentric circles
2.3 GHz quad-core Intel Xeon processors, with 8GB memory in intel07 cluster
Run-time
Size of
dataset
(no. of
Records)
Benchmark
Performance
(Speedup
Factor )
10K 3.8
100K 4.8
1M 3.8
10M 6.4
HPCC Clustering Models
High Performance / High Concurrence Real-time Delivery (HPCC)
Distributed Clustering Models
The Cone™‫‏‬– Brand Loyalty / Affinity

More Related Content

What's hot

Omnichannel Marketing for shopping centre
Omnichannel Marketing  for shopping centreOmnichannel Marketing  for shopping centre
Omnichannel Marketing for shopping centreIT-factory
 
VBE SuperValu Test Case
VBE SuperValu Test CaseVBE SuperValu Test Case
VBE SuperValu Test CaseDavid Hay
 
2015-global-omnichannel-retail-index
2015-global-omnichannel-retail-index2015-global-omnichannel-retail-index
2015-global-omnichannel-retail-indexBenedikt Schmaus
 
Omnichannel retail study
Omnichannel retail studyOmnichannel retail study
Omnichannel retail studyTrustRobin
 
Positioning for Retail Recovery: The Role of Predictive Analytics Fueled by M...
Positioning for Retail Recovery: The Role of Predictive Analytics Fueled by M...Positioning for Retail Recovery: The Role of Predictive Analytics Fueled by M...
Positioning for Retail Recovery: The Role of Predictive Analytics Fueled by M...Precisely
 
LG Digital Signage Solutions
LG Digital Signage SolutionsLG Digital Signage Solutions
LG Digital Signage SolutionsLGMEA2016
 
Electronic Billboards Bidding System - Bidboards - Business plan - v7 - 20100116
Electronic Billboards Bidding System - Bidboards - Business plan - v7 - 20100116Electronic Billboards Bidding System - Bidboards - Business plan - v7 - 20100116
Electronic Billboards Bidding System - Bidboards - Business plan - v7 - 20100116Jorge Avila
 
Brochure & Annual_report_samples_v2
Brochure & Annual_report_samples_v2Brochure & Annual_report_samples_v2
Brochure & Annual_report_samples_v2Lawrence Norman
 
Трансформация оффлайн ритейла в эпоху диджитал
Трансформация оффлайн ритейла в эпоху диджиталТрансформация оффлайн ритейла в эпоху диджитал
Трансформация оффлайн ритейла в эпоху диджиталMILDBERRY
 
Omnichannel Marketing for hotels
Omnichannel Marketing for hotelsOmnichannel Marketing for hotels
Omnichannel Marketing for hotelsIT-factory
 
[E-Contest 2020] No.1 Team
[E-Contest 2020] No.1 Team[E-Contest 2020] No.1 Team
[E-Contest 2020] No.1 TeamBich Nguyen
 
2014 Retail Experts Survey
2014 Retail Experts Survey2014 Retail Experts Survey
2014 Retail Experts SurveyAaron Blackman
 
OmniChannel Retail Best Practices for Brands and Retailers
OmniChannel Retail Best Practices for Brands and RetailersOmniChannel Retail Best Practices for Brands and Retailers
OmniChannel Retail Best Practices for Brands and RetailersStephany Gochuico
 

What's hot (14)

Omnichannel Marketing for shopping centre
Omnichannel Marketing  for shopping centreOmnichannel Marketing  for shopping centre
Omnichannel Marketing for shopping centre
 
VBE SuperValu Test Case
VBE SuperValu Test CaseVBE SuperValu Test Case
VBE SuperValu Test Case
 
2015-global-omnichannel-retail-index
2015-global-omnichannel-retail-index2015-global-omnichannel-retail-index
2015-global-omnichannel-retail-index
 
Omnichannel retail study
Omnichannel retail studyOmnichannel retail study
Omnichannel retail study
 
Positioning for Retail Recovery: The Role of Predictive Analytics Fueled by M...
Positioning for Retail Recovery: The Role of Predictive Analytics Fueled by M...Positioning for Retail Recovery: The Role of Predictive Analytics Fueled by M...
Positioning for Retail Recovery: The Role of Predictive Analytics Fueled by M...
 
LG Digital Signage Solutions
LG Digital Signage SolutionsLG Digital Signage Solutions
LG Digital Signage Solutions
 
Marketing indaba
Marketing indabaMarketing indaba
Marketing indaba
 
Electronic Billboards Bidding System - Bidboards - Business plan - v7 - 20100116
Electronic Billboards Bidding System - Bidboards - Business plan - v7 - 20100116Electronic Billboards Bidding System - Bidboards - Business plan - v7 - 20100116
Electronic Billboards Bidding System - Bidboards - Business plan - v7 - 20100116
 
Brochure & Annual_report_samples_v2
Brochure & Annual_report_samples_v2Brochure & Annual_report_samples_v2
Brochure & Annual_report_samples_v2
 
Трансформация оффлайн ритейла в эпоху диджитал
Трансформация оффлайн ритейла в эпоху диджиталТрансформация оффлайн ритейла в эпоху диджитал
Трансформация оффлайн ритейла в эпоху диджитал
 
Omnichannel Marketing for hotels
Omnichannel Marketing for hotelsOmnichannel Marketing for hotels
Omnichannel Marketing for hotels
 
[E-Contest 2020] No.1 Team
[E-Contest 2020] No.1 Team[E-Contest 2020] No.1 Team
[E-Contest 2020] No.1 Team
 
2014 Retail Experts Survey
2014 Retail Experts Survey2014 Retail Experts Survey
2014 Retail Experts Survey
 
OmniChannel Retail Best Practices for Brands and Retailers
OmniChannel Retail Best Practices for Brands and RetailersOmniChannel Retail Best Practices for Brands and Retailers
OmniChannel Retail Best Practices for Brands and Retailers
 

Similar to Cone TM Digital Marketing - Principles PDF

Lyns new powerpont new
Lyns new powerpont newLyns new powerpont new
Lyns new powerpont newRyan Jones
 
Marketing for Artists: Best Practices for Fine Artists on Social Media
Marketing for Artists: Best Practices for Fine Artists on Social Media Marketing for Artists: Best Practices for Fine Artists on Social Media
Marketing for Artists: Best Practices for Fine Artists on Social Media Luke Joerger
 
Film and audience
Film and audienceFilm and audience
Film and audienceatif95
 
Culture Vulture, Entertainment – inspiring original thinking through a deeper...
Culture Vulture, Entertainment – inspiring original thinking through a deeper...Culture Vulture, Entertainment – inspiring original thinking through a deeper...
Culture Vulture, Entertainment – inspiring original thinking through a deeper...Vikrant Mudaliar
 
films and audiences
films and audiences films and audiences
films and audiences yoitsarman
 
Podcasting Environment Nov 2008
Podcasting Environment Nov 2008Podcasting Environment Nov 2008
Podcasting Environment Nov 2008John Blue
 
About Senserit - Multi-media organizations at the intersection of health + de...
About Senserit - Multi-media organizations at the intersection of health + de...About Senserit - Multi-media organizations at the intersection of health + de...
About Senserit - Multi-media organizations at the intersection of health + de...Senserit
 
assignment 3
assignment 3assignment 3
assignment 3itzsabzz
 
Media Audiences an Introduction
Media Audiences an IntroductionMedia Audiences an Introduction
Media Audiences an Introductionalevelmedia
 
Audiences & institutions
Audiences & institutionsAudiences & institutions
Audiences & institutionstessfizzy
 
Digital Marketing 101 5496
Digital Marketing 101 5496Digital Marketing 101 5496
Digital Marketing 101 5496prashantnandan1
 
Audiences hamzaa new
Audiences hamzaa newAudiences hamzaa new
Audiences hamzaa newkinghamzaa
 
Audiences hamzaa new
Audiences hamzaa newAudiences hamzaa new
Audiences hamzaa newkinghamzaa
 
Audiences hamzaa new
Audiences hamzaa newAudiences hamzaa new
Audiences hamzaa newkinghamzaa
 
Webinar Presentation on Communication Design
Webinar Presentation on Communication DesignWebinar Presentation on Communication Design
Webinar Presentation on Communication DesignPearlAcademy India
 

Similar to Cone TM Digital Marketing - Principles PDF (20)

Enliven campaign brief
Enliven campaign briefEnliven campaign brief
Enliven campaign brief
 
Lyns new powerpont new
Lyns new powerpont newLyns new powerpont new
Lyns new powerpont new
 
Marketing for Artists: Best Practices for Fine Artists on Social Media
Marketing for Artists: Best Practices for Fine Artists on Social Media Marketing for Artists: Best Practices for Fine Artists on Social Media
Marketing for Artists: Best Practices for Fine Artists on Social Media
 
Film and audience
Film and audienceFilm and audience
Film and audience
 
Silent Night
Silent NightSilent Night
Silent Night
 
Culture Vulture, Entertainment – inspiring original thinking through a deeper...
Culture Vulture, Entertainment – inspiring original thinking through a deeper...Culture Vulture, Entertainment – inspiring original thinking through a deeper...
Culture Vulture, Entertainment – inspiring original thinking through a deeper...
 
films and audiences
films and audiences films and audiences
films and audiences
 
Podcasting Environment Nov 2008
Podcasting Environment Nov 2008Podcasting Environment Nov 2008
Podcasting Environment Nov 2008
 
Bbdo big idea_today
Bbdo big idea_todayBbdo big idea_today
Bbdo big idea_today
 
About Senserit - Multi-media organizations at the intersection of health + de...
About Senserit - Multi-media organizations at the intersection of health + de...About Senserit - Multi-media organizations at the intersection of health + de...
About Senserit - Multi-media organizations at the intersection of health + de...
 
assignment 3
assignment 3assignment 3
assignment 3
 
Media Audiences an Introduction
Media Audiences an IntroductionMedia Audiences an Introduction
Media Audiences an Introduction
 
Capstone(V2)
Capstone(V2)Capstone(V2)
Capstone(V2)
 
Audiences & institutions
Audiences & institutionsAudiences & institutions
Audiences & institutions
 
Digital Marketing 101 5496
Digital Marketing 101 5496Digital Marketing 101 5496
Digital Marketing 101 5496
 
Digital + Marketing 101
Digital + Marketing 101Digital + Marketing 101
Digital + Marketing 101
 
Audiences hamzaa new
Audiences hamzaa newAudiences hamzaa new
Audiences hamzaa new
 
Audiences hamzaa new
Audiences hamzaa newAudiences hamzaa new
Audiences hamzaa new
 
Audiences hamzaa new
Audiences hamzaa newAudiences hamzaa new
Audiences hamzaa new
 
Webinar Presentation on Communication Design
Webinar Presentation on Communication DesignWebinar Presentation on Communication Design
Webinar Presentation on Communication Design
 

More from Nigel Tebbutt 奈杰尔 泰巴德

Strategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDFStrategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDFNigel Tebbutt 奈杰尔 泰巴德
 
Strategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDFStrategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDFNigel Tebbutt 奈杰尔 泰巴德
 

More from Nigel Tebbutt 奈杰尔 泰巴德 (15)

Pyramid™‏Digital Marketing PDF
Pyramid™‏Digital Marketing PDFPyramid™‏Digital Marketing PDF
Pyramid™‏Digital Marketing PDF
 
Connected Fashion™ Final‏
Connected Fashion™ Final‏Connected Fashion™ Final‏
Connected Fashion™ Final‏
 
Strategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDFStrategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDF
 
Strategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDFStrategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDF
 
4D Geospatial Analytics in Digital Healthcare PDF
4D Geospatial Analytics in Digital Healthcare PDF4D Geospatial Analytics in Digital Healthcare PDF
4D Geospatial Analytics in Digital Healthcare PDF
 
Enterprise Risk Management 2015 PDF
Enterprise Risk Management 2015 PDFEnterprise Risk Management 2015 PDF
Enterprise Risk Management 2015 PDF
 
Ghost in the Machine 2015 - Workbook PDF
Ghost in the Machine 2015 - Workbook PDFGhost in the Machine 2015 - Workbook PDF
Ghost in the Machine 2015 - Workbook PDF
 
Ghost in the Machine 2015 - Principles PDF
Ghost in the Machine 2015 - Principles PDFGhost in the Machine 2015 - Principles PDF
Ghost in the Machine 2015 - Principles PDF
 
Thinking about the Future 3 - Scenarios and Use Cases PDF
Thinking about the Future 3 - Scenarios and Use Cases PDFThinking about the Future 3 - Scenarios and Use Cases PDF
Thinking about the Future 3 - Scenarios and Use Cases PDF
 
Thinking about the Future 3 - Principles PDF
Thinking about the Future 3 - Principles PDFThinking about the Future 3 - Principles PDF
Thinking about the Future 3 - Principles PDF
 
Nigel Tebbutt Profile - Fin Tech PDF
Nigel Tebbutt Profile - Fin Tech PDFNigel Tebbutt Profile - Fin Tech PDF
Nigel Tebbutt Profile - Fin Tech PDF
 
Business Cycles, Patterns and Trends Version 6 PDF
Business Cycles, Patterns and Trends Version 6 PDFBusiness Cycles, Patterns and Trends Version 6 PDF
Business Cycles, Patterns and Trends Version 6 PDF
 
Future Homes Business Model PDF
Future Homes Business Model PDFFuture Homes Business Model PDF
Future Homes Business Model PDF
 
The Internet of Things (IoT) PDF
The Internet of Things (IoT) PDFThe Internet of Things (IoT) PDF
The Internet of Things (IoT) PDF
 
Digital Healthcare - Detailed Presentation PDF
Digital Healthcare - Detailed Presentation PDFDigital Healthcare - Detailed Presentation PDF
Digital Healthcare - Detailed Presentation PDF
 

Cone TM Digital Marketing - Principles PDF

  • 2. Digital Transformation Throughout eternity, all that is of like form comes around again – everything that is the same must return again in its own everlasting cycle..... • Marcus Aurelius – Emperor of Rome •
  • 3. Digital Product Lifecycle Strategy • Everything that goes around, comes around – everything has its’ own lifecycle, in its’ own time. Things are born, grow, age, and ultimately they die. It’s easy to spot a lifecycle in action everywhere you look. As a person is born, grows, ages, and dies – then so does a star, a tree, a bird, a bee, or a civilization – and so does a company, a product, a technology or a market - everything goes around in a lifecycle of it own.
  • 4. Digital Product Lifecycle Strategy Investment Product Lifecycle Product Design Product Launch Product Planning Death Plateau Product Maturity Decline Aging Early Growth Migrate Customers to new Products Withdraw Innovation Prototype / Pilot / Proof-of-concept Cash CowCease Investment
  • 7. The CONE™ The CONE™ - Social Intelligence Getting to the heart of audiences - and putting audiences back at the heart of marketing.
  • 8. The CONE™ - Audience Measurement • Due to severe competition, Communications Service Providers (CSPs) such as 3 Mobile, EE, Talk-Talk and Vodafone, along with Mobile Virtual Network Operators (MVNOs) such as Virgin, Tesco and Giff-gaff - no longer make significant profit from their core services (Mobile, Fixed-line and Broadband). This has caused the dash for “Quad-play”, where CSPs now add Media and Entertainment Packages to their core network services offering (Mobile, Fixed-line & Broadband). • TV Set-top Boxes (Virgin, Talk-Talk, Sky, EE) are connected to the Internet and continuously stream Audience Channel Selection data and Music Play-lists to the Communications Service Provider (CSP) Audience Insight and Analytics servers. Similarly, Smart Phone Apps (BBC i- player, Sky Go, Netflix, Spotify) also continuously stream Audience Channel Selection data and Music Play-lists to the Communications Service Provider (CSP) - via Apigee to AWS Big Data. • In a typical household (Mother, Father, two children) there may be four Smart Phones and as many as ten other internet connected devices (Tablets, Laptops, Internet TVs, TV Set-top Boxes and Video Games Boxes) – all streaming video, audio and data – the details of which are captured, stored and analysed by the Communications Service Provider (CSP) using “Big Data” Analytics techniques. This yields valuable Audience Metrics and Analytics based on intimate understanding of consumer video, audio and internet content from which actionable audience insights is derived from video, audio and internet streaming data – which drives Personalised Advertising across all devices (Smart Phone, Tablet, Internet TV, Games Boxes).
  • 9.
  • 10. The CONE™ - Social Intelligence This revolutionary Digital Marketing approach is called the Cone™‫-‏‬ a next- generation Social Intelligence solution for real-time lifestyle understanding: - • The Cone™‫‏‬solution uses Social Intelligence to get right to the heart of every audience - and puts the audience back at the heart of every media organisation. • The Cone™‫‏‬Digital‫‏‬Marketing‫‏‬solution works through Real-time Analytics – tuning directly into the dynamic nature of people, fashion, media and culture. • The Cone™‫‏‬solution analyses intimate audience viewing behaviour using Social Intelligence and Real-time Insight, inspiring better digital marketing campaigns, faster – ideas which connect directly with the widest possible network audience. • Most importantly, the Cone™‫‏‬solution tracks and understands the changing behaviour of viewers, fans and audiences and their propensity to engage with different ideas, lifestyles, interests, needs, passions, aspirations and desires.
  • 11. 21st Century Lifestyle Understanding Fanatics (10%) Enthusiasts (20%) Casuals (20%) Indifferent (40%) Cone™ Fan Base Understanding© ©2013 Innovation Pipeline
  • 12. The CONE™ - a New Lens Today we can view audiences through a better lens than given by traditional segmentation. Our better lens is what we now call the Cone™. The Cone™ visualises the volume and behaviour of a user-defined audience. When an audience is viewed is this way, the behaviours and volumes are visualised across our Cone™ spectrum that segments the audience’s propensity to engage. It’s this behaviour and volume understanding that visualises the Cone™. Scene Setters Restless Contented ©2013 Innovation Pipeline
  • 13. Cone™ Lifestyle Understanding What‫‏‬is‫‘‏‬The‫‏‬Cone’? • At its simplest, The‫‏‬Cone™‫‏‬is a visual metaphor that maps the volume of audiences across an engagement spectrum with regards to how people connect with different passions and ideas. • At its most sophisticated, the Cone™ delivers total entertainment digital innovation. Why a Cone? • The Cone™ shape is informed by the correlation between the volume of audiences and their propensity to engage with different passions. This Cone shape proves to be universal in it’s application to brands, ideas and industries that have ‘fans’ i.e. – 1. The thin, pointy end of the Cone™ - • Low audience volume but incredibly high engagement and therefore high ‘purchase’ intent’ 2. The fat, base end of the Cone™ - • High audience volume but low engagement and therefore, much lower ‘purchase 'intent’ • We use our proprietary IP to produce The Cone™ in industries and clients that have fans (or at least where people engage through ‘passionate interest’ vs mere ‘consumption’). Thus The‫‏‬Cone™‫‏‬maps people as fans and audiences with active interests, needs and desires - not just as passive consumers.
  • 14. Cone™ Lifestyle Understanding Cone™ Lifestyle Understanding© Fanatics (10%) - Core fans, including cultural arbiters, trend setters, curators, editors. Enthusiasts (20%) - Social amplifiers, restless for the new, who enjoy the discovery and social kudos of feeling and “being first”. Casuals (20%) - The wider market, happy to be influenced by others and open to engagement through social influence. Indifferent (40%) - Generally agnostic, uninterested and indifferent to ideas in question. Fanatics 10% Enthusiasts 20% Casuals 30% Indifferent 40% ©2013 Innovation Pipeline
  • 15. Cone™ Lifestyle Understanding How does the Cone work? • The principle of The‫‏‬Cone™‫‏‬Audience‫‏‬Metrics‫‏&‏‬Analytics‫‏‬Solution‫‏‬is firstly to understand people’s lives, and then understand the role that different entertainment concepts and content play in their lives. Using this narrative of understanding, we can gain unique insights, helping make better and more incisive decisions through understanding who ideas are connecting with and why that inspires creative marketing. We then apply The Cone™ creative inspiration to innovate compelling propositions and ideas that will connect with the widest possible audiences. • On the surface, The‫‏‬Cone™‫‏‬profiles people’s propensity to engage with any given lens e.g. film, reality TV, music, radio, mobile, etc. along our FECI continuum: ranging from Fanatics through Enthusiasts to Casuals and “Indifferent” – finally the “Unconnected”. We then use proprietary data analytics to profile and describe groups of similar people within the FECI continuum. • The‫‏‬Cone™‫‏‬facilitates our understanding of how groups of like-minded individuals are connecting (or not connecting…..) with our brand and content – thus we can use intimate personal insights to learn how to inspire the right kinds of ideas and events to better target brand positioning and product content, influencing more receptive audiences, so delivering new core fan connections which drives an expanding and increasingly loyal fan base …..
  • 17. The CONE™ - BBC Radio 1 Cone™‫‏‬Innovation - BBC Radio 1, 2002-05 • In 2002, BBC Radio 1 - the UK’s no.1 youth radio brand (now globally streamed to millions) - was in danger of losing its public service licence. Listener volume was in decline, with a total RAJAR audience of circa 7 million. Radio 1 had become disconnected from its core audiences. • We were asked to help innovate the total transformation of ideas, creativity and environment to return Radio 1 to its pre-eminent place in youth culture. • Central to Radio 1’s innovative revival was a new lens through which to view the Radio 1 audience. This lens helped us understand audience engagement through behaviour - versus fixed demographics. ©2013 Innovation Pipeline
  • 18. Sony Music: Audience Cone™ / Artist DNA Sony Music 2007-2011 - Audience Cone™‫/‏‬ Artist DNA • The key to success at Sony Music was using the Audience‫‏‬Cone™‫‏‬and Artist DNA in order to help A&R Managers and Producers to understand the role music plays in people's lives - and then understand the impact of any particular genre or specific artist within that audience and cultural context. • We provided a unique approach to make sense of Digital Marketing and Social Intelligence as part of an Artists musical and career development. We called it the Artist DNA – a tool which supports the insightful creative foundation for all artist releases, tours, appearances and campaigns. • Today the Cone™‫‏‬App‫-‏‬ our proprietary solution using the Audience Cone™‫‏‬and Artist DNA approach – is used by Sony Music in 32 global territories – placing the audience back at the heart of Sony Music and putting the artists back at the heart of their audiences - attracting new fans and re- connecting with old fans – to give the widest possible audience and fan-base.
  • 19. The Challenge – American Idol, 2014 The Challenge – American Idol, 2014 • Analyse the Reality TV audience spectrum so that we can better understand who American Idol fans are, and therefore gain insight into how we can halt the audience decline of 2014….. • There is a very real and present Reality TV Cone - because there exists distinct Reality TV audience clusters - discrete groups of people who engage with Reality TV in a variety of different ways….. • Reality TV is a well understood lens into how people live out their own lives (they might not admit this) – so that we can understand viewers lives and lifestyle and engage them through the Reality TV lens. • We can map this lens through our Fanatics, Enthusiasts, Casuals and Indifferent (FECI) spectrum in order to place each individual along a continuum of audience interest, affinity, loyalty and engagement. • We can then profile and segment these people into different groups along the FECI spectrum – and therefore, those within these groups who have a greater propensity and appetite for American Idol: - – Viewers with an increased or decreased awareness of the Reality TV genre – Viewers with a higher or lower interest in Reality TV shows / media coverage – Viewers with a greater or lesser knowledge of Reality TV presenters / participants – Viewers who invest more or less time in consuming Reality TV – live / streamed content
  • 20. The CONE™ - American Idol, 2014 Cone™‫‏‬Innovation – American Idol, 2014 1. Fanatics - 10% : - Know about each contestant in every show, devote time to reality TV. Primarily live viewers. 2. Enthusiasts - 26%: - Buy very much into Reality TV. Have other passions. Love social media ‘second screening’. 3. Casuals - 42% : - A more diverse group. Reality TV is only one part of their busy lives. Will engage if it meets their needs and values. American Idol, 2014 over-indexed on “Casuals”‫–‏‬ but under-indexed on Audience Total 4. Indifferent - 22% : - “Indifferent”‫‏‬viewers interact with the brand when there are other brand Fans within their social network who act as “Influencers”.‫‏‏‬AI 2014 under-indexed on both “Indifferent”‫‏‬and Audience Total 5. Unconnected. Huge marketplace. Generally, “Unconnected”‫‏‬viewers only connect with the brand if there are other brand advocates within their social network who act as influencers or “Introducers”‫‏‬to Reality TV series. Fanatics 10% Enthusiasts 26% Casuals 42% Indifferent 22% The Challenge – American Idol, 2014 Analyse the Reality TV audience so that we can better understand who American Idol fans are, and therefore gain insight into how we can halt the audience decline of 2014….. • There is a Reality TV Cone because there exists discrete groups of people who engage with Reality TV in different ways. • Reality TV is a well understood lens in peoples lives (they might not admit this - but we can view their lives through this Reality TV lens). • We can map this lens through our Fanatics, Enthusiasts, Casuals and Indifferent (FECI) continuum in order to place every individual along the spectrum of audience engagement. ©2013 Innovation Pipeline
  • 21. Cone™ Fan Base Understanding ©2013 Innovation Pipeline
  • 22. The Cone™ Application • Where old-school audience analysis was retrospective and fixed, the new Cone™ data science is lean, agile, current, fluid and predictive. • The‫‏‬Cone™‫‏‬App takes our proven Audience Cone™‫‏‬and Artist DNA approach and puts it on-line to render a custom lens for an audience; a lens you can zoom, pan and focus - to reveal more hidden detail. • The‫‏‬Cone™‫‏‬App applies data science and digital analytics principles to generate innovative marketing insights - translated into a narrative of real-time audience understanding - that answers the six key questions: - 1. What’s happening now ? 2. Who’s making it happen ? 3. Where is it happening ? 4. Why is it happening ? 5. When is it happening ? 6. How is it happening ?
  • 23. The‫‏‬Cone™‫‏‬Application Social Intelligence Cloud CRM Data Profile Data CRM / CEM Big Data Analytics Customer Management (CRM / CEM) Social Intelligence Campaign Management e-Business Big Data Analytics The Cone™‫‏‬ Customer Loyalty & Brand Affinity The Cone™‫‏‬ Smart Apps Audience Survey Data Insights Reports TV Set- top Box
  • 24. Proof-of-concept and Prototype The Cone™‫‏‬approach is lean, agile, smart and creative: - • We start by providing a custom Cone™ app as a proof of concept. We then work with client key stakeholders to scope a detailed brief which articulates a business problem domain that the Cone™ can help resolve. • Under normal circumstances we utilise all current and past audience research and any other available internal data to first establish a baseline client Cone™. • We then augment this by overlaying external data - Social Media Intelligence and other live streamed audience data that will provide our new real-time view for who / what / why / where / when and how fan-base and lifestyle understanding. • Lastly, we apply this understanding social intelligence as new actionable insights to inform creative marketing campaign solutions against the agreed brief. • Post proof-of-concept, we then agree a Cone™ app fixed term licence along with Cone™ consulting, mentoring and support – on-demand, as and when required.
  • 25. The Cone™‫‏‬– Model Design and Delivery Phase / Step Description Input Design Process Output Cost (estimate) Skill Set 1 1 Cone™‫‏‬Model‫‏‬Data‫‏‬ Analysis / Design User Requirements Data Analysis & Data Modelling Cone™ Logical Data Model £k Business / Data Analyst 2 Cone™‫‏‬Data‫‏‬Design‫‏‬ – Questionnaire User Requirements Data Analysis & Data Modelling Questionnaire Survey Form £k Business / Data Analyst 3 Cone™‫‏‬Physical‫‏‬ Database Design Logical Data Model Cone™ Database Design Physical Cone™ Design £k Data Analyst / DBA 4 Cone™‫‏‬Data‫‏‬Load‫–‏‬ Questionnaire / Survey Forms Physical Data Model, Survey Questionnaire Cone™ Model Calibration and Tuning Runs Initialised Cone™ Model £k Business / Data Analyst, DBA 2 5 Cone™‫‏‬Data‫‏‬Load‫–‏‬ In-house CRM and Audience Data Physical Data Model, People CRM Data Cone™ Model CRM Data Load Populated Cone™ Model £k Business / Data Analyst, DBA 6 Cone™‫‏‬Profiling Cone™ Clustering Algorithms Cone™ Model Data Profiling – Kernel k-means Profiled Cone™ Model £k Data Analyst, DBA, Data Scientists 3 7 Cone™‫‏‬Streaming‫‏‬ and Segmentation Historic Sales and CRM Data Cone™ History Matching Runs Cone™ Historic Trends £k Data Scientists 8 Cone™‫‏‬Real-time Social Media Feeds Global Social Intelligence Cone™ Real- Time Analytics Actionable Cone™ Insights (variable with Cone™ total data volume) Data Scientists
  • 27. The Cone™‫‏‬ The Cone™‫‏‬– Digital Marketing – turning Social Intelligence into Actionable Marketing Insights / Sales Opportunities… 1. Education Cone™ – Training and Education Business Scenario and Use Cases 2. Utilities Cone™ – Water, Gas and Electricity Business Scenario and Use Cases 3. Media Cone™ – Broadband, Land-line, Mobile and Entertainment Business Scenario and Use Cases 4. Music Cone™ – Brand / Genre / Label / Artists Business Scenario and Use Cases 5. Political Cone™ – Party and Voter Election Business Scenario and Use Cases 6. Fashion Cone™ – Fashion and Luxury Brands Business Scenario and Use Cases 7. Sports Cone™ – Elite Team Sports Franchise Business Scenario and Use Cases 8. Patient Cone™ – Digital Healthcare / medical Business Scenario and Use Cases
  • 29. Telematics The Internet of Things (IoT) – Smart Devices, Smart Apps, Wearable Technology, Vehicle Telemetry, Smart Homes and Building Automation SMACT/4D Digital Technology Stack
  • 30. The Cone™‫‏‬– Model Design and Delivery Phase / Step Description Input Design Process Output Cost (estimate) Skill Set 1 1 Cone™‫‏‬Model‫‏‬Data‫‏‬ Analysis / Design User Requirements Data Analysis & Data Modelling Cone™ Logical Data Model £k Business / Data Analyst 2 Cone™‫‏‬Data‫‏‬Design‫‏‬ – Questionnaire User Requirements Data Analysis & Data Modelling Questionnaire Survey Form £k Business / Data Analyst 3 Cone™‫‏‬Physical‫‏‬ Database Design Logical Data Model Cone™ Database Design Physical Cone™ Design £k Data Analyst / DBA 4 Cone™‫‏‬Data‫‏‬Load‫–‏‬ Questionnaire / Survey Forms Physical Data Model, Survey Questionnaire Cone™ Model Calibration and Tuning Runs Initialised Cone™ Model £k Business / Data Analyst, DBA 2 5 Cone™‫‏‬Data‫‏‬Load‫–‏‬ In-house CRM and Audience Data Physical Data Model, People CRM Data Cone™ Model CRM Data Load Populated Cone™ Model £k Business / Data Analyst, DBA 6 Cone™‫‏‬Profiling Cone™ Clustering Algorithms Cone™ Model Data Profiling – Kernel k-means Profiled Cone™ Model £k Data Analyst, DBA, Data Scientists 3 7 Cone™‫‏‬Streaming‫‏‬ and Segmentation Historic Sales and CRM Data Cone™ History Matching Runs Cone™ Historic Trends £k Data Scientists 8 Cone™‫‏‬Real-time Social Media Feeds Global Social Intelligence Cone™ Real- Time Analytics Actionable Cone™ Insights (variable with Cone™ total data volume) Data Scientists
  • 31. Social Intelligence – Brand Loyalty and Affinity CONE SEGMENTS – Brand Loyalty and Affinity Social Intelligence drives Brand Loyalty and Affinity, Lifestyle Understanding - Fan-base Profiling, Streaming and Segmentation and marketing Campaigns – expressed in the creation and maintenance of a detailed History and Balanced Scorecard for every individual in the Cone, allowing summation by Stream / Segment: - 1. Inactive – need to draw their attention towards the Brand 2. Indifferent – need to educate them about core Brand Values 3. Disconnected– need to re-engage with the Brand 4. Casuals – exhibit Brand awareness and interest 5. Followers – follow the Brand, engage with social media and consume brand communications 6. Enthusiasts – engaged with the Brand, participate in Brand / Product / Media events and merchandising 7. Supporters– show strong need, desire and propensity to support Brand / Product / Media consumption 8. Fanatics – demonstrate total Commitment / Dedication / Loyalty for all aspects of the Brand / Product / Media PROPENSITY – Balanced Scorecard • Balanced Scorecard – is a summary of all the data-points for an Individual / Stream / Segment • Propensity Score – In the statistical analysis of observational data, Propensity Score Matching (PSM) is a statistical matching technique that attempts to estimate the effect of a Campaign / Offer / Promotion or other intervention by calculating the impact of factors that predict the outcome of the Campaign / Offer / Promotion. • Propensity Model – is the Baysian probability of the outcome of an event in an Individual / Stream / Segment • Predictive Analytics - an area of data mining that deals with extracting information from data and using it to predict trends and behaviour patterns. Often the unknown event of interest is in the future, however, Predictive Analytics can be applied to any type of event with an unknown outcome - in the past, present or future.
  • 32. Social Intelligence – Streaming and Segmentation Social Interaction Brand Affinity Geo-demographic ProfileExperian Mosaic – 15 Groups (Streams), 66 Types (Segments) Hybrid Cone – 3 Dimensions The Cone™‫‏‬ Social Interaction The Cone™‫‏‬– Streaming & Segmentation
  • 33. Social Intelligence – Social Interaction Social Interaction Cone Rules 1. Inactive – not engaged – low evidence / low affinity / low interest in Social Media 2. Lone Wolf – sparse / thin social network - may share negative information (Trolling) 3. Home Boy – Social Network clustered around Home Location Postcodes (Gang Culture) 4. Eternal Student – Social Network clustered around School / College / University Alumni 5. Workplace – Social Network clustered around Work and Colleagues (e.g. City Brokers, Traders) 6. Friends and Family – Social Network clustered around physical social contacts - Friends and Family 7. Enthusiast – Social Network clustered around shared, common interests – Sport. Music and Fashion etc. 8. Promiscuous – Open Networker – virtual Social Network across all categories- will connect with anybody Number of Segments • With anonymous data (e.g. surveys and polls) then the number of initial Segments is 4 (Matt Hart). With people data (named individuals) we can discover much richer internal and external data from multiple sources (Social Media / User Content / Experian) - and therefore segment the population with greater granularity Individuals Qualifying for Multiple Segments. • When individuals qualify for multiple segments - we can either add these deviant (non-standard) individuals to the Segment that they have the greatest affinity with - or kick out any such deviants into an Outlying / Outcast / Miscellaneous Segment for further statistical processing or for processing throiugh manual intervention
  • 34. Social Intelligence – Actionable Insights Brand Affinity Social Interaction Geo-demographic Profile Experian Mosaic – 15 Groups (Segments), 66 Types (Streams) Hybrid Cone – 3 Dimensions Fanatics - 10% Enthusiasts - 20% Casuals - 30% Indifferent - 40% The Cone™‫‏‬ Brand Loyalty & Affinity The Cone™‫‏‬– Actionable Insights
  • 35. Social Interaction How consumers use social media (e.g., Facebook, Twitter) to address and/or engage with companies around social and environmental issues.
  • 36.
  • 37. The chart above illustrates the richness and diversity of social media.....
  • 38. The pattern of Social Relationships..... Social Media is the fastest growing category of user-provided global content and will eventually grow to 20% of all internet content. Gartner defines social media content as unstructured data created, edited and published by users on external platforms including Facebook, MySpace, LinkedIn, Twitter, Xing, YouTube and a myriad of other social networking platforms - in addition to internal Corporate Wikis, special interest group blogs, communications and collaboration platforms..... Social Mapping is the method used to describe how social linkage between individuals in order to define Social Networks and to understand the nature of intimate relationships between individuals.
  • 40. Traditional CRM was very much based around data and information that brands could collect on their customers, all of which would go into a CRM system that then allowed the company to better target various customers. CRM is comprised of sales, marketing and service / support–based functions whose purpose was to move the customer through a pipeline with the goal of keeping the customer coming back to buy more and more stuff...... TRADITIONAL CRM – Customer Management PipelineTRADITIONAL CRM – Customer Management Pipeline
  • 41. Evolution of CRM to SCRM - The challenge for organizations now is adapting and evolving to meet the needs and demands of these new social customers - many organizations still do not understand the CRM value of social media..... SOCIAL CRM – Social Media ConversationsSOCIAL CRM – Social Media Conversations
  • 42. In Social CRM - the customer is actually the focal point of how an organization operates. Instead of marketing products or pushing messages to customers, brands now talk to and collaborate with their customers to solve business problems, empower customers to shape their own Customer Experience and Journeys and develop strong customer relationships - which will over time, turn participants into brand evangelists and positive customer advocates..... SOCIAL CRM – Social CRM ProcessesSOCIAL CRM – Social Media Conversations
  • 43. Posted on April 20, 2010 by Laurance Buchanan - Capgemini SOCIAL CRM – a Business Framework and Operating Model Social CRM - a Business Framework and Operating Model SOCIAL CRM – Business Framework and Operating Model
  • 44. Social Graphs and Market Sentiment •‫‏‬Using‫“‏‬BIG‫‏‬DATA”‫‏‬to‫‏‬drive‫‏‬Market‫‏‬Sentiment‫‏•‏‬ Unprompted online conversations, statements and news create an online reflection of real-life events and issues – influencing the thoughts of individual consumers – managing Reputational Risk and so shaping Market Sentiment. The Social Media data, Blogs and News feeds that form this digital mirror of the world provides a gold mine of actionable information.....
  • 45. • Influencer Programmes have a long history in industries such as software, computers and electronics, - but today they are successfully deployed across all types of industries including automotive, smart phones, fashion, health and nutrition, wine, sports, music, technology, travel tourism and leisure – and financial services..... • In a hyper-connected world market-makers and influencers increasingly provide the gateway to decision makers who drive consumer behaviour. • Unprompted online conversations, statements and news create an online reflection of real-life events and issues – influencing the thoughts of individual consumers and so shaping Market Sentiment. • The Social Media data and News feeds that form this digital mirror of the world provides a gold mine of information. However, unlocking the data is not straight forward as it requires a complex and unique set of technologies, skills and methods..... INFLUENCER PROGRAMMES – Social Media Conversations INFLUENCER PROGRAMMES – Social Media Conversations INFLUENCER PROGRAMMES – Social Media Conversations
  • 47. SalesForce.com – a Cloud Platform Social CRM Business Solution The Cone™‫‏‬- Digital Marketing The Cone™‫‏‬- Lifestyle Understanding Customer Management (CRM / CEM) Social Intelligence Campaign Management e-Business Big Data Analytics The Cone™‫‏‬ Customer Loyalty & Brand Affinity The Cone™‫‏‬ Smart Apps Alarms & Alerts Reporting
  • 48. Digital Marketing – Solution Options Vendor Social Intelligence Mobile Big Data Analytics Cloud CRM / CEM Amazon + Salesforce Anomaly 42 Apple iOS + Android AWS Elastic MapReduce (EMR) AWS S3 “R” Revolution Kernel k-means AWS EC2 SalesForce + 3rd Party Apps Store Google Google Analytics Google Nexus Google Hadoop Google Analytics Google Cloud Google Office + Apps IBM IBM InfoSphere BigInsights IBM Cloud Microsoft Nokia, Windows 8 for Mobile Microsoft SQL/Server + Hadoop Microsoft Analytics DOT.NET, C# Windows Azure HDInsight Microsoft Office 360 + Dynamics Oracle Oracle DBMS + Hadoop OBIE Oracle Cloud Oracle CRM and EBS SAP SUP + Fiori SAP HANA + Hadoop Business Objects SAP HANA Cloud SAP CRM + Hybris
  • 49. The Cone™‫‏‬- Digital Marketing The Cone™‫‏‬ Lifestyle Understanding The‫‏‬Cone™‫‏‬– Brand Loyalty and Affinity The Cloud – SalesForce.com Amazon Web Services (AWS} Social Intelligence Data Science / Big Data Analytics Customer Experience & Journey - CRM / CEM Alarms / Alerts Reporting e-Business Smart Apps
  • 50. The Cone™‫‏‬– Digital Marketing Connecting‫‏‬the‫‏‬Unconnected….. • FMCG, Media, Entertainment and other enterprises which supply products and services indirectly to consumers – via Channel Partners such as Distributors, Dealers, Wholesalers and Retailers – are not directly connected to their customer base. In order to drive brand strategy and customer loyalty / affinity – they have to reach out to, contact and connect with, on the most intimate terms - the widest possible range of end-user consumers: - – Music (e.g. BBC and Sony Music) – Broadcasting (e.g. Radio 1 / American Idol) – Digital Media Content (e.g. Sony Films / Netflix) – Sports Franchises (e.g. Manchester City / New York City) – Fast Fashion Retailers (e.g. ASOS, Next, New Look, Primark, Top Shop) – Luxury Brands / Aggregators (e.g. Armani, Burberry, Versace / LVMH, PPR, Richemont) – Multi-channel Retailers – Loyalty, Campaigns, Offers and Promotions – Financial Services Companies – Brand Protection and Reputation Management – Travel, Leisure and Entertainment Organisations - Destination Resorts and Events – MVNO / CSPs - OTT Business Partner Analytics (Sky Go, Netflix via Firebrand / Apigee) – Telco, Media and Communications - Churn Management / Conquest / Up-sell / Cross-sell Campaigns – Digital Healthcare – Private / Public Healthcare Service Provisioning: - Geo-demographic Clustering and Propensity Modelling (Patient Monitoring, Wellbeing, Clinical Trials, Morbidity and Actuarial Outcomes)
  • 51. The Cone™‫‏‬- Eight Primitives Primitive Problem / Opportunity Business Domain System Function Software Product Who ? Who are our Customers ? Party - People / Organisations CRM / CEM SalesForce.com - Customer Management What ? What are they saying about us ? Social Media / Communications Social Intelligence Google Analytics, Anomaly 42 Why ? Why - their Interest / Behaviour / Motivation / Aspirations / Desires ? Brand Identity / Loyalty / Affinity / Offers / Promos’ Marketing, Campaign Management Predictive Analytics / Propensity Modelling Where ? Where do they Live / Work / Shop / Relax ? Places - Location GIS / GPS Geospatial Analytics When ? When do they contact / buy products from us ? Time / Date Contact Event / Sales Transaction Multi-channel Retail / Mobile Platforms How ? How do they contact and connect with us – Media / Telecoms Channels ? Communications Channel • Mobile • Internet • In-store Multi-channel Retail / Mobile Platforms Which ? Which Brands / Ranges / Categories / Products ? Retail Merchandising Product Catalogue IBM Product Centre / Stebo / Kalido Via ? Via Business Partners / 3rd Party Channels ? Sales Channel Retail Channel / Outlet Amazon, E-bay, Alibaba
  • 52. The Cone™‫‏‬– EIGHT PRIMITIVES Event Dimension Party Dimension Geographic Dimension Motivation Dimension Time Dimension Media Dimension Cone™‫‏‬ MEDIA FACT WHO ? WHAT ? WHERE ? HOW ?WHEN ?WHY ? • Indifferent • Casuals • Enthusiasts • Fanatics • Radio Show • Television Show • Internet Advert • Campaign • Offer • Promotion • Pre-order • Purchase • Download • Playlist • Booking • Attendance • Advert / Publicity • Posting / Blog • Facebook • LinkedIn • Myspace • Twitter • YouTube • Xing • Region / Country • State / County • City / Town • Street / Building • Postcode • Person • Organisation Product Dimension WHICH ? • Category • Label / Artist • Album / Track • Tour / City / Arena • Merchandise Channel Dimension VIA ? • Channel / Partner • In-store • Internet Service • Mobile Smart App (Spotify etc.) Advert / Publicity Type Sales Channel Posting / Blog Source / Type Subject Location Media Event • Awareness • Interest • Need • DesireMotivation Customer Time / Date Version 2 – Media Co’s
  • 53. Social Intelligence – Profiling and Analysis Fanatics - 10% Enthusiasts - 20% Casuals - 30% Indifferent - 40% The Cone™‫‏‬ Brand Loyalty & Affinity The Cone™‫‏‬– Profiling & Analysis
  • 54. The Cone™‫‏‬– Model Development Initialise Cone™‫‏‬ Model Cone™‫‏‬ Model Design Data Load Cone™‫‏‬ Model Calibration and Tuning Cone™‫‏‬ History Matching Cone™‫‏‬ Real-Time Analytics Survey Script Data Data Model Customer Data Profiling Data Historic Data Real-Time Data Cone™ Model Database Design Populated Cone™ Model Profiled Cone™ Model Historic Trends Actionable Insights Step 1 Step 3 Step 4 Step 5 Step 6Step 2
  • 55. The Cone™‫‏‬– Model Delivery Phase / Step Description Input Design Process Output Cost (estimate) Skill Set 1 1 Cone™‫‏‬Model‫‏‬Data‫‏‬ Analysis / Design User Requirements Data Analysis & Data Modelling Cone™ Logical Data Model £k Business / Data Analyst 2 Cone™‫‏‬Data‫‏‬Design‫‏‬ – Questionnaire User Requirements Data Analysis & Data Modelling Questionnaire Survey Form £k Business / Data Analyst 3 Cone™‫‏‬Physical‫‏‬ Database Design Logical Data Model Cone™ Database Design Physical Cone™ Design £k Data Analyst / DBA 4 Cone™‫‏‬Data‫‏‬Load‫–‏‬ Questionnaire / Survey Forms Physical Data Model, Survey Questionnaire Cone™ Model Calibration and Tuning Runs Initialised Cone™ Model £k Business / Data Analyst, DBA 2 5 Cone™‫‏‬Data‫‏‬Load‫–‏‬ In-house CRM and Audience Data Physical Data Model, People CRM Data Cone™ Model CRM Data Load Populated Cone™ Model £k Business / Data Analyst, DBA 6 Cone™‫‏‬Profiling Cone™ Clustering Algorithms Cone™ Model Data Profiling – Kernel k-means Profiled Cone™ Model £k Data Analyst, DBA, Data Scientists 3 7 Cone™‫‏‬Streaming‫‏‬ and Segmentation Historic Sales and CRM Data Cone™ History Matching Runs Cone™ Historic Trends £k Data Scientists 8 Cone™‫‏‬Real-time Social Media Feeds Global Social Intelligence Cone™ Real- Time Analytics Actionable Cone™ Insights (variable with Cone™ total data volume) Data Scientists
  • 56. The Cone™‫‏‬– Model Implementation Initialise Cone™‫‏‬ Model Cone™‫‏‬ Model Design Data Load Cone™‫‏‬ Model Calibration and Tuning Cone™‫‏‬ History Matching Cone™‫‏‬ Real-Time Analytics Data Model Database Schema Business Analyst DBA Survey Data Cone™‫‏‬Model Data Architect DBA CRM Data Populated Cone™‫‏‬Model Data Architect DBA Stream and Segment Data Profiled Cone™‫‏‬Model Data Architect DBA Historic Data Historic Trends Data Architect Data Scientists Real-Time Data Actionable Insights Data Architect Data Scientists
  • 57. The Cone™‫‏‬– Digital Marketing Data Streams into Revenue Streams….. • Digital Marketing is the communication, advertising and marketing of brands, products and services via multiple digital channels and channel partners in order to reach out to, contact and connect, on the most intimate terms, with the widest possible range of consumers. Through the exploitation of Digital Media we can initiate and maintain engaging Social Conversations. • Digital Marketing extends key Brand Messages across every digital platform, from simple internet marketing to mobile, broadcast and social media channels – yielding Social Intelligence data in order to discover actionable Marketing Insights – which in turn convert digital Data Streams into Revenue Streams • The key objective of Digital Marketing is to reach out to, contact and connect directly with carefully selected consumers – so that we create strong, lasting and durable relationships in order to promote key brand, category and product messages to targeted consumers and thus develop a tangible, valuable. very real and distinct brand / category / product interest, following, affinity and loyalty
  • 58. The Cone™ Converting Data Streams into Revenue Streams Salesforce Anomaly 42 Cone Unica End User BIG DATA ANALYTICS SOCIAL MEDIA E-Commerce Platform FULFILMENT Sales Orders Salesforce CRM Geo-demographics • Streaming • Segmentation • Household Data SOCIAL CRM Households Insights InsightsInsights Anomaly 42 Unica Offers and Promotions People and Places Campaigns Social Intelligence • User Content and Blogs • Social Groups and NetworksSOCIAL INTELLIGENCE Actionable Marketing Insights EXPERIAN The Cone™‫‏‬ Big Wheel keeps on turning – Perfect Store
  • 59. SalesForce.com – a Cloud Platform Social CRM Business Solution The Cone™‫‏‬- Digital Marketing The Cone™‫‏‬- Lifestyle Understanding Customer Management (CRM / CEM) Social Intelligence Campaign Management e-Business Big Data Analytics The Cone™‫‏‬ Customer Loyalty & Brand Affinity The Cone™‫‏‬ Smart Apps Alarms & Alerts Reporting
  • 60. “DATA‫‏‬SCIENCE”‫–‏‬ my own special area of Business expertise Targeting – Map / Reduce Consume – End-User Data Data Acquisition – High-Volume Data Flows – Mobile‫‏‬Enterprise‫‏‬Platforms‫(‏‬MEAP’s) Apache Hadoop Framework HDFS, MapReduce, Metlab “R” Autonomy, Vertica Smart Devices Smart Apps Smart Grid Clinical Trial, Morbidity and Actuarial Outcomes Market Sentiment and Price Curve Forecasting Horizon Scanning,, Tracking and Monitoring Weak Signal, Wild Card and Black Swan Event Forecasting – Data Delivery and Consumption News Feeds and Digital Media Global Internet Content Social Mapping Social Media Social CRM – Data Discovery and Collection – Analytics Engines - Hadoop – Data Presentation and Display Excel Web Mobile – Data Management Processes Data Audit Data Profile Data Quality Reporting Data Quality Improvement Data Extract, Transform, Load – Performance Acceleration GPU’s – massive parallelism SSD’s – in-memory processing DBMS – ultra-fast data replication – Data Management Tools DataFlux Embarcadero Informatica Talend – Info. Management Tools Business Objects Cognos Hyperion Microstrategy Biolap Jedox Sagent Polaris Teradata SAP HANA Netezza (now IBM) Greenplum (now EMC2) Extreme Data xdg Zybert Gridbox – Data Warehouse Appliances Ab Initio Ascential Genio Orchestra SOCIAL CRM – The Emerging Big Data Stack
  • 61. The Cone™‫‏‬- Brand Loyalty / Affinity 1. Brand Affinity 2. Social Interaction 3. Geo-demographic Profile – Experian Mosaic -15 Groups (Segments), 66 Types (Streams) Hybrid Cone™ – 3 Dimensions Fanatics - 10% Enthusiasts - 20% Casuals - 30% Indifferent - 40% The Cone™‫‏‬ Brand Loyalty & Affinity
  • 63. Salesforce Anomaly 42 Cone Unica End User BIG DATA ANALYTICS Cone™‫‏‬ Brand Affinity Campaign CRM Insights InsightsInsights SALES PEOPLE DEMOGRAPHICS Household Data SOCIAL INTELLIGENCE User Content, Social Groups and Networks Offers and Promotions People & Places PROFILING Streaming & Segmentation The‫‏‬Cone™‫‏‬– CYCLEThe Cone™‫‏‬– CONSUMER CYCLE e-Business Smart Apps Big Wheel keeps on turning – Perfect Store
  • 64. Hadoop Clustering and Managing Data..... Managing Data Transfers in Networked Computer Clusters using Orchestra To illustrate I/O Bottlenecks, we studied Data Transfer impact in two clustered computing systems: - Hadoop - using trace from a 3000-node cluster at Facebook Spark a MapReduce-like framework with iterative machine learning + graph algorithms. Mosharaf Chowdhury, Matei Zaharia, Justin Ma, Michael I. Jordan, Ion Stoica University of California, Berkeley {mosharaf, matei, jtma, jordan, istoica}@cs.berkeley.edu
  • 65. Hadoop Framework • The workhorse relational database has been the tool of choice for businesses for well over 20 years now. Challengers have come and gone but the trusty RDBMS is the foundation of almost all enterprise systems today. This includes almost all transactional and data warehousing systems. The RDBMS has earned its place as a proven model that, despite some quirks, is fundamental to the very integrity and operational success of IT systems around the world. • The relational database is finally showing some signs of age as data volumes and network speeds grow faster than the computer industry's present compliance with Moore's Law can keep pace with. The Web in particular is driving innovation in new ways of processing information as the data footprints of Internet-scale applications become prohibitive using traditional SQL database engines. • When it comes to database processing today, change is being driven by (at least) four factors: – Speed. The seek times of physical storage is not keeping pace with improvements in network speeds. – Scale. The difficulty of scaling the RDBMS out efficiently (i.e. clustering beyond a handful of servers is notoriously hard.) – Integration. Today's data processing tasks increasingly have to access and combine data from many different non-relational sources, often over a network. – Volume. Data volumes have grown from tens of gigabytes in the 1990s to hundreds of terabytes and often petabytes in recent years.
  • 66.
  • 67. RDBMS and Hadoop: Apples and Oranges? • Below is Figure 1 - a comparison of the overall differences between Database RDBMS and MapReduce-based systems such as Hadoop • From this it's clear that the MapReduce model cannot replace the traditional enterprise RDBMS. However, it can be a key enabler of a number of interesting scenarios that can considerably increase flexibility, turn-around times, and the ability to tackle problems that weren't possible before. • With Database RDBMS platforms, SQL-based processing of data sets tends to fall away and not scale linearly after a specific volume ceiling, usually just a handful of nodes in a cluster. With MapReduce, you can consistently obtain performance gains by increasing the size of the cluster. In other words, double the size of Hadoop cluster and a job will run twice as fast - quadruple it will rub four times faster - its the same linear relationship, irrespective of data volume and throughput.
  • 68. Comparing Data in DWH, Appliances, Hadoop Clusters and Analytics Engines RDBMS DWH DWH Appliance Hadoop Cluster Analytics Appliance Data size Gigabytes Terabytes Petabytes Petabytes Access Interactive and batch Interactive and batch Batch Interactive Structure Fixed schema Fixed schema Flexible schema Flexible schema Language SQL SQL Non-procedural Languages (Java, C++, Ruby, “R” etc) Non-procedural Languages (Java, C++, Ruby, “R” etc) Data Integrity High High Low Very High Architecture Shared memory - SMP Shared nothing - MPP Hadoop DFS In-memory Processing – GPGPUs / SSDs Virtualisation Partitions / Regions MPP / Nodal MPP / Clustered MPP / Clustered Scaling Non-linear Nodal / Linear Clustered / Linear Clustered / Linear Updates Read and write Write once, read many Write once, read many Write once, read many Selects Row-based Set-based Column-based Array-based Latency Low – Real-time Low – Near Real-time High – Historic Reporting Very Low – Real-time Analytics Figure 1: Comparing RDBMS to MapReduce
  • 69. Hadoop Framework • These datasets would previously have been very challenging and expensive to take on with a traditional RDBMS using standard bulk load and ETL approaches. Never mind trying to efficiently combining multiple data sources simultaneously or dealing with volumes of data that simply can't reside on any single machine (or often even dozens). Hadoop deals with this by using a distributed file system (HDFS) that's designed to deal coherently with datasets that can only reside across distributed server farms. HDFS is also fault resilient and so doesn't impose the overhead of RAID drives and mirroring on individual nodes in a Hadoop compute cluster, allowing the use of truly low cost commodity hardware. • So what does this specifically mean to enterprise users that would like to improve their data processing capabilities? Well, first there are some catches to be aware of. Despite enormous strengths in distributed data processing and analysis, MapReduce is not good in some key areas that the RDMS is extremely strong in (and vice versa). The MapReduce approach tends to have high latency (i.e. not suitable for real-time transactions) compared to relational databases and is strongest at processing large volumes of write-once data where most of the dataset needs to be processed at one time. The RDBMS excels at point queries and updates, while MapReduce is best when data is written once and read many times. • The story is the same with structured data, where the RDBMS and the rules of database normalization identified precise laws for preserving the integrity of structured data and which have stood the test of time. MapReduce is designed for a less structured, more federated world where schemas may be used but data formats can be much looser and freeform.
  • 70. The Emerging “Big Data” Stack Targeting – Map / Reduce Consume – End-User Data Data Acquisition – High-Volume Data Flows – Mobile Enterprise Platforms (MEAP’s) Apache Hadoop Framework HDFS, MapReduce, Metlab “R” Autonomy, Vertica Smart Devices Smart Apps Smart Grid Clinical Trial, Morbidity and Actuarial Outcomes Market Sentiment and Price Curve Forecasting Horizon Scanning,, Tracking and Monitoring Weak Signal, Wild Card and Black Swan Event Forecasting – Data Delivery and Consumption News Feeds and Digital Media Global Internet Content Social Mapping Social Media Social CRM – Data Discovery and Collection – Analytics Engines - Hadoop – Data Presentation and Display Excel Web Mobile – Data Management Processes Data Audit Data Profile Data Quality Reporting Data Quality Improvement Data Extract, Transform, Load – Performance Acceleration GPU’s – massive parallelism SSD’s – in-memory processing DBMS – ultra-fast database replication – Data Management Tools DataFlux Embarcadero Informatica Talend – Info. Management Tools Business Objects Cognos Hyperion Microstrategy Biolap Jedox Sagent Polaris Teradata SAP HANA Netezza (now IBM) Greenplum (now EMC2) Extreme Data xdg Zybert Gridbox – Data Warehouse Appliances Ab Initio Ascential Genio Orchestra
  • 71.
  • 72. Hadoop Framework • Each of these factors is presently driving interest in alternatives that are significantly better at dealing with these requirements. I'll be clear here: The relational database has proven to be incredibly versatile and is the right tool for the majority of business needs today. However, the edge cases for many large-scale business applications are moving out into areas where the RDBMS is often not the strongest option. One of the most discussed new alternatives at the moment is Hadoop, a popular open source implementation of MapReduce. MapReduce is a simple yet very powerful method for processing and analyzing extremely large data sets, even up to the multi- petabyte level. At its most basic, MapReduce is a process for combining data from multiple inputs (creating the "map"), and then reducing it using a supplied function that will distill and extract the desired results. It was originally invented by engineers at Google to deal with the building of production search indexes. The MapReduce technique has since spilled over into other disciplines that process vast quantities of information including science, industry, and systems management. For its part, Hadoop has become the leading implementation of MapReduce. • While there are many non-relational database approaches out there today (see my emerging IT and business topics post for a list), nothing currently matches Hadoop for the amount of attention it's receiving or the concrete results that are being reported in recent case studies. A quick look at thelist of organizations that have applications powered by Hadoop includes Yahoo! with over 25,000 nodes (including a single, massive 4,000 node cluster), Quantcast which says it has over 3,000 cores running Hadoop and currently processes over 1PB of data per day, and Adknowledge who uses Hadoop to process over 500 million clickstream events daily using up to 200 nodes
  • 73. HP HAVEn Big Data Platform
  • 75. Telco 2.0 “Big Data” Analytics Architecture
  • 76. Case Study – Huawei SmartCare CEM Customers Campaign Mart Analytics & Customer Loyalty Loyalty Mart CRM Data Customer DWH Customer Care “BIG‫‏‬DATA” Merchandising & Logistics Data Retail Data Warehouse Retail Multi-channel Sales Analysis Mobile Platforms EPOS Data Call Centre Data Internet Data e-Commerce Systems Store Systems Merchandising Warehousing & Logistics Inventory & Provisioning Hadoop Cluster SAP HANA ERP Systems Finance Managers Financial Data Warehouse Head OfficeFinancial Analysis Reports ERP Data OSS – Network Management Network Provisioning & Fault Management OperationsNetwork Data Network and Fault Reports Operations Managers Inventory, Provisioning & Replenishment BSS – Rating, Mediation and Billing Mediation Rating and Billing Systems Business Managers Supplier Data Product Data Customer Data Inventory & Provisioning Reports Planning & Forecasting Systems CDR Data Call Data Warehouse Billing Data Autonomy Vertica Operational “BIG‫‏‬DATA” Multi-channel Retail MSS – Head Office – Finance, Planning &Strategy Social Media - External Data Customer Care Systems CRM & Digital Marketing Systems Customers CEM SAP HANA Catalogue Hadoop ClusterPentaho, MetLab, “R” Cloudera Apache Hadoop Framework
  • 77. Big Data – Products The MapReduce technique has spilled over into many other disciplines that process vast quantities of information including science, industry, and systems management. The Apache Hadoop Library has become the most popular implementation of MapReduce – with framework implementations from Cloudera, Hortonworks and MAPR
  • 78. Split-Map-Shuffle-Reduce Process Big Data Consumers Split Map Shuffle Reduce Key / Value Pairs Actionable InsightsData Provisioning Raw Data
  • 79. Apache Hadoop Component Stack HDFS MapReduce Pig Zookeeper Hive HBase Oozie Mahoot Hadoop Distributed File System (HDFS) Scalable Data Applications Framework Procedural Language – abstracts low-level MapReduce operators High-reliability distributed cluster co-ordination Structured Data Access Management Hadoop Database Management System Job Management and Data Flow Co-ordination Scalable Knowledge-base Framework
  • 80. Data Management Component Stack Informatica Drill Millwheel Informatica Big Data Edition / Vibe Data Stream Data Analysis Framework Data Analytics on-the-fly + Extract – Transform – Load Framework Flume Sqoop Scribe Extract – Transform - Load Extract – Transform - Load Extract – Transform - Load Talend Extract – Transform - Load Pentaho Extract – Transform – Load Framework + Data Reporting on-the-fly
  • 81. Big Data Storage Platforms Autonomy Vertica MongoDB HP Unstructured Data DBMS HP Columnar DBMS High-availability DBMS CouchDB Couchbase Database Server for Big Data with NoSQL / Hadoop Integration Pivotal Pivotal Big Data Suite – GreenPlum, GemFire, SQLFire, HAWQ Cassandra Cassandra Distributed Database for Big Data with NoSQL and Hadoop Integration NoSQL NoSQL Database for Oracle, SQL/Server, Couchbase etc. Riak Basho Technologies Riak Big Data DBMS with NoSQL / Hadoop Integration
  • 82. Big Data Analytics Engines and Appliances Alpine Karmasphere Kognito Alpine Data Studio - Advanced Big Data Analytics Karmasphere Studio and Analyst – Hadoop Customer Analytics Kognito In-memory Big Data Analytics MPP Platform Skytree Redis Skytree Server Artificial Intelligence / Machine Learning Platform Redis is an open source key-value database for AWS, Pivotal etc. Teradata Teradata Appliance for Hadoop Neo4j Crunchbase Neo4j - Graphical Database for Big Data InfiniDB Columnar MPP open-source DB version hosted on GitHub Big Data Analytics Engines / Appliances
  • 83. Big Data Analytics and Visualisation Platforms Tableaux Tableaux - Big Data Visualisation Engine Eclipse Symentec Eclipse - Big Data Visualisation Mathematica Mathematical Expressions and Algorithms StatGraphics Statistical Expressions and Algorithms FastStats Numerical computation, visualization and programming toolset MatLab R Data Acquisition and Analysis Application Development Toolkit “R”‫‏‬Statistical‫‏‬Programming‫‏/‏‬Algorithm‫‏‬Language Revolution Revolution‫‏‬Analytics‫‏‬Framework‫‏‬and‫‏‬Library‫‏‬for‫“‏‬R”
  • 84. Hadoop / Big Data Extended Infrastructure Stack SSD Solid State Drive (SSD) – configured as cached memory / fast HDD CUDA CUDA (Compute Unified Device Architecture) GPGPU GPGPU (General Purpose Graphical Processing Unit Architecture) IMDG IMDG (In-memory Data Grid – extended cached memory) Vibe Splunk High Velocity / High Volume Machine / Automatic Data Streaming High Velocity / High Volume Machine / Automatic Data Streaming Ambari High-availability distributed cluster co-ordination YARN Hadoop Resource Scheduling Big Data Extended Architecture Stack
  • 85. Cloud-based Big-Data-as-a-Service and Analytics AWS Amazon Web Services (AWS) – Big Data-as-a-Service (BDaaS) Elastic Compute Cloud (ECC) and Simple Storage Service (S3) 1010 Data Big Data Discovery, Visualisation and Sharing Cloud Platform SAP HANA SAP HANA Cloud - In-memory Big Data Analytics Appliance Azure Microsoft Azure Data-as-a-Service (DaaS) and Analytics Anomaly 42 Anomaly 42 Smart-Data-as-a-Service (SDaaS) and Analytics Workday Workday Big-Data-as-a-Service (BDaaS) and Analytics Google Cloud Google Cloud Platform – Cloud Storage, Compute Platform, Firebrand API Resource Framework Apigee Apigee API Resource Framework
  • 86. Gartner Magic Quadrant for BI and Analytics Platforms
  • 87. Hadoop Framework Distributions FEATURE Hortonworks Cloudera MAPR Pivotal Open Source Hadoop Library Yes Yes Yes Pivotal HD Support Yes Yes Yes Yes Professional Services Yes Yes Yes Yes Catalogue Extensions Yes Yes Yes Yes Management Extensions Yes Yes Yes Architecture Extensions Yes Yes Infrastructure Extensions Yes Yes Library Support Services Catalogue Job Management Library Support Services Catalogue Hortonworks Cloudera MAPR Library Support Services Catalogue Job Management Resilience High Availability Performance Pivotal Library Support Services Catalogue Job Management Resilience High Availability Performance
  • 89. Data Warehouse Appliance / Real-time Analytics Engine Price Comparison Manufacturer Server Configuration Cached Memory Server Type Software Platform Cost (est.) SAP HANA 32-node (4 Channels x 8 CPU) 1.3 Terabytes SMP Proprietary $ 6,000,,000 Teradata 20-node (2 Channels x 10 CPU) 1 Terabyte MPP Proprietary $ 1,000,000 Netezza (now IBM) 20-node (2 Channels x 10 CPU) 1 Terabyte MPP Proprietary $ 180,000 IBM ex5 (non-HANA configuration) 32-node (4 Channels x 8 CPU) 1.3 Terabytes SMP Proprietary $ 120,000 Greenplum (now Pivotal) 20-node (2 Channels x 10 CPU) 1 Terabyte MPP Open Source $ 20,000 XtremeData xdb (BO BW) 20-node (2 Channels x 10 CPU) 1 Terabyte MPP Open Source $ 18,000 Zybert Gridbox 48-node (4 Channels x 12 CPU) 20 Terabytes SMP Open Source $ 60,000
  • 90. Clustering in “Big Data” “A Cluster is a group of the same or similar data elements which are aggregated – or closely distributed – together” Clustering is a technique used to explore content and understand information in every business sector and scientific field that collects and processes very large volumes of data Clustering is an essential tool for any “Big Data” problem
  • 91.
  • 92. • “Big‫‏‬Data”‫‏‬refers to vast aggregations (super sets) consisting of numerous individual datasets (structured and unstructured) - whose size and scope is beyond the capability of conventional transactional (OLTP) or analytics (OLAP) Database Management Systems and Enterprise Software Tools to capture, store, analyse and manage. Examples of “Big Data” include the vast and ever changing amounts of data generated in social networks where we maintain Blogs and have conversations with each other, news data streams, geo-demographic data, internet search and browser logs, as well as the ever-growing amount of machine data generated by pervasive smart devices - monitors, sensors and detectors in the environment – captured via the Smart Grid, then processed in the Cloud – and delivered to end-user Smart Phones and Tablets via Intelligent Agents and Alerts. • Data Set Mashing and “Big‫‏‬Data”‫‏‬Global‫‏‬Content‫‏‬Analysis – drives Horizon Scanning, Monitoring and Tracking processes by taking numerous, apparently un-related RSS and other Information Streams and Data Feeds, loading them into Very large Scale (VLS) DWH Structures and Document Management Systems for Real-time Analytics – searching for and identifying possible signs of relationships hidden in data (Facts/Events)– in order to discover and interpret previously unknown Data Relationships driven by hidden Clustering Forces – revealed via “Weak‫‏‬Signals”‫‏‬indicating emerging and developing Application Scenarios, Patterns and Trends - in turn predicating possible, probable and alternative global transformations which may unfold as future “Wild‫‏‬Card”‫‏‬or “Black‫‏‬Swan”‫‏‬events. “Big Data”
  • 93. Clustering in “Big Data” • The profiling and analysis of large aggregated datasets in order to determine a ‘natural’ structure of groupings provides an important technique for many statistical and analytic applications. Cluster analysis on the basis of profile similarities or geographic distribution is a method where no prior assumptions are made concerning the number of groups or group hierarchies and internal structure. Geo- demographic techniques are frequently used in order to profile and segment populations by ‘natural’ groupings - such as common behavioural traits, Clinical Trial, Morbidity or Actuarial outcomes - along with many other shared characteristics and common factors.....
  • 94. Clustering in “Big Data” •‫"‏‬BIG‫‏‬DATA”‫‏‬ANALYTICS‫–‏‬ PROFILING, CLUSTERING and 4D‫‏‬GEOSPATIAL‫‏‬ANALYSIS‫‏•‏‬ • The profiling and analysis of large aggregated datasets in order to determine a ‘natural’ structure of data relationships or groupings, is an important starting point forming the basis of many mapping, statistical and analytic applications. Cluster analysis of implicit similarities - such as time-series demographic or geographic distribution - is a critical technique where no prior assumptions are made concerning the number or type of groups that may be found, or their relationships, hierarchies or internal data structures. Geospatial and demographic techniques are frequently used in order to profile and segment populations by ‘natural’ groupings. Shared characteristics or common factors such as Behaviour / Propensity or Epidemiology, Clinical, Morbidity and Actuarial outcomes – allow us to discover and explore previously unknown, concealed or unrecognised insights, patterns, trends or data relationships. •‫‏‬PREDICTIVE‫‏‬ANALYITICS‫‏‬and‫‏‬EVENT‫‏‬FORECASTING‫•‏‬ • Predictive Analytics and Event Forecasting uses Horizon Scanning, Tracking and Monitoring methods combined with Cycle, Pattern and Trend Analysis techniques for Event Forecasting and Propensity Models in order to anticipate a wide range of business. economic, social and political Future Events – ranging from micro-economic Market phenomena such as forecasting Market Sentiment and Price Curve movements - to large-scale macro-economic Fiscal phenomena using Weak Signal processing to predict future Wild Card and Black Swan Events - such as Monetary System shocks.
  • 95.
  • 96. Multi-channel Retail - Digital Architecture • The last decade has seen an unprecedented explosion in mobile platforms as the internet and mobile worlds came of age. It is no longer acceptable to have only a bricks-and-mortar high-street presence – customer-focused companies are now expected to deliver their Customer Experience and Journey via internet websites, mobiles and more recently tablets.
  • 97. Targeting – Map / Reduce Consume – End-User Data Data Acquisition – High-Volume Data Flows – Mobile Enterprise Platforms (MEAP’s) Apache Hadoop Framework HDFS, MapReduce, Metlab “R” Autonomy, Vertica Smart Devices Smart Apps Smart Grid Clinical Trial, Morbidity and Actuarial Outcomes Market Sentiment and Price Curve Forecasting Horizon Scanning,, Tracking and Monitoring Weak Signal, Wild Card and Black Swan Event Forecasting – Data Delivery and Consumption News Feeds and Digital Media Global Internet Content Social Mapping Social Media Social CRM – Data Discovery and Collection – Analytics Engines - Hadoop – Data Presentation and Display Excel Web Mobile – Data Management Processes Data Audit Data Profile Data Quality Reporting Data Quality Improvement – Performance Acceleration GPU’s – massive parallelism SSD’s – in-memory processing DBMS – ultra-fast data replication – Data Management Tools DataFlux Embarcadero Informatica Talend – Info. Management Tools Business Objects Cognos Hyperion Microstrategy Biolap Jedox Sagent Polaris Teradata SAP HANA Netezza (now IBM) Greenplum (now EMC2) Extreme Data xdg – Data Warehouse Appliances Ab Initio Ascential Genio Orchestra Social Intelligence – The Emerging Big Data Stack
  • 98. GIS MAPPING and SPATIAL DATA ANALYSIS • A Geographic Information System (GIS) integrates hardware, software and digital data capture devices for acquiring, managing, analysing, distributing and displaying all forms of geographically dependant location data – including machine generated data such as Computer-aided Design (CAD) data from land and building surveys, Global Positioning System (GPS) terrestrial location data - as well as all kinds of data streams - HDCCTV, aerial and satellite image data.....
  • 99. GIS Mapping and Spatial Analysis •‫‏‬GIS‫‏‬MAPPING‫‏‬and‫‏‬SPATIAL‫‏‬DATA‫‏‬ANALYSIS‫•‏‬ • A Geographic Information System (GIS) integrates hardware, software and digital data capture devices for acquiring, managing, analysing, distributing and displaying all forms of geographically dependant location data – including machine generated data such as Computer-aided Design (CAD) data from land and building surveys, Global Positioning System (GPS) terrestrial location data - as well as all kinds of data streams - HDCCTV, aerial and satellite image data..... • Spatial Data Analysis is a set of techniques for analysing 3-dimensional spatial (Geographic) data and location (Positional) object data overlays. Software that implements spatial analysis techniques requires access to both the locations of objects and their physical attributes. Spatial statistics extends traditional statistics to support the analysis of geographic data. Spatial Data Analysis provides techniques to describe the distribution of data in the geographic space (descriptive spatial statistics), analyse the spatial patterns of the data (spatial pattern or cluster analysis), identify and measure spatial relationships (spatial regression), and create a surface from sampled data (spatial interpolation, usually categorized as geo-statistics). • The results of spatial data analysis are largely dependent upon the type, quantity, distribution and data quality of the spatial objects under analysis.
  • 100. World-wide Visitor Count – GIS Mapping
  • 101. Geo-demographic Clustering in “Big Data” •‫‏‬GEODEMOGRAPHIC‫‏‬PROFILING‫–‏‬ CLUSTERING‫‏‬IN“BIG‫‏‬DATA”‫‏•‏‬ • The profiling and analysis of large aggregated datasets in order to determine a ‘natural’ or implicit structure of data relationships or groupings where no prior assumptions are made concerning the number or type of groups discovered or group relationships, hierarchies or internal data structures - in order to discover hidden data relationships - is an important starting point forming the basis of many statistical and analytic applications. The subsequent explicit Cluster Analysis as of discovered data relationships is a critical technique which attempts to explain the nature, cause and effect of those implicit profile similarities or geographic distributions. Demographic techniques are frequently used in order to profile and segment populations using ‘natural’ groupings - such as common behavioural traits, Clinical, Morbidity or Actuarial outcomes, along with many other shared characteristics and common factors – and then attempt to understand and explain those natural group affinities and geographical distributions using methods such as Causal Layer Analysis (CLA).....
  • 102. GIS Mapping and Spatial Analysis • A Geographic Information System (GIS) integrates hardware, software and digital data capture devices for acquiring, managing, analysing, distributing and displaying all forms of geographically dependant location data – including machine generated data such as Computer-aided Design (CAD) data from land and building surveys, Global Positioning System (GPS) terrestrial location data - as well as all kinds of data streams - HDCCTV, aerial and satellite image data..... • Spatial Data Analysis is a set of techniques for analysing spatial (Geographic) location data. The results of spatial analysis are dependent on the locations of the objects being analysed. Software that implements spatial analysis techniques requires access to both the locations of objects and their physical attributes. • Spatial statistics extends traditional statistics to support the analysis of geographic data. Spatial Data Analysis provides techniques to describe the distribution of data in the geographic space (descriptive spatial statistics), analyse the spatial patterns of the data (spatial pattern or cluster analysis), identify and measure spatial relationships (spatial regression), and create a surface from sampled data (spatial interpolation, usually categorized as geo-statistics).
  • 105.
  • 106. Targeting – Map / Reduce Consume – End-User Data Data Acquisition – High-Volume – Mobile Enterprise Platforms (MEAP’s) – Data Delivery and Consumption – Data Discovery and Collection – Analytics Engines - Hadoop – Data Management Processes – Performance Acceleration Apache Hadoop Framework HDFS, MapReduce, Metlab “R” Autonomy, Vertica Smart Devices Smart Apps Smart Grid Clinical Trial, Morbidity and Actuarial Outcomes Market Sentiment and Price Curve Forecasting Horizon Scanning,, Tracking and Monitoring Weak Signal, Wild Card and Black Swan Event Forecasting News Feeds and Digital Media Global Internet Content Social Mapping Social Media Social CRM Data Audit Data Profile Data Quality Reporting Data Quality Improvement Data Extract, Transform, Load GPU’s – massive parallelism SSD’s – in-memory processing DBMS – ultra-fast data replication – Data Presentation and Display – Data Management Tools – Info. Management Tools – Data Warehouse Appliances Excel Web Mobile DataFlux Embarcadero Informatica Talend Business Objects Cognos Hyperion Microstrategy Biolap Jedox Sagent Polaris Teradata SAP HANA Netezza (now IBM) Greenplum (now EMC2) Extreme Data xdg Zybert Gridbox Ab Initio Ascential Genio Orchestra
  • 107. Clustering Phenomena in “Big Data” “A Cluster is a group of profiled data similarities aggregated closely together” • Cluster Analysis is a technique which is used to explore very large volumes of structured and unstructured data - transactional, machine generated (automatic) social media and internet content and geo-demographic information - in order to discover previously unknown, unrecognised or hidden logical data relationships.
  • 108. Event Clusters and Connectivity A B C D E G H F The above is an illustration of Event relationships - how Events might be connected. Any detailed, intimate understanding of the connection between Events may help us to answer questions such as: - • If Event A occurs does it make Event B or H more or less likely to occur ? • If Event B occurs what effect does it have on Events C,D,E, F and G ? Answering questions such as these allows us to plan our Event Management approach and Risk mitigation strategy – and to decide how better to focus our Incident / Event resources and effort…..
  • 109. Event Clusters and Connectivity • Aggregated Event includes coincident, related, connected and interconnected Event: - • Coincident - two or more Events appear simultaneously in the same domain – but they arise from different triggers (unrelated causal events) • Related - two more Events materialise in the same domain sharing common Event features or characteristics (may share a possible hidden common trigger or cause – and so are candidates for further analysis and investigation) • Connected - two more Events materialise in the same domain due to the same trigger (common cause) • Interconnected - two more Events materialise together in a Event cluster, series or “storm” - the previous (prior) Event event triggering the subsequent (next) event in an Event Series….. • A series of Aggregated Events may result in a significant cumulative impact - and are therefore frequently identified incorrectly as Wild-card or Black Swan Events - rather than just simply as event clusters or event “storms”.....
  • 110. Event Clusters and Connectivity 1 2 3 4 5 7 8 6 The above is an illustration of Event relationships - how Risk Events might be connected. A detailed and intimate understanding of Event clusters and the connection between Events may help us to understand: - • What is the relationship between Events 1 and 8, and what impact do they have on Events 2 - 7 ? • Events 2 - 5 and Events 6 and 7 occur in clusters – what are the factors influencing these clusters ? Answering questions such as these allows us to plan our Risk Event management approach and mitigation strategy – and to decide how to better focus our resources and effort on Risk Events and fraud management. Claimant 1 Risk Event Claimant 2 Residence Vehicle Event Cluster
  • 111. Aggregated Event Types ATrigger A Coincident Events BTrigger B Event Event CTrigger 1 Related Events DTrigger 2 Event Event E Trigger Connected Events Event EventF GTrigger Inter-connected Events Event Event H
  • 113. • 4D Geospatial Analytics is the profiling and analysis of large aggregated datasets in order to determine a ‘natural’ structure of groupings provides an important technique for many statistical and analytic applications. • Demographic and Geospatial Cluster Analysis - on the basis of profile similarities or geographic distribution - is a statistical method whereby no prior assumptions are made concerning the number of groups or group hierarchies and internal structure. Geo-spatial and geodemographic techniques are frequently used in order to profile and segment populations by ‘natural’ groupings - such as common behavioural traits, Clinical Trial, Morbidity or Actuarial outcomes - along with many other shared characteristics and common factors..... 4D Geospatial Analytics
  • 114. The Flow of Information through Time • String Theory predicates that Space-Time exists in discrete packages, with Time Present always in some way inextricably woven into both Time Past and Time Future. This yields the intriguing possibility of insights through the mists of time into the outcome of future events – as any item of Data or Information (Global Content) may contain faint traces which offer glimpses into the future trajectory of Clusters of linked Past, Present and Future Events. If all future timeline were linear, then every event would unfold in an unerringly predictable manner towards a known and certain conclusion. The future is, however, both unknown and unknowable (Hawking Paradox) . Future outcomes are uncertain – future timelines are non-linear (branched) with a multitude of possible alternative futures. Chaos Theory suggests that even the most subliminal inputs, originating from unknown forces so minute as to be undetectable, might become amplified through numerous system cycles to grow in influence and impact over time – deviating Space-Time trajectories far away from their original predicted path – so fundamentally altering the outcome of future events. • Every item of Global Content in the Present is somehow connected with both Past and Future temporal planes. Space-Time is a Dimension Cluster consisting of the three Spatial dimensions (x, y and z axes) plus Time (the fourth dimension - t) – which together flow in a single direction – relentlessly towards the future. Space-Time does not flow uniformly – the “arrow of time” may be deflected by unknown factors. There may exist “hidden external forces” (unseen interactions) that create disturbance in the temporal plane stack which marks the passage of time - with the potential to create eddies, vortices and whirlpools along the trajectory of Time (chaos, disorder and uncertainty) – which in turn posses the capacity to generate ripples and waves (randomness and disruption) – thus changing the course of the Space-Time continuum. “Weak‫‏‬Signals”‫‏‬are “Ghosts‫‏‬in‫‏‬the‫‏‬Machine” – echoes of these subliminal temporal interactions – that may contain within insights or clues about possible future “Wild‫‏‬card” or “Black‫‏‬Swan”‫‏‬random events
  • 115. 4D Geospatial Analytics – The Temporal Wave • The Temporal Wave is a novel and innovative method for Visual Modelling and Exploration of Geospatial “Big Data” - simultaneously within a Time (history) and Space (geographic) context. The problems encountered in exploring and analysing vast volumes of spatial– temporal information in today's data-rich landscape – are becoming increasingly difficult to manage effectively. In order to overcome the problem of data volume and scale in a Time (history) and Space (location) context requires not only traditional location–space and attribute–space analysis common in GIS Mapping and Spatial Analysis - but now with the additional dimension of time–space analysis. The Temporal Wave supports a new method of Visual Exploration for Geospatial (location) data within a Temporal (timeline) context. • This time-visualisation approach integrates Geospatial (location) data within a Temporal (timeline) dataset - along with data visualisation techniques - thus improving accessibility, exploration and analysis of the huge amounts of geo-spatial data used to support geo- visual “Big Data” analytics. The temporal wave combines the strengths of both linear timeline and cyclical wave-form analysis – and is able to represent data both within a Time (history) and Space (geographic) context simultaneously – and even at different levels of granularity. Linear and cyclic trends in space-time data may be represented in combination with other graphic representations typical for location–space and attribute–space data- types. The Temporal Wave can be used in roles as a time–space data reference system, as a time–space continuum representation tool, and as time–space interaction tool.
  • 116. 4D Geospatial Analytics – London Timeline
  • 117. 4D Geospatial Analytics – London Timeline • How did London evolve from its creation as a Roman city in 43AD into the crowded, chaotic cosmopolitan megacity we see today? The London Evolution Animation takes a holistic view of what has been constructed in the capital over different historical periods – what has been lost, what saved and what protected. • Greater London covers 600 square miles. Up until the 17th century, however, the capital city was crammed largely into a single square mile which today is marked by the skyscrapers which are a feature of the financial district of the City. • This visualisation, originally created for the Almost Lost exhibition by the Bartlett Centre for Advanced Spatial Analysis (CASA), explores the historic evolution of the city by plotting a timeline of the development of the road network - along with documented buildings and other features – through 4D geospatial analysis of a vast number of diverse geographic, archaeological and historic data sets. • Unlike other historical cities such as Athens or Rome, with an obvious patchwork of districts from different periods, London's individual structures scheduled sites and listed buildings are in many cases constructed gradually by parts assembled during different periods. Researchers who have tried previously to locate and document archaeological structures and research historic references will know that these features, when plotted, appear scrambled up like pieces of different jigsaw puzzles – all scattered across the contemporary London cityscape.
  • 118. • The Temporal Wave is a novel and innovative method for Visual Modelling and Exploration of Geospatial “Big Data” - simultaneously within a Time (history) and Space (geographic) context. The problems encountered in exploring and analysing vast volumes of spatial– temporal information in today's data-rich landscape – are becoming increasingly difficult to manage effectively. In order to overcome the problem of data volume and scale in a Time (history) and Space (location) context requires not only traditional location–space and attribute–space analysis common in GIS Mapping and Spatial Analysis - but now with the additional dimension of time–space analysis. The Temporal Wave supports a new method of Visual Exploration for Geospatial (location) data within a Temporal (timeline) context. • This time-visualisation approach integrates Geospatial (location) data within a Temporal (timeline) dataset - along with data visualisation techniques - thus improving accessibility, exploration and analysis of the huge amounts of geo-spatial data used to support geo- visual “Big Data” analytics. The temporal wave combines the strengths of both linear timeline and cyclical wave-form analysis – and is able to represent data both within a Time (history) and Space (geographic) context simultaneously – and even at different levels of granularity. Linear and cyclic trends in space-time data may be represented in combination with other graphic representations typical for location–space and attribute–space data- types. The Temporal Wave can be used in roles as a time–space data reference system, as a time–space continuum representation tool, and as time–space interaction tool. 4D Geospatial Analytics – The Temporal Wave
  • 119. Social Intelligence – Brand Affinity CONE SEGMENTS - BRAND AFFINITY • Social Intelligence drives Brand Loyalty Understanding - Fan-base Profiling, Streaming and Segmentation – expressed in the creation and maintenance of a detailed History and Balanced Scorecard for every individual in the Cone, allowing summation by Stream / Segment: - 1. Inactive – need to draw their attention towards the Brand 2. Indifferent – need to educate them about core Brand Values 3. Disconnected– need to re-engage with the Brand 4. Casuals – exhibit Brand awareness and interest 5. Followers – follow the Brand, engage with social media and consume brand communications 6. Enthusiasts – engaged with the Brand, participate in Brand / Product / Media events and merchandising 7. Supporters– show strong need, desire and propensity to support Brand / Product / Media consumption 8. Fanatics – demonstrate total Commitment / Dedication / Loyalty for all aspects of the Brand / Product / Media PROPENSITY • Balanced Scorecard – is a summary of all the data-points for an Individual / Stream / Segment • Propensity Score – In the statistical analysis of observational data, Propensity Score Matching (PSM) is a statistical matching technique that attempts to estimate the effect of a Campaign / Offer / Promotion or other intervention by calculating the impact of factors that predict the outcome of the Campaign / Offer / Promotion. • Propensity Model – is the Baysian probability of the outcome of an event in an Individual / Stream / Segment • Predictive Analytics - an area of data mining that deals with extracting information from data and using it to predict trends and behaviour patterns. Often the unknown event of interest is in the future, however, Predictive Analytics can be applied to any type of event with an unknown outcome - in the past, present or future.
  • 120. Social Intelligence – Fan-base Understanding Football Supporters – Map of London
  • 121. Social Intelligence – Fan-base Understanding CONE STREAMING and SEGMENTATION • Multiple Cones can be created and cross-referenced using Social Intelligence and Brand Interaction / Fan-base Profiling and Segmentation in order to deliver actionable insights for any genre of Brand Loyalty and Fan-base Understanding – as well as for other Geo-demographic Analytics purposes – e.g. Digital Healthcare, Clinical Trials, Morbidity and Actuarial Outcomes: - – Music (e.g. BBC and Sony Music) – Broadcasting (e.g. Radio 1 / American Idol) – Digital Media Content (e.g. Sony Films / Netflix) – Sports Franchises (e.g. Manchester City / New York City) – Sport Footwear and Apparel (e.g. Nike, Puma, Adidas, Reebok) – Fast Fashion Retailers (e.g. ASOS, Next, New Look, Primark) – Luxury Brands / Aggregators (e.g. Armani, Burberry, Versace / LVMH, PPR, Richemont) – Multi-channel Retailers – Brand Affinity / Loyalty Marketing + Product Campaigns, Offers & Promotions – Financial Services Companies – Brand Protection and Reputation Management – Travel, Leisure and Entertainment Organisations - Destination Events and Resorts – MVNO / CSPs - OTT Business Partner Analytics (Sky Go, Netflix, iPlayer via Firebrand / Apigee) – Telco, Media and Communications - Churn Management / Conquest / Up-sell / Cross-sell Campaigns – Digital Healthcare – Private / Public Healthcare Service Provisioning: - Geo-demographic Clustering and Propensity Modelling (Patient Monitoring, Wellbeing, Clinical Trials, Morbidity and Actuarial Outcomes)
  • 122. Social Intelligence – Fan-base Understanding
  • 123. Social Intelligence – Social Interaction Social Interaction Cone Rules 1. Inactive – not engaged – low evidence / low affinity / low interest in Social Media 2. Lone Wolf – sparse / thin social network - may share negative information (Trolling) 3. Home Boy – Social Network clustered around Home Location Postcodes (Gang Culture) 4. Eternal Student – Social Network clustered around School / College / University Alumni 5. Workplace – Social Network clustered around Work and Colleagues (e.g. City Brokers, Traders) 6. Friends and Family – Social Network clustered around physical social contacts - Friends and Family 7. Enthusiast – Social Network clustered around shared, common interests – Sport. Music and Fashion etc. 8. Promiscuous – Open Networker – virtual Social Network across all categories- will connect with anybody Number of Segments • With anonymous data (e.g polls) then the number of initial Segments is 4 (Matt Holland). With named individuals we can discover much richer internal and external
  • 124. Social Interaction How consumers use social media (e.g., Facebook, Twitter) to address and/or engage with companies around social and environmental issues.
  • 125. Clustering in “Big Data” “A Cluster is a group of profiled data similarities aggregated closely together” • Cluster Analysis is a technique used to explore very large volumes of transactional and machine generated (automatic) data, social media and internet content and information - in order to discover previously unknown, unrecognised or hidden data relationships. • Clustering is an essential tool for any “Big‫‏‬Data”‫‏‬problem. Cluster Analysis of both explicit (given) or implicit (discovered) data relationships in “Big‫‏‬Data”‫‏‬is a critical technique which attempts to explain the nature, cause and effect of the forces which drive clustering. Any observed profiled data similarities – geographic or temporal aggregations, mathematical or statistical distributions – may be explained through Causal Layer Analysis. – Choice of clustering algorithm and parameters are both process and data dependent – Approximate Kernel K-means provides a good trade-off between clustering accuracy and data volumes, throughput, performance and scalability – Challenges include homogeneous and heterogeneous data (structured versus unstructured data), data quality, streaming, scalability, cluster cardinality and validity
  • 126. Cluster Types Deep Space Galactic Clusters Hadoop Cluster – “Big Data” Servers Molecular Clusters Geo-Demographic Clusters Mineral Lode Clusters
  • 127. •‫‏‬GEODEMOGRAPHIC‫‏‬PROFILING‫–‏‬ CLUSTERING‫‏‬IN“BIG‫‏‬DATA”‫‏•‏‬ • The profiling and analysis of very large aggregated datasets to determine ‘natural’ or implicit data relationships and discover hidden common factors and data structures - where no prior assumptions are made concerning the number or type of groups - is driven by uncovering previously unknown data relationships and natural groupings. The discovery of such Cluster / Group relationships, hierarchies or internal data structures is an important starting point forming the basis of many statistical and analytic applications which are designed to expose hidden data relationships. • A subsequent explicit Cluster Analysis of previously discovered data relationships is an important technique which attempts to understand the true nature, cause and impact of unknown clustering forces driving implicit profile similarities, mathematical and geographic distributions. Geo-demographic techniques are frequently used in order to profile and segment Demographic and Spatial data by ‘natural’ groupings – including common behavioural traits, Clinical Trial, Morbidity or Actuarial outcomes – along with numerous other shared characteristics and common factors Cluster Analysis attempt to understand and explain those natural group affinities and geographical distributions using methods such as Causal Layer Analysis (CLA)..... Clustering in “Big Data”
  • 128. Cluster Types DISCIPLINE CLUSTER TYPE CLUSTERS DIMENSIONS DATA TYPE DATA SOURCE CLUSTERING FACTORS / FORCES Astrophysics 4D Distribution of Matter across the Universe through Space and Time Star Systems Stellar Clusters Galaxies Galactic Clusters Mass / Energy Space / Time Astronomy Images – Microwave, Infrared, Optical, Ultraviolet, Radio, X-ray, Gamma-ray Optical Telescope Infrared Telescope Radio Telescope X-ray Telescope Gravity Dark Matter Dark Energy Dark Flow Climate Change Temperature Changes Precipitation Changes Ice-mass Changes Hot / Cold Dry / Wet More / Less ice Temperature Precipitation Sea / Land Ice Average Temperature Average Precipitation Greenhouse Gases % Weather Station Data Ice Core Data Tree-ring Data Solar Forcing Oceanic Forcing Atmospheric Forcing Actuarial Science Morbidity, Clinical Trials, Epidemiology Place / Date of birth Place / Date of death Cause of Death Birth / Death Longevity Cause of Death Medical Events Geography Time Biomedical Data Demographic Data Geographic data Register of Births Register of Deaths Medical Records Health Wealth Demographics Price Curves Economic Modelling Long-range Forecasting Economic growth Economic recession Bull markets Bear markets Monetary Value Geography Time Real (Austrian) GDP Foreign Exchange Rates Interest Rates Price movements Daily Closing Prices Government Central Banks Money Markets Stock Exchange Commodity Exchange Business Cycles Economic Trends Market Sentiment Fear and Greed Supply / Demand Business Clusters Retail Parks Digital / Fin Tech Leisure / Tourism Creative / Academic Retail Technology Resorts Arts / Sciences Company / SIC Geography Time Entrepreneurs Start-ups Mergers Acquisitions Investors NGAs Government Academic Bodies Capital / Finance Political policy Economic policy Social policy Elite Team Sports Performance Science Winners Loosens Team / Athlete Sport / Club League Tables Medal Tables Sporting Events Team / Athlete Sport / Club Geography Time Performance Data Biomedical Data Sports Governing Bodies RSS News Feeds Social Media Hawk-Eye Pro-Zone Technique Application Form / Fitness Ability / Attitude Training / Coaching Speed / Endurance Future Management Human Activity Natural Events Random Events Waves, Cycles, Patterns, Trends Random Events Geography Time Weak Signals Strong Signals Wild Card Events Black Swan Events Global Internet Content / Big Data Analytics - Horizon Scanning, Tracking and Monitoring Random Events Waves, Cycles, Patterns, Trends, Extrapolations
  • 129. Clustering in “Big Data” •‫"‏‬BIG‫‏‬DATA”‫‏‬ANALYTICS‫–‏‬ PROFILING, CLUSTERING and 4D‫‏‬GEOSPATIAL‫‏‬ANALYSIS‫‏•‏‬ • The profiling and analysis of large aggregated datasets in order to determine a ‘natural’ structure of data relationships or groupings, is an important starting point forming the basis of many mapping, statistical and analytic applications. Cluster analysis of implicit similarities - such as time-series demographic or geographic distribution - is a critical technique where no prior assumptions are made concerning the number or type of groups that may be found, or their relationships, hierarchies or internal data structures. Geospatial and demographic techniques are frequently used in order to profile and segment populations by ‘natural’ groupings. Shared characteristics or common factors such as Behaviour / Propensity or Epidemiology, Clinical, Morbidity and Actuarial outcomes – allow us to discover and explore previously unknown, concealed or unrecognised insights, patterns, trends or data relationships. •‫‏‬PREDICTIVE‫‏‬ANALYITICS‫‏‬and‫‏‬EVENT‫‏‬FORECASTING‫•‏‬ • Predictive Analytics and Event Forecasting uses Horizon Scanning, Tracking and Monitoring methods combined with Cycle, Pattern and Trend Analysis techniques for Event Forecasting and Propensity Models in order to anticipate a wide range of business. economic, social and political Future Events – ranging from micro-economic Market phenomena such as forecasting Market Sentiment and Price Curve movements - to large-scale macro-economic Fiscal phenomena using Weak Signal processing to predict future Wild Card and Black Swan Events - such as Monetary System shocks.
  • 130.
  • 131. Cluster Analysis • Data Representation – Metadata - identifying common Data Objects, Types and Formats • Data Taxonomy and Classification – Similarity Matrix (labelled data) – Grouping of explicit data relationships • Data Audit - given any collection of labelled objects..... – Identifying relationships between discrete data items – Identifying common data features - values and ranges – Identifying unusual data features - outliers and exceptions • Data Profiling and Clustering - given any collection of unlabeled objects..... – Pattern Matrix (unlabelled data) – Discover implicit data relationships – Find meaningful groupings in Data (Clusters) – Predictive Analytics – Baysean Event Forecasting – Wave-form Analytics – Periodicity, Cycles and Trends – Explore hidden relationships between discrete data features Many big data problems feature unlabeled objects
  • 133. Cluster Analysis Clustering Algorithms Hundreds of spatial, mathematical and statistical clustering algorithms are available – many clustering algorithms are “admissible” – but no single algorithm alone is “optimal” • K-means • Gaussian mixture models • Kernel K-means • Spectral Clustering • Nearest neighbour • Latent Dirichlet Allocation Challenges‫‏‬in‫“‏‬Big‫‏‬Data”‫‏‬Clustering • Data quality • Volume – number of data items • Cardinality – number of clusters • Synergy – measures of similarity • Values – outliers and exceptions • Cluster accuracy - validity and verification • Homogeneous versus heterogeneous data (structured and unstructured data)
  • 134. Distributed Clustering Model Performance Clustering 100,000 2-D points with 2 clusters on 2.3 GHz quad-core Intel Xeon processors, with 8GB memory in intel07 cluster Network communication cost increases with the no. of processors K-means Kernel K -means
  • 135. Distributed Clustering Models Number of processors Speedup Factor - K-means Speedup Factor - Kernel K-means 2 1.1 1.3 3 2.4 1.5 4 3.1 1.6 5 3.0 3.8 6 3.1 1.9 7 3.3 1.5 8 1.2 1.5 K-means Kernel K -means Clustering 100,000 2-D points with 2 clusters on 2.3 GHz quad-core Intel Xeon processors, with 8GB memory in intel07 cluster Network communication cost increases with the no. of processors
  • 136. Distributed Clustering Model Performance Distributed Approximate Kernel K-means 2-D data set with 2 concentric circles 2.3 GHz quad-core Intel Xeon processors, with 8GB memory in intel07 cluster Run-time Size of dataset (no. of Records) Benchmark Performance (Speedup Factor ) 10K 3.8 100K 4.8 1M 3.8 10M 6.4
  • 137. HPCC Clustering Models High Performance / High Concurrence Real-time Delivery (HPCC)
  • 139. The Cone™‫‏‬– Brand Loyalty / Affinity