Approaching Big Data: Lesson Plan

Bessie Chu
Bessie ChuDirector, Global Product Management at GroupM
Leveraging Engagement:
Big Data Lesson
Agenda
What is Big Data?
•  Some Definitions
•  Mixed Methods Approach
Champion’s League & World Cup Case Study
•  Process
•  Results and Usage
•  Pitfalls and Learnings
Moving Forward
•  Data Approach Flow
•  Caveats
•  Organization and Communication
What is Big Data?
So many different definitions… nobody quite
agrees….
… except that it’s definitely a buzzword
What is Big Data?
It is just generally agreed upon that it’s messy and complex. This
is an opportunity and challenge for us to innovate.
“an all-encompassing term for any collection of data sets so large and complex
that it becomes difficult to process using on-hand data management tools or
traditional data processing applications.”
“Big data is a buzzword, or catch-phrase, used to describe a massive
volume of both structured and unstructured data that is
so large that it's difficult to process using traditional database and
software techniques. In most enterprise scenarios the data is too big or it moves
too fast or it exceeds current processing capacity. Big data has the
potential to help companies improve operations and
make faster, more intelligent decisions.”
“Volume, Variety, Velocity, Variability, Complexity”
Quotes	
  from:	
  	
  
h-p://www.forbes.com/sites/gilpress/2014/09/03/12-­‐big-­‐data-­‐definiBons-­‐whats-­‐yours/2/	
  
h-p://www.webopedia.com/TERM/B/big_data.html	
  
h-p://en.wikipedia.org/wiki/Big_data	
  
What We Do Need to Solve Big
Data?
… for leveraging engagement at least.
…	
  for	
  leveraging	
  engagement	
  at	
  least.	
  
Determine	
  Right	
  
QuesBons	
  and	
  
Goals	
  for	
  Data	
  
Interdisciplinary	
  
Approach	
  
IteraBve	
  
Refinement	
  
“Combining the what (quantitative) with the why (qualitative) can
be exponentially powerful.  It is also critical to our ability to take all our
clickstream data and truly analyze it, to find insights that drive
meaningful website changes that will improve our customers’
experiences.” – Avinash Kaushik
Answer:
Mixed Methods and Innovation
Quote	
  from:	
  	
  Web	
  AnalyBcs	
  in	
  One	
  Hour	
  a	
  Day	
  by	
  Avinash	
  Kaukshik	
  
CHAMPIONS LEAGUE AND
WORLD CUP BIG DATA
DISCOVERY PROCESS
Annenberg Lab Framework
GOALS
Sports Fan and Engagement
Study Overall Goals for HAVAS
•  to identify and define communities of sports fans
based around passion points(A)
•  to analyze fan interactions with those passions
(B)
•  position HAVAS Sports & Entertainment to more
effectively advise brands on how to meaningfully
engage with sports fans by leveraging passion-
based communities.
(C)
Big Data Research Objectives
•  Discover a mixed
methodology
framework for sports
and entertainment fan
engagement
External
for
Havas
•  Justify our fan logic
topology in relation to
Twitter conversations
through natural
language processing
Internal
for Lab
Initial Data Collection Steps
1) Modify data collection process to fit live
soccer events using Champion’s league as
a test run
2) Establish methodology in seeding initial
pool of users, keywords, and hashtags
3) Analyze tweets and how they fit into
logics of engagement
4) Establish methodology in how to gain
insight from twitter conversations
“Analyzing Big Data is a BIG JOB
with Many People” – Jake
Inputs & Equipment
Keywords,
hashtags, user
clusters file on
txt document
Dedicated
server system
colllecting
information
Engineering
Run and modify
Python script
Register Public
Screening API
Parse for
results
Live Viewing Team
Team to watch game and look for patterns
Data Collection Process
Engineering &
Team: Tech and
Data Set-Up
Engineer: Run
Script with Seed
File
Team: Watch Event
for Patterns and
Additional Seeds
Team: Decide Data
to Analyze
Engineer: Parse
Data into User-
Friendly Format
Team: Look at Data
and prepare for
next event
DATA SEED METHODOLOGY
Initial Keyword Seed Scoping
Keep it simple
Discover through
observations
Soccer Hashtags and Keywords
Official
Hashtags
Sponsors
Team
Names
Key
Terms
Key
Players
Headliners
Official Organization Handles
Official Team Handles
Official	
  
Hashtags	
  
Sponsors	
  
Team	
  
Names	
  
Key	
  Terms	
  
Key	
  
Players	
  
Sponsors
Sponsors will often have official hashtags promoted during
sporting events to cross-promote their brand and the sporting
event.
Official	
  
Hashtags	
  
Sponsors	
  
Team	
  
Names	
  
Key	
  
Terms	
  
Key	
  
Players	
  
Supporting Characters
Superfans
-Fans with
unusual
followings on
Twitter
Sports
Commentators
-ESPN
commentators
and the like
Prominent
Bloggers
-Blogs or
bloggers with
large following
on certain
teams
Initial Data Seed Scoping Caveats
• Twitter caps at couple of
thousand tweets per
second on Public API
• Public API received tweets
do not appear to be
affected by location based
factors the way individual
user feeds are
• Twitter chunks these
tweets in mysterious
algorithm it deems
important
• Number of Tweets
scrapped render these
factors nominal in terms of
large-scale user behavior
ENGAGEMENT HYPOTHESIS &
ANALYSIS
What kind of Tweets or tone in
tweets fit into logics of
engagement?
*Informed by survey and ethnography
Entertainment Immersion
Social
Connection
Identification
Mastery Pride Play Advocacy
Operational Process
Plan for World Cup & Modeling with Beacon Capabilities
See how conservations analyzed from a big data perspective fit and build on the
logics of engagement model
Determine what data frameworks worked in capturing useful information
Initial qualitative look at data
Exercise: Seed Scoping
Questions on Approach Before
We Get Into Analysis?
Big Data
Analysis Process to Dashboard
Big Data
Basic Methods of Analysis
• Text processing of tweets and plotting using algorithms into
agglomerative clusters (aka cool visuals)
• Frequency of terms, associations, and word clouds fall under
here
• Goal: Find texts of what spurred the most conversation
Textual
• A way to visually see social connection data
• Understand forms of bonds and the connections between
individual data points worth exploring
• Goal: Detecting communities (our clusters, brands)
Networks
• Toolkits (such as Hootsuite) that measure “sentiment” using
positive and negative language
• Can be used to see if an initiative performed well
• Goal: Measure success of a campaign at different times
Sentiment
Big Data
Low-Hanging Fruit - Topline
Rt Author Screenname
FIFAWorldCup 76172
9GAG 37459
DFB_Team_EN 21247
BBCSport 19564
FCBayern 14782
FTBpro 13409
_Snape_ 11371
benparr 10616
TheTweetOfGod 9435
espn 7465
Queen_UK 7174
thereaIbanksy 7113
sulsultm3 6646
damnitstrue 6603
asshaaban 6513
SportsCenter 6470
fifaworldcup_es 6365
LicDice_ 6361
FIFAworldcup_e 6241
DFB_Team 6114
Argentina 5964
Big Data
Low-Hanging Fruit – Sentiment Analysis
Fan
Handles1 Game
Data2 Brand
Data3 Integrate insights
with Ethnographic
and Survey Data for
final deliverables
Initial Idealized Approach
•  Survey Twitter Handles
–  See if their online behavior matches survey logics
–  What does the content they’re sharing look like
–  Trends by cluster, gender, other data points
•  Match Data
–  Look for clusters of behavior to events in games
–  See popularity of brand campaigns and behavioral response to brand stories
–  Gain insight from bursts of activity and real-time marketing
–  See what are characteristics of influencers
•  Brand Data
–  Identify how these strategies were executed in online conversations and responses
–  Identify types of interactions/content/other markers around brands on Twitter
–  Do influential brands mean consistent users interacting across brands? Why are people
interacting in this way? How can we categorize these interactions according to our logic
clusters?
–  Was the content agile?
–  See how users responded by the logics to different types of content
–  Look for differences in fan response and fan-initiated behavior to the brands
Questions and Hypothesis
What We Planned To Do
•  Steps
•  Define interesting WC fan moments and brand moments
•  Examine moments in time and certain brand campaigns
•  Investigate possible Natural Language Processing tools
•  Formulated Questions
•  Timeline
•  Created a timeline assigning roles to each person
•  Deliverables
•  TBD, likely looking at clusters of behavior around brand campaigns.
•  Sentiment analysis may tie in here
Ethnographic
Report
-What did people
say about the
brand or the
logics they used?
Survey Data
-Under this brand
logic utilized,
what is the
intensity and who
are the clusters?
Big Data
-How did
audiences
respond online to
actions by the
brand?
Approaching with Mixed
Methods
Exercise: Group Datasets
Figure out what insight you might be able
to get from each piece of data and how
would you apply mixed methods.
Dashboard Process
The Future of Social Media
Analytics
“We will be moving beyond key-word based
queries into machine-learning algorithms.
Influencers whom I have with with echo
similar ideas about the increasing use and
refine of latent semantic indexing (or some
variant of it) and other machine-learning
algorithms in order to improve social
listening, automatic categorization of
content, and the ability to take action on
data” - Marshall Sponder
Key
Learnings
for Mood
Board
Ethnography
Survey Twitter Data
Brazil
Brought Together All Data
Concept Creation
The Dashboard Build Process
Pulled 250
Retweeted
Tweets with
Verification
from
BigSheets
Coded
Tweets
According to
Logic for
Testing Data
Built
Dictionary
According to
Sample
Tweets,
Ethnography,
Survey
Created
Natural
Language
Processing
and Machine
Learning
Algorithms
Fan
Engagement
Dashboard
Prototype
Model
Technology
Collaboration
Innovation Fan Engagement
Dashboard Prototype
jStart Beacon
Custom-Built Twitter Collection Web App
jStart BigSheets
Leveraging Engagement Framework
Annenberg Innovation Lab Fan
Engagement Dashboard built through
collaboration and mixed methods
learning.
67% Accuracy in classifying tweets by
Logic of Engagement leading to
actionable insight and business intelligence
for Leveraging Fan Engagement.
Approaching Big Data: Lesson Plan
The Process End-to-End
Collecting and
Managing Data
Data Back Up Data Clean Up Run Models
Gain InsightsRefine Models
Learn Actionable
Insights
Communicate
Insights (Reports,
Infographic
Blueprints)
Create Initial
Dictionary for
Natural Language
Processing
Annotate/Code
Tweets for
Training Data for
Machine Learning
Created
Dashboard
Improve on
Design
Now What?
Moving Forward
Your Challenge
•  Your data will be different
client-to-client
•  Twitter is just the beginning
•  Your will get to be creative
and work on collaborative
cross-functional teams to
dive into the data
•  *This will be both rewarding
and potentially difficult
Tasks Ahead
•  Begin thinking about
what you can learn from
data to help our sponsors
reach their goals
•  Start thinking about how
your fans behave in your
approach to figuring out
what questions to ask the
data
Most Basic Steps
Determine Goals Capture Data Curate Data
Merge Datasets
and Bring Together
Methodologies if
Necessary
Additional Data
Processing to
Usable Form
Deliver Insight to
the Client
Thinking About Process
Bumps in the Road Ahead
•  Privacy Issues and
Respecting the Fans
•  Company layers and
politics – releasing data
from companies is
fraught with back and
forth
•  Getting data into a
usable form
•  Assumptions were wrong
or have to be redefined
– it’s ok to fail fast – but
be ready to keep moving
•  Working in cross-
functional groups
Image	
  from:	
  CapGemini	
  h-p://www.capgemini.com/sites/default/files/technology-­‐blog/files/2012/09/big-­‐data-­‐vendors.jpg	
  
Cross-Functional Communication
Goal	
   Timing	
  	
  
Point	
  
People	
  
Resources	
  
Needed	
  	
  
Bring it Together
Draw connections between the data sets
and how could they relate to the eight
logics and situational triggers.
“While social media data are always interesting in
themselves (at least, for an analyst), when business
owners are able to combine data and layer them
efficiently, the information will become more useful
and actionable.” – Marshall Sponder
Thank You
Questions?
1 of 52

Recommended

Applications of Big Data by
Applications of Big DataApplications of Big Data
Applications of Big DataPrashant Kumar Jadia
986 views31 slides
Big data (word file) by
Big data  (word file)Big data  (word file)
Big data (word file)Shahbaz Anjam
810 views14 slides
Big data 2017 final by
Big data 2017   finalBig data 2017   final
Big data 2017 finalAmjid Ali
423 views110 slides
Big data by
Big dataBig data
Big datamadhavsolanki
916 views26 slides
The promise and challenge of Big Data by
The promise and challenge of Big DataThe promise and challenge of Big Data
The promise and challenge of Big DataThe Marketing Distillery
5K views31 slides
Team 2 Big Data Presentation by
Team 2 Big Data PresentationTeam 2 Big Data Presentation
Team 2 Big Data PresentationMatthew Urdan
760 views21 slides

More Related Content

What's hot

The Pros and Cons of Big Data in an ePatient World by
The Pros and Cons of Big Data in an ePatient WorldThe Pros and Cons of Big Data in an ePatient World
The Pros and Cons of Big Data in an ePatient WorldPYA, P.C.
2.6K views36 slides
Big Data Analytics by
Big Data AnalyticsBig Data Analytics
Big Data AnalyticsNapier University
697 views12 slides
Chapter 4 what is data and data types by
Chapter 4  what is data and data typesChapter 4  what is data and data types
Chapter 4 what is data and data typesPro Guide
118 views6 slides
NewMR 2016 presents: 9 Big Applications of Big Data by
NewMR 2016 presents: 9 Big Applications of Big DataNewMR 2016 presents: 9 Big Applications of Big Data
NewMR 2016 presents: 9 Big Applications of Big DataAnnie Pettit, Research Methodologist
1K views15 slides
Big data lecture notes by
Big data lecture notesBig data lecture notes
Big data lecture notesMohit Saini
32.1K views59 slides

What's hot(20)

The Pros and Cons of Big Data in an ePatient World by PYA, P.C.
The Pros and Cons of Big Data in an ePatient WorldThe Pros and Cons of Big Data in an ePatient World
The Pros and Cons of Big Data in an ePatient World
PYA, P.C.2.6K views
Chapter 4 what is data and data types by Pro Guide
Chapter 4  what is data and data typesChapter 4  what is data and data types
Chapter 4 what is data and data types
Pro Guide118 views
Big data lecture notes by Mohit Saini
Big data lecture notesBig data lecture notes
Big data lecture notes
Mohit Saini32.1K views
Big Data can be fun! by Bruno Aziza
Big Data can be fun!Big Data can be fun!
Big Data can be fun!
Bruno Aziza4.1K views
Big Data for Beginners by Michael Perez
Big Data for BeginnersBig Data for Beginners
Big Data for Beginners
Michael Perez5.7K views
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of things by Ramakant Gawande
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of thingsBig Data & Future - Big Data, Analytics, Cloud, SDN, Internet of things
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of things
Ramakant Gawande7.1K views
Big data unit i by Navjot Kaur
Big data unit iBig data unit i
Big data unit i
Navjot Kaur11.8K views
Big data overview external by Brett Colbert
Big data overview externalBig data overview external
Big data overview external
Brett Colbert1.1K views
Big data PPT prepared by Hritika Raj (Shivalik college of engg.) by Hritika Raj
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Hritika Raj1.2K views
Big Data - 25 Amazing Facts Everyone Should Know by Bernard Marr
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should Know
Bernard Marr487.1K views
Big Data vs. Small Data...what's the difference? by Anna Kuhn
Big Data vs. Small Data...what's the difference?Big Data vs. Small Data...what's the difference?
Big Data vs. Small Data...what's the difference?
Anna Kuhn4.2K views
Big Data Characteristics And Process PowerPoint Presentation Slides by SlideTeam
Big Data Characteristics And Process PowerPoint Presentation SlidesBig Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation Slides
SlideTeam765 views
10 Most Effective Big Data Technologies by Mahindra Comviva
10 Most Effective Big Data Technologies10 Most Effective Big Data Technologies
10 Most Effective Big Data Technologies
Mahindra Comviva328 views
Data-Ed Webinar: Demystifying Big Data by DATAVERSITY
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data
DATAVERSITY4.5K views

Viewers also liked

10. Graph Databases by
10. Graph Databases10. Graph Databases
10. Graph DatabasesFabio Fumarola
7.3K views65 slides
crimes vocabulary by
crimes vocabularycrimes vocabulary
crimes vocabularyIrina K
10.7K views3 slides
Big data ppt by
Big data pptBig data ppt
Big data pptThirunavukkarasu Ps
75K views33 slides
4 Best Practices for Analyzing Healthcare Data by
4 Best Practices for Analyzing Healthcare Data4 Best Practices for Analyzing Healthcare Data
4 Best Practices for Analyzing Healthcare DataHealth Catalyst
191.5K views13 slides

Similar to Approaching Big Data: Lesson Plan

Open Analytics: Building Effective Frameworks for Social Media Analysis by
Open Analytics: Building Effective Frameworks for Social Media AnalysisOpen Analytics: Building Effective Frameworks for Social Media Analysis
Open Analytics: Building Effective Frameworks for Social Media Analysisikanow
3.2K views32 slides
How to use your data science team: Becoming a data-driven organization by
How to use your data science team: Becoming a data-driven organizationHow to use your data science team: Becoming a data-driven organization
How to use your data science team: Becoming a data-driven organizationYael Garten
739 views26 slides
Introduction to Enterprise Search by
Introduction to Enterprise SearchIntroduction to Enterprise Search
Introduction to Enterprise SearchFindwise
1.5K views59 slides
Optimising Your Content for findability by
Optimising Your Content for findabilityOptimising Your Content for findability
Optimising Your Content for findabilityKristian Norling
674 views56 slides
Optimising Your Content for Findability by
Optimising Your Content for FindabilityOptimising Your Content for Findability
Optimising Your Content for FindabilityFindwise
3.2K views57 slides
Introducing Data Driven Tech Leadership: Social media, Google Analytics, and ... by
Introducing Data Driven Tech Leadership: Social media, Google Analytics, and ...Introducing Data Driven Tech Leadership: Social media, Google Analytics, and ...
Introducing Data Driven Tech Leadership: Social media, Google Analytics, and ...Debra Askanase
5K views53 slides

Similar to Approaching Big Data: Lesson Plan (20)

Open Analytics: Building Effective Frameworks for Social Media Analysis by ikanow
Open Analytics: Building Effective Frameworks for Social Media AnalysisOpen Analytics: Building Effective Frameworks for Social Media Analysis
Open Analytics: Building Effective Frameworks for Social Media Analysis
ikanow3.2K views
How to use your data science team: Becoming a data-driven organization by Yael Garten
How to use your data science team: Becoming a data-driven organizationHow to use your data science team: Becoming a data-driven organization
How to use your data science team: Becoming a data-driven organization
Yael Garten739 views
Introduction to Enterprise Search by Findwise
Introduction to Enterprise SearchIntroduction to Enterprise Search
Introduction to Enterprise Search
Findwise1.5K views
Optimising Your Content for findability by Kristian Norling
Optimising Your Content for findabilityOptimising Your Content for findability
Optimising Your Content for findability
Kristian Norling674 views
Optimising Your Content for Findability by Findwise
Optimising Your Content for FindabilityOptimising Your Content for Findability
Optimising Your Content for Findability
Findwise3.2K views
Introducing Data Driven Tech Leadership: Social media, Google Analytics, and ... by Debra Askanase
Introducing Data Driven Tech Leadership: Social media, Google Analytics, and ...Introducing Data Driven Tech Leadership: Social media, Google Analytics, and ...
Introducing Data Driven Tech Leadership: Social media, Google Analytics, and ...
Debra Askanase5K views
Information Search by allerhed
Information SearchInformation Search
Information Search
allerhed614 views
Social media analytics powered by data science by Navin Manaswi
Social media analytics powered by data scienceSocial media analytics powered by data science
Social media analytics powered by data science
Navin Manaswi1.6K views
Data Informed Design - Good Tech Test - May 2018 by Courtney Clark
Data Informed Design - Good Tech Test - May 2018Data Informed Design - Good Tech Test - May 2018
Data Informed Design - Good Tech Test - May 2018
Courtney Clark876 views
Finding Meaning in the Numbers: Making Data-Informed Decisions Across Your Or... by TechSoup Canada
Finding Meaning in the Numbers: Making Data-Informed Decisions Across Your Or...Finding Meaning in the Numbers: Making Data-Informed Decisions Across Your Or...
Finding Meaning in the Numbers: Making Data-Informed Decisions Across Your Or...
TechSoup Canada 1.6K views
Big data for sales and marketing people by Edward Chenard
Big data for sales and marketing peopleBig data for sales and marketing people
Big data for sales and marketing people
Edward Chenard2.6K views
HighRoad Solution Session at AUC016-Creating the Insight-Driven Content Marke... by HighRoad Solution
HighRoad Solution Session at AUC016-Creating the Insight-Driven Content Marke...HighRoad Solution Session at AUC016-Creating the Insight-Driven Content Marke...
HighRoad Solution Session at AUC016-Creating the Insight-Driven Content Marke...
HighRoad Solution198 views
More Than a Feeling: Data-Informed Design by Courtney Clark
More Than a Feeling: Data-Informed DesignMore Than a Feeling: Data-Informed Design
More Than a Feeling: Data-Informed Design
Courtney Clark447 views
Using Google Analytics To Market Your Software Idea by Pierre DeBois
Using Google Analytics To Market Your Software IdeaUsing Google Analytics To Market Your Software Idea
Using Google Analytics To Market Your Software Idea
Pierre DeBois58 views
Social Data Intelligence: Webinar with Susan Etlinger by Susan Etlinger
Social Data Intelligence: Webinar with Susan EtlingerSocial Data Intelligence: Webinar with Susan Etlinger
Social Data Intelligence: Webinar with Susan Etlinger
Susan Etlinger15.7K views
Introduction to enterprise search for intranets and websites by Kristian Norling
Introduction to enterprise search for intranets and websitesIntroduction to enterprise search for intranets and websites
Introduction to enterprise search for intranets and websites
Kristian Norling1.2K views
ScienceOnline impact workshop by SpotOnLondon
ScienceOnline impact workshop ScienceOnline impact workshop
ScienceOnline impact workshop
SpotOnLondon475 views

More from Bessie Chu

Product Management Impact Mapping by
Product Management Impact MappingProduct Management Impact Mapping
Product Management Impact MappingBessie Chu
11 views11 slides
Brand Building for Start-Ups by
Brand Building for Start-UpsBrand Building for Start-Ups
Brand Building for Start-UpsBessie Chu
6 views6 slides
Tweet Tracking App Design Document by
Tweet Tracking App Design DocumentTweet Tracking App Design Document
Tweet Tracking App Design DocumentBessie Chu
704 views24 slides
Simple Slide Design and Data Visualization Crash Course by
Simple Slide Design and Data Visualization Crash CourseSimple Slide Design and Data Visualization Crash Course
Simple Slide Design and Data Visualization Crash CourseBessie Chu
916 views33 slides
Marketing to Female Sports Fans by
Marketing to Female Sports FansMarketing to Female Sports Fans
Marketing to Female Sports FansBessie Chu
4.5K views43 slides
Thesis: Marketing to Female Sports Fans by
Thesis: Marketing to Female Sports Fans Thesis: Marketing to Female Sports Fans
Thesis: Marketing to Female Sports Fans Bessie Chu
3.3K views73 slides

More from Bessie Chu(16)

Product Management Impact Mapping by Bessie Chu
Product Management Impact MappingProduct Management Impact Mapping
Product Management Impact Mapping
Bessie Chu11 views
Brand Building for Start-Ups by Bessie Chu
Brand Building for Start-UpsBrand Building for Start-Ups
Brand Building for Start-Ups
Bessie Chu6 views
Tweet Tracking App Design Document by Bessie Chu
Tweet Tracking App Design DocumentTweet Tracking App Design Document
Tweet Tracking App Design Document
Bessie Chu704 views
Simple Slide Design and Data Visualization Crash Course by Bessie Chu
Simple Slide Design and Data Visualization Crash CourseSimple Slide Design and Data Visualization Crash Course
Simple Slide Design and Data Visualization Crash Course
Bessie Chu916 views
Marketing to Female Sports Fans by Bessie Chu
Marketing to Female Sports FansMarketing to Female Sports Fans
Marketing to Female Sports Fans
Bessie Chu4.5K views
Thesis: Marketing to Female Sports Fans by Bessie Chu
Thesis: Marketing to Female Sports Fans Thesis: Marketing to Female Sports Fans
Thesis: Marketing to Female Sports Fans
Bessie Chu3.3K views
Video Game Journalism: Content Analysis by Bessie Chu
Video Game Journalism: Content AnalysisVideo Game Journalism: Content Analysis
Video Game Journalism: Content Analysis
Bessie Chu944 views
Comparative Broadband Policy: Hong Kong, Singapore, and Taiwan by Bessie Chu
Comparative Broadband Policy: Hong Kong, Singapore, and TaiwanComparative Broadband Policy: Hong Kong, Singapore, and Taiwan
Comparative Broadband Policy: Hong Kong, Singapore, and Taiwan
Bessie Chu1.4K views
Mad Men and Scandal: Marketing via Fan Tastes by Bessie Chu
Mad Men and Scandal: Marketing via Fan TastesMad Men and Scandal: Marketing via Fan Tastes
Mad Men and Scandal: Marketing via Fan Tastes
Bessie Chu979 views
Snapbasket Survey Results by Bessie Chu
Snapbasket Survey ResultsSnapbasket Survey Results
Snapbasket Survey Results
Bessie Chu421 views
Data Visualization as Narrative? by Bessie Chu
Data Visualization as Narrative?Data Visualization as Narrative?
Data Visualization as Narrative?
Bessie Chu689 views
Design and Material Culture: Taiwan Case Study by Bessie Chu
Design and Material Culture: Taiwan Case StudyDesign and Material Culture: Taiwan Case Study
Design and Material Culture: Taiwan Case Study
Bessie Chu1.4K views
Amazon Web Services SWOT & Competitor Analysis by Bessie Chu
Amazon Web Services SWOT & Competitor AnalysisAmazon Web Services SWOT & Competitor Analysis
Amazon Web Services SWOT & Competitor Analysis
Bessie Chu36.7K views
Social Good App for EBT Users: SNAPBasket by Bessie Chu
Social Good App for EBT Users: SNAPBasketSocial Good App for EBT Users: SNAPBasket
Social Good App for EBT Users: SNAPBasket
Bessie Chu769 views
Amazon Web Services SWOT by Bessie Chu
Amazon Web Services SWOTAmazon Web Services SWOT
Amazon Web Services SWOT
Bessie Chu6.4K views
Square Payments Class Presentation by Bessie Chu
Square Payments Class PresentationSquare Payments Class Presentation
Square Payments Class Presentation
Bessie Chu1.1K views

Recently uploaded

Construction Accidents & Injuries by
Construction Accidents & InjuriesConstruction Accidents & Injuries
Construction Accidents & InjuriesBisnar Chase Personal Injury Attorneys
6 views5 slides
[DSC Europe 23] Matteo Molteni - Implementing a Robust CI Workflow with dbt f... by
[DSC Europe 23] Matteo Molteni - Implementing a Robust CI Workflow with dbt f...[DSC Europe 23] Matteo Molteni - Implementing a Robust CI Workflow with dbt f...
[DSC Europe 23] Matteo Molteni - Implementing a Robust CI Workflow with dbt f...DataScienceConferenc1
5 views18 slides
Oral presentation (1).pdf by
Oral presentation (1).pdfOral presentation (1).pdf
Oral presentation (1).pdfreemalmazroui8
5 views10 slides
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P... by
[DSC Europe 23][AI:CSI]  Dragan Pleskonjic - AI Impact on Cybersecurity and P...[DSC Europe 23][AI:CSI]  Dragan Pleskonjic - AI Impact on Cybersecurity and P...
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P...DataScienceConferenc1
8 views36 slides
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation by
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented GenerationDataScienceConferenc1
19 views29 slides
Inawsidom - Data Journey by
Inawsidom - Data JourneyInawsidom - Data Journey
Inawsidom - Data JourneyPhilipBasford
8 views38 slides

Recently uploaded(20)

[DSC Europe 23] Matteo Molteni - Implementing a Robust CI Workflow with dbt f... by DataScienceConferenc1
[DSC Europe 23] Matteo Molteni - Implementing a Robust CI Workflow with dbt f...[DSC Europe 23] Matteo Molteni - Implementing a Robust CI Workflow with dbt f...
[DSC Europe 23] Matteo Molteni - Implementing a Robust CI Workflow with dbt f...
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P... by DataScienceConferenc1
[DSC Europe 23][AI:CSI]  Dragan Pleskonjic - AI Impact on Cybersecurity and P...[DSC Europe 23][AI:CSI]  Dragan Pleskonjic - AI Impact on Cybersecurity and P...
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P...
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation by DataScienceConferenc1
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
Best Home Security Systems.pptx by mogalang
Best Home Security Systems.pptxBest Home Security Systems.pptx
Best Home Security Systems.pptx
mogalang9 views
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx by DataScienceConferenc1
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx
Data Journeys Hard Talk workshop final.pptx by info828217
Data Journeys Hard Talk workshop final.pptxData Journeys Hard Talk workshop final.pptx
Data Journeys Hard Talk workshop final.pptx
info82821711 views
Dr. Ousmane Badiane-2023 ReSAKSS Conference by AKADEMIYA2063
Dr. Ousmane Badiane-2023 ReSAKSS ConferenceDr. Ousmane Badiane-2023 ReSAKSS Conference
Dr. Ousmane Badiane-2023 ReSAKSS Conference
AKADEMIYA20635 views
[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ... by DataScienceConferenc1
[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ...[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ...
[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ...
Listed Instruments Survey 2022.pptx by secretariat4
Listed Instruments Survey  2022.pptxListed Instruments Survey  2022.pptx
Listed Instruments Survey 2022.pptx
secretariat493 views
K-Drama Recommendation Using Python by FridaPutriassa
K-Drama Recommendation Using PythonK-Drama Recommendation Using Python
K-Drama Recommendation Using Python
FridaPutriassa5 views
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ... by DataScienceConferenc1
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...
LIVE OAK MEMORIAL PARK.pptx by ms2332always
LIVE OAK MEMORIAL PARK.pptxLIVE OAK MEMORIAL PARK.pptx
LIVE OAK MEMORIAL PARK.pptx
ms2332always7 views
[DSC Europe 23] Ivan Dundovic - How To Treat Your Data As A Product.pptx by DataScienceConferenc1
[DSC Europe 23] Ivan Dundovic - How To Treat Your Data As A Product.pptx[DSC Europe 23] Ivan Dundovic - How To Treat Your Data As A Product.pptx
[DSC Europe 23] Ivan Dundovic - How To Treat Your Data As A Product.pptx
OPPOTUS - Malaysians on Malaysia 3Q2023.pdf by Oppotus
OPPOTUS - Malaysians on Malaysia 3Q2023.pdfOPPOTUS - Malaysians on Malaysia 3Q2023.pdf
OPPOTUS - Malaysians on Malaysia 3Q2023.pdf
Oppotus30 views

Approaching Big Data: Lesson Plan

  • 2. Agenda What is Big Data? •  Some Definitions •  Mixed Methods Approach Champion’s League & World Cup Case Study •  Process •  Results and Usage •  Pitfalls and Learnings Moving Forward •  Data Approach Flow •  Caveats •  Organization and Communication
  • 3. What is Big Data? So many different definitions… nobody quite agrees…. … except that it’s definitely a buzzword
  • 4. What is Big Data? It is just generally agreed upon that it’s messy and complex. This is an opportunity and challenge for us to innovate. “an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process using on-hand data management tools or traditional data processing applications.” “Big data is a buzzword, or catch-phrase, used to describe a massive volume of both structured and unstructured data that is so large that it's difficult to process using traditional database and software techniques. In most enterprise scenarios the data is too big or it moves too fast or it exceeds current processing capacity. Big data has the potential to help companies improve operations and make faster, more intelligent decisions.” “Volume, Variety, Velocity, Variability, Complexity” Quotes  from:     h-p://www.forbes.com/sites/gilpress/2014/09/03/12-­‐big-­‐data-­‐definiBons-­‐whats-­‐yours/2/   h-p://www.webopedia.com/TERM/B/big_data.html   h-p://en.wikipedia.org/wiki/Big_data  
  • 5. What We Do Need to Solve Big Data? … for leveraging engagement at least.
  • 6. …  for  leveraging  engagement  at  least.   Determine  Right   QuesBons  and   Goals  for  Data   Interdisciplinary   Approach   IteraBve   Refinement   “Combining the what (quantitative) with the why (qualitative) can be exponentially powerful.  It is also critical to our ability to take all our clickstream data and truly analyze it, to find insights that drive meaningful website changes that will improve our customers’ experiences.” – Avinash Kaushik Answer: Mixed Methods and Innovation Quote  from:    Web  AnalyBcs  in  One  Hour  a  Day  by  Avinash  Kaukshik  
  • 7. CHAMPIONS LEAGUE AND WORLD CUP BIG DATA DISCOVERY PROCESS Annenberg Lab Framework
  • 9. Sports Fan and Engagement Study Overall Goals for HAVAS •  to identify and define communities of sports fans based around passion points(A) •  to analyze fan interactions with those passions (B) •  position HAVAS Sports & Entertainment to more effectively advise brands on how to meaningfully engage with sports fans by leveraging passion- based communities. (C)
  • 10. Big Data Research Objectives •  Discover a mixed methodology framework for sports and entertainment fan engagement External for Havas •  Justify our fan logic topology in relation to Twitter conversations through natural language processing Internal for Lab
  • 11. Initial Data Collection Steps 1) Modify data collection process to fit live soccer events using Champion’s league as a test run 2) Establish methodology in seeding initial pool of users, keywords, and hashtags 3) Analyze tweets and how they fit into logics of engagement 4) Establish methodology in how to gain insight from twitter conversations
  • 12. “Analyzing Big Data is a BIG JOB with Many People” – Jake Inputs & Equipment Keywords, hashtags, user clusters file on txt document Dedicated server system colllecting information Engineering Run and modify Python script Register Public Screening API Parse for results Live Viewing Team Team to watch game and look for patterns
  • 13. Data Collection Process Engineering & Team: Tech and Data Set-Up Engineer: Run Script with Seed File Team: Watch Event for Patterns and Additional Seeds Team: Decide Data to Analyze Engineer: Parse Data into User- Friendly Format Team: Look at Data and prepare for next event
  • 15. Initial Keyword Seed Scoping Keep it simple Discover through observations
  • 16. Soccer Hashtags and Keywords Official Hashtags Sponsors Team Names Key Terms Key Players
  • 17. Headliners Official Organization Handles Official Team Handles Official   Hashtags   Sponsors   Team   Names   Key  Terms   Key   Players  
  • 18. Sponsors Sponsors will often have official hashtags promoted during sporting events to cross-promote their brand and the sporting event. Official   Hashtags   Sponsors   Team   Names   Key   Terms   Key   Players  
  • 19. Supporting Characters Superfans -Fans with unusual followings on Twitter Sports Commentators -ESPN commentators and the like Prominent Bloggers -Blogs or bloggers with large following on certain teams
  • 20. Initial Data Seed Scoping Caveats • Twitter caps at couple of thousand tweets per second on Public API • Public API received tweets do not appear to be affected by location based factors the way individual user feeds are • Twitter chunks these tweets in mysterious algorithm it deems important • Number of Tweets scrapped render these factors nominal in terms of large-scale user behavior
  • 22. What kind of Tweets or tone in tweets fit into logics of engagement? *Informed by survey and ethnography Entertainment Immersion Social Connection Identification Mastery Pride Play Advocacy
  • 23. Operational Process Plan for World Cup & Modeling with Beacon Capabilities See how conservations analyzed from a big data perspective fit and build on the logics of engagement model Determine what data frameworks worked in capturing useful information Initial qualitative look at data
  • 25. Questions on Approach Before We Get Into Analysis?
  • 27. Big Data Basic Methods of Analysis • Text processing of tweets and plotting using algorithms into agglomerative clusters (aka cool visuals) • Frequency of terms, associations, and word clouds fall under here • Goal: Find texts of what spurred the most conversation Textual • A way to visually see social connection data • Understand forms of bonds and the connections between individual data points worth exploring • Goal: Detecting communities (our clusters, brands) Networks • Toolkits (such as Hootsuite) that measure “sentiment” using positive and negative language • Can be used to see if an initiative performed well • Goal: Measure success of a campaign at different times Sentiment
  • 28. Big Data Low-Hanging Fruit - Topline Rt Author Screenname FIFAWorldCup 76172 9GAG 37459 DFB_Team_EN 21247 BBCSport 19564 FCBayern 14782 FTBpro 13409 _Snape_ 11371 benparr 10616 TheTweetOfGod 9435 espn 7465 Queen_UK 7174 thereaIbanksy 7113 sulsultm3 6646 damnitstrue 6603 asshaaban 6513 SportsCenter 6470 fifaworldcup_es 6365 LicDice_ 6361 FIFAworldcup_e 6241 DFB_Team 6114 Argentina 5964
  • 29. Big Data Low-Hanging Fruit – Sentiment Analysis
  • 30. Fan Handles1 Game Data2 Brand Data3 Integrate insights with Ethnographic and Survey Data for final deliverables Initial Idealized Approach
  • 31. •  Survey Twitter Handles –  See if their online behavior matches survey logics –  What does the content they’re sharing look like –  Trends by cluster, gender, other data points •  Match Data –  Look for clusters of behavior to events in games –  See popularity of brand campaigns and behavioral response to brand stories –  Gain insight from bursts of activity and real-time marketing –  See what are characteristics of influencers •  Brand Data –  Identify how these strategies were executed in online conversations and responses –  Identify types of interactions/content/other markers around brands on Twitter –  Do influential brands mean consistent users interacting across brands? Why are people interacting in this way? How can we categorize these interactions according to our logic clusters? –  Was the content agile? –  See how users responded by the logics to different types of content –  Look for differences in fan response and fan-initiated behavior to the brands Questions and Hypothesis
  • 32. What We Planned To Do •  Steps •  Define interesting WC fan moments and brand moments •  Examine moments in time and certain brand campaigns •  Investigate possible Natural Language Processing tools •  Formulated Questions •  Timeline •  Created a timeline assigning roles to each person •  Deliverables •  TBD, likely looking at clusters of behavior around brand campaigns. •  Sentiment analysis may tie in here
  • 33. Ethnographic Report -What did people say about the brand or the logics they used? Survey Data -Under this brand logic utilized, what is the intensity and who are the clusters? Big Data -How did audiences respond online to actions by the brand? Approaching with Mixed Methods
  • 34. Exercise: Group Datasets Figure out what insight you might be able to get from each piece of data and how would you apply mixed methods.
  • 36. The Future of Social Media Analytics “We will be moving beyond key-word based queries into machine-learning algorithms. Influencers whom I have with with echo similar ideas about the increasing use and refine of latent semantic indexing (or some variant of it) and other machine-learning algorithms in order to improve social listening, automatic categorization of content, and the ability to take action on data” - Marshall Sponder
  • 37. Key Learnings for Mood Board Ethnography Survey Twitter Data Brazil Brought Together All Data
  • 39. The Dashboard Build Process Pulled 250 Retweeted Tweets with Verification from BigSheets Coded Tweets According to Logic for Testing Data Built Dictionary According to Sample Tweets, Ethnography, Survey Created Natural Language Processing and Machine Learning Algorithms Fan Engagement Dashboard Prototype
  • 40. Model Technology Collaboration Innovation Fan Engagement Dashboard Prototype jStart Beacon Custom-Built Twitter Collection Web App jStart BigSheets Leveraging Engagement Framework
  • 41. Annenberg Innovation Lab Fan Engagement Dashboard built through collaboration and mixed methods learning. 67% Accuracy in classifying tweets by Logic of Engagement leading to actionable insight and business intelligence for Leveraging Fan Engagement.
  • 43. The Process End-to-End Collecting and Managing Data Data Back Up Data Clean Up Run Models Gain InsightsRefine Models Learn Actionable Insights Communicate Insights (Reports, Infographic Blueprints) Create Initial Dictionary for Natural Language Processing Annotate/Code Tweets for Training Data for Machine Learning Created Dashboard Improve on Design
  • 45. Moving Forward Your Challenge •  Your data will be different client-to-client •  Twitter is just the beginning •  Your will get to be creative and work on collaborative cross-functional teams to dive into the data •  *This will be both rewarding and potentially difficult Tasks Ahead •  Begin thinking about what you can learn from data to help our sponsors reach their goals •  Start thinking about how your fans behave in your approach to figuring out what questions to ask the data
  • 46. Most Basic Steps Determine Goals Capture Data Curate Data Merge Datasets and Bring Together Methodologies if Necessary Additional Data Processing to Usable Form Deliver Insight to the Client
  • 48. Bumps in the Road Ahead •  Privacy Issues and Respecting the Fans •  Company layers and politics – releasing data from companies is fraught with back and forth •  Getting data into a usable form •  Assumptions were wrong or have to be redefined – it’s ok to fail fast – but be ready to keep moving •  Working in cross- functional groups
  • 49. Image  from:  CapGemini  h-p://www.capgemini.com/sites/default/files/technology-­‐blog/files/2012/09/big-­‐data-­‐vendors.jpg  
  • 50. Cross-Functional Communication Goal   Timing     Point   People   Resources   Needed    
  • 51. Bring it Together Draw connections between the data sets and how could they relate to the eight logics and situational triggers. “While social media data are always interesting in themselves (at least, for an analyst), when business owners are able to combine data and layer them efficiently, the information will become more useful and actionable.” – Marshall Sponder