Choosing	  a	  Big	  Data	  Technology	  Stack	  for	  Digital	  Marke7ng	    Gary	  Angel	               Krishnan	  Paras...
Your	  Hosts                             	      Gary Angel, Semphonic President and Co-Founder    20+ years experience w...
Talking	  Points	  for	  today’s	  discussion	  •  Challenges	  with	  Digital	  Marke7ng	  and	  Analy7cs.	  What	     ma...
Introducing	  Semphonic                                                        	      Founded in 1997 and exclusively foc...
Two	  Worlds	  Divided	            BI	  &	  Customer	                  Analy3cs	                             Web	  Analy3c...
Our	  Goal	  is	  to	  Bring	  Them	  Together                                                            	           Stat...
Digital	  Analy7cs	  	  in	  a	  Big-­‐Data	  World	  
Digital	  Analy7cs	  is	  a	  Paradigm	  Big	  Data	  Applica7on	  Digital	  Measurement	  is	  a	  paradigm	  case	  of	 ...
Why	  it’s	  Challenging	  These	  unique	  aspects	  of	  digital	  data	  make	  it	  difficult	  for	  most	  tradi3onal	...
Digital	  Data:	  Meaning	  &	  Integra7on	  
Why	  Digital	  Data	  IS	  DIFFERENT	  •  Here’s	  why	  your	  tradi7onal	  BI	  and	  Customer	  Analy7cs	  folks	     ...
Talking	  Streams	  (More	  detail	  because	  this	  is	  hard	  to	  convey)	  
Aggrega7on	  of	  Streams                                                                         	  •  Aggrega7on	  of	  ...
Why	  Streams	  MaYer                                                               	  •  The	  single	  biggest	  driver	...
A	  Quick	  Primer	  on	  Joins                                  	                               Joining	  one	  type	  of...
And	  Why	  Streams	  are	  Pain	                                                 Combining	  streams	  like	  digital	   ...
Every	  Sub-­‐Channel	  has	  Dis7nct	  Streams                                                                  	  •  One...
Choosing	  the	  Right	  Digital	  Technology	  Stack	  
Don’t	  Have	  a	  Homer	  Moment	  (D’oh)                                                                 	  •  Crea7ng	 ...
Here	  are	  the	  Key	  Decision	  Vectors	                                                                    	  •  We’v...
Decision	  Vectors                                                                                     	                  ...
And	  Here’s	  a	  Snapshot	  of	  the	  Decision	  Matrix	  
Most	  Common	  Failure	  Points                                                             	  Here	  are	  some	  common...
Digital	  Analy7cs	  is	  a	  Paradigm	  Big	  Data	  Applica7on	  Digital	  Measurement	  is	  a	  paradigm	  case	  of	 ...
Digital	  Analy7cs	  is	  a	  Paradigm	  Big	  Data	  Applica7on	  Digital	  Measurement	  is	  a	  paradigm	  case	  of	 ...
The	  Big	  Data	  Plaform	  Requirements                                                                                 ...
IBM’s	  Big	  Data	  Plaform	                           Impressions	                                                      ...
Audience	  Q&A	  Download	  the	  Full	  Whitepaper	  at:	  hrp://­‐big-­‐data-­‐technology-­‐
Upcoming SlideShare
Loading in …5

Choosing a Big Data Technology Stack for Digital Marketing


Published on

Confused about the right technologies for your digital marketing and analytics needs? You’re not alone. The challenges are complex and the range of possible solutions potentially bewildering.

In this webinar, Gary Angel, President and CTO of Semphonic, and Krishnan Parasuraman, CTO of IBM Big Data Solutions, demonstrate a common-sense approach to finding the right solutions for your company. Underlying the approach is a deep intellectual framework that highlights why digital marketing analysis requires new technologies and approaches. Using that framework, a whole range of specific digital marketing tasks (from full attribution analysis to social media marketing)are examined in light of the unique stresses each places on your technology stack. The result? A powerful way to match your specific business goals and requirements to the array of new technologies now coming online.
If you’re thinking about, evaluating, planning or designing a technology stack for digital marketing, this webinar is specifically for you.

What you’ll take away:
1. A much deeper understanding of why certain types of analysis in digital challenge traditional architectures – even when data volumes aren’t enormous
2. A true working definition of “big data” and a way to evaluate what it means beyond the hype
3. A detailed examination of the most common digital marketing business requirements in terms of their specific challenges to your technology stack
4. A good understanding of how IBM’s Big Data Solutions fit within that framework

See the recording of this webinar at

Published in: Technology

Choosing a Big Data Technology Stack for Digital Marketing

  1. 1. Choosing  a  Big  Data  Technology  Stack  for  Digital  Marke7ng   Gary  Angel   Krishnan  Parasuraman   President  and  CTO   CTO,  IBM  Big  Data  Solutions  
  2. 2. Your  Hosts    Gary Angel, Semphonic President and Co-Founder  20+ years experience with BI & database marketing  15 years experience with digital measurement  Leading industry expert, speaker, blogger and Semphonic practice leader for advanced analytics  Selected: Digital Analytics Association (formerly WAA) Most Influential Industry Contributor: 2012  Krishnan Parasuraman, CTO Big Data Solutions, IBM  15+ years experience with Large scale distributed information systems  Background in product development, consulting and technology management  Leading authority on big data technologies such as massively parallel data warehousing and Hadoop  Author of the book Harness the Power of Big Data
  3. 3. Talking  Points  for  today’s  discussion  •  Challenges  with  Digital  Marke7ng  and  Analy7cs.  What   makes  this  problem  so  unique  and  different?  •  Why  is  it  hard  to  use  tradi7onal  database  technologies  to   analyze  Digital  Data?  •  What  type  of  framework  would  we  use  to  evaluate  and   select  the  right  technology  stack  for  Digital  Analy7cs  need?  •  How  does  IBM’s  stack  address  the  needs  of  Digital   Marke7ng?  
  4. 4. Introducing  Semphonic    Founded in 1997 and exclusively focused on digital measurement and digital customer analytics  Deep expertise in traditional Web analytics solutions (Omniture, GA Premium, IBM, etc.) AND in the use of advanced technologies for warehousing, integrating, and analyzing digital data.  Practice focused on high-end customer analytics including:   Digital Segmentation   Site optimization and Personalization   Customer Analytics   Attribution Analysis & Media Mix Modeling   Digital Data Models for the Warehouse
  5. 5. Two  Worlds  Divided   BI  &  Customer   Analy3cs   Web  Analy3cs   and  Digital  •  Tradi7onal  BI  and  Customer  Analy7cs  teams  have  deep  methods   and  powerful  tools.  But  digital  data  is  surprisingly  different  and   challenging.  •  Digital  Measurement  professionals  lack  the  tools,  the   methodology,  and  the  exper7se  to  do  mul7-­‐channel  analy7cs.  
  6. 6. Our  Goal  is  to  Bring  Them  Together   Statistical ModelsProven Actionable Online Behavior Demographics Email Marketing Database   Web     Marke3ng   Analy3cs  Database-Driven Event Driven Social Old SaaS List Enhancement Customer Driven
  7. 7. Digital  Analy7cs    in  a  Big-­‐Data  World  
  8. 8. Digital  Analy7cs  is  a  Paradigm  Big  Data  Applica7on  Digital  Measurement  is  a  paradigm  case  of  big-­‐data:  •  Lot’s  of  data   –  Millions  (hundreds  of?)  events  per  day   –  Lots  of  data  per  event  •  Lot’s  of  key  High  Cardinality  variables     –  Page  Name,  Product  Sets,  Referrers,  Campaigns,  Keywords   –  and  Customers  •  Focus  on  Detail-­‐Level  Analy7cs:   –  Customer  Life7me  Value   –  Full  mul7-­‐touch  aYribu7on  •  Lack  of  meaning  at  the  Row-­‐Level   –  In  digital,  meaning  exists  in  a  collec7on  of  records.  
  9. 9. Why  it’s  Challenging  These  unique  aspects  of  digital  data  make  it  difficult  for  most  tradi3onal  technology  stacks  to  support  effec3ve  digital  measurement  and  analysis:   Large  Row   Defeats  systems  not  setup  to  op3mize  full  table  scans   Volumes   ONen  creates  unmanageable  indexing  sizes     Creates  basic  load  and  availability  issues   High   Defeats  many  classic  OLAP  strategies     Cardinality   Forces  full-­‐table  or  index  scans  of  the  data   Focus  on   Defeats  aggrega3on  strategies   Detail  Level   Analy3cs   No  opportunity  for  fixed  aggregates  to  succeed   Lack  of   Defeats  simple  aggrega3on  strategies   Meaning  at   the  Row  Level   Defeats  tradi3onal  row-­‐based  ETL  
  10. 10. Digital  Data:  Meaning  &  Integra7on  
  11. 11. Why  Digital  Data  IS  DIFFERENT  •  Here’s  why  your  tradi7onal  BI  and  Customer  Analy7cs  folks   struggle  with  Digital:   –  There  are  no  domain  experts   –  Nearly  all    digital  data  is  stream  data   –  Unlike  transac7on  data,  digital  streams  don’t  aggregate  cleanly   –  Digital  Data  o]en  contains  a  hidden  topographic  structure   Most  modeling  Tradi7onal  data   Unlike   and  analysis  modeling  relied   Transac7on   Hidden   systems  provide   on  domain   data,  digital   Structure  skews   row-­‐based   experts.  These   stream  data   Basic  Sta7s7cal   analysis.   don’t  exist  in   doesn’t   Analysis   Analy7cs  data  is   digital   aggregate   stream  data.  
  12. 12. Talking  Streams  (More  detail  because  this  is  hard  to  convey)  
  13. 13. Aggrega7on  of  Streams  •  Aggrega7on  of  streams  is  cri7cal  to  effec7ve  digital  measurement   –  This  isn’t  because  of  performance  (though  it  helps)   –  Digital  data  has  meaning  as  a  collec7on  not  a  single  row   –  So  no  maYer  how  powerful  your  processing  system,  you  need  to  understand  whole  sequences  of   behavior    •  Tradi7onal  aggrega7on  doesn’t  work:   Transaction Page View Total Transactions Total Page Views
  14. 14. Why  Streams  MaYer  •  The  single  biggest  driver  of  digital  analy7cs  measurement  is  the   need  to  de-­‐silo  data.    •  Proper  answers  to  ALL  of  these  ques7ons  require  mul7-­‐channel   data  integra7on.     –  Where  do  mobile  apps  fit  in  the  broader  customer  journey?   –  How  does  web  engagement  translate  into  offline  sales?   –  How  do  my  best  offline  customers  use  the  digital  channel?   –  What’s  the  Predicted  Life7me  Value  of  a  Digital  Lead?     –  What’s  the  value  of  a  Facebook  Fan?   –  What  impact  does  Posi7ve  Social  ChaYer  have  on  Brand  Affinity?   –  What  content  on  my  Website  is  most  effec7ve?  
  15. 15. A  Quick  Primer  on  Joins   Joining  one  type  of  Customer   Record  to  another  yields  a  single   row  per  customer  with  an  easy   to  use  combined  record.   Joining  a  Customer  Record  to   Geo  or  Census  data  yields  a   single  row  per  customer  with  an   easy  to  use  record.  
  16. 16. And  Why  Streams  are  Pain   Combining  streams  like  digital   and  mobile  –  even  with  a  join   key  –  just  yields  two  dis3nct   streams.  The  join  doesn’t   simplify  analysis.   Pu[ng  mul3ple  digital   data  sources  on  the  same   box  WITH  join  keys  doesn’t   really  solve  the  problem.  
  17. 17. Every  Sub-­‐Channel  has  Dis7nct  Streams  •  One  of  the  HUGE  challenges  facing  a  digital  technology  stack  is   that  almost  every  digital  source  is  quite  different.   –  One  of  the  most  common  failure  points  we  see  is  the  assump7on  that  pudng   the  data  in  one  place  makes  it  useful.     –  Given  the  challenges  of  stream  analy7cs  in  a  single  sub-­‐channel,  asking  the   analyst  to  join  streams  on  un-­‐modified  data  is  overly-­‐op7mis7c.     –  An  effec7ve  data  model  has  to  provide  a  means  of  unifying  sub-­‐channels  in  a   coherent  structure.  
  18. 18. Choosing  the  Right  Digital  Technology  Stack  
  19. 19. Don’t  Have  a  Homer  Moment  (D’oh)  •  Crea7ng  a  strong  founda7on  for  assessment  begins  with  your   business  purposes.  Each  of  these  puts  different  stresses  on  the   underlying  technology  stack:   Advanced  Web   Customer   Analy3cs   Modeling   Personaliza3on   Email  Targe3ng   Site   Loyalty  Program   Merchandising   Enterprise   Personaliza3on   Analy3cs   Analy3cs   Dashboarding   Social  Media   Opera3ons  (Call   Analy3cs   Avoidance,  etc.)  
  20. 20. Here  are  the  Key  Decision  Vectors    •  We’ve  matched  the  business  func7ons  to  the  following  key   aYributes  of  various  big  data  technology  stacks:   The  goal  is  to  help  you  assess   what  technology  trade-­‐offs  best   fit  your  needs.  
  21. 21. Decision  Vectors   Advanced  Web  Analy7cs   Advanced  Web  Analy7cs  &  Hadoop   Handling  Huge   Volume   Up3me/Load   90   Miminize  Data   Handling  Huge   80   Volume   Without  Disrup3on   70   Modeling   Up3me/Load   90   60   80   Miminize  Data   Without   50   70   Modeling   Minimize   Easy  Data   Disrup3on   60   40   50   Administra3on   30   Integra3on   Minimize   40   Easy  Data   20   Administra3on   30   Integra3on   10   20   0   10   Support  Integrated   0   Support  Integrated   Real-­‐3me  Support   Marke3ng  Solu3ons  Real-­‐3me  Support   Marke3ng   Solu3ons   Support   Support  Algorithmic   Support  BI  Tools   Algorithmic   Support  BI  Tools   Queries   Queries   Support  Stats   Exper3se  Available   Support  Stats  Tools   Exper3se  Available   Tools   Advanced  Web  Analy3cs   Hadoop  
  22. 22. And  Here’s  a  Snapshot  of  the  Decision  Matrix  
  23. 23. Most  Common  Failure  Points  Here  are  some  common  risk  points:   • Insis3ng  on  Too  Much  History   Data  Windows     • Using  a  single  technology     • Keeping  too  much  data   ETL   • Missing  Join  Keys   • Failure  to  Reckon  with  Streams   Integra3on   • Assump3on  that  a  key  is  all  that’s  necessary   • Ad  Hoc  Effort  instead  of  up-­‐front  segmenta3on   Analy3cs   • Failure  to  understand  Topology   • Lack  of  structure   Data  Democra3za3on   • Tool  Complexity   • Single  Technology  Stack   Real-­‐3me   • Unrealis3c  expecta3ons  
  24. 24. Digital  Analy7cs  is  a  Paradigm  Big  Data  Applica7on  Digital  Measurement  is  a  paradigm  case  of  big-­‐data:  •  Lot’s  of  data   –  Millions  (hundreds  of?)  events  per  day   –  Lots  of  data  per  event  •  Lot’s  of  key  High  Cardinality  variables     –  Page  Name,  Product  Sets,  Referrers,  Campaigns,   Keywords   –  and  Customers  •  Focus  on  Detail-­‐Level  Analy7cs:   –  Customer  Life7me  Value   –  Full  mul7-­‐touch  aYribu7on  •  Lack  of  meaning  at  the  Row-­‐Level   –  In  digital,  meaning  exists  in  a  collec7on  of  records.  
  25. 25. Digital  Analy7cs  is  a  Paradigm  Big  Data  Applica7on  Digital  Measurement  is  a  paradigm  case  of  big-­‐data:  •  Lot’s  of  data   –  Millions  (hundreds  of?)  events  per  day   –  Lots  of  data  per  event  •  Lot’s  of  key  High  Cardinality  variables     –  Page  Name,  Product  Sets,  Referrers,  Campaigns,   Keywords   BIG  DATA   –  and  Customers   PLATFORM  •  Focus  on  Detail-­‐Level  Analy7cs:   –  Customer  Life7me  Value   –  Full  mul7-­‐touch  aYribu7on  •  Lack  of  meaning  at  the  Row-­‐Level   –  In  digital,  meaning  exists  in  a  collec7on  of  records.  
  26. 26. The  Big  Data  Plaform  Requirements   Analyze  Extreme  Volumes  of  Data   Impressions   Online,  Offline,  Social,  Behavior,  First  Party  &   Cookies   Third  Party  across  mul3ple  channels   Online   Registra3ons   Purchase  Transac3ons   Analyze  Wide  Variety  of  Data   In-­‐Market  Intent   Structured  –  POS,  3rd  Party,  Transac3ons   Unstructured  –  Social,  Video,  Blogs   Influence   Semi-­‐Structured  –  Cookies,  Impressions   Sen3ments   BIG  DATA  Social   Followers   Analyze  Data  in  Real  Time   Recommenda3ons   Likes   PLATFORM   Product  Recommenda3ons,  Real  Time  offers,   Targeted  Ads  in  Real  Time   Psychographic  surveys   Geo-­‐Demographic   Discover  &  Experiment  3rd  Party   Segments   Ad-­‐hoc  analy3cs,  data  discovery  &   Offline  Transac3ons   experimenta3on   Responses   Governance   Enforce  data  structure,  integrity  and   control  to  ensure  consistency    
  27. 27. IBM’s  Big  Data  Plaform   Impressions   Netezza   Cookies   •  Extreme  Performance   Online   Registra3ons   •  In-­‐Database  Analy3cs   Purchase  Transac3ons   In-­‐Market  Intent   •  Scalable  Appliance   Influence   Sen3ments   Streams   BIG  DATA  Social   Followers   •  Act  on  Data  “In-­‐Mo3on”   Recommenda3ons   Likes   PLATFORM   •  Real  3me  analy3cs   •  Alerts/Ac3ons   Psychographic  surveys   Geo-­‐Demographic  3rd  Party   Segments   Offline  Transac3ons   Big  Insights   Responses   •  Hadoop/  Unstructured   Data   •  Complex  Analy3cs  
  28. 28. Audience  Q&A  Download  the  Full  Whitepaper  at:  hrp://­‐big-­‐data-­‐technology-­‐stack.html  Learn  more  about  IBM’s  big  data  solu3ons  at:  hrp://  hrp://