SlideShare a Scribd company logo
Master Analytics Data Solution
~ Multiple Channels
DRAFTED in the reigned social domain other than being under further completion
Copy for Mr. Gary Chin, 20150206 prepared by Teng Xiaolu
Draw out the analytics framework as,
Data Solution = 1. (Statistics Model + Machine Learning) + 2. (Strategy
Insights + Metrics Schema + Innovation Tech)
On par of the intimidating abundant realms involved, it could be spilt into 2
major parts,
Bracket 1. Riding on the foundation of methodology, propagate algorithm
techniques and statistical test.
Bracket 2. Based on the data responses, decisions led by data analysis could
be made through entanglement of insights, measurements, and addictive
innovations [fig.3].
Machine Learning
In general, at a glance of machine learning (the collections from blog
discussions for your best references) who guides a processing to be
emphasized: training, tune, test.
	
  
Basically	
  you	
  have	
  three	
  data	
  sets:	
  training,	
  validation	
  and	
  testing.	
  
You	
  train	
  the	
  classifier	
  using	
  'training	
  set',	
  tune	
  the	
  parameters	
  using	
  'validation	
  set'	
  
and	
  then	
  test	
  the	
  performance	
  of	
  your	
  classifier	
  on	
  unseen	
  'test	
  set'.	
  Normally,	
  the	
  
data	
  size	
  test	
  vs	
  training,	
  I	
  have	
  seen	
  the	
  versions	
  discrepancies,	
  30%:	
  70%	
  or	
  10%:	
  
90%.	
  Probably	
  there	
  is	
  no	
  one	
  way	
  to	
  choose.	
  Is	
  it	
  eliminating	
  bias	
  of	
  classification?	
  
Any	
  result	
  in	
  the	
  possible	
  generalization?	
  	
  
	
  
A	
  well	
  accepted	
  method	
  is	
  N-­‐Fold	
  cross	
  validation,	
  in	
  which	
  you	
  randomize	
  the	
  
dataset	
  and	
  create	
  N	
  (almost)	
  equal	
  size	
  partitions.	
  Then	
  choose	
  Nth	
  partition	
  for	
  
testing	
  and	
  N-­‐1	
  partitions	
  for	
  training	
  the	
  classifier.	
  Within	
  the	
  training	
  set	
  you	
  can	
  
further	
  employ	
  another	
  K-­‐fold	
  cross	
  validation	
  to	
  create	
  a	
  validation	
  set	
  and	
  find	
  the	
  
best	
  parameters.	
  And	
  repeat	
  this	
  process	
  N	
  times	
  to	
  get	
  an	
  average	
  of	
  the	
  metric.	
  	
  
Key	
  words:	
  unbias,	
  cross-­‐validation,	
  randomized	
  data,	
  average	
  of	
  the	
  metric	
  
	
  
Strategy Insights + Metrics Schema (Social genre)
In the sense of social mining, social channels employ the first approach to
fulfill insights sophisticated.
Also the best to derive from market distinction.
In this first section, it would add on social listening:
Which fans of network are figured out as influenced nodes and how much
fraction it takes of the total scale of fans. Especially, these scatters are spilt
into the diversified layers of network. How much frequencies the reactions
direct to the posts, which could be classified into following volumes in terms of
followers size. How it’s to figure out the overlapped area in between of various
communities when the common of high interests in hashtags. (none
expanding version yet)
Against static view on attribute,
Source: MYTH-BUSTING SOCIAL MEDIA ADVERTISING
Source: nielsen-cross-platform-report-march-2014.pdf
fEATURE selection separated, or involved into scenarios.
Statistic Model
As long as fulfilling the social mining, it’s able to array into a digital model.
I would suggest to run out together Logistic Regression, Decision Tree,
Neural Network, considering the complementary effect of these 3 functional
classifiers.
Before, it’s unavoided daunts to discover a certain period who should be used
in who among flourished classification techniques.
Now it’s able to identify the limitations to be removed in the same time
maximize the strengths, for instance, tolerance of missing data is found in
decision tree, in the result to tackle the black-box happened in neural network.
Nonetheless, this phenomena tends to high allowance on features less
restricted, and tolerance to the highly interdependent attributions, it ends to
don’t know what to be predicted why it’s predicted.
(the collections from blog discussions) for your best references determine the
number of neurals
	
  
• The	
  VC	
  dimension	
  provides	
  a	
  rule	
  of	
  thumb	
  for	
  the	
  number	
  of	
  neurons.	
  
Basically	
  it	
  states	
  that	
  the	
  number	
  of	
  free	
  parameters	
  should	
  be	
  much	
  less	
  
than	
  the	
  number	
  of	
  examples	
  in	
  your	
  training	
  set.	
  "Free	
  parameters"	
  
translates	
  to	
  the	
  number	
  of	
  connections	
  in	
  your	
  neural	
  net	
  that	
  need	
  to	
  be	
  
tuned,	
  which	
  in	
  a	
  fully	
  connected	
  net	
  depend	
  on	
  the	
  number	
  of	
  neurons	
  and	
  
how	
  many	
  of	
  them	
  are	
  in	
  the	
  input	
  layer	
  vs	
  the	
  hidden	
  layer.	
  [1]	
  
• In	
  general,	
  with	
  a	
  large	
  dataset,	
  the	
  more	
  parameters	
  the	
  
better.	
  	
  Regularization	
  can	
  prevent	
  overfitting.	
  	
  	
  
	
  
The	
  structure	
  of	
  the	
  neural	
  net	
  is	
  also	
  critical,	
  and	
  actually	
  determines	
  the	
  
number	
  of	
  parameters	
  (which	
  corresponds	
  much	
  more	
  to	
  the	
  number	
  of	
  
connections).	
  	
  	
  
	
  
The	
  most	
  popular	
  architectures	
  these	
  days	
  use	
  many	
  (e.g.	
  10)	
  "layers"	
  of	
  
neurons,	
  and/or	
  feedback	
  connections	
  (see	
  recurrent	
  neural	
  nets,	
  now	
  
almost	
  always	
  using	
  LSTM).	
  	
  
So	
  in	
  short,	
  #neurons	
  <<<	
  #examples	
  in	
  training	
  set	
  
	
  	
  
[1]	
  Notice	
  how	
  low-­‐dimensional	
  examples	
  becomes	
  a	
  positive	
  thing	
  here	
  
Good to understand neural network
	
  
THINKING:	
  A	
  hybrid	
  solution	
  is	
  suggested	
  in	
  current	
  version	
  with	
  the	
  paper	
  in	
  [fig.1].	
  
Despite	
  of	
  continuous	
  lacking	
  of	
  evidences	
  to	
  define	
  the	
  how	
  much	
  concerns	
  on	
  the	
  
speed	
  of	
  learning	
  and	
  data	
  consumption,	
  in	
  this	
  moment,	
  I	
  would	
  support	
  this	
  
operation	
  phasing	
  to	
  none	
  clicks	
  !	
  clicks.	
  
[fig.1]
Neural	
  networks	
  are	
  routinely	
  ignored	
  as	
  a	
  modeling	
  tool	
  because	
  they	
  are	
  largely	
  
uninterpretable	
  overall	
  and	
  are	
  generally	
  less	
  familiar	
  to	
  analysts	
  and	
  business	
  
people	
  alike.	
  Neural	
  networks	
  can	
  provide	
  great	
  diagnostic	
  insights	
  into	
  the	
  potential	
  
shortcomings	
  of	
  other	
  modeling	
  methods,	
  and	
  comparing	
  the	
  results	
  of	
  different	
  
models	
  can	
  help	
  identify	
  what	
  is	
  needed	
  to	
  improve	
  model	
  performance.	
  
For	
  example,	
  consider	
  a	
  situation	
  where	
  the	
  best	
  tree	
  model	
  fits	
  poorly,	
  but	
  the	
  best	
  
neural	
  network	
  model	
  and	
  the	
  best	
  regression	
  model	
  perform	
  similarly	
  well	
  on	
  the	
  
validation	
  data.	
  Had	
  the	
  analyst	
  not	
  considered	
  using	
  a	
  neural	
  network,	
  little	
  
performance	
  would	
  be	
  lost	
  by	
  investigating	
  only	
  the	
  regression	
  model.	
  Consider	
  a	
  
similar	
  situation	
  where	
  the	
  best	
  tree	
  fits	
  poorly	
  and	
  the	
  best	
  regression	
  fits	
  
somewhat	
  better,	
  but	
  the	
  best	
  neural	
  network	
  shows	
  marked	
  improvement	
  over	
  the	
  
regression	
  model.	
  The	
  poor	
  tree	
  fit	
  might	
  indicate	
  that	
  the	
  relationship	
  between	
  the	
  
predictors	
  and	
  the	
  response	
  changes	
  smoothly.	
  The	
  improvement	
  of	
  the	
  neural	
  
network	
  over	
  the	
  regression	
  indicates	
  that	
  the	
  regression	
  model	
  is	
  not	
  capturing	
  the	
  
complexity	
  of	
  the	
  relationship	
  between	
  the	
  predictors	
  and	
  the	
  response.	
  Without	
  
the	
  neural	
  network	
  results,	
  the	
  regression	
  model	
  would	
  be	
  chosen	
  and	
  much	
  
interpretation	
  would	
  go	
  into	
  interpreting	
  a	
  model	
  that	
  inadequately	
  describes	
  the	
  
relationship.	
  Even	
  if	
  the	
  neural	
  network	
  is	
  not	
  a	
  candidate	
  to	
  present	
  to	
  the	
  final	
  
client	
  or	
  management	
  team,	
  the	
  neural	
  network	
  can	
  be	
  highly	
  diagnostic	
  for	
  other	
  
modeling	
  approaches.	
  
In	
  another	
  situation,	
  the	
  best	
  tree	
  model	
  and	
  the	
  best	
  neural	
  network	
  model	
  might	
  
be	
  performing	
  well,	
  but	
  the	
  regression	
  model	
  is	
  performing	
  somewhat	
  poorly.	
  In	
  this	
  
case,	
  the	
  relative	
  interpretability	
  of	
  the	
  tree	
  might	
  lead	
  to	
  its	
  selection,	
  but	
  the	
  
neural	
  network	
  fit	
  confirms	
  that	
  the	
  tree	
  model	
  adequately	
  summarizes	
  the	
  
relationship.	
  In	
  yet	
  another	
  scenario,	
  the	
  tree	
  is	
  performing	
  very	
  well	
  relative	
  to	
  both	
  
the	
  neural	
  network	
  and	
  regression	
  models.	
  This	
  scenario	
  might	
  imply	
  that	
  there	
  are	
  
certain	
  variables	
  that	
  behave	
  unusually	
  with	
  respect	
  to	
  the	
  response	
  when	
  a	
  missing	
  
value	
  is	
  present.	
  Because	
  trees	
  can	
  handle	
  missing	
  values	
  directly,	
  they	
  are	
  able	
  to	
  
differentiate	
  between	
  a	
  missing	
  value	
  and	
  a	
  value	
  that	
  has	
  been	
  imputed	
  for	
  use	
  in	
  a	
  
regression	
  or	
  neural	
  network	
  model.	
  In	
  this	
  case,	
  it	
  might	
  make	
  more	
  sense	
  to	
  
investigate	
  missing	
  value	
  indicators	
  rather	
  than	
  to	
  look	
  at	
  increasing	
  the	
  flexibility	
  of	
  
the	
  regression	
  model	
  because	
  the	
  neural	
  network	
  shows	
  that	
  this	
  improved	
  
flexibility	
  does	
  not	
  improve	
  the	
  fit.	
  
To	
  overcome	
  this	
  problem,	
  select	
  variables	
  judiciously	
  and	
  fit	
  a	
  neural	
  network	
  while	
  
ensuring	
  that	
  there	
  is	
  an	
  adequate	
  amount	
  of	
  data	
  in	
  the	
  validation	
  data	
  set.	
  As	
  
discussed	
  earlier,	
  performing	
  variable	
  selection	
  in	
  a	
  variety	
  of	
  ways	
  ensures	
  that	
  
important	
  variables	
  are	
  included.	
  Evaluate	
  the	
  models	
  fit	
  by	
  decision	
  tree,	
  
regression,	
  and	
  neural	
  network	
  methods	
  to	
  better	
  understand	
  the	
  relationships	
  in	
  
the	
  data,	
  and	
  use	
  this	
  information	
  to	
  identify	
  ways	
  to	
  improve	
  the	
  overall	
  fit.	
  
Source:	
  <Identifying	
  and	
  Overcoming	
  Common	
  Data	
  Mining	
  Mistakes>	
  
From	
  the	
  other	
  book,	
  
“However,	
  a	
  neural	
  network	
  is	
  a	
  “black	
  box”	
  method	
  that	
  does	
  not	
  provide	
  any	
  
interpetable	
  explanation	
  to	
  accompany	
  its	
  classifications	
  or	
  predictions.	
  Adjusting	
  
the	
  parameters	
  to	
  tune	
  the	
  neural	
  network	
  performance	
  is	
  largely	
  a	
  matter	
  of	
  trial	
  
and	
  error	
  guided	
  by	
  rules	
  of	
  thumb	
  and	
  user	
  experience.”	
  
{SIDE	
  NOTE}	
  
Inspired	
  by	
  listening	
  !	
  imitation	
  !	
  recode,	
  I	
  would	
  like	
  to	
  believe	
  the	
  other	
  tuple	
  
required	
  to	
  heterogenic	
  interpretation	
  with	
  discriminant	
  effect.	
  Probably	
  it	
  requires	
  
Naïve	
  Bayes	
  and	
  VMC	
  to	
  iterate	
  stringently.	
  Please	
  kindly	
  noted	
  independent	
  
fEATURE	
  selection	
  to	
  support	
  formula	
  could	
  be	
  packed	
  into	
  scenarios.	
  
	
  
About	
  Naïve	
  Bayes	
  in	
  few	
  paragraphs,	
  
• The	
  second	
  contribution	
  is	
  a	
  technical	
  contribution:	
  We	
  in-­‐	
  troduce	
  a	
  version	
  
of	
  Na	
  ̈ıve	
  Bayes	
  with	
  a	
  multivariate	
  event	
  model	
  that	
  can	
  scale	
  up	
  efficiently	
  
to	
  massive,	
  sparse	
  datasets.	
  Specifically,	
  this	
  version	
  of	
  the	
  commonly	
  used	
  
multivariate	
  Bernoulli	
  Na	
  ̈ıve	
  Bayes	
  only	
  needs	
  to	
  consider	
  the	
  ‘‘active’’	
  
elements	
  of	
  the	
  dataset—those	
  that	
  are	
  present	
  or	
  non-­‐zero—	
  which	
  can	
  be	
  
a	
  tiny	
  fraction	
  of	
  the	
  elements	
  in	
  the	
  matrix	
  for	
  massive,	
  sparse	
  data.	
  This	
  
means	
  that	
  predictive	
  modelers	
  wanting	
  to	
  work	
  with	
  the	
  very	
  convenient	
  
Na	
  ̈ıve	
  Bayes	
  algorithm	
  are	
  not	
  forced	
  to	
  use	
  the	
  multinomial	
  event	
  model	
  
simply	
  because	
  it	
  is	
  more	
  scalable.	
  This	
  article	
  thereby	
  makes	
  a	
  small	
  but	
  
important	
  addition	
  to	
  the	
  cumulative	
  answer	
  to	
  a	
  current	
  open	
  research	
  
question17:	
  	
  
• How	
  can	
  we	
  learn	
  predictive	
  models	
  from	
  lots	
  of	
  data?	
  	
  
• Note	
  that	
  our	
  use	
  of	
  Na	
  ̈ıve	
  Bayes	
  should	
  not	
  be	
  interpreted	
  as	
  a	
  claim	
  
that	
  Na	
  ̈ıve	
  Bayes	
  is	
  by	
  any	
  means	
  the	
  best	
  modeling	
  technique	
  for	
  these	
  
data.	
  Other	
  methods	
  exist	
  that	
  handle	
  large	
  transactional	
  datasets,	
  such	
  
as	
  the	
  popular	
  Vowpal	
  Wabbit	
  software	
  based	
  on	
  scalable	
  stochastic	
  
gradient	
  descent	
  and	
  input	
  hashing.2,18,19	
  Moreover,	
  results	
  based	
  on	
  
Na	
  ̈ıve	
  Bayes	
  are	
  conservative.	
  As	
  one	
  would	
  expect	
  theoretically20	
  and	
  as	
  
shown	
  empirically,15	
  nonlinear	
  modeling	
  and	
  less-­‐restrictive	
  linear	
  modeling	
  
generally	
  will	
  show	
  continued	
  improvements	
  in	
  predictive	
  performance	
  for	
  
much	
  larger	
  datasets	
  than	
  will	
  Na	
  ̈ıve	
  Bayes	
  modeling.	
  (However,	
  how	
  to	
  
conduct	
  robust,	
  effective	
  nonlinear	
  modeling	
  with	
  massive	
  high-­‐dimensional	
  
data	
  is	
  still	
  an	
  open	
  question.)	
  Nevertheless,	
  Na	
  ̈ıve	
  Bayes	
  is	
  popular	
  and	
  
quite	
  robust.	
  Using	
  it	
  provides	
  a	
  clear	
  and	
  conservative	
  baseline	
  to	
  
demonstrate	
  the	
  point	
  of	
  the	
  article.	
  If	
  we	
  see	
  continued	
  improvements	
  
when	
  scaling	
  up	
  Na	
  ̈ıve	
  Bayes	
  to	
  massive	
  data,	
  we	
  should	
  ex-­‐	
  pect	
  even	
  
greater	
  improvements	
  when	
  scaling	
  up	
  more	
  sophisticated	
  induction	
  
algorithms.	
  	
  
• These	
  results	
  are	
  important	
  because	
  they	
  help	
  provide	
  some	
  solid	
  empirical	
  
grounding	
  to	
  the	
  importance	
  of	
  big	
  data	
  for	
  predictive	
  analytics	
  and	
  highlight	
  
a	
  particular	
  sort	
  of	
  data	
  in	
  which	
  predictive	
  analytics	
  is	
  likely	
  to	
  benefit	
  from	
  
big	
  data.	
  They	
  also	
  add	
  to	
  the	
  observation3	
  that	
  firms	
  (or	
  other	
  entities)	
  with	
  
massive	
  data	
  assets21	
  may	
  indeed	
  have	
  a	
  considerable	
  competitive	
  
advantage	
  over	
  firms	
  with	
  smaller	
  data	
  assets.	
  
Source:	
  big%2E2013%2E0037.pdf	
  <Is	
  Bigger	
  Really	
  Better?>	
  	
  
More	
  discussion	
  from	
  the	
  paper	
  about	
  digital	
  data	
  occurrences,	
  sparse,	
  fine-­‐grained,	
  
so	
  does	
  massive	
  !	
  more	
  data	
  actually	
  beats	
  algorithms.	
  	
  
[fig.2]
Dynamic Programming
Source:
https://www.cs.utexas.edu/~eladlieb/RLRG.html
http://theanalyticsstore.com/deep-learning/
 
	
  
	
  
INNOVATION is a case combined tech advance
TV + Social probably arrives aftermath from the aspect top-to-bottom. I have
saw somewhere the similar opinion. Will point out specific article later for your
conveniences due to time constrains. I would support to the tool kit, 0. ! -1.
So does real time approach which attempts to roll out under specific real time
metrics recency, particularly link to data stream hour, day, week, month. (refer
to ++Insight+1++). Functionally it needs parallel to brand metrics, awareness
and retention rate. With such, it won’t cover too much about impression,
project ROI, est. cost of prospect gaining = estimated margin per prospect /
(1+ROI threshold) or CLTV (new deducted from existing) in the monothetic
statistical test, including a longer list p value, F test, t-Test, R2
, adjusted R2
,
correlation matrix, elasticity and co-efficiency functionally validate, type I/II
errors. MAPE, error rate management, ROC, lift depended on the model
selection and more likelihood time series, association, what-if. (note: longer in
book(s), thicker+thicker, every fraction, self-semester) There is analytics
session named transaction analysis, RFM discerning acquisition !
transaction. It illustrates the possibilities, in conditional setting, the clicks but
none-purchase might belong to the group who stimulates to longer
relationship with brands who offer coupons, probably the variation of against
cost sensitivity. A model helps to recognize this type of existence with
parameters. In the contrast, purchased group alternately to be encouraged for
repeat purchase due to shifted demand from the analysis cross-sell and up-
sell. Upstream and downstream both could be spread. How far from the
overarching digital intrinsic relevance, by channel or by touchpoint? TV, plus
time shifted TV diminishing, even capping 2, on or off should leave off this
motion in a certain scenario, for instance customer scoring program. Social
network analysis plays throughout the scenarios estimating customer
profitability, listening ! imitation ! recode, on the path of being extrapolation,
both supervised and un-supervised.
Source: Bayesian+reasoning+and+machine+learning.pdf
NEWS,
In	
  particular,	
  they	
  want	
  to	
  see	
  highly	
  granular	
  data	
  from	
  all	
  touchpoints.	
  "Increasing	
  
the	
  granularity	
  and	
  variability	
  of	
  media	
  inputs	
  can	
  increase	
  the	
  estimate	
  of	
  a	
  
medium's	
  RoI	
  by	
  as	
  much	
  as	
  27%,"	
  they	
  reported.	
  
	
  
They	
  also	
  highlighted	
  the	
  "shocking	
  oversight"	
  when	
  it	
  comes	
  to	
  measuring	
  
creativity,	
  with	
  some	
  observers	
  claiming	
  that	
  70%	
  of	
  the	
  sales	
  effectiveness	
  of	
  
advertising	
  can	
  be	
  attributed	
  to	
  the	
  creative	
  message.	
  
	
  
Acknowledging	
  that	
  this	
  is	
  a	
  difficult	
  area,	
  they	
  argued	
  that	
  more	
  direct	
  integration	
  
of	
  copy	
  tests	
  into	
  marketing	
  mix	
  models	
  would	
  move	
  the	
  industry	
  on	
  from	
  
determining	
  which	
  ads	
  worked	
  to	
  understanding	
  why	
  they	
  worked.	
  
Source:	
  New	
  marketing	
  models	
  emerge,	
  	
  London:	
  6	
  February	
  2015	
  
http://www.warc.com/LatestNews/News/EmailNews.news?ID=34271&Origin=WARCNewsEmail&CID=N34271&P
UB=Warc_News&utm_source=WarcNews&utm_medium=email&utm_campaign=WarcNews20150206	
  
	
  
	
  
	
  
	
  
Other gifts given from London
What Winston said other else? :>>
“During these turbulent times, predictive analytics is how smart companies
are turning data into knowledge to gain a competitive advantage.” Source: <Drive
your business with predictive analytics>
Source: <Drive your business with predictive analytics>
THINKING from Facebook case: it might be two-ways TV doesn’t simply play
as domination to influence social responses in trend, in the contradict, social
platform reflects TV opportunities that Facebook leverages on Super Bowl
significance. It’s typical event show pattern vs proportion vs longer
viewingship extension added with transaction history to capture higher value
customer.
[fig.3]
• January	
  30,	
  2015,	
  1:53	
  PM	
  
• Facebook’s	
  new	
  Super	
  Bowl	
  ad	
  play	
  
• By	
  Zak	
  Stambor	
  Managing	
  Editor	
  	
  
• The	
  social	
  network	
  will	
  launch	
  a	
  live	
  feed	
  where	
  fans	
  can	
  discuss	
  the	
  game,	
  
and	
  it	
  is	
  selling	
  video	
  ads	
  that	
  target	
  consumers	
  based	
  on	
  what	
  they	
  talk	
  
about.	
  Among	
  those	
  signing	
  up	
  to	
  advertise	
  are	
  Toyota,	
  Pepsi,	
  Intuit	
  
TurboTax	
  and	
  Anheuser-­‐Busch.	
  
• Facebook	
  Inc.	
  wants	
  to	
  be	
  on	
  consumers’	
  second	
  screen	
  during	
  the	
  Super	
  
Bowl.	
  
• The	
  social	
  network	
  will	
  launch	
  a	
  Super	
  Bowl-­‐specific	
  feed	
  during	
  the	
  game	
  
where	
  consumers	
  can	
  comment	
  on	
  the	
  game—and	
  the	
  surrounding	
  hoopla	
  
around	
  it,	
  including	
  ads.	
  And	
  advertisers	
  can	
  target	
  consumers	
  within	
  the	
  
feed	
  based	
  on	
  what	
  participants	
  are	
  discussing.	
  
• Among	
  the	
  brands	
  that	
  plan	
  to	
  advertise	
  within	
  Facebook’s	
  feed	
  are	
  Toyota,	
  
Pepsi,	
  Intuit	
  TurboTax	
  and	
  Anheuser-­‐Busch.	
  Each	
  of	
  those	
  brands	
  is	
  also	
  
running	
  ads	
  during	
  the	
  game’s	
  TV	
  broadcast.	
  
• Using	
  Facebook,	
  as	
  well	
  as	
  other	
  digital	
  channels,	
  to	
  amplify	
  a	
  costly	
  ad	
  buy	
  is	
  
an	
  essential	
  part	
  of	
  advertising	
  strategy	
  in	
  today’s	
  media	
  climate,	
  says	
  
Rebecca	
  Lieb,	
  an	
  analyst	
  at	
  the	
  business	
  research	
  and	
  advisory	
  firm	
  Altimeter	
  
Group.	
  
• “Brands	
  are	
  in	
  a	
  position	
  where	
  making	
  corresponding	
  web	
  and	
  social	
  ad	
  
buys	
  is	
  de	
  rigueur,”	
  she	
  says.	
  “Why	
  would	
  you	
  invest	
  all	
  the	
  time	
  and	
  money	
  
in	
  a	
  Super	
  Bowl	
  ad	
  and	
  give	
  it	
  the	
  lifespan	
  of	
  a	
  fruit	
  fly	
  by	
  letting	
  it	
  begin	
  and	
  
end	
  on	
  broadcast	
  TV?”	
  
• This	
  year	
  30	
  seconds	
  of	
  Super	
  Bowl	
  air	
  time	
  costs	
  advertisers	
  $4.5	
  million,	
  
according	
  to	
  Variety.	
  That	
  doesn’t	
  begin	
  to	
  factor	
  in	
  production	
  costs,	
  which	
  
can	
  also	
  be	
  extremely	
  costly,	
  Lieb	
  says.	
  
• In	
  addition	
  to	
  letting	
  large	
  advertisers	
  amplify	
  their	
  Super	
  Bowl	
  campaigns,	
  
the	
  feed	
  will	
  also	
  let	
  smaller	
  marketers,	
  including	
  e-­‐retailers,	
  use	
  attention-­‐
grabbing	
  ads	
  to	
  be	
  a	
  part	
  of	
  consumers’	
  Super	
  Bowl	
  discussion,	
  says	
  Lou	
  
Kerner,	
  a	
  social	
  media	
  analyst	
  and	
  investor	
  at	
  The	
  Social	
  Internet	
  Fund.	
  
• While	
  Twitter	
  is	
  often	
  thought	
  of	
  as	
  the	
  social	
  network	
  consumers	
  engage	
  
with	
  while	
  watching	
  TV,	
  its	
  audience	
  is	
  roughly	
  one-­‐fifth	
  the	
  size	
  of	
  
Facebook’s,	
  Lieb	
  says.	
  Twitter	
  has	
  284	
  million	
  monthly	
  active	
  users—and	
  only	
  
63	
  million	
  in	
  the	
  United	
  States—compared	
  to	
  Facebook,	
  which	
  has	
  1.393	
  
billion	
  monthly	
  active	
  users,	
  including	
  208	
  million	
  in	
  the	
  United	
  States	
  and	
  
Canada	
  (Facebook	
  doesn’t	
  release	
  a	
  U.S.-­‐only	
  figure).	
  
• “There’s	
  never	
  been	
  a	
  medium	
  as	
  big	
  as	
  Facebook,”	
  Lieb	
  says.	
  “Now	
  clearly	
  
not	
  all	
  of	
  Facebook’s	
  users	
  are	
  Americans,	
  not	
  all	
  of	
  those	
  American	
  users	
  are	
  
football	
  fans,	
  but	
  there	
  are	
  millions	
  and	
  millions	
  of	
  people	
  who	
  represent	
  a	
  
very	
  large	
  potential	
  audience	
  for	
  advertisers,”	
  she	
  says.	
  While	
  TV	
  gives	
  
advertisers	
  a	
  tool	
  to	
  reach	
  a	
  wide	
  swath	
  of	
  consumers,	
  Facebook	
  gives	
  them	
  
an	
  even	
  bigger	
  audience	
  that	
  they	
  can	
  finely	
  target,	
  she	
  says.	
  
• Facebook	
  recognizes	
  this	
  and	
  is	
  emphasizing	
  to	
  potential	
  advertisers	
  that,	
  in	
  
addition	
  to	
  football	
  fans,	
  they	
  can	
  reach	
  people	
  discussing	
  party	
  planning,	
  
sharing	
  recipes,	
  buying	
  a	
  new	
  flat-­‐screen	
  TV,	
  the	
  half-­‐time	
  show	
  or	
  chattering	
  
about	
  ads,	
  a	
  spokeswoman	
  says.	
  Facebook	
  declined	
  to	
  say	
  what	
  it	
  is	
  charging	
  
marketers	
  to	
  advertise	
  in	
  the	
  Super	
  Bowl	
  feed.	
  
• While	
  115	
  million	
  U.S.	
  consumers	
  watched	
  the	
  Super	
  Bowl	
  last	
  year,	
  
Facebook	
  says	
  170	
  million	
  people	
  saw	
  Super	
  Bowl-­‐related	
  posts	
  and	
  ads	
  last	
  
year.	
  By	
  developing	
  a	
  dedicated	
  feed,	
  Facebook	
  aims	
  to	
  grow	
  that	
  number.	
  
	
  
Source:	
  https://www.internetretailer.com/2015/01/30/facebooks-­‐new-­‐super-­‐bowl-­‐ad-­‐play	
  
	
  
++Insight+1++	
  from	
  Nielsen,	
  Spredfast,	
  Rentrak:	
  
We	
  also	
  know	
  that	
  40%	
  of	
  U.S.	
  tablet	
  and	
  smartphone	
  users	
  visit	
  a	
  social	
  network	
  
while	
  watching	
  TV.	
  
Five	
  of	
  the	
  top	
  10	
  primetime	
  TV	
  shows	
  integrate	
  social	
  media	
  online	
  and/or	
  on-­‐	
  air:	
  
NBC	
  Sunday	
  Night	
  Football,	
  both	
  nights	
  of	
  The	
  Voice,	
  and	
  both	
  nights	
  of	
  X	
  Factor.	
  In	
  
addition,	
  Spredfast	
  reaches	
  135	
  million	
  people	
  each	
  week	
  through	
  our	
  on-­‐air	
  social	
  
visuali-­‐	
  zations,	
  which	
  is	
  40%	
  of	
  the	
  U.S.	
  population.	
  Rentrak’s	
  scale	
  allows	
  us	
  to	
  sell	
  
on	
  cycles	
  up	
  to	
  28	
  days	
  for	
  most	
  shows	
  because	
  we	
  have	
  tremendous	
  coverage	
  
across	
  users.	
  
 
Reading for more references:
Nielsen-cross-platform-report-march-2014.pdf
Do display ad influence search.pdf
Tech Trends 2014 Inspiring Disruption – Deloitte.pdf
Accenture_Technology_Vision_2014.pdf
Social_Shopping_2011_Brief1.pdf
Social_Media_Analytics_-_Sample_report_-_Marketing_effectiveness.pdf
13926_di_social_q413_v5.pdf

More Related Content

What's hot

Data Mining in Market Research
Data Mining in Market ResearchData Mining in Market Research
Data Mining in Market Researchbutest
 
Parkinson disease classification v2.0
Parkinson disease classification v2.0Parkinson disease classification v2.0
Parkinson disease classification v2.0
Nikhil Shrivastava, MS, SAFe PMPO
 
Filtering Instagram hashtags through crowdtagging and the HITS algorithm
Filtering Instagram hashtags through crowdtagging and the HITS algorithmFiltering Instagram hashtags through crowdtagging and the HITS algorithm
Filtering Instagram hashtags through crowdtagging and the HITS algorithm
Venkat Projects
 
Tree pruning
Tree pruningTree pruning
Tree pruning
priya_kalia
 
Anomaly Detection for Real-World Systems
Anomaly Detection for Real-World SystemsAnomaly Detection for Real-World Systems
Anomaly Detection for Real-World Systems
Manojit Nandi
 
Classification
ClassificationClassification
Classification
Dr. C.V. Suresh Babu
 
Dealing with inconsistency
Dealing with inconsistencyDealing with inconsistency
Dealing with inconsistency
Rajat Sharma
 
AI: Learning in AI 2
AI: Learning in AI 2AI: Learning in AI 2
AI: Learning in AI 2
DataminingTools Inc
 
세미나 20170929
세미나 20170929세미나 20170929
세미나 20170929
Lee Gyeong Hoon
 
Delayed Rewards in the context of Reinforcement Learning based Recommender ...
Delayed Rewards in the context of Reinforcement Learning based Recommender ...Delayed Rewards in the context of Reinforcement Learning based Recommender ...
Delayed Rewards in the context of Reinforcement Learning based Recommender ...
Debmalya Biswas
 
Survey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data MiningSurvey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data Mining
ijsrd.com
 
Pca seminar final report
Pca seminar final reportPca seminar final report
Clustering
ClusteringClustering
Predictive job scheduling in a connection limited system using parallel genet...
Predictive job scheduling in a connection limited system using parallel genet...Predictive job scheduling in a connection limited system using parallel genet...
Predictive job scheduling in a connection limited system using parallel genet...Mumbai Academisc
 
Parkinson disease classification recorded v2.0
Parkinson disease classification recorded   v2.0Parkinson disease classification recorded   v2.0
Parkinson disease classification recorded v2.0
Nikhil Shrivastava, MS, SAFe PMPO
 
Paper id 71201913
Paper id 71201913Paper id 71201913
Paper id 71201913
IJRAT
 
Classification and decision tree classifier machine learning
Classification and decision tree classifier machine learningClassification and decision tree classifier machine learning
Classification and decision tree classifier machine learning
Francisco E. Figueroa-Nigaglioni
 
Chaptr 7 (final)
Chaptr 7 (final)Chaptr 7 (final)
Chaptr 7 (final)
Nateshwar Kamlesh
 

What's hot (20)

Data Mining in Market Research
Data Mining in Market ResearchData Mining in Market Research
Data Mining in Market Research
 
Parkinson disease classification v2.0
Parkinson disease classification v2.0Parkinson disease classification v2.0
Parkinson disease classification v2.0
 
Filtering Instagram hashtags through crowdtagging and the HITS algorithm
Filtering Instagram hashtags through crowdtagging and the HITS algorithmFiltering Instagram hashtags through crowdtagging and the HITS algorithm
Filtering Instagram hashtags through crowdtagging and the HITS algorithm
 
Tree pruning
Tree pruningTree pruning
Tree pruning
 
Anomaly Detection for Real-World Systems
Anomaly Detection for Real-World SystemsAnomaly Detection for Real-World Systems
Anomaly Detection for Real-World Systems
 
Classification
ClassificationClassification
Classification
 
Dealing with inconsistency
Dealing with inconsistencyDealing with inconsistency
Dealing with inconsistency
 
bbbPaper
bbbPaperbbbPaper
bbbPaper
 
AI: Learning in AI 2
AI: Learning in AI 2AI: Learning in AI 2
AI: Learning in AI 2
 
세미나 20170929
세미나 20170929세미나 20170929
세미나 20170929
 
Delayed Rewards in the context of Reinforcement Learning based Recommender ...
Delayed Rewards in the context of Reinforcement Learning based Recommender ...Delayed Rewards in the context of Reinforcement Learning based Recommender ...
Delayed Rewards in the context of Reinforcement Learning based Recommender ...
 
Survey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data MiningSurvey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data Mining
 
FLIPKART SAMSUNG
FLIPKART SAMSUNGFLIPKART SAMSUNG
FLIPKART SAMSUNG
 
Pca seminar final report
Pca seminar final reportPca seminar final report
Pca seminar final report
 
Clustering
ClusteringClustering
Clustering
 
Predictive job scheduling in a connection limited system using parallel genet...
Predictive job scheduling in a connection limited system using parallel genet...Predictive job scheduling in a connection limited system using parallel genet...
Predictive job scheduling in a connection limited system using parallel genet...
 
Parkinson disease classification recorded v2.0
Parkinson disease classification recorded   v2.0Parkinson disease classification recorded   v2.0
Parkinson disease classification recorded v2.0
 
Paper id 71201913
Paper id 71201913Paper id 71201913
Paper id 71201913
 
Classification and decision tree classifier machine learning
Classification and decision tree classifier machine learningClassification and decision tree classifier machine learning
Classification and decision tree classifier machine learning
 
Chaptr 7 (final)
Chaptr 7 (final)Chaptr 7 (final)
Chaptr 7 (final)
 

Similar to copy for Gary Chin.

Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
ssuser33da69
 
On Machine Learning and Data Mining
On Machine Learning and Data MiningOn Machine Learning and Data Mining
On Machine Learning and Data Miningbutest
 
Pharmacokinetic pharmacodynamic modeling
Pharmacokinetic pharmacodynamic modelingPharmacokinetic pharmacodynamic modeling
Pharmacokinetic pharmacodynamic modeling
Meghana Gowda
 
Neural networks, naïve bayes and decision tree machine learning
Neural networks, naïve bayes and decision tree machine learningNeural networks, naïve bayes and decision tree machine learning
Neural networks, naïve bayes and decision tree machine learning
Francisco E. Figueroa-Nigaglioni
 
Data science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptxData science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptx
swapnaraghav
 
CUSTOMER CHURN PREDICTION
CUSTOMER CHURN PREDICTIONCUSTOMER CHURN PREDICTION
CUSTOMER CHURN PREDICTION
IRJET Journal
 
ML crash course
ML crash courseML crash course
ML crash course
mikaelhuss
 
Cognitive automation
Cognitive automationCognitive automation
Cognitive automation
Trideeb Kumar Das
 
Cyber bullying detection and analysis.ppt.pdf
Cyber bullying detection and analysis.ppt.pdfCyber bullying detection and analysis.ppt.pdf
Cyber bullying detection and analysis.ppt.pdf
Hunais Abdul Nafi
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
NitinSharma134320
 
Introductionedited
IntroductioneditedIntroductionedited
Introductionedited
Mefratechnologies
 
Survey: Biological Inspired Computing in the Network Security
Survey: Biological Inspired Computing in the Network SecuritySurvey: Biological Inspired Computing in the Network Security
Survey: Biological Inspired Computing in the Network Security
Eswar Publications
 
Top 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdfTop 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdf
AnanthReddy38
 
Bank loan purchase modeling
Bank loan purchase modelingBank loan purchase modeling
Bank loan purchase modeling
Saleesh Satheeshchandran
 
AI TESTING: ENSURING A GOOD DATA SPLIT BETWEEN DATA SETS (TRAINING AND TEST) ...
AI TESTING: ENSURING A GOOD DATA SPLIT BETWEEN DATA SETS (TRAINING AND TEST) ...AI TESTING: ENSURING A GOOD DATA SPLIT BETWEEN DATA SETS (TRAINING AND TEST) ...
AI TESTING: ENSURING A GOOD DATA SPLIT BETWEEN DATA SETS (TRAINING AND TEST) ...
ijsc
 
Ieee transactions on 2018 knowledge and data engineering topics with abstract .
Ieee transactions on 2018 knowledge and data engineering topics with abstract .Ieee transactions on 2018 knowledge and data engineering topics with abstract .
Ieee transactions on 2018 knowledge and data engineering topics with abstract .
tsysglobalsolutions
 
Week 4 advanced labeling, augmentation and data preprocessing
Week 4   advanced labeling, augmentation and data preprocessingWeek 4   advanced labeling, augmentation and data preprocessing
Week 4 advanced labeling, augmentation and data preprocessing
Ajay Taneja
 
Lime
LimeLime
Dissertation
DissertationDissertation
Dissertation
Mefratechnologies
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
Eng Teong Cheah
 

Similar to copy for Gary Chin. (20)

Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
 
On Machine Learning and Data Mining
On Machine Learning and Data MiningOn Machine Learning and Data Mining
On Machine Learning and Data Mining
 
Pharmacokinetic pharmacodynamic modeling
Pharmacokinetic pharmacodynamic modelingPharmacokinetic pharmacodynamic modeling
Pharmacokinetic pharmacodynamic modeling
 
Neural networks, naïve bayes and decision tree machine learning
Neural networks, naïve bayes and decision tree machine learningNeural networks, naïve bayes and decision tree machine learning
Neural networks, naïve bayes and decision tree machine learning
 
Data science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptxData science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptx
 
CUSTOMER CHURN PREDICTION
CUSTOMER CHURN PREDICTIONCUSTOMER CHURN PREDICTION
CUSTOMER CHURN PREDICTION
 
ML crash course
ML crash courseML crash course
ML crash course
 
Cognitive automation
Cognitive automationCognitive automation
Cognitive automation
 
Cyber bullying detection and analysis.ppt.pdf
Cyber bullying detection and analysis.ppt.pdfCyber bullying detection and analysis.ppt.pdf
Cyber bullying detection and analysis.ppt.pdf
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
 
Introductionedited
IntroductioneditedIntroductionedited
Introductionedited
 
Survey: Biological Inspired Computing in the Network Security
Survey: Biological Inspired Computing in the Network SecuritySurvey: Biological Inspired Computing in the Network Security
Survey: Biological Inspired Computing in the Network Security
 
Top 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdfTop 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdf
 
Bank loan purchase modeling
Bank loan purchase modelingBank loan purchase modeling
Bank loan purchase modeling
 
AI TESTING: ENSURING A GOOD DATA SPLIT BETWEEN DATA SETS (TRAINING AND TEST) ...
AI TESTING: ENSURING A GOOD DATA SPLIT BETWEEN DATA SETS (TRAINING AND TEST) ...AI TESTING: ENSURING A GOOD DATA SPLIT BETWEEN DATA SETS (TRAINING AND TEST) ...
AI TESTING: ENSURING A GOOD DATA SPLIT BETWEEN DATA SETS (TRAINING AND TEST) ...
 
Ieee transactions on 2018 knowledge and data engineering topics with abstract .
Ieee transactions on 2018 knowledge and data engineering topics with abstract .Ieee transactions on 2018 knowledge and data engineering topics with abstract .
Ieee transactions on 2018 knowledge and data engineering topics with abstract .
 
Week 4 advanced labeling, augmentation and data preprocessing
Week 4   advanced labeling, augmentation and data preprocessingWeek 4   advanced labeling, augmentation and data preprocessing
Week 4 advanced labeling, augmentation and data preprocessing
 
Lime
LimeLime
Lime
 
Dissertation
DissertationDissertation
Dissertation
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
 

copy for Gary Chin.

  • 1. Master Analytics Data Solution ~ Multiple Channels DRAFTED in the reigned social domain other than being under further completion Copy for Mr. Gary Chin, 20150206 prepared by Teng Xiaolu Draw out the analytics framework as, Data Solution = 1. (Statistics Model + Machine Learning) + 2. (Strategy Insights + Metrics Schema + Innovation Tech) On par of the intimidating abundant realms involved, it could be spilt into 2 major parts, Bracket 1. Riding on the foundation of methodology, propagate algorithm techniques and statistical test. Bracket 2. Based on the data responses, decisions led by data analysis could be made through entanglement of insights, measurements, and addictive innovations [fig.3]. Machine Learning In general, at a glance of machine learning (the collections from blog discussions for your best references) who guides a processing to be emphasized: training, tune, test.   Basically  you  have  three  data  sets:  training,  validation  and  testing.   You  train  the  classifier  using  'training  set',  tune  the  parameters  using  'validation  set'   and  then  test  the  performance  of  your  classifier  on  unseen  'test  set'.  Normally,  the   data  size  test  vs  training,  I  have  seen  the  versions  discrepancies,  30%:  70%  or  10%:   90%.  Probably  there  is  no  one  way  to  choose.  Is  it  eliminating  bias  of  classification?   Any  result  in  the  possible  generalization?       A  well  accepted  method  is  N-­‐Fold  cross  validation,  in  which  you  randomize  the   dataset  and  create  N  (almost)  equal  size  partitions.  Then  choose  Nth  partition  for   testing  and  N-­‐1  partitions  for  training  the  classifier.  Within  the  training  set  you  can   further  employ  another  K-­‐fold  cross  validation  to  create  a  validation  set  and  find  the   best  parameters.  And  repeat  this  process  N  times  to  get  an  average  of  the  metric.     Key  words:  unbias,  cross-­‐validation,  randomized  data,  average  of  the  metric     Strategy Insights + Metrics Schema (Social genre) In the sense of social mining, social channels employ the first approach to fulfill insights sophisticated. Also the best to derive from market distinction. In this first section, it would add on social listening:
  • 2. Which fans of network are figured out as influenced nodes and how much fraction it takes of the total scale of fans. Especially, these scatters are spilt into the diversified layers of network. How much frequencies the reactions direct to the posts, which could be classified into following volumes in terms of followers size. How it’s to figure out the overlapped area in between of various communities when the common of high interests in hashtags. (none expanding version yet) Against static view on attribute, Source: MYTH-BUSTING SOCIAL MEDIA ADVERTISING
  • 3. Source: nielsen-cross-platform-report-march-2014.pdf fEATURE selection separated, or involved into scenarios. Statistic Model As long as fulfilling the social mining, it’s able to array into a digital model. I would suggest to run out together Logistic Regression, Decision Tree, Neural Network, considering the complementary effect of these 3 functional classifiers. Before, it’s unavoided daunts to discover a certain period who should be used in who among flourished classification techniques. Now it’s able to identify the limitations to be removed in the same time maximize the strengths, for instance, tolerance of missing data is found in decision tree, in the result to tackle the black-box happened in neural network. Nonetheless, this phenomena tends to high allowance on features less restricted, and tolerance to the highly interdependent attributions, it ends to don’t know what to be predicted why it’s predicted. (the collections from blog discussions) for your best references determine the number of neurals   • The  VC  dimension  provides  a  rule  of  thumb  for  the  number  of  neurons.   Basically  it  states  that  the  number  of  free  parameters  should  be  much  less   than  the  number  of  examples  in  your  training  set.  "Free  parameters"   translates  to  the  number  of  connections  in  your  neural  net  that  need  to  be  
  • 4. tuned,  which  in  a  fully  connected  net  depend  on  the  number  of  neurons  and   how  many  of  them  are  in  the  input  layer  vs  the  hidden  layer.  [1]   • In  general,  with  a  large  dataset,  the  more  parameters  the   better.    Regularization  can  prevent  overfitting.         The  structure  of  the  neural  net  is  also  critical,  and  actually  determines  the   number  of  parameters  (which  corresponds  much  more  to  the  number  of   connections).         The  most  popular  architectures  these  days  use  many  (e.g.  10)  "layers"  of   neurons,  and/or  feedback  connections  (see  recurrent  neural  nets,  now   almost  always  using  LSTM).     So  in  short,  #neurons  <<<  #examples  in  training  set       [1]  Notice  how  low-­‐dimensional  examples  becomes  a  positive  thing  here   Good to understand neural network   THINKING:  A  hybrid  solution  is  suggested  in  current  version  with  the  paper  in  [fig.1].   Despite  of  continuous  lacking  of  evidences  to  define  the  how  much  concerns  on  the   speed  of  learning  and  data  consumption,  in  this  moment,  I  would  support  this   operation  phasing  to  none  clicks  !  clicks.   [fig.1] Neural  networks  are  routinely  ignored  as  a  modeling  tool  because  they  are  largely   uninterpretable  overall  and  are  generally  less  familiar  to  analysts  and  business   people  alike.  Neural  networks  can  provide  great  diagnostic  insights  into  the  potential   shortcomings  of  other  modeling  methods,  and  comparing  the  results  of  different   models  can  help  identify  what  is  needed  to  improve  model  performance.   For  example,  consider  a  situation  where  the  best  tree  model  fits  poorly,  but  the  best   neural  network  model  and  the  best  regression  model  perform  similarly  well  on  the   validation  data.  Had  the  analyst  not  considered  using  a  neural  network,  little   performance  would  be  lost  by  investigating  only  the  regression  model.  Consider  a   similar  situation  where  the  best  tree  fits  poorly  and  the  best  regression  fits   somewhat  better,  but  the  best  neural  network  shows  marked  improvement  over  the   regression  model.  The  poor  tree  fit  might  indicate  that  the  relationship  between  the   predictors  and  the  response  changes  smoothly.  The  improvement  of  the  neural   network  over  the  regression  indicates  that  the  regression  model  is  not  capturing  the   complexity  of  the  relationship  between  the  predictors  and  the  response.  Without   the  neural  network  results,  the  regression  model  would  be  chosen  and  much   interpretation  would  go  into  interpreting  a  model  that  inadequately  describes  the   relationship.  Even  if  the  neural  network  is  not  a  candidate  to  present  to  the  final   client  or  management  team,  the  neural  network  can  be  highly  diagnostic  for  other   modeling  approaches.   In  another  situation,  the  best  tree  model  and  the  best  neural  network  model  might   be  performing  well,  but  the  regression  model  is  performing  somewhat  poorly.  In  this  
  • 5. case,  the  relative  interpretability  of  the  tree  might  lead  to  its  selection,  but  the   neural  network  fit  confirms  that  the  tree  model  adequately  summarizes  the   relationship.  In  yet  another  scenario,  the  tree  is  performing  very  well  relative  to  both   the  neural  network  and  regression  models.  This  scenario  might  imply  that  there  are   certain  variables  that  behave  unusually  with  respect  to  the  response  when  a  missing   value  is  present.  Because  trees  can  handle  missing  values  directly,  they  are  able  to   differentiate  between  a  missing  value  and  a  value  that  has  been  imputed  for  use  in  a   regression  or  neural  network  model.  In  this  case,  it  might  make  more  sense  to   investigate  missing  value  indicators  rather  than  to  look  at  increasing  the  flexibility  of   the  regression  model  because  the  neural  network  shows  that  this  improved   flexibility  does  not  improve  the  fit.   To  overcome  this  problem,  select  variables  judiciously  and  fit  a  neural  network  while   ensuring  that  there  is  an  adequate  amount  of  data  in  the  validation  data  set.  As   discussed  earlier,  performing  variable  selection  in  a  variety  of  ways  ensures  that   important  variables  are  included.  Evaluate  the  models  fit  by  decision  tree,   regression,  and  neural  network  methods  to  better  understand  the  relationships  in   the  data,  and  use  this  information  to  identify  ways  to  improve  the  overall  fit.   Source:  <Identifying  and  Overcoming  Common  Data  Mining  Mistakes>   From  the  other  book,   “However,  a  neural  network  is  a  “black  box”  method  that  does  not  provide  any   interpetable  explanation  to  accompany  its  classifications  or  predictions.  Adjusting   the  parameters  to  tune  the  neural  network  performance  is  largely  a  matter  of  trial   and  error  guided  by  rules  of  thumb  and  user  experience.”   {SIDE  NOTE}   Inspired  by  listening  !  imitation  !  recode,  I  would  like  to  believe  the  other  tuple   required  to  heterogenic  interpretation  with  discriminant  effect.  Probably  it  requires   Naïve  Bayes  and  VMC  to  iterate  stringently.  Please  kindly  noted  independent   fEATURE  selection  to  support  formula  could  be  packed  into  scenarios.     About  Naïve  Bayes  in  few  paragraphs,   • The  second  contribution  is  a  technical  contribution:  We  in-­‐  troduce  a  version   of  Na  ̈ıve  Bayes  with  a  multivariate  event  model  that  can  scale  up  efficiently   to  massive,  sparse  datasets.  Specifically,  this  version  of  the  commonly  used   multivariate  Bernoulli  Na  ̈ıve  Bayes  only  needs  to  consider  the  ‘‘active’’   elements  of  the  dataset—those  that  are  present  or  non-­‐zero—  which  can  be   a  tiny  fraction  of  the  elements  in  the  matrix  for  massive,  sparse  data.  This   means  that  predictive  modelers  wanting  to  work  with  the  very  convenient   Na  ̈ıve  Bayes  algorithm  are  not  forced  to  use  the  multinomial  event  model   simply  because  it  is  more  scalable.  This  article  thereby  makes  a  small  but  
  • 6. important  addition  to  the  cumulative  answer  to  a  current  open  research   question17:     • How  can  we  learn  predictive  models  from  lots  of  data?     • Note  that  our  use  of  Na  ̈ıve  Bayes  should  not  be  interpreted  as  a  claim   that  Na  ̈ıve  Bayes  is  by  any  means  the  best  modeling  technique  for  these   data.  Other  methods  exist  that  handle  large  transactional  datasets,  such   as  the  popular  Vowpal  Wabbit  software  based  on  scalable  stochastic   gradient  descent  and  input  hashing.2,18,19  Moreover,  results  based  on   Na  ̈ıve  Bayes  are  conservative.  As  one  would  expect  theoretically20  and  as   shown  empirically,15  nonlinear  modeling  and  less-­‐restrictive  linear  modeling   generally  will  show  continued  improvements  in  predictive  performance  for   much  larger  datasets  than  will  Na  ̈ıve  Bayes  modeling.  (However,  how  to   conduct  robust,  effective  nonlinear  modeling  with  massive  high-­‐dimensional   data  is  still  an  open  question.)  Nevertheless,  Na  ̈ıve  Bayes  is  popular  and   quite  robust.  Using  it  provides  a  clear  and  conservative  baseline  to   demonstrate  the  point  of  the  article.  If  we  see  continued  improvements   when  scaling  up  Na  ̈ıve  Bayes  to  massive  data,  we  should  ex-­‐  pect  even   greater  improvements  when  scaling  up  more  sophisticated  induction   algorithms.     • These  results  are  important  because  they  help  provide  some  solid  empirical   grounding  to  the  importance  of  big  data  for  predictive  analytics  and  highlight   a  particular  sort  of  data  in  which  predictive  analytics  is  likely  to  benefit  from   big  data.  They  also  add  to  the  observation3  that  firms  (or  other  entities)  with   massive  data  assets21  may  indeed  have  a  considerable  competitive   advantage  over  firms  with  smaller  data  assets.   Source:  big%2E2013%2E0037.pdf  <Is  Bigger  Really  Better?>     More  discussion  from  the  paper  about  digital  data  occurrences,  sparse,  fine-­‐grained,   so  does  massive  !  more  data  actually  beats  algorithms.     [fig.2] Dynamic Programming Source: https://www.cs.utexas.edu/~eladlieb/RLRG.html http://theanalyticsstore.com/deep-learning/
  • 7.       INNOVATION is a case combined tech advance TV + Social probably arrives aftermath from the aspect top-to-bottom. I have saw somewhere the similar opinion. Will point out specific article later for your conveniences due to time constrains. I would support to the tool kit, 0. ! -1.
  • 8. So does real time approach which attempts to roll out under specific real time metrics recency, particularly link to data stream hour, day, week, month. (refer to ++Insight+1++). Functionally it needs parallel to brand metrics, awareness and retention rate. With such, it won’t cover too much about impression, project ROI, est. cost of prospect gaining = estimated margin per prospect / (1+ROI threshold) or CLTV (new deducted from existing) in the monothetic statistical test, including a longer list p value, F test, t-Test, R2 , adjusted R2 , correlation matrix, elasticity and co-efficiency functionally validate, type I/II errors. MAPE, error rate management, ROC, lift depended on the model selection and more likelihood time series, association, what-if. (note: longer in book(s), thicker+thicker, every fraction, self-semester) There is analytics session named transaction analysis, RFM discerning acquisition ! transaction. It illustrates the possibilities, in conditional setting, the clicks but none-purchase might belong to the group who stimulates to longer relationship with brands who offer coupons, probably the variation of against cost sensitivity. A model helps to recognize this type of existence with parameters. In the contrast, purchased group alternately to be encouraged for repeat purchase due to shifted demand from the analysis cross-sell and up- sell. Upstream and downstream both could be spread. How far from the overarching digital intrinsic relevance, by channel or by touchpoint? TV, plus time shifted TV diminishing, even capping 2, on or off should leave off this motion in a certain scenario, for instance customer scoring program. Social network analysis plays throughout the scenarios estimating customer profitability, listening ! imitation ! recode, on the path of being extrapolation, both supervised and un-supervised.
  • 9. Source: Bayesian+reasoning+and+machine+learning.pdf NEWS, In  particular,  they  want  to  see  highly  granular  data  from  all  touchpoints.  "Increasing   the  granularity  and  variability  of  media  inputs  can  increase  the  estimate  of  a   medium's  RoI  by  as  much  as  27%,"  they  reported.     They  also  highlighted  the  "shocking  oversight"  when  it  comes  to  measuring   creativity,  with  some  observers  claiming  that  70%  of  the  sales  effectiveness  of   advertising  can  be  attributed  to  the  creative  message.     Acknowledging  that  this  is  a  difficult  area,  they  argued  that  more  direct  integration   of  copy  tests  into  marketing  mix  models  would  move  the  industry  on  from   determining  which  ads  worked  to  understanding  why  they  worked.   Source:  New  marketing  models  emerge,    London:  6  February  2015   http://www.warc.com/LatestNews/News/EmailNews.news?ID=34271&Origin=WARCNewsEmail&CID=N34271&P UB=Warc_News&utm_source=WarcNews&utm_medium=email&utm_campaign=WarcNews20150206          
  • 10. Other gifts given from London What Winston said other else? :>> “During these turbulent times, predictive analytics is how smart companies are turning data into knowledge to gain a competitive advantage.” Source: <Drive your business with predictive analytics>
  • 11. Source: <Drive your business with predictive analytics> THINKING from Facebook case: it might be two-ways TV doesn’t simply play as domination to influence social responses in trend, in the contradict, social platform reflects TV opportunities that Facebook leverages on Super Bowl significance. It’s typical event show pattern vs proportion vs longer viewingship extension added with transaction history to capture higher value customer. [fig.3] • January  30,  2015,  1:53  PM   • Facebook’s  new  Super  Bowl  ad  play   • By  Zak  Stambor  Managing  Editor     • The  social  network  will  launch  a  live  feed  where  fans  can  discuss  the  game,   and  it  is  selling  video  ads  that  target  consumers  based  on  what  they  talk   about.  Among  those  signing  up  to  advertise  are  Toyota,  Pepsi,  Intuit   TurboTax  and  Anheuser-­‐Busch.   • Facebook  Inc.  wants  to  be  on  consumers’  second  screen  during  the  Super   Bowl.   • The  social  network  will  launch  a  Super  Bowl-­‐specific  feed  during  the  game   where  consumers  can  comment  on  the  game—and  the  surrounding  hoopla   around  it,  including  ads.  And  advertisers  can  target  consumers  within  the   feed  based  on  what  participants  are  discussing.   • Among  the  brands  that  plan  to  advertise  within  Facebook’s  feed  are  Toyota,   Pepsi,  Intuit  TurboTax  and  Anheuser-­‐Busch.  Each  of  those  brands  is  also   running  ads  during  the  game’s  TV  broadcast.   • Using  Facebook,  as  well  as  other  digital  channels,  to  amplify  a  costly  ad  buy  is   an  essential  part  of  advertising  strategy  in  today’s  media  climate,  says  
  • 12. Rebecca  Lieb,  an  analyst  at  the  business  research  and  advisory  firm  Altimeter   Group.   • “Brands  are  in  a  position  where  making  corresponding  web  and  social  ad   buys  is  de  rigueur,”  she  says.  “Why  would  you  invest  all  the  time  and  money   in  a  Super  Bowl  ad  and  give  it  the  lifespan  of  a  fruit  fly  by  letting  it  begin  and   end  on  broadcast  TV?”   • This  year  30  seconds  of  Super  Bowl  air  time  costs  advertisers  $4.5  million,   according  to  Variety.  That  doesn’t  begin  to  factor  in  production  costs,  which   can  also  be  extremely  costly,  Lieb  says.   • In  addition  to  letting  large  advertisers  amplify  their  Super  Bowl  campaigns,   the  feed  will  also  let  smaller  marketers,  including  e-­‐retailers,  use  attention-­‐ grabbing  ads  to  be  a  part  of  consumers’  Super  Bowl  discussion,  says  Lou   Kerner,  a  social  media  analyst  and  investor  at  The  Social  Internet  Fund.   • While  Twitter  is  often  thought  of  as  the  social  network  consumers  engage   with  while  watching  TV,  its  audience  is  roughly  one-­‐fifth  the  size  of   Facebook’s,  Lieb  says.  Twitter  has  284  million  monthly  active  users—and  only   63  million  in  the  United  States—compared  to  Facebook,  which  has  1.393   billion  monthly  active  users,  including  208  million  in  the  United  States  and   Canada  (Facebook  doesn’t  release  a  U.S.-­‐only  figure).   • “There’s  never  been  a  medium  as  big  as  Facebook,”  Lieb  says.  “Now  clearly   not  all  of  Facebook’s  users  are  Americans,  not  all  of  those  American  users  are   football  fans,  but  there  are  millions  and  millions  of  people  who  represent  a   very  large  potential  audience  for  advertisers,”  she  says.  While  TV  gives   advertisers  a  tool  to  reach  a  wide  swath  of  consumers,  Facebook  gives  them   an  even  bigger  audience  that  they  can  finely  target,  she  says.   • Facebook  recognizes  this  and  is  emphasizing  to  potential  advertisers  that,  in   addition  to  football  fans,  they  can  reach  people  discussing  party  planning,   sharing  recipes,  buying  a  new  flat-­‐screen  TV,  the  half-­‐time  show  or  chattering   about  ads,  a  spokeswoman  says.  Facebook  declined  to  say  what  it  is  charging   marketers  to  advertise  in  the  Super  Bowl  feed.   • While  115  million  U.S.  consumers  watched  the  Super  Bowl  last  year,   Facebook  says  170  million  people  saw  Super  Bowl-­‐related  posts  and  ads  last   year.  By  developing  a  dedicated  feed,  Facebook  aims  to  grow  that  number.     Source:  https://www.internetretailer.com/2015/01/30/facebooks-­‐new-­‐super-­‐bowl-­‐ad-­‐play     ++Insight+1++  from  Nielsen,  Spredfast,  Rentrak:   We  also  know  that  40%  of  U.S.  tablet  and  smartphone  users  visit  a  social  network   while  watching  TV.   Five  of  the  top  10  primetime  TV  shows  integrate  social  media  online  and/or  on-­‐  air:   NBC  Sunday  Night  Football,  both  nights  of  The  Voice,  and  both  nights  of  X  Factor.  In   addition,  Spredfast  reaches  135  million  people  each  week  through  our  on-­‐air  social   visuali-­‐  zations,  which  is  40%  of  the  U.S.  population.  Rentrak’s  scale  allows  us  to  sell   on  cycles  up  to  28  days  for  most  shows  because  we  have  tremendous  coverage   across  users.  
  • 13.   Reading for more references: Nielsen-cross-platform-report-march-2014.pdf Do display ad influence search.pdf Tech Trends 2014 Inspiring Disruption – Deloitte.pdf Accenture_Technology_Vision_2014.pdf Social_Shopping_2011_Brief1.pdf Social_Media_Analytics_-_Sample_report_-_Marketing_effectiveness.pdf 13926_di_social_q413_v5.pdf