The document discusses data analytics solutions involving machine learning and statistical modeling. It proposes splitting the solution into two parts: 1) applying algorithm techniques and statistical tests to data, and 2) making data-driven decisions using insights, metrics, and innovations. It then provides more details on machine learning techniques like training/testing data sets, and determining the optimal number of neurons in neural networks. Statistical modeling techniques like logistic regression, decision trees, and neural networks are recommended. The document emphasizes comparing different model results to identify ways to improve performance.
Amazon Product Review Sentiment Analysis with Machine Learningijtsrd
Users of Amazons online shopping service are allowed to leave feedback for the items they buy. Amazon makes no effort to monitor or limit the scope of these reviews. Although the amount of reviews for various items varies, the reviews provide easily accessible and abundant data for a variety of applications. This paper aims to apply and expand existing natural language processing and sentiment analysis research to data obtained from Amazon. The number of stars given to a product by a user is used as training data for supervised machine learning. Since more people are dependent on online products these days, the value of a review is increasing. Before making a purchase, a buyer must read thousands of reviews to fully comprehend a product. In this day and age of machine learning, however, sorting through thousands of comments and learning from them would be much easier if a model was used to polarize and learn from them. We used supervised learning to polarize a massive Amazon dataset and achieve satisfactory accuracy. Ravi Kumar Singh | Dr. Kamalraj Ramalingam "Amazon Product Review Sentiment Analysis with Machine Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42372.pdf Paper URL: https://www.ijtsrd.comcomputer-science/data-processing/42372/amazon-product-review-sentiment-analysis-with-machine-learning/ravi-kumar-singh
10.sentiment analysis of customer product reviews using machine learniVenkat Projects
10.sentiment analysis of customer product reviews using machine learning In this project author is detecting sentiments from amazon reviews by using various machine learning algorithms such as SVM, Decision Tree and Naïve Bayes. In all 3 algorithms SVM is giving better accuracy and to train this algorithms author has used AMAZON reviews dataset and this dataset is saved inside ‘Amazon_Reviews_dataset’ folder. Below screen shot show example reviews from dataset
Amazon Product Review Sentiment Analysis with Machine Learningijtsrd
Users of Amazons online shopping service are allowed to leave feedback for the items they buy. Amazon makes no effort to monitor or limit the scope of these reviews. Although the amount of reviews for various items varies, the reviews provide easily accessible and abundant data for a variety of applications. This paper aims to apply and expand existing natural language processing and sentiment analysis research to data obtained from Amazon. The number of stars given to a product by a user is used as training data for supervised machine learning. Since more people are dependent on online products these days, the value of a review is increasing. Before making a purchase, a buyer must read thousands of reviews to fully comprehend a product. In this day and age of machine learning, however, sorting through thousands of comments and learning from them would be much easier if a model was used to polarize and learn from them. We used supervised learning to polarize a massive Amazon dataset and achieve satisfactory accuracy. Ravi Kumar Singh | Dr. Kamalraj Ramalingam "Amazon Product Review Sentiment Analysis with Machine Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42372.pdf Paper URL: https://www.ijtsrd.comcomputer-science/data-processing/42372/amazon-product-review-sentiment-analysis-with-machine-learning/ravi-kumar-singh
10.sentiment analysis of customer product reviews using machine learniVenkat Projects
10.sentiment analysis of customer product reviews using machine learning In this project author is detecting sentiments from amazon reviews by using various machine learning algorithms such as SVM, Decision Tree and Naïve Bayes. In all 3 algorithms SVM is giving better accuracy and to train this algorithms author has used AMAZON reviews dataset and this dataset is saved inside ‘Amazon_Reviews_dataset’ folder. Below screen shot show example reviews from dataset
Delayed Rewards in the context of Reinforcement Learning based Recommender ...Debmalya Biswas
We present a Reinforcement Learning (RL) based approach to implement Recommender systems. The results are based on a real-life Wellness app that is able to provide personalized health / activity related content to users in an interactive fashion. Unfortunately, current recommender systems are unable to adapt to continuously evolving features, e.g. user sentiment, and scenarios where the RL reward needs to computed based on multiple and unreliable feedback channels (e.g., sensors, wearables). To overcome this, we propose three constructs: (i) weighted feedback channels, (ii) delayed rewards, and (iii) rewards boosting, which we believe are essential for RL to be used in Recommender Systems.
Survey on Various Classification Techniques in Data Miningijsrd.com
Dynamic Classification is an information mining (machine learning) strategy used to anticipate bunch participation for information cases. In this paper, we show the essential arrangement systems. A few significant sorts of arrangement technique including induction, Bayesian networks, k-nearest neighbor classifier, case-based reasoning, genetic algorithm and fuzzy logic techniques. The objective of this review is to give a complete audit of distinctive characterization procedures in information mining.
Delayed Rewards in the context of Reinforcement Learning based Recommender ...Debmalya Biswas
We present a Reinforcement Learning (RL) based approach to implement Recommender systems. The results are based on a real-life Wellness app that is able to provide personalized health / activity related content to users in an interactive fashion. Unfortunately, current recommender systems are unable to adapt to continuously evolving features, e.g. user sentiment, and scenarios where the RL reward needs to computed based on multiple and unreliable feedback channels (e.g., sensors, wearables). To overcome this, we propose three constructs: (i) weighted feedback channels, (ii) delayed rewards, and (iii) rewards boosting, which we believe are essential for RL to be used in Recommender Systems.
Survey on Various Classification Techniques in Data Miningijsrd.com
Dynamic Classification is an information mining (machine learning) strategy used to anticipate bunch participation for information cases. In this paper, we show the essential arrangement systems. A few significant sorts of arrangement technique including induction, Bayesian networks, k-nearest neighbor classifier, case-based reasoning, genetic algorithm and fuzzy logic techniques. The objective of this review is to give a complete audit of distinctive characterization procedures in information mining.
Survey: Biological Inspired Computing in the Network SecurityEswar Publications
Traditional computing techniques and systems consider a main process device or main server, and technique details generally
serially. They're non-robust and non-adaptive, and have limited quantity. Indifference, scientific technique details in a very similar and allocated manner, while not a main management. They're exceedingly strong, elastic, and ascendible. This paper offers a short conclusion of however the ideas from biology are will never to style new processing techniques and techniques that even have a number of the beneficial qualities of scientific techniques. Additionally, some illustrations are a device given of however these techniques will be used in details security programs.
Top 20 Data Science Interview Questions and Answers in 2023.pdfAnanthReddy38
Here are the top 20 data science interview questions along with their answers:
What is data science?
Data science is an interdisciplinary field that involves extracting insights and knowledge from data using various scientific methods, algorithms, and tools.
What are the different steps involved in the data science process?
The data science process typically involves the following steps:
a. Problem formulation
b. Data collection
c. Data cleaning and preprocessing
d. Exploratory data analysis
e. Feature engineering
f. Model selection and training
g. Model evaluation and validation
h. Deployment and monitoring
What is the difference between supervised and unsupervised learning?
Supervised learning involves training a model on labeled data, where the target variable is known, to make predictions or classify new instances. Unsupervised learning, on the other hand, deals with unlabeled data and aims to discover patterns, relationships, or structures within the data.
What is overfitting, and how can it be prevented?
Overfitting occurs when a model learns the training data too well, resulting in poor generalization to new, unseen data. To prevent overfitting, techniques like cross-validation, regularization, and early stopping can be employed.
What is feature engineering?
Feature engineering involves creating new features from the existing data that can improve the performance of machine learning models. It includes techniques like feature extraction, transformation, scaling, and selection.
Explain the concept of cross-validation.
Cross-validation is a resampling technique used to assess the performance of a model on unseen data. It involves partitioning the available data into multiple subsets, training the model on some subsets, and evaluating it on the remaining subset. Common types of cross-validation include k-fold cross-validation and holdout validation.
What is the purpose of regularization in machine learning?
Regularization is used to prevent overfitting by adding a penalty term to the loss function during model training. It discourages complex models and promotes simpler ones, ultimately improving generalization performance.
What is the difference between precision and recall?
Precision is the ratio of true positives to the total predicted positives, while recall is the ratio of true positives to the total actual positives. Precision measures the accuracy of positive predictions, whereas recall measures the coverage of positive instances.
Explain the term “bias-variance tradeoff.”
The bias-variance tradeoff refers to the relationship between a model’s bias (error due to oversimplification) and variance (error due to sensitivity to fluctuations in the training data). Increasing model complexity reduces bias but increases variance, and vice versa. The goal is to find the right balance that minimizes overall error.
Bank - Loan Purchase Modeling
This case is about a bank which has a growing customer base. Majority of these customers are liability customers (depositors) with varying size of deposits. The number of customers who are also borrowers (asset customers) is quite small, and the bank is interested in expanding this base rapidly to bring in more loan business and in the process, earn more through the interest on loans. In particular, the management wants to explore ways of converting its liability customers to personal loan customers (while retaining them as depositors). A campaign that the bank ran last year for liability customers showed a healthy conversion rate of over 9% success. This has encouraged the retail marketing department to devise campaigns with better target marketing to increase the success ratio with a minimal budget. The department wants to build a model that will help them identify the potential customers who have a higher probability of purchasing the loan. This will increase the success ratio while at the same time reduce the cost of the campaign. The dataset has data on 5000 customers. The data include customer demographic information (age, income, etc.), the customer's relationship with the bank (mortgage, securities account, etc.), and the customer response to the last personal loan campaign (Personal Loan). Among these 5000 customers, only 480 (= 9.6%) accepted the personal loan that was offered to them in the earlier campaign.
Our job is to build the best model which can classify the right customers who have a higher probability of purchasing the loan. We are expected to do the following:
EDA of the data available. Showcase the results using appropriate graphs
Apply appropriate clustering on the data and interpret the output .
Build appropriate models on both the test and train data (CART & Random Forest). Interpret all the model outputs and do the necessary modifications wherever eligible (such as pruning).
Check the performance of all the models that you have built (test and train). Use all the model performance measures you have learned so far. Share your remarks on which model performs the best.
AI TESTING: ENSURING A GOOD DATA SPLIT BETWEEN DATA SETS (TRAINING AND TEST) ...ijsc
Artificial Intelligence and Machine Learning have been around for a long time. In recent years, there has been a surge in popularity for applications integrating AI and ML technology. As with traditional development, software testing is a critical component of a successful AI/ML application. The development methodology used in AI/ML contrasts significantly from traditional development. In light of these distinctions, various software testing challenges arise. The emphasis of this paper is on the challenge of effectively splitting the data into training and testing data sets. By applying a k-Means clustering strategy to the data set followed by a decision tree, we can significantly increase the likelihood of the training data set to represent the domain of the full dataset and thus avoid training a model that is likely to fail because it has only learned a subset of the full data domain.
Week 4 advanced labeling, augmentation and data preprocessingAjay Taneja
This is the Machine Learning Engineering in Production Course notes. This is the Week 4 of Machine Learning Data Life Cycle in Production (Course 2) course. This is the course 2 of MLOps specialization on coursera
1. Master Analytics Data Solution
~ Multiple Channels
DRAFTED in the reigned social domain other than being under further completion
Copy for Mr. Gary Chin, 20150206 prepared by Teng Xiaolu
Draw out the analytics framework as,
Data Solution = 1. (Statistics Model + Machine Learning) + 2. (Strategy
Insights + Metrics Schema + Innovation Tech)
On par of the intimidating abundant realms involved, it could be spilt into 2
major parts,
Bracket 1. Riding on the foundation of methodology, propagate algorithm
techniques and statistical test.
Bracket 2. Based on the data responses, decisions led by data analysis could
be made through entanglement of insights, measurements, and addictive
innovations [fig.3].
Machine Learning
In general, at a glance of machine learning (the collections from blog
discussions for your best references) who guides a processing to be
emphasized: training, tune, test.
Basically
you
have
three
data
sets:
training,
validation
and
testing.
You
train
the
classifier
using
'training
set',
tune
the
parameters
using
'validation
set'
and
then
test
the
performance
of
your
classifier
on
unseen
'test
set'.
Normally,
the
data
size
test
vs
training,
I
have
seen
the
versions
discrepancies,
30%:
70%
or
10%:
90%.
Probably
there
is
no
one
way
to
choose.
Is
it
eliminating
bias
of
classification?
Any
result
in
the
possible
generalization?
A
well
accepted
method
is
N-‐Fold
cross
validation,
in
which
you
randomize
the
dataset
and
create
N
(almost)
equal
size
partitions.
Then
choose
Nth
partition
for
testing
and
N-‐1
partitions
for
training
the
classifier.
Within
the
training
set
you
can
further
employ
another
K-‐fold
cross
validation
to
create
a
validation
set
and
find
the
best
parameters.
And
repeat
this
process
N
times
to
get
an
average
of
the
metric.
Key
words:
unbias,
cross-‐validation,
randomized
data,
average
of
the
metric
Strategy Insights + Metrics Schema (Social genre)
In the sense of social mining, social channels employ the first approach to
fulfill insights sophisticated.
Also the best to derive from market distinction.
In this first section, it would add on social listening:
2. Which fans of network are figured out as influenced nodes and how much
fraction it takes of the total scale of fans. Especially, these scatters are spilt
into the diversified layers of network. How much frequencies the reactions
direct to the posts, which could be classified into following volumes in terms of
followers size. How it’s to figure out the overlapped area in between of various
communities when the common of high interests in hashtags. (none
expanding version yet)
Against static view on attribute,
Source: MYTH-BUSTING SOCIAL MEDIA ADVERTISING
3. Source: nielsen-cross-platform-report-march-2014.pdf
fEATURE selection separated, or involved into scenarios.
Statistic Model
As long as fulfilling the social mining, it’s able to array into a digital model.
I would suggest to run out together Logistic Regression, Decision Tree,
Neural Network, considering the complementary effect of these 3 functional
classifiers.
Before, it’s unavoided daunts to discover a certain period who should be used
in who among flourished classification techniques.
Now it’s able to identify the limitations to be removed in the same time
maximize the strengths, for instance, tolerance of missing data is found in
decision tree, in the result to tackle the black-box happened in neural network.
Nonetheless, this phenomena tends to high allowance on features less
restricted, and tolerance to the highly interdependent attributions, it ends to
don’t know what to be predicted why it’s predicted.
(the collections from blog discussions) for your best references determine the
number of neurals
• The
VC
dimension
provides
a
rule
of
thumb
for
the
number
of
neurons.
Basically
it
states
that
the
number
of
free
parameters
should
be
much
less
than
the
number
of
examples
in
your
training
set.
"Free
parameters"
translates
to
the
number
of
connections
in
your
neural
net
that
need
to
be
4. tuned,
which
in
a
fully
connected
net
depend
on
the
number
of
neurons
and
how
many
of
them
are
in
the
input
layer
vs
the
hidden
layer.
[1]
• In
general,
with
a
large
dataset,
the
more
parameters
the
better.
Regularization
can
prevent
overfitting.
The
structure
of
the
neural
net
is
also
critical,
and
actually
determines
the
number
of
parameters
(which
corresponds
much
more
to
the
number
of
connections).
The
most
popular
architectures
these
days
use
many
(e.g.
10)
"layers"
of
neurons,
and/or
feedback
connections
(see
recurrent
neural
nets,
now
almost
always
using
LSTM).
So
in
short,
#neurons
<<<
#examples
in
training
set
[1]
Notice
how
low-‐dimensional
examples
becomes
a
positive
thing
here
Good to understand neural network
THINKING:
A
hybrid
solution
is
suggested
in
current
version
with
the
paper
in
[fig.1].
Despite
of
continuous
lacking
of
evidences
to
define
the
how
much
concerns
on
the
speed
of
learning
and
data
consumption,
in
this
moment,
I
would
support
this
operation
phasing
to
none
clicks
!
clicks.
[fig.1]
Neural
networks
are
routinely
ignored
as
a
modeling
tool
because
they
are
largely
uninterpretable
overall
and
are
generally
less
familiar
to
analysts
and
business
people
alike.
Neural
networks
can
provide
great
diagnostic
insights
into
the
potential
shortcomings
of
other
modeling
methods,
and
comparing
the
results
of
different
models
can
help
identify
what
is
needed
to
improve
model
performance.
For
example,
consider
a
situation
where
the
best
tree
model
fits
poorly,
but
the
best
neural
network
model
and
the
best
regression
model
perform
similarly
well
on
the
validation
data.
Had
the
analyst
not
considered
using
a
neural
network,
little
performance
would
be
lost
by
investigating
only
the
regression
model.
Consider
a
similar
situation
where
the
best
tree
fits
poorly
and
the
best
regression
fits
somewhat
better,
but
the
best
neural
network
shows
marked
improvement
over
the
regression
model.
The
poor
tree
fit
might
indicate
that
the
relationship
between
the
predictors
and
the
response
changes
smoothly.
The
improvement
of
the
neural
network
over
the
regression
indicates
that
the
regression
model
is
not
capturing
the
complexity
of
the
relationship
between
the
predictors
and
the
response.
Without
the
neural
network
results,
the
regression
model
would
be
chosen
and
much
interpretation
would
go
into
interpreting
a
model
that
inadequately
describes
the
relationship.
Even
if
the
neural
network
is
not
a
candidate
to
present
to
the
final
client
or
management
team,
the
neural
network
can
be
highly
diagnostic
for
other
modeling
approaches.
In
another
situation,
the
best
tree
model
and
the
best
neural
network
model
might
be
performing
well,
but
the
regression
model
is
performing
somewhat
poorly.
In
this
5. case,
the
relative
interpretability
of
the
tree
might
lead
to
its
selection,
but
the
neural
network
fit
confirms
that
the
tree
model
adequately
summarizes
the
relationship.
In
yet
another
scenario,
the
tree
is
performing
very
well
relative
to
both
the
neural
network
and
regression
models.
This
scenario
might
imply
that
there
are
certain
variables
that
behave
unusually
with
respect
to
the
response
when
a
missing
value
is
present.
Because
trees
can
handle
missing
values
directly,
they
are
able
to
differentiate
between
a
missing
value
and
a
value
that
has
been
imputed
for
use
in
a
regression
or
neural
network
model.
In
this
case,
it
might
make
more
sense
to
investigate
missing
value
indicators
rather
than
to
look
at
increasing
the
flexibility
of
the
regression
model
because
the
neural
network
shows
that
this
improved
flexibility
does
not
improve
the
fit.
To
overcome
this
problem,
select
variables
judiciously
and
fit
a
neural
network
while
ensuring
that
there
is
an
adequate
amount
of
data
in
the
validation
data
set.
As
discussed
earlier,
performing
variable
selection
in
a
variety
of
ways
ensures
that
important
variables
are
included.
Evaluate
the
models
fit
by
decision
tree,
regression,
and
neural
network
methods
to
better
understand
the
relationships
in
the
data,
and
use
this
information
to
identify
ways
to
improve
the
overall
fit.
Source:
<Identifying
and
Overcoming
Common
Data
Mining
Mistakes>
From
the
other
book,
“However,
a
neural
network
is
a
“black
box”
method
that
does
not
provide
any
interpetable
explanation
to
accompany
its
classifications
or
predictions.
Adjusting
the
parameters
to
tune
the
neural
network
performance
is
largely
a
matter
of
trial
and
error
guided
by
rules
of
thumb
and
user
experience.”
{SIDE
NOTE}
Inspired
by
listening
!
imitation
!
recode,
I
would
like
to
believe
the
other
tuple
required
to
heterogenic
interpretation
with
discriminant
effect.
Probably
it
requires
Naïve
Bayes
and
VMC
to
iterate
stringently.
Please
kindly
noted
independent
fEATURE
selection
to
support
formula
could
be
packed
into
scenarios.
About
Naïve
Bayes
in
few
paragraphs,
• The
second
contribution
is
a
technical
contribution:
We
in-‐
troduce
a
version
of
Na
̈ıve
Bayes
with
a
multivariate
event
model
that
can
scale
up
efficiently
to
massive,
sparse
datasets.
Specifically,
this
version
of
the
commonly
used
multivariate
Bernoulli
Na
̈ıve
Bayes
only
needs
to
consider
the
‘‘active’’
elements
of
the
dataset—those
that
are
present
or
non-‐zero—
which
can
be
a
tiny
fraction
of
the
elements
in
the
matrix
for
massive,
sparse
data.
This
means
that
predictive
modelers
wanting
to
work
with
the
very
convenient
Na
̈ıve
Bayes
algorithm
are
not
forced
to
use
the
multinomial
event
model
simply
because
it
is
more
scalable.
This
article
thereby
makes
a
small
but
6. important
addition
to
the
cumulative
answer
to
a
current
open
research
question17:
• How
can
we
learn
predictive
models
from
lots
of
data?
• Note
that
our
use
of
Na
̈ıve
Bayes
should
not
be
interpreted
as
a
claim
that
Na
̈ıve
Bayes
is
by
any
means
the
best
modeling
technique
for
these
data.
Other
methods
exist
that
handle
large
transactional
datasets,
such
as
the
popular
Vowpal
Wabbit
software
based
on
scalable
stochastic
gradient
descent
and
input
hashing.2,18,19
Moreover,
results
based
on
Na
̈ıve
Bayes
are
conservative.
As
one
would
expect
theoretically20
and
as
shown
empirically,15
nonlinear
modeling
and
less-‐restrictive
linear
modeling
generally
will
show
continued
improvements
in
predictive
performance
for
much
larger
datasets
than
will
Na
̈ıve
Bayes
modeling.
(However,
how
to
conduct
robust,
effective
nonlinear
modeling
with
massive
high-‐dimensional
data
is
still
an
open
question.)
Nevertheless,
Na
̈ıve
Bayes
is
popular
and
quite
robust.
Using
it
provides
a
clear
and
conservative
baseline
to
demonstrate
the
point
of
the
article.
If
we
see
continued
improvements
when
scaling
up
Na
̈ıve
Bayes
to
massive
data,
we
should
ex-‐
pect
even
greater
improvements
when
scaling
up
more
sophisticated
induction
algorithms.
• These
results
are
important
because
they
help
provide
some
solid
empirical
grounding
to
the
importance
of
big
data
for
predictive
analytics
and
highlight
a
particular
sort
of
data
in
which
predictive
analytics
is
likely
to
benefit
from
big
data.
They
also
add
to
the
observation3
that
firms
(or
other
entities)
with
massive
data
assets21
may
indeed
have
a
considerable
competitive
advantage
over
firms
with
smaller
data
assets.
Source:
big%2E2013%2E0037.pdf
<Is
Bigger
Really
Better?>
More
discussion
from
the
paper
about
digital
data
occurrences,
sparse,
fine-‐grained,
so
does
massive
!
more
data
actually
beats
algorithms.
[fig.2]
Dynamic Programming
Source:
https://www.cs.utexas.edu/~eladlieb/RLRG.html
http://theanalyticsstore.com/deep-learning/
7.
INNOVATION is a case combined tech advance
TV + Social probably arrives aftermath from the aspect top-to-bottom. I have
saw somewhere the similar opinion. Will point out specific article later for your
conveniences due to time constrains. I would support to the tool kit, 0. ! -1.
8. So does real time approach which attempts to roll out under specific real time
metrics recency, particularly link to data stream hour, day, week, month. (refer
to ++Insight+1++). Functionally it needs parallel to brand metrics, awareness
and retention rate. With such, it won’t cover too much about impression,
project ROI, est. cost of prospect gaining = estimated margin per prospect /
(1+ROI threshold) or CLTV (new deducted from existing) in the monothetic
statistical test, including a longer list p value, F test, t-Test, R2
, adjusted R2
,
correlation matrix, elasticity and co-efficiency functionally validate, type I/II
errors. MAPE, error rate management, ROC, lift depended on the model
selection and more likelihood time series, association, what-if. (note: longer in
book(s), thicker+thicker, every fraction, self-semester) There is analytics
session named transaction analysis, RFM discerning acquisition !
transaction. It illustrates the possibilities, in conditional setting, the clicks but
none-purchase might belong to the group who stimulates to longer
relationship with brands who offer coupons, probably the variation of against
cost sensitivity. A model helps to recognize this type of existence with
parameters. In the contrast, purchased group alternately to be encouraged for
repeat purchase due to shifted demand from the analysis cross-sell and up-
sell. Upstream and downstream both could be spread. How far from the
overarching digital intrinsic relevance, by channel or by touchpoint? TV, plus
time shifted TV diminishing, even capping 2, on or off should leave off this
motion in a certain scenario, for instance customer scoring program. Social
network analysis plays throughout the scenarios estimating customer
profitability, listening ! imitation ! recode, on the path of being extrapolation,
both supervised and un-supervised.
9. Source: Bayesian+reasoning+and+machine+learning.pdf
NEWS,
In
particular,
they
want
to
see
highly
granular
data
from
all
touchpoints.
"Increasing
the
granularity
and
variability
of
media
inputs
can
increase
the
estimate
of
a
medium's
RoI
by
as
much
as
27%,"
they
reported.
They
also
highlighted
the
"shocking
oversight"
when
it
comes
to
measuring
creativity,
with
some
observers
claiming
that
70%
of
the
sales
effectiveness
of
advertising
can
be
attributed
to
the
creative
message.
Acknowledging
that
this
is
a
difficult
area,
they
argued
that
more
direct
integration
of
copy
tests
into
marketing
mix
models
would
move
the
industry
on
from
determining
which
ads
worked
to
understanding
why
they
worked.
Source:
New
marketing
models
emerge,
London:
6
February
2015
http://www.warc.com/LatestNews/News/EmailNews.news?ID=34271&Origin=WARCNewsEmail&CID=N34271&P
UB=Warc_News&utm_source=WarcNews&utm_medium=email&utm_campaign=WarcNews20150206
10. Other gifts given from London
What Winston said other else? :>>
“During these turbulent times, predictive analytics is how smart companies
are turning data into knowledge to gain a competitive advantage.” Source: <Drive
your business with predictive analytics>
11. Source: <Drive your business with predictive analytics>
THINKING from Facebook case: it might be two-ways TV doesn’t simply play
as domination to influence social responses in trend, in the contradict, social
platform reflects TV opportunities that Facebook leverages on Super Bowl
significance. It’s typical event show pattern vs proportion vs longer
viewingship extension added with transaction history to capture higher value
customer.
[fig.3]
• January
30,
2015,
1:53
PM
• Facebook’s
new
Super
Bowl
ad
play
• By
Zak
Stambor
Managing
Editor
• The
social
network
will
launch
a
live
feed
where
fans
can
discuss
the
game,
and
it
is
selling
video
ads
that
target
consumers
based
on
what
they
talk
about.
Among
those
signing
up
to
advertise
are
Toyota,
Pepsi,
Intuit
TurboTax
and
Anheuser-‐Busch.
• Facebook
Inc.
wants
to
be
on
consumers’
second
screen
during
the
Super
Bowl.
• The
social
network
will
launch
a
Super
Bowl-‐specific
feed
during
the
game
where
consumers
can
comment
on
the
game—and
the
surrounding
hoopla
around
it,
including
ads.
And
advertisers
can
target
consumers
within
the
feed
based
on
what
participants
are
discussing.
• Among
the
brands
that
plan
to
advertise
within
Facebook’s
feed
are
Toyota,
Pepsi,
Intuit
TurboTax
and
Anheuser-‐Busch.
Each
of
those
brands
is
also
running
ads
during
the
game’s
TV
broadcast.
• Using
Facebook,
as
well
as
other
digital
channels,
to
amplify
a
costly
ad
buy
is
an
essential
part
of
advertising
strategy
in
today’s
media
climate,
says
12. Rebecca
Lieb,
an
analyst
at
the
business
research
and
advisory
firm
Altimeter
Group.
• “Brands
are
in
a
position
where
making
corresponding
web
and
social
ad
buys
is
de
rigueur,”
she
says.
“Why
would
you
invest
all
the
time
and
money
in
a
Super
Bowl
ad
and
give
it
the
lifespan
of
a
fruit
fly
by
letting
it
begin
and
end
on
broadcast
TV?”
• This
year
30
seconds
of
Super
Bowl
air
time
costs
advertisers
$4.5
million,
according
to
Variety.
That
doesn’t
begin
to
factor
in
production
costs,
which
can
also
be
extremely
costly,
Lieb
says.
• In
addition
to
letting
large
advertisers
amplify
their
Super
Bowl
campaigns,
the
feed
will
also
let
smaller
marketers,
including
e-‐retailers,
use
attention-‐
grabbing
ads
to
be
a
part
of
consumers’
Super
Bowl
discussion,
says
Lou
Kerner,
a
social
media
analyst
and
investor
at
The
Social
Internet
Fund.
• While
Twitter
is
often
thought
of
as
the
social
network
consumers
engage
with
while
watching
TV,
its
audience
is
roughly
one-‐fifth
the
size
of
Facebook’s,
Lieb
says.
Twitter
has
284
million
monthly
active
users—and
only
63
million
in
the
United
States—compared
to
Facebook,
which
has
1.393
billion
monthly
active
users,
including
208
million
in
the
United
States
and
Canada
(Facebook
doesn’t
release
a
U.S.-‐only
figure).
• “There’s
never
been
a
medium
as
big
as
Facebook,”
Lieb
says.
“Now
clearly
not
all
of
Facebook’s
users
are
Americans,
not
all
of
those
American
users
are
football
fans,
but
there
are
millions
and
millions
of
people
who
represent
a
very
large
potential
audience
for
advertisers,”
she
says.
While
TV
gives
advertisers
a
tool
to
reach
a
wide
swath
of
consumers,
Facebook
gives
them
an
even
bigger
audience
that
they
can
finely
target,
she
says.
• Facebook
recognizes
this
and
is
emphasizing
to
potential
advertisers
that,
in
addition
to
football
fans,
they
can
reach
people
discussing
party
planning,
sharing
recipes,
buying
a
new
flat-‐screen
TV,
the
half-‐time
show
or
chattering
about
ads,
a
spokeswoman
says.
Facebook
declined
to
say
what
it
is
charging
marketers
to
advertise
in
the
Super
Bowl
feed.
• While
115
million
U.S.
consumers
watched
the
Super
Bowl
last
year,
Facebook
says
170
million
people
saw
Super
Bowl-‐related
posts
and
ads
last
year.
By
developing
a
dedicated
feed,
Facebook
aims
to
grow
that
number.
Source:
https://www.internetretailer.com/2015/01/30/facebooks-‐new-‐super-‐bowl-‐ad-‐play
++Insight+1++
from
Nielsen,
Spredfast,
Rentrak:
We
also
know
that
40%
of
U.S.
tablet
and
smartphone
users
visit
a
social
network
while
watching
TV.
Five
of
the
top
10
primetime
TV
shows
integrate
social
media
online
and/or
on-‐
air:
NBC
Sunday
Night
Football,
both
nights
of
The
Voice,
and
both
nights
of
X
Factor.
In
addition,
Spredfast
reaches
135
million
people
each
week
through
our
on-‐air
social
visuali-‐
zations,
which
is
40%
of
the
U.S.
population.
Rentrak’s
scale
allows
us
to
sell
on
cycles
up
to
28
days
for
most
shows
because
we
have
tremendous
coverage
across
users.
13.
Reading for more references:
Nielsen-cross-platform-report-march-2014.pdf
Do display ad influence search.pdf
Tech Trends 2014 Inspiring Disruption – Deloitte.pdf
Accenture_Technology_Vision_2014.pdf
Social_Shopping_2011_Brief1.pdf
Social_Media_Analytics_-_Sample_report_-_Marketing_effectiveness.pdf
13926_di_social_q413_v5.pdf