SMA-Unit-I: The Foundation for Analytics

Data Analytics
Data Analytics
Accumulation of raw data captured from various sources (i.e.
discussion boards, emails, exam logs, chat logs in e-learning
systems) can be used to identify fruitful patterns and relationships.
Exploratory visualization – uses exploratory data analytics by
capturing relationships that are perhaps unknown or at least less
formally formulated
Confirmatory visualization - theory-driven

 2012 - Dr. Yair Levy and Dr. Michelle M. Ramim – Chais 2012 Conference, February 16, 2012.
What Can Be Learned From Data
Sets?
Business Intelligence (BI) or Business Analytics (BA)
Data analytics is an emerging technique that dives into a data set
without prior set of hypotheses
The data derive meaningful trends or intriguing findings that were not
previously seen or empirically validated.
Data analytics enables quick decisions or help change policies due to
trends observed

The Power of Data Analytics
Source: http://mobile.informationweek.com/80256/show/488f5c42fd3f92317e5ac29faeee033e/

Data Analytics vs. Statistical Analysis
Statistical Analysis
Utilizes statistical
and/or mathematical
techniques
Used based on
theoretical foundation
Seeks to identify a
significant level to
address hypotheses
or RQs
Data Analytics
Utilizes data mining
techniques
Identifies inexplicable
or novel
relationships/trends
Seeks to visualize the
data to allow the
observation of
relationships/trends

The Foundation for Analytics
Difference between social media analysis and traditional business
intelligence is the competitive information highly available on social
media.
Historically, companies have mostly dealt with their own data, and
eventually added studies based on external data published by
research organizations and so forth. Even in the digital age, web
site analysis and digital advertising still binds most of the useful
data to its owners, and being competitive has never been an easy
task.
Competitive intelligence and other aspects of social media data
and analytics create a new context for social media and a data-
driven strategy.

Social Media Data Sources: Offline
and Online

Offline originated data - Data that has been generated
with no connection to the Internet, and then registered into
a system, which may be accessed via the Internet later.
Examples include physical retail, printed press, live
events, telephone marketing and customer support,
traditional television audience measurements, and so forth

Online originated data - Created from systems
connected to the Internet. Examples include web sites, e-
commerce, media streaming, e-mail, mobile applications,
social media, online devices, and so forth.

Within these systems we might also have
different points of data generation. The social
media post itself is one source, for example,
and the comments under that same social
media post comes from a different source. Now
think of everything that happens on social
media, with all the different types of interactions
and quick sharing of content, there are many
data sources within each social media network.

Defining Social Media Data
Within online generated data, only a portion of it
is considered truly social media data. There is
always a gray area of debate on what social
media is and what it is not. But to make it
simple, we can define social media data as
data generated within a self-named social
media platform

Data Sources in social Media
Channels

Create a profile.

Publish content.

Praise or react to content (e.g., likes, favorites, etc.).

Comment on content.

Share content.

Create groups and content only available to the group.

Send direct messages and chat with other users.

Connect to another profile (as a friend, follower, etc.).

Purchase products and perform transactions .

Estimated Data sources and Factual
Data Sources

It is important to understand when data sources
represent a fact, and when they represent a possible
fact or an estimate. Especially when going into paid
media and content promotion, we come across many
estimated, or sometimes questionable, sources.

These estimated metrics include views, impressions,
and reach, for example, which provide the number of
times a user has potentially seen a certain piece of
content or advertising.


With a large amount of high-quality data, we
can reach very high-quality estimates, to the
point where we get our estimates correct and
statistically validated every single time. That is
the point where machines take over the
decision-making process, and automation kicks
in to take care of such tasks.

Public and Private Data

Public data is what anyone can see when
navigating a social platform.

Private data is what only the owner of the
social media profile can see.


If we are looking for competitive analytics and social media
benchmarking, which is really a must-do for optimizing
social strategy, we need public data for that. Public-level
data allows us to see the performance of our competitors’
social media channels, or any of the available information
that does not belong to us.

Two points are important to remember:
1. What is public can be easily compared.
2. What is private, if it does not belong to us, can only
be estimated.


Some services, for example, offer to detect paid
posts on Facebook. Paid information is private,
so it is not available if we don’t own the data.
Paid detection offers an estimate of it based on
a machine learning process.

Data Gathering in Social Media
Analytics
Data can be gathered in two ways when it comes
to pulling human-action

Via API (application programming interface)

Web crawling or scraping

API : Application Programming
Interface

An API is a structured channel of access into an
application. It allows a programmer to see a
clear structure of the information that is stored
in the application. This structure points the
programmer straight to the data that he or she
is looking for. Facebook offers API access to its
data. A programmer can look into Facebook
and request any specific information; for
example, “total likes for a certain page post”

Web Crawling or Scraping

Everything we see on the Internet has a source code driving it—a set of
instructions for all the systems connecting and interacting with the data. A
good-looking web site full of images has a set of hidden instructions telling
the browser how to display all that information.

A programmer can tap into the source code and then crawl for any specific
information needed. Other terms are also used to describe this process, such
as scraping .Web crawling/scraping is a very fragile and unstable way of
gathering data, because when anything changes on the web site, the source
code is changed, and the programmer has to reprogram a new way to crawl
for the information.

It is also likely that crawling can bump into privacy regulations from web
sites; the owners of these web sites will not like that very much. So
whenever possible, API access is the way to go. APIs are what most
analytics platforms rely upon.

SMA-Unit-I: The Foundation for Analytics

More Related Content

Similar to SMA-Unit-I: The Foundation for Analytics

More from DEEPAK948083

Recently uploaded

SMA-Unit-I: The Foundation for Analytics