2. The Role of Social Media in Research
Marketers rely on several variants of
marketing research to make decisions.
Options include:
Secondary research – information already
collected and available for use.
Primary research – information collected
solely for the research purpose at hand.
2-9
3. Primary Social Media Research
The possible approaches to collecting primary data
in social spaces include the use of consumer diaries,
interviews and focus groups, surveys, and
experiments.
3-9
4. Qualitative Social Media Research
Observational Research involves recording
behavior or the residual evidence of behavior.
4-9
5. Qualitative Social Media Research
Ethnographic Research occurs when marketing
researchers conduct field research by visiting
people’s homes and offices to observe them as they
go about everyday tasks.
Netnography is a rapidly growing research
methodology that adapts ethnographic research
techniques to study the communities that emerge
through computer-mediated communications.
5-9
7. Quantitative Social Media Research
Monitoring and Tracking
Social media monitoring occurs by carefully
choosing and searching the appropriate key words
and the relevant social communities.
This process answers four basic questions:
1. How many times was the search term found?
2. When was the search term found?
3. Where was the search term found?
4. Who mentioned the search term?
7-9
8. Quantitative Social Media Research
Sentiment Analysis refers to
determining how people think or feel
about an object.
The analysis consists of four steps:
1. Crawl, Fetch and cleanse.
2. Extract entities of interest.
3. Extract sentiment.
4. Aggregate raw data into a summary.
8-9
9. Quantitative Social Media Research
The challenges of sentiment analysis
Accuracy in gauging sentiment with automated tools
Cultural factors, linguistic nuances and differing contexts
Defining the sentiment dictionary
Accuracy in the categorical data needed to make better
use of data
9-9
10. Quantitative Social Media Research
Content Analysis
Deep insight into the text (or other content)
Pieces of information are classified and analyzed
for themes
10-
9
12. Caution! Research Errors and Biases
Coverage and Sampling Errors
Coverage error is the result of a failure to cover all
components of a population being studied.
Sampling error is the result of collecting data from only a subset,
rather than all, of the members of the sampling frame; it heightens
the chance that the results are wrong.
12-
9
13. Caution! Research Errors and Biases
Nonresponse Bias
Nonresponse error is the potential that those
units not included in the final sample are
significantly different from those that were.
13-
9
Secondary research may be internal, published publicly, or available via syndicated sources. Secondary data might include background on the market, industry, competitors, and the brand’s history.
Primary can help marketers to understand consumers in the market, including psychological makeup, spending and media consumption patterns, and responsiveness to message appeals and offers.
Netnography is an unobtrusive approach to research with a key benefit of observing what is likely to be credible information, unaffected by the research process. Many marketers already use a very informal and unsystematic form of netnography by simply exploring relevant online communities. However, to minimize the limitations of netnography, researchers should be careful in their evaluations by employing triangulation to confirm findings whenever possible.
Step 1: Fetch, crawl, and cleanse. Data from the sources are collected using web crawlers. These are the same types of programs search engines use to catalog web pages. Using the word-phrase dictionary, the crawlers select only the content that appears to be relevant based on matches with the dictionary. This process is called fetching or web scraping. The scraped data need to be cleansed to eliminate unnecessary formatting prior to moving forward. A text classifier (from the dictionary) is then applied to the data to filter any irrelevant content that made it into the data set.
Step 2: Extract entities of interest. From this filtered set of content, relevant posts are extracted. The data are filtered again using rules to tag the entities of interest and further narrow the data set.
Step 3: Extract sentiment. From there, the analyst can begin sentiment extraction using sentiment indicators. These are words or other cues used to indicate positive or negative sentiment. A sentiment dictionary specifies sentiment indicators and rules to be used in the analysis.
Step 4: Aggregate raw sentiment data into a summary. Raw sentiments are then aggregated creating a sentiment summary.
• First and foremost is accuracy in gauging sentiment with automated tools. The sheer volume of conversation creates an information overload issue for most brands wanting to use social media monitoring and research. The solution is the use of an automated system, but these systems still struggle with accuracy in the coding of meaning.
• Cultural factors, linguistic nuances, and differing contexts all make it difficult to code text into negative, neutral, or positive categories. Consider this example: A search on attitudes toward the movie Julie & Julia revealed a positive sentiment score from 77 percent of tweets related to the movie. But some tweets may have been miscoded. A tweet reading “Julie and Julia was truly delightful! We all felt hungry afterwards” was coded as negative. The word “hungry” was used as a sentiment indicator for negative. A person could understand that this statement was positive for the movie, but the software program couldn’t. Linguistic nuances make it difficult for mining software to achieve better accuracy levels. A chocolate torte described as wickedly sinful would be coded as negative, when it fact the descriptor is positive.
• Defining the sentiment dictionary can also be a challenge, ultimately affecting whether the right words are extracted. Words can have many meanings. Take BP, for instance. As the oil spill in the Gulf of Mexico has created a public relations crisis for the company, measuring sentiment before and after recovery steps and announcements is a useful tool for gauging damage control for the brand’s image. But in a world of acronyms, BP may mean blood pressure, border patrol, business plan, Brad Pitt, or bipolar disorder.
• Accuracy in the categorical data needed to make better use of data is also an issue. It’s difficult to gauge who is making comments (which segments they represent) in terms of demographic and geographic descriptors. Conversation origin may be identifiable using the URL, the IP address, or the language used, but all of these methods have flaws. The URL and IP address are not always helpful (take Facebook, for instance, with users around the world). Language indicators likewise leave a lot to be desired.
To conduct a content analysis, the text is coded, or broken down, into manageable categories on a variety of levels—word, word sense, phrase, sentence, and theme—and then examined further for interpretation. Using codes, labels that classify and assign meanings to pieces of information, analysts can use the comments to determine any themes that are reflected in the comments.
In survey research, sampling error is associated with how we draw our sample, using either a probability or non-probability method. For social media research, we will utilize these guidelines in collecting data, but two situations create additional concerns in the area of sampling error for social media researchers: (1) the echo effect and (2) participation effect. The actual number of conversations may not always be what it seems.
If there are relevant differences, the results based on participating units may not accurately reflect the population of interest. For example, people who are willing to take a 30-minute phone survey may be different than those who aren’t.