Dissertation Writing Service From Writekraft [www.writekraft.com]
Text Analytics Presentation
1. An Introduction to Text Analytics
in IBM SPSS Modeler
Skylar Ritchie
Shawn Bergman
1
2. Objectives
1. To give a broad overview of text analytics…
a. Defining key terms
b. Describing important steps in the process
2. To provide a step-by-step tutorial for how to use IBM SPSS Modeler
to...
a. Read in source text
b. Extract concepts, sentiment, and text link patterns from records
c. Categorize records
d. Visualize the results
2
4. Text Analytics
“The process of deriving high quality
information from text”
--Marisa Peacock, Social Media Strategist
”A technology and process both, a
mechanism for knowledge discovery applied
to documents, a means of finding value in
text. Solutions…analyze linguistic
structure...discern entities...as well as
relationships, concepts, and even
sentiments. They...automate
classification...of source documents. They
exploit visualization for exploratory analysis.”
--Seth Grimes, Analytics Strategy Consultant
1. Extraction: to discern entities,
relationships, concepts, and sentiments
2. Categorization: to automate classification
3. Visualization
4
5. What does text analytics ”look” like?
1: Source Text
•File
•Web Feed
2: Dictionaries
• Substitution
• Type
• Exclude
3: Extraction
Results
•Concepts
•Types
•Text Link
Analysis
Patterns
4: Grouping
Techniques
•Concept
Inclusion
•Concept Root
Derivation
•Semantic
Network
•Co-occurrence
5: Categorization
Results
•Categories
•Descriptors
Sourcing
(Step 1)
Extracting
(Steps 2-3)
Categorizing
(Steps 4-5)
Visualizing
5
Handout provided
9. Key Terms
<Organization>
university
university, college, school,
academy, institute,
polytechnic, alma mater,
graduate school…
Types: higher-level concepts
Concepts: lead terms under which
similar terms are grouped
together
Terms: single words (uni-terms) or
word phrases (multi-terms) that
are interesting or relevant
9
Handout provided
10. Substitution Dictionary: Terms Concepts
An editable collection of
synonymous terms grouped under
a target term, or concept
Target Term Synonyms
university university, college,
school, academy,
institute, polytechnic,
alma mater, graduate
school…
student student, scholar,
undergraduate, graduate,
grad student,
postdoctoral fellow,
freshman, sophomore,
junior, senior…
professor professor, prof, tenured
faculty member, dean,
assistant professor,
associate professor,
lecturer, academic…
university
graduate
school
college
university
10
11. Type Dictionary: Concepts Types
An editable collection of
concepts grouped under a
label known as the type
name
Concept Type
5 star <Positive>
a lot better <Positive>
beyond my expectations <Positive>
abhor <Negative>
bizarre <Negative>
can’t stand <Negative>
all about the same <Uncertain>
been with it for too little time <Uncertain>
can’t think of any <Uncertain>
11
12. Exclude Dictionary
An editable collection of terms
and types that will be removed
from the final extraction results
Exclude List
any kind of problem
can’t say enough
can’t wait
i was out of
if it ain’t broke, don’t fix it
prefer not to
to work with
went down to
12
13. Text Link Analysis (TLA)
A pattern-matching
technology that is used
to extract relationships
found between…
• Either concepts
• Or types
• <Organization> + <Positive>
• university + excellent
“This is a 5 star
university”
• <Unknown> + <Unknown> +
<Negative>
• undergraduates + lecturers + dislike
“Undergraduates
abhor mere
lecturers”
13
Handout provided
15. Key Terms
Categorization: the process of
assigning records to a category when
the text within them matches a
descriptor
Category: higher-level ideas that
capture the central message of the
text
Descriptor: concepts, types,
patterns, and category rules that
have been used to define a category
Descriptors
Concepts
Types
TLA patterns
Category rules
15
16. Category Rules
Statements that classify records into a category based on a logical
expression using extracted concepts, types, and patterns as well as
Boolean operators
Operator Meaning Example
+ ”And”
(order
important)
• <Organization> + <Positive>
• university + excellent
& ”And”
(order not
important)
• <Positive> & <Organization>
• excellent & university
| ”Or” • <Person> | <Organization>
• student | university
!() “Not” • !(<Person>)
• !(student)
Matching Sentence
This is a 5 star university
16
Handout provided
17. Wildcard Operator
The Boolean operator * that acts as a variable and stands in for a missing
word or word fragment
Usage Example Matching Phrases
Space after word graduate * • graduate school
• graduate student
Space before word * graduate • university graduate
No space after word graduate* • graduates
• graduated
No space before word *graduate • undergraduate
17
18. Grouping Techniques
The mechanisms underlying the categorization process
Extraction Results
• Concepts
• Types
• Text Link
Analysis
Patterns
Grouping
Techniques
• Concept
Inclusion
• Concept Root
Derivation
• Semantic
Network
• Co-occurrence
Categorization
Results
• Categories
• Descriptors
18
Handout provided
19. Concept Inclusion
What?
Grouping based on subsets and
supersets
How?
1. Breaking concepts into
components
2. De-inflecting components
When?
Text that is somewhat technical
Descriptor: De-inflected Components
faculty
De-inflected Components
{graduate, faculty} {faculty, committee} {tenure, faculty, member}
Components
{graduate, faculty} {faculty, committees} {tenured, faculty, members}
Concepts
graduate faculty faculty committees tenured faculty members
19
20. Concept Root Derivation
What?
Grouping based on morphological
relationships
How?
1. Breaking concepts into
components
2. De-inflecting components
3. Removing suffixes to find root
When?
Any text, but few categories
Descriptor: De-inflected Component Roots
psycholog-
De-inflected Components
{study, psychology} {psychological, study} {noteworthy, psychologist}
Components
{studies, psychology} {psychological, studies} {noteworthy, psychologist}
Concepts
studies in psychology psychological studies noteworthy psychologist
20
21. Semantic Network
What?
Grouping based on semantic
relationships
How?
• Synonyms: “are” relationship
• Hyponyms: “is a” relationship
When?
Text that is not highly technical
Category
educators
Synonyms
professors teachers
Category
social science
Hyponyms
psychology social science
21
22. Co-occurrence
What?
Grouping based on concepts that
appear together
How?
𝐶 𝑋𝑌 ≥ 2 → 𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 =
(𝐶 𝑋𝑌)2
𝐶 𝑋 × 𝐶 𝑌
When?
Any text, but many categories based
on possibly distant relationships
Example Concepts
Students flock to ASU • students = W
• ASU = X
ASU focuses on sustainability • ASU = X
• sustainability = Y
Sustainability is the way of
the future
• sustainability = Y
• way of the future = Z
𝐶 𝑊 = 1
𝐶 𝑋 = 2
𝐶 𝑌 = 2
𝐶 𝑍 = 1
𝐶 𝑊𝑋 = 1
𝐶 𝑋𝑌 = 1
𝐶 𝑌𝑍 = 1
(𝐶 𝑊𝑋)2
𝐶 𝑊 × 𝐶 𝑋
=
12
1 × 2
=
1
2
(𝐶 𝑋𝑌)2
𝐶 𝑋 × 𝐶 𝑌
=
12
2 × 2
=
1
4
(𝐶 𝑌𝑍)2
𝐶 𝑌 × 𝐶 𝑍
=
12
2 × 1
=
1
222
23. Extraction v. Categorization
Extraction Categorization
Ends To discover what records contain To classify records based on what they
contain
Means • Substitution dictionary
• Type dictionary
• Exclude dictionary
• Concept root derivation
• Concept inclusion
• Semantic network
• Co-occurrence
Output • Concepts
• Types
• TLA patterns
• Categories
• Descriptors
• Concepts
• Types
• TLA patterns
• Category rules
23
29. Sourcing an Excel File
1. Click the tab
2. Double click the node or click and drag it into the
stream
3. Double click the node within the stream or right
click and click Edit
4. Click on the tab
5. Select the
6. Select the
7. A
8. Select
9. Click Ok
29
30. Starting Interactive
Workbench
Session with…
Basic Resources
Template
Opinions Template
Opinions Text
Analysis Package
30
Handout provided
• Less information in
substitution, type, and
exclude dictionaries
• No categories
• More information in
substitution, type, and
exclude dictionaries
• No categories
• More information in
substitution, type, and
exclude dictionaries
• Pre-built categories
32. Starting an Interactive Workbench Session with the Basic Resources Template
1. Click the tab
2. Double click the node or click and drag it into
the stream
3. Double click the node within the stream or right
click and click Edit
4. Click on the tab
5. Select the
6. Click on the tab
7. Select
8. Click
32
36. Starting an Interactive Workbench Session with the Opinions Template
1. Double click the node within the stream or right
click and click Edit
2. Click on the tab
3. Click
4. Select
5. Click Ok
6. Click
36
41. Starting an Interactive Workbench Session with the Opinions Text Analysis Package
1. Double click the node within the stream or right
click and click Edit
2. Click on the tab
3. Select
4. Click
5. Select
6. Click
7. Click
41
48. Editing the Substitution Dictionary
1. Right click on the concept
2. Select Add to Synonym
3. Click New
4. Create the target term to which you want to assign the
synonym
5. Click Ok
6. Click
48
52. Editing the Type Dictionary
1. Right click on the concept
2. Select Add to Type
3. Click More
4. Select the type to which you want to assign the concept
5. Click Ok
6. Click Ok again
7. Click
52
58. Using the
Opinions
Template for…
Extraction by…
Editing the…
Substitution
Dictionary
Type Dictionary
Exclude
Dictionary
Extracting TLA
Patterns
Categorization
by…
Automatically
Building
Categories
Manually
Categorizing…
Concepts
Types
TLA Patterns
Manually
Creating
Category Rules
58
59. Extracting TLA Patterns
1. In the Text Link Analysis View, click
2. Select a type pattern to see the concept patterns that
correspond to it
3. Click to see the concepts and type webs
corresponding to these patterns
59
68. Manually Categorizing Concepts
1. Select the concept you want to categorize
2. Click
3. Select the category to which you want to assign the
concept:
4. Click Ok
68
72. Manually Categorizing Types
1. Select the type you want to categorize
2. Click
3. Select the category to which you want to assign the
concept or create a new category:
4. Click Ok
72
76. Manually Categorizing TLA Patterns
1. Select the TLA pattern you want to categorize
2. Click
3. Select the category to which you want to assign the
concept or create a new category:
4. Click Ok
76
78. Using the
Opinions
Template for…
Extraction by…
Editing the…
Substitution
Dictionary
Type Dictionary
Exclude
Dictionary
Extracting TLA
Patterns
Categorization
by…
Automatically
Building
Categories
Manually
Categorizing…
Concepts
Types
TLA Patterns
Manually
Creating
Category Rules
78
79. Manually Creating Category Rules
1. Right click on the category for which you want to create
a rule
2. Click Create Category Rule
3. Create your rule by…
1. Dragging concepts or types into the Rule Editor
2. Combining them with Boolean operators
4. Click to see how many records match
5. Click
79
81. Using the Opinions
Text Analysis
Package for…
Manually Adjusting
Categories
Generating Model
Converting Model
Categories to Fields
Deriving...
Total Negativity
Score
Overall Sentiment
Score
Visualizing Model
Results
81
Handout provided
83. Using the Opinions
Text Analysis
Package for…
Manually Adjusting
Categories
Generating Model
Converting Model
Categories to Fields
Deriving...
Total Negativity
Score
Overall Sentiment
Score
Visualizing Model
Results
83
85. Manually Adjusting Categories
1. Right click on the category or categories that you want
to adjust
2. Select either Move to Category or Merge Categories or
Edit > Delete
85
87. Using the Opinions
Text Analysis
Package for…
Manually Adjusting
Categories
Generating Model
Converting Model
Categories to Fields
Deriving...
Total Negativity
Score
Overall Sentiment
Score
Visualizing Model
Results
87
89. Generating Model
1. Once you are satisfied with the categories you have
created, click
2. Drag the newly created modeling node
into your stream
3. Right click on your source node
4. Click Connect
5. Click on your modeling node to connect the
two nodes
89
91. Using the Opinions
Text Analysis
Package for…
Manually Adjusting
Categories
Generating Model
Converting Model
Categories to Fields
Deriving...
Total Negativity
Score
Overall Sentiment
Score
Visualizing Model
Results
91
92. Converting Model Categories to Fields
1. Right click on your modeling node
2. Click Edit
3. Click on the tab
4. Select
5. Change the
6. Click Ok
92
93. Using the Opinions
Text Analysis
Package for…
Manually Adjusting
Categories
Generating Model
Converting Model
Categories to Fields
Deriving...
Total Negativity
Score
Overall Sentiment
Score
Visualizing Model
Results
93
94. Deriving a Total Negativity Score
1. Click on the tab
2. Double click the node or click and drag it into the
stream
3. Double click the node within the stream or right
click and click Edit
4. Give a descriptive name to your
5. Click to create a formula
6. In Expression Builder, click on a category that you want to
be in your formula
7. Click to add it
8. Click on an operator such as
9. Add another category
10. When you are finished, click Ok
11. Repeat the process to create additional formulas
94
95. Using the Opinions
Text Analysis
Package for…
Manually Adjusting
Categories
Generating Model
Converting Model
Categories to Fields
Deriving...
Total Negativity
Score
Overall Sentiment
Score
Visualizing Model
Results
95
96. Deriving an Overall Sentiment Score
1. Click on the tab
2. Double click the node or click and drag it into
the stream
3. Double click the node within the stream or right
click and click Edit
4. Give a descriptive name to your
5. Select
6. Define field settings:
7. Click Ok
96
98. Using the Opinions
Text Analysis
Package for…
Manually Adjusting
Categories
Generating Model
Converting Model
Categories to Fields
Deriving...
Total Negativity
Score
Overall Sentiment
Score
Visualizing Model
Results
98
99. Visualizing Model Results
1. Click on the tab
2. Double click the node or click and drag it into
the stream
3. Double click the node within the stream or right
click and click Edit
4. Click on the tab
5. Select
6. Select overlay:
7. Select
8. Click
99
101. Summary
1. To give a broad overview of text analytics…
a. Defining key terms
b. Describing important steps in the process
2. To provide a step-by-step tutorial for how to use IBM SPSS Modeler
to...
a. Read in source text
b. Extract concepts, sentiment, and text link patterns from records
c. Categorize records
d. Visualize the results
101
102. Additional Resources
• Users Guide:
http://public.dhe.ibm.com/software/analytics/spss/documentation/m
odeler/17.0/en/ModelerTextAnalytics.pdf
• Introduction to SPSS Text Analytics Webinar:
https://www.youtube.com/watch?v=tK-o4MnRScQ&list=WL&index=2
102
Editor's Notes
Having read the 225-page User’s Guide cover to cover and watched countless videos on Modeler, I can personally attest that the two most difficult aspects of learning the software are…
Distinguishing between terms that look similar, but signify very different ideas
Coming up with an organizational framework for understanding the many things you can do in Modeler
The first half of the presentation is dedicated to the first difficulty, and the second half of the presentation, to the second
The overriding goal of this presentation is for you to feel as though you can explore the software for yourselves
In putting it together, I tried to focus only on the essentials, and even though I only scratched the surface of what the software can do, we will have to hustle to make it through everything
However, we will post this presentation with all of its examples and videos on the Office of Research Consultation website so that you can use it as a resource and refer back to it when you need it
In the interest of time, I am going to cover the first half of the presentation relatively quickly, but if I am moving too quickly, please do not hesitate to ask questions and slow me down—just understand that we may not get to everything and that you may have to watch some of the videos at the end for yourself
Let’s start with a definition of text analytics
One thing both of these definitions have in common is that they both describe text analytics as a process
Furthermore, both definitions describe the outcome of this process in similar terms: the outcome is high quality information, knowledge, and value
The second definition, however, is somewhat more descriptive than the first, since it enumerates the principal steps in this process
Those steps are to…
Discern entities, relationships, concepts, and the relationship between them—something IBM calls extraction
Automate classification—something IBM calls categorization
Visualize the results
In my presentation today, I will first describe these steps in greater detail and then show you how to perform them for yourself
So what does this process look like?
On the macro-level, the process involves four primary steps:
Reading in source text
Extracting linguistic entities, relationships, and sentiment
Categorizing records
Visualizing the results
On more of a micro-level, the primary steps of extracting and categorizing can be broken down further:
Extraction involves passing the source text through a variety of dictionaries (to be described in greater detail) in order to identify…
Concepts
Types
Text Link Analysis patterns
Categorization involves taking these extraction results and applying a number of grouping techniques in order to create categories and descriptors that classify records
These diagrams depict text analytics as a linear process; however, as the User’s Guide repeatedly emphasizes, text analytics is an iterative process, so a more accurate depiction might include a feedback loop
Let’s take a look at the first step in the text analytics process: sourcing
Source text can take the form of either a computer file (such as an Excel file) or a Web feed (such as an RSS feed with various web links)
Since the focus of today’s presentation is to demonstrate how to perform extraction, categorization, and visualization, I will use an Excel file as source text
Using a Web feed as a source is a little less straightforward, but if you are interested in that as well, I can make that the topic of a future presentation
Within an Excel file, you have worksheets, whose columns are known as “fields” and whose cells are referred to as either “documents” or “records,” two terms that IBM uses interchangeably
For the sake of simplicity, I will refer to them in the future as records
Let’s turn now to the second main step in text analytics: extraction
Here for the first time we encounter a number of terms that look similar, but signify very different ideas
In fact, these ideas are arranged hierarchically wither “terms” at the bottom and “types” at the highest level of abstraction
Terms and concepts are always written as lowercase words or word phrases, and types are always enclosed in brackets
The general types that come with the Core Library—more on that later—include <Person>, <Product>, <Organization>, and <Location>
But types in other more specific libraries can themselves be more specific:
Types in the Opinions Library include <Positive>, <Negative>, <Contextual>, and <Uncertain> among others
Types in the Employee Satisfaction Library include <CoWorker>, <Management>, <Benefits>, and <WorkLifeBalance> among others
As I mentioned earlier, there are several linguistic dictionaries that are instrumental in the extraction process
The first of these is known as a substitution dictionary, and it is responsible for grouping terms under what are called target terms or concepts
The computer scans all of the records, and whenever it finds synonymous terms, it essentially rewrites them as the target term
It is important to note that this dictionary—and all the others—are editable
So if, for example, you want to distinguish between “universities” and “institutes,” you can separate the two terms in your substitution dictionary
And if, on the other hand, you want to use two terms synonymously, you can combine them in this dictionary
The second linguistic dictionary is known as a type dictionary, and as its name implies, it is responsible for grouping concepts under their respective types
Here the computer assigns a higher-level descriptive label to the concepts themselves, and although it is generally pretty good at assigning types when given some kind of context, if it is not given context, it will often assign the type <Unknown>
The third and final linguistic dictionary is known as the exclude dictionary, and as its name suggests, whatever it contains is excluded from the final extraction
As you peruse this dictionary, you might find a term or phrase that you do want to extract, and by deselecting it in this dictionary, you can ensure that it shows up in the extraction results
There is also a way to assign unwanted terms and phrases to the exclude dictionary
Text Link Analysis (or TLA) is where text analytics really demonstrates its value
TLA patterns are the fourth and final kind of extraction results
Whereas the other extraction results (terms, concepts, and types) represent a single linguistic unit, TLA patterns represent the relationships between these units and can express the meaning of an entire sentence with a subject, verb, and predicate
As the examples at right indicate…
Patterns can contain 2 or more concepts or types
Order is important (indicated by the + operator), but sentiments always come last
Finally let’s turn to the third main step in text analytics: categorization
Whereas extraction involves bundling the terms, concepts, and types within records, categorization bundles the records themselves on the basis of what they contain
Descriptors determine whether or not a record is assigned to a given category, and descriptors can take the form of either concepts, types, TLA patterns, or category rules
Since we have already covered concepts, types, and TLA patterns, let’s move on and cover category rules
In one way, category rules are like TLA patterns: they often join concepts or categories to describe a record and determine whether or not it belongs in a category
In another way, however, category rules are unlike TLA patterns
In the first place, they can use operators such as the ampersand or the vertical bar, in which case order is not important
(excellent & university) would capture the exact same records as (university & student)
In the second place, category rules can indicate the absence of something, whereas TLA patterns only focus on the presence of things
!(student) would capture all of the records that do not contain student, and this might be a considerable number
Usually, you would want to use the not operator in conjunction with another operator such as student & !(professor)
The fifth and final Boolean operator is known as the wildcard, and you can think of it as a variable that represents a missing…
Prefix
Suffix
Or word that precedes or comes after a given word
If there is a space either before or after the wildcard, the wildcard represents a missing word
If, on the other hand, there is no space, then the wildcard only represents a part of a word
Wildcards can be useful for generalizing category descriptors, but in some instances, they can overgeneralize
For example, “graduated” can be either an adjective or a verb, and if it is an adjective, it can refer to an alumnus or to a cylinder, and depending on the context, you may want to capture one concept but not the other with your descriptor
Having covered category rules, the fourth kind of category descriptor, let’s turn to the grouping techniques that generate both the categories and their descriptors
There are four of these: concept inclusion, concept root derivation, semantic networks, and co-occurrence
Concept inclusion is a grouping technique that involves breaking concepts into their component sets, de-inflecting these components, and then identifying areas of overlap
For example, let’s say you had the multi-term concepts “graduate faculty,” “faculty committees,” and “tenured faculty members”
These concepts would first be broken down into their component sets and then these sets would be de-inflected (e.g., converting nouns from plural to singular)
In the process at right, I have illustrated the de-inflection process by underlining the parts of the word that are removed in a subsequent step
In these component sets, the order of the words is not important; the only thing that is important for the concept inclusion technique is whether or not these component sets have areas of overlap
Concept inclusion is a technique that is relatively robust and works well on text that contains technical jargon
Concept root derivation employs a very similar process, but goes one step further—stripping words down to their morphological or structural roots so that areas of overlap can be identified
As you can see at right, “psychology,” “psychological,” and “psychologist” all have the same root—”psycholog-”—and the concepts can be grouped into categories on the basis of this similarity
Unlike concept root derivation, which categorizes concepts on the basis of morphological relationships, the semantic network technique looks for and categorizes concepts on the basis of semantic relationships, relationships having to do with word meanings
These semantic relationships generally take the form of either synonyms or hyponyms, where the former denotes an “are” relationship, and the latter, an “is a relationship”
“Professors” and “teachers,” for example, might be considered synonyms, since they both are educators
“Psychology” and “social science,” on the other hand, are hyponyms, since psychology is a social science
The fourth and final grouping technique is that of co-occurrence
Cxy represents the number of records in which two concepts co-occur; Cx, the number in which the first concept occurs; Cy, the number in which the second occurs
Generally, concepts must co-occur two or more times in order for them to be categorized together; however, this setting can be adjusted either higher or lower
If your setting is high, you will generate fewer categories, but these categories will contain concepts that are more similar to each other
If your setting is low, you will generate more categories, but they will be more heterogeneous
Co-occurrence is a relatively straightforward technique, but if you are interested in how it computes a similarity coefficient for two concepts, several sample calculations are illustrated at right
To sum up what we have said so far, extraction differs from categorization both in terms of its purpose or end and in terms of its means to that end
The purpose of extraction is to discover what records contain, whereas the purpose of categorization is to classify records on the basis of what they contain
The means used are also different
Extraction takes place by comparing records against a number of dictionaries
Categorization, on the other hand, involves applying a variety of algorithms to the extraction results to create categories
In this way, concepts, types, and TLA patterns are both output and input: output for the extraction process and input for the categorization process
They are what gets pulled out of records and what the software then turns around and uses to classify those records
Now that we have parsed out what the terminology means, let’s take a look at the software itself and see how to perform the various tasks associated with sourcing, extracting, categorizing, and visualizing
As I mentioned earlier, one difficulty in learning Modeler is distinguishing between terms that look similar; however, a second difficulty concerns organizing the many different tasks you can perform in Modeler
To surmount this second difficulty, I have provided a number of charts so that you can keep track of what we have done and what we are doing
If you have the data set, you may find it helpful to follow along on your computer
A stream is just your workspace, and it lays out in a visual fashion…
What data you are using
What processes you are running it through
The data set that you gave us to analyze is a focus group conversation about the strategic direction of the College of Business
Because you are probably less interested in moderator comments than you are in those of participants, you may want to filter out the moderator’s remarks in Excel before you start the analysis process
Templates initiate the extraction phrase and pull out concepts and types
There are many different kinds of templates, some of which contain more in their substitution, type, and exclude dictionaries than others
There are also what are called text analysis packages (or TAPs) that come…
Not only with a wealth of information in their dictionaries
But also with a number of pre-built categories that you may be interested in when you are conducting your analysis
For example, there is a TAP for employee satisfaction surveys, and the categories that it comes with include positive and negative sentiment toward…
Coworkers
Managers
Communication
Job security
Benefits
Etc.
If you are not interested in all of the pre-built categories, you can delete or modify them to suit your preferences
Now that we have explored the extraction and categorization results with the Opinions Template, let’s move to the Opinions Text Analysis Package
As you’ll remember from the first part of the tutorial, the difference between a template and a text analysis package is that the former does not come with pre-built categories, whereas the latter does
Because the focus group conversation is not in the proper format with a question as the field header and each record as one person’s response to that question, we will switch to a slightly different data set that is in the proper format so that we can demonstrate the remaining capabilities
This data set is a questionnaire about a company’s safety program, and the field that we will be looking at has to do with what employees want the company to stop doing with regard to safety
Because this is an employee opinion questionnaire, we can use the employee opinion text analysis package