1. The Corpus of Business
Discourse
A Comparison of Accounting and HR Learners
Dr. Alfred Miller
@ACBSPAccredited #ACBSP2016
ACBSP Region 8
The International Council of Business Schools
and Programs
3. OBJECTIVE
•Explore Creating New Knowledge in the Classroom
•Build Awareness of Software-mediated Content Analysis
•Improve Teaching and Learning
@ACBSPAccredited #ACBSP2016
4. Overview of Method
•Mixed-method grounded theory study
•Data mining and content analysis
•KH Coder, prepped with Stanford POS Tagger
•Data is student reflections from Taxation and HR
•Qualitative interpretation of quantitative data
•Five quantitative methods + intuition and induction
5. KH Coder Five Quantitative Methods
•Word frequency analysis
•Hierarchical cluster analysis
•Co-occurrence network
•Multi-dimensional scaling
•Self-organizing map
6. Grounded Theory
•Course reflections = Business discourse data
•HR section included an overseas study tour to Greece
•Taxation Accounting students learn U.S. Tax Code
•Data is coded and tagged and machine interaction
enables identification of emergent properties to
construct a new theory from the data
7. Individual Reflection is submitted with
the Final Project-Outline below
•Introduction
•What did you learn about individual and
business taxation in BUS 4163 Taxation or
International HRM in BUS 4936?
•Challenges you faced?
•Your experience with the group project?
•Conclusion
8. Three Groups
•2015 Year 4 International
HR: 25 Students
•2015 Year 4 Taxation
(Accounting): 23 Students
•2016 Year 4 Taxation
(Accounting): 28 Students
9. Data Preparation
•Assemble reflections into a single master
document for each section
•Scrub learner names, headings, bullet points,
fonts and special treatments
•Separate each learner’s contribution using
unique html code: <h1>Learner1</h1>
•Data sample saved as a .txt file i.e. plain text.
10. Research Questions
• Q1. Is text-mining methodology, an effective way to
reorganize, visualize and analyze text from business
student reflections?
• Q2. Is text analysis from student reflections a meaningful
way to assess creation of new knowledge in the classroom?
• Q3. Was new knowledge created as a result of the learners
experience in the classroom?
11. Graduate Outcomes HR
• Global Awareness
• Propose the application of functional knowledge and managerial
insight through research on complex challenges facing human
resource management.
• Evaluate the relevance of various theories related to challenges
in human resource management using criteria derived from
industry-based knowledge and skills.
• Analyze challenges facing human resource management using
quantitative and qualitative analytical tools (Higher Colleges of
Technology, 2015).
12.
13.
14. Course Learning Outcomes Taxation
•Differentiate the principles and practices in various tax
systems
•Critique the effectiveness of tax system compliance
•Calculate the taxable income and tax liability of
individuals
•Calculate the taxable income of business entities
•Calculate the taxable capital gains of assets
•Recommend a business model predicated on a tax
system
15.
16. Definition of Terms
•Corpus Linguistics. Corpus linguistics
is the study of a language through
large database of native texts,
written or spoken. It includes using
frequency and concordancing
techniques (Tang, 2008).
17. Definition of Terms
•KH Coder. Open source, text mining and
quantitative content analysis software.
Continuously improved since intro in
2001. Originally developed for use with
Japanese text, now expanded for use
with several languages (Koichi, H. 2015;
Text Analysis, 2015), and used in over
532 studies (Pelet, 2014).
18. Definition of Terms
•Utilizes R programming language which
is becoming an increasingly popular
lingua franca for data analytics in both
academia and the corporate world.
Open sourced R is considered on par
with proprietary packages like SPSS,
SAS and Stata (Northeastern University,
2016)
19. Definition of Terms
•Preprocessing text. To use KH Coder,
text must be preprocessed first;
usually using a computer program, to
remove characters that will not
transfer effectively and be read by
the coding software (Pelet, Khan,
Papadopoulou, & Bernardin, 2014).
20. Definition of Terms
•Stanford POS Tagger. Efficient, basic part-
of-speech tagger—software that reads text
originally in English but expanded to other
languages, and assigns parts of speech to
each word and token, such as noun, verb,
adjective, etc. The tagger also performs
lemmatization and identifies and groups
similar words according to their root form,
use and meaning (Toutanova, Klein,
Manning, & Singer, 2003).
21. Definition of Terms
•Fruchterman-Reingold algorithm. Force
directional algorithm to determine co-
occurrence networks in KH Coder.
Stabilization of force vectors determines
node placement based on spring-like
attraction and electric particle-like
repulsion (Fruchterman, & Reingold
1991).
22. •Jaccard similarity coefficient. A Word frequency
algorithm, dividing (frequencies of co-occurrence or
intersection of a and b) by (frequency of appearance of
word a + frequency of appearance of word b or union, –
frequencies of co-occurrence of a and b). For example,
in the case where the frequency of word a is 4, and the
frequency of word b is 3, then the frequency of a and b
is 2. As such, the Jaccard coefficient
is 2/ (4+3-2) =0.4 (Mori, Matsuo, & Ishizuka
2004, p. 2).
Definition of Terms
23. Word Frequency
•Word frequency: First and most basic way to
identify themes. Key assumption is that
frequently occurring words are important clues
to the major themes of the text than words that
occur less frequently (Ryan, 2003).
•Proper Noun. 68 for HR Class reflections
•Tier 1: Two were high frequency Greece (44) and
HR (25)
•Tier 2: Beiersdorf, manufacturer of Nivea, and
Demo Pharmaceuticals. Firms students toured in
Greece and interviewed the management
24. Word Frequency
• Noun. 373 total words, most used word was course (60)
• Tier 1; course, research, company, project
• Tier 2; student, employee, information, knowledge,
experience, problem, organization, time, country, skill, class,
interview, way, work, people
• Tier 3; business, industry, opportunity, thing, addition, issue,
team, career, example, culture, topic, lot, semester,
department, life, manager, responsibility, study, challenge,
environment, idea, term, trip, chance, goal, strategy, teacher
25. Word Frequency
• Adjective. 153 total words
• Tier 1: different (43)
• Tier 2; future, new, important, good, right, useful, able, best,
international, integrative, confident, great, better, clear, economic,
human, interesting, professional, real
• Adverb. 33 total words, most used adverb was abroad (7)
• Tier 1; abroad, especially, finally, furthermore, really
• Tier 2; actually, effectively, exactly, just, likewise, totally
• Tier 3; additionally, basically, briefly, efficiently, externally, firstly,
greatly, hard, instead, internally, late, maybe, nowadays, precisely,
prior, proficiently, properly, randomly, secondly, short, successfully
26. Word Frequency
•Verb. 189 total words, and most used verb and most
used word overall was learn (86).
•Tier 1; learn, help, make
•Tier 2; know, write, face, gain, improve, work, think,
create, study, relate, deal, let, provide, apply,
develop, teach, understand, analyze, collect, meet
•Tier 3; choose, follow, interview, notice, organize,
recruit, use, visit, benefit, build, expand, happen,
manage, need, overcome, solve, support, travel,
communicate, cooperate, discover, discuss,
encourage, feel, like, live, search, try
27. WORD FREQUENCY ANALYSIS for TAXATION vs HR
Part of
Speech
HR
2015
Tax
2015
Tax
2016
Nouns 373 603 617
Proper nouns 60 133 95
Adjectives 153 264 251
Adverbs 33 59 69
Verbs 189 296 288
n= 25 23 28
28. Word
Frequency
2015
(Anzai and
Matsuzawa,
2013)
tax 754
income 299
taxation 197
deduction 191
C or S corporation 173
individual 146
business 130
government 94
learn 94
pay 90
rate 79
different 79
project 78
use 73
type 71
course 70
form 69
group 68
country 63
expense 60
work 60
help 57
taxpayer 56
challenge 54
know 53
people 48
face 47
entity 47
thing 46
understand 45
liability 44
calculate 43
profit 41
wealth 41
case 40
credit 40
corporate 40
deduct 40
apply 39
time 38
include 37
make 37
work 37
gross 37
example 36
person 36
partnership 35
service 35
filing 33
important 33
exemption 32
personal 32
standard 32
married 31
money 31
company 29
information 29
revenue 29
year 29
like 29
29. Word
Frequency
2016
(Anzai and
Matsuzawa,
2013)
tax 573
taxation 201
learn 147
income 136
business 131
project 119
C or S Corporation 103
course 99
type 97
help (ful) 96
know 87
deduction 86
understand 80
information 78
entity 74
calculate 73
group 66
work 65
face 63
thing 63
government 63
different 62
make 60
time 50
country 49
knowledge 45
work 45
mean 43
example 42
challenge 41
people 41
wealth 41
include 41
rate 40
new 40
individual 39
experience 37
good 37
use 36
member 36
lot 35
difficult 34
class 33
subject 32
student 31
difficulty 30
need 30
study 30
problem 29
calculation 29
future 29
provide 28
liability 28
way 28
service 27
payment 26
point 26
value 26
idea 26
consumption 26
30. Analysis of TAX Word Frequency
• 6 out of 10 top words match up between 2015 and 2016
• 70% Similarity between top 60 words
• Unique to 2015: form, expense, report, taxpayer, liability, case,
profit, credit, apply, gross, filing, important, exemption, personal,
standard, married, year, like
• Differences were tax related, i.e. AMT, IRS Publication 17 Individual
Tax Guide, Standard Deduction vs Itemized, Foreign Tax Credit
• Unique to 2016: mean, new, experience, good, member, lot, class,
subject, student, need, study, problem, future, provide, liability,
way, point, idea, consumption
• Differences were teaching and learning related, i.e. Active, IELTS,
Extended Outside the Classroom, Experiential
31. HIERARCHICAL CLUSTER ANALYSIS
• Produces a treed dendrogram
• Two objects are merged at every step, two which are least
dissimilar.
• Agglomerative bottoms-up approach i.e. Ward’s (1963)
minimum variance method is used. Average and complete are
other methods.
• Tree branches can be cut to isolate construct categories
• Often followed with Multi-dimensional scaling
@ACBSPAccredited #ACBSP2016
37. HIERARCHICAL CLUSTER ANALYSIS of TAXATION
2015 vs 2016
Cooperative
Learning
Business
Taxation
Individual
Taxation
Cooperative
Learning
Business
Taxation
Understand-
ing US Tax
Code
38. 2015 Cluster AnalysisMarried
filing
separately
vs jointly
Ways to
lower
taxes
Itemized
Deductions
Tax
Revenue
and
distribution
of benefits
Tax
Deductible
Business
Expenses
Sources of
Income
Educating and
understanding by
non-US students
Student
learning
through
group
work
Business
Entities
C1
C2
C3
39. 2016 Cluster AnalysisDealing
with hard
math
problems
Teamwork
skills:
Being an
effective
team
member
Business
entity
taxation
USA Study for
International
Students
Study
skills and
Learning
Taxation
facts
Taxation
and
Justice
Taxable
Income
Tax
deductible
business
expenses
C1
C2 C3
40. Multi-dimensional Scaling
• Multi-dimensional scaling (MDS) for interval or ratio
scaled data or with correspondence analysis for
nominal data to obtain mapped observations in
space.
• Graphical way of finding groupings in the data.
• Preferred in some cases because MDS has relaxed
assumptions of normality, scale data, equal
variances and covariances, and sample size.
• For analysis mainly looking for clusters and
dimensions
41. Multi-dimensional Scaling
= How to create an effective cross cultural training program =
trip, problem, culture, HR, class write interview department topic.
= Career Aspirations = relate, term, career, team, future,
industry.
= Participative Management = organization study idea issue work
important business manager.
= Management of the Classroom Environment = responsibility,
deal, semester, study, life, let, environment, create, think.
= Career Skills that Travel Will Improve = good, improve,
opportunity, skill, thing, country.
= How to Face Challenges = provide, right, lot, face, know,
way, people, gain, new, challenge, information, example.
= Knowledge Worker = work, addition, make, employee,
student learn, course, project Greece, different, research, company,
help, time, knowledge, experience.
46. Co-occurrence Network
• Pink color = company node = single most central concept
• Darker Blue = Central concept following Pink. In the project,
students researched and proposed solutions to specific HR
problems uncovered at particular firms.
• Learn = Largest node and also centrally located.
• Lines = Network relationships, Thicker line = Stronger
connection.
• Strongest bond = career + future, 2nd strongest = project +
industry.
• Greece is center of its own cluster.
• Responsibility and work related to each other only
47. Co-occurrence Network
• The central cluster of the following words; learn,
company, knowledge, experience, make,
information, research and project appear to be
significantly related, as core concepts.
• Others which are darker blue but located more
peripherally are topics of tangential importance.
• Some words which logically seem to go together are
connected such as face, department and challenge;
or important, organization and employee being
connected to company. Interestingly these two
streams intersect first and then connect at the
problem node
53. Self Organizing Map
• Construct groups that are colors closer to pink like orange or
red, asserts there is a large difference in vectors of
neighboring nodes; they are distant.
• Pink line if present, denotes a vast gulf dividing clusters.
• Shades closer to blue, like purple and green are proximally
related to neighbors,
• Shades closer to white, such as gray are more neutral
(Higuchi, 2010).
• Constructs are listed below by color from proximal to more
distant and correlated with search hits shown in italics, for the
construct. These relate to the map on the following page
54. Self Organizing Map
= Challenges of Growing a Business and Employee Retention =
experience, challenge, good, employee, important, organization, lot,
information, know, problem, company, face
= General Manager = business, responsibility, country, manager, work,
term, study
= The Advanced Way = example, way, issue.
• Gray = Travel Strategy project, industry, new, trip, people, different, improve,
create.
= Study Abroad = life, addition, make, idea, student, study, let, think,
semester.
= Human Resource Management = course, career, future, team, thing, HR,
deal, Greece, culture, environment, learn, interview, department.
= How to Write = write, relate, provide, right.
= Research Skills = research, topic, time, skill, gain, help, work, class,
opportunity, knowledge.
58. Q1. Is text-mining methodology, an effective
way to reorganize, visualize and analyze text
from business student reflections?
•Yes, the five text mining methods from KH Coder
contributed new ways to interpret the data.
•The method has been well deployed as it is
approaching 600 published studies.
•The software is open source, works with many
languages, and is easy to use.
59. Q2. Is text analysis from student reflections a
meaningful way to assess creation of new knowledge
in the classroom?
• Word frequency analysis = understanding mastery and use of
language
• Co-occurrence network allowed the identification of
differentiated concepts — central and tangential
• Hierarchical cluster analysis, multi-dimensional scaling and
self-organizing map contributed to a theoretical model
building.
• Ordaining grounded theory, these theoretical models appear
as emergent properties. They are meaningful ways to assess
creation of new knowledge in the classroom.
60. Q3. Was new knowledge created as a result
of the learners experience in the global
classroom?
• Models using hierarchical cluster analysis, multi-dimensional
scaling, and the self-organizing map along with the supporting
information gained from word frequency analysis and the co-
occurrence network assessed the creation of new knowledge.
• Models demonstrate alignment with learning outcomes from
Higher Colleges of Technology courses.
• New knowledge was created in the classroom via exploratory
theoretical models that explain what the student’s gained from
completing the course or program
61. Author Bio
• Dr. Alfred Miller, Mizzou alum, entrepreneur & company
founder,
• US Army veteran, holds a Ph.D. in E-Commerce from NCU, four
MA’s from Webster, and a graduate of Wharton’s Global Faculty
Development Program.
• BAC/GRAD Commissioner, Editor ACBSP Region 8 Journal
• Faculty and ACBSP Champion for Higher Colleges of Technology
• Chair-Elect for ACBSP Region 8, representative to the Scholar
Practitioner Publication Committee/Editorial Review Board
• Member of the American Center for Mongolian Studies, the
Gulf Comparative Education Society, and Higher Education
Teaching and Learning’s Liaison for UAE and Ecommerce
Discipline Officer
62. Thank You!
• Dr. Alfred Miller (Ph.D.)
• Commissioner, ACBSP BAC/GRAD Board
• Business Faculty/ACBSP Champion/System
Representative
• Chairman Elect of the Board of Directors, ACBSP Region 8
• Editor Region 8 Journal
• Higher Colleges of Technology-Fujairah Women’s College
• Direct: 971 9 201 1325
• Fax: 971 9 228 1313
• Mobile: 971 50 324 1094
@ACBSPAccredited #ACBSP2015
63. The Corpus of Business
Discourse
A Comparison of Accounting and HR Learners
Dr. Alfred Miller
@ACBSPAccredited #ACBSP2016
ACBSP Region 8
The International Council of Business Schools
and Programs
Editor's Notes
An international collaboration between a UAE Federal University and a US private university pursued initially in 2009, at the height of the financial crisis failed to materialize, with each school pursuing other opportunities instead. The to ACBSP Region 8 institutions, Higher Colleges of Technology and Webster University, later overcame obstacles to forming a partnership through cultivation of key relationships; as organizational behavior yielded ripened opportunities. Parties to this collaboration capitalized on their key competencies and become proactively engaged in in the ACBSP Global Accreditation Community.
Photo taken at the Webster Athens Cultural Center, before the heading off for the farewell reception