Comment: Advertising Won’t Die, But Defining It Will
Continue to be Challenging
Jisu Huh
University of Minnesota, Minneapolis, Minnesota, USA
The evolution of the field of advertising has constantly
demanded redefining what advertising is and what topics fit
under the rubric of “advertising research.” In response, both
academic and industry organizations have often developed
definitions resembling a laundry list of new advertising types
added to earlier definitions of advertising. While this might be
useful for temporarily addressing the gap between the estab-
lished concept of advertising and the changing reality of the
phenomenon, more thoughtful and rigorous conceptualization
of advertising has been long overdue.
Thus, I am excited to see Dahlen and Rosengren’s (2016)
article and thank the authors for their contribution to the adver-
tising discipline by reinitiating the important discussion about
how advertising should be defined. This work presents the
compelling need for redefining advertising, a thoughtful over-
view of the historical development of advertising definitions, a
systematic conceptual approach focusing on three specific
dynamics, and well-developed empirical studies testing the
new definition.
While Dahlen and Rosengren (2016) make an excellent
effort, developing a universally accepted definition that delin-
eates the boundaries of the phenomenon that is the central
focus of scholarly inquiry is not easy for any academic disci-
pline. Especially for a field like ours, which is closely linked
to constantly evolving phenomena, it seems impossible to
develop the perfect definition including everything and satisfy-
ing everyone. Thus, an important question preceding “What is
advertising?” would be “What is the purpose and motivation
of (re-)defining advertising?” Is it to include everything prac-
ticed under the name of advertising? To expand the size of the
academic field of advertising? To advance advertising
scholarship and theory building? Or to determine what should
be covered in advertising education?
Some of these questions are present in every call to develop
a better definition of advertising. However, would it be truly
attainable, or even desirable, to try to address all of these ques-
tions in a single definition? In an ideal sense, research, educa-
tion, and practice should be closely connected, but they are not
exactly the same field because of differences in the missions,
objectives, and environmental/structural factors between aca-
demia and industry, and even within the academic community.
Keeping this in mind, I respond here to Dahlen and Rose-
ngren’s (2016) proposed working definition and pose some
questions with the purpose of advancing the academic field of
advertising and advertising theory building. I hope my com-
ments serve as food for thought and help advance the impor-
tant dialogue about the definition of advertising.
TO ADVANCE THE ACADEMIC FIELD OF ...
Science 7 - LAND and SEA BREEZE and its Characteristics
Comment Advertising Won’t Die, But Defining It WillContinue.docx
1. Comment: Advertising Won’t Die, But Defining It Will
Continue to be Challenging
Jisu Huh
University of Minnesota, Minneapolis, Minnesota, USA
The evolution of the field of advertising has constantly
demanded redefining what advertising is and what topics fit
under the rubric of “advertising research.” In response, both
academic and industry organizations have often developed
definitions resembling a laundry list of new advertising types
added to earlier definitions of advertising. While this might be
useful for temporarily addressing the gap between the estab-
lished concept of advertising and the changing reality of the
phenomenon, more thoughtful and rigorous conceptualization
of advertising has been long overdue.
Thus, I am excited to see Dahlen and Rosengren’s (2016)
article and thank the authors for their contribution to the adver-
tising discipline by reinitiating the important discussion about
2. how advertising should be defined. This work presents the
compelling need for redefining advertising, a thoughtful over-
view of the historical development of advertising definitions, a
systematic conceptual approach focusing on three specific
dynamics, and well-developed empirical studies testing the
new definition.
While Dahlen and Rosengren (2016) make an excellent
effort, developing a universally accepted definition that delin-
eates the boundaries of the phenomenon that is the central
focus of scholarly inquiry is not easy for any academic disci-
pline. Especially for a field like ours, which is closely linked
to constantly evolving phenomena, it seems impossible to
develop the perfect definition including everything and satisfy-
ing everyone. Thus, an important question preceding “What is
advertising?” would be “What is the purpose and motivation
of (re-)defining advertising?” Is it to include everything prac-
ticed under the name of advertising? To expand the size of the
academic field of advertising? To advance advertising
3. scholarship and theory building? Or to determine what should
be covered in advertising education?
Some of these questions are present in every call to develop
a better definition of advertising. However, would it be truly
attainable, or even desirable, to try to address all of these ques-
tions in a single definition? In an ideal sense, research, educa-
tion, and practice should be closely connected, but they are not
exactly the same field because of differences in the missions,
objectives, and environmental/structural factors between aca-
demia and industry, and even within the academic community.
Keeping this in mind, I respond here to Dahlen and Rose-
ngren’s (2016) proposed working definition and pose some
questions with the purpose of advancing the academic field of
advertising and advertising theory building. I hope my com-
ments serve as food for thought and help advance the impor-
tant dialogue about the definition of advertising.
TO ADVANCE THE ACADEMIC FIELD OF ADVERTISING
AS A UNIQUE AND COHESIVE DISCIPLINE
Is advertising a unique scientific field? This question has
4. been confronting us for decades, and with our collective
actions we suggest an answer of “yes” to this question (for a
detailed discussion, see Thorson and Rodgers 2012). The aca-
demic discipline of advertising is a unique and cohesive field
that is “formed around advertisements” (Thorson and Rodgers
2012, p. 13) and situated at the intersection of mass communi-
cation, journalism, and marketing.
To establish and advance an academic discipline, it is
essential for its members to share an understanding of the
unique attributes of the phenomenon of the collective schol-
arly interest that distinguish it from related others. Thus, the
definition of advertising is inherently linked to the legitimacy
of the advertising field as a unique scientific discipline and
fundamental to the field’s cohesive identity.
This seemingly straightforward task becomes complicated
due to the complex and accidental nature of the academic ori-
gin of the advertising field and diverse backgrounds of its
Address correspondence to Jisu Huh, School of Journalism and
5. Mass
Communication, University of Minnesota, 206 Church Street
SE, Mur-
phy Hall 338, Minneapolis, MN 55455. E-mail:
[email protected]
Jisu Huh (PhD, University of Georgia) is Professor, Raymond
O.
Mithun Chair in Advertising, School of Journalism and Mass
Communication, University of Minnesota, and 2016 President of
the
American Academy of Advertising.
356
Journal of Advertising, 45(3), 356–358
Copyright � 2016, American Academy of Advertising
ISSN: 0091-3367 print / 1557-7805 online
DOI: 10.1080/00913367.2016.1191391
members (Ross and Richards 2008). Depending on individual
backgrounds and academic affiliations, an attribute that is
important and unique to some might not be so to others. This
perspective-driven difference is noticeable in Dahlen and
Rosengren’s (2016) study. Reviewing the previous advertising
definitions, for example, the authors criticize use of the term
mediated as confusing and suggest it should be dropped from
6. the definition, and argue paid is too limiting and should be
replaced with brand-initiated.
The confusion about the term mediated may stem from the
Dahlen and Rosengren’s (2016) marketing-oriented back-
grounds. The authors state that “brands are increasingly adver-
tising through own channels, ranging from social media to
Web sites and apps, which would not be a mediated
communication,” but all of these examples are actually medi-
ated forms of communication. Mediated communication is a
concept referring to communication performed through media,
not through face-to-face interpersonal communication. From
the marketing scholars’ perspectives, mediated and paid might
not be considered unique attributes that matter for advertising
research. However, from the mass communication perspective,
distinguishing advertising from other forms of communication
is important for the purpose of establishing advertising as a
unique academic field and as a unique academic unit within a
university. Dahlen and Rosengren (2016) are too quick to
7. identify advertising as a discipline in marketing. However,
given that about two-thirds of the members of the American
Academy of Advertising are from journalism and mass com-
munication programs, and 95% of advertising programs in the
United States are situated in journalism and mass communica-
tion or arts and sciences colleges (Ross and Richards 2008), it
would be important to bring in the mass communication per-
spectives and compare them to those from the marketing
perspectives.
Within the broad field of communication, interpersonal
communication and mediated communication are clearly dis-
tinguished and studied in separate disciplines where unique
theoretical parameters exist, causing different directions in
theory development. In the previous definitions of advertising,
mass communication or mediated form of communication has
served the purpose of distinguishing advertising from interper-
sonal/speech communication. Likewise, paid has distinguished
advertising from public relations and other forms of communi-
8. cation that are not under the advertiser’s control (Richards and
Curran 2002).
I agree with Dahlen and Rosengren’s (2016) contention that
the terms paid and identifiable sources might be too limiting
given new forms of advertising. However, I still struggle with
the disciplinary distinctions that exist in the academe, espe-
cially in the broad field of communication. It is true that
boundaries in the business practices of advertising and public
relations have been blurring. However, is a changing business
practice sufficient justification for eliminating key components
in the definition of the phenomenon that identifies an academic
discipline? Furthermore, would the convergence trend in mar-
keting communication necessarily make the academic disci-
plinary distinction between advertising and other forms of
communication obsolete and meaningless? These questions
need to be examined thoughtfully to advance our field as a
unique and cohesive academic discipline.
TO DRAW CONCEPTUAL BOUNDARIES FOR
9. ADVERTISING THEORY
Applying the conceptualization of level fields versus vari-
able fields, Faber, Duff, and Nan (2012) elegantly described
the nature of theory building in variable fields: “rather than
desiring theories with broad abstract generalizations, variable
fields should be concerned with identifying the boundary con-
ditions where a broader theory might no longer be true. To do
this, a variable field needs to recognize what makes it unique
and to identify the variables that it can contribute to testing
and qualifying broad theories from level fields” (pp. 19–20).
For advertising theory building, therefore, it is imperative to
identify the unique attributes of advertising and recognize how
they may influence more general theories.
Faber, Duff, and Nan (2012) proposed four unique attrib-
utes that could serve to challenge the boundary conditions for
general theories and thereby advance advertising theory build-
ing: consumer skepticism, repetition, message coordination,
and clutter. If we agree that “persuasive intent by an identifi-
10. able source” are two essential elements of an advertising defi-
nition, consumer skepticism and persuasion knowledge would
be important variables of interest that would lead to meaning-
ful advertising theory development. However, if such concep-
tual elements are not part of the definition, researchers would
have to qualify their theoretical contributions as confined to
only certain types of advertising but not all.
The issue of boundary conditions for advertising theory is
inherently linked to the issue of ecological and external validity
of
advertising research. Advertising theories should be relevant to
practice, and advertising research should be based on real-world
advertising issues. The definition of advertising, therefore, is
fun-
damental for determining the ecological validity of research and
identifying true advertising research as such, and disguised mar-
keting, consumer behavior, psychology, and mass
communication
research as such. I would encourage the authors to consider the
issues of boundary conditions for advertising theory building,
11. external and ecological validity, and the implications of their
pro-
posed definition for addressing these issues.
TO EXPAND THE FIELD OF ADVERTISING RESEARCH
The expansion of advertising practice and research has been
happening in multiple dimensions, including expansion beyond
traditional mass media advertising, advertising agency work,
effect outcomes, and branding/marketing communication. I
agree
with Dahlen and Rosengren’s (2016) suggestion that a revised
COMMENT: ADVERTISING WON’T DIE, BUT DEFINING IT
WILL CONTINUE TO BE CHALLENGING 357
definition of advertising should update the term receiver to
reflect
the more active roles of consumers and broaden the scope of
advertising effects.
However, the fourth expansion area and the growing trend
in grant-oriented research in the mass communication field
call the brand-initiated component in the proposed definition
12. into question. What would be the implications of the term
brand-initiated for the growing subareas of advertising that
are not branding or marketing communication? If we are to
embrace or even foster research about communication cam-
paigns promoting ideas, issues, or health for the benefits of
general public, brand-initiated might not be the best word
choice, even with a qualifying explanation of the term. Per-
haps a less marketing-oriented phrase, for example,
“communication initiated by an organization or person,”
would make the definition more open to the growing research
areas focusing on nonmarketing, nonbranding communication
campaigns.
CONCLUDING THOUGHTS
Advertising will be alive and well, constantly transformed,
and continue to be understood differently by different stake-
holders. Some aspects of the conceptualization of advertising
can and should cover the common denominator agreed by
everyone (e.g., “communication”), but other aspects will likely
13. be debated continuously and disagreed upon regularly. How-
ever, such disagreement should be considered not a sign of a
discipline in crisis but a sign of a vigorously growing field
with many new avenues for future research.
The advertising academic field has been dealing with the
criticism of lagging behind real-world practice and its seeming
unwillingness to broaden the definition of advertising. Dahlen
and Rosengren’s (2016) proposed working definition is defi-
nitely less narrow than the previous ones. As acknowledged
by the authors, however, the trade-off between the overinclu-
siveness and underinclusiveness of a definition is an important
issue. Depending on the motivation of an author, different defi-
nitions would likely err on different sides. The current working
definition errs on the overinclusive side in an attempt to “stay
relevant,” which brings up another important balancing issue:
dealing with the different motivations of practitioners and
academics.
Many verbatim comments quoted in Richards and Curran
14. (2002) indicate significantly different viewpoints of the two
groups. The mean scores reported in the current study about
practitioners’ and academics’ ratings of different definitions
also showed that academics rated the previous definition as
more proper than the new definition, but professionals rated
the new definition as more proper. Unfortunately, this very
interesting result did not get adequate attention from the
authors, who focused more on a general discussion arguing
that the new definition was improvement over the previous
one.
The missions and motivations driving advertising practice and
academic research and education, and challenges and opportuni-
ties affecting them, are intertwined but not the same across the
dif-
ferent sectors. Our different backgrounds would likely have
significant impact on what each of us would consider an accept-
able or perfect definition of advertising. Finding a definition
that
satisfies everyone might be impossible, but the chance of doing
15. so
would improve if additional studies and open discussion
continue.
In doing so, more thoughtful consideration of purpose-driven
defi-
nitions and cross-fertilization across scholars with different
back-
grounds is strongly recommended.
REFERENCES
Dahlen, Micael, and Sara Rosengren (2016), “If Advertising
Won’t Die, What
Will It Be? Toward a Working Definition of Advertising,”
Journal of
Advertising, 45 (3), 334–345.
Faber, Ronald J., Brittany R.L. Duff, and Xiaoli Nan (2012),
“Coloring Out-
side the Lines: Suggestions for Making Advertising Theory
More Mean-
ingful,” in Advertising Theory, Shelly Rodgers and Esther
Thorson, eds.,
New York: Routledge, 18–32.
Richards, Jef I., and Catherine M. Curran (2002), “Oracles on
‘Advertising’:
16. Searching for a Definition,” Journal of Advertising, 31 (2), 63–
77.
Ross, Billy I., and Jef I. Richards (2008), A Century of
Advertising Education,
American Academy of Advertising.
Thorson, Esther, and Shelly Rodgers (2012), “What Does
‘Theories of
Advertising’ Mean?” in Advertising Theory, Shelly Rodgers and
Esther
Thorson, eds., New York: Routledge, 3–17.
358 J. HUH
Copyright of Journal of Advertising is the property of Taylor &
Francis Ltd and its content
may not be copied or emailed to multiple sites or posted to a
listserv without the copyright
holder's express written permission. However, users may print,
download, or email articles for
individual use.
3. Access and authority control
Access points are often under some form of authority control
(also called access control or terminology control). Authority
control is a mechanism for bringing consistency to data values
in an information organization system. Data entered in fields
17. that are under authority control must come from a file or list of
authorized (or controlled) terms. In your system, terms related
to subjects and to names of people and corporations are under
authority control. You can establish authority control in two
forms for purposes of this assignment:
· thesaurus (external to the main database file) for subject terms
(section 4.2, required)
· name authority file (external to the main database file) for
names in the records (section 5, required)
In this section, you explain authority control in general and
state which fields are under which type of control.
Tasks: Determine which fields (both physical description and
subject description) are under some form of authority control.
Consider the following:
· Fields with simple, predictable terms. These are usually
physical description fields such as Format with terms such as
"book" and "video." Decide whether any such field should be
under control of a controlled vocabulary
· The field with the greatest number of potential terms and the
most semantically (conceptually) complex terms, especially how
the terms are related to one another is a candidate for a
thesaurus. Usually this is a subject field. Choose one field only
for vocabulary control using a thesaurus.
· Fields with proper names. These may be personal names
(people) or corporate names (companies, organizations).
Usually all name fields are controlled by a name authority file.
The name authority file also controls the form of names used in
subject fields.
Write narrative.
Narrative:
· Discuss the purpose of authority control and its importance in
18. your system.
· Explain how it works.
· Explain the relationship to controlled vocabularies.
· Explain why it is beneficial to have specific access points
under authority control from the perspectives of the end user
searching the system and the technical user creating the records.
· State the kinds of authority control in your system. Note that
access points do not always have to be under authority control,
and you can have authority control on non-access points.
· Discuss the fields under control of a thesaurus, and a name
authority file. State explicitly which fields are under which type
of control mechanism.
Hint: If you have trouble completing this section, come back to
it after completing section 4.
4. Representation of information content
Given the basic resource description for the information
container developed in section 2, you now need to determine the
metadata elements necessary for representing information
content (or intellectual content, subjects, topics). Section 4
focuses on problems of describing subjects, including use of
controlled vocabulary in section 4.2, and subject-based
classification in section 4.3.
4.1. Subject access
Tasks:
· Determine how to provide subject representation, or how to
represent the information content of the objects.
· The subject representations will be the basis for providing
subject access in your system.
· Consider the kinds of subjects (e.g., topics, themes, time
period, geographic area) of the information objects.
· Note that, although fields such as title and table of contents
can provide clues to aboutness, these fields are
19. considered physical description of the information container,
not subject description of the information content.
· Decide how many subject fields you need.
· You may translate Subject into more than one field (e.g.,
Topics and Time Period) and/or you may rename the metadata
element and database field.
· You may have some subject fields controlled by a subject
heading list, or controlled by a thesaurus, or fields that contain
natural language terms (e.g., abstracts, summaries, etc.).
The classification code to be developed in Draft 3 should be
based in part on information content.
Narrative:
· Define and discuss subject representation, subject analysis and
subject access.
· Explain the importance of subject access for your users.
· Describe how your organization system provides subject
access by listing all fields in your records that contain subject-
related data or information.
· Explain that classification is partially based on subject,
identify the subject-based facet(s) in your classification scheme,
and name the field that contains the classification code. (You
may need to return to this after you complete section 4.3).
4.2. Thesaurus structure
This section addresses subject authority control (also called
vocabulary control or terminology control) using a thesaurus. A
thesaurus is a list of controlled vocabulary terms that provides
data values (terms) for a single field under subject authority
control. It serves both technical users (indexers, cataloguers) as
a source of terms to enter in the record and end users as a
source of search terms.
Tasks:
· Review the Thesaurus Tutorial in the Canvas course site.
20. · Review, discuss and demonstrate the three semantic
(conceptual) relationships in the thesaurus, and understand how
mandatory reciprocals are used to indicate these three
relationships. This should be a thorough discussion that fully
informs the readers on this topic.
· Determine the domain and scope of the thesaurus.
· Make decisions concerning specificity and exhaustivity.
· Consider how each decision may affect information retrieval
performance based on measures of precision and recall.
Write narrative.
Narrative:
· Explain the purpose of subject authority control, how it is
implemented in your system, and why it is important for both
end users and technical users of your system.
· Discuss why the subject field needs authority control
· Define the thesaurus as a kind of controlled vocabulary.
Explain the purpose of its syndetic structure.
· Define and describe the three (3) kinds of semantic
relationships and how each is displayed.
· Explain mandatory reciprocals and how they are used.
· Describe the domain and scope of the thesaurus.
· Define specificity.
· State the level of specificity in the thesaurus (high, moderate,
low) and explain why it is appropriate for the users and/or
information objects.
· Discuss the probable effect of this level of specificity
on precision and recall measures of information retrieval
performance.
· Define exhaustivity. State the level of exhaustivity for
indexing, that is, whether the indexer should tend more
toward depth indexing or summarization.
· Explain why this level is appropriate for the users and/or
information objects.
· Discuss the probable effect of this level of exhaustivity
on precision and recall measures of information retrieval
21. performance.
Refer to Appendix D: Sample thesaurus.
Note: The instructor understands that your thesaurus is only a
sample and that it is not comprehensive. The reader should have
a thorough understanding of how a thesaurus works, how the
three relationships work, how they look in the thesaurus, what
mandatory reciprocals are, and how they are shown in the
thesaurus. Actual examples go a long way here.
4.3. Classification scheme
Classification is a process of categorizing objects according to
one or more attributes or characteristics. Formal classification
systems such as Dewey Decimal and Library of Congress are
called schemes. Classification codes are derived from schemes
and assigned to objects to group items that are similar in one or
more ways together. The primary function of bibliographic
classification is to bring items together that contain similar
intellectual content or subject matter. In the library world,
bibliographic classification systems are also used as the basis
for physical location. Classification schemes are used by
technical users who create the codes and by end users who want
to understand the organization of materials. Ultimately, your
classification codes will be your call numbers.
Tasks:
· Review Faceted Classification Tutorial and/or Hierarchical
Classification Tutorial in the Canvas course site.
· Determine your approach to classification: faceted
(recommended) or hybrid (hierarchical first facet) .
· Choose three or four attributes of the objects (e.g., subject,
creator, literary form or genre, media format, date) to be used in
classification.
· Consider attributes suggested by users' questions and how
these relate to users' expectations for physical arrangement of
22. objects (e.g., whether to arrange objects first by subject or by
format).
· For this project, you should have at least three (3) facets, and
at least one (1) facet must relate to information content or
subjects. Your first facet should not be Author or any other
facet that merely alphabetizes the collection.
· Develop a notation code (you may not use a pre-existing code
such as Dewey or LC) to identify and group the objects by
class.
· In order to physically organize the objects, make this a unique
identifier (call number) by adding to the notation code a unique
number (for example, RecordID) to identify the individual
object.
· Be sure to create a code (call number) for each of your
records.
Create Appendix E: Classification scheme.
Write narrative.
Narrative:
· Define classification and its purposes in general.
· Describe the role of classification in your system with regard
to providing intellectual access and physical access if
appropriate.
· Define and describe the difference between faceted and
hierarchical approaches to classification; state your approach
and explain your choice. The reader should have a thorough
understanding of the differences, pros, cons, etc. of each.
· State the primary facet and explain why you chose it with
regard to providing intellectual access (subject-based). List the
other facets in order.
· Explain why you chose these facets, including their
effectiveness as a system for intellectual and physical
organization of the objects (if applicable).
· Your primary facet should be derived from a field that uses a
controlled vocabulary.
· If you are adding a unique identifier to the classification code
23. for physical arrangement, explain why that is necessary and the
source of the unique identifier.
· In a separate paragraph, illustrate your classification system
by providing a complete example of one classification code:
· Briefly describe one of your 10 objects
· Show the classification code for that object
· Explain what each part of the classification code represents.
3. Access and authority control
An access point is a field for a record that can be searched.
Access points are defined in any record to make the record
searchable. Access points are selected such that they
compliment users’ searching behavior and cater to their needs.
An access point represents information that is returned when a
user enters a search term into a field. Access points generally
have authority control applied to them.
Authority control is an important concept that greatly helps in
standardizing data and reducing inconsistencies. It is defined as
a way of controlling or manipulating data that is entered into a
field so that standardization of data is enforced and achieved.
Authority control ensures that only allowed or acceptable terms
are used when entering data into a field or for searching.
Authority control, when used, is applicable to the cataloguers as
well as users: the cataloguers need to mandatorily adhere to the
authorized terms when entering data into fields; the users in
turn, are provided with relevant results only when they select
the right terms to search for an object. Authority control can be
of the following types:
One type of authority control is where use of controlled
vocabulary is made. Controlled vocabulary is a list of
24. standardized or authorized terms that can be used to retrieve
information about an object. This type of authority control is
very efficient and practical when the authorized terms are
limited; for example, a drop-down menu for Genre field where
terms are limited. This type of authority control does not allow
for increase in number of authorized terms as the collection
grows. In this collection, the field Tags (for Subject and Genre)
has predictable terms and should have controlled vocabulary
applied. Of these, the field Tags (for Subject) has a very large
number of terms as possible input to the field and must
therefore be under controlled vocabulary using a thesaurus.
Comment by Jeannie Naylor: Make sure to clearly define
the two types of authority control- name and subject - and then
the mechanisms - NA File, thesaurus, and validation list. All
three mechanisms are a form of controlled vocabulary. - 1 point
Another type of authority control is one which is very
applicable where name-type fields are being considered. This is
called name authority control. It uses name authority file, and
authorized terms grow with growth in collection. The field
Author in this collection should have name authority control
applied to it. Similarly, the field Publisher should have name
authority control applied to it to ensure different variations of
user input still retrieve relevant objects.
4. Representation of information content
4.1. Subject access
An information object can have two types of descriptions
associated with it. Bibliographic description is information
where physical features of an object, such as title of the object,
number of pages, the audience level, etc. can be determined.
Intellectual description, on the other hand, is relevant to the
aboutness of the object. It refers to the subject of object. A
subject of an object is defined as the central idea, the main
25. theme of the object; such fields as subject, topic, theme, tag,
etc. can be considered to describe an object intellectually.
Comment by Jeannie Naylor: Don’t use etc.
Subject access is a broad concept that deals with the intellectual
content or topic searched by users. It is a collective term that
encompasses all the procedures and measures taken in a system
to provide access to intellectual content of the objects within a
collection. This represents all the fields that cover subject
access such as, in the case of this collection, Subject, Genre,
Plot. Here the concept of natural language indexing versus
authority control is also important. Natural language is what
comes freely to people while communicating, whether written or
oral, and natural language indexing as a result is where no terms
used are controlled and as close to natural language as possible.
Authority control, on the other hand, allows for the use of only
authorized terms. It is essential to remember here that the main
aim is for users to be able to access the right objects in the most
easy and convenient manner. For this collection, the field Plot
is not searchable and as such, needs no indexing, while Subject
and Genre, as discussed above, have controlled vocabulary.
An important process involved in subject representation is
subject analysis. Subject analysis is mainly for cataloguers and
is defined as the process of finding out terms for representing
the information object. Also, depending on the field and field
rules, subject analysis may be done for natural language
indexing or authority control. In both cases, the three steps of
familiarization, extraction and assignment are common.
Familiarization involves figuring out the major theme or idea of
the object, in this case, a book, and this need not be done in
depth, it is done in a cursory manner. Once the cataloguer is
familiar with the book, they move to the next step of extraction,
which is where the cataloguer starts thinking of terms to use
based on what they now know of the book. This is also where
the field and its input rules come into picture and decision on
26. whether to use natural language indexing or authority control is
made. If using natural language indexing, the cataloguer’s
domain knowledge comes into play for term selection and based
on those selected terms, the final step, assignment, is attained.
Assignment is simply entering the selected terms into the field,
adhering to input rules. In the case of authority control
application, after extraction, an additional step of translation
also comes into picture, where the cataloguer compares
extracted terms with what is allowed or authorized. Every term
extracted is compared with the controlled vocabulary and the
term which is most similar or the closest in meaning or
relevance is chosen for the purpose of assignment.
So subject representation involves subject analysis carried out
by the cataloguer, which leads to subject access. It is important
to know that subject analysis also carries partially into the
process of classification in that it determines the physical
location of the object by virtue of the fact that cataloguer
determines the predominant subject term for classification. So
in essence, the process of subject analysis begins right from the
author, to the publishers, the cataloguers, indexers, classifiers
and finally the users.
Classification is the process used to organize information
objects in a systematic way. Using one or more subject based
fields to classify information objects allows technical users to
organize them in a manner that is more user-friendly; users
searching for information objects based on their subject find it
easier to access the information object in such classification. In
this collection, two subject based fields are used in the
classification scheme: Tags (for Subject) and Tags (For Genre).
4.2. Thesaurus structure
Subject authority control is defined as the process of applying
controlled vocabulary to subject search terms as well as subject
27. headings. A subject heading is the closest word or group of
words to the subject of a book. The field chosen in this
collection for subject authority control is the Subject field. This
is because for users using this field to retrieve right results,
controlled vocabulary needs to be applied to it to minimize
inconsistencies and eliminate disparity between what users
search for and what cataloguers enter in the field. Authority
control is applied to Genre field for the same reason, i.e., to
reduce inconsistencies and impose standardization.
Subject authority control makes use of subject authority files
which contain subject records, which in turn contain the
controlled vocabulary that represents the subject. Subject
authority files are of the types thesaurus and subject heading
lists. A thesaurus is defined as a document containing words
with associated relationships and it allows for vocabulary
control, thereby improving search results retrieval. In this
collection, the thesaurus is developed for the Subject field.
Controlled vocabulary, which is previously defined, is a
solution to indexing problems that result from natural language
and it allows usage of a single term, spelled a single, specific
way for content representation purpose. While considering
controlled vocabulary in terms of a thesaurus, it is important to
understand what authorized and unauthorized terms are. An
authorized term is what is selected by the indexer as allowed or
acceptable. Use of any other term than the authorized term in
the field is unacceptable. Unauthorized terms are those terms
that are not to be used or unacceptable in the field; in their
place, the related authorized term needs to be used.
Semantic relationships are associations between words based on
their meanings. Semantic relationships follow the syndetic
structure, which is defined as cross-referencing between terms
used in the controlled vocabulary, in this case, in the thesaurus.
There are three types of semantic relationships taken into
28. consideration for building the thesaurus for the Subject field:
equivalent, hierarchical and associative. Equivalent relationship
is one where the associated words have the same meaning, or
very close to it. For example, the terms bravery and valor are
nearly identical in meaning and therefore share an equivalent
relationship. Hierarchical relationship is where the associated
terms are such that one is a broader representation of the other
and conversely the other is a narrower representation of the
first. For example, armed forces and air force share the broad-
narrow relationship respectively because air force is a type of
armed forces. An associative relationship is where the two
terms considered are associated terms. For example, Holocaust
and concentration camps are associated or related terms and
therefore have an associative relationship.
As previously explained, semantic relationship approaches are
so defined that every relationship contains associations that are
complementary, and these cross references are called mandatory
reciprocals. For example, the equivalent relationship has the
USE FOR – USE cross reference, the hierarchical relationship
has the BROADER TERM – NARROWER TERM cross
reference and the associative relationship has the RELATED
TERM – RELATED TERM cross reference.
The domain of a thesaurus the complete range, concept-wise, of
terms that can be used in the field that the thesaurus is designed
for. The scope, on the other hand, defines the limit or boundary
that is applied on the domain. In this collection, the domain for
the thesaurus is topics and themes related to World War II,
whereas the scope of the thesaurus is that topics and themes
pertaining only to World War II are allowed.
Specificity is the extent of precision of terms used to represent
the subject of the book in the chosen field. Higher the level of
specificity, higher is the precision of the subject representation.
Conversely, lower the level of specificity, lesser the accuracy of
29. subject representation. Specificity partially depends on the
concreteness, or lack thereof, of the chosen terms. For terms
that are more abstract, specificity is generally low. For this
collection, high level of specificity is appropriate as users have
high domain knowledge. The terms selected represent the theme
of the book accurately. High level of specificity results in high
precision and low recall.
Exhaustivity determines the number of terms assigned for
representing the subjects of each object in the collection. It is
the extent of subject representation for every object. Depending
on subject coverage, exhaustivity is classified further into depth
indexing and summarization. Depth indexing covers more
ground, covering main as well as sub- topics, whereas
summarization covers only main subject of the object. Depth
indexing yields high exhaustivity whereas summarization yields
low exhaustivity. Depth indexing is more applicable in case
where selected terms are more abstract and more terms are
assigned to each record. For this thesaurus, the depth-indexing
method is used to yield better results for each search because
the user domain knowledge is high in this case but subject terms
are more abstract. So for example, even though the term bravery
precisely specifies the subject of the book, bravery can also be
described as courage or chivalry and all these terms need to be
taken into consideration. The exhaustivity level is high for this
thesaurus. For depth indexing, recall is high, and precision is
low.
Refer Appendix D for thesaurus for this collection.
4.3. Classification scheme
Classification is a system of information organization that
enables proper organizing and arrangement of information
objects. The system of classification is implemented via
classification schemes, which are useful in that they enable
30. proper ordering of information objects, and also make them
logically easier to locate. Classification can be done using two
approaches: hierarchical approach and faceted approach.
The hierarchical approach uses prearrangement into classes and
subclasses, where classes are a category of similar objects and
sub-classes are a further classification of classes. This approach
is exhaustive in terms of including all possible concepts and is
rigid, modifications are not allowed. The faceted approach
requires prior selection of subject fields that are possible
candidates for facets. Here, there is no prearrangement of
classes and subclasses. This approach requires prior analysis of
the information object, and based on that analysis, the notation
is coined. Faceted approach allows for certain modifications if
required, such as addition of classes in the future, etc.
In the case of this collection, the faceted approach is used.
Facets are different types of classes or categories and they
enable better organization of objects. The user questions are
analyzed and based on this information it is concluded that user
searching behavior involves knowledge of Genre and Subject
fields, which are subject class candidates, as well as author
name. The primary facet is selected to be Genre field as it has
limited terms and allows for better organization and as a result,
retrieval. Along with these, the Publication Date field is
selected to come up with the classification notation, followed
by a unique identifier; the unique identifier is not a facet. A
unique identifier is, in this case, a number which is assigned
only to a single information object, in this case a book, which
distinguishes it from all other information objects, and decides
the physical shelf location of the book. So, the unique identifier
starts from the first record created and increases by one for
every new record created. The facets and unique identifier
enable for precise identification of the book, and the notation
created allows for logical ordering and placement of the book.
31. The classification scheme is for this collection is designed to
produce the following kind of code. Considering a book from
the collection, the genre of which is classified as Holocaust
(Hol), the first Tags (for Subject) term is Auschwitz, the last
name of the author is Morris and the year of publication from
the Publication Date is 2018, following the notation as defined
in Appendix E., the classification code is Hol.Aus.Mor.2018/10.
The notation requires the usage of the abbreviation of the Genre
term as provided in Appendix E table and followed by period,
the use of the first three letters of the first Subject term, first
letter capitalized, followed by period, the first three letters of
the last name of Author with first letter capitalized and
followed by period, the four digits of the year field in
Publication Date followed by a period, followed finally by the
unique identifier assigned to the book.