Streamlining Your Path to Metadata Charlotte Robidoux Stacey Swart

Information Management
Best Practices - Volume 1
“Regardless of the kind of information you need to manage, this book
will make your projects better.”
Bob Boiko, Washington University
“The TIMAF organization proves its dedication to raising the bar for
information management practitioners everywhere. It has assembled the
best thought leaders in the field to share insights from case studies which
are both actionable and universally relevant. Even the most experienced
IM professionals will learn something new with each turn of the page.”
Scott Liewehr, President CM Professionals
“It is, quite frankly, the best collection of case studies and solutions I’ve
run across and will be an invaluable resource for our readers. The reports
are solid, practical and real-world examples of how to do things right.”
Hugh McKellar, Editor in Chief KM World Magazine

515 Charlotte Robidoux, Stacey Swart
Charlotte Robidoux (charlotte.robidoux@hp.com) is a Content Strategy Manager at
Hewlett-Packard Company (HP) and has over 17 years of experience in technical communication.
At HP, she oversees the single sourcing strategy and implementation for the StorageWorks
Division. Charlotte holds a Ph.D. from the Catholic University of America in rhetoric and
technical communication. She is the author of ‘Rhetorically Structured Content: Developing a
Collaborative Single-Sourcing Curriculum’ published in Technical Communication Quarterly. She
is co-editor of ‘Collaborative Writing in Virtual Workplaces: Computer-Mediated Communication
Technologies and Tools’.
Stacey Swart (stacey.swart@hp.com) is the Content Management System Administrator and
Strategist at StorageWorks Division of Hewlett-Packard Company (HP). She has over 16 years
in the tech industry in areas ranging from technical support to technical communication, and is
certified by HP as a Lean Sigma Green Belt. Stacey holds a B.S. from the University of Kansas in
Education and English.
Develop a Metadata Strategy in Eight Steps
Streamlining Your Path to Metadata
Charlotte Robidoux
Stacey Swart
3 15

6 TIMAF Information Management Best Practices Vol. 1
Abstract
A Content Management System (CMS) allows a business to
streamline its content development processes, using and
reusing content from a single source for multiple purposes.
Fully leveraging this capability requires the ability to access
and manage your content, and managing your content
efficiently necessitates a robust metadata strategy. However,
developing a metadata strategy can be intimidating, onerous,
and costly. The sheer amount of time needed–time to research,
evaluate, synthesize, implement, and maintain a viable
solution–can prompt even the most dedicated among us to
abandon a strategic effort altogether. For this reason, it is
essential to find a streamlined approach to metadata strategy
development. This case study explores how can groups stream-
line their metadata development without cutting corners and
without undermining the purpose of having a CMS.
Establishing a metadata strategy using a gradual approach
makes the process more streamlined and manageable. The
key to this solution is to define metadata components that are
meaningful in your environment. After creating these compo-
nents, you can determine the optimal configuration for your
business and customize a taxonomy that makes your content
easier to find. This means that comprehensive metadata solu-
tion is both directly managed by users who assign predefined
values from controlled vocabularies and system-driven.
The solution also depends on input from all team members
involved in content development, from content developers to
editors and administrators. This article discusses the essential
steps needed to streamline your metadata strategy.

Background
At times, research on metadata can make the
concept seem more like a metaphysical journey
than one related to any practical outcomes.Yet
as long as there has been a need to categorize
objects and the information describing them,
metadata has been the essential means for
managing information collections or reposito-
ries. In our modern age, the need to manage
and access data on a large scale in a global
economy is no less important. Metadata is
central to modern authoring environments.
For example, it is an integral part of automa-
ting technical documentation development;
documentation which enables users to operate
the complex technologies that help to drive
business transactions. More generally, it is vital
to administer metadata efficiently, as indicated
by metadata expert Murtha Baca: “Institutions
must streamline metadata production and
replace manual methods of metadata creation
with ‘industrial production whenever possible
and appropriate’.” (1, page 72)
The Skills Needed to Perform
this Best Practice
Successfully implementing this strategy requires
one or more people in each of the following
roles:
• Content Librarian: Oversees the quality of
modular content in the CMS and assists
writers with opportunities for reuse across
the database.
• Editor: Manages edits at the sentence level
and reviews content against style guidelines.
• Content Developer: Uses content mapping
to define and create reusable content
modules.
• Tools Administrator: Configures and manage
the tool set.
Including these roles as a part of your strategy
is key to your success.Without them, you will
find holes in what could be a more streamlined
approach. Step 7,Assign MetadataTasks to
Roles, describes this in detail.
Step 1: Define What Metadata
Means to Your Organization
and Why It is Important
If you have found that some simply stated
definitions of metadata are hard to make use
of, and other highly technical ones are hard to
understand, you are not alone. Metadata is an
intricate subject that has become increasingly
technologized.While stripping the term to its
bare essence—such as “data about data”—helps
demystify it, such definitions leave us with
few clues about how to move forward. Finding
comparative definitions that make sense in your
organization can serve as a useful starting point
for understanding the concept: card catalogs in
a library, directories in a grocery store or mall,
playlists on an iPod, for example.
For our team, the most compelling comparison
was right in front of us—metadata as an index
in a book or a help system.The index compari-
son enabled our team to appreciate why meta-
data is important—it helps us organize and
access content.We also related the concept of
metadata to our own environment by review-
ing established metadata standards to see if,
and how, they would fit our needs. Standards
or schemas are rules for uniformly managing
information within and across repositories.They
fall into various types:
• Structure;
• Value;
• Content;
• Format.

For example, the Dublin Core Metadata
Element Set (DCMES) (2) is a general standard
that provides guidelines for structuring informa-
tion into categories. It was formulated to
describe documents on theWeb. DCMES
defines 15 metadata elements, including:
• Title;
• Creator;
• Subject;
• Description;
• Date;
• Type.
This standard features a succinct set of elements
or categories and has been endorsed in other
standards like ISO Standard 15836-2009 (3)
andANSI/NISO Standard Z39.85-2007 (4).The
general nature of DCMES elements make them
applicable to many organizations.
However, if you want to streamline your path
to metadata, avoid getting lost in the sea of
standards available. Because “[t]here is no
‘one-size-fits-all’ metadata schema or controlled
vocabulary or data content (cataloging) stan-
dard”, consider drawing on aspects of various
standards that will fit your organization
(1, page 72).
No specific standards seemed to target
computer documentation, but our team did
consider standards related to structure in order
to verify that we were targeting all of the key
elements.We only evaluated other standards,
including one from an HP marketing group, if
they seemed pertinent to our environment.
For example, we drew on a value standard,
ANSI/NISO Z39.19-2005 (5), to find guidelines
for developing controlled vocabularies,
as discussed below.
The challenge in defining metadata was learning
to appreciate the power inherent in distinguish-
ing content from the descriptors used to access
and manage that content effectively.
Step 2: Determine the Goals
That Drive Your Metadata
Strategy
Knowing what you want metadata to achieve
is fundamental to developing a sound strategy.
Once your team agrees on a definition of meta
data, turn their attention to identify the primary
goals that will drive the strategy. Experts
suggest “working backwards” from your goals
to the metadata needed to reach your goal.
“Deciding which aspects of metadata are essen-
tial for the desired goal and how granular each
type of metadata needs to be” is essential to
the process of formulating a strategy (6, page
193) (1, page 19).
We began by listing the various kinds of
information that would be useful to us: track-
ing types of content and components, content
status (new, a draft, approved), who originally
created the content, who revised it and when,
what content is reused, where the content is
reused, workflow tasks, multimedia objects
available, version details, profiled content, sys-
tem performance, and reports related to these
items. Next we compared this list with several
types of metadata: descriptive, administrative,
and structural.While experts refer to the
number and names of these types differently,
our team drew on the types identified by
NISO (7).These types are described inTable 1.

When looking at these types of metadata, we
saw that items on our list could be understood
in terms of these categories. From this view, we
began formulating and prioritizing our goals,
short-term vs. long-term. Given our focus on
gaining efficiency, we determined that being
able to retrieve and reuse content was a paired
goal.Another important goal was to minimize
the risk of content being reused prematurely.
Longer term goals included tracking the
percentage of content we reuse, determining
what reuse opportunities are still untapped,
ensuring the quality of our deliverables, and
identifying what content is being localized.
Through this exercise, we could see that all
these metadata could help us achieve our goals.
The focus on metadata types helped to stream-
line how we thought about our goals. Our next
step was to understand what specific metadata
components would help us attain our short-
term goals.
Step 3: Identify the Metadata
Components That Help You
Obtain Your Goals
Selecting metadata components is extremely
important in the process of establishing a
metadata strategy.The ability to decide on the
optimal number of metadata components is not
easy. How do you pick just the right number,
not too many or too few?Which ones will have
the biggest impact and help to minimize risk?
Here are some sample questions you should
consider (5, page 193-194 and 196):
• What type of content is it?
• What else do you need to know about the
content to ensure the correct piece of content
is retrieved?
• In what form will users want to retrieve
content?
• How will users specifically identify the
desired content?
Table 1 Metadata Types
Metadata Type
Descriptive
Administrative
Structural
Identifies and describes
collections resources.
Used in managing and
administering collections,
versioning, and reten-
tion.
Delineates the
organization and
relationship of content
units within a system.
Assists with queries and the ability to locate types
of content that can be reused. This includes:
• Content type and status
• Tracking types of content/components
• Profiled content
• Multimedia objects available
Enables creation and management of collections
and configuration of tasks, permissions, status,
and history:
• Who created content and when
• Workflow tasks
• Version details
• Reuse statistics
• System performance
Supports navigation and means of combing
components into deliverables:
• What content is reused
• Where content is reused
• Where multimedia objects referenced
• Reporting
Purpose Relevance to our environment

Research into structure standards showed our
team that we should focus on components that
describe the subject of our content (one of
the Dublin Core elements).These components
would be the basis of user queries.The best way
to streamline this step is to look at your own
content for the answers. Once again, the index
serves as a valuable tool for understanding what
terms might be queried, along with the table of
contents, providing clues about the hierarchy of
terms as they relate to the subject. Linking the
index concept to metadata was useful in helping
team members understand metadata hierarchies
and how the components related to each other.
FollowingAnn Rockley’s advice to select three
to five components, we chose four that were
subject related and two that were user related,
as shown inTable 2.
Table 2 Metadata Elements and Attributes
Element
ContentType
Product
Keyword
Abstract
Originator
Reuser
Exactly 1
1 or more
At least 2
At least 1
Exactly one
Exactly one
The largest “container” used to describe major
topics that make up our documentation.
“ContentType” describes the subject matter of
the content.
A smaller “container” used to qualify how a topic
applies to various products. “Product” designates
the name of the product for which the content was
written, including the model and / or version.
The smallest category that further limits the
relevance of a topic. “Keyword” helps to
further narrow search results.
Provides a synopsis of content that authors
can use to determine if reuse is appropriate,
describing the subject of the content, why it is
relevant, and guidelines for using the content.
Who originally created a reusable piece of
content.
Authors are reusing a piece of content.
Occurrence Rule Purpose

After choosing our components, we had to
consider how to manage them in our CMS.We
streamlined the process by drawing on options
that our CMS already supported. Our CMS
allowed searching on fields such as “Status,”
“Create date,” “Edit date,” and “username,”
but we needed to search on more specific
subject-related content as well. Our DTD,
which is a subset of DocBook, only contains
“keywordset/keyword.”To fill the gaps, we
developed custom elements and attributes,
adding custom elements for “ContentType”
and “Product,” and two attributes for
“Originator” and “Reuser.”We chose elements
when we might need to use multiple values,
and attributes when we wanted to enforce
only one value.
While it was clear that the goals of retrieval
and reuse could be achieved by building related
metadata into our content, we felt that the
goal of minimizing the risk of premature reuse
needed additional CMS support.To achieve this,
we organized our content into two collections:
“Working” and “Approved.”Working collec-
tions would contain work in progress; only the
“originator” content developer could reference
this content; reuse by others was not supported.
In contrast, “Approved” collections would
contain finalized content that had been
reviewed by an editor as well as subject matter
experts and could not be changed; any author
could be a “Reuser” of the content contained
here. Separate collections ensure that original
content will not change if reused. Instead,
“Reusers” must copy content from anApproved
collection to aWorking collection to propose
changes.After those changes are made,
the author initiates a Change Proposal work-
flow, illustrated below, via the CMS.The
workflow automatically notifies the assigned
stakeholders that a change to content is being
proposed. Some of the automation is possible
because of the metadata attributes “Reuser”
and “originator.”The CMS is able to determine
who initiated the change proposal and who
the changes will affect.The workflow content
librarian also employs automated email notifi-
cations and task assignments.Two options are
possible: either the approved content is updated
to reflect the changes agreed upon by the
review team, or new content is added to the
Approved collection because the original
content is still needed as first written. By
organizing the CMS collections this way, and
by creating a workflow that leverages user-
related metadata, we effectively streamlined
our use of metadata and found a way to
leverage elements to minimize risk when
reusing content.That is, a strategic approach
to metadata from the outset triggers additional
efficiencies; streamlining metadata cascades
into workflow and CMS implementation.
Step 4: Identify Metadata
Values
Without question, identifying metadata values
to create a stable list of terms-a controlled
vocabulary-is the most time consuming and
contentious step of the process. Deliberating
over synonyms and laboring over documents
to test the appropriateness of the values seems
endless.The best way to streamline this part
of the process is to form a small workgroup
of three or more members who can begin to
evaluate document conventions and create lists
of terms related to the components selected.
(A workbook works well for managing the
terms on separate spreadsheets.)As mentioned
earlier, our team drew extensively fromANSI/
NISO Z39.19-2005, Guidelines for the Con-
struction, Format, and Management of Mono-
lingual ControlledVocabularies.
This standard helped the workgroup and users
appreciate why a controlled vocabulary is so
important, given that “[t]wo or more words
or terms can be used to represent a single
concept” and that “[t]wo or more words that
have the same spelling can represent different

concepts” (5).When creating the lists, the work-
group relied on the Standard’s recommendations
for conducting “top- down” and “bottom-up”
assessments, determining “the correct form of
each term,” and for following key principles
such as: “eliminating ambiguity,” “controlling
synonyms,” “establishing relationships among
terms where appropriate,” and “testing and
validation of the terms” (5).
Once the lists were created, workgroup
members began vetting these lists with
seasoned authors, many of whom were not
co-located.The ability to engage teams across
the organization when the workgroup had little
authority was especially challenging.We relied
on many virtual collaboration techniques to
streamline our efforts so that we could complete
the work. Do not overlook the importance of
showing the value of metadata to the users-they
need to understand and believe in the purpose
of their work, and realize that metadata:
• Enhances query capabilities in the CMS by
enabling “effective retrieval” (6, page 18).
• Allows users to locate their own content, as
well as other content that they could reuse
or leverage.
• Reduces “redundant content” (6, page 185),
making content developers more productive.
• Reduces costs (Management may care more
about this, but in today’s work environment,
a content developer who is saving the
company money is a content developer
worth keeping.).
Additionally, employing a controlled vocabulary
saves the content developers time by increasing
the amount of content that can be successfully
retrieved.
TheANSI/NISO Z39.19-2005 standard
provided essential principles for maintaining
a controlled vocabulary, especially how best to
manage additions and modifications as well as
a history of changes (5, page 97).The change
history was especially critical when updating
the values in our tools.These processes are
contained within a single resource that we refer
to as metadata guidelines.
Documenting the metadata process is a must.
Bob Boiko discusses the idea of a metatorial
guide containing “a set of rigorous metadata
guidelines,” similar to an editorial guide (8,
page 495). Boiko goes on to say that the
metadata process must ensure (8, page 508):
• Metadata completeness and consistency.
• Content manageability, accessibility, and
targetability (intended for the appropriate
audience).
A thoroughly documented set of rules and
procedures helps take the guesswork out of
metadata application.As Boiko explains, “in a
sizable system with numerous contributors you
can almost guarantee that you will find wide
variation in the ways that people interpret and
apply even the most precisely stated tagging
rules” (8, page 509). Providing a link from the
tool’s support menu to the metatorial guide
puts the information at the content developers’
fingertips, giving users easy access to the meta-
data processes and guidelines.As previously
discussed, proper application of metadata is
critical to ensure quality search results. Making
the guidelines as accessible as possible will help
ensure that they are followed.
Once guidelines are documented, you need
to determine what type of user will apply the
metadata. Should content developers add all
user-driven metadata, or should a content
librarian assist them?What are the roles regard-
ing metadata application? Boiko contends that
“a different set of skills is necessary to deal
with this metadata” (8, page 495). Some users
can be trained to apply metadata. However, as
he goes on to say, users “rarely have the where-
withal to discover what content others are sub-
mitting and exactly how to relate that material

to what they submit.” Someone on the team
with an eye for detail like a content librarian
is more appropriate for this role.
While content developers understand their
content and usage better than anyone else, as
noted by Peter Emonds-Banfield, they might
not have the “expertise necessary for meta-
data creation, nor the time to keep up with it”;
whereas “... metators (= editors that manage
metadata) can play a role by educating content
teams on metadata creation” (8, page 509).
As previously discussed, some tools can be
configured to enforce certain rules; however,
some standards require the human eye. In those
cases, the content librarian can audit metadata
application before content is approved, ensur-
ing the metadata values chosen by the content
developer meet quality standards.You can
liken the role of a content librarian to that of
an editor. Instead of reviewing content against
structure and style rules, the content librarian
reviews metadata against metatorial guidelines,
ensuring that metadata application is consistent
throughout all content in the CMS. Boiko refers
to this as “a central point of decision” (8, page
511).The more complex the metadata and
content, and the more users who access it, the
more critical such a point of decision becomes.
On the other hand, is having the content
librarian audit metadata application by content
developers sufficient, or should the content
librarian apply all metadata to content, com-
pletely releasing the content developer from
such a burden?According to Boiko “the task
of looking after the metadata health of a CMS
is crucial if you want to be confident that the
content in the system will show up when you
expect it to.” (8, page 511).This the point for
content to be retrievable so that you can then
reuse it. If you want to be completely sure
that metadata is applied consistently across
all content, regardless of who originated it,
then having a content librarian perform this
task is as close to a guarantee as you might get.
However, some organizations do not have the
resources to staff a content librarian. In that
case, an editor might take this on as a new
role. If resource constraints are an issue, some
organizations must rely on content developers
to apply user-driven metadata. In this case, the
metatorial guide is what you are betting on,
and it must be rock solid.
In our case, we rely on content developers to
apply user-driven metadata.The editors are
charged with reviewing metadata as they would
any other content.The content librarian is
consulted when questions arise, and also audits
content in the CMS for consistency. Ultimately,
the content librarian is the point of decision
and is responsible for educating others and
maintaining the metatorial guide.We have also
staffed a trainer who works with the content
librarian to develop metadata training for all
(content developers and editors).The primary
reason we have this model is to share the work-
load; we do not have the resources to assign
such a role in a full-time capacity. Regardless
of who is doing it, applying “metadata well
requires a lot of human energy” (8, page 495).
Step 5: Determine What
Metadata Components Can
Be Automated
Determining which metadata components,
if any, can be automated, is important at this
stage in developing a metadata strategy. Some
components need the human touch for quality
purposes, or because tools such as the CMS are
not able to automate the application of such
metadata. However, when possible, utilize auto-
mation.The options for this will vary depending
on the tool. In our case, we looked to the CMS
for automating the application of metadata.
Why automate?Automating metadata applica-
tion lessens the burden on the content develop-
ers and helps avoid inconsistency. In addition,
if it is “up to the author to remember to add

the metadata in all the relevant places”, it is
a “recipe for missed metadata” (6, page 200).
As Boiko writes, “without a rigorous consis-
tency and careful attention, metadata becomes
useless” (8, page 495). He goes on to say that
“someone or, better, some system is necessary
to ensure that people handle metadata
thoroughly and consistently” (8, page 495).
So if the CMS can handle it, automate it!
What metadata makes a good candidate for
automation? From our experience, metadata
with a yes or no value should be automated
if the question can be answered by data that
is accessible to the CMS. For example, to
answer the question “Is the content being
reused?”, populate the reuse attribute with
either “yes” or “no.” In our case, if content
lives within a specific CMS collection, then
it is reused. Otherwise, it is not. Our CMS is
smart enough to answer this question based
on the location of the content – in a certain
collection, so we let it answer that question
for us.
Metadata containing a value that is definite
should also be automated. For example, the
originator attribute can be populated with the
username of the person who created the con-
tent because the CMS knows who that person
is. Likewise, the CMS knows who is reusing
content because it can follow the reference
to the content back to the username who
created the reference.As a result, we let the
CMS capture the username for us by adding
it to the reuser attribute.
On the other hand, what metadata should
not be automated? Metadata requiring a
discerning human eye should not be automated.
For example, a person is needed to determine
the subject of the content. One could argue
that if the content contains a title, the subject
could be leveraged from the title. However,
not all content chunks include a title.As a
result, we do not automate the ContentType
metadata element.
A gray area might be keywords. In our case,
we depend on a person to assign keywords.
This person is typically the content developer,
with some assistance from the content librarian
if required.As content grows, new keywords
might be necessary. If they are not part of the
controlled vocabulary, the content librarian
can make note of that and modify the list as
needed. From our experience, controlled
vocabularies are certainly living lists, as
previously discussed.
Table 3 shows our system-driven metadata,
including metadata used to manage the status
of content (whether or not it can be reused).
Be sure to also consider the risks of automa-
tion. Boiko states that “the problem isn’t to
find the blanks, but to correctly fill them”
(8, page 509).The key word here is “correctly”.
Similarly, Rockley explains that “[i]mproperly
identified metadata ... can cause problems
ranging from misfiled and therefore inacces-
sible content to even more serious problems ...”
(6, page 185). In our case, inaccessible content
would be a deal breaker since our primary goal
is retrieval for reuse. It is critical that metadata
applied automatically by the CMS is done with
the highest quality standards.There can be no
room for incorrectly applied metadata or for
the possibility of inaccessible content.
Consequently, if you rely on the CMS to auto-
mate the application of metadata, make sure it
is fool- proof (tool-proof).
Step 6: Ensure That Users Will
Apply the Metadata
Once you have determined which metadata
components can be automated, the remaining
components will be user-driven.The next step
it to ensure that users will apply it.As Rockley
notes, metadata is “only valuable if it gets
used” (6, page 200).

Table 3 System-Driven Attributes
System-Driven
Attributes
Status
Collection
Reuse
Originator
Reuser
Working
Approved
Section
Chapter
Glossentry
Yes
No
Username
Username
External use in the authoring environment. Used
upon extract to work with style sheets to lock
content from changes if approved.
Used to properly reload content to the correct
collection.
If yes, content is from an approved collection.
Used to color-code approved content so that
reviewers and editors know it has already been
approved.
Who created the content; used by CMS
workflow.
Who is reusing the content; used by CMS
workflow.
Value Goal
One method to ensure users apply metadata is
to configure your tools with metadata require-
ments.The DTD behind an authoring tool can
utilize occurrence rules to require specific
metadata components to be added (such as at
least two keywords must be present).A CMS
can be configured to enforce the same rules.
In our case, we have rules established in both
tools. Regardless of which tool the metadata is
applied in, the user must meet certain require-
ments.The tools alert the user when those
requirements are not met.
We have found that the CMS provides greater
specificity than our authoring tool in such
requirements.While the DTD behind the
authoring tool can require that metadata
components be present, it cannot enforce that
values be added to those components. For
example, in the authoring environment, a user
could add two keyword elements, but leave
them empty with no values assigned.Techni-
cally, they would meet the DTD rules.The CMS
provides the additional reinforcement. In our
case, content shows as incomplete unless the
metadata components are present and they
contain values. For example, two keywords
are present and the values are this and that.
Content within the CMS shows as incomplete
unless all metadata requirements are met, and
because all content is ultimately managed in
the CMS, it becomes the final checkpoint.
To fully utilize the benefits of metadata,
however, users must do more than just apply
metadata to their content.They must apply the
appropriate metadata to their content.A well
designed metadata strategy ensures that the
metadata components and values are tailored
to the needs of the user; metadata guidelines
assist them with the tasks they need to accom-
plish and include terms they will use when
retrieving content. But as previously discussed,
users do not all think the same way.This is
where having a controlled vocabulary is a must.
Even though one user might be inclined to
search on “America” and another might search
on “U.S.A.”, they will both work off of the
same list of terms, which in this example could
include “United States”. Such search standards

can be taught, and will ensure effective search
results, rather than wasting the user’s valuable
time.
There are other ways to assist users in metadata
application. One is to provide templates that
are pre-populated with the required metadata
components.We have done this in our author-
ing toolset; the content developer only needs
to assign values to the components.Another
method that both assists users and provides a
level of control that can be a partner to occur-
rence rules is described by Boiko as “categori-
zing metadata by its allowed values” (8, page
506). For example, we use a “closed list,” which
allows users to select a value from a predefined
set of terms, or a controlled vocabulary (8, page
507). In our case, the controlled vocabulary is
built into the authoring tool and the CMS.The
user cannot type in metadata values; the only
option is to select them from a list.
Step 7: Assign Metadata Tasks
to Roles
To ensure your metadata goals become part of
your business processes and tool environment,
assign roles to team members who can imple-
ment the metadata strategy.These assignments
streamline the implementation effort.Table 4
describes each of these roles.
Metadata is dependent on many contribu-
tors.While tool administrators can ensure that
system- driven metadata and automation are
set up behind the scenes, they are not the sole
contributors.The realization of your strategy
becomes much easier with all team members
involved.

Role
Content Librarian
Editor
Originator (Content
Developer)
Reuser (Content Devel-
oper)
Responsible for the quality of modular content in the CMS as well as
for flagging opportunities for reuse across the database. The content
librarian’s tasks include:
* Assisting content developers when needed in understanding the
metadata guidelines.
* Maintaining the metadata guidelines document and the master list
of values.
* Reviewing and accepting or rejecting new metadata value requests.
* Notifying Tool Developers when new metadata values need to be
added to the tool set.
* Auditing the quality of metadata the values that content developers
apply before the content can be made available for reuse.
* Overseeing content change proposals for reused modules and
validating the requests for changes.
* Facilitating the review process to ensure all Reusers participate by
either accepting or rejecting their changes
* Implementing the final result by either overwriting the original
“approved” content in the CMS, or by creating a variant of the original
“approved” content.
* Populating the CMS with common queries to assist content developers
with locating content to be reused or leveraged.
* Assisting content developers when more specific search criteria is
needed for database queries to locate content to be reused or leveraged.
Manages edits at the sentence level and reviews content against style
guidelines. The editor’s responsibilities include:
* Reviewing metadata values as part of the literary edit to ensure consistent
usage.
* Maintaining an eye toward content that can be leveraged or reused when
a content developer opts to create new content.
Identifies the need for and creates reusable modules of content using
content mapping. The originator’s responsibilities include:
* Identifying unique, similar and identical content across the deliverables set.
* Capturing metadata values for identical content.
* Analyzing similar content for opportunities to make it identical.
* Creating reusable topics of information.
* Requesting new metadata values as needed via the CMS workflow.
A content developer who uses metadata to query the CMS for reusable
content. The reuser’s responsibilities include:
* Reusing “approved” content by referencing it in deliverables.
* Initiating the change proposal workflow as needed to request changes
to “approved” content.
* Reviewing change proposals from other reusers.
Responsibilities
Table 4 Roles and Responsibilities >>

Tool Administrators
DTD Developer
Authoring Tool Developer
Authoring Tool Developer
CMS Administrator
Responsible for configuring and managing the tool set. Examples of tool
administrators include:
* DTD Developer
* Authoring Tool Developer
* CMS Administrator
* Publishing Tool Developer
* The tool administrator’s responsibilities include:
* Addressing requirements versus options.
* Automating processes where possible.
* Ensuring that the tools support the reuse and metadata strategy.
The DTD developer’s responsibilities include:
* Managing DTD elements, attributes, and occurrence rules.
* Communicating with the CMS Administrator when DTD changes
are needed.
The authoring tool developer’s responsibilities include:
* Making templates available for new content creation.
* Automating adding required child elements when a parent element is
selected.
* Maintaining pre-populated menus with required user- driven metadata
values.
* Providing links to support documentation from the authoring tool menu.
The authoring tool developer’s responsibilities include:
* Making templates available for new content creation.
* Automating adding required child elements when a parent element is
selected.
values.
* Providing links to support documentation from the authoring tool menu.
The CMS administrator’s responsibilities include:
* Managing content collections for editing, loading, and extracting
behaviors, including extracts directly to the publishing tool.
* Maintaining user roles and privileges.
* Configuring the CMS to ensure alignment with DTD rules.
* Setting up CMS-specific elements and attributes as needed.
* Tightening structure rules by requiring text to be present and/or valid
values to be used for applicable elements and attributes.
* Making components, properties and operators available to help ensure
effective query options.
values.
* Implementing visual aids to assist users when viewing content in the CMS.
* Automating the capturing of system-driven attributes.
* Creating workflow configurations to support CMS- assisted procedures.
Publishing Tool Developer The publishing tool developer’s responsibilities include:
* Creating and maintaining style sheets for use in the authoring and
publishing tools.
* Developing authoring tool scripts to provide visual cues for reused
content to content developers, editor, and Subject Matter Experts.

It is critical that the team members have a clear
definition of their roles and the importance of
each role in contributing to the overall success
of the strategy.
Step 8: Prove That Your
Strategy is Sound
Readiness to release metadata to production
can take months or years, depending on the
complexity of your strategy. Because our
organization did not always have dedicated
resources to devote to this implementation,
tracking our progress via a schedule was
absolutely essential.As priorities shifted in our
organization, deliverables were either pulled in
or pushed out as needed. Shifting priorities and
balancing resources may ultimately determine
the time needed to develop and implement
your metadata strategy.While it took our team
a number of years, we understood the return
on investment. Had we given in to the pressure
to release any sooner, we would have had a less
effective, less efficient, and less robust metadata
strategy.And because metadata is truly the
backbone to our reuse strategy, skimping was
not an option.
It is also necessary to come to an agreement
with management as to what qualifies the
strategy to be ready for release. For example,
we negotiated to have a full quarter of simula-
tion testing, and agreed that we would only
release to production if simulation testing
resulted in zero process or tool issues.Test
scenarios should be as realistic as possible.
In our case, we used actual customer content,
assigned roles, and created scenarios to put
our business and tool processes to the test.
The testers received new test scenarios each
week so that they couldn’t see what was coming
next. Our support staff, including the editor
and the content librarian, were also given test
scenarios. In some cases, we set up intentional
conflicts to ensure users knew how to handle
them.
Before you can release your metadata strategy
to production, you must ensure that:
• Your tools are functioning as expected to
support the strategy.
• Roles and expectations are clear.
• The metadata guide is available.
• Training has occurred, including making
all users aware of the importance of
metadata (resulting in a willingness to use it).
• No gaps have been identified.
• There are no technical issues with any of the
processes supported by the tools.
After all of the “human energy” (8, page 495)
spent on creating your metadata strategy, don’t
short- change yourself by rushing through the
testing process.When you do release your
metadata strategy, you want to know it is rock
solid.
Metadata in Action
As previously discussed, our goal is to enable
effective reuse by making content easy to find.
Because the originating content developer
(Originator) added metadata for reuse, other
content developers (Reusers) can query on
those values. Figure 1 shows some of an
Originator’s content.

In some cases, a content developer might know
the content exists, and is already familiar with
it. In that case, she would have knowledge of
the metadata values that are likely to be associ-
ated with the content.
In other cases, the content developer has a
need for content, but is not sure it exists. Rather
than creating it from scratch, she searches the
CMS to see if content exists that she can reuse
or at least leverage. For example, a user needs
content specific to installing NAS products
onto servers. Because our CMS is configured
with drop-down lists of valid values (controlled
vocabulary metadata values), the users selects
the appropriate metadata elements and values
from the list.
In this example, the content developer would
query on:
• ContentType = Installing;
• Product = NAS;
• Keyword = servers.
The content developer can search on one
or more metadata elements, as shown in
Figure 2. Combining multiple metadata
elements provides narrower results. In the
preceding example, over 1,200 section modules
were queried, resulting in one section that met
the content developer’s query (shown in
Figure 3).At this point, the content developer
can review the content in more detail and
decide if she can use it as is, or leverage it.
Figure 2: Searching Using Metadata
Figure 1: Originator Content

It is easy to see that without the metadata to
support the query, the content developer would
likely never have located the content she needed.
She probably would have just created a new
section, duplicating existing data. She would
have spent time doing this, taking away from
her other work. In addition, the CMS would
have become populated with redundant content.
Even when a content developer locates content
that already exists, it might not fully meet her
needs. In that case, she can propose changes to
the content.We use the Change ProposalWork-
flow feature in our CMS to manage this process.
The workflow has the following steps, shown in
Figure 4:
• Proposal: The content developer copies con-
tent to a working collection, makes changes
as needed, and initiates the workflow.
• Review: The content librarian validates the
request. System-generated email notifications
are sent to all Reusers.The content librarian
facilitates an offline review and mediates any
counter-proposals.
• Outcome: If all Reusers accept the change
proposal, the CMS automatically overwrites
the original content in the approved collec-
tion with the changed content.A system-
generated email notification is sent to all
content developers, letting them know that
the workflow has been completed.
• Relink: If only some Reusers accepted the
change proposal, the content librarian assigns
a unique ID to the content, and the CMS
automatically moves the variant to the
approved collection.A system-generated
email notification is sent to all content
developers, reminding them to relink to
the variant as needed.
Figure 3: Reused Content
Figure 4: Change Proposal Workflow

If, on the other hand, query results show that
new content needs to be created, the content
developer can do so.Adding the metadata
elements to the new content will help ensure
that other content developers can locate the
content for future usage.
Summary and
Conclusion
Proving the soundness of metadata in our case
entailed extensive collaboration and testing
among team members. Key areas of focus
included:
• Checking and rechecking that our metadata
values were entered into the tools correctly.
• Ensuring high usability in the tools and in
written processes so that team members
could add metadata easily.
• Configuring our tools to easily locate meta
data and to indicate if values and elements
were missing.
• Proving the concept that our metadata would
enable us to locate content effectively for the
purpose of reuse.
The ultimate test of success is verifying that
implementing metadata allows your organiza-
tion to achieve the goals you identified at the
outset.
There is little guidance available on how to
develop a metadata strategy.While some in-
dustries have developed specifications tailored
to their content, others seem to be starting
at square one.Technical communication, as it
relates to the computer industry, could benefit
from more substantial models to follow. Having
a specification as a starting point would help
companies get started with their metadata strat-
egy by providing a list of components, possible
values, and the pros and cons of building upon
this as a foundation.
References
1. Baca, Murtha, ed.. Introduction to Metadata, 2nd Edition. Los Angeles, CA: Getty Publications, 2008.
2. Dublin Core Metadata Initiative. “Dublin core metadata element set, version 1.1 .” 2008. Retrieved on
20 Mar. 2009. http://dublincore.org/documents/dces/.
3. International Organization for Standardization. “ISO 15836:2009, Information and documentation - The Dublin
Core metadata element set.” 2009. Retrieved on 21 Feb. 2009. http://www.iso.org/iso/iso_catalogue/catalogue_
tc/catalogue_detail.htm?csnumber=52142.
4. American National Standards Institute. “The Dublin Core Metadata Element Set.” ANSI/NISO Z39.85.
Bethesda, MD: NISO Press, 2007. Retrieved on 8 Jun. 2008. http://www.niso.org/kst/reports/standards/kfile_
download?id%3Austring%3Aiso-8859-1=Z39-85-2007.pdf&pt=RkGKiXzW643YeUaYUqZ1BFwDhIG4-24RJbcZB-
Wg8uE4vWdpZsJDs4RjLz0t90_d5_ymGsj_IKVa86hjP37r_hFEijh12LhLqJw52B-5udAaMy22WJJl0y5GhhtjwcI3V.
5. American National Standards Institue. “Guidelines for the Construction, Format, and Management of
Monolingual Controlled Vocabularies.” ANSI/NISO Z39.19-2005. Bethesda, MD: NISO Press, 2005. Retrieved on
9 Jun. 2008. http://www.niso.org/kst/reports/standards?step=2&gid=None&project_key=7cc9b583cb5a62e8c15
d3099e0bb46bbae9cf38a.
6. Rockley, Ann. Managing Enterprise Content: A Unified Content Strategy. Indianapolis, IN: New Riders Publishing,
2002.
7. National Information Standards Organization. “Understanding Metadata.” Bethesda, MD: NISO Press, Retrieved
on 21 Feb. 2009. http://www.niso.org/publications/press/UnderstandingMetadata.pdf.
8. Boiko, Bob. The Content Management Bible, 2nd Edition. Indianapolis, In: Wiley Publishers, 2005.

About the TIMAF Library
This ‘Information Management Best Practices’ book is a publication in theTIMAF
Library.The publications in theTIMAF Library are aimed at promoting Information
Management and are published on behalf ofTIMAF.TIMAF, the Information Manage-
ment Foundation, is an initiative of information management practitioners to provide
a strong and clear foundation of information management.
TIMAF encourages authors from around the world, who are experts in their
particular information management sub discipline, to contribute to the development
of theTIMAF publications.Are you interested in sharing your ideas and experiences
online with theTIMAF Community?Visit www.timaf.org and join the discourse.
Have you experienced the merits of a specific approach to information management
strategies, methodologies, technologies or tools? Submit a proposal according to the
requirements listed in the ‘Call for Best Practices’ at www.timaf.org.
The following publications are available in theTIMAF Library:
Introduction books
Information Management Framework
paper edition - release: September 2011
Best Practices books
Information Management Best Practices 2009 Sneak Preview
online edition - www.timaf.org
Information Management Best Practices – Volume 1
paper edition - ISBN 978-94-90164-03-4
Pocket Guides
Information Management Framework – A Pocket Guide
paper edition - release: November 2011
Social Networks
Information Management Framework Wiki
www.timaf.org/wiki
We will publish new books and other instruments on a regular basis. For further
enquiries about the Information Management Library, please visit www.timaf.org
or send an email to info@timaf.org.

Introduction
Information? Manage!
Information is the term we use to stand for all forms of preserved communica-
tion that organizations care to produce, store and distribute. If we communi-
cate it and record it, it is information. So, for us, information is anything from
sales figures in a database to a video on philosophy viewed on a mobile phone.
We define information management as the organized collection, storage and
use of information for the benefit of an enterprise.
Our definitions are intentionally wide enough to cover content, document,
asset, data, records and all other ‘information managements’ that organizations
do.We believe that while each of these “sub-disciplines” has its own tools and
types of information, there is much more that unites them than divides them.
Our definitions are intentionally quite practical. For us, information manage-
ment simply means moving pieces of recorded communication from creation
to consumption to retirement. Our definitions are crafted to carve out a niche
for the information manager. Information managers make sure that recorded
communication can be amassed and distributed in a way that benefits their
organization. Finally our definitions are crafted to be a simple guiding
principle.Any person working in any information project can use this defini-
tion to remain focused on the ultimate aim of their particular kind of work.
Information Management? TIMAF!
The field of information management is currently fractured and incoherent.
Each sub discipline (content, document, asset, data, records management
to name just a few) has its own practitioners, applications and professional
communities.We believe that behind the seeming differences between these
‘managements’ there is a deeper unity that will eventually define a strong
and clear foundation for all of them.
We do not believe that all managements will or should merge, but rather that
just as business underlies a variety of business practices including accounting
and finance, there is a common foundation for the various forms of informa-
tion management.
The Information Management Foundation (TIMAF) tries to provide this foun-
dation by publishing these information management best practices. In addition,
TIMAF develops and maintains an information management framework that
brings the commonalities between sub disciplines to light and helps to organize
the best practices that we publish.

Best Start? Best Practice!
Just as business is practiced within a more specific context, information
management is also practiced in context.Thus, we believe that the best way
to illustrate the concepts and practices of information management is within
the context of one or more sub disciplines. So, this best practices book tries
to show global principles of information management in the context of
projects in one or more of the sub disciplines.
This is the first volume of ‘Information Management Best Practices.’ In future
publications we will provide an ongoing compilation of high quality best
practice guidance, written for and by experts in the Information Management
field from around the world.These best practices are designed to help
professionals overcome their information management challenges.They bring
complex models down to earth, with practical guidance on tough problems.
In this volume, practitioners describe nineteen projects that you can learn
from, In return, we ask that you let us learn from you! Please let us know
what your experiences are with these or other projects at www.timaf.org.

Colophon
Title
TIMAF Information Management Best Practices – Volume 1
Editors
Bob Boiko – USA - Erik M. Hartman – NL
Copy-editors
Jonah Bull – USA – Jenny Collins – USA - Elishema Fishman – USA
Publisher
Erik Hartman Communicatie – NL
Edition
Volume 1 – 1st impression – November 2010
ISBN
978-94-90164-03-4
Design & Layout
Nevel Karaali – NL
Print
Wöhrmann Print Service – NL
© 2010, TIMAF
All rights reserved. No part of this publication may be reproduced in any form by
print, photo print, microfilm or any other means without written permission by the
publisher. Although this publication has been composed with much care, neither
author, nor editor, nor publisher can accept any liability for damage caused by possible
errors and/or incompleteness in this publication.
TRADEMARK NOTICE
TIMAF ® is a RegisteredTrade Marks and Registered CommunityTrade Marks of the
Office of Government Commerce, and is Registered in the U.S. Patent andTrademark Office.
Please contact the editors for ideas, suggestions and improvements at info@timaf.org.

Streamlining Your Path to Metadata Charlotte Robidoux Stacey Swart

Streamlining Your Path to Metadata Charlotte Robidoux Stacey Swart

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Streamlining Your Path to Metadata Charlotte Robidoux Stacey Swart

Similar to Streamlining Your Path to Metadata Charlotte Robidoux Stacey Swart (20)

Streamlining Your Path to Metadata Charlotte Robidoux Stacey Swart