SlideShare a Scribd company logo
1 of 27
Download to read offline
Best Practices
Information Management
Best Practices - Volume 1
“Regardless of the kind of information you need to manage, this book
will make your projects better.”
Bob Boiko, Washington University
“The TIMAF organization proves its dedication to raising the bar for
information management practitioners everywhere. It has assembled the
best thought leaders in the field to share insights from case studies which
are both actionable and universally relevant. Even the most experienced
IM professionals will learn something new with each turn of the page.”
Scott Liewehr, President CM Professionals
“It is, quite frankly, the best collection of case studies and solutions I’ve
run across and will be an invaluable resource for our readers. The reports
are solid, practical and real-world examples of how to do things right.”
Hugh McKellar, Editor in Chief KM World Magazine
515 Charlotte Robidoux, Stacey Swart
Charlotte Robidoux (charlotte.robidoux@hp.com) is a Content Strategy Manager at
Hewlett-Packard Company (HP) and has over 17 years of experience in technical communication.
At HP, she oversees the single sourcing strategy and implementation for the StorageWorks
Division. Charlotte holds a Ph.D. from the Catholic University of America in rhetoric and
technical communication. She is the author of ‘Rhetorically Structured Content: Developing a
Collaborative Single-Sourcing Curriculum’ published in Technical Communication Quarterly. She
is co-editor of ‘Collaborative Writing in Virtual Workplaces: Computer-Mediated Communication
Technologies and Tools’.
Stacey Swart (stacey.swart@hp.com) is the Content Management System Administrator and
Strategist at StorageWorks Division of Hewlett-Packard Company (HP). She has over 16 years
in the tech industry in areas ranging from technical support to technical communication, and is
certified by HP as a Lean Sigma Green Belt. Stacey holds a B.S. from the University of Kansas in
Education and English.
Develop a Metadata Strategy in Eight Steps
Streamlining Your Path to Metadata
Charlotte Robidoux
Stacey Swart
	3 15
6 TIMAF Information Management Best Practices Vol. 1
Abstract
A Content Management System (CMS) allows a business to
streamline its content development processes, using and
reusing content from a single source for multiple purposes.
Fully leveraging this capability requires the ability to access
and manage your content, and managing your content
efficiently necessitates a robust metadata strategy. However,
developing a metadata strategy can be intimidating, onerous,
and costly. The sheer amount of time needed–time to research,
evaluate, synthesize, implement, and maintain a viable
solution–can prompt even the most dedicated among us to
abandon a strategic effort altogether. For this reason, it is
essential to find a streamlined approach to metadata strategy
development. This case study explores how can groups stream-
line their metadata development without cutting corners and
without undermining the purpose of having a CMS.
Establishing a metadata strategy using a gradual approach
makes the process more streamlined and manageable. The
key to this solution is to define metadata components that are
meaningful in your environment. After creating these compo-
nents, you can determine the optimal configuration for your
business and customize a taxonomy that makes your content
easier to find. This means that comprehensive metadata solu-
tion is both directly managed by users who assign predefined
values from controlled vocabularies and system-driven.
The solution also depends on input from all team members
involved in content development, from content developers to
editors and administrators. This article discusses the essential
steps needed to streamline your metadata strategy.
715 Charlotte Robidoux, Stacey Swart
Background
At times, research on metadata can make the
concept seem more like a metaphysical journey
than one related to any practical outcomes.Yet
as long as there has been a need to categorize
objects and the information describing them,
metadata has been the essential means for
managing information collections or reposito-
ries. In our modern age, the need to manage
and access data on a large scale in a global
economy is no less important. Metadata is
central to modern authoring environments.
For example, it is an integral part of automa-
ting technical documentation development;
documentation which enables users to operate
the complex technologies that help to drive
business transactions. More generally, it is vital
to administer metadata efficiently, as indicated
by metadata expert Murtha Baca: “Institutions
must streamline metadata production and
replace manual methods of metadata creation
with ‘industrial production whenever possible
and appropriate’.” (1, page 72)
The Skills Needed to Perform
this Best Practice
Successfully implementing this strategy requires
one or more people in each of the following
roles:
•	 Content Librarian: Oversees the quality of 	
	 modular content in the CMS and assists 		
	 writers with opportunities for reuse across 	
	 the database.
•	 Editor: Manages edits at the sentence level 	
	 and reviews content against style guidelines.
•	 Content Developer: Uses content mapping 	
	 to define and create reusable content
	 modules.
•	 Tools Administrator: Configures and manage 	
	 the tool set.
Including these roles as a part of your strategy
is key to your success.Without them, you will
find holes in what could be a more streamlined
approach. Step 7,Assign MetadataTasks to
Roles, describes this in detail.
Step 1: Define What Metadata
Means to Your Organization
and Why It is Important
If you have found that some simply stated
definitions of metadata are hard to make use
of, and other highly technical ones are hard to
understand, you are not alone. Metadata is an
intricate subject that has become increasingly
technologized.While stripping the term to its
bare essence—such as “data about data”—helps
demystify it, such definitions leave us with
few clues about how to move forward. Finding
comparative definitions that make sense in your
organization can serve as a useful starting point
for understanding the concept: card catalogs in
a library, directories in a grocery store or mall,
playlists on an iPod, for example.
For our team, the most compelling comparison
was right in front of us—metadata as an index
in a book or a help system.The index compari-
son enabled our team to appreciate why meta-
data is important—it helps us organize and
access content.We also related the concept of
metadata to our own environment by review-
ing established metadata standards to see if,
and how, they would fit our needs. Standards
or schemas are rules for uniformly managing
information within and across repositories.They
fall into various types:
•	 Structure;
•	 Value;
•	 Content;
•	 Format.
8 TIMAF Information Management Best Practices Vol. 1
For example, the Dublin Core Metadata
Element Set (DCMES) (2) is a general standard
that provides guidelines for structuring informa-
tion into categories. It was formulated to
describe documents on theWeb. DCMES
defines 15 metadata elements, including:
•	 Title;
•	 Creator;
•	 Subject;
•	 Description;
•	 Date;
•	 Type.
This standard features a succinct set of elements
or categories and has been endorsed in other
standards like ISO Standard 15836-2009 (3)
andANSI/NISO Standard Z39.85-2007 (4).The
general nature of DCMES elements make them
applicable to many organizations.
However, if you want to streamline your path
to metadata, avoid getting lost in the sea of
standards available. Because “[t]here is no
‘one-size-fits-all’ metadata schema or controlled
vocabulary or data content (cataloging) stan-
dard”, consider drawing on aspects of various
standards that will fit your organization
(1, page 72).
No specific standards seemed to target
computer documentation, but our team did
consider standards related to structure in order
to verify that we were targeting all of the key
elements.We only evaluated other standards,
including one from an HP marketing group, if
they seemed pertinent to our environment.
For example, we drew on a value standard,
ANSI/NISO Z39.19-2005 (5), to find guidelines
for developing controlled vocabularies,
as discussed below.
The challenge in defining metadata was learning
to appreciate the power inherent in distinguish-
ing content from the descriptors used to access
and manage that content effectively.
Step 2: Determine the Goals
That Drive Your Metadata
Strategy
Knowing what you want metadata to achieve
is fundamental to developing a sound strategy.
Once your team agrees on a definition of meta
data, turn their attention to identify the primary
goals that will drive the strategy. Experts
suggest “working backwards” from your goals
to the metadata needed to reach your goal.
“Deciding which aspects of metadata are essen-
tial for the desired goal and how granular each
type of metadata needs to be” is essential to
the process of formulating a strategy (6, page
193) (1, page 19).
We began by listing the various kinds of
information that would be useful to us: track-
ing types of content and components, content
status (new, a draft, approved), who originally
created the content, who revised it and when,
what content is reused, where the content is
reused, workflow tasks, multimedia objects
available, version details, profiled content, sys-
tem performance, and reports related to these
items. Next we compared this list with several
types of metadata: descriptive, administrative,
and structural.While experts refer to the
number and names of these types differently,
our team drew on the types identified by
NISO (7).These types are described inTable 1.
915 Charlotte Robidoux, Stacey Swart
When looking at these types of metadata, we
saw that items on our list could be understood
in terms of these categories. From this view, we
began formulating and prioritizing our goals,
short-term vs. long-term. Given our focus on
gaining efficiency, we determined that being
able to retrieve and reuse content was a paired
goal.Another important goal was to minimize
the risk of content being reused prematurely.
Longer term goals included tracking the
percentage of content we reuse, determining
what reuse opportunities are still untapped,
ensuring the quality of our deliverables, and
identifying what content is being localized.
Through this exercise, we could see that all
these metadata could help us achieve our goals.
The focus on metadata types helped to stream-
line how we thought about our goals. Our next
step was to understand what specific metadata
components would help us attain our short-
term goals.
Step 3: Identify the Metadata
Components That Help You
Obtain Your Goals
Selecting metadata components is extremely
important in the process of establishing a
metadata strategy.The ability to decide on the
optimal number of metadata components is not
easy. How do you pick just the right number,
not too many or too few?Which ones will have
the biggest impact and help to minimize risk?
Here are some sample questions you should
consider (5, page 193-194 and 196):
•	 What type of content is it?
•	 What else do you need to know about the 	
	 content to ensure the correct piece of content 	
	 is retrieved?
•	 In what form will users want to retrieve
	 content?
•	 How will users specifically identify the
	 desired content?
Table 1 Metadata Types
Metadata Type
Descriptive
Administrative
Structural
Identifies and describes
collections resources.
Used in managing and
administering collections,
versioning, and reten-
tion.
Delineates the
organization and
relationship of content
units within a system.
Assists with queries and the ability to locate types
of content that can be reused. This includes:
•	 Content type and status
•	 Tracking types of content/components
•	 Profiled content
•	 Multimedia objects available
Enables creation and management of collections
and configuration of tasks, permissions, status,
and history:
•	 Who created content and when
•	 Workflow tasks
•	 Version details
•	 Reuse statistics
•	 System performance
Supports navigation and means of combing
components into deliverables:
•	 What content is reused
•	 Where content is reused
•	 Where multimedia objects referenced
•	 Reporting
Purpose Relevance to our environment
10 TIMAF Information Management Best Practices Vol. 1
Research into structure standards showed our
team that we should focus on components that
describe the subject of our content (one of
the Dublin Core elements).These components
would be the basis of user queries.The best way
to streamline this step is to look at your own
content for the answers. Once again, the index
serves as a valuable tool for understanding what
terms might be queried, along with the table of
contents, providing clues about the hierarchy of
terms as they relate to the subject. Linking the
index concept to metadata was useful in helping
team members understand metadata hierarchies
and how the components related to each other.
FollowingAnn Rockley’s advice to select three
to five components, we chose four that were
subject related and two that were user related,
as shown inTable 2.
Table 2 Metadata Elements and Attributes
Element
ContentType
Product
Keyword
Abstract
Originator
Reuser
Exactly 1
1 or more
At least 2
At least 1
Exactly one
Exactly one
The largest “container” used to describe major
topics that make up our documentation.
“ContentType” describes the subject matter of
the content.
A smaller “container” used to qualify how a topic
applies to various products. “Product” designates
the name of the product for which the content was
written, including the model and / or version.
The smallest category that further limits the
relevance of a topic. “Keyword” helps to
further narrow search results.
Provides a synopsis of content that authors
can use to determine if reuse is appropriate,
describing the subject of the content, why it is
relevant, and guidelines for using the content.
Who originally created a reusable piece of
content.
Authors are reusing a piece of content.
Occurrence Rule Purpose
1115 Charlotte Robidoux, Stacey Swart
After choosing our components, we had to
consider how to manage them in our CMS.We
streamlined the process by drawing on options
that our CMS already supported. Our CMS
allowed searching on fields such as “Status,”
“Create date,” “Edit date,” and “username,”
but we needed to search on more specific
subject-related content as well. Our DTD,
which is a subset of DocBook, only contains
“keywordset/keyword.”To fill the gaps, we
developed custom elements and attributes,
adding custom elements for “ContentType”
and “Product,” and two attributes for
“Originator” and “Reuser.”We chose elements
when we might need to use multiple values,
and attributes when we wanted to enforce
only one value.
While it was clear that the goals of retrieval
and reuse could be achieved by building related
metadata into our content, we felt that the
goal of minimizing the risk of premature reuse
needed additional CMS support.To achieve this,
we organized our content into two collections:
“Working” and “Approved.”Working collec-
tions would contain work in progress; only the
“originator” content developer could reference
this content; reuse by others was not supported.
In contrast, “Approved” collections would
contain finalized content that had been
reviewed by an editor as well as subject matter
experts and could not be changed; any author
could be a “Reuser” of the content contained
here. Separate collections ensure that original
content will not change if reused. Instead,
“Reusers” must copy content from anApproved
collection to aWorking collection to propose
changes.After those changes are made,
the author initiates a Change Proposal work-
flow, illustrated below, via the CMS.The
workflow automatically notifies the assigned
stakeholders that a change to content is being
proposed. Some of the automation is possible
because of the metadata attributes “Reuser”
and “originator.”The CMS is able to determine
who initiated the change proposal and who
the changes will affect.The workflow content
librarian also employs automated email notifi-
cations and task assignments.Two options are
possible: either the approved content is updated
to reflect the changes agreed upon by the
review team, or new content is added to the
Approved collection because the original
content is still needed as first written. By
organizing the CMS collections this way, and
by creating a workflow that leverages user-
related metadata, we effectively streamlined
our use of metadata and found a way to
leverage elements to minimize risk when
reusing content.That is, a strategic approach
to metadata from the outset triggers additional
efficiencies; streamlining metadata cascades
into workflow and CMS implementation.
Step 4: Identify Metadata
Values
Without question, identifying metadata values
to create a stable list of terms-a controlled
vocabulary-is the most time consuming and
contentious step of the process. Deliberating
over synonyms and laboring over documents
to test the appropriateness of the values seems
endless.The best way to streamline this part
of the process is to form a small workgroup
of three or more members who can begin to
evaluate document conventions and create lists
of terms related to the components selected.
(A workbook works well for managing the
terms on separate spreadsheets.)As mentioned
earlier, our team drew extensively fromANSI/
NISO Z39.19-2005, Guidelines for the Con-
struction, Format, and Management of Mono-
lingual ControlledVocabularies.
This standard helped the workgroup and users
appreciate why a controlled vocabulary is so
important, given that “[t]wo or more words
or terms can be used to represent a single
concept” and that “[t]wo or more words that
have the same spelling can represent different
12 TIMAF Information Management Best Practices Vol. 1
concepts” (5).When creating the lists, the work-
group relied on the Standard’s recommendations
for conducting “top- down” and “bottom-up”
assessments, determining “the correct form of
each term,” and for following key principles
such as: “eliminating ambiguity,” “controlling
synonyms,” “establishing relationships among
terms where appropriate,” and “testing and
validation of the terms” (5).
Once the lists were created, workgroup
members began vetting these lists with
seasoned authors, many of whom were not
co-located.The ability to engage teams across
the organization when the workgroup had little
authority was especially challenging.We relied
on many virtual collaboration techniques to
streamline our efforts so that we could complete
the work. Do not overlook the importance of
showing the value of metadata to the users-they
need to understand and believe in the purpose
of their work, and realize that metadata:
•	 Enhances query capabilities in the CMS by 	
	 enabling “effective retrieval” (6, page 18).
•	 Allows users to locate their own content, as 	
	 well as other content that they could reuse
	 or leverage.
•	 Reduces “redundant content” (6, page 185), 	
	 making content developers more productive.
•	 Reduces costs (Management may care more 	
	 about this, but in today’s work environment, 	
	 a content developer who is saving the
	 company money is a content developer 		
	 worth keeping.).
Additionally, employing a controlled vocabulary
saves the content developers time by increasing
the amount of content that can be successfully
retrieved.
TheANSI/NISO Z39.19-2005 standard
provided essential principles for maintaining
a controlled vocabulary, especially how best to
manage additions and modifications as well as
a history of changes (5, page 97).The change
history was especially critical when updating
the values in our tools.These processes are
contained within a single resource that we refer
to as metadata guidelines.
Documenting the metadata process is a must.
Bob Boiko discusses the idea of a metatorial
guide containing “a set of rigorous metadata
guidelines,” similar to an editorial guide (8,
page 495). Boiko goes on to say that the
metadata process must ensure (8, page 508):
•	 Metadata completeness and consistency.
•	 Content manageability, accessibility, and 		
	 targetability (intended for the appropriate
	 audience).
A thoroughly documented set of rules and
procedures helps take the guesswork out of
metadata application.As Boiko explains, “in a
sizable system with numerous contributors you
can almost guarantee that you will find wide
variation in the ways that people interpret and
apply even the most precisely stated tagging
rules” (8, page 509). Providing a link from the
tool’s support menu to the metatorial guide
puts the information at the content developers’
fingertips, giving users easy access to the meta-
data processes and guidelines.As previously
discussed, proper application of metadata is
critical to ensure quality search results. Making
the guidelines as accessible as possible will help
ensure that they are followed.
Once guidelines are documented, you need
to determine what type of user will apply the
metadata. Should content developers add all
user-driven metadata, or should a content
librarian assist them?What are the roles regard-
ing metadata application? Boiko contends that
“a different set of skills is necessary to deal
with this metadata” (8, page 495). Some users
can be trained to apply metadata. However, as
he goes on to say, users “rarely have the where-
withal to discover what content others are sub-
mitting and exactly how to relate that material
1315 Charlotte Robidoux, Stacey Swart
to what they submit.” Someone on the team
with an eye for detail like a content librarian
is more appropriate for this role.
While content developers understand their
content and usage better than anyone else, as
noted by Peter Emonds-Banfield, they might
not have the “expertise necessary for meta-
data creation, nor the time to keep up with it”;
whereas “... metators (= editors that manage
metadata) can play a role by educating content
teams on metadata creation” (8, page 509).
As previously discussed, some tools can be
configured to enforce certain rules; however,
some standards require the human eye. In those
cases, the content librarian can audit metadata
application before content is approved, ensur-
ing the metadata values chosen by the content
developer meet quality standards.You can
liken the role of a content librarian to that of
an editor. Instead of reviewing content against
structure and style rules, the content librarian
reviews metadata against metatorial guidelines,
ensuring that metadata application is consistent
throughout all content in the CMS. Boiko refers
to this as “a central point of decision” (8, page
511).The more complex the metadata and
content, and the more users who access it, the
more critical such a point of decision becomes.
On the other hand, is having the content
librarian audit metadata application by content
developers sufficient, or should the content
librarian apply all metadata to content, com-
pletely releasing the content developer from
such a burden?According to Boiko “the task
of looking after the metadata health of a CMS
is crucial if you want to be confident that the
content in the system will show up when you
expect it to.” (8, page 511).This the point for
content to be retrievable so that you can then
reuse it. If you want to be completely sure
that metadata is applied consistently across
all content, regardless of who originated it,
then having a content librarian perform this
task is as close to a guarantee as you might get.
However, some organizations do not have the
resources to staff a content librarian. In that
case, an editor might take this on as a new
role. If resource constraints are an issue, some
organizations must rely on content developers
to apply user-driven metadata. In this case, the
metatorial guide is what you are betting on,
and it must be rock solid.
In our case, we rely on content developers to
apply user-driven metadata.The editors are
charged with reviewing metadata as they would
any other content.The content librarian is
consulted when questions arise, and also audits
content in the CMS for consistency. Ultimately,
the content librarian is the point of decision
and is responsible for educating others and
maintaining the metatorial guide.We have also
staffed a trainer who works with the content
librarian to develop metadata training for all
(content developers and editors).The primary
reason we have this model is to share the work-
load; we do not have the resources to assign
such a role in a full-time capacity. Regardless
of who is doing it, applying “metadata well
requires a lot of human energy” (8, page 495).
Step 5: Determine What
Metadata Components Can
Be Automated
Determining which metadata components,
if any, can be automated, is important at this
stage in developing a metadata strategy. Some
components need the human touch for quality
purposes, or because tools such as the CMS are
not able to automate the application of such
metadata. However, when possible, utilize auto-
mation.The options for this will vary depending
on the tool. In our case, we looked to the CMS
for automating the application of metadata.
Why automate?Automating metadata applica-
tion lessens the burden on the content develop-
ers and helps avoid inconsistency. In addition,
if it is “up to the author to remember to add
14 TIMAF Information Management Best Practices Vol. 1
the metadata in all the relevant places”, it is
a “recipe for missed metadata” (6, page 200).
As Boiko writes, “without a rigorous consis-
tency and careful attention, metadata becomes
useless” (8, page 495). He goes on to say that
“someone or, better, some system is necessary
to ensure that people handle metadata
thoroughly and consistently” (8, page 495).
So if the CMS can handle it, automate it!
What metadata makes a good candidate for
automation? From our experience, metadata
with a yes or no value should be automated
if the question can be answered by data that
is accessible to the CMS. For example, to
answer the question “Is the content being
reused?”, populate the reuse attribute with
either “yes” or “no.” In our case, if content
lives within a specific CMS collection, then
it is reused. Otherwise, it is not. Our CMS is
smart enough to answer this question based
on the location of the content – in a certain
collection, so we let it answer that question
for us.
Metadata containing a value that is definite
should also be automated. For example, the
originator attribute can be populated with the
username of the person who created the con-
tent because the CMS knows who that person
is. Likewise, the CMS knows who is reusing
content because it can follow the reference
to the content back to the username who
created the reference.As a result, we let the
CMS capture the username for us by adding
it to the reuser attribute.
On the other hand, what metadata should
not be automated? Metadata requiring a
discerning human eye should not be automated.
For example, a person is needed to determine
the subject of the content. One could argue
that if the content contains a title, the subject
could be leveraged from the title. However,
not all content chunks include a title.As a
result, we do not automate the ContentType
metadata element.
A gray area might be keywords. In our case,
we depend on a person to assign keywords.
This person is typically the content developer,
with some assistance from the content librarian
if required.As content grows, new keywords
might be necessary. If they are not part of the
controlled vocabulary, the content librarian
can make note of that and modify the list as
needed. From our experience, controlled
vocabularies are certainly living lists, as
previously discussed.
Table 3 shows our system-driven metadata,
including metadata used to manage the status
of content (whether or not it can be reused).
Be sure to also consider the risks of automa-
tion. Boiko states that “the problem isn’t to
find the blanks, but to correctly fill them”
(8, page 509).The key word here is “correctly”.
Similarly, Rockley explains that “[i]mproperly
identified metadata ... can cause problems
ranging from misfiled and therefore inacces-
sible content to even more serious problems ...”
(6, page 185). In our case, inaccessible content
would be a deal breaker since our primary goal
is retrieval for reuse. It is critical that metadata
applied automatically by the CMS is done with
the highest quality standards.There can be no
room for incorrectly applied metadata or for
the possibility of inaccessible content.
Consequently, if you rely on the CMS to auto-
mate the application of metadata, make sure it
is fool- proof (tool-proof).
Step 6: Ensure That Users Will
Apply the Metadata
Once you have determined which metadata
components can be automated, the remaining
components will be user-driven.The next step
it to ensure that users will apply it.As Rockley
notes, metadata is “only valuable if it gets
used” (6, page 200).
1515 Charlotte Robidoux, Stacey Swart
Table 3 System-Driven Attributes
System-Driven
Attributes
Status
Collection
Reuse
Originator
Reuser
Working
Approved
Section
Chapter
Glossentry
Yes
No
Username
Username
External use in the authoring environment. Used
upon extract to work with style sheets to lock
content from changes if approved.
Used to properly reload content to the correct
collection.
If yes, content is from an approved collection.
Used to color-code approved content so that
reviewers and editors know it has already been
approved.
Who created the content; used by CMS
workflow.
Who is reusing the content; used by CMS
workflow.
Value Goal
One method to ensure users apply metadata is
to configure your tools with metadata require-
ments.The DTD behind an authoring tool can
utilize occurrence rules to require specific
metadata components to be added (such as at
least two keywords must be present).A CMS
can be configured to enforce the same rules.
In our case, we have rules established in both
tools. Regardless of which tool the metadata is
applied in, the user must meet certain require-
ments.The tools alert the user when those
requirements are not met.
We have found that the CMS provides greater
specificity than our authoring tool in such
requirements.While the DTD behind the
authoring tool can require that metadata
components be present, it cannot enforce that
values be added to those components. For
example, in the authoring environment, a user
could add two keyword elements, but leave
them empty with no values assigned.Techni-
cally, they would meet the DTD rules.The CMS
provides the additional reinforcement. In our
case, content shows as incomplete unless the
metadata components are present and they
contain values. For example, two keywords
are present and the values are this and that.
Content within the CMS shows as incomplete
unless all metadata requirements are met, and
because all content is ultimately managed in
the CMS, it becomes the final checkpoint.
To fully utilize the benefits of metadata,
however, users must do more than just apply
metadata to their content.They must apply the
appropriate metadata to their content.A well
designed metadata strategy ensures that the
metadata components and values are tailored
to the needs of the user; metadata guidelines
assist them with the tasks they need to accom-
plish and include terms they will use when
retrieving content. But as previously discussed,
users do not all think the same way.This is
where having a controlled vocabulary is a must.
Even though one user might be inclined to
search on “America” and another might search
on “U.S.A.”, they will both work off of the
same list of terms, which in this example could
include “United States”. Such search standards
16 TIMAF Information Management Best Practices Vol. 1
can be taught, and will ensure effective search
results, rather than wasting the user’s valuable
time.
There are other ways to assist users in metadata
application. One is to provide templates that
are pre-populated with the required metadata
components.We have done this in our author-
ing toolset; the content developer only needs
to assign values to the components.Another
method that both assists users and provides a
level of control that can be a partner to occur-
rence rules is described by Boiko as “categori-
zing metadata by its allowed values” (8, page
506). For example, we use a “closed list,” which
allows users to select a value from a predefined
set of terms, or a controlled vocabulary (8, page
507). In our case, the controlled vocabulary is
built into the authoring tool and the CMS.The
user cannot type in metadata values; the only
option is to select them from a list.
Step 7: Assign Metadata Tasks
to Roles
To ensure your metadata goals become part of
your business processes and tool environment,
assign roles to team members who can imple-
ment the metadata strategy.These assignments
streamline the implementation effort.Table 4
describes each of these roles.
Metadata is dependent on many contribu-
tors.While tool administrators can ensure that
system- driven metadata and automation are
set up behind the scenes, they are not the sole
contributors.The realization of your strategy
becomes much easier with all team members
involved.
1715 Charlotte Robidoux, Stacey Swart
Role
Content Librarian
Editor
Originator (Content
Developer)
Reuser (Content Devel-
oper)
Responsible for the quality of modular content in the CMS as well as
for flagging opportunities for reuse across the database. The content
librarian’s tasks include:
*	 Assisting content developers when needed in understanding the
	 metadata guidelines.
*	 Maintaining the metadata guidelines document and the master list
	 of values.
*	 Reviewing and accepting or rejecting new metadata value requests.
*	 Notifying Tool Developers when new metadata values need to be 		
	 added to the tool set.
*	 Auditing the quality of metadata the values that content developers 	
	 apply before the content can be made available for reuse.
*	 Overseeing content change proposals for reused modules and
	 validating the requests for changes.
*	 Facilitating the review process to ensure all Reusers participate by 		
	 either accepting or rejecting their changes
*	 Implementing the final result by either overwriting the original
	 “approved” content in the CMS, or by creating a variant of the original 	
	 “approved” content.
*	 Populating the CMS with common queries to assist content developers 	
	 with locating content to be reused or leveraged.
*	 Assisting content developers when more specific search criteria is 		
	 needed for database queries to locate content to be reused or leveraged.
Manages edits at the sentence level and reviews content against style
guidelines. The editor’s responsibilities include:
*	 Reviewing metadata values as part of the literary edit to ensure consistent 	
	 usage.
*	 Maintaining an eye toward content that can be leveraged or reused when 	
	 a content developer opts to create new content.
Identifies the need for and creates reusable modules of content using
content mapping. The originator’s responsibilities include:
*	 Identifying unique, similar and identical content across the deliverables set.
*	 Capturing metadata values for identical content.
*	 Analyzing similar content for opportunities to make it identical.
*	 Creating reusable topics of information.
*	 Requesting new metadata values as needed via the CMS workflow.
A content developer who uses metadata to query the CMS for reusable
content. The reuser’s responsibilities include:
*	 Reusing “approved” content by referencing it in deliverables.
*	 Initiating the change proposal workflow as needed to request changes
	 to “approved” content.
*	 Reviewing change proposals from other reusers.
Responsibilities
Table 4 Roles and Responsibilities >>
18 TIMAF Information Management Best Practices Vol. 1
Tool Administrators
DTD Developer
Authoring Tool Developer
Authoring Tool Developer
CMS Administrator
Responsible for configuring and managing the tool set. Examples of tool
administrators include:
*	 DTD Developer
*	 Authoring Tool Developer
*	 CMS Administrator
*	 Publishing Tool Developer
*	 The tool administrator’s responsibilities include:
*	 Addressing requirements versus options.
*	 Automating processes where possible.
*	 Ensuring that the tools support the reuse and metadata strategy.
The DTD developer’s responsibilities include:
*	 Managing DTD elements, attributes, and occurrence rules.
*	 Communicating with the CMS Administrator when DTD changes
	 are needed.
The authoring tool developer’s responsibilities include:
*	 Making templates available for new content creation.
*	 Automating adding required child elements when a parent element is 	
	 selected.
*	 Maintaining pre-populated menus with required user- driven metadata 	
	 values.
*	 Providing links to support documentation from the authoring tool menu.
The authoring tool developer’s responsibilities include:
*	 Making templates available for new content creation.
*	 Automating adding required child elements when a parent element is 	
	 selected.
*	 Maintaining pre-populated menus with required user- driven metadata 	
	 values.
*	 Providing links to support documentation from the authoring tool menu.
The CMS administrator’s responsibilities include:
*	 Managing content collections for editing, loading, and extracting
	 behaviors, including extracts directly to the publishing tool.
*	 Maintaining user roles and privileges.
*	 Configuring the CMS to ensure alignment with DTD rules.
*	 Setting up CMS-specific elements and attributes as needed.
*	 Tightening structure rules by requiring text to be present and/or valid 	
	 values to be used for applicable elements and attributes.
*	 Making components, properties and operators available to help ensure 	
	 effective query options.
*	 Maintaining pre-populated menus with required user- driven metadata 	
	 values.
*	 Implementing visual aids to assist users when viewing content in the CMS.
*	 Automating the capturing of system-driven attributes.
*	 Creating workflow configurations to support CMS- assisted procedures.
Publishing Tool Developer The publishing tool developer’s responsibilities include:
*	 Creating and maintaining style sheets for use in the authoring and
	 publishing tools.
*	 Developing authoring tool scripts to provide visual cues for reused
	 content to content developers, editor, and Subject Matter Experts.
1915 Charlotte Robidoux, Stacey Swart
It is critical that the team members have a clear
definition of their roles and the importance of
each role in contributing to the overall success
of the strategy.
Step 8: Prove That Your
Strategy is Sound
Readiness to release metadata to production
can take months or years, depending on the
complexity of your strategy. Because our
organization did not always have dedicated
resources to devote to this implementation,
tracking our progress via a schedule was
absolutely essential.As priorities shifted in our
organization, deliverables were either pulled in
or pushed out as needed. Shifting priorities and
balancing resources may ultimately determine
the time needed to develop and implement
your metadata strategy.While it took our team
a number of years, we understood the return
on investment. Had we given in to the pressure
to release any sooner, we would have had a less
effective, less efficient, and less robust metadata
strategy.And because metadata is truly the
backbone to our reuse strategy, skimping was
not an option.
It is also necessary to come to an agreement
with management as to what qualifies the
strategy to be ready for release. For example,
we negotiated to have a full quarter of simula-
tion testing, and agreed that we would only
release to production if simulation testing
resulted in zero process or tool issues.Test
scenarios should be as realistic as possible.
In our case, we used actual customer content,
assigned roles, and created scenarios to put
our business and tool processes to the test.
The testers received new test scenarios each
week so that they couldn’t see what was coming
next. Our support staff, including the editor
and the content librarian, were also given test
scenarios. In some cases, we set up intentional
conflicts to ensure users knew how to handle
them.
Before you can release your metadata strategy
to production, you must ensure that:
•	 Your tools are functioning as expected to 	
	 support the strategy.
•	 Roles and expectations are clear.
•	 The metadata guide is available.
•	 Training has occurred, including making
	 all users aware of the importance of
	 metadata (resulting in a willingness to use it).
•	 No gaps have been identified.
•	 There are no technical issues with any of the 	
	 processes supported by the tools.
After all of the “human energy” (8, page 495)
spent on creating your metadata strategy, don’t
short- change yourself by rushing through the
testing process.When you do release your
metadata strategy, you want to know it is rock
solid.
Metadata in Action
As previously discussed, our goal is to enable
effective reuse by making content easy to find.
Because the originating content developer
(Originator) added metadata for reuse, other
content developers (Reusers) can query on
those values. Figure 1 shows some of an
Originator’s content.
20 TIMAF Information Management Best Practices Vol. 1
In some cases, a content developer might know
the content exists, and is already familiar with
it. In that case, she would have knowledge of
the metadata values that are likely to be associ-
ated with the content.
In other cases, the content developer has a
need for content, but is not sure it exists. Rather
than creating it from scratch, she searches the
CMS to see if content exists that she can reuse
or at least leverage. For example, a user needs
content specific to installing NAS products
onto servers. Because our CMS is configured
with drop-down lists of valid values (controlled
vocabulary metadata values), the users selects
the appropriate metadata elements and values
from the list.
In this example, the content developer would
query on:
•	 ContentType = Installing;
•	 Product = NAS;
•	 Keyword = servers.
The content developer can search on one
or more metadata elements, as shown in
Figure 2. Combining multiple metadata
elements provides narrower results. In the
preceding example, over 1,200 section modules
were queried, resulting in one section that met
the content developer’s query (shown in
Figure 3).At this point, the content developer
can review the content in more detail and
decide if she can use it as is, or leverage it.
Figure 2: Searching Using Metadata
Figure 1: Originator Content
2115 Charlotte Robidoux, Stacey Swart
It is easy to see that without the metadata to
support the query, the content developer would
likely never have located the content she needed.
She probably would have just created a new
section, duplicating existing data. She would
have spent time doing this, taking away from
her other work. In addition, the CMS would
have become populated with redundant content.
Even when a content developer locates content
that already exists, it might not fully meet her
needs. In that case, she can propose changes to
the content.We use the Change ProposalWork-
flow feature in our CMS to manage this process.
The workflow has the following steps, shown in
Figure 4:
•	 Proposal: The content developer copies con-	
	 tent to a working collection, makes changes 	
	 as needed, and initiates the workflow.
•	 Review: The content librarian validates the 	
	 request. System-generated email notifications 	
	 are sent to all Reusers.The content librarian 	
	 facilitates an offline review and mediates any 	
	 counter-proposals.
•	 Outcome: If all Reusers accept the change
	 proposal, the CMS automatically overwrites
	 the original content in the approved collec-	
	 tion with the changed content.A system-
	 generated email notification is sent to all 	
	 content developers, letting them know that 	
	 the workflow has been completed.
•	 Relink: If only some Reusers accepted the 	
	 change proposal, the content librarian assigns 	
	 a unique ID to the content, and the CMS
	 automatically moves the variant to the
	 approved collection.A system-generated 		
	 email notification is sent to all content 		
	 developers, reminding them to relink to
	 the variant as needed.
Figure 3: Reused Content
Figure 4: Change Proposal Workflow
22 TIMAF Information Management Best Practices Vol. 1
If, on the other hand, query results show that
new content needs to be created, the content
developer can do so.Adding the metadata
elements to the new content will help ensure
that other content developers can locate the
content for future usage.
Summary and
Conclusion
Proving the soundness of metadata in our case
entailed extensive collaboration and testing
among team members. Key areas of focus
included:
•	 Checking and rechecking that our metadata 	
	 values were entered into the tools correctly.
•	 Ensuring high usability in the tools and in 	
	 written processes so that team members 		
	 could add metadata easily.
•	 Configuring our tools to easily locate meta	
	 data and to indicate if values and elements
	 were missing.
•	 Proving the concept that our metadata would 	
	 enable us to locate content effectively for the 	
	 purpose of reuse.
The ultimate test of success is verifying that
implementing metadata allows your organiza-
tion to achieve the goals you identified at the
outset.
There is little guidance available on how to
develop a metadata strategy.While some in-
dustries have developed specifications tailored
to their content, others seem to be starting
at square one.Technical communication, as it
relates to the computer industry, could benefit
from more substantial models to follow. Having
a specification as a starting point would help
companies get started with their metadata strat-
egy by providing a list of components, possible
values, and the pros and cons of building upon
this as a foundation.
References
1.	Baca, Murtha, ed.. Introduction to Metadata, 2nd Edition. Los Angeles, CA: Getty Publications, 2008.
2.	 Dublin Core Metadata Initiative. “Dublin core metadata element set, version 1.1 .” 2008. Retrieved on
	 20 Mar. 2009. http://dublincore.org/documents/dces/.
3.	 International Organization for Standardization. “ISO 15836:2009, Information and documentation - The Dublin 	
	Core metadata element set.” 2009. Retrieved on 21 Feb. 2009. http://www.iso.org/iso/iso_catalogue/catalogue_	
	 tc/catalogue_detail.htm?csnumber=52142.
4.	 American National Standards Institute. “The Dublin Core Metadata Element Set.” ANSI/NISO Z39.85. 		
	Bethesda, MD: NISO Press, 2007. Retrieved on 8 Jun. 2008. http://www.niso.org/kst/reports/standards/kfile_		
	 download?id%3Austring%3Aiso-8859-1=Z39-85-2007.pdf&pt=RkGKiXzW643YeUaYUqZ1BFwDhIG4-24RJbcZB-	
	 Wg8uE4vWdpZsJDs4RjLz0t90_d5_ymGsj_IKVa86hjP37r_hFEijh12LhLqJw52B-5udAaMy22WJJl0y5GhhtjwcI3V.
5.	 American National Standards Institue. “Guidelines for the Construction, Format, and Management of
	 Monolingual Controlled Vocabularies.” ANSI/NISO Z39.19-2005. Bethesda, MD: NISO Press, 2005. Retrieved on 	
	 9 Jun. 2008. http://www.niso.org/kst/reports/standards?step=2&gid=None&project_key=7cc9b583cb5a62e8c15	
	 d3099e0bb46bbae9cf38a.
6.	Rockley, Ann. Managing Enterprise Content: A Unified Content Strategy. Indianapolis, IN: New Riders Publishing, 	
	 2002.
7. National Information Standards Organization. “Understanding Metadata.” Bethesda, MD: NISO Press, Retrieved 	
	 on 21 Feb. 2009. http://www.niso.org/publications/press/UnderstandingMetadata.pdf.
8. Boiko, Bob. The Content Management Bible, 2nd Edition. Indianapolis, In: Wiley Publishers, 2005.
About the TIMAF Library
This ‘Information Management Best Practices’ book is a publication in theTIMAF
Library.The publications in theTIMAF Library are aimed at promoting Information
Management and are published on behalf ofTIMAF.TIMAF, the Information Manage-
ment Foundation, is an initiative of information management practitioners to provide
a strong and clear foundation of information management.
TIMAF encourages authors from around the world, who are experts in their
particular information management sub discipline, to contribute to the development
of theTIMAF publications.Are you interested in sharing your ideas and experiences
online with theTIMAF Community?Visit www.timaf.org and join the discourse.
Have you experienced the merits of a specific approach to information management
strategies, methodologies, technologies or tools? Submit a proposal according to the
requirements listed in the ‘Call for Best Practices’ at www.timaf.org.
The following publications are available in theTIMAF Library:
Introduction books
Information Management Framework
paper edition - release: September 2011
Best Practices books
Information Management Best Practices 2009 Sneak Preview
online edition - www.timaf.org
Information Management Best Practices – Volume 1
paper edition - ISBN 978-94-90164-03-4
Pocket Guides
Information Management Framework – A Pocket Guide
paper edition - release: November 2011
Social Networks
Information Management Framework Wiki
www.timaf.org/wiki
We will publish new books and other instruments on a regular basis. For further
enquiries about the Information Management Library, please visit www.timaf.org
or send an email to info@timaf.org.
Introduction
Information? Manage!
Information is the term we use to stand for all forms of preserved communica-
tion that organizations care to produce, store and distribute. If we communi-
cate it and record it, it is information. So, for us, information is anything from
sales figures in a database to a video on philosophy viewed on a mobile phone.
We define information management as the organized collection, storage and
use of information for the benefit of an enterprise.
Our definitions are intentionally wide enough to cover content, document,
asset, data, records and all other ‘information managements’ that organizations
do.We believe that while each of these “sub-disciplines” has its own tools and
types of information, there is much more that unites them than divides them.
Our definitions are intentionally quite practical. For us, information manage-
ment simply means moving pieces of recorded communication from creation
to consumption to retirement. Our definitions are crafted to carve out a niche
for the information manager. Information managers make sure that recorded
communication can be amassed and distributed in a way that benefits their
organization. Finally our definitions are crafted to be a simple guiding
principle.Any person working in any information project can use this defini-
tion to remain focused on the ultimate aim of their particular kind of work.
Information Management? TIMAF!
The field of information management is currently fractured and incoherent.
Each sub discipline (content, document, asset, data, records management
to name just a few) has its own practitioners, applications and professional
communities.We believe that behind the seeming differences between these
‘managements’ there is a deeper unity that will eventually define a strong
and clear foundation for all of them.
We do not believe that all managements will or should merge, but rather that
just as business underlies a variety of business practices including accounting
and finance, there is a common foundation for the various forms of informa-
tion management.
The Information Management Foundation (TIMAF) tries to provide this foun-
dation by publishing these information management best practices. In addition,
TIMAF develops and maintains an information management framework that
brings the commonalities between sub disciplines to light and helps to organize
the best practices that we publish.
Best Start? Best Practice!
Just as business is practiced within a more specific context, information
management is also practiced in context.Thus, we believe that the best way
to illustrate the concepts and practices of information management is within
the context of one or more sub disciplines. So, this best practices book tries
to show global principles of information management in the context of
projects in one or more of the sub disciplines.
This is the first volume of ‘Information Management Best Practices.’ In future
publications we will provide an ongoing compilation of high quality best
practice guidance, written for and by experts in the Information Management
field from around the world.These best practices are designed to help
professionals overcome their information management challenges.They bring
complex models down to earth, with practical guidance on tough problems.
In this volume, practitioners describe nineteen projects that you can learn
from, In return, we ask that you let us learn from you! Please let us know
what your experiences are with these or other projects at www.timaf.org.
Colophon
Title
TIMAF Information Management Best Practices – Volume 1
Editors
Bob Boiko – USA - Erik M. Hartman – NL
Copy-editors
Jonah Bull – USA – Jenny Collins – USA - Elishema Fishman – USA
Publisher
Erik Hartman Communicatie – NL
Edition
Volume 1 – 1st impression – November 2010
ISBN
978-94-90164-03-4
Design & Layout	
Nevel Karaali – NL
Print
Wöhrmann Print Service – NL
© 2010, TIMAF
All rights reserved. No part of this publication may be reproduced in any form by
print, photo print, microfilm or any other means without written permission by the
publisher. Although this publication has been composed with much care, neither
author, nor editor, nor publisher can accept any liability for damage caused by possible
errors and/or incompleteness in this publication.
TRADEMARK NOTICE
TIMAF ® is a RegisteredTrade Marks and Registered CommunityTrade Marks of the
Office of Government Commerce, and is Registered in the U.S. Patent andTrademark Office.
Please contact the editors for ideas, suggestions and improvements at info@timaf.org.
Streamlining Your Path to Metadata Charlotte Robidoux Stacey Swart

More Related Content

What's hot

Data Governance by stealth v0.0.2
Data Governance by stealth v0.0.2Data Governance by stealth v0.0.2
Data Governance by stealth v0.0.2
Christopher Bradley
 
Enterprise Data Management
Enterprise Data ManagementEnterprise Data Management
Enterprise Data Management
Bhavendra Chavan
 

What's hot (20)

Analytics Organization Modeling for Maturity Assessment and Strategy Development
Analytics Organization Modeling for Maturity Assessment and Strategy DevelopmentAnalytics Organization Modeling for Maturity Assessment and Strategy Development
Analytics Organization Modeling for Maturity Assessment and Strategy Development
 
Role of metadata in transportation agency data programs
Role of metadata in transportation agency data programsRole of metadata in transportation agency data programs
Role of metadata in transportation agency data programs
 
Dc2010 fanning
Dc2010 fanningDc2010 fanning
Dc2010 fanning
 
Information Management Training Options
Information Management Training OptionsInformation Management Training Options
Information Management Training Options
 
Webinar: Initiating a Customer MDM/Data Governance Program
Webinar: Initiating a Customer MDM/Data Governance ProgramWebinar: Initiating a Customer MDM/Data Governance Program
Webinar: Initiating a Customer MDM/Data Governance Program
 
The Chief Data Officer: Tomorrow's Corporate Rockstar
The Chief Data Officer: Tomorrow's Corporate RockstarThe Chief Data Officer: Tomorrow's Corporate Rockstar
The Chief Data Officer: Tomorrow's Corporate Rockstar
 
Information Management Fundamentals DAMA DMBoK training course synopsis
Information Management Fundamentals DAMA DMBoK training course synopsisInformation Management Fundamentals DAMA DMBoK training course synopsis
Information Management Fundamentals DAMA DMBoK training course synopsis
 
Data Management
Data ManagementData Management
Data Management
 
Knowledge mgmt
Knowledge mgmtKnowledge mgmt
Knowledge mgmt
 
Five creative search solutions using text analytics
Five creative search solutions using text analyticsFive creative search solutions using text analytics
Five creative search solutions using text analytics
 
Chief Data & Analytics Officer Fall Boston - Presentation
Chief Data & Analytics Officer Fall Boston - PresentationChief Data & Analytics Officer Fall Boston - Presentation
Chief Data & Analytics Officer Fall Boston - Presentation
 
Data Governance by stealth v0.0.2
Data Governance by stealth v0.0.2Data Governance by stealth v0.0.2
Data Governance by stealth v0.0.2
 
Data modelling 101
Data modelling 101Data modelling 101
Data modelling 101
 
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and MethodologyEnterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
 
BP Data Modelling as a Service (DMaaS)
BP Data Modelling as a Service (DMaaS)BP Data Modelling as a Service (DMaaS)
BP Data Modelling as a Service (DMaaS)
 
Managing Data as a Strategic Resource – Foundation of the Digital and Data-Dr...
Managing Data as a Strategic Resource – Foundation of the Digital and Data-Dr...Managing Data as a Strategic Resource – Foundation of the Digital and Data-Dr...
Managing Data as a Strategic Resource – Foundation of the Digital and Data-Dr...
 
Opentext Decisiv
Opentext DecisivOpentext Decisiv
Opentext Decisiv
 
Metadata
MetadataMetadata
Metadata
 
Enterprise Data Management
Enterprise Data ManagementEnterprise Data Management
Enterprise Data Management
 
WHITE PAPER: Distributed Data Quality
WHITE PAPER: Distributed Data QualityWHITE PAPER: Distributed Data Quality
WHITE PAPER: Distributed Data Quality
 

Similar to Streamlining Your Path to Metadata Charlotte Robidoux Stacey Swart

Encrypted Data Management With Deduplication In Cloud...
Encrypted Data Management With Deduplication In Cloud...Encrypted Data Management With Deduplication In Cloud...
Encrypted Data Management With Deduplication In Cloud...
Angie Jorgensen
 
Successfully supporting managerial decision-making is critically dep.pdf
Successfully supporting managerial decision-making is critically dep.pdfSuccessfully supporting managerial decision-making is critically dep.pdf
Successfully supporting managerial decision-making is critically dep.pdf
anushasarees
 
CHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal OlechowsCHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal Olechows
JinElias52
 
DISCUSSION 15 4All students must review one (1) Group PowerP.docx
DISCUSSION 15 4All students must review one (1) Group PowerP.docxDISCUSSION 15 4All students must review one (1) Group PowerP.docx
DISCUSSION 15 4All students must review one (1) Group PowerP.docx
cuddietheresa
 

Similar to Streamlining Your Path to Metadata Charlotte Robidoux Stacey Swart (20)

Making Meaning with Metadata
Making Meaning with MetadataMaking Meaning with Metadata
Making Meaning with Metadata
 
Encrypted Data Management With Deduplication In Cloud...
Encrypted Data Management With Deduplication In Cloud...Encrypted Data Management With Deduplication In Cloud...
Encrypted Data Management With Deduplication In Cloud...
 
Data Systems Integration & Business Value Pt. 1: Metadata
Data Systems Integration & Business Value Pt. 1: MetadataData Systems Integration & Business Value Pt. 1: Metadata
Data Systems Integration & Business Value Pt. 1: Metadata
 
Data-Ed: Data Systems Integration & Business Value PT. 1: Metadata
Data-Ed: Data Systems Integration & Business Value PT. 1: MetadataData-Ed: Data Systems Integration & Business Value PT. 1: Metadata
Data-Ed: Data Systems Integration & Business Value PT. 1: Metadata
 
The Metadata Secret in Your Data
The Metadata Secret in Your DataThe Metadata Secret in Your Data
The Metadata Secret in Your Data
 
Taxonomy and seo sla 05-06-10(jc)
Taxonomy and seo   sla 05-06-10(jc)Taxonomy and seo   sla 05-06-10(jc)
Taxonomy and seo sla 05-06-10(jc)
 
11 Strategic Considerations for SharePoint Migrations
11 Strategic Considerations for SharePoint Migrations11 Strategic Considerations for SharePoint Migrations
11 Strategic Considerations for SharePoint Migrations
 
km ppt neew one
km ppt neew onekm ppt neew one
km ppt neew one
 
Successfully supporting managerial decision-making is critically dep.pdf
Successfully supporting managerial decision-making is critically dep.pdfSuccessfully supporting managerial decision-making is critically dep.pdf
Successfully supporting managerial decision-making is critically dep.pdf
 
CHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal OlechowsCHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal Olechows
 
DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...
DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...
DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...
 
BPC10 BuckleyMigration-share
BPC10 BuckleyMigration-shareBPC10 BuckleyMigration-share
BPC10 BuckleyMigration-share
 
Course Outline Ch 2
Course Outline Ch 2Course Outline Ch 2
Course Outline Ch 2
 
Performance Management: How Technology is Changing the Game
Performance Management: How Technology is Changing the GamePerformance Management: How Technology is Changing the Game
Performance Management: How Technology is Changing the Game
 
DAMA Australia: How to Choose a Data Management Tool
DAMA Australia: How to Choose a Data Management ToolDAMA Australia: How to Choose a Data Management Tool
DAMA Australia: How to Choose a Data Management Tool
 
DISCUSSION 15 4All students must review one (1) Group PowerP.docx
DISCUSSION 15 4All students must review one (1) Group PowerP.docxDISCUSSION 15 4All students must review one (1) Group PowerP.docx
DISCUSSION 15 4All students must review one (1) Group PowerP.docx
 
The Missing Link in Enterprise Data Governance - Automated Metadata Management
The Missing Link in Enterprise Data Governance - Automated Metadata ManagementThe Missing Link in Enterprise Data Governance - Automated Metadata Management
The Missing Link in Enterprise Data Governance - Automated Metadata Management
 
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeData Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
 
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
 
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
 

Streamlining Your Path to Metadata Charlotte Robidoux Stacey Swart

  • 2.
  • 3. Information Management Best Practices - Volume 1 “Regardless of the kind of information you need to manage, this book will make your projects better.” Bob Boiko, Washington University “The TIMAF organization proves its dedication to raising the bar for information management practitioners everywhere. It has assembled the best thought leaders in the field to share insights from case studies which are both actionable and universally relevant. Even the most experienced IM professionals will learn something new with each turn of the page.” Scott Liewehr, President CM Professionals “It is, quite frankly, the best collection of case studies and solutions I’ve run across and will be an invaluable resource for our readers. The reports are solid, practical and real-world examples of how to do things right.” Hugh McKellar, Editor in Chief KM World Magazine
  • 4.
  • 5. 515 Charlotte Robidoux, Stacey Swart Charlotte Robidoux (charlotte.robidoux@hp.com) is a Content Strategy Manager at Hewlett-Packard Company (HP) and has over 17 years of experience in technical communication. At HP, she oversees the single sourcing strategy and implementation for the StorageWorks Division. Charlotte holds a Ph.D. from the Catholic University of America in rhetoric and technical communication. She is the author of ‘Rhetorically Structured Content: Developing a Collaborative Single-Sourcing Curriculum’ published in Technical Communication Quarterly. She is co-editor of ‘Collaborative Writing in Virtual Workplaces: Computer-Mediated Communication Technologies and Tools’. Stacey Swart (stacey.swart@hp.com) is the Content Management System Administrator and Strategist at StorageWorks Division of Hewlett-Packard Company (HP). She has over 16 years in the tech industry in areas ranging from technical support to technical communication, and is certified by HP as a Lean Sigma Green Belt. Stacey holds a B.S. from the University of Kansas in Education and English. Develop a Metadata Strategy in Eight Steps Streamlining Your Path to Metadata Charlotte Robidoux Stacey Swart 3 15
  • 6. 6 TIMAF Information Management Best Practices Vol. 1 Abstract A Content Management System (CMS) allows a business to streamline its content development processes, using and reusing content from a single source for multiple purposes. Fully leveraging this capability requires the ability to access and manage your content, and managing your content efficiently necessitates a robust metadata strategy. However, developing a metadata strategy can be intimidating, onerous, and costly. The sheer amount of time needed–time to research, evaluate, synthesize, implement, and maintain a viable solution–can prompt even the most dedicated among us to abandon a strategic effort altogether. For this reason, it is essential to find a streamlined approach to metadata strategy development. This case study explores how can groups stream- line their metadata development without cutting corners and without undermining the purpose of having a CMS. Establishing a metadata strategy using a gradual approach makes the process more streamlined and manageable. The key to this solution is to define metadata components that are meaningful in your environment. After creating these compo- nents, you can determine the optimal configuration for your business and customize a taxonomy that makes your content easier to find. This means that comprehensive metadata solu- tion is both directly managed by users who assign predefined values from controlled vocabularies and system-driven. The solution also depends on input from all team members involved in content development, from content developers to editors and administrators. This article discusses the essential steps needed to streamline your metadata strategy.
  • 7. 715 Charlotte Robidoux, Stacey Swart Background At times, research on metadata can make the concept seem more like a metaphysical journey than one related to any practical outcomes.Yet as long as there has been a need to categorize objects and the information describing them, metadata has been the essential means for managing information collections or reposito- ries. In our modern age, the need to manage and access data on a large scale in a global economy is no less important. Metadata is central to modern authoring environments. For example, it is an integral part of automa- ting technical documentation development; documentation which enables users to operate the complex technologies that help to drive business transactions. More generally, it is vital to administer metadata efficiently, as indicated by metadata expert Murtha Baca: “Institutions must streamline metadata production and replace manual methods of metadata creation with ‘industrial production whenever possible and appropriate’.” (1, page 72) The Skills Needed to Perform this Best Practice Successfully implementing this strategy requires one or more people in each of the following roles: • Content Librarian: Oversees the quality of modular content in the CMS and assists writers with opportunities for reuse across the database. • Editor: Manages edits at the sentence level and reviews content against style guidelines. • Content Developer: Uses content mapping to define and create reusable content modules. • Tools Administrator: Configures and manage the tool set. Including these roles as a part of your strategy is key to your success.Without them, you will find holes in what could be a more streamlined approach. Step 7,Assign MetadataTasks to Roles, describes this in detail. Step 1: Define What Metadata Means to Your Organization and Why It is Important If you have found that some simply stated definitions of metadata are hard to make use of, and other highly technical ones are hard to understand, you are not alone. Metadata is an intricate subject that has become increasingly technologized.While stripping the term to its bare essence—such as “data about data”—helps demystify it, such definitions leave us with few clues about how to move forward. Finding comparative definitions that make sense in your organization can serve as a useful starting point for understanding the concept: card catalogs in a library, directories in a grocery store or mall, playlists on an iPod, for example. For our team, the most compelling comparison was right in front of us—metadata as an index in a book or a help system.The index compari- son enabled our team to appreciate why meta- data is important—it helps us organize and access content.We also related the concept of metadata to our own environment by review- ing established metadata standards to see if, and how, they would fit our needs. Standards or schemas are rules for uniformly managing information within and across repositories.They fall into various types: • Structure; • Value; • Content; • Format.
  • 8. 8 TIMAF Information Management Best Practices Vol. 1 For example, the Dublin Core Metadata Element Set (DCMES) (2) is a general standard that provides guidelines for structuring informa- tion into categories. It was formulated to describe documents on theWeb. DCMES defines 15 metadata elements, including: • Title; • Creator; • Subject; • Description; • Date; • Type. This standard features a succinct set of elements or categories and has been endorsed in other standards like ISO Standard 15836-2009 (3) andANSI/NISO Standard Z39.85-2007 (4).The general nature of DCMES elements make them applicable to many organizations. However, if you want to streamline your path to metadata, avoid getting lost in the sea of standards available. Because “[t]here is no ‘one-size-fits-all’ metadata schema or controlled vocabulary or data content (cataloging) stan- dard”, consider drawing on aspects of various standards that will fit your organization (1, page 72). No specific standards seemed to target computer documentation, but our team did consider standards related to structure in order to verify that we were targeting all of the key elements.We only evaluated other standards, including one from an HP marketing group, if they seemed pertinent to our environment. For example, we drew on a value standard, ANSI/NISO Z39.19-2005 (5), to find guidelines for developing controlled vocabularies, as discussed below. The challenge in defining metadata was learning to appreciate the power inherent in distinguish- ing content from the descriptors used to access and manage that content effectively. Step 2: Determine the Goals That Drive Your Metadata Strategy Knowing what you want metadata to achieve is fundamental to developing a sound strategy. Once your team agrees on a definition of meta data, turn their attention to identify the primary goals that will drive the strategy. Experts suggest “working backwards” from your goals to the metadata needed to reach your goal. “Deciding which aspects of metadata are essen- tial for the desired goal and how granular each type of metadata needs to be” is essential to the process of formulating a strategy (6, page 193) (1, page 19). We began by listing the various kinds of information that would be useful to us: track- ing types of content and components, content status (new, a draft, approved), who originally created the content, who revised it and when, what content is reused, where the content is reused, workflow tasks, multimedia objects available, version details, profiled content, sys- tem performance, and reports related to these items. Next we compared this list with several types of metadata: descriptive, administrative, and structural.While experts refer to the number and names of these types differently, our team drew on the types identified by NISO (7).These types are described inTable 1.
  • 9. 915 Charlotte Robidoux, Stacey Swart When looking at these types of metadata, we saw that items on our list could be understood in terms of these categories. From this view, we began formulating and prioritizing our goals, short-term vs. long-term. Given our focus on gaining efficiency, we determined that being able to retrieve and reuse content was a paired goal.Another important goal was to minimize the risk of content being reused prematurely. Longer term goals included tracking the percentage of content we reuse, determining what reuse opportunities are still untapped, ensuring the quality of our deliverables, and identifying what content is being localized. Through this exercise, we could see that all these metadata could help us achieve our goals. The focus on metadata types helped to stream- line how we thought about our goals. Our next step was to understand what specific metadata components would help us attain our short- term goals. Step 3: Identify the Metadata Components That Help You Obtain Your Goals Selecting metadata components is extremely important in the process of establishing a metadata strategy.The ability to decide on the optimal number of metadata components is not easy. How do you pick just the right number, not too many or too few?Which ones will have the biggest impact and help to minimize risk? Here are some sample questions you should consider (5, page 193-194 and 196): • What type of content is it? • What else do you need to know about the content to ensure the correct piece of content is retrieved? • In what form will users want to retrieve content? • How will users specifically identify the desired content? Table 1 Metadata Types Metadata Type Descriptive Administrative Structural Identifies and describes collections resources. Used in managing and administering collections, versioning, and reten- tion. Delineates the organization and relationship of content units within a system. Assists with queries and the ability to locate types of content that can be reused. This includes: • Content type and status • Tracking types of content/components • Profiled content • Multimedia objects available Enables creation and management of collections and configuration of tasks, permissions, status, and history: • Who created content and when • Workflow tasks • Version details • Reuse statistics • System performance Supports navigation and means of combing components into deliverables: • What content is reused • Where content is reused • Where multimedia objects referenced • Reporting Purpose Relevance to our environment
  • 10. 10 TIMAF Information Management Best Practices Vol. 1 Research into structure standards showed our team that we should focus on components that describe the subject of our content (one of the Dublin Core elements).These components would be the basis of user queries.The best way to streamline this step is to look at your own content for the answers. Once again, the index serves as a valuable tool for understanding what terms might be queried, along with the table of contents, providing clues about the hierarchy of terms as they relate to the subject. Linking the index concept to metadata was useful in helping team members understand metadata hierarchies and how the components related to each other. FollowingAnn Rockley’s advice to select three to five components, we chose four that were subject related and two that were user related, as shown inTable 2. Table 2 Metadata Elements and Attributes Element ContentType Product Keyword Abstract Originator Reuser Exactly 1 1 or more At least 2 At least 1 Exactly one Exactly one The largest “container” used to describe major topics that make up our documentation. “ContentType” describes the subject matter of the content. A smaller “container” used to qualify how a topic applies to various products. “Product” designates the name of the product for which the content was written, including the model and / or version. The smallest category that further limits the relevance of a topic. “Keyword” helps to further narrow search results. Provides a synopsis of content that authors can use to determine if reuse is appropriate, describing the subject of the content, why it is relevant, and guidelines for using the content. Who originally created a reusable piece of content. Authors are reusing a piece of content. Occurrence Rule Purpose
  • 11. 1115 Charlotte Robidoux, Stacey Swart After choosing our components, we had to consider how to manage them in our CMS.We streamlined the process by drawing on options that our CMS already supported. Our CMS allowed searching on fields such as “Status,” “Create date,” “Edit date,” and “username,” but we needed to search on more specific subject-related content as well. Our DTD, which is a subset of DocBook, only contains “keywordset/keyword.”To fill the gaps, we developed custom elements and attributes, adding custom elements for “ContentType” and “Product,” and two attributes for “Originator” and “Reuser.”We chose elements when we might need to use multiple values, and attributes when we wanted to enforce only one value. While it was clear that the goals of retrieval and reuse could be achieved by building related metadata into our content, we felt that the goal of minimizing the risk of premature reuse needed additional CMS support.To achieve this, we organized our content into two collections: “Working” and “Approved.”Working collec- tions would contain work in progress; only the “originator” content developer could reference this content; reuse by others was not supported. In contrast, “Approved” collections would contain finalized content that had been reviewed by an editor as well as subject matter experts and could not be changed; any author could be a “Reuser” of the content contained here. Separate collections ensure that original content will not change if reused. Instead, “Reusers” must copy content from anApproved collection to aWorking collection to propose changes.After those changes are made, the author initiates a Change Proposal work- flow, illustrated below, via the CMS.The workflow automatically notifies the assigned stakeholders that a change to content is being proposed. Some of the automation is possible because of the metadata attributes “Reuser” and “originator.”The CMS is able to determine who initiated the change proposal and who the changes will affect.The workflow content librarian also employs automated email notifi- cations and task assignments.Two options are possible: either the approved content is updated to reflect the changes agreed upon by the review team, or new content is added to the Approved collection because the original content is still needed as first written. By organizing the CMS collections this way, and by creating a workflow that leverages user- related metadata, we effectively streamlined our use of metadata and found a way to leverage elements to minimize risk when reusing content.That is, a strategic approach to metadata from the outset triggers additional efficiencies; streamlining metadata cascades into workflow and CMS implementation. Step 4: Identify Metadata Values Without question, identifying metadata values to create a stable list of terms-a controlled vocabulary-is the most time consuming and contentious step of the process. Deliberating over synonyms and laboring over documents to test the appropriateness of the values seems endless.The best way to streamline this part of the process is to form a small workgroup of three or more members who can begin to evaluate document conventions and create lists of terms related to the components selected. (A workbook works well for managing the terms on separate spreadsheets.)As mentioned earlier, our team drew extensively fromANSI/ NISO Z39.19-2005, Guidelines for the Con- struction, Format, and Management of Mono- lingual ControlledVocabularies. This standard helped the workgroup and users appreciate why a controlled vocabulary is so important, given that “[t]wo or more words or terms can be used to represent a single concept” and that “[t]wo or more words that have the same spelling can represent different
  • 12. 12 TIMAF Information Management Best Practices Vol. 1 concepts” (5).When creating the lists, the work- group relied on the Standard’s recommendations for conducting “top- down” and “bottom-up” assessments, determining “the correct form of each term,” and for following key principles such as: “eliminating ambiguity,” “controlling synonyms,” “establishing relationships among terms where appropriate,” and “testing and validation of the terms” (5). Once the lists were created, workgroup members began vetting these lists with seasoned authors, many of whom were not co-located.The ability to engage teams across the organization when the workgroup had little authority was especially challenging.We relied on many virtual collaboration techniques to streamline our efforts so that we could complete the work. Do not overlook the importance of showing the value of metadata to the users-they need to understand and believe in the purpose of their work, and realize that metadata: • Enhances query capabilities in the CMS by enabling “effective retrieval” (6, page 18). • Allows users to locate their own content, as well as other content that they could reuse or leverage. • Reduces “redundant content” (6, page 185), making content developers more productive. • Reduces costs (Management may care more about this, but in today’s work environment, a content developer who is saving the company money is a content developer worth keeping.). Additionally, employing a controlled vocabulary saves the content developers time by increasing the amount of content that can be successfully retrieved. TheANSI/NISO Z39.19-2005 standard provided essential principles for maintaining a controlled vocabulary, especially how best to manage additions and modifications as well as a history of changes (5, page 97).The change history was especially critical when updating the values in our tools.These processes are contained within a single resource that we refer to as metadata guidelines. Documenting the metadata process is a must. Bob Boiko discusses the idea of a metatorial guide containing “a set of rigorous metadata guidelines,” similar to an editorial guide (8, page 495). Boiko goes on to say that the metadata process must ensure (8, page 508): • Metadata completeness and consistency. • Content manageability, accessibility, and targetability (intended for the appropriate audience). A thoroughly documented set of rules and procedures helps take the guesswork out of metadata application.As Boiko explains, “in a sizable system with numerous contributors you can almost guarantee that you will find wide variation in the ways that people interpret and apply even the most precisely stated tagging rules” (8, page 509). Providing a link from the tool’s support menu to the metatorial guide puts the information at the content developers’ fingertips, giving users easy access to the meta- data processes and guidelines.As previously discussed, proper application of metadata is critical to ensure quality search results. Making the guidelines as accessible as possible will help ensure that they are followed. Once guidelines are documented, you need to determine what type of user will apply the metadata. Should content developers add all user-driven metadata, or should a content librarian assist them?What are the roles regard- ing metadata application? Boiko contends that “a different set of skills is necessary to deal with this metadata” (8, page 495). Some users can be trained to apply metadata. However, as he goes on to say, users “rarely have the where- withal to discover what content others are sub- mitting and exactly how to relate that material
  • 13. 1315 Charlotte Robidoux, Stacey Swart to what they submit.” Someone on the team with an eye for detail like a content librarian is more appropriate for this role. While content developers understand their content and usage better than anyone else, as noted by Peter Emonds-Banfield, they might not have the “expertise necessary for meta- data creation, nor the time to keep up with it”; whereas “... metators (= editors that manage metadata) can play a role by educating content teams on metadata creation” (8, page 509). As previously discussed, some tools can be configured to enforce certain rules; however, some standards require the human eye. In those cases, the content librarian can audit metadata application before content is approved, ensur- ing the metadata values chosen by the content developer meet quality standards.You can liken the role of a content librarian to that of an editor. Instead of reviewing content against structure and style rules, the content librarian reviews metadata against metatorial guidelines, ensuring that metadata application is consistent throughout all content in the CMS. Boiko refers to this as “a central point of decision” (8, page 511).The more complex the metadata and content, and the more users who access it, the more critical such a point of decision becomes. On the other hand, is having the content librarian audit metadata application by content developers sufficient, or should the content librarian apply all metadata to content, com- pletely releasing the content developer from such a burden?According to Boiko “the task of looking after the metadata health of a CMS is crucial if you want to be confident that the content in the system will show up when you expect it to.” (8, page 511).This the point for content to be retrievable so that you can then reuse it. If you want to be completely sure that metadata is applied consistently across all content, regardless of who originated it, then having a content librarian perform this task is as close to a guarantee as you might get. However, some organizations do not have the resources to staff a content librarian. In that case, an editor might take this on as a new role. If resource constraints are an issue, some organizations must rely on content developers to apply user-driven metadata. In this case, the metatorial guide is what you are betting on, and it must be rock solid. In our case, we rely on content developers to apply user-driven metadata.The editors are charged with reviewing metadata as they would any other content.The content librarian is consulted when questions arise, and also audits content in the CMS for consistency. Ultimately, the content librarian is the point of decision and is responsible for educating others and maintaining the metatorial guide.We have also staffed a trainer who works with the content librarian to develop metadata training for all (content developers and editors).The primary reason we have this model is to share the work- load; we do not have the resources to assign such a role in a full-time capacity. Regardless of who is doing it, applying “metadata well requires a lot of human energy” (8, page 495). Step 5: Determine What Metadata Components Can Be Automated Determining which metadata components, if any, can be automated, is important at this stage in developing a metadata strategy. Some components need the human touch for quality purposes, or because tools such as the CMS are not able to automate the application of such metadata. However, when possible, utilize auto- mation.The options for this will vary depending on the tool. In our case, we looked to the CMS for automating the application of metadata. Why automate?Automating metadata applica- tion lessens the burden on the content develop- ers and helps avoid inconsistency. In addition, if it is “up to the author to remember to add
  • 14. 14 TIMAF Information Management Best Practices Vol. 1 the metadata in all the relevant places”, it is a “recipe for missed metadata” (6, page 200). As Boiko writes, “without a rigorous consis- tency and careful attention, metadata becomes useless” (8, page 495). He goes on to say that “someone or, better, some system is necessary to ensure that people handle metadata thoroughly and consistently” (8, page 495). So if the CMS can handle it, automate it! What metadata makes a good candidate for automation? From our experience, metadata with a yes or no value should be automated if the question can be answered by data that is accessible to the CMS. For example, to answer the question “Is the content being reused?”, populate the reuse attribute with either “yes” or “no.” In our case, if content lives within a specific CMS collection, then it is reused. Otherwise, it is not. Our CMS is smart enough to answer this question based on the location of the content – in a certain collection, so we let it answer that question for us. Metadata containing a value that is definite should also be automated. For example, the originator attribute can be populated with the username of the person who created the con- tent because the CMS knows who that person is. Likewise, the CMS knows who is reusing content because it can follow the reference to the content back to the username who created the reference.As a result, we let the CMS capture the username for us by adding it to the reuser attribute. On the other hand, what metadata should not be automated? Metadata requiring a discerning human eye should not be automated. For example, a person is needed to determine the subject of the content. One could argue that if the content contains a title, the subject could be leveraged from the title. However, not all content chunks include a title.As a result, we do not automate the ContentType metadata element. A gray area might be keywords. In our case, we depend on a person to assign keywords. This person is typically the content developer, with some assistance from the content librarian if required.As content grows, new keywords might be necessary. If they are not part of the controlled vocabulary, the content librarian can make note of that and modify the list as needed. From our experience, controlled vocabularies are certainly living lists, as previously discussed. Table 3 shows our system-driven metadata, including metadata used to manage the status of content (whether or not it can be reused). Be sure to also consider the risks of automa- tion. Boiko states that “the problem isn’t to find the blanks, but to correctly fill them” (8, page 509).The key word here is “correctly”. Similarly, Rockley explains that “[i]mproperly identified metadata ... can cause problems ranging from misfiled and therefore inacces- sible content to even more serious problems ...” (6, page 185). In our case, inaccessible content would be a deal breaker since our primary goal is retrieval for reuse. It is critical that metadata applied automatically by the CMS is done with the highest quality standards.There can be no room for incorrectly applied metadata or for the possibility of inaccessible content. Consequently, if you rely on the CMS to auto- mate the application of metadata, make sure it is fool- proof (tool-proof). Step 6: Ensure That Users Will Apply the Metadata Once you have determined which metadata components can be automated, the remaining components will be user-driven.The next step it to ensure that users will apply it.As Rockley notes, metadata is “only valuable if it gets used” (6, page 200).
  • 15. 1515 Charlotte Robidoux, Stacey Swart Table 3 System-Driven Attributes System-Driven Attributes Status Collection Reuse Originator Reuser Working Approved Section Chapter Glossentry Yes No Username Username External use in the authoring environment. Used upon extract to work with style sheets to lock content from changes if approved. Used to properly reload content to the correct collection. If yes, content is from an approved collection. Used to color-code approved content so that reviewers and editors know it has already been approved. Who created the content; used by CMS workflow. Who is reusing the content; used by CMS workflow. Value Goal One method to ensure users apply metadata is to configure your tools with metadata require- ments.The DTD behind an authoring tool can utilize occurrence rules to require specific metadata components to be added (such as at least two keywords must be present).A CMS can be configured to enforce the same rules. In our case, we have rules established in both tools. Regardless of which tool the metadata is applied in, the user must meet certain require- ments.The tools alert the user when those requirements are not met. We have found that the CMS provides greater specificity than our authoring tool in such requirements.While the DTD behind the authoring tool can require that metadata components be present, it cannot enforce that values be added to those components. For example, in the authoring environment, a user could add two keyword elements, but leave them empty with no values assigned.Techni- cally, they would meet the DTD rules.The CMS provides the additional reinforcement. In our case, content shows as incomplete unless the metadata components are present and they contain values. For example, two keywords are present and the values are this and that. Content within the CMS shows as incomplete unless all metadata requirements are met, and because all content is ultimately managed in the CMS, it becomes the final checkpoint. To fully utilize the benefits of metadata, however, users must do more than just apply metadata to their content.They must apply the appropriate metadata to their content.A well designed metadata strategy ensures that the metadata components and values are tailored to the needs of the user; metadata guidelines assist them with the tasks they need to accom- plish and include terms they will use when retrieving content. But as previously discussed, users do not all think the same way.This is where having a controlled vocabulary is a must. Even though one user might be inclined to search on “America” and another might search on “U.S.A.”, they will both work off of the same list of terms, which in this example could include “United States”. Such search standards
  • 16. 16 TIMAF Information Management Best Practices Vol. 1 can be taught, and will ensure effective search results, rather than wasting the user’s valuable time. There are other ways to assist users in metadata application. One is to provide templates that are pre-populated with the required metadata components.We have done this in our author- ing toolset; the content developer only needs to assign values to the components.Another method that both assists users and provides a level of control that can be a partner to occur- rence rules is described by Boiko as “categori- zing metadata by its allowed values” (8, page 506). For example, we use a “closed list,” which allows users to select a value from a predefined set of terms, or a controlled vocabulary (8, page 507). In our case, the controlled vocabulary is built into the authoring tool and the CMS.The user cannot type in metadata values; the only option is to select them from a list. Step 7: Assign Metadata Tasks to Roles To ensure your metadata goals become part of your business processes and tool environment, assign roles to team members who can imple- ment the metadata strategy.These assignments streamline the implementation effort.Table 4 describes each of these roles. Metadata is dependent on many contribu- tors.While tool administrators can ensure that system- driven metadata and automation are set up behind the scenes, they are not the sole contributors.The realization of your strategy becomes much easier with all team members involved.
  • 17. 1715 Charlotte Robidoux, Stacey Swart Role Content Librarian Editor Originator (Content Developer) Reuser (Content Devel- oper) Responsible for the quality of modular content in the CMS as well as for flagging opportunities for reuse across the database. The content librarian’s tasks include: * Assisting content developers when needed in understanding the metadata guidelines. * Maintaining the metadata guidelines document and the master list of values. * Reviewing and accepting or rejecting new metadata value requests. * Notifying Tool Developers when new metadata values need to be added to the tool set. * Auditing the quality of metadata the values that content developers apply before the content can be made available for reuse. * Overseeing content change proposals for reused modules and validating the requests for changes. * Facilitating the review process to ensure all Reusers participate by either accepting or rejecting their changes * Implementing the final result by either overwriting the original “approved” content in the CMS, or by creating a variant of the original “approved” content. * Populating the CMS with common queries to assist content developers with locating content to be reused or leveraged. * Assisting content developers when more specific search criteria is needed for database queries to locate content to be reused or leveraged. Manages edits at the sentence level and reviews content against style guidelines. The editor’s responsibilities include: * Reviewing metadata values as part of the literary edit to ensure consistent usage. * Maintaining an eye toward content that can be leveraged or reused when a content developer opts to create new content. Identifies the need for and creates reusable modules of content using content mapping. The originator’s responsibilities include: * Identifying unique, similar and identical content across the deliverables set. * Capturing metadata values for identical content. * Analyzing similar content for opportunities to make it identical. * Creating reusable topics of information. * Requesting new metadata values as needed via the CMS workflow. A content developer who uses metadata to query the CMS for reusable content. The reuser’s responsibilities include: * Reusing “approved” content by referencing it in deliverables. * Initiating the change proposal workflow as needed to request changes to “approved” content. * Reviewing change proposals from other reusers. Responsibilities Table 4 Roles and Responsibilities >>
  • 18. 18 TIMAF Information Management Best Practices Vol. 1 Tool Administrators DTD Developer Authoring Tool Developer Authoring Tool Developer CMS Administrator Responsible for configuring and managing the tool set. Examples of tool administrators include: * DTD Developer * Authoring Tool Developer * CMS Administrator * Publishing Tool Developer * The tool administrator’s responsibilities include: * Addressing requirements versus options. * Automating processes where possible. * Ensuring that the tools support the reuse and metadata strategy. The DTD developer’s responsibilities include: * Managing DTD elements, attributes, and occurrence rules. * Communicating with the CMS Administrator when DTD changes are needed. The authoring tool developer’s responsibilities include: * Making templates available for new content creation. * Automating adding required child elements when a parent element is selected. * Maintaining pre-populated menus with required user- driven metadata values. * Providing links to support documentation from the authoring tool menu. The authoring tool developer’s responsibilities include: * Making templates available for new content creation. * Automating adding required child elements when a parent element is selected. * Maintaining pre-populated menus with required user- driven metadata values. * Providing links to support documentation from the authoring tool menu. The CMS administrator’s responsibilities include: * Managing content collections for editing, loading, and extracting behaviors, including extracts directly to the publishing tool. * Maintaining user roles and privileges. * Configuring the CMS to ensure alignment with DTD rules. * Setting up CMS-specific elements and attributes as needed. * Tightening structure rules by requiring text to be present and/or valid values to be used for applicable elements and attributes. * Making components, properties and operators available to help ensure effective query options. * Maintaining pre-populated menus with required user- driven metadata values. * Implementing visual aids to assist users when viewing content in the CMS. * Automating the capturing of system-driven attributes. * Creating workflow configurations to support CMS- assisted procedures. Publishing Tool Developer The publishing tool developer’s responsibilities include: * Creating and maintaining style sheets for use in the authoring and publishing tools. * Developing authoring tool scripts to provide visual cues for reused content to content developers, editor, and Subject Matter Experts.
  • 19. 1915 Charlotte Robidoux, Stacey Swart It is critical that the team members have a clear definition of their roles and the importance of each role in contributing to the overall success of the strategy. Step 8: Prove That Your Strategy is Sound Readiness to release metadata to production can take months or years, depending on the complexity of your strategy. Because our organization did not always have dedicated resources to devote to this implementation, tracking our progress via a schedule was absolutely essential.As priorities shifted in our organization, deliverables were either pulled in or pushed out as needed. Shifting priorities and balancing resources may ultimately determine the time needed to develop and implement your metadata strategy.While it took our team a number of years, we understood the return on investment. Had we given in to the pressure to release any sooner, we would have had a less effective, less efficient, and less robust metadata strategy.And because metadata is truly the backbone to our reuse strategy, skimping was not an option. It is also necessary to come to an agreement with management as to what qualifies the strategy to be ready for release. For example, we negotiated to have a full quarter of simula- tion testing, and agreed that we would only release to production if simulation testing resulted in zero process or tool issues.Test scenarios should be as realistic as possible. In our case, we used actual customer content, assigned roles, and created scenarios to put our business and tool processes to the test. The testers received new test scenarios each week so that they couldn’t see what was coming next. Our support staff, including the editor and the content librarian, were also given test scenarios. In some cases, we set up intentional conflicts to ensure users knew how to handle them. Before you can release your metadata strategy to production, you must ensure that: • Your tools are functioning as expected to support the strategy. • Roles and expectations are clear. • The metadata guide is available. • Training has occurred, including making all users aware of the importance of metadata (resulting in a willingness to use it). • No gaps have been identified. • There are no technical issues with any of the processes supported by the tools. After all of the “human energy” (8, page 495) spent on creating your metadata strategy, don’t short- change yourself by rushing through the testing process.When you do release your metadata strategy, you want to know it is rock solid. Metadata in Action As previously discussed, our goal is to enable effective reuse by making content easy to find. Because the originating content developer (Originator) added metadata for reuse, other content developers (Reusers) can query on those values. Figure 1 shows some of an Originator’s content.
  • 20. 20 TIMAF Information Management Best Practices Vol. 1 In some cases, a content developer might know the content exists, and is already familiar with it. In that case, she would have knowledge of the metadata values that are likely to be associ- ated with the content. In other cases, the content developer has a need for content, but is not sure it exists. Rather than creating it from scratch, she searches the CMS to see if content exists that she can reuse or at least leverage. For example, a user needs content specific to installing NAS products onto servers. Because our CMS is configured with drop-down lists of valid values (controlled vocabulary metadata values), the users selects the appropriate metadata elements and values from the list. In this example, the content developer would query on: • ContentType = Installing; • Product = NAS; • Keyword = servers. The content developer can search on one or more metadata elements, as shown in Figure 2. Combining multiple metadata elements provides narrower results. In the preceding example, over 1,200 section modules were queried, resulting in one section that met the content developer’s query (shown in Figure 3).At this point, the content developer can review the content in more detail and decide if she can use it as is, or leverage it. Figure 2: Searching Using Metadata Figure 1: Originator Content
  • 21. 2115 Charlotte Robidoux, Stacey Swart It is easy to see that without the metadata to support the query, the content developer would likely never have located the content she needed. She probably would have just created a new section, duplicating existing data. She would have spent time doing this, taking away from her other work. In addition, the CMS would have become populated with redundant content. Even when a content developer locates content that already exists, it might not fully meet her needs. In that case, she can propose changes to the content.We use the Change ProposalWork- flow feature in our CMS to manage this process. The workflow has the following steps, shown in Figure 4: • Proposal: The content developer copies con- tent to a working collection, makes changes as needed, and initiates the workflow. • Review: The content librarian validates the request. System-generated email notifications are sent to all Reusers.The content librarian facilitates an offline review and mediates any counter-proposals. • Outcome: If all Reusers accept the change proposal, the CMS automatically overwrites the original content in the approved collec- tion with the changed content.A system- generated email notification is sent to all content developers, letting them know that the workflow has been completed. • Relink: If only some Reusers accepted the change proposal, the content librarian assigns a unique ID to the content, and the CMS automatically moves the variant to the approved collection.A system-generated email notification is sent to all content developers, reminding them to relink to the variant as needed. Figure 3: Reused Content Figure 4: Change Proposal Workflow
  • 22. 22 TIMAF Information Management Best Practices Vol. 1 If, on the other hand, query results show that new content needs to be created, the content developer can do so.Adding the metadata elements to the new content will help ensure that other content developers can locate the content for future usage. Summary and Conclusion Proving the soundness of metadata in our case entailed extensive collaboration and testing among team members. Key areas of focus included: • Checking and rechecking that our metadata values were entered into the tools correctly. • Ensuring high usability in the tools and in written processes so that team members could add metadata easily. • Configuring our tools to easily locate meta data and to indicate if values and elements were missing. • Proving the concept that our metadata would enable us to locate content effectively for the purpose of reuse. The ultimate test of success is verifying that implementing metadata allows your organiza- tion to achieve the goals you identified at the outset. There is little guidance available on how to develop a metadata strategy.While some in- dustries have developed specifications tailored to their content, others seem to be starting at square one.Technical communication, as it relates to the computer industry, could benefit from more substantial models to follow. Having a specification as a starting point would help companies get started with their metadata strat- egy by providing a list of components, possible values, and the pros and cons of building upon this as a foundation. References 1. Baca, Murtha, ed.. Introduction to Metadata, 2nd Edition. Los Angeles, CA: Getty Publications, 2008. 2. Dublin Core Metadata Initiative. “Dublin core metadata element set, version 1.1 .” 2008. Retrieved on 20 Mar. 2009. http://dublincore.org/documents/dces/. 3. International Organization for Standardization. “ISO 15836:2009, Information and documentation - The Dublin Core metadata element set.” 2009. Retrieved on 21 Feb. 2009. http://www.iso.org/iso/iso_catalogue/catalogue_ tc/catalogue_detail.htm?csnumber=52142. 4. American National Standards Institute. “The Dublin Core Metadata Element Set.” ANSI/NISO Z39.85. Bethesda, MD: NISO Press, 2007. Retrieved on 8 Jun. 2008. http://www.niso.org/kst/reports/standards/kfile_ download?id%3Austring%3Aiso-8859-1=Z39-85-2007.pdf&pt=RkGKiXzW643YeUaYUqZ1BFwDhIG4-24RJbcZB- Wg8uE4vWdpZsJDs4RjLz0t90_d5_ymGsj_IKVa86hjP37r_hFEijh12LhLqJw52B-5udAaMy22WJJl0y5GhhtjwcI3V. 5. American National Standards Institue. “Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies.” ANSI/NISO Z39.19-2005. Bethesda, MD: NISO Press, 2005. Retrieved on 9 Jun. 2008. http://www.niso.org/kst/reports/standards?step=2&gid=None&project_key=7cc9b583cb5a62e8c15 d3099e0bb46bbae9cf38a. 6. Rockley, Ann. Managing Enterprise Content: A Unified Content Strategy. Indianapolis, IN: New Riders Publishing, 2002. 7. National Information Standards Organization. “Understanding Metadata.” Bethesda, MD: NISO Press, Retrieved on 21 Feb. 2009. http://www.niso.org/publications/press/UnderstandingMetadata.pdf. 8. Boiko, Bob. The Content Management Bible, 2nd Edition. Indianapolis, In: Wiley Publishers, 2005.
  • 23. About the TIMAF Library This ‘Information Management Best Practices’ book is a publication in theTIMAF Library.The publications in theTIMAF Library are aimed at promoting Information Management and are published on behalf ofTIMAF.TIMAF, the Information Manage- ment Foundation, is an initiative of information management practitioners to provide a strong and clear foundation of information management. TIMAF encourages authors from around the world, who are experts in their particular information management sub discipline, to contribute to the development of theTIMAF publications.Are you interested in sharing your ideas and experiences online with theTIMAF Community?Visit www.timaf.org and join the discourse. Have you experienced the merits of a specific approach to information management strategies, methodologies, technologies or tools? Submit a proposal according to the requirements listed in the ‘Call for Best Practices’ at www.timaf.org. The following publications are available in theTIMAF Library: Introduction books Information Management Framework paper edition - release: September 2011 Best Practices books Information Management Best Practices 2009 Sneak Preview online edition - www.timaf.org Information Management Best Practices – Volume 1 paper edition - ISBN 978-94-90164-03-4 Pocket Guides Information Management Framework – A Pocket Guide paper edition - release: November 2011 Social Networks Information Management Framework Wiki www.timaf.org/wiki We will publish new books and other instruments on a regular basis. For further enquiries about the Information Management Library, please visit www.timaf.org or send an email to info@timaf.org.
  • 24. Introduction Information? Manage! Information is the term we use to stand for all forms of preserved communica- tion that organizations care to produce, store and distribute. If we communi- cate it and record it, it is information. So, for us, information is anything from sales figures in a database to a video on philosophy viewed on a mobile phone. We define information management as the organized collection, storage and use of information for the benefit of an enterprise. Our definitions are intentionally wide enough to cover content, document, asset, data, records and all other ‘information managements’ that organizations do.We believe that while each of these “sub-disciplines” has its own tools and types of information, there is much more that unites them than divides them. Our definitions are intentionally quite practical. For us, information manage- ment simply means moving pieces of recorded communication from creation to consumption to retirement. Our definitions are crafted to carve out a niche for the information manager. Information managers make sure that recorded communication can be amassed and distributed in a way that benefits their organization. Finally our definitions are crafted to be a simple guiding principle.Any person working in any information project can use this defini- tion to remain focused on the ultimate aim of their particular kind of work. Information Management? TIMAF! The field of information management is currently fractured and incoherent. Each sub discipline (content, document, asset, data, records management to name just a few) has its own practitioners, applications and professional communities.We believe that behind the seeming differences between these ‘managements’ there is a deeper unity that will eventually define a strong and clear foundation for all of them. We do not believe that all managements will or should merge, but rather that just as business underlies a variety of business practices including accounting and finance, there is a common foundation for the various forms of informa- tion management. The Information Management Foundation (TIMAF) tries to provide this foun- dation by publishing these information management best practices. In addition, TIMAF develops and maintains an information management framework that brings the commonalities between sub disciplines to light and helps to organize the best practices that we publish.
  • 25. Best Start? Best Practice! Just as business is practiced within a more specific context, information management is also practiced in context.Thus, we believe that the best way to illustrate the concepts and practices of information management is within the context of one or more sub disciplines. So, this best practices book tries to show global principles of information management in the context of projects in one or more of the sub disciplines. This is the first volume of ‘Information Management Best Practices.’ In future publications we will provide an ongoing compilation of high quality best practice guidance, written for and by experts in the Information Management field from around the world.These best practices are designed to help professionals overcome their information management challenges.They bring complex models down to earth, with practical guidance on tough problems. In this volume, practitioners describe nineteen projects that you can learn from, In return, we ask that you let us learn from you! Please let us know what your experiences are with these or other projects at www.timaf.org.
  • 26. Colophon Title TIMAF Information Management Best Practices – Volume 1 Editors Bob Boiko – USA - Erik M. Hartman – NL Copy-editors Jonah Bull – USA – Jenny Collins – USA - Elishema Fishman – USA Publisher Erik Hartman Communicatie – NL Edition Volume 1 – 1st impression – November 2010 ISBN 978-94-90164-03-4 Design & Layout Nevel Karaali – NL Print Wöhrmann Print Service – NL © 2010, TIMAF All rights reserved. No part of this publication may be reproduced in any form by print, photo print, microfilm or any other means without written permission by the publisher. Although this publication has been composed with much care, neither author, nor editor, nor publisher can accept any liability for damage caused by possible errors and/or incompleteness in this publication. TRADEMARK NOTICE TIMAF ® is a RegisteredTrade Marks and Registered CommunityTrade Marks of the Office of Government Commerce, and is Registered in the U.S. Patent andTrademark Office. Please contact the editors for ideas, suggestions and improvements at info@timaf.org.