Taxonomy 101: Classifying DITA Tasks


Published on

Click here to listen to the webcast -

DITA Tasks are often the most valuable content we create – especially when we present them in Support portals. But if end-users can’t find them they have no value – avoiding that requires classifying them with metadata and labels from a standard taxonomy.

Taxonomy and metadata can seem like scary or complex turf to the uninitiated – but they don’t have to be. In this 40-minute webinar, Paul Wlodarczyk will walk you through a simple process to begin to assemble a basic taxonomy of controlled vocabularies for tagging your DITA Tasks.

You will learn:

The most critical metadata for classifying tasks – regardless of your industry
How to use tools that you already own to build your taxonomy
Simple rules for keeping your terms consistent
Using existing lists of terms so you don’t have to build a taxonomy from scratch

Published in: Technology, Education
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Taxonomy 101: Classifying DITA Tasks

  1. 1. easyDITA How-To Series:Taxonomy 101: Classifying DITA Tasks Paul Wlodarczyk CEO, Jorsek LLC June 28, 2012
  2. 2. Poll: Please complete while folks arrive How are you delivering DITA Tasks or other procedural / how-to content? • Portal with advanced / faceted search • Static web pages or web help • Print / PDF • Windows Help • Other 6/28/2012 © Jorsek, LLC. All Rights Reserved. 2
  3. 3. Why talk about task oriented content? Task-oriented content is valuable: • It is versatile and can be reused in more deliverables than conceptual content – Product user guides – Context-sensitive help – Knowledge base – Support – Training • It’s what most users are searching for in a A DITA Task published to MindTouch knowledge base or help 6/28/2012 © Jorsek, LLC. All Rights Reserved. 3
  4. 4. Benefits of using DTA for authoring tasks • Task authored in DITA are – Concise – Consistent – Modular – Semantic • DITA Tasks make good templates for content contributed by SMEs (like product engineers) • For software UA in particular, task-oriented content is perfect for QA. The task becomes the Test Case. The XML Source for a DITA Task 6/28/2012 © Jorsek, LLC. All Rights Reserved. 4
  5. 5. Anatomy of a DITA Task • Title • Short description • Context • Prerequisite • Step section • Step • Command • Sub Step • Step Info • Step Result • Step Example • Choice and Choice Table • Example • Post-requisite • Result A DITA Task in easyDITA 6/28/2012 © Jorsek, LLC. All Rights Reserved. 5
  6. 6. DITA Tasks are semantic • DITA tasks are inherently semantic – Not simple ordered lists – Not simple paragraphs • This is useful for – Dynamic rendition, e.g. • Expand / collapse steps • Interactive UI controls – Semantic Search in the context of the structure, e.g. • find STEPS that contain MENU CASCADES • Find STEP INFORMATION that contains IMAGES tagged with [text] • Find PREREQUISITES that contain [text] 6/28/2012 © Jorsek, LLC. All Rights Reserved. 6
  7. 7. Making tasks more findable with metadata • Q: How can we make content even more findable – For authors and content managers? – For end users in a dynamic delivery system? • A: Tag tasks with semantic metadata – Semantic = “meaning” – Metadata can be set with terms from controlled vocabularies defined and managed in a taxonomy 6/28/2012 © Jorsek, LLC. All Rights Reserved. 7
  8. 8. What is Metadata? • Literally “Data about the data” • Also known as “tags” – Not to be confused with the content itself (e.g. XML structure) – Can be embedded in a file (e.g. the DITA Prolog or attributes; JPEG image data) or associated in a CMS • Two main flavors: – Administrative metadata • e.g. Content Type, Author, Date Modified, Version, Title, etc., • Usually system-generated • What the content is – Descriptive metadata • Subject classification, keywords, etc. • Usually manually authored • What the content is about 6/28/2012 © Jorsek, LLC. All Rights Reserved. 8
  9. 9. Key Concept: Taxonomy taxonomy n. A categorization scheme for concepts, often hierarchical • Most often, taxonomies show “is a” relationships, e.g. A mammal is a vertebrate, A rodent is a mammal, etc. • Navigation up and down the tree yields broader than (BT) and narrower than (NT) classification – Can be used to adjust search scope • Can also show related terms (RT) – Can be used to suggest related searches / “see also” • Can manage synonyms (UF – Use For) – Can be used to find content when search terms are not the preferred terms 6/28/2012 © Jorsek, LLC. All Rights Reserved. 9
  10. 10. Using Taxonomy for controlled vocabularies • A taxonomy is the “source of truth” for what terms to use for various concepts – so terms are consistent. • Taxonomy terms can be used as controlled vocabularies (“pick- lists”) for metadata, so authors simply select preferred terms – Avoids typos, duplicates, word form variations, use of non-preferred terms • Some content management systems enable controlled vocabularies from taxonomies to be used for setting attribute values in DITA (e.g. selectatts like Audience, Product, Platform etc.). • Relationships between terms in a Taxonomy can improve search – CMS search and site search indexing tools can use equivalent and related terms to find content that does not contain the search term – Relationships between terms can be expressed as RDF in HTML content for improving web search indexing 6/28/2012 © Jorsek, LLC. All Rights Reserved. 10
  11. 11. Simple framework for tagging tasks• In any industry, we’re all trying to help people do something to something in a context: – Who is doing what to what (+ other important context or condition)• Examples Junior Service Technician doing preventive maintenance on Acme Jetpack XR7 that uses nitrous oxide injection technology Casual User clearing paper jam on MFD100 Copier with envelope tray option Case Worker performing an intake interview for a recently unemployed person in New York State Intermediate User publishing a DITA Map using DITA OT to PDF format Financial analyst calculating a WACC for a publicly traded company located in a country using GAPP accounting Registered Nurse administering medication to patient in the ICU and drug is a controlled substance Contract Service Technician doing diagnosis on P1000 Printer showing missing sections of the printed image 6/28/2012 © Jorsek, LLC. All Rights Reserved. 11
  12. 12. What metadata do you need? Information about the Performer, Activity, Object, and Context will help narrow search results for a user or author (see our blog post on Metadata 101: A Search First Approach) • Performer metadata: – Types of users (roles, experience, education level, etc.) – Types of employees (title, training, certifications, clearance, departmen t, skill level etc.) – Types of customers • Activity metadata: – Broad Task Types (e.g. for service: maintenance, diagnosis, repair, calibration, startup, etc.) – High Level Task names from a performance analysis / instructional design – Competencies from a model – 6/28/2012 Commercial Services listing © Jorsek, LLC. All Rights Reserved. 12
  13. 13. What metadata do you need? Information about the Performer, Activity, Object, and Context will help narrow search results for a user or author (see our blog post on Metadata 101: A Search First Approach) • Object (i.e. “To what / to whom”) metadata: – Things: Product, product components, product subsystems – People: Types of customers or clients • Context metadata: – Market / locale – Product options – Technologies – Special situations – Tools required – Security classification – Symptoms / Fault codes 6/28/2012 © Jorsek, LLC. All Rights Reserved. 13
  14. 14. Do we have to create these terms from scratch? No! You are surrounded by free sources for term lists, many are governed and authoritative. Don’t reinvent – borrow! Here are some common sources of terms: • Corporate ECM or Web taxonomy (from IT or marketing) • Industry-specific taxonomies (e.g. MeSH for life sciences, DSM for mental health) • Government taxonomies (e.g. UK IPSV - Integrated Public Sector Vocabulary) • Generic public domain taxonomies (e.g. People, Places, and Cultures; AP News) • Other corporate sources: – Training group (competency models, task analyses) – HR (Job codes and Job Titles) – Support / field service systems (Parts, fault classifications, failure modes, tools used) – CRM data (Customer names, Customer categories, SKUs, Products & Services) – Product data (Product BOMs, platforms, parts, subsystems, options) – Organization Charts (Divisions, departments, locations, budget centers) – Business Process Analysis (process names and steps, inputs and outputs) 6/28/2012 © Jorsek, LLC. All Rights Reserved. 14
  15. 15. Taxonomy Tools • You can build and manage a simple taxonomy in Microsoft Excel • Even if authors manually tag metadata, the Excel taxonomy can be a useful guide and source of terms to copy/paste • Each row is a term and each column is a level in the hierarchy • Put other data required for related and equivalent terms in columns to right of preferred term hierarchy • Add a column for scope notes • Use Grouping to help expand / collapse sections of a long taxonomy • If you have a CMS or other tool that consumes taxonomy, you can export a CSV file from Excel and import it to the CMS (see Mary Garcia’s excellent blog posts at to learn how) 6/28/2012 © Jorsek, LLC. All Rights Reserved. 15
  16. 16. Taxonomy Tools • Consider using a Taxonomy Management System if: – You have a large taxonomy (over 500 terms) – The taxonomy changes often – You have a complex governance process for approving new terms – The taxonomy needs to be consumed by more than one system – You are using term relationships to improve search indexing 6/28/2012 © Jorsek, LLC. All Rights Reserved. 16
  17. 17. Guidelines for taxonomy quality • The hierarchy should reflect any of three relationships: – Generic (e.g. VehicleCar) – Instance (e.g. Mountain regionsRockies) – Whole-Part (e.g. HouseRoof) • Terms should be nouns or noun phrases. • Activities should be nouns or gerunds. • Avoid adjectives and prepositions unless integral to the term. • When in doubt singular vs. plural, choose plural; these are categories. Singular is OK for instances at the narrow end. • Named entities should be proper nouns. • Avoid punctuation and ampersands. Eliminate hyphens except where the term is confusing or unclear without them. • Make the most commonly used term the preferred term, even if it is an acronym (e.g. NASA). Make other forms Equivalent Terms. 6/28/2012 © Jorsek, LLC. All Rights Reserved. 17
  18. 18. Poll: Are you currently using controlled vocabularies for any of the following? • CMS Metadata • DITA Attributes • Prolog Metadata and Keywords • Other • Not using controlled vocabularies 6/28/2012 © Jorsek, LLC. All Rights Reserved. 18
  19. 19. Resources • LinkedIn Taxonomy Community of Practice • ANSI/NISO Z39.19-2005 - Guidelines on Construction, Format, and Management of Monolingual Controlled Vocabularies • IBM Presentation: Writing Effective DITA Task Topics – • TaxoDiary blog posts by Mary Garcia: Maintaining a Thesaurus in an Excel Workbook (two parts) – workbook/ – workbook-part-2/ • easyDITA blog posts and Twitter – and @easydita 6/28/2012 © Jorsek, LLC. All Rights Reserved. 19
  20. 20. Thank you! • Questions? • Recorded webcast will be available soon through our website – you will get an email with the link • Anyone can register after the event to view the recording • Slides will be available on SlideShare – • Next webcast July 25, featuring Amber Swope of DITA Strategies discussing Using Taxonomy for DITA Content. Please join us! 6/28/2012 © Jorsek, LLC. All Rights Reserved. 20