Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
DIY Content Management
1. Chapter
1
Content Management from the
Ground Up
Intro
Clay Helberg
Principal Writer at SPSS Inc.
Worked as a statistical consultant for the University of Wisconsin Medical School
before joining SPSS in 1996.
Has worn many hats at SPSS, including statistician, technical writer, training
instructor, and most recently, XML architect and programmer.
Has worked with XML and XSLT since 2000, with Epic since 2002, and with
XSL-FO since 2003.
Maintains an elaborate Rube-Goldberg-esque system of tools to provide authoring
and production support for technical publications at SPSS.
Telecommutes from St. Cloud, MN.
SPSS Inc.
Statistical (data analysis) software company
Flagship products: SPSS, Clementine
Used in academic research, business analytics, marketing, data mining
$210M revenue for 2002
Diverse user base: data entry staff -> university professors
Documentation is in XML (custom DTD) stored as files in source control
1
2. 2
Chapter 1
XML Use
Documentation is in XML (custom DTD) stored as files in source control
Generate various output formats (Online help—HTML and CHM, tutorials,
print/PDF) that ship with our software
Why Content Management?
Why do we want Content Management? The usual reasons:
Reuse
Consistency
Availability of content
Workflow
Our Setting
Publications Department
E Writers
Nine writers in home office
E Editors
Three full-time editors
E Production
One full-time production manager
E Remote offices
Four writers in Rochester, MN office
Two writers in Cambridge, MA office
3. 3
Content Management from the Ground Up
Three writers in UK office
Four full-time telecommuters
Other departments
E QA
E Tech support
E Marketing
E Design
Our dream: collaboration across departments to create and use content
Why a Homebrew System?
There are lots of “canned” CMS systems available. Why are we building our own
instead of just buying one?
4. 4
Chapter 1
We already have Source Control System (IBM/Rational ClearCase), which solves
some of the trickier problems.
Prepackaged systems are expensive—especially in a time of tight budgets.
Cost of integrating a purchased system with our software build process offsets the
value of the system.
What We Get from Source Control
The source control system gives us a lot of the features of a CMS:
Versioning/history
Ownership data
Ability to attach triggers (event handlers) to content changes
Access control
What We Have to Add
Other features that we want, but don’t get “for free” with source control:
Search & Insert
Enhanced metadata
Company-wide access
Workflow management
Search & Insert
Find content based on FTS, metadata
See where each content fragment is referenced
Insert reference to desired fragment in document
5. 5
Content Management from the Ground Up
Enhanced Metadata
Currently have version numbers, labels and person who created version
Need to add:
Keywords (all files)
Project leader (project files)
Company-wide Access
Remote access to content
Remote transformation from XML to end-user documentation (various formats)
Remote editing of content
Workflow Management
Access control
Ownership of files and projects
Ability to route files for review, edit, production, etc.
Change tracking
Roll Out Plan
We’re planning to roll out our custom CMS solution in four phases:
Phase I: Search and Reuse
Phase II: Remote Access
Phase III: Workflow Management
Phase IV: Remote Editing
6. 6
Chapter 1
Phase I: Search and Reuse
Basics already implemented (demo)
Figure 1-1
CMS Search Dialog
Text search
Limit by version
List of hits
File details for selected hit
Link backtracking (“Used where”)
Autogenerate reference to selected file
Open selected file
7. 7
Content Management from the Ground Up
Figure 1-2
Implementation of CMS search
Our implementation uses Lucene from the Apache project
(http://jakarta.apache.org/lucene/docs/index.html) to create the
searchable index of our content on the back-end, and to provide the searching
function in the front-end.
The ClearCase functions are controlled using ClearCaseJava, an open-source project,
which in turn relies on the Jacob open source project to provide the COM-Java
bridge required.
We use Javascript to control the XUI search dialog, and Java to manage the interface
between the Lucene index, ClearCase, and Epic.
Will eventually add:
More metadata options
Smarter parsing of XML content while building the index
8. 8
Chapter 1
Index compilation from dedicated server
Web-based searching (For more information, see “Phase II: Remote Access”
on p. 8 .)
Phase II: Remote Access
Web interface to content:
Searching
Check in/check out
Building end-user documentation from XML source
Editing would be done on user’s machine by downloading file(s) and opening with
local software (Epic, Word 2003, Emacs, etc.), then uploading when done.
Phase III: Workflow Management
Document status tracking (create, update, edit, format, complete)
Change tracking
Alerts and messages on status change
Access control (content owners can accept/reject changes made by others)
Phase IV: Remote Editing
Document editing via web interface
Similar to Contributor (but we can’t afford E3, so we’re exploring other options)
Implementation TBD