The document discusses converting legacy files to an XML/DITA compliant format using FrameMaker conversion tables. It provides an overview of the agenda which includes converting unstructured content to structured content using an EDD, conversion table, and structured template. It then demonstrates how to convert files with basic content like character and paragraph tags, and how to add support for images and tables. The document includes a demo of converting unstructured content to structured content using conversion tables, with samples that are easy to recreate but powerful in functionality.
How to Get Started in Social Media for Art League City
Moving unstructured FrameMaker content to structure
1. Planning and implementing conversion
of legacy files to XML/DITA compliancy
Bernard Aschwanden
www.publishingsmarter.com
bernard@publishingsmarter.com
Migrating to XML with
FrameMaker Conversion Tables
21:06
1
@publishsmarter
2. The agenda
21:06@publishsmarter
2
Convert content from unstructured to structured
EDD, conversion table, and a structured template
Using basic examples to get you started, this
session:
Convert files with content such as character tags and
paragraph tags
Add support for images and tables
Demo converting unstructured to structured using conversion
tables
Samples are easy to recreate, but complex and
powerful in functionality
3. Housekeeping and note taking
21:06@publishsmarter
3
Not all slides or topics are
equally weighted
Use some, discard others
Slide speed varies as this is
a QUICK session
Questions? Ask along the
way!
I’d love to claim errors/typos
is on purpose… they isn’t,
ain’t, and weren’t never;
however, I’ll fix ‘em as I
can…
4. About your speaker
21:06@publishsmarter
4
Publishing Smarter:
President
Content strategist,
publishing technologies
expert, author, and geek-
enough
Certified Technical Trainer
DITA
Content management
Topic-based writing
Society for Technical
Communication
(www.stc.org)
President
STC Associate Fellow
5. Standard disclaimer
21:06@publishsmarter
5
In the interest of brevity I
will make some blanket
statements to keep it
simple
It’s not all 100% “the
truth”, but I’ll stay close
Purists may complain
And they are wrong!
(except when they are
right)
6. Major disclaimer
21:06@publishsmarter
6
This is a quick session
There are LOTS of
samples in slides or
FrameMaker
Simple samples
Still complex ideas
Tricky to set things up
Happy to share files
To review/apply this
Watch the recording
Jot down “time stamps”
Cool item at 17:23
Excel formulas 18:57
Word updates 26:33
Then watch it again
Pause it, rewind, try it
Do this at your own pace
Slowly test with your
content
8. Legacy content and document review
21:06@publishsmarter
8
Include analysis of legacy files
Identify what can stay and what needs to go.
Approach with flexibility
What structure to use?
Decide on the overall structural environment you want to work
with
Could include S1000D, DocBook and DITA
Can also build your own
Develop your FrameMaker support materials
EDD, conversion table, and a template at the least
9. ID a rule set
21:06@publishsmarter
9
Use existing rules
If rules already exist, you have a solid starting point
Learn the rules and adapt your content to them
Build your own rules
If no rules exist, you can set your own from the start
Learn how to create the rules and build all the components
Hybrid approach
If you see a set of rules that look promising, learn about them
Find out how you need to adapt your content to match the
rules
If that does not work, then consider adapting the rules
10. Not chaos, but it’s at least
unstructured and needs work
21:06@publishsmarter
Create structure from chaos
10
11. Method 1—Manually, element by element
21:06@publishsmarter
11
Apply structural rules to your content
Manually wrap content such as text ranges and
tables
Continue to manually wrap contents of paragraphs
together in Para elements
Then wrap sequences of Head and Para elements in
Section elements
And so on until entire document is wrapped in single
highest-level element
12. Method 2—Automatically
21:06@publishsmarter
12
Similar to adding structure manually
Apply rules to document objects below paragraph level
Then at paragraph level and through successively higher
levels
Stops at root element, or no more rules exist
Automatic wrapping requires a conversion table
Provides table of mappings to automate task of adding
structure to unstructured documents
Uses paragraph and character tags, and object types (such as
equations or footnotes), to identify how to wrap document
components in elements
Also specifies how to wrap child elements in parent elements
13. Let’s dive into it
21:06@publishsmarter
Conversion tables
13
14. Conversion Table—Overview
21:06@publishsmarter
14
Conversion table: rules for mapping content in
unstructured files to structured content.
Conversion table can be split up into several tables with text or
graphics in between for comments
Cannot have any tables other than conversion tables
Must be saved at least once before it can be used
Allows for iterative testing though
Can be in structured or unstructured document
15. Conversion Table—Organization
21:06@publishsmarter
15
Organization of conversion table:
Regular table, with at least 3 columns and 1 body row
Additional columns and heading/footing rows can hold
comments
Each body row holds 1 rule
Column 1 Column 2 Column 3
specifies document
object, child element,
or sequence to wrap
specifies
element in
which to wrap
specifies optional
qualifier (“nickname”) to
use as temporary label
16. Conversion Table—Sample
21:06@publishsmarter
16
Wrap this object In this element With this qualifier
P:Bullet Item Unordered
P:Numbered Item Ordered
Column 1 Column 2 Column 3
specifies document
object, child element,
or sequence to wrap
names the
element in
which to wrap
specifies optional
qualifier (“nickname”) to
use as temporary label
18. Conversion Table Production: Manual
21:06@publishsmarter
18
You have full control. No automatically inserted
content. All the rules are specific to what you tell the
system. However, you have to be explicit. (I am not a
fan)Wrap this object In this element With this qualifier
P:Head1 Head1
P:Head2 Head2
P:Body Body
P:Code Code
SV:Current Date CurrentDate
C:Code cCode
TC: CELL
TR: ROW
19. Conversion Table Production: Automatic
21:06@publishsmarter
19
Autogenerated content , then develop more rules or tweak as
needed. Rules based on content used in source files. (I like
this a lot more)
Use if you already have an unstructured document
Scans body page flows to ID every object that can be structured
Lists object type and format tag (if any) used in document
Maps object to element
Element tag named same as format tag
If object does not have format, element tag is a default name
for example: CELL or BODY
Removes parentheses and other characters to create valid element tag
Object type identifier in lowercase is prepended to duplicate
tags
Developer adds additional rules to:
Wrap elements in higher-level elements
Set attributes as elements are created
Wrap all elements in root element (by using root RE or by making
elements wrap up properly)
20. Number of conversion tables you need
21:06@publishsmarter
20
Based on types of high level elements and
amount/quality of content
If documents are clear and short, with a single highest
level
Create unique conversion table for each document type and convert
in bulk
For example if your documents are already clearly defined as task,
reference or concept you can apply one of three conversion tables to
groups of files
If documents are clear, but long and with multiple highest
level
Create a single conversion table that covers as much as possible
and then divide up content as required, or;
Reorganize first, then you have clear, short files with one highest
level
If documents are scattered with content
Create a single conversion table that does initial work and then
manually rework the structure as needed, or;
22. Your first conversion table
21:06@publishsmarter
22
1. Open document with objects you want to structure
2. Structure Tools > Generate Conversion Table
3. From the Generate Conversion Table dialog box,
select Generate New Conversion Table
4. Click Generate
23. Expected results
21:06@publishsmarter
23
Unnamed conversion table appears with rules based
on objects in document and element tags based on
format tags (tags used in the file, not all in catalog)
Wrap this object In this element With this qualifier
P:Title Title
P:Body Body
P:Heading1 Heading1
P:Heading2 Heading2
P:Heading3 Heading3
C:Emphasis Emphasis
X:See Heading SeeHeading
M:Index Index
M:Cross-Ref Cross-Ref
24. Update a conversion table
21:06@publishsmarter
24
Do so for a more complete list of objects (for
example, after a chapter is parsed, a more complete
one is found)
1. Open document with objects you want to structure
2. Structure Tools > Generate Conversion Table
3. From the Generate Conversion Table dialog box, select
Update Conversion Table
4. From Update Conversion Table popup menu, choose a
previously saved and open conversion table to update
5. Click Generate
26. Rule Syntax—Character Restrictions
21:06@publishsmarter
26
Case-Sensitivity in Tags
Format and element tags must be specified as defined in catalogs
Qualifier tags are case-sensitive; two occurrences of one qualifier
must match exactly
Special characters in Tags include ( ) & | , * + ? % [ ] :
In format tags and qualifier tags—allowed but must be preceded by
backslash () in table
In element tags—not allowed
A space character in tags does not need to be preceded
with backslash (you can write tag Format A)
Wildcard character (%) in Tags
Use % as in format or element tag to match zero, one, or more
characters (similar to * in general rule)
(you can write P:%Body matches paragraphs with format tag Body,
FirstBody, or BulletBody)
27. Rule Syntax—Specifying What to Wrap
21:06@publishsmarter
27
In Column 1
of the
Conversion
Table
1 or 2 letter
code to ID
item type
Type format
name to
narrow
definitions
Object Code Additional Info After Code Example
Paragraph P: Paragraph format tag P:Body
Text range C: Character format tag C:Emphasis
Table T: Table format tag T:Format A
Table title TT: (none) TT:
Table heading TH: (none) TH:
Table body TB: (none) TB:
Table row TR: (none) TR:
Table cell TC: (none) TC:
System variable SV: Variable format name SV:Current Date
User variable UV: Variable format name UV:CompanyName
Graphic G: (none) G:
Footnote F: Location of footnote3 F:Flow
Marker M: Marker type M:Index
Cross-reference X: Cross-reference format X:Heading Only
Text Inset TI: (none) TI:
Equation Q:
Equation size: Small,
Medium, or Large Q:Medium
28. Rule Syntax—Specifying the Wrapper
21:06@publishsmarter
28
In Column 2
of the
conversion
table
Type object
identifier E:
(optional)
Followed by
element tag
Wrap this object In this element With this qualifier
P:Body Para
C:ReportName Report
T:Format Part PartsTable
TT: TableTitle
TH: TableHeading
TB: TableBody
TR: PartsRow
TC: PartName
SV:Current Date (Long) Date
UV:Customer Customer
G: Graphic
F:Flow Footnote
M:Index IndexEntry
X:ElemNumTextPage XRef
TI: Para
Q:Large EQ
29. Rule Syntax—Specifying a Qualifier
21:06@publishsmarter
29
In
Column 3,
type
qualifier
(optional)
for new
element tag
Wrap this object In this element With this qualifier
P:Bullet Item Bullet
P:StepRestart Item Step1
P:Step Item Step
30. Rule Syntax—Identifying Sequence to Wrap
21:06@publishsmarter
30
In Column 1
of the
conversion
table
Type E: for
element, then
the element
tag
Type qualifier
(optional) in
brackets
Add more
element tags
with code
identifiers and
connectors
(as in EDD)
Symbol Meaning
Plus sign (+) Item is required and can occur more than once
Question mark (?) Item is optional and can occur once
Asterisk (*) Item is optional and can occur more than once
Comma (,) Items must occur in order given
Vertical bar (|) Any one of items in sequence can occur
Parentheses Beginning and end of sequence
Wrap this object In this element With this qualifier
P:Bullet Item Bullet
P:StepRestart Item Step1
P:Step Item Step
E:Item[Bullet]+ List
E:Item[Step1],
E:Item[Step]+ List
E:Head, (Para | List)+ Section
31. Rule Syntax—Adding Attributes to Elements
21:06@publishsmarter
31
Optional in Column 2 of the Conversion Table
Type attribute name and value in brackets after element tag
Separate name and value with equal sign, and enclose value
in double quotation marks
Wrap this object In this element With this qualifier
P:Bullet Item Bull
P:StepRestart Item Step1
P:Step Item Step
E:Item[Bull]+ List [Type = “Bulleted”]
E:Item[Step1], E:Item[Step]+ List [Type = “Numbered”]
E:Head, (Para | List)+ Section
32. Rule Syntax—Promoting Anchored Object
21:06@publishsmarter
32
When user adds structure to document, table or
graphic becomes child of paragraph with anchor
FrameMaker can break table or graphic out of its
paragraph and promote element to be sibling of
paragraphs:
In Column 2:
Type element tag for table or graphic
Add keyword “promote” in parentheses after element tag
Wrap this object In this element With this qualifier
T:Format A ProcedureTable (promote)
33. Rule Syntax—Flagging Format Overrides
21:06@publishsmarter
33
Provides a valuable set of elements related to
instances when the Paragraph or Character
Designer was used to make formatting changes
without saving to catalog format. This adds an
attribute called Override with value Yes.
In Column 1:
Add rule “flag paragraph format overrides”
Add rule “flag character format overrides”Wrap this object In this element With this qualifier
flag paragraph format overrides
flag character format overrides
34. Rule Syntax—Wrapping Untagged Text
21:06@publishsmarter
34
To wrap untagged formatted text:
In Column 1, add rule “untagged character formatting”
In Column 2, add element tag
Wrap this object In this element With this qualifier
untagged character formatting UntaggedText
35. Structuring a file (or set of files) with
a conversion table
21:06@publishsmarter
Converting files
35
36. Procedure: Structuring Current Unstructured
Docs
21:06@publishsmarter
36
1. Open conversion table and unstructured document
2. In unstructured doc, import element definitions from existing
structured template or EDD
Makes elements available in Element Catalog
If you do not perform this step, next steps produce elements in
Element Catalog defined by rules specified in conversion table
Can always import element definitions after generating structure
3. In unstructured file, StructureTools > Utilities > Structure
Current Document
4. From Conversion Table Document popup menu, choose
open conversion table file
5. Click Add Structure. A new document appears with content
wrapped into elements as defined in rules of conversion
table
6. Validate, correct errors, save file
37. Procedure: Structuring Group of Unstructured
Files
21:06@publishsmarter
37
1. Place files to convert in separate directory
2. Open a conversion table file
3. StructureTools > Utilities > Structure Documents and
the Structure Documents dialog box appears
4. From Conversion Table Document popup menu,
choose the conversion table
5. Under Input Unstructured Files, set directory to
structure
6. Optionally, if files have unique extension, in Suffix text
box, type extension (otherwise, all files in directory will
be structured)
7. Under Output Structured Files, set directory to write to
38. (continued)
21:06@publishsmarter
38
1. Turn on Allow Existing Files to Be Overwritten
As documents are structured, resulting files might have same names
as some existing files in directory specified for storing structured files
When on, overwrites older versions
When off, skips over files with existing matching filenames and
presents log file
2. Click Add Structure
3. When the “Operation completed normally” alert appears,
click OK to dismiss alert (structured files appear in output
directory with filenames matching those in input directory)
4. Open each file and import element definitions from any
existing structured template or EDD (makes elements in
Element Catalog match those in structured template or
EDD)
5. Validate, correct errors, save files
40. Procedure: Structuring Unstructured Book
21:06@publishsmarter
40
1. Open saved conversion table file
2. Open unstructured book
3. In unstructured book, import element definitions
from any structured template or EDD
Makes elements available in Element Catalog
If you do not perform this step, next steps produce elements
in Element Catalog defined by rules specified in conversion
table
Can always import element definitions after generating
structure
4. Select StructureTools > Utilities > Structure Current
Book (the Structure Book dialog box appears)
41. (continued)
21:06@publishsmarter
41
1. From Conversion Table Document popup menu,
choose saved conversion table file
2. In Output Directory text box, type directory for saving
structured files or choose from Browse
3. Turn on Allow Existing Files to Be Overwritten
As you add structure to documents, resulting files might have
same names as some existing files in specified directory for storing
structured files
When on, overwrites older versions
When off, skips over files with existing matching filenames and
presents log file
4. Click Add Structure (structured book and files appear in
output directory with filenames matching those in input
directory)
5. Validate, correct errors, save
42. Summing up the discussion,
and options to continue it.
@publishsmarter 21:06
42
Conclusion and contact
43. About this session
21:06@publishsmarter
43
Convert content from unstructured to structured
EDD, conversion table, and a structured template
Using basic examples to get you started, this session:
Convert files with content such as character tags and paragraph
tags
Add support for images and tables
Demo converting unstructured to structured using conversion
tables
Samples are easy to recreate, but complex and
powerful in functionality
44. My request
21:06@publishsmarter
44
Please suggest this session to others
If there are any problems with slides, please let me
know
Remember my disclaimer at the beginning
Not all slides are equal: Use some, discard others
In the interest of brevity I make some blanket statements
It’s not all 100% “the truth”, but I’ll stay close
Purists may complain
And they are wrong!
(except when they are right)