Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to Content Engineering


Published on

This is an introductory tutorial that presents, in a whirlwind fashion, the core concepts underlying Content Engineering.

Published in: Technology, Business
  • enjoyed the presentation Joe. Content engineering fits what many provided do when engineering content to be accessible for individual with disabilities. cheers!
    Are you sure you want to  Yes  No
    Your message goes here
  • This is a great guide for students attempting to compose collective contemporary theatre and performance. It is exactly what they need! Content engineering.
    Are you sure you want to  Yes  No
    Your message goes here

Introduction to Content Engineering

  1. 1. An Introduction to Content Engineering Joe Gollner VP e-Publishing Solutions Copyright © Stilo International plc 2008
  2. 2. Introduction to Content Engineering: Topics What is Content? Content Engineering & the Content Processing Roadmap The Business Context of Content Engineering Aims: Establish the nature of, and need for, Content Engineering Define a rubric of terminology for the tools and techniques that constitute a practical working framework for discussing, designing, developing and deploying content management and processing systems
  3. 3. What is Content?
  4. 4. Content is how we Communicate Content is the physical form of human communication Content is meaningful because it entails context Narrative Structures Implied Associations Associative Memory Associative Memory Acquired Perspectives Acquired Perspectives Imperfect Expression Imperfect Interpretation Content is typically serialized due to the ways we express, store and interpret information
  5. 5. The Document as the Popular Face of Content The document has proven to be a powerful device for communicating and retaining content While documents provide effective physical containers for content, they also lead to multiple modes of exchange and potential obsolescence
  6. 6. Content is Everywhere This has been true since the dawn of civilization and its importance grows daily Content populates an ecosystem where people receive, internalize, modify, create and share that content. Content connects everything.
  7. 7. The Truth about Content We are faced with: Massively expanding content volumes Diversifying venues for content delivery Proliferating format varieties Rising expectations of users Escalating specialization of content Evolving interconnectedness of content Multiplying problems related to content security Continuing lifecycle challenges (obsolescence remains a risk) Increasing complexity of content (the reintegration of data & documents) Growing recognition of the central importance of content
  8. 8. What Lies Ahead? What are the biggest challenges you face today in managing and using content? What do you suspect will be the biggest challenge you will be facing in the next five years? What are the opportunities emerging to leverage content in your business?
  9. 9. An Essential Response: Content Engineering Working Definition The application of rigorous engineering discipline to the design, development and deployment of content management and processing systems Distinguishing Features Systematic approach Progressive use of technology Awareness of Lifecycle considerations Total cost of ownership Solution scalability
  10. 10. Engineering and Content Organizing work Laying out work spaces Sequencing of process steps Optimizing tasks Refining tools Improving materials Transferring results between stages Sharing resources Performing maintenance Troubleshooting problems Differential Analyzer – Vannevar Bush (1930s)
  11. 11. Content Engineering Content Engineering Governing discipline Goal-directed Content Management Protect Value Content Processing Enhance Value People Create Value Planning Designing Authoring Editing
  12. 12. Content Management Components Content Management Control Organize resources, access and lifecycle Change Facilitate the evolution of content and the associated services Deploy Enable the services the content makes possible Control Change Deploy
  13. 13. Content Management and Content Processing A Close Relationship CM cannot exist without content processing services Expanding CM services demands more processing The sophistication of the processing functions increases more rapidly than management functions Many CMS solutions are constrained by weak content processing capabilities
  14. 14. Content Processing Components Content Processing Convert Transform Publish Key Focus in Content Engineering
  15. 15. Content Processing Components Content Processing Convert Transform Publish Transformation Breaks down into Refactor Relate Collect Resolve Compile Emphasis on leveraging efficient automation
  16. 16. The Content Processing Roadmap ACQUIRE ENRICH DELIVER CONTEXT Import Metadata Select Content Processing Convert Collect Compile CONTENT Import Select Publish Manage Content Processing Refactor Relate Resolve CONNECTIONS Import Links Select
  17. 17. Convert Content ACQUIRE ENRICH DELIVER CONTEXT Import Metadata Select Content Processing Convert Collect Compile CONTENT Import Select Publish Manage Content Processing Refactor Relate Resolve CONNECTIONS Import Links Select
  18. 18. Converting Content ? Conversion: changing the format of legacy content to make it increasingly suitable for efficient management, revision, reuse and publishing.
  19. 19. The Harsh Reality of Legacy Content Legacy Content All content resources that modification in order to be useful The Legacy Content Spectrum Opaque Not directly processable (e.g., paper) Annoying Aggressively proprietary Little or no predictability in usage Polluted Normally processable but frequently filled with deviations & additions (HTML) Tolerable Documented format that exposes format & structure in a processable form
  20. 20. Conversion Fundamentals Conversion is unavoidable and always under-estimated Conversion is fundamentally a matter of interpretation Parsing the legacy format & layout Inferring a meaning from this information Correlating the format & layout to a target structure Addressing problems introduced by format peculiarities Leveraging the content itself to guide format interpretation Enhancing interpretive rules by matching content patterns Automating conversion typically relies on two stages: Format Interpreter that can make sense of source formatting Rules-based Correlation Processor that maps content into structures
  21. 21. Conversion Process Template Target Source to Subject XML Source Target Interaction Matter Schema Analysis Experts Mapping Guidance Legacy Source Modify Modified Manual Existing Content Conversion Conversion Conversion Editing Rules Process Rules Example 1 Execute Result Identified Set Conversion Interaction Analysis Issues Process Sample 2 Set 10% Complete 3 Application Validation & Complete Set 100% Tests Verification
  22. 22. Refactor Content ACQUIRE ENRICH DELIVER CONTEXT Import Metadata Select Content Processing Convert Collect Compile CONTENT Import Select Publish Manage Content Processing Refactor Relate Resolve CONNECTIONS Import Links Select
  23. 23. Refactoring Content Refactoring: restructuring content, without loss of meaning, to improve its suitability for management, maintenance and specifically reuse.
  24. 24. Aspects of Refactoring Refactoring breaks down into two tasks Bursting Normalization Content Bursting Decomposing content into components optimized for reuse Content Normalization Systematic removal of redundancies to improve maintainability Challenges Ensuring content components remain meaningful & manageable Maintaining a complete equivalence with the original Adapting the linking mechanisms so they remain valid and functional Usually entails introduction of an indirect referencing scheme
  25. 25. Refactoring Strategies Strategy needed to ensure adequate returns on investment Refactor content that undergoes the highest rates of change first Conversion Outputs Compare Outputs
  26. 26. Collect Metadata ACQUIRE ENRICH DELIVER CONTEXT Import Metadata Select Content Processing Convert Collect Compile CONTENT Import Select Publish Manage Content Processing Refactor Relate Resolve CONNECTIONS Import Links Select
  27. 27. Collecting Metadata Metadata: a set of data that provides information about other data. Collecting Metadata: extracting, validating, integrating, supplementing, synchronizing and storing metadata from, and about, the content.
  28. 28. The Function of Metadata Metadata is used to make the context of content explicit Used to facilitate Control Security Limitation of rights Orderly storage & retrieval Discovery Searching Navigating Exchange Surprisingly important point The boundary between metadata and content is Yale University Library never completely clear
  29. 29. The Storage of Metadata Useful Design Pattern: Detachable Metadata Key metadata clustered into a document sub-component Shareable amongst many uses Incorporated into document when important to do so & only then
  30. 30. Ontologies, Taxonomies & Metadata Ontology The Meaning of Metadata Metadata categories and values relate content to aspects of metadata an Ontology The Ontology provides the context for metadata Ontologies metadata Describe a domain of knowledge Topic Can be used as the basis of: Topic Taxonomies (classification schemes) Link networks Taxonomy Topic Context driven navigational aids Topic Link Network
  31. 31. Establish Relationships ACQUIRE ENRICH DELIVER CONTEXT Import Metadata Select Content Processing Convert Collect Compile CONTENT Import Select Publish Manage Content Processing Refactor Relate Resolve CONNECTIONS Import Links Select
  32. 32. Establishing Relationships Explicit Links (Actual) Identifier Source Target Type A1 A2 Implicit Links (Potential) Identifier Source Target Type B1 B2 Reuse Links (Physical) Identifier Resource Request Condition R1 R2 Links: the connections or relationships between things that represent a significant portion of the meaning and value of content
  33. 33. Link Management Link Analysis: Increasingly Outbound Links: Intact or broken important Transclusions: Where used metadata Inbound Links: Track-back / Where cited Increasingly External Links: Network participation complex L ink Link Analysis metadata b o und Out Significant L in k processing cl u sion Trans Leverages external i nk ou nd L storage of links Inb Bidirectional External Link & link metadata Link generation becoming critical Link Base
  34. 34. Deliver Content ACQUIRE ENRICH DELIVER CONTEXT Import Metadata Select Content Processing Convert Collect Compile CONTENT Import Select Publish Manage Content Processing Refactor Relate Resolve CONNECTIONS Import Links Select
  35. 35. Delivering Content Compile Publish Resolve Resolve: assemble content and instantiate applicable relationships Compile: convert resolved content into a form suitable for rendition Publish: render the content in the forms required by the context
  36. 36. The Goal: High Fidelity Automation Print Publishing Content (PDF) Web Publishing Output Print Deliver PDF (Portal / Portable) Products - Resolve - Compile - Publish Rules Publish Transformations Output Variants Templates Delivery Processing Resolve Render Output Plan Assembling the inputs (Map & View) Content requested Content Supporting assets Assets Compile Applicable stylesheets & rules Output Web XHTML Resolve into a processable whole Products Compile formattable content representations Publish final formatted renditions
  37. 37. Content Processing & Validation Validation Essential capability Enables consistent processing Streamlines processes Validation must be Accurate Manageable Informative Actionable Pro-active Continuously improving
  38. 38. Validate & Transform: Simple Content Validation DTD structural rules Instance conformance Content Transformation Traditionally focused on arranging content for formatting Supporting primarily structural manipulation Validated Outputs Inputs to rendition processes HTML outputs XML outputs
  39. 39. Schema Rules Content Instance Validate & Transform: Complex Structure Validation Content Verification Content Validation & Verification Schema structural rules Rules governing content values Instance conformance Transformation Content Transformation Processing Continuous process of improvement Parse, validate, align, verify…repeat Manipulation of many content types Validated Outputs Outputs Inputs to rendition processes HTML outputs XML outputs Data outputs for applications
  40. 40. Complexity and the Cost of Quality Complexity is inherent in the nature of content Increasing content complexity increases the amount and sophistication of content processing tasks Increases in content processing tasks results in a significant increase in the total cost of quality
  41. 41. Solution Architectures Content Assembles Engineering components to provide integrated services Content Content Solution Management Processing Architectures Technology selection & integration Convert Transform Publish Standards selection & integration Refactor Collect Compile Multiple solution instances Relate Resolve will exist Validate
  42. 42. Managing Solution Risk Integration risk represents The potential loss of services The potential loss of assets Integration risk increases with the increase in the number of technologies used to build a solution System complexity Can be managed Ultimately limits solution affordability and even viability Addressed in design selections
  43. 43. Technology Selection Key Considerations Solution context Scored against requirements Scoring scale 0 – No Fit 6 – Total Fit Results weighed against acquisition cost
  44. 44. Technology Lifecycle Considerations High High Solution context includes Measuring Overall Productivity over Time Urgency Complexity Criticality Constraints Time Projected lifecycle Low Expected lifespan Complexity Rate of change Influencing factors High High
  45. 45. Solution Component Dependencies Structure Content Media Process Maps Schemas Files Sources Rules <X> Document Processing Import Data Style Templates Scripts Sources Sources A BC Sheets Analysis Relationships Quality Log Configuration xy Reports Reports Reports Files .. .. A B .. .. Because all components within a solution evolve their inter-dependencies require explicit description and management.
  46. 46. Evaluating Standards as Potential Tools Independence From parochial interests, proprietary claims, external influences Formality Of creation, validation, approval & modification process Stability Of standard over time & the backward compatibility of changes Completeness Sufficiency for declared scope as well as availability of useful documentation & reference implementations Adoption Extent of support amongst tool vendors, authorities & users Practicality The extent to which all, or parts, of the standard can be deployed
  47. 47. Evaluating a Specialized Industry Standard Scenario Industry specification Broad scope Specialized stakeholder community Continuously changing & expanding Strategy Implement where necessary Address risk areas
  48. 48. Evaluating a Cross-Industry Standard Scenario Addressing widespread issues Broad stakeholder community Mature Further capabilities emerging Strategy Plan for adoption Consider for use in variety of areas
  49. 49. Content Solution Architecture Framework Controls Enterprise Programs Domains Active Web Specialized Document Sources Publishing Services Models Integrate External Print Ontology Sources Discovery Services Rules Legacy Application Data Sources Content Architecture Data Services Inputs Outputs Users Tools Mechanisms Authors Content Management Resources Subject Matter Experts Content Processing Administrators Budget Content Authoring Information Architects Personnel Development Tools Developers Infrastructure Web Services
  50. 50. Content Architecture Content Establishes Engineering governing model of the knowledge Content Architecture domain Content Content Solution The knowledge Management Processing Architectures that has informed the content Convert Transform Publish The knowledge being encapsulated in the solutions Refactor Collect Compile Supports multiple Relate Resolve solution instances Validate
  51. 51. The Central Role of the Content Architecture Content Service Discovery Specialized Requirements Requirements Taxonomies Architecture Topic Description Description Procedure Data Concept Task Reference Data Data Description Data Description Procedure Procedure Data Data Specialized Information Types Specialized Delivery Processes Procedure Data Data Annotation Formatting Effectivity Data Procedure Data Change Procedure Data Data Specialized Procedure Data Domains
  52. 52. Content Solution Design Principles The nature of content demands an adaptable architecture Technology components should be loosely-coupled Content must always be available in its simplest self-describing form Data stores should be replaceable by stored instances True for content, metadata and links Content processing events can be performed many ways Simple methods must be present, sophisticated methods may be All interfaces established as the exchange of validated content Processing rules are, themselves, managed & processable content Content Processing should be extensively leveraged Content validation, analysis and reporting at every stage Used to manage & optimize solution components to improve efficiency
  53. 53. Content Engineering Maturity Model Modeled on the Software Engineering Institutes (SEI) Capability Maturity Model Integration (CMMI) “managed” used instead of “quantitatively managed” for level 4 “repeated” used instead of “managed” for level 2 “reactive” used instead of “performed” for level 1 Level Content Engineering Maturity Model Objective 5 Optimized Follow software engineering in 4 Managed emphasizing the 3 Defined importance of formalization & 2 Repeated quantitative methods 1 Reactive for continuous improvement 0 Incomplete
  54. 54. CE Maturity Model: Level 0 Incomplete Incomplete Often the complete absence of a documented process A process that is documented but not followed also qualifies Features New requirements addressed using available tools Each solution seeks cost minimization No persistent infrastructure No improvement between projects
  55. 55. CE Maturity Model: Level 1 Reactive Reactive A process exists for specific goals Sufficient for the needs of selected products Not institutionalized and not integrated with institutional processes Content Engineering Maturity Model Features Level Not designed to 5 Optimized handle new or 4 Managed changing requirements 3 Defined Can result in 2 Repeated multiple solutions each created as a 1 Reactive reaction 0 Incomplete
  56. 56. CE Maturity Model: Level 2 Repeated Repeated A managed process exists and is supported by basic infrastructure Predictability can be achieved in process performance & products Reviews are conducted to identify & initiate improvements Content Engineering Maturity Model Features Level A common set of 5 Optimized tools has been 4 Managed selected 3 Defined Procedures exist for steps 2 Repeated Solution 1 Reactive components documented 0 Incomplete
  57. 57. CE Maturity Model: Level 3 Defined Defined Standardization in processes established on an institutional level Common tools & techniques used across processes & projects Features Content Engineering Maturity Model A single Level infrastructure used 5 Optimized to support multiple 4 Managed processes & projects 3 Defined Processes defined 2 Repeated with reference to enterprise models 1 Reactive Interrelationships 0 Incomplete are known
  58. 58. CE Maturity Model: Level 4 Managed Managed Processes are managed using quantitative measurement Automation is maximized in the execution of process steps A single integrated & managed environment supports all processes Content Engineering Maturity Model Features Level Infrastructure 5 Optimized components 4 Managed managed as content with automation 3 Defined used to adapt 2 Repeated behaviour High levels of 1 Reactive quality sustained 0 Incomplete
  59. 59. CE Maturity Model: Level 5 Optimized Optimized Continuous orientation towards improvement Continuous refactoring of solution and content to achieve efficiencies Continuous identification & implementation of heightened standards Content Engineering Maturity Model Features Level Systematic analysis 5 Optimized & correction of 4 Managed variations 3 Defined Proactive identification of new 2 Repeated products & services that can be offered 1 Reactive Industry innovation 0 Incomplete
  60. 60. General Observations Content is inherently complex Current trends have moved content to the center of attention Content Engineering is an essential response Provides the necessary discipline & the conceptual framework Content has not typically received this level of attention in the past Effective Content Processing is central to success Content Management services are enabled by content processes Adaptive content processing is essential for addressing change Effective Content Solutions are designed to cover the complete content lifecycle and all stakeholder perspectives The efficient management and processing of content remains an elusive goal for most organizations
  61. 61. Content Engineering and Business Value The design of Content Solutions should Continuously minimize the costs of acquiring, enriching, managing and delivering content Continuously improve content resources through enrichment Continuously increase the benefits realized through the delivery of content Continuously reduce risks threatening content assets or the services being supported Each of these represents an increase in value
  62. 62. Top Ten Secrets of Content Solution Success Don’t underestimate your content or your business Don’t underestimate the power of good automation Chose an appropriate tool set and validate your choices Don’t invest in content management technology too early Carefully plan and execute migration activities Take a “customer service” focus in delivering tangible benefits (new products / services) from your investments Be demanding of your suppliers (expect quality) Engage your stakeholders and “take control” of the solution Leverage standards, don’t be enslaved by them Be an active part of the community as a way to learn and as a way to share what you have learned
  63. 63. The End Admittedly an awful lot to cover in a single go. Hopefully some of the ideas connect with some of your experiences and perhaps help in framing aspects of your next project. Joe Gollner VP e-Publishing Solutions Stilo International