“It’s not rocket science!” Applying CMS and semantic enrichment to transform book publishing

Uploaded on


  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Wednesday Webinar Series “It’s not rocket science!” Applying CMS and semantic enrichment to transform book publishing ©2010 Really Strategies, Inc. | www.rsuitecms.com
  • 2. Webinar Agenda  Welcome, Overview, Introductions  Online Poll  Semantic enrichment  Wolters Kluwer Health case study  Online Poll  Content management with RSuite  Q&A ©2010 Really Strategies, Inc. | www.reallysi.com
  • 3. Who is Really Strategies?  Founded: 2000  Consulting Services to Publishers  Specialists in XML-based Content Management Solutions 2007, 2009  Project/Program Management  Workflow Analysis and Reengineering  Content and Metadata Modeling 2004 - 2009  Technology Assessment and Roadmaps  Much more… A content management system for 2006, 2007, 2008 publishers. 2007, 2009 2004, 2005, 2007 ©2010 Really Strategies, Inc. | www.reallysi.com
  • 4. Serving over 100 companies STM Educational Media Tech Pubs 4 ©2010 Really Strategies, Inc. | www.reallysi.com
  • 5. Webinar Presenters Jake Zarnegar, CTO Silverchair Jabin White, Director of Strategic Content Wolters Kluwer Health – Professional & Education Mike Sherlock, Program Manager Really Strategies, Inc. 5 ©2010 Really Strategies, Inc. | www.reallysi.com
  • 6. The CMS – Semantic Landscape Content management Content delivery Content creation Author submission Online submission iPad and other mobile apps Editorial workflow Check in/ check out Tools used Version control Production Metadata Web sites Transformations Good old print Content enrichment Taxonomy management 6 ©2010 Really Strategies, Inc. | www.reallysi.com
  • 7. ONLINE POLL 7 ©2010 Really Strategies, Inc. | www.reallysi.com
  • 8. Jake Zarnegar, Silverchair Why Enrich Your Content With Semantics?  Silverchair | www.silverchair.com
  • 9. Semedica, a division of Silverchair Tagmaster Semantic autotagging w/expert review Totem Taxonomy/Ontology manager Cortex Biomedical taxonomy & thesaurus Swiss Semantic web services  Silverchair | www.silverchair.com
  • 10. My Brother is a Rocket Scientist  Silverchair | www.silverchair.com
  • 11. His Test Answers A-C-B-A-E-D-C-A-B-E-A-C-B-B-E-D- C-A-B-C-B-C-A-D-B-E-A-D-C-A-B- A-E-A-C-A-E-C-B-A-E-B-D-D-E-A-B- C-A-B-A-C-A-A-C-E-C-B-D-D-A-B- C-E-A-C-C-D-E-B-A-B-B-C-D-E-A- D-A-E-B-E-C-A-E-D-A-C-E-B-C-A-B- E-A-A-D-E-A-B  Silverchair | www.silverchair.com
  • 12. Semantic Enrichment Raison D’Etre To put thousands and thousands of tiny meaningful hooks in your data so that your software applications can create richer outcomes for your users and your organization.  Silverchair | www.silverchair.com
  • 13. Semantic Enrichment Raison D’Etre To put thousands and thousands of tiny meaningful hooks in your data so that your software applications can create richer outcomes for your users and your organization.  Silverchair | www.silverchair.com
  • 14. Semantics in 3 Minutes  Silverchair | www.silverchair.com
  • 15. Semantics are About Meaning Semantics describe the meaning of your content, on top of the physical structure. Meaning is generally conveyed in topics and concepts. Semantic metadata formally answers the most important question of all for content producers and users: What is this content about?  Silverchair | www.silverchair.com
  • 16. “Atomizing” Information The semantic approach requires us to go beyond documents and think of our content as data. For example: 1 textbook chapter = 1 document OR 1 textbook chapter = 712 distinct pieces of data (sections, paragraphs, lists, tables, figures, equations, etc.)  Silverchair | www.silverchair.com
  • 17. But breaking down content into its smallest parts is not an end unto itself…  Silverchair | www.silverchair.com
  • 18. Taxonomy as Semantic Foundation • The taxonomy is the framework for the semantic layer and semantic tagging—crucial for concept grouping and hierarchical relationships • Also serves to normalize terminology and language variances when combined with a robust thesaurus • Industry-standard taxonomies facilitate integration  Silverchair | www.silverchair.com
  • 19. Use taxonomy axes to organize your atomized content on key traits and prepare it for recombination…  Silverchair | www.silverchair.com
  • 20. Nuts & Bolts: Semantic Tagging • Tagging is the insertion of semantic (meaning) information in the XML, whose smallest unit is called a tag • Tagging can also be placed in database tables and header files if the content is inaccessible (such as images and videos) • Tagging should be done at the smallest “atomic” level of data possible
  • 21. Paragraph entity identification. What is this content about?  Silverchair | www.silverchair.com
  • 22. Semantic article summary. What is this content about?  Silverchair | www.silverchair.com
  • 23. Semantics for Your Users  Silverchair | www.silverchair.com
  • 24. Know Your Users!  Focus your metadata creation on how your users want to use your content: • How do they search? Browse? At what point in their workflow is your product used?  Almost all information sites have multiple use cases. You need to know what those use cases are for your products.  Start with what is the most important to the most users and work your way down a priority list.  Silverchair | www.silverchair.com
  • 25. The Semantic Use Test I am specifically identifying __________ because ____________ is very important to my ____________ users when they are _____________.  Silverchair | www.silverchair.com
  • 26. Semantic Metadata: Focus on Use Example: I am specifically identifying concise disease treatment content because immediate access to treatment options is very important to my emergency physician users when they have 8 seconds to look up an answer.  Silverchair | www.silverchair.com
  • 27. McGraw-Hill: metadata targeted to deliver fast, concise treatment info to ED  Silverchair | www.silverchair.com
  • 28. Semantic Metadata: Focus on Use Example: I am specifically identifying skin disorder images on all body locations and all types of skin because visual diagnosis is very important to my family physician users when they are trying to identify a rash.  Silverchair | www.silverchair.com
  • 29. Derm101: images show up immediately in the diagnosis results for searches  Silverchair | www.silverchair.com
  • 30. Semantic Metadata: Focus on Use Example: I am specifically identifying manufacturer names because the source of medical devices is very important to my surgical resident users when they are prepping for a procedure.  Silverchair | www.silverchair.com
  • 31. Semantic Metadata: Focus on Use Example: I am specifically identifying manufacturer names because the source of medical devices is very important to my surgical resident users when they are prepping for a procedure. Not Likely!  Silverchair | www.silverchair.com
  • 32. Semantics for Your Organization  Silverchair | www.silverchair.com
  • 33. Use Semantics to Know Your Users  Silverchair | www.silverchair.com
  • 34. Use Semantics to “Know Thyself!”  Silverchair | www.silverchair.com
  • 35. Thank you! For more information: Jake Zarnegar CTO, Silverchair President, Silverchair Information Systems jakez@silverchair.com (434) 296-6333 x236 www.silverchair.com www.semedica.com  Silverchair | www.silverchair.com
  • 36. Jabin White Director of Strategic Content Wolters Kluwer Health – Professional & Education Really Strategies/Silverchair Webinar – September 29, 2010
  • 37. Agenda • A little background (framing the problem) • Our goals • When we’re done, we’ll be able to…
  • 38. Who we are • We are Wolters Kluwer Health – Professional & Education • Wolters Kluwer Health includes: ▫ Lippincott Williams & Wilkins titles ▫ Ovid ▫ UpToDate ▫ Provation Order Sets ▫ Drug Facts & Comparisons ▫ Medi-Span ▫ Clin-eguide
  • 39. A Little History • Joined WK Health in May 2009 ▫ Responsible for making sure content flows through company more efficiently (DTDs, Content Management, Authoring Tools, Semantic Enrichment, Product Information Management, etc.) • The reasons are not important, but we hadn’t spent a lot of time modernizing our digital production methods
  • 40. Today – Our typical workflow • Book is “signed” • Instructions for authors are sent, and ignored • Chapters, etc., are submitted in MS Word • Word files are sent “over the wall” (outsourced), coded, and put into a pagination software (still some Quark, moving to Adobe InDesign) • Final pages are approved • High-resolution PDFs are sent to printer • After final pages are approved, vendors convert into XML (if the title was comped after May 2009). If before, we roll the dice… • Delivered back to P&E archive, along with printer PDFs, application files, and images
  • 41. So what’s your problem? • We pay at every step of the previous workflow, and we believe unnecessarily near the end • If we need ePub, we have to go back into the archive to a “mixed bag” of content (some Quark, some PDF, some XML) • There is no central repository – or common format – in which to apply semantic tagging ▫ And the frustrating thing is we have GOOD DTDs! • If we believe in semantic markup, which we do, we must essentially throw content over the wall again just as in composition (shampoo, rinse, repeat)
  • 42. Enter RSuiteCMS • RSuiteCMS gives us the ability to control the workflow and use good content management practices (it does a LOT more, but we’re starting slow) • Very importantly, we get to have authors write in XML without them knowing (or quite frankly caring) • We put a LOT of work into the authoring environment, trying to keep authors away from angle brackets • “It takes a lot of hard work to make things simple”
  • 43. When we’re done, we’ll be able to… • …Produce structured content with lower effort/cost • Working on SECOND RSuiteCMS implementation as we speak ▫ Will scale in latter part of 2010 and 2011 • We are moving cautiously and ensuring “buy in” from stakeholders at each step • Ideally, we will grow our ability to produce clean, structured XML to check into our repository • But Rome wasn’t tagged in a day...
  • 44. Enter Semedica • Gives us the ability to add semantic tagging to our content, either when it is finished (in the repository) or while it is being worked on (within RSuiteCMS) • Semedica gives us the ability to: ▫ Leverage a standard taxonomy (Cortex) ▫ Add to the taxonomy and manage equivalencies – perhaps mined from our search logs – “Wenckenback = Wenckebach” (Totem) ▫ Apply the tags to our content (Tagmaster)
  • 45. Why Semantic Tagging? • It adds extra power to our content to drive: ▫ More precise searching ▫ Contextually-based connections ▫ Lowering of “two terms meaning the same thing” syndrome (hypertension vs. high blood pressure; heart attack vs. myocardial infarction) ▫ Filling in of content gaps ▫ Asking questions of data (aka, querying): “How many chapters do we publish that are tagged with the term “pediatric oncology” or “leukemia” that also contain the treatment “interferon therapy”
  • 46. How RSuiteCMS & Semedica need each other • I wouldn’t think of using Semedica to enrich Word files (and not just because Jake would laugh) • I couldn’t make the business case for RSuiteCMS to help produce structural XML without dangling the prospect of semantic enrichment • Which came first, the chicken or the egg?
  • 47. Jabin White Director of Strategic Content Wolters Kluwer Health Jabin.white@wolterskluwer.com 215.521.8911 Twitter: @jabinwhite Blog: Technically Speaking at http://www.bookbusinessmag.com/channel/technically-speaking
  • 48. Facilitating semantic enrichment for the 5-Minute Clinical Consult product  XML source content must be updated regularly by working medical professionals without using desktop software and  updates must be easily imported and exported for multiple channels without technical intervention ©2010 Really Strategies, Inc. | www.reallysi.com
  • 49. Solution: a simplified, browser-based interface to RSuite Contributors only see what they need to see ©2010 Really Strategies, Inc. | www.reallysi.com
  • 50. Workflow tools Enable contributors to manage their own tasks ©2010 Really Strategies, Inc. | www.reallysi.com
  • 51. Integrated Xopus XML editor Enables balance between required content structure and flexibility to enhance information ©2010 Really Strategies, Inc. | www.reallysi.com
  • 52. Custom PubMed lookup and reference management Users can search, insert, cite, and auto-renumber to ensure markup consistency ©2010 Really Strategies, Inc. | www.reallysi.com
  • 53. Editing in XML source enables users to tag items of interest In this example, user highlights text and uses an icon to label text as a drug name ©2010 Really Strategies, Inc. | www.reallysi.com
  • 54. Alternate view: XML markup stays hidden in background Contributors are not aware that they are editing XML content ©2010 Really Strategies, Inc. | www.reallysi.com
  • 55. Exporting XML from repository Managing Editor can easily select a content set and choose an export target ©2010 Really Strategies, Inc. | www.reallysi.com
  • 56. Lessons learned  It’s hard to add XML structure to unstructured content  Authors must work on a single content source; semantic enrichment is too valuable to throw away  Challenges to manage:  Editing tool vs. form: need to balance conformity vs. medical usefulness  Hiding XML from editors requires very tight content controls  Training occasional external contributors not a viable option  Lack of control over user’s browser types and versions makes technical support difficult It’s not rocket science, it’s just a lot of hard work ©2010 Really Strategies, Inc. | www.reallysi.com
  • 57. QUESTIONS 17 ©2010 Really Strategies, Inc. | www.reallysi.com