Document repositories-and-metadata


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Lets, start with the most important driver for re-use from a purely pragmatic perspective. Many organizations view re-use as the ultimate driver for a DAM initiative, and it makes total sense. We can spend less and do more if we re-use. And it’s true that budget reductions, and cost drivers can drive a DAM iniative but we can’t make the mistake of assuming budgets on their own can change in grainded ways of working
  • So even with the cards stacked against them with no repositories and no mechanism of sharing, there are some who manage to share assets by email, or mailing CDs or ad hoc file shares, But its hard to maintain and filled the road is filled with pot holes, lets go to the next slide to look at ad hoc re-use scenario
  • A classic ad hoc re-use scenario illustrated here shows the intracacies of file formats, digital rights and process inefficiency will usually ruin the best sharing intenetions Maverick one talks to maverick two about the fantastic photo he saw on in his colleagues print ad asks if he can use it as well, just needs to crop out a part of the photo Maverick two says sure and talks to the creative agency who created the photo and they upload a copy to their file share. Maverick two downloads the photo and puts it on a CD and mails it to Maverick one. Maverick one forwards the CD to his creative agency only to be told the format is incorrect and unusable Maverick one emails Maverick two, lets her know the problem. Maverick two emails her creative agency only to be told that the their contract doesn’t wont allow them to forward the re-usable source file version of the photo and weeks time and energy have been wasted.
  • So we have talk about two major drivers that generally push organizations towards investing in DAM, 1.The appealing business case of re-using assets to reduce overall operating costs. 2. The frustrations and efforts of people trying to share bubbling up to audible levels. So lets look at what usually happens next in these situations.
  • The organization buys and installs a system with little thought or consideration into what will really make it useable. So if we look at the scenario above then, even though everyone can access all the assets in the same place, without taxonomy, metadata, content strategy and good governance, the potential time and money saved from re-using assets is lost because it takes an exceedingly long time to find anything.  So lets take the time to walk through each problem Consider the images and text that make up a print advertisement.  If the creative director stored the advertisement and its components in a file labeled "Ted's Print Projects 2009", it would be very difficult for people in another part of the organization to locate and reuse any of the components. Thats were taxonomy and metadat come in  
  • Taxonomy and metadata are ways of describing Assets so that they become findable for a larger audience. I am sure Ted knows what “Ted Print Projects 2009” are but how could anybody else. For that matter Ted might not even remember what Ted’s Print Projects were in 2009 when it’s 2010. In the above example, each bolded term represents what is termed a "facet“ of taxonomy. Each facet is made up of it’s own controlled vocabulary and  represents a different way of accessing a piece of information. In this example, we might define the type of asset, the specific channel, target demographic, a country or region, a language and perhaps a term to describe the concept.  The number of facets is limited by a couple of practical issues (like who will add the terms to describe content) but can be tailored to the organization's specific processes, content, markets, asset types, channels, brands, regions, etc.  The point to be made however is that there is no replacement for taxonomy in a DAM solution. No search engine can do the job that taxonomy does, which is give everybody a consistent mechanism for finding content.
  • On this slide you can see an ideal DAM system, lets imagine that you done it all A centralized repository with well established governance, that leverages taxonomy and formal organizing principals to ensure that search and retrieval of assets is smooth and streamlined process. It looks simple on this slide but the reality is that making sure that the proper taxonomy, metadata, and content strategy is in place before you simply create the DAM dumping ground is something people rarely do. But here’s the catch... Your are only half way there at this point
  • Document repositories-and-metadata

    1. 1. Document Repositories & Metadata Richard Beatch– Earley & Associates
    2. 2. <ul><ul><li>Focus: Information Architecture (“IA”) Services </li></ul></ul><ul><ul><li>Founded: 1994 </li></ul></ul><ul><ul><li>Personnel: Twenty core team consultants, plus a network of other top industry experts </li></ul></ul><ul><ul><ul><li>ECM and KM experts </li></ul></ul></ul><ul><ul><ul><li>taxonomy specialists </li></ul></ul></ul><ul><ul><ul><li>search experts </li></ul></ul></ul><ul><ul><ul><li>information architects </li></ul></ul></ul><ul><ul><ul><li>usability professionals </li></ul></ul></ul><ul><ul><ul><li>technology consultants </li></ul></ul></ul><ul><ul><ul><li>business process experts </li></ul></ul></ul><ul><ul><li>Headquarters: Boston, MA </li></ul></ul>About Earley & Associates, Inc. <ul><ul><li>Consulting Philosophy: </li></ul></ul><ul><ul><ul><li>Organizing Principles based on business context and goals </li></ul></ul></ul><ul><ul><ul><li>Four Pillars - People, Content, Process, and Technology </li></ul></ul></ul>
    3. 3. Core Capabilities Enterprise Search, Portal Design, Collaboration Web Content Management Workflow Management Security & Privacy Management Rights Management Records Management Website Navigation, Search & SEO Digital Asset Management Taxonomy, Metadata, & Usability
    4. 4. Core Capabilities <ul><li>Document/Content/Management: </li></ul><ul><li>Strategy and requirements planning </li></ul><ul><li>Taxonomy, Metadata, Object modeling </li></ul><ul><li>Audit and analysis </li></ul><ul><li>Migration </li></ul><ul><li>Tagging and indexing </li></ul><ul><li>Lifecycle and workflow planning </li></ul><ul><li>Technology selection, RFP development </li></ul><ul><li>Governance </li></ul><ul><li>Taxonomy & Metadata: </li></ul><ul><li>Taxonomy strategy </li></ul><ul><li>Taxonomy development (for e-commerce, faceted search, ECM, DAM, enterprise taxonomy, thesauri) </li></ul><ul><li>Taxonomy evaluation and testing </li></ul><ul><li>Taxonomy implementation </li></ul><ul><li>Taxonomy governance and training </li></ul><ul><li>Taxonomy tool selection </li></ul><ul><li>Metadata standards development </li></ul><ul><li>Metadata schema design </li></ul><ul><li>Metadata governance </li></ul><ul><li>  </li></ul><ul><li>Digital Asset Management: </li></ul><ul><li>DAM strategy </li></ul><ul><li>DAM taxonomy </li></ul><ul><li>DAM technology evaluation </li></ul><ul><li>Asset lifecycle management </li></ul><ul><li>Marketing resource management (MRM) </li></ul><ul><li>Information Architecture/Usability: </li></ul><ul><li>Usability studies (site, navigation, taxonomy) </li></ul><ul><li>Wireframes and IA design </li></ul><ul><li>  </li></ul><ul><li>Search: </li></ul><ul><li>Search audit and user testing </li></ul><ul><li>Search strategy and ROI analysis </li></ul><ul><li>Taxonomy for faceted search and search optimization </li></ul><ul><li>Search deployment  </li></ul><ul><li>Search and business intelligence </li></ul><ul><li>Search tuning and SEO </li></ul><ul><li>Search technology evaluation/tool selection </li></ul><ul><li>  </li></ul>
    5. 5. About Me <ul><li>Richard Beatch </li></ul><ul><ul><li>Senior Consultant at Earley & Associates, Inc. </li></ul></ul><ul><ul><li>Ph.D. in Ontology </li></ul></ul><ul><ul><li>Specialized in Taxonomy, Search, Metadata, and content architecture. </li></ul></ul><ul><ul><li>Extensive industry experience leading the implementation and design of taxonomies and search solutions for a range of companies including Apple, McAfee, Allstate, Dell, and AT&T. </li></ul></ul><ul><ul><li>Blog: </li></ul></ul>PAGE
    6. 6. The Challenge <ul><li>Suppose you have roughly 1 Million scanned documents entering your document management system each week </li></ul><ul><li>Suppose you want users to be able to find them in the future so as to conduct your business </li></ul><ul><li>Suppose it is 2001 </li></ul>PAGE
    7. 7. The result <ul><li>H:DocStoreCaliforniaClaimsAutoPoliceRepPhotosDR65876KL </li></ul><ul><li>H:DocStoreCaliforniaClaimsAutoPoliceRepPhotosDR64876KL </li></ul><ul><li>H:DocStoreCaliforniaClaimsAutoPoliceRepPhotosDR64879DL </li></ul><ul><li>H:DocStoreCaliforniaClaimsAutoPoliceRepPhotosDW72876KL </li></ul><ul><li>Multiplied by (roughly) 250K each week </li></ul>PAGE
    8. 8. Why should I care about access anyways? <ul><li>Reuse of content </li></ul><ul><li>Access in order to do business, e.g., process an insurance claim </li></ul><ul><li>Access for regulatory needs </li></ul><ul><li>In short, to either generate revenue or save money </li></ul>PAGE
    9. 9. How do we access th is information now? <ul><li>Ad HOC mechanisms </li></ul><ul><ul><li>File shares </li></ul></ul><ul><ul><li>Snail mail/sending CDs </li></ul></ul><ul><ul><li>Email </li></ul></ul>PAGE
    10. 10. Why Ad Hoc approaches still fail: <ul><li>Intricacies of: </li></ul><ul><ul><li>files formats </li></ul></ul><ul><ul><li>digital rights </li></ul></ul><ul><ul><li>time to transfer content </li></ul></ul>PAGE <ul><ul><li>Wow what a cool photo can I re-use it? </li></ul></ul><ul><ul><li>Yeah sure let me get a copy and send it over </li></ul></ul>
    11. 11. Document Management to the Rescue <ul><li>Ad Hoc Sharing frustration bubbles up to the surface </li></ul><ul><li>Business recognizes the need and the potential cost savings over time </li></ul>PAGE We need a document management system
    12. 12. But we all know the answer: <ul><li>A database and metadata! </li></ul>PAGE
    13. 13. But how do you expect me to find content? PAGE Print Websites Social Media Ted's Print Projects 2009 Home_.html Facebook new ideas
    14. 14. Taxonomy & Metadata For Findability <ul><li>Type : Magazine Advertisement </li></ul><ul><li>Channel:  Print </li></ul><ul><li>Target Demographic:  Parents </li></ul><ul><li>Country : US </li></ul><ul><li>Language : Spanish </li></ul><ul><li>Concept : Rebellion </li></ul><ul><li>Brand: Settletra </li></ul>PAGE <ul><li>Do your kids: </li></ul><ul><li>Have discipline problems? </li></ul><ul><li>Trouble paying attention in school? </li></ul><ul><li>Trouble getting along with others? </li></ul><ul><li>Maybe it’s time to find out how Settletra™ can help </li></ul>
    15. 15. <ul><li>Structured data that describes the attributes of an “information package” (Taylor, 1994) </li></ul><ul><li>Helps manage & share information </li></ul><ul><li>Helps find information </li></ul>Metadata – a refresher Document Component Data Metadata can be applied at any level Library © 2009
    16. 16. I am metadata
    17. 17. Types of metadata Structural Administrative Descriptive Taxonomy can apply in any category What is it? What is it about? What is it called? When was it created? Who owns it? What’s its status? What parts does it have?
    18. 18. Types of metadata Structural Administrative Descriptive Taxonomy can apply in any category Subject Title Document type Description Date created File type Review date Publication Status Is_Part _Of Requires Parent_Object
    19. 19. <ul><li>Taxonomy is applied to content as metadata </li></ul><ul><ul><li>Describes </li></ul></ul><ul><ul><ul><li>Is-ness </li></ul></ul></ul><ul><ul><ul><li>About-ness </li></ul></ul></ul>Taxonomy as metadata Press Release Item Types Press Press Releases Logos Press Kits Taxonomy IRESSA Brands ELAVIL IRESSA Is about Is a Date created May-15-2009 Document name IRESSA Recommended... Item Type Metadata Document type Document
    20. 20. Uses for Metadata <ul><li>Identification </li></ul><ul><li>Discovery </li></ul><ul><li>Structural </li></ul><ul><li>Rights </li></ul><ul><li>Product </li></ul>
    21. 21. Identification <ul><li>Globally unique identifiers </li></ul><ul><li>Single or federated registries (directories) </li></ul><ul><li>Choice of what to identify </li></ul><ul><ul><li>Abstract piece of IP </li></ul></ul><ul><ul><li>Manifestation of work (US version, German version, etc.) </li></ul></ul><ul><ul><li>Individual copy </li></ul></ul><ul><li>General or content type-specific </li></ul><ul><li>Examples: </li></ul><ul><ul><li>Book publishing: ISBN, ISTC </li></ul></ul><ul><ul><li>Journal publishing: ISSN </li></ul></ul><ul><ul><li>Video content: ISAN </li></ul></ul><ul><ul><li>Music: ISWC, ISRC, ISMN, GRid </li></ul></ul><ul><ul><li>Broadcast industry: UMID </li></ul></ul><ul><ul><li>All content types: DOI, Handle </li></ul></ul><ul><ul><li>Internet resources: URL, URN, URI </li></ul></ul>
    22. 22. Discovery <ul><li>Enable searching, querying, categorization </li></ul><ul><li>Basic identifying information </li></ul><ul><li>Descriptive metadata </li></ul><ul><li>Examples </li></ul><ul><ul><li>Identifying information – from Dublin Core schema: Title Creator Publisher Format </li></ul></ul><ul><ul><li>Descriptive information – from Dublin Core schema: Subject Description </li></ul></ul>
    23. 23. Discovery Standards <ul><li>Basic bibliographic: Dublin Core </li></ul><ul><li>Books: ONIX </li></ul><ul><li>Magazine articles (print & online): PRISM </li></ul><ul><li>Journal articles (online): CrossRef </li></ul><ul><li>News stories: NewsML </li></ul><ul><li>Educational content: LOM </li></ul><ul><li>Images: TIFF, DIG35 </li></ul><ul><li>Music: MUZE, AMG </li></ul>
    24. 24. Structural <ul><li>Describe logical structure of content </li></ul><ul><ul><li>Ideally without defining output appearance </li></ul></ul><ul><li>Allow content to be fed to predefined templates for production & distribution </li></ul><ul><li>Replacements for old markup languages (TROFF, SCRIPT, etc.) </li></ul><ul><li>Examples </li></ul><ul><ul><li>From NITF tagset: <hedline> [sic] <byline> </li></ul></ul>
    25. 25. Structural Standards <ul><li>Web pages: XHTML – HTML that can be validated through an XML parser </li></ul><ul><li>News stories: NITF </li></ul><ul><li>E-books: IDPF OPS/OPF </li></ul><ul><li>Technical documentation (book form): DocBook </li></ul><ul><li>Technical documentation (modular): DITA </li></ul><ul><li>Multimedia: SMIL/MMS </li></ul>
    26. 26. Rights <ul><li>Establish rights that can be conveyed to user </li></ul><ul><li>Define rights that you own or can grant </li></ul><ul><li>Examples </li></ul><ul><ul><li>From ODRL 1.1 Permission Elements: display print play execute sell lend give lease modify excerpt … </li></ul></ul>
    27. 27. Rights Standards <ul><li>DRM-based distribution: ODRL, MPEG REL/XrML </li></ul><ul><li>Website indexing/search: ACAP </li></ul><ul><li>Image licensing: PLUS </li></ul><ul><li>Downstream reuse rights: Creative Commons </li></ul>
    28. 28. Product <ul><li>Describe characteristics of product </li></ul><ul><ul><li>Physical or appearance </li></ul></ul><ul><ul><li>Marketing </li></ul></ul><ul><li>Allow separation of content from product </li></ul><ul><li>Examples </li></ul><ul><ul><li>From ONIX: <ProductForm> <NumberOfPieces> <Audience> <NumberOfPages> </li></ul></ul><ul><li>Product metadata standard: ONIX (books) </li></ul>
    29. 29. The Holy Grail PAGE Taxonomy & Metadata Governance & Content Strategy submission retrieval
    30. 30. Why stop there? PAGE
    31. 31. Perhaps we can do better… <ul><li>This is ALL just metadata </li></ul><ul><li>Different users can focus on what is valuable to them: </li></ul><ul><ul><li>Price </li></ul></ul><ul><ul><li>Optical zoom </li></ul></ul><ul><ul><li>Megapixels </li></ul></ul><ul><li>The good news: this used to cost a fortune. Not anymore. </li></ul>PAGE
    32. 32. Conclusion <ul><li>Managing large and changing document repositories is challenging. </li></ul><ul><li>File stores and databases alone cannot provide for genuine findability. </li></ul><ul><li>Semantically rich metadata can provide for findability through search. </li></ul><ul><li>Shifts in the costs of faceted navigation make eCommerce-style searching a real option within the enterprise. </li></ul>PAGE
    33. 33. Communities & Events <ul><li>Communities of Practice </li></ul><ul><ul><li>Taxonomy: </li></ul></ul><ul><ul><li>SharePoint IA: </li></ul></ul><ul><ul><li>Search: </li></ul></ul><ul><li>Upcoming Webinars </li></ul><ul><ul><li>Taxonomy Community of Practice series </li></ul></ul><ul><ul><ul><li> </li></ul></ul></ul><ul><ul><li>Technology Showcase series </li></ul></ul><ul><ul><ul><li> </li></ul></ul></ul><ul><ul><li>Jumpstarts </li></ul></ul><ul><ul><ul><li> </li></ul></ul></ul>
    34. 34. Communities & Events <ul><li>SharePoint IA Group: </li></ul><ul><li>Taxonomy Group: </li></ul><ul><li>Search Group: </li></ul><ul><li>Upcoming Taxonomy Community of Practice Webinars </li></ul><ul><ul><ul><li>May 5, 2010 – Cross-Channel Brand Management </li></ul></ul></ul><ul><ul><ul><li>June 2, 2010 – Mega Menus </li></ul></ul></ul><ul><ul><ul><li>July 7, 2010 – Taxonomy for SharePoint 2010 </li></ul></ul></ul><ul><li>Upcoming Vendor Showcase Webinars </li></ul><ul><ul><ul><li>March 30, 2010 – SharePoint Search </li></ul></ul></ul><ul><ul><ul><li>May 11, 2010 – Optimizing Search with FAST </li></ul></ul></ul><ul><li>Visit for upcoming schedules and archives. </li></ul>Communities of Practice
    35. 35. For Additional Reading <ul><li>Conquering Chaos via Smart Content Management </li></ul><ul><li>Tips for Keyword Research </li></ul><ul><li>Measuring the Success of a Taxonomy Project </li></ul><ul><li>Retrospective Indexing: Strategies for Cataloging Legacy Content </li></ul><ul><li>Designing for Faceted Search </li></ul><ul><li>Search & Taxonomy - Leveraging Metadata to Return Content in Context </li></ul>
    36. 36. Questions PAGE