Drilling Down to the Challenges of SharePoint Taxonomy Implementation


Published on

Webinar presented by Marjorie M.K. Hlava of Access Innovations, Inc. and Joe Shepley of Doculabs on August 10, 2011 for the American Society of Information Science & Technology.

Published in: Technology, Education
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Drilling Down to the Challenges of SharePoint Taxonomy Implementation

  1. 1. Drilling Down to the Challenges of SharePoint Taxonomy Implementation<br />By Joe Shepley and<br />Marjorie M.K. Hlava<br />
  2. 2. What You’ll Learn<br />2<br />At the end of this webinar, you’ll better understand:<br />The problems caused by having a poor taxonomy for SharePoint<br />The benefits of having an effective taxonomy for SharePoint<br />How to create a taxonomy in SharePoint<br />How “partner” technologies can improve the taxonomy creation and management process in SharePoint<br />
  3. 3. SharePoint 2010 Capabilities<br />3<br />
  4. 4. SharePoint 2010 Capabilities<br />4<br />TAXONOMY<br />
  5. 5. SharePoint Has Many Potential Benefits<br />5<br />
  6. 6. If You Can Implement It Correctly<br />6<br />Thousands of sites, most unknown to SharePoint administrators<br />Terabytes of unnecessary content<br />No rhyme or reason to site and site collection structure<br />No consistent use of metadata…if used at all<br />
  7. 7. The Results of Poor SharePoint Taxonomy<br />The result is a tangle of SharePoint sites, with poorly organized content at every level, which renders the SharePoint environment little better than traditional shared drives<br />7<br />
  8. 8. The Results of Poor SharePoint Taxonomy<br />In fact, in many ways a SharePoint without an IA (or with a poorly designed one) is worse than shared drives<br />8<br /><ul><li>Higher storage volumes (multiple copies of a document, each with version control on it)
  9. 9. Higher per user costs (need licenses to use SharePoint)
  10. 10. Higher maintenance (DBA, SharePoint developers and admins needed to care and feed SharePoint)</li></li></ul><li>Challenges to Building a SharePoint Taxonomy<br />Even when a SharePoint implementation is planned, taxonomy typically gets eclipsed by “nuts and bolts” activities required to stand up the environment, like network architecture<br />Tight schedule, budget constraints<br />Lack of experience with taxonomy at most organizations means it gets low (or no) priority during SharePoint design and implementation<br />Often no one owns taxonomy at the organization<br />Few people outside of web design have heard of it<br />Fewer have ever had an direct experience with it<br />Taxonomy may have never been done at any time, in any part of the organization at all<br />9<br />
  11. 11. How Does a Taxonomy Help SharePoint?<br />10<br />
  12. 12. SharePoint 2010 Metadata Management<br />Create taxonomy lists in the Term Store<br />Use the taxonomy for assisted indexing<br />Type-ahead suggestion for indexing content<br />Use synonyms to represent multiple ways to express a single subject<br />Improves precision and recall for indexing<br />Import preexisting taxonomies from a CSV.<br />11<br />
  13. 13. Selectterm store management located under Site Administration<br />Edit Term Sets to accurately reflect your document libraries and content types. Term sets can be individual taxonomies or flat controlled vocabulary lists<br />Term Sets<br />12<br />
  14. 14. SharePoint server 2010 Capabilities<br />Some of the features of Windows SharePoint Services are used directly by Office SharePoint Server 2010<br />List management <br />Storage capabilities <br />Web Part framework.<br />13<br />
  15. 15. Features highlighted in Microsoft Office SharePoint Server (MOSS) 2010<br />Search (FAST ESP)<br />Document management<br />Enterprise content management<br />Business process automation and workflows<br />Taxonomy and metadata management<br />14<br />Features of SharePoint Server 2010 <br />
  16. 16. Managing Site Content<br />Create document libraries to reflect different content types used in all departments.<br />Add metadata<br />Author<br />File extension<br />Subject and indexing terms<br />Company code<br />Add Retention: Choose when the server deletes the content, or updates it.<br />15<br /><ul><li>Locations
  17. 17. Date added
  18. 18. Other metadata (Dublin Core)</li></li></ul><li>SharePoint needs<br />Metadata on every document<br />Relevant search <br />Related content alerts<br />Automatically aggregated content<br />Many use cases<br />Simple tagging<br />Authors<br />Staff<br />As uploaded<br />Automatic Security and retention for content types<br />16<br />
  19. 19. Taxonomy in SharePoint Allows<br />Browse by terms<br />Search Documents<br />Limit Search by Facets<br />Update terms<br />Reindex Documents<br />Automatic and Assisted indexing methods<br />Facilitate document retention <br />Document security by user and document type<br />Allow for the ability to use tagging – view and select<br />Integrate seamlessly with SharePoint 2010<br />Integrate with other CMS (Ektron, Drupal, etc.)<br />17<br />
  20. 20. Why add a partner to SharePoint?<br />Use taxonomy in multiple systems<br />Manage audit and govern the taxonomy<br />Identify and extract information from documents<br />Legacy data tagging automatically<br />Bulk add the metadata by populating site columns with reference to taxonomies<br />18<br />
  21. 21. Client Data<br />Full Text<br />HTML, PDF,<br />Data Feeds, etc.<br />Taxonomy Fully integrated with MOSS<br />Automatic Summarization<br />Search<br />Presentation:90% accuracy<br />Browse by Subject<br />Auto-completion<br />Broader Terms<br />Narrower Terms<br />Related Terms<br />Machine Aided Indexer (M.A.I.™)<br />SharePoint Server<br />Repository<br />Search<br />Software<br />Inline Tagging<br />Client taxonomy<br />Client Taxonomy<br />Metadata and Entity Extractor<br />Thesaurus Master<br />19<br />
  22. 22. Adding terms to the taxonomy<br />Suggest new (unused) terms for content after bulk import<br />Use the folksonomy features of SharePoint<br />Use the search logs<br />Could also use Novelty Detection<br />20<br />
  23. 23. Taxonomy in Functions<br />21<br />Equivalent terms / synonyms / non preferred terms<br />Associative relationships / related terms<br />Easy updating and modification of terms<br /> Associative relationships (Related terms).<br />Equivalent relationships <br />(Synonyms/Preferred and <br />non-preferred terms).<br />
  24. 24. SharePoint Browsable Search<br />22<br />
  25. 25. CUSTOM<br />CONNECTOR<br />EMAIL<br />CONNECTOR<br />DATABASE<br />CONNECTOR<br />FILE<br />TRAVERSER<br />WEB<br />CRAWLER<br />Adding the taxonomy <br />FAST MANAGEMENT API<br />QUERY API<br />CONTENT API<br />Data Harmony Governance API<br />SEARCH<br />SERVER<br />FILTERSERVER<br />23<br />Core Architectural Components<br />Administrator’s<br />Dashboard<br />Web<br />Content<br />Vertical<br />Applications<br />Pipeline<br />Query<br />Pipeline<br />Files,<br />Documents<br />QUERY<br />PROCESSOR<br />Portals<br />Index DB<br />Databases<br />DOCUMENT<br />PROCESSOR<br />Results<br />Custom<br />Front-Ends<br />Alerts<br />Email, <br />Groupware<br />Search harmony<br />Mobile<br />Devices<br />Custom<br />Applications<br />Content<br />Push<br />MAIstro<br />Agent DB<br />
  26. 26. Federated SharePoint Search<br />24<br />
  27. 27. Fields searched<br />25<br />
  28. 28. Role of Staff<br />Project Coordination<br />Sample data<br />Copy of thesaurus<br />Update and maintain thesaurus<br />Take training<br />Decide who will do the indexing<br />Only staff<br />Everyone<br />SharePoint Server Admin will install<br />26<br />
  29. 29. Incorporating Taxonomy into SharePoint<br />Add an EventHandler to Document Library<br />After a user uploads a file, EventHandler will send the file content to the Data Harmony server<br />Data Harmony server creates metadata by adding suggested terms from M.A.I.<br />SharePoint updates metadata fields<br />27<br />
  30. 30. Taxonomy<br />view<br />Thesaurus<br />Term Record<br />view<br />
  31. 31. Machine-Aided Indexing (M.A.I.)<br />29<br />Automatically populate Keywords, Descriptors, Indexing terms, etc.<br />Allow for manual review of auto-tagging for quality assurance.<br />
  32. 32. Automated Indexing for SharePoint<br />User adds a document to the SharePoint space and attach indexing terms to the document.<br />A new version is saved on the SharePoint 2010 server with edited properties<br />Batch upload documentation to SharePoint<br />30<br />
  33. 33. Taxonomy Management<br />Export an existing taxonomy into a CSV<br />Import new taxonomy as a Term set into SharePoint Term store management<br />Use the taxonomy for assisted searches and indexing<br />31<br />
  34. 34. Data Harmony Sample<br />32<br />32<br />A sample taxonomy exported directly from Data Harmony<br />
  35. 35. Create and name a Column for adding metadata.<br />Select the Managed Metadata radio button to add a Term set or taxonomy<br />Data Harmony Sample<br />33<br />
  36. 36. Managed Metadata<br />Importing a taxonomy enhances the way users can manually add indexing terms<br />Inclusion of synonyms<br />Type-ahead for searching and adding metadata<br />Browsing the hierarchy for indexing terms<br />34<br />
  37. 37. Managed Metadata<br />35<br />Users can browse for indexing terms <br />or…<br />Type ahead and select the appropriate suggestion<br />
  38. 38. User uploads a document <br />to SharePoint space<br />Before uploading to SharePoint server, the EventHandler sends the document to Data Harmony.<br />Data Harmony automatically attaches indexing terms before uploading to MOSS<br />Event Handler<br />Returns data to user <br />Data Harmony<br />Server<br />(M.A.I.)<br />Microsoft<br />SharePoint<br />Server 2010<br />Returns subject metadata<br />Data Harmony & MOSS<br />36<br />
  39. 39. About Doculabs<br />37<br />Doculabs consultants are experts in enterprise social collaboration and content management. We deliver highly actionable and comprehensive strategic plans and road maps that help our clients achieve their business goals, create competitive advantage, and reduce risk.<br />Our services help organizations govern information for the benefit of internal and external constituents through enhanced customer communications,<br />e-discovery, and collaboration processes.<br />Quick Facts<br /><ul><li>Founded in 1993
  40. 40. Headquartered in Chicago
  41. 41. Privately held
  42. 42. Delivered more than 1000 engagements to more than 500 customers</li></li></ul><li>About Access Innovations<br />38<br />Access innovations are experts in content creation, enrichment and conversion services. We provide services to semantically enrich and tag and raw text into highly structured data. We deliver clean ,well formed, metadata enriched ,data so our clients can reuse repurpose, store, and find their knowledge assets. We go beyond the standards to build taxonomies and other data control structures as a solid foundation for data. <br />Our services and software allow organizations to use and present their information to both internal and external constituents by leveraging search, presentation, e-commerce . We change search to found!<br />Quick Facts<br /><ul><li>Founded in 1978
  43. 43. Headquartered in Albuquerque
  44. 44. Privately held
  45. 45. Delivered more than 2000 engagements </li></li></ul><li>39<br />Questions?<br />Marjorie M.K. Hlava<br />Access Innovations, Inc.<br />mhlava@accessinn.com<br />(505) 998-0800<br />Joe Shepley<br />Doculabs, Inc.<br />jshepley@doculabs.com<br />(773) 827-2945<br />http://flavors.me/jshepley<br />