Why Metadata Matters in SharePoint Search and Information Governance Webinar


Published on

Explore innovative approaches to leverage your SharePoint investment and discover how other SharePoint organizations are solving the challenges of information governance in SharePoint and in Office 365.

Join Cem Aykan, Senior Product Manager Search at Microsoft and Don Miller, VP of Commercial Accounts at Concept Searching for this one hour webinar.

• Understand the direction of SharePoint Search and the impact and changing landscape of Microsoft’s focus on the cloud and Office 365
• Obtain a high level view of the importance of metadata in search, records management, data privacy, eDiscovery, litigation support, text analytics, and social content and collaboration
• Find out how you can eliminate manual tagging and develop an enterprise metadata repository to solve all your information governance challenges
• See the award winning conceptClassifier for SharePoint platform in action and how it addresses information governance – on-premise, in the cloud, and in hybrid environments

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Why Metadata Matters in SharePoint Search and Information Governance Webinar

  1. 1. Why Metadata Matters in SharePoint Search and Information Governance Cem Aykan Senior Product Manager of Enterprise Search Microsoft cem.aykan@microsoft.com Twitter @Microsoft Don Miller Vice President of Commercial Accounts Concept Searching donm@conceptsearching.com Twitter @conceptsearch
  2. 2. Expert Speakers Cem Aykan – Senior Product Manager of Enterprise Search at Microsoft has been active in the field from the early days of FAST and joined Microsoft with the FAST Search acquisition back in 2008. Since then, he has been involved in a range of search projects and initiatives, and is currently working on nextgen Search and Discovery scenarios. Don Miller – Vice President of Sales at Concept Searching has over 20 years’ experience in knowledge management. He is a frequent speaker on records management, and information architecture challenges and solutions, and has been a guest speaker at Taxonomy Boot Camp, and numerous SharePoint events about information organization and records management.
  3. 3. Agenda • Microsoft • Roadmap for SharePoint Search • SharePoint Search 2013 features and functions • Impact and changing landscape of Microsoft’s focus on the cloud and Office 365 • Concept Searching • Information Governance and why it matters – on-premise and in the cloud • Technologies and solutions • Next Steps
  4. 4. Enterprise Search Cem Aykan Sr. Product Marketing Manager Business-Critical SharePoint
  5. 5. Evolution of Search
  6. 6. More then 10 Blue Links… Unified Search 10 blue links vs Visual and actionable Contextual and personalized Rich recommendations driven by user behavior Extensible search platform with industry standards Business-Critical SharePoint
  7. 7. Search Drives Experiences…
  8. 8. Why Search and Information Governance? Information overload makes it difficult to stay on top of topics that matter most Finding the right information for the task at hand can be difficult and time consuming Connecting with the right experts across different teams is challenging
  9. 9. Forward Looking… • • • • • Faster release cadence on service A/B Instrumentation on new features Enhanced analytics Connected experiences across O365 Mobile experiences & responsive designs • • • • • Extract meaning and context from social Engaging, visual and actionable results Flexible on-premises, cloud and hybrid Consume diverse content and signals Don’t just search, ask questions
  10. 10. Native Integration into Search/SharePoint 2013 • SharePoint 2013 is the enabler to achieve the objectives of Information Governance • conceptClassifier for SharePoint platform does not replace SharePoint Search but augments it by providing the rich multi-term metadata to the search index, auto-classifies content, and allows management of content via the Term Store/taxonomy conceptClassifier for SharePoint, coupled with the features and functions of SharePoint Search, results in a powerful enterprise application that can be used to significantly improve ‘findability’ and is critical to solving Information Governance challenges
  11. 11. The Global Leader in Managed Metadata Solutions • Company founded in 2002 • Product launched in 2003 • Focus on management of structured and unstructured information • Technology Platform • Delivered as a web service • Automatic concept identification, content tagging, auto-classification, taxonomy management • Only statistical vendor that can extract conceptual metadata • 2009, 2010, 2011, 2012, 2013, 2014 ‘100 Companies that Matter in KM’ KMWorld and Trend Setting product of 2009, 2010, 2011, 2012, 2013 • Authority to Operate enterprise wide US Air Force and enterprise wide NETCON US Army • Locations: US, UK, and South Africa • Client base: Fortune 500/1000 organizations • Microsoft Business-Critical SharePoint Program partner, Gold Certification in Application Development • Smart Content Framework™ for Information Governance comprising • Five Building Blocks for success • Product Platforms: conceptClassifier for SharePoint, conceptClassifier for Office 365, conceptClassifier, and Concept Searching Technology
  12. 12. What is Information Governance? Wikipedia says: • “Information governance, or IG, is the set of multi-disciplinary structures, policies, procedures, processes and controls implemented to manage information at an enterprise level, supporting an organization's immediate and future regulatory, legal, risk, environmental and operational requirements.” • “IG encompasses more than traditional records management. It incorporates privacy attributes, electronic discovery requirements, storage optimization, and metadata management.” Gartner says: • “Information governance is the specification of decision rights and an accountability framework to encourage desirable behavior in the valuation, creation, storage, use, archival and deletion of information. It includes the processes, roles, standards and metrics that ensure the effective and efficient use of information in enabling an organization to achieve its goals.” More simply put, the goal of Information Governance is to optimize the value of information, while simultaneously minimizing the associated risks and costs. 12
  13. 13. What is Information Governance? • Managing the information lifecycle of structured and unstructured information to improve business performance and to address • Regulatory Compliance • Organizational Policy and Risk • Privacy/Security • At a tactical solution level • • • • Search Records Management Metadata Management Migration • At an application solution level • eDiscovery • PII, PHI, FOIA • Enterprise Metadata Platform 13
  14. 14. Why do you care? • Without effective governance, most technology focused metadata projects will fail (Forrester Research) • Less than 50% of content is correctly indexed, meta tagged, or efficiently searchable • Unstructured data and metadata are increasing at an average annual growth rate of 62% • Corporations will be responsible for the security, privacy, reliability, and compliance of 85% of that information (IDC 2010 Digital Universe Study) • 67% of data loss in records management is due to end user error (Prism International) • 70% of data breaches are due to end user error (Ponemon Institute)
  15. 15. Hopefully this isn’t your job! “Gartner predicts by 2016, 20% of CIOs in regulated industries will lose their jobs for failing to implement the discipline of information governance successfully.”
  16. 16. Smart Content Framework™ Sum of parts is greater than whole Metadata driven applications – conceptClassifier for SharePoint platform has been deployed by clients in diverse industries to automatically generate metadata and use that metadata to apply and enforce Information Governance policies
  17. 17. Metadata “The metadata infrastructure provides the critical glue that binds the information infrastructure to the underlying IT infrastructure. Sound information governance practices would take advantage of the metadata infrastructure to ensure that content and data are managed consistently and adhere to written policies, across on-premise and cloud based environments.” 2010 IDC Digital Universe Study Advantages • Ability to develop a single repository of organizationally relevant metadata to be made available to any application that requires the use of metadata • Elimination of costs and errors associated with end user tagging • Normalization of content across functional and geographic boundaries to remove ambiguity in vocabulary • Metadata managed and changed in one place • Ability to apply policy consistently across diverse repositories and applications • Provides flexibility to rapidly make changes to the repository for regulatory compliance where changes are immediately available for use by applications
  18. 18. Insight “Automated tools for handling the flood of information are the only solution to coping with the increasing demands for compliance, more targeted discovery and better business intelligence. IDC Advantages • Provides the ability to find and deliver the most relevant and granular results from large, heterogeneous repositories • Provides access to relevant knowledge assets that typically would not be found • Reduces duplication of content • Makes content available for re-use and re-purposing instead of recreating it • Removes ambiguity in search • Compliance and security of content assets • Improves any interactive metadata application such as search, eDiscovery, litigation support, FOIA, text analytics, social tagging, and collaboration 18
  19. 19. Risk “More than 100,000 international laws and regulations are potentially relevant to Forbes Global 1000 companies—ranging from financial disclosure requirements to standards for data retention and privacy. Additionally, many of these regulations are evolving and often vary or even contradict one another across borders and jurisdictions.” Lorrie Luellig, Of counsel, Ryley Carlock & Applewhite, PC Advantages Risk is different for every organization – regulatory, intellectual property protection, cyber security, eDiscovery, data retention, even the use of information in unintended ways • The ability to effectively identify and validate the ‘risk’ factor • Cost versus Benefit – you may want to assume risk in certain instances • Provides the ability to identify risk – known and unknown – while weighing information value • Proactively addresses and reduces risk factors through the use of business processes and technology • Integrated into an organization’s enterprise objectives or functional objectives 19
  20. 20. Policy “Sound information governance practices and tools would enable organizations to align their data retention, acceptable use and communication, data privacy, records management, and information security policies, processes, and technical controls.” Worldwide Governance, Risk, and Compliance Infrastructure 2010–2014 Advantages Policy is driven by the organization – people – not by applications • Requires the appropriate and individualized approach for the disposition of diverse content • Includes identifying where the content resides, cleansing the content, identifying the relationship between content, then defining the policies • Key component is business user responsibilities and adaptability of the users to follow new procedures – i.e. elimination of end user tagging • Provides the infrastructure processes where content is relevant, protected, archived, or deleted 20
  21. 21. Action Action as a pillar in the Smart Content Framework™ is the execution and interactive management of the policies and subsequent processes that ensures all unstructured and semi-structured content is processed in a manner that achieves the Information Governance objectives. Advantages • Fulfils the defined organizational policies that reduce risk and enable effective management of all semi-structured and unstructured content • Enforceable and adopted by business users • Facilitates and improves business processes • Quantifiable and able to be measured 21
  22. 22. How do we get to Information Governance with SharePoint? How do we get to there with SharePoint? 1. 2. 3. 4. 5. Manually create term set for Information Governance and manage Manually search content to validate Manually add metadata to documents in alignment with Regulations, Risk and Policy Apply metadata to all legacy content SharePoint 2010 or 2013 and Office 365 How do we really get there? 1. 2. 3. 4. Extract out semantic vocabulary from your content (conceptTaxonomyManager) Validate in alignment with your vocabulary (conceptTaxonomyManager) Automate applying metadata to new and legacy content (conceptClassifier for SharePoint) Move documents into SharePoint 2010, 2013 or Office 365 (conceptTaxonomyWorkflow)
  23. 23. How to Achieve SharePoint Information Governance conceptClassifier for SharePoint and conceptClassifier for Office 365 platforms: • conceptClassifier Both automated and manual classification is supported to one or more term sets within the Term Store and across content hubs. • conceptTaxonomyManager This is an advanced enterprise class, easy-to-use taxonomy and term set development and management tool. It integrates natively with the SharePoint Term Store reading and writing in real time ensuring that the taxonomy/term set definition is maintained in only one place. • conceptSearch Compound Term Indexing Engine Licensed for the sole use of building and refining the taxonomy/term set, the engine provides automatic semantic metadata generation that extracts multi-word terms or concepts along with keywords and acronyms. Optional Product: • conceptTaxonomyWorkflow Can perform an action on a document following a classification decision when certain criteria are met. The workflow source type works in SharePoint 2007, 2010, and 2013, as well as all document types, including FILE and HTTP.
  24. 24. Why is Metadata so hard to get right for Information Governance? A manual metadata approach will fail 95%+ of the time Issue Organizational Impact Inconsistent Less than 50% of content is correctly indexed, meta-tagged or efficiently searchable rendering it unusable to the organization (IDC) Subjective Highly trained information specialists will agree on meta tags between 33%-50% of the time (C. Cleverdon) Cumbersome - expensive Average cost of manually tagging one item runs from $4 - $7 per document and does not factor in the accuracy of the meta tags nor the repercussions from mistagged content (Hoovers) Malicious compliance End users select first value in list (Perspectives on Metadata, Sarah Courier) No perceived value for end user What’s in it for me? End user creates document, does not see value for organization nor risks associated with litigation and non-conformance to policies What have you seen Metadata will continue to be a problem due to inconsistent human behavior
  25. 25. Building an Information Governance Concept Index Concept Searching has a unique approach to ensure success • Concept Searching’s unique statistical concept identification underpins all technologies • Multi-word suggestion is explicitly more valuable than single term suggestion algorithms Concept Searching provides Automatic Concept Term Extraction Triple Heart Bypass Baseball Three Organ Center Highway Avoid • conceptClassifier for SharePoint will generate conceptual metadata by extracting multi-word terms that identify ‘triple heart bypass’ as a concept as opposed to single keywords • Metadata can be used by any search engine index or any application/process that uses metadata.
  26. 26. How to create Metadata Alignment for Information Governance conceptClassifier for SharePoint provides an automated metadata approach for an immediate ROI and enforces Information Governance • • • • Create enterprise automated metadata framework/model • Average return on investment minimum of 38% and runs as high as 600% (IDC) Apply consistent meaningful metadata to enterprise content • Incorrect meta tags costs an organization $2,500 per user per year – in addition potential costs for non-compliance (IDC) Guide users to relevant content with taxonomy navigation • Savings of $8,965 per year per user based on an $80K salary (Chen & Dumais) • 100% “Recall” of content, 35% Faster access to content “Precision” 1. Create concept index from your content 7. Life Cycle Management 2. Model and Validate 6. Records Management and PII Use automatic conceptual metadata generation to improve Records Management • Eliminate inconsistent end user tagging at $4-$7 per record (Hoovers) • Improve compliance processes, eliminate potential privacy exposures 3. Automate Tagging 5. Business Processes 4. Findability
  27. 27. Enterprise Search “By itself the search function has limited value. The real value of search and information access technologies is in the ongoing efforts needed to establish effective taxonomies, to index and classify content of all kinds, in order to provide meaningful results.” Tom Eid, Research Vice President, Gartner Group
  28. 28. Building Blocks Metadata, Insight, Governance Situation: • Not-for-profit organization that contributes to the prevention and cure of cancer • More than 30,000 users • Outpatient treatment programs that record more than 328,300 visits a year Challenge: • Portal to enable patients to access information relevant to their specific health situations • Accurate, medically sound, and secure information necessary Solution: • conceptClassifier for SharePoint platform • SharePoint 2010 Microsoft FAST Search • Integrated solution with partner Aeturnum Benefits: • Accuracy of search • Relevance of results • Confidence in data • Control and trust “With more than 30,000 current users, the MyMoffitt Patient Portal has seen significant growth, and of the new patients that come to Moffitt, 87% register for a patient portal account. All developments and enhancements are about improving the patient experience.” Jennifer Camps, Director of Portal Technologies and Data Management, Moffitt Cancer Center Read the Case Study
  29. 29. Search Demonstration
  30. 30. Records Management “It is simply not realistic to expect broad sets of employees to navigate extensive classification options while referring to a records schedule that may weigh in at more than 100 pages.” Forrester Research/ARMA International Survey
  31. 31. Data Privacy and Cyber Security “70% of all breaches are due to the organization’s own staff.” Ponemon Institute
  32. 32. Building Blocks Metadata, Policy, Governance Situation: • UK County Council • Serves approximately 1 million citizens Challenge: • Management of digital records • Reduction in paper records • Reduce physical and digital storage Solution: • conceptClassifier for SharePoint platform • SharePoint Search • SharePoint Records Management • Managed Metadata Services “One way to manage records whatever their medium” Benefits: • Migration of thousands of documents from File Shares to SharePoint to deliver and integrated view of all information • Cleanse content existing in multiple repositories • Real time identification and classification from all sources of ingestion – fax, scanned, email, etc. – as well as from all semi-structured and unstructured content • Improved search to quickly identify value of document/record in ‘plain English’
  33. 33. Building Blocks Metadata, Insight, Policy, Governance Situation: • Budget of $6.9 billion • Over 60,000 users • Runs 75 hospitals and clinics providing care to more than 2.6 million beneficiaries Challenge: • Data Privacy • Intelligent Migration • Before and after • Records Management • 72,000 Site Collections, 5,300 retention codes, classify 200,000 documents per hour with minimum resources Solution: • conceptClassifier for SharePoint platform Benefits: • Automatic tagging based on organizational vocabulary and descriptors • Automatic routing and the ability to change the SharePoint content type • Eliminated manual tagging, removes from unauthorized access and portability • No security exposures or breaches in 4 years “Concept Searching’s Taxonomy Manager provides our Subject Matter Experts with a user friendly web interface enabling the development of controlled vocabularies that can be used to filter search results and autoclassify content to folder structures.” J.D. Whitlock, Lt Col, USAF, MSC, CPHIMS Air Force Medical Service Read the Case Study
  34. 34. Records Management and Data Privacy Demonstration
  35. 35. Migration “At the 2012 Compliance, Governance and Oversight Counsel (CGOC) Summit, a survey of corporate CIOs and general counsels found that, typically, 1% of corporate information is on litigation hold, 5% is in a records-retention category and 25% has current business value. This means that approximately 69% of the data most organizations keep can – and should – be deleted.” Compliance, Governance and Oversight Council (CGOC)
  36. 36. Building Blocks Metadata, Governance, Migration Situation: • Multiple clients Challenge: • Simply moving content to new location did not provide any benefits • Human error and time was too costly Solution: • conceptClassifier for SharePoint platform Benefits: • Cleanses irrelevant and unnecessary documents • Dramatically reduces the time for migration • Eliminates manual intervention • Improves the outcome enabling improvements in: • Search • Records management • Data privacy • eDiscovery and litigation support • Text analytics conceptClassifier for SharePoint identified 66,000 duplicates out of a total of 270,000 documents, representing a 24% reduction in disk space. Global Supplier of Automotive Parts The goal was to improve search for 40,000 business users but needed to migrate literally millions of documents. conceptClassifier for SharePoint was used for the pre and post migration and for enabling concept based searching with their existing search engine and taxonomy based search after the migration.
  37. 37. Migration Demonstration
  38. 38. eDiscovery, FOIA, Litigation Support “Law firms must keep up with the ever-increasing number of compliance regulations for their clients. In addition, the average Fortune 500 companies have 125 lawsuits at any given point. If law firms and compliance departments have control of the information, they will know where to look and be able to preserve the information during discovery. IG can therefore also serve as an organizational tool during litigation.” National Law Review
  39. 39. Building Blocks Metadata, Insight, Governance Situation: • Legal department • FOIA processing Challenge: • Reduce costs associated with litigation support • Overabundance of content – much was not tagged • Increase relevance in finding appropriate information Solution: • conceptClassifier for SharePoint Benefits: • Vocabulary normalization • Enables ‘concept based’ searching and eliminate the construction of complex queries • Removes the ambiguity in content • Enables the identification of new information to be captured and identified early in the discovery process and immediately made available to the discovery team • Eliminates manual tagging unless authorized • Scalable to consume terabytes of content
  40. 40. Final Comments – Q&A • SharePoint and SharePoint 2013 Search provide the foundation to build an Information Governance Platform • conceptClassifier for SharePoint augments and automates the critical components for Information Governance • • • • Content validation in alignment with Policy and Risk Auto-application of metadata in alignment with Policy and Risk Action/Migration of content in alignment with Policy and Risk Native Metadata to enhance SharePoint 2013 Search • Native SharePoint integration into Term Store/managed metadata service • SharePoint 2010, 2013, Office 365 • Lowest cost to deploy, lowest cost to maintain, fast ROI • Metadata Survey – What are industry leaders doing? • http://www.conceptsearching.com/wp/sharepoint-survey/
  41. 41. Next Steps Ready to explore SharePoint Search and Integration into Line of Business Applications to Achieve Information Governance and a Solid ROI? Please contact Don Miller, Vice President of Commercial Accounts at Concept Searching donm@conceptsearching.com
  42. 42. Please join us for our Next Webinar Climbing the Slippery Slope of SharePoint Migrations Date: March 25th Time: 11:30am-12:30pm EST “A survey of corporate CIOs and general counsels found that, typically, 69% of the data most organizations keep can – and should – be deleted.” Compliance, Governance and Oversight Counsel (CGOC) Summit So what happens to the 69%? Most likely it will get migrated with no rhyme or reason. Just because it seems easier. And the organization is still left with mismanaged, useless information. That’s only one migration scenario. Migrations can be fraught with delays, budget over runs, and overall frustration. Register for this practical and informative webinar, sponsored by Portal Solutions and Concept Searching and learn how you can eliminate migration challenges and reach the pinnacle of success. To Register: https://www3.gotomeeting.com/register/937305526
  43. 43. Thank You Cem Aykan Senior Product Manager of Enterprise Search Microsoft cem.aykan@microsoft.com Twitter @Microsoft Don Miller Vice President of Commercial Accounts Concept Searching donm@conceptsearching.com Twitter @conceptsearch