Just the Facts - Auto-classification and Taxonomies Webinar


Published on

The recent AIIM survey, ‘Automating Information Governance – Assuring Compliance’, co-sponsored by Concept Searching, showed that only 10% of organizations have a workable Information Governance policy. Is your company one of them? If so, it’s time to register for this educational webinar.

Information Governance is people, process, and technology. The technology is relatively easy, but it’s the people and the process that are the tough parts. Surprisingly, executive buy-in isn’t always easy to obtain, unless there has been a major data breach, or fines due to non-compliance.

If you own a piece of Information Governance, and who doesn’t, we highly recommend this webinar so you can understand the business challenges, the justification and ROI, and the technology issues that need to be answered, even if you are approaching Information Governance a step at a time. Understand, from a best practices perspective, how to develop a framework for Information Governance that can be accomplished incrementally.

What we will discuss:
• What is Information Governance and why do you care?
• The business issues and application challenges
• Justifying Information Governance, ROI and savings, and building the case
• Developing a framework for Information Governance
• Technologies
• Information Governance for Office 365 – the unique issues
• Case studies in SharePoint and Office 365 – challenges and solutions

Focusing on the SharePoint and Office 365 environments, this webinar would also be beneficial to any organization, regardless of platform

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Just the Facts - Auto-classification and Taxonomies Webinar

  1. 1. Just the Facts Auto-classification and Taxonomies Don Miller Vice President of Sales Concept Searching donm@conceptsearching.com Twitter @conceptsearch © Concept Searching 2014
  2. 2. Expert Speakers Doug Miles – Director of Market Intelligence at AIIM International researches and documents end user drivers and priorities for information management and collaboration, as AIIM’s chief analyst. He has surveyed users and authored reports on ECM, records management, scanning and capture, BPM, mobile, social, cloud, and big data. Don Miller – Vice President of Sales at Concept Searching has over 20 years’ experience in knowledge management. He is a frequent speaker on records management, and information architecture challenges and solutions, and has been a guest speaker at Taxonomy Boot Camp, and numerous SharePoint events about information organization and records management. © Concept Searching 2014
  3. 3. Agenda •Introduction •The Role of Metadata •Auto-classification Primer •Example: Auto-classification in Records Management •Typical Enforcement Challenges •Responsibilities •The Role of Taxonomies •Evaluating Solutions – Food for Thought •Types of Questions to Ask Solution Providers •Calculating ROI •Addendums •Case Studies © Concept Searching 2014
  4. 4. Best Practice – Training – Thought Leadership •Download at: www.aiim.org/research •Plus many other reports and AIIM White Papers •Training at: www.aiim.org/training
  5. 5. Volume of Electronic Records Keep ALL emails, or have no policy on email archiving Agree: “Our strategy for managing increasing content volume is to buy more discs” Of content is of no business value – i.e. is ROT redundant, obsolete or trivial Have had significant compliance or litigation issues in the last 2 years
  6. 6. Information Chaos
  7. 7. Automating Compliance How do we encourage, enforce and automate information governance? Know they are storing stuff they don’t need Not confident that they know what is safe to delete Feel automated classification is the only way to keep up with the volumes
  8. 8. Delete?
  9. 9. Automated Classification – plans How would you best describe your overall plans for automated declaration/classification of records? 14% mature across content types 25% just getting started 10% keen to get going Half overall have immediate interest We are doing it successfully across a number of content types, 8% We are doing it successfully across one or two content types, 6% We’re just getting started, 25% We are keen to automate as soon as we can, 10% It’s something we plan to do in the future, 28% We have no plans, 24% © AIIM 2014
  10. 10. Automated Classification – benefits What would you expect to be/what have been the two biggest benefits from automated classification? (Max TWO) #1 Searchability #2 Productivity #3 Compliance © AIIM 2014 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% Improved searchability General staff productivity Defensible compliance Repository alignment Storage volume reduction Staff cooperation Big data readiness Migration Not sure yet
  11. 11. •Company founded in 2002 •Product launched in 2003 •Focus on management of structured and unstructured information •Technology Platform •Delivered as a web service •Automatic concept identification, content tagging, auto-classification, and taxonomy management •Only statistical vendor that can extract conceptual metadata •8 years KMWorld ‘100 Companies that Matter in Knowledge Management’ •7 years KMWorld ‘Trend Setting Product’ •Authority to operate enterprise wide US Air Force, enterprise wide NETCON US Army, and Canadian SLSA •Locations: US, UK, and South Africa •Client base: Fortune 500/1000 organizations •Microsoft Gold Certification in Application Development •Microsoft Business-Critical SharePoint elite program partner •Smart Content Framework™ for information governance comprising •conceptClassifier for SharePoint and conceptClassifier for Office 365 •Concept Searching Technology Platform and conceptClassifier Platform •Add on – conceptTaxonomyWorkflow, and conceptClassifier for OneDrive for Business The Global Leader in Managed Metadata Solutions © Concept Searching 2014
  12. 12. Before we start... auto-classification is great but what about metadata? Metadata © Concept Searching 2014
  13. 13. Definition Metadata describes other data. It provides information about a certain item's content. For example, an image may include metadata that describes how large the picture is, the color depth, the image resolution, when the image was created, and other data. A text document's metadata may contain information about how long the document is, who the author is, when the document was written, and a short summary of the document. TechTerms.com Metadata © Concept Searching 2014
  14. 14. Types of Classification Metadata Intrinsic Information that can be extracted directly from an object (file name, size) Administrative/Management Information used to manage the document (author, date created, date to be reviewed) Descriptive Information that describes the object (title, subject, audience) Semantic Ability to extract concepts from within content and generate the metadata (intelligent metadata) © Concept Searching 2014
  15. 15. Metadata Matters The challenges of content overload •80% of enterprise data is unstructured (IDC) •60% of documents are obsolete (eLaw) •50% of documents are duplicates (Equivio) The benefits of automatic semantic metadata generation •Elimination of costs and errors associated with end user tagging •Identification and protection of secure content assets from unauthorized access and portability in accordance with compliance procedures •Automatic identification and tagging of documents of record •Normalization of content across functional and geographic boundaries •Integration with search •Ability to apply policy consistently across diverse repositories and environments © Concept Searching 2014
  16. 16. Garbage In – Garbage Out The quality of your metadata will impact the quality of auto-classification and ultimately negate your records management outcomes – and increase organizational risk and non-compliance •Manual tagging efforts are labor-intensive, error prone, redundant, and inconsistent •Best solution is automated semantic metadata generation •Defines which words are meaningful in which contexts •Creates facets for drill-down capabilities (user does not necessarily know the ‘right’ words to use to achieve the ‘right’ results when searching) •Automatically tags and classifies content in real time •Workflows to any metadata application (records management, security) •Eliminates end user tagging, yet provide overrides for authorized users © Concept Searching 2014
  17. 17. A manual metadata approach will fail 95%+ of the time Issue Organizational Impact Inconsistent Less than 50% of content is correctly indexed, meta-tagged or efficiently searchable rendering it unusable to the organization (IDC) Subjective Highly trained information specialists will agree on meta tags between 33%-50% of the time (C. Cleverdon) Cumbersome - expensive Average cost of manually tagging one item runs from $4 - $7 per document and does not factor in the accuracy of the meta tags nor the repercussions from mistagged content (Hoovers) Malicious compliance End users select first value in list (Perspectives on Metadata, Sarah Courier) No perceived value for end user What’s in it for me? End user creates document, does not see value for organization nor risks associated with litigation and non-conformance to policies What have you seen Metadata will continue to be a problem due to inconsistent human behavior Why is metadata so hard to get right? © Concept Searching 2014
  18. 18. OK, we have our metadata, what’s next? Auto-classification © Concept Searching 2014
  19. 19. •A feature found in some content management systems or records management applications that will scan the contents of a document and automatically assign metadata, categories, and keywords based on the document contents •Content based assignment of one or more pre-defined categories to documents (records), usually machine learning, statistical pattern recognition, or neural network approaches that are used to construct classifiers automatically What is Auto-classification? © Concept Searching 2014
  20. 20. Content Based Weight given to a particular subject in a document determines the class to which the document is assigned Request Based Sometimes referred to as indexing, classification in which the anticipated requests from users influence how documents are classified Policy Based Classification that is aimed at a particular audience or user group The History of Auto-classification © Concept Searching 2014
  21. 21. Supervised Some external mechanism, such as human feedback, provides information on the correct classification Unsupervised Also known as document clustering, where the classification has no reference to external information Semi-supervised Where parts of the documents are labeled by an external mechanism and some by human intervention Automatic Document Classification + © Concept Searching 2014
  22. 22. Taxonomies and thesauri are the foundation of an auto-classifier. They provide the vocabulary against which rules are built and ‘teach’ the machine how to ‘understand’ and categorize content Accuracy rates with auto-classification systems are approximately 60% - 85% Statistical •Often use Bayes’ theorem: measures ‘degrees of belief’ (or ‘degrees of aboutness’) •Use frequency and location to determine important or useful concepts •Feed the system example text for the specific category •Statistically identifies and extracts significant keywords and patterns •Document training sets •Match word/concept patterns to categories •Often need sets of 50+ documents, or more! •Poor document choice can cause pollution/noise •Drawbacks •Effort required to create the training set •Relies on the availability of keyword-rich text •Hard to determine problems Auto-classification Systems – Statistical © Concept Searching 2014
  23. 23. Taxonomies and thesauri are the foundation of an auto-classifier. They provide the vocabulary against which rules are built and ‘teach’ the machine how to ‘understand’ and categorize content Rules-based •Rely on Boolean (and, or, not) categorization rules to find either a positive or negative evidence of a match to a category •LexisNexis: and, not, not w/n, not w/para, or, pre/n, pre/, w/, not w/seg, not w/sent, w/n, w/p, w/seg, w/s, atleast, allcaps, caps, nocaps, plural, singular •More control over behavior •More work! •Success depends on quality of rules •Example: (Google OR Salesforce) NOT LinkedIn •Drawbacks •Dependent on the richness of the taxonomy and collection of synonyms/keywords •Creating and/or tweaking the rules for each category – can be onerous Most popular taxonomy management suites include auto-classification modules •With few exceptions, taxonomy tools are generally rules-based systems Auto-classification Systems – Rules Based © Concept Searching 2014
  24. 24. Linguistic •No commitment to a taxonomic tree, based on parts of speech and their relationships, typically not scalable •Related to parts of speech, syntactic parses, or semantic interpretations Machine Learning •Subfield of computer science (CS) and artificial intelligence (AI) that deals with the construction and study of systems that can learn from data, rather than follow only explicitly programmed instructions Semantic Networks •Refers to a set of relationships between concepts and words, including parts of speech and real-world relationships •These can include rules of various types – not just Boolean Auto-classification Systems – Other © Concept Searching 2014
  25. 25. Pros and Cons of Most Widely Used Classification Techniques Statistical Rules-based Work involved in building good training sets Work involved in building exhaustive rules (mitigated by taxonomy tools) If there’s a problem, can be difficult to diagnose and rectify/retrain If there’s a problem, go back to the rule set and tweak Machine learning can augment accuracy or lead to pollution (accuracy can wax and wane) System doesn’t evolve without new rules, but high degree of control (accuracy mostly increases) •Most widely used are statistical and rules-based •Several are a combination of both statistical and rules-based © Concept Searching 2014
  26. 26. Auto-classification Systems – What do they do? Document Preparation •Split into language blocks (paragraphs, headings), formatting, layout Parsing •Entity extraction •NLP: parts of speech, phrases •Terms, variants Weighting •Frequency •Location in text, phrase •Proximity •Combination •Format of text Classification •If threshold reached •Can influence search results This is where rules vs statistics come into play… Not all classification solutions are created equal! © Concept Searching 2014
  27. 27. There has been an influx of information governance tools in recent years – know the difference between the problem you are trying to solve and the solution you choose •eDiscovery indexing engines that aim to index all of an organization’s content across however many repositories/applications it uses •Records management tools that apply classification and retention rules to content •e-mail archive tools that are more ambitious than the previous generation of email archives, offer features supporting the auto-classification of email •Plugins for SharePoint to bridge the gaps, often used for enterprise search and content management •Clean-up tools for shared drives •Migration tools What Types of Tools are Available? © Concept Searching 2014
  28. 28. “More than 100,000 international laws and regulations are potentially relevant to Forbes Global 1000 companies – ranging from financial disclosure requirements to standards for data retention and privacy. Additionally, many of these regulations are evolving and often vary or even contradict one another across borders and jurisdictions.” Lorrie Luellig is of counsel, Ryley Carlock & Applewhite, PC Example – Compliance and Records Management © Concept Searching 2014
  29. 29. Has different requirements from general document classification systems or approaches Classification •Records management repository is typically the only definitive information management taxonomy managed by an organization Declaration •Conscious decision to determine what is a business record and what is not •Goal is to remove decision making from the end user •Requires classification and workflow Retention Management •Information governance •Preserve, retain, move, dispose •In-place retention management •Retention and disposition managed across multiple retention management repositories, applications, collaboration environments, file systems •No need to relocate the content Security and Auditability •Safe-haven for records •A guarantee of integrity, security, authenticity, and availability •Must provide a full audit trail that can withstand legal scrutiny Difference between auto-classification and auto-declaration Auto-classification in Records Management © Concept Searching 2014
  30. 30. •Cloud is not often used as the records management repository, nor for applications that can impact compliance, due to where and how information is stored •Storage issue – what country? •Privacy/security issue •Integration issue •Compliance issue •Infrastructure, licensing decision, not necessarily functional BUT…. •Can be used for auto-classification for any metadata application such as search, text analytics, social tagging, migration, identification of data exposures •Entirely dependent on classification/taxonomy software •Evaluate vendors who have an integrated solution •Ask for production clients using the integrated solution Implications of the Cloud © Concept Searching 2014
  31. 31. eDiscovery •Legal fees, fines and damages could be reduced by 25% if companies applied best practices to records management, security and eDiscovery (AIIM) Search •Estimates indicate that end users spend 2.5 hours per day to find information necessary to do their jobs (IDC) •85% of relevant documents are never retrieved in search (IDC) Security and Privacy •70% of data breaches are due to a mistake or malicious intent by end users (Ponemon Institute) •88% are attributed to negligence (Wharton Information Security Best Practices Conference) Implications Across the Organization Even if content is not declared a record, it still must be governed © Concept Searching 2014
  32. 32. “74% of organizations continue to depend on individuals to manually comply with legal, regulatory, and record management requirements. Given the projected growth and the inability of employees to manually manage information, organizations need to start automating the tasks associated with classifying, managing, and disposing of information assets.” Council for Information Auto-Classification (CIAC) The Problem of Course is the End User “It is simply not realistic to expect broad sets of employees to navigate extensive classification options while referring to a records schedule that may weigh in at more than 100 pages.” Forrester Research/ARMA International Survey © Concept Searching 2014
  33. 33. What Are the Typical Challenges? Electronic information is growing at a rate of 30% to 60% per year – electronic records typically constitute 90% of an organization’s records •Users •Trained (rarely done, if done minimalist approach) •Policies enforcement •Reality – biggest stumbling block in classification is the end user •Impact •Classification can be subjective, erroneous, or non-existent, if content not tagged correctly •Impacts productivity •Increases organizational risk •End user classification – 20% to 80% accurate (typically more on the low end) •Automated classification – 80% to 90% accurate, if tuned and managed Typical Enforcement Challenges Business User © Concept Searching 2014
  34. 34. Using auto-classification with the goal of streamlining the records management application, and removing the end user where possible from the process, will require •Updating policies •Creating new processes and workflows •Creating content based retention schedules •Identifying exemplar documents •Creating test programs •Audit and review processes •Ensure overall transparency Typical Challenges for IT and Management Transitioning to metadata, auto- classification, and taxonomy technologies will require one or more of the items listed, depending on the application © Concept Searching 2014
  35. 35. We still have one more missing piece! Taxonomies © Concept Searching 2014
  36. 36. Types of Taxonomies List, Picklist, Controlled Vocabulary, Authority Files List of lead or preferred terms, selected by the end user, may or may not have relationships among the terms, can include a synonym ring Synonym Lists The use of synonyms allows one concept to be instantiated as the same as the other, but still allows a term to be preferred over another Hierarchical Each content item resides in only one category, referred to as a ‘tree’ •Piano •Musical instrument Taxonomy is sometimes called the ‘world’s oldest profession’ (Aristotle) Swedish botanist Carolus Linnaeus is regarded as the father of taxonomy © Concept Searching 2014
  37. 37. Types of Taxonomies Polyhierarchical, Faceted, Thesauri Content items can exist in more than one category, more structured controlled vocabulary, provides information about each term and its relationship to other terms, features of a hierarchical taxonomy plus associative relationships •Piano •Musical instrument •Stringed instrument •Percussion instrument Ontology Multiple taxonomies with additional relationships added to specify concepts within a domain Sources: Marlene Rockmore – The Taxonomy Blog, and Heather Hedden, author of ‘The Accidental Taxonomist’ © Concept Searching 2014
  38. 38. Evaluating Solutions © Concept Searching 2014
  39. 39. Solution Components *Basic* •Semantic metadata generation, automated classification, taxonomy management •Stay away from lengthy implementations, required use of vendor consultants, stand-alone applications, use of new languages •Look for a flexible solution that addresses more than ‘just’ records management and is extendable to other applications, such as search, privacy, migration, and text analytics •Accomplished through the development of an enterprise metadata repository •Evaluate solution and richness of function as well as ease of use •Who will maintain? IT or Business? •How easy for Subject Matter Experts to contribute? •Identify how the taxonomy tool integrates with other applications and how this reduces time and effort, yet delivers high quality Records Management *As an example, each application will have its own benefits* •Automatic identification and declaration of documents of record as content is created or ingested •In-place retention management option •Elimination of end user tagging •Cloud integration •Workflow capabilities including actions on documents/records Evaluating Solutions © Concept Searching 2014
  40. 40. A Few Questions to Ask to Get You Started •How often should a drive or repository be indexed for new content? •Does the system need to perform in real time? •Should old content be re-classified to determine if it should be classified according to a different category? •How are classification errors solved? •Should the user have the ability to override the classification assignment? •Who should manage the system – IT or Business? Or both? •How long should deployment and ongoing management take? •Can end user involvement be eliminated? •How does the system handle vocabulary and/or language ambiguities? © Concept Searching 2014
  41. 41. “The metadata infrastructure provides the critical glue that binds the information infrastructure to the underlying IT infrastructure. Sound information governance practices would take advantage of the metadata infrastructure to ensure that content and data are managed consistently and adhere to written policies, across on-premise and cloud based environments.” 2010 IDC Digital Universe Study The Advantages •Ability to develop a single repository of organizationally relevant metadata to be made available to any application that requires the use of metadata •Elimination of costs and errors associated with end user tagging •Normalization of content across functional and geographic boundaries to remove ambiguity in vocabulary •Metadata managed and changed in one place •Ability to apply policy consistently across diverse repositories and applications •Provides flexibility to rapidly make changes to the repository for regulatory compliance where changes are immediately available for use by applications Recommendation – Enterprise Metadata Repository © Concept Searching 2014
  42. 42. Calculating ROI © Concept Searching 2014
  43. 43. Real World Business Drivers •Three areas •Business Impact •IT Impact •Process Impact •Most enterprises will see highest ROI from Process Impact © Concept Searching 2014
  44. 44. The primary research collected in this reference white paper illustrates •There are several benefits spanning the categories of IT, process, and business impact, all of which have moderate to high levels of business impact •The leading ROI value drivers are related to processes and effective decision making •The value of process and IT drivers is manifested via their business impact Pique Solutions – Connected Value: The ROI Benefits of Business Critical SharePoint ROI Objectives © Concept Searching 2014
  45. 45. Real World Savings Pique Solutions The Business Solutions •Search •Records Management •Migration •Data Security •eDiscovery/Litigation Support, FOIA •Information Governance •Text Analytics •Business Social Networking •Collaboration •Content Management •Metadata Management © Concept Searching 2014
  46. 46. Freebie If you would like to calculate the ROI, including soft and hard benefits for a specific application solution, please contact us on marketing@conceptsearching.com or 1 703 531 8564 © Concept Searching 2014
  47. 47. Questions? © Concept Searching 2014
  48. 48. Thank You Don Miller Vice President of Sales Concept Searching donm@conceptsearching.com Twitter @conceptsearch © Concept Searching 2014
  49. 49. © Concept Searching 2014
  50. 50. Metadata driven applications - conceptClassifier for SharePoint platform has been deployed by clients in diverse industries to automatically generate metadata and use that metadata to apply and enforce information governance policies Smart Content Framework™ Sum of parts is greater than whole © Concept Searching 2014
  51. 51. •Concept Searching’s unique statistical concept identification underpins all technologies •Multi-word suggestion is explicitly more valuable than single term suggestion algorithms Concept Searching has a unique approach to ensure success •conceptClassifier for SharePoint will generate conceptual metadata by extracting multi-word terms that identify ‘triple heart bypass’ as a concept as opposed to single keywords •Metadata can be used by any search engine index or any application/process that uses metadata. Concept Searching provides Automatic Concept Term Extraction Triple Baseball Three Heart Organ Center Bypass Highway Avoid Building an Information Governance Concept Index - Example © Concept Searching 2014
  52. 52. Situation: •Nonprofit organization that contributes to the prevention and cure of cancer •More than 30,000 users •Outpatient treatment programs that record more than 328,300 visits a year Challenge: •Portal to enable patients to access information relevant to their specific health situations •Accurate, medically sound, and secure information necessary •Aggregate content from internal and external sources Solution: •conceptClassifier for SharePoint platform •SharePoint 2010 •Microsoft FAST Search •Integrated solution with partner Aeturnum Benefits: •Accuracy of search •Relevance of results •Confidence in data •Control and trust “With more than 30,000 current users, the MyMoffitt Patient Portal has seen significant growth, and of the new patients that come to Moffitt, 87% register for a patient portal account. All developments and enhancements are about improving the patient experience.” Jennifer Camps, Director of Portal Technologies and Data Management, Moffitt Cancer Center Read the Case Study Case Study – Intelligent Search © Concept Searching 2014
  53. 53. Situation: •UK County Council •Serves approximately 1 million citizens Challenge: •Management of digital records •Reduction in paper records •Reduce physical and digital storage Solution: •conceptClassifier for SharePoint platform •SharePoint Search •SharePoint Records Management •Managed Metadata Services Benefits: •Migration of thousands of documents from File Shares to SharePoint to deliver and integrated view of all information •Cleanse content existing in multiple repositories •Real time identification and classification from all sources of ingestion (fax, scanned, and email) as well as from all semi-structured and unstructured content •Improved search to quickly identify value of document/record in ‘plain English’ Case Study - Records Management “One way to manage records whatever their medium” © Concept Searching 2014
  54. 54. Situation: •Budget of $6.9 billion •Over 60,000 users •Runs 75 hospitals and clinics providing care to more than 2.6 million beneficiaries Challenge: •Data Privacy •Intelligent Migration •Before and after •Records Management •Pilot project: 72,000 Site Collections, 5,300 retention codes, classify 200,000 documents per hour with minimum resources Solution: •conceptClassifier for SharePoint platform Benefits: •Automatic tagging based on organizational vocabulary and descriptors •Automatic routing and the ability to change the SharePoint content type •Eliminated manual tagging, removes from unauthorized access and portability •No security exposures or breaches in 4 years “Concept Searching’s Taxonomy Manager provides our Subject Matter Experts with a user friendly web interface enabling the development of controlled vocabularies that can be used to filter search results and auto- classify content to folder structures.” J.D. Whitlock, Lt Col, USAF, MSC, CPHIMS Air Force Medical Service Read the Case Study Case Study - Search, Data Privacy, Records Management, Migration © Concept Searching 2014
  55. 55. Situation: •Legal Department •FOIA Processing Challenge: •Reduce costs associated with litigation support •Overabundance of content - much was not tagged •Increase relevance in finding appropriate information Solution: •conceptClassifier for SharePoint Benefits: •Vocabulary normalization •Enables ‘concept based’ searching and eliminate the construction of complex queries •Removes the ambiguity in content •Enables the identification of new information to be captured and identified early in the discovery process and immediately made available to the discovery team •Eliminates manual tagging unless authorized Case Study - eDiscovery, Litigation Support, FOIA © Concept Searching 2014
  56. 56. Situation: •Multiple Clients Challenge: •Simply moving content to new location did not provide any benefits •Human error and time was too costly •Quantity of content too great Solution: •conceptClassifier for SharePoint platform Benefits: •Cleanses irrelevant and unnecessary documents •Dramatically reduces the time for migration •Eliminates manual intervention •Improves the outcome enabling improvements in: •Search •Records management •Data privacy •eDiscovery and litigation support •Text analytics Case Study – Intelligent Migration Automotive Parts Company The goal was to improve search for 147,000 business users but needed to migrate literally millions of documents. conceptClassifier for SharePoint was used for the pre and post migration and for enabling concept based searching with their existing search engine and taxonomy based search after the migration. conceptClassifier for SharePoint identified 66,000 duplicates out of a total of 270,000 documents, representing a 24% reduction in disk space. © Concept Searching 2014
  57. 57. Situation: •Budget of $6.9 billion •Over 60,000 users •Runs 75 hospitals and clinics providing care to more than 2.6 million beneficiaries Challenge: •Data Privacy •Intelligent Migration •Before and after •Records Management •72,000 Site Collections, 5,300 retention codes, classify 200,000 documents per hour with minimum resources (Proof of Concept) Solution: •conceptClassifier for SharePoint platform Benefits: •Automatic tagging based on organizational vocabulary and descriptors •Automatic routing and the ability to change the SharePoint content type •Eliminated manual tagging, removes from unauthorized access and portability •No security exposures or breaches in 5 years, since deployed The US Air Force deployed the technologies to implement data privacy protection processes and after five years has not had a data breach Read the Case Study Case Study – Automatic Tagging, Policy, and Governance © Concept Searching 2014
  58. 58. Situation: •8,000 registered users •800 virtual workspaces and communities of practice •Share and disseminate mission critical information and technical excellence •Collaboration Challenge: •Poor search results •No collaboration capabilities •Lack of security and control of content assets for each end user •No robust knowledge management capabilities (wikis, blogs, collaboration, poor search, etc.) Solution: •Partner Triune Group provided the knowledge management and collaboration platform •conceptSearch embedded into the solution Benefits: •Accuracy of search and relevant results •Confidence in data •Increase in quality of decision making “With the integration of the Concept Searching intelligent search capability, Triune Group was able to provide us with a robust and scalable collaboration tool that delivers not only powerful advanced searching capabilities, but also a controlled and secure environment.” Brian Follen NASA Safety Center (NSC) KnowledgeNow Program Manager Client received KMWorld ‘Reality Award Winner’ for successful deployment and recognition for a superior Knowledge Management Solution Read the Case Study Case Study – Social Tagging and Collaboration © Concept Searching 2014