Don Miller, VP of Business Development
1 (408) 828-3400
donm@conceptsearching.com
John Challis, CTO Founder
johnc@concepts...
Agenda – Tagging Taxonomies and Term Store
Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com
Fir...
Concept Searching, Inc.
Company founded in 2002
 Product launched in 2003
 Focus on management of structured and unstru...
What Is Keyword vs. Metadata Costing You?
•Identify any type of
organizationally defined
privacy data
•Combines pattern
ma...
Tel: 703.246.9360 | Fax: 240.465.1182
USAF Human Performance Clearinghouse
GOAL : Leverage Existing USAF, AFDW, and AFMS L...
Taxonomy
Management
or IA Design
Classification
and Content
Alignment
Search and
Validation
Accurate metadata requires thr...
A Manual Metadata Approach Will Fail 95%+ Of The Time
Issue Organizational Impact
Inconsistent Less than 50% of content is...
 Create enterprise automated metadata
framework/model
 Average return on investment minimum of 38%
and runs as high as 6...
Core search expertise drives business value
 Concept Extraction – the ability to extract ‘concepts in context’
• Only sta...
conceptClassifier and Taxonomy Manager Value Propositions
Concept Searching • Don Miller • (408) 828-3400 • donm@conceptse...
Start
• File Share
• Corporate WWW Site Directory
• Industry Standard – ~50% of terms; but which 50% are in alignment with...
Taxonomy Manager aligns the content and provides “Guided Navigation”
Concept Searching • Don Miller • (408) 828-3400 • don...
Dynamic clustering is not Guided Navigation for “Proposals”
Concept Searching • Don Miller • (408) 828-3400 • donm@concept...
Clustering can be used to drive relevant clues about a term
Concept Searching • Don Miller • (408) 828-3400 • donm@concept...
Search allows us to validate our terms and the clues for our terms
Concept Searching • Don Miller • (408) 828-3400 • donm@...
conceptClassifier and taxonomyManager
We Make Metadata Work For You
 Automatic Conceptual Metadata Generation
 Automated...
conceptClassifier for FAST Search
Improves search outcomes by placing conceptual
metadata in the FAST Search index to inc...
Demo
Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com
Please feel free to call me about your term store questions and
best of breed approaches.
Don Miller, VP of Business Devel...
Upcoming SlideShare
Loading in …5
×

Webinar - The Swiss Army Knife for SharePoint 2010 – Tagging, Term Store and Taxonomies

1,724 views

Published on

The presentation from our recent Webinar.

Published in: Business, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,724
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • It is important to note that metadata, auto-classification, and taxonomies are not applications – the business value of these tools are often integrated with other solutions – such as the offerings of the other participants in this panel

    Let’s look at where these tools can compliment other solutions and improve business processes

    CLICK: Migration:
    With the vast amounts of content - moving all content doesn’t make sense and using valuable resources to identify what should/should not be migrated isn’t a good use of time or money
    Before the migration you can use these technologies to:
    Eliminate duplicate documents
    Identify documents that contain confidential or privacy data
    Identify and declare records
    Identify high value content

    Savings: We had one client who needed to manually tag 45K marketing documents and estimated that it would take 6 months will 2 full-time people – with our tools it took 2 weeks

    CLICK Search:
    The age old problem is how to get end users to tag content – it’s estimated that less 50% of content is correctly indexed, meta tagged or efficiently searchable – it isn’t about what search engine you use
    Statistics still claim that end users spend 15% of their time duplicating information, 25% searching, and 40% can’t find what they need to do their jobs
    Automatic generation of conceptual metadata removes the end user from the tagging process HUMANS WON’T TAG CONTENT THROUGH FORMS, PICKLISTS, DROP DOWNS BUT WE WILL ALWAYS FIND WAYS TO AVOID TAGGING
    Content, once tagged can be provided to any search engine index to deliver more accurate search results
    Using the taxonomy users can more efficiently find relevant information via the hierarchical structure

    Savings: 2.5 hours per day per user

    CLICK Records Management:
    The problem cited most frequently is inconsistent end user tagging in the declaration of records
    With metadata generation and a taxonomy that mirrors the file plan – documents can be automatically declared records based on the concepts and descriptors within the document
    Based on custom Content Types in SharePoint the document can be declared a record and routed to the RM repository

    Savings: $4 - $7.04 per document record

    CLICK Data Privacy Protection
    Taxonomy(s) can be created to identify any organizationally defined confidential information
    When content is created or ingested the document can be identified as containing confidential information and using Content Type updating the document can be routed to a secure location and locked down using Windows Rights Management

    Cost Avoidance: Average cost of a data exposure is $225K - $35 million








  • Tying this all together and seeing how it works in the real world:

    The USAF Human Performance Clearinghouse (HPC) is an enterprise solution that serves the USAF Human Systems Integration Community.  Lead by the Air Force Medical Service (AFMS) the HPC leverages SharePoint and conceptClassifier to deliver real-time collaboration, Information/Content Management, Knowledge Management, Taxonomy Management, automatic metadata tagging, and automated Windows Rights Management to over 75 locations worldwide.

    US Air Force Medical Service
    Initially deployed conceptClassifier to power Knowledge Portal with over 65K users
    Controlled vocabulary consists of over 27K unique keywords, metadata, and multi-word fragments generated by conceptClassifier
    They are now using it do solve a variety of challenges

    CLICK THROUGH ARROWS:
    CLICK: Migration
    CLICK: Data Privacy
    CLICK: Search
    CLICK: Although we didn’t talk about this, the US Air Force HPC also uses it for eDiscovery and Freedom of Information Act

    conceptClassifier provides technologies that are natively integrated with SharePoint and delivers the missing pieces including the conceptual metadata generation, auto-classification, and taxonomy management tools that can be used to leverage your metadata and improve business outcomes resulting in a tangible ROI, reduces costs and organizational risk.
  • Traditional search assumes the end user knows what they are looking for, or must enter the ‘right’ combination of words to get the ‘right’ result.

    Knowledge workers need to identify content in the context of what they are seeking. The fundamental problem with search solutions is that they are based on an index of single words. Yet most queries are expressed in short patterns of words and not single words in isolation – which are highly ambiguous. In the example above, a search engine would identify all the documents that contained the words: triple, heart, bypass instead of documents that contained the concept of ‘triple heart bypass’. Since the concept has been identified, other documents that have related concepts will be identified even if they do not contain that exact phrase.

    The metadata generation issue is increasingly a growing concern in enterprises. Not only for search but also for records management, compliance, and enterprise content management. A comprehensive approach requires more than syntactic metadata and requiring end users to add rich metadata is haphazard and subjective at best. Since conceptClassifier for SharePoint is no longer restricted to keyword identification, compound term metadata can be automatically generated either when the content is created or ingested. The generation of metadata based on concepts extracts compound terms and keywords from a document or corpus of documents that are highly correlated to a particular concept. By identifying the most significant patterns in any text, these compound terms can then be used to generate non-subjective metadata based on an understanding of conceptual meaning.

    Compound term processing can address many challenges facing large enterprises and provide many benefits. Identification of concepts within a large corpus of information removes the ambiguity in search, eliminates inconsistent meta-tagging, and automatic classification and taxonomy management based on concept identification simplifies development and on-going maintenance.
     

  • conceptClassifier for SharePoint is fully integrated with both SharePoint, Microsoft Office, Windows Server 2008 R2 FCI, FAST and Microsoft Enterprise Search.

    The automatic extraction of compound terms enables the Subject Matter Expert (SME) to use the terms within the taxonomy generation process, reducing the time to build out and maintain taxonomies by 80%.

    Features:
    Downloadable in 30 minutes – no programming required
     Automatic classification and compound term meta data extraction
     Classification technology uses concept extraction and compound term processing
     Taxonomy based and faceted navigation
     Robust suite of tools to build an maintain taxonomies
    Fully integrated with Content Types
    Automatic classification from MS Office and Outlook
    Taxonomy browse, faceted navigation, and preview functionality from the search interface
    Can automatically classify from SharePoint, folders, and web sites providing a single interface to all permmissable content
     Simple intuitive interface designed for the SME
      Fully SOA compliant, delivered as Web Parts, based on open standards
      Integrates with Microsoft Office, Microsoft Records Center 
  • The Only Microsoft Solution that Runs Natively in ...
    FAST Search, SharePoint 2007, 2010, Windows Server R2 FCI, and Microsoft Office
     
    conceptClassifier provides the tools to rapidly build and easily manage unstructured content. Providing automatic conceptual metadata generation, automated classification and taxonomy management organizations can harness the power of content to not only improve findability within the FAST Search product suite, but drive additional business processes such as records management, compliance, and enforce governance.
      
    The Only FAST Search Solution that ...
    Automatically Generates Conceptual Metadata
     
    Utilizing our unique concept identification and extraction capabilities, conceptClassifier’s statistical engine can identify out-of-the box all the meaningful concepts resident within an organization’s own information repositories and automatically generate semantic metadata that is unique to organization and their nomenclature.
     
    The ability to automatically generate conceptual multi-word term metadata and placing those terms in the FAST Search index, the search can be performed with a higher degree of accuracy because the ambiguity inherent in single words is no longer a problem.
     
    Utilizing the Concept Searching technology framework, end users can now search on concepts, delivering a multi-dimensional view of relevant information and easily identify the relationships between content assets that otherwise may not have been found.
     
     
    The Only FAST Search Solution that ...
    Eliminates Manual Metadata Tagging
     
    The Only FAST Search Solution that...
    Delivers Innovative, Intuitive, & Rapidly Deployed Taxonomy Management Managed by Business Users
     
  • Webinar - The Swiss Army Knife for SharePoint 2010 – Tagging, Term Store and Taxonomies

    1. 1. Don Miller, VP of Business Development 1 (408) 828-3400 donm@conceptsearching.com John Challis, CTO Founder johnc@conceptsearching.com The Swiss Army Knife of SharePoint
    2. 2. Agenda – Tagging Taxonomies and Term Store Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com First 20 Minutes • Company Overview • Review metadata, taxonomy, classification, search problem and value • Case Study • Review tagging approaches • Review suggested taxonomy approaches to build out 2010 Term Store • Concept Searching turn key approach to metadata management and term store development Second 30 Minutes • Product demo – End Users – How does an end user search in 2010 – How does an end user tag in 2010 • How does conceptClassifier and Taxonomy Manager help 2010 – How do we accelerate process of tagging – How do we improve end user search experience with “Guided Navigation” • Product demo – Technical – Improve Information Architecture Design – Integration into 2010 Term Store – Improve Taxonomy Design for Content Stewards – Integration to FAST 2010
    3. 3. Concept Searching, Inc. Company founded in 2002  Product launched in 2003  Focus on management of structured and unstructured information  Technology  Automatic concept identification, content tagging, auto- classification, taxonomy management  Only statistical vendor that can extract conceptual metadata  2009 and 2010 ‘100 Companies that Matter in KM’ (KM World Magazine)  KMWorld ‘Trend Setting Product’ of 2009 and 2010  Locations: US, UK, & South Africa Client base: Fortune 500/1000 organizations  Managed Partner under Microsoft global ISV Program - “go to partner” for Microsoft for auto-classification and taxonomy management  Microsoft Enterprise Search ISV , FAST Partner  Software Product Suite: conceptSearch, conceptTaxonomyManager, conceptClassifier, conceptClassifier for SharePoint, contentTypeUpdater Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com
    4. 4. What Is Keyword vs. Metadata Costing You? •Identify any type of organizationally defined privacy data •Combines pattern matching with associated vocabulary •Automatic Content Type updating enabling workflows and rights management Data Privacy Protection •Average cost per exposed record is $197 and ranges from $90- $305 per record •70% of breaches are due to a mistake or malicious intent by an organization’s own staff •Average cost runs from $225K to $35M •Eliminate manual tagging & replace with automatic identification of multi- word concepts •Provide guided navigation via the taxonomy structure (i.e. concepts) •Go beyond dynamic clustering with conceptual clustering based on the taxonomies Search •“It’s not about better search” •Less than 50% of content is correctly indexed, meta tagged or efficiently searchable •85% of relevant documents are never retrieved in search •Taxonomy navigation is 36% - 48% faster •Savings 2.5 hours per user per day •Eliminate inconsistent end user tagging •Automatically declare documents of record based on vocabulary and retention codes •Automatically change the Content Type and route to the Records Management repository Records Management •67% of data loss in Records Management is due to end user error •It costs and organization $180 per document to recreate it when it is not tagged correctly and cannot be found •Savings of $4.00 - $7.04 per record by eliminating manual tagging •Ensures compliance and reduces potential litigation exposures •Eliminate duplicate documents •Identify privacy data exposures •Identify and declare records that were not previously identified •Notify users of high value content •Migrating required content to a structure Pre Migration/Collaboration •60% of stored documents are obsolete •50% of documents are duplicates •Requires resources to identify content alignment and what should/not be migrated •Reduces migration costs •Ensures compliance and protection of content assets •Easy end user updates Problem Solution Benefit
    5. 5. Tel: 703.246.9360 | Fax: 240.465.1182 USAF Human Performance Clearinghouse GOAL : Leverage Existing USAF, AFDW, and AFMS License Agreements to Enable IM, RM, & Privacy & Security Compliance Requirements • DoDD 8320 (Data Sharing in a Net-Centric DoD) • DoDD 5015 (Records Management) • USAF Privacy Act Program & HIPAA • Freedom of Information Act (FOIA) Distribution Statement A: Approved for public release; distribution is unlimited. 311 ABG/PA No. 09-488, 16 Oct 2009 MigrationMigration Data Privacy Records Management Search eDiscovery & FOIA Distribution Statement A: Approved for public release; distribution is unlimited. 311 ABG/PA No. 09-488, 16 Oct 2009
    6. 6. Taxonomy Management or IA Design Classification and Content Alignment Search and Validation Accurate metadata requires three components… taxonomy management, classification and search There is no such thing as an auto taxonomy generation tool. With out taxonomy, you cannot align content or guide users to content. With out classification your taxonomy does not connect to search or records management. There is no alignment. 100% of results. With out search, you can not validate taxonomy and classification. Should support multi words, because single key words are ambiguous. An enterprise metadata generation solution requires all three, otherwise the results are inferior and cumbersome or impossible to implement Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com
    7. 7. A Manual Metadata Approach Will Fail 95%+ Of The Time Issue Organizational Impact Inconsistent Less than 50% of content is correctly indexed, meta-tagged or efficiently searchable rendering it unusable to the organization (IDC) Subjective Highly trained Information Specialists will agree on meta tags between 33% - 50% of the time. (C. Cleverdon) Cumbersome - Expensive Average cost of manually tagging one item runs from $4 - $7 per document and does not factor in the accuracy of the meta tags nor the repercussions from mis-tagged content (Hoovers) Malicious Compliance End users select first value in list (Perspectives on Metadata, Sarah Courier) No perceived value for end user What’s in it for me? End user creates document, does not see value for organization nor risks associated with litigation and non conformance to policies. What have you seen Metadata will continue to be a problem due to inconsistent human behavior The answer to consistent metadata is an automated approach that can extract the meaning from content eliminating manual metadata generation yet still providing the ability to manage knowledge assets in alignment with the unique corporate knowledge infrastructure. Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com
    8. 8.  Create enterprise automated metadata framework/model  Average return on investment minimum of 38% and runs as high as 600% (IDC)  Apply consistent meaningful metadata to enterprise content  Incorrect meta tags costs an organization $2,500 per user per year – in addition potential costs for non-compliance (IDC)  Guide users to relevant content with taxonomy navigation  Savings of $8,965 per year per user based on an $80K salary (Chen & Dumais)  100% “Recall” of content, 35% Faster access to content “Precision”  Use automatic conceptual metadata generation to improve Records Management  Eliminate inconsistent end user tagging at $4-$7 per record (Hoovers)  Improve compliance processes, eliminate potential privacy exposures conceptClassifer’s TaxonomyManager automated metadata approach drives business value 1. Model and Validate 2. Automate Tagging 3. Findability 4. Business Processes 5. Records Management and PII 6. Life Cycle Management Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com
    9. 9. Core search expertise drives business value  Concept Extraction – the ability to extract ‘concepts in context’ • Only statistical metadata generation and classification company that can extract concepts from content as it is created or ingested triple heart bypass Triple Baseball Three Heart Organ Center Bypass Highway Avoid  conceptClassifier will generate conceptual metadata by extracting multi-word terms that identifies ‘triple heart bypass’ as a concept as opposed to single keywords • Search will return results based on the concept even if the exact terms are not contained in the document (i.e. ‘coronary artery surgery’, ‘heart surgery’) • Metadata can be used by any search engine index or any application/process that uses metadata Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com
    10. 10. conceptClassifier and Taxonomy Manager Value Propositions Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com • No Behavior Modification for tagging or searching • Accurate metadata (Metadata is descriptive information about information) • Accelerate and validate the building out of Taxonomies for Business Applications (THERE IS NO SUCH THING AS AUTOGENERATING A TAXONOMY!!!). If nothing, start with file share folder structure. • Align content with Business Requirements • Drive: Records Management, PII, Collaboration, Findability, EIA and Taxonomy Based Applications
    11. 11. Start • File Share • Corporate WWW Site Directory • Industry Standard – ~50% of terms; but which 50% are in alignment with your content? • Align with Business Goals • Search Logs Optimize • Subject Matter Expert Interviews • Card Sorts/Tree Builds • Clustering Validate • Classify against Taxonomy • Guided Navigation Manual taxonomy approaches to build out 2010 Term Store However, the real problem will still be that until you force end users to manually tag, your taxonomy is of little value! Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com
    12. 12. Taxonomy Manager aligns the content and provides “Guided Navigation” Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com • Enterprise Architects / Content Editors can ensure alignment with taxonomy • After 100% of Results are returned, leverage metadata for guided navigation and refiners • Accelerate document finding [PRECISION] by a minimum of 35% I want all proposals in two specific regions. I could then have a guided refiner for vertical, amount, etc.
    13. 13. Dynamic clustering is not Guided Navigation for “Proposals” Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com • Most clustering is key word based • Brings back clusters, they are best guesses • Clustering is not a starting point taxonomy, good for clues about a term or concept. • They might help, they might make it worse • Better than nothing, but not a long term strategy or evolution of key word search Dynamic navigation (CLUSTERING) is helpful, but how does an information worker know when it is a good topic or not? This is NOT PRECISION!
    14. 14. Clustering can be used to drive relevant clues about a term Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com
    15. 15. Search allows us to validate our terms and the clues for our terms Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com
    16. 16. conceptClassifier and taxonomyManager We Make Metadata Work For You  Automatic Conceptual Metadata Generation  Automated Classification  Taxonomy Development & Management • Proven to reduce taxonomy development by 80%  Microsoft Integration • Runs natively in SharePoint 2007 and SharePoint 2010, Microsoft Office Applications, SharePoint Search and FAST, Windows Server 2008 R2 FCI • Fully integrated with SharePoint Content Types  Content Type Updater • Automatically changes the Content Type based on presence of organizationally defined metadata found within the document • Identification of confidential/privacy data • Ability to identify records based on the records retention schedule and route to the records center  Technology • Downloadable in 30 minutes – no programming required • Fully SOA compliant, delivered as Web Parts, based on open standards • Highly scalable Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com
    17. 17. conceptClassifier for FAST Search Improves search outcomes by placing conceptual metadata in the FAST Search index to increase relevancy of search results Enables import of FAST Entities into the conceptClassifier taxonomy manager to fine-tune them with metadata generated from your own content and nomenclature Runs natively as a FAST Pipeline Stage eliminating integration and customization issues Eliminates vocabulary normalization issues across global boundaries through controlled vocabularies Improves faceted search results as facets are based on concepts aligned with the taxonomy Provides taxonomy browse capabilities based on the nodes within the corporate taxonomy(s) Provides accurate metadata filters such as numeric range searching and wildcard alphanumeric matching Removes documents from search results that are confidential/sensitive through automatic Content Type updating and routing to secure server Automatically tags content with both vocabulary and retention codes and respects SharePoint security that could prevent access to the document once it has been declared a record Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com
    18. 18. Demo Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com
    19. 19. Please feel free to call me about your term store questions and best of breed approaches. Don Miller, VP of Business Development 1 (408) 828-3400 donm@conceptsearching.com

    ×