© Concept Searching 2017
Going Meta –
How to Really Use Metadata in SharePoint
www.conceptsearching.com
Twitter @conceptsearch
Robert Piddocke
Vice President of Business and Channel Development
Concept Searching
robertp@conceptsearching.com
© Concept Searching 2017
Robert Piddocke – Vice President of Channel and Business
Development is passionate about information management and
governance. He has worked for several information management
companies and assisted with a number of migration projects. In
addition, he is an information retrieval geek, who has authored two
books on SharePoint Search.
Pro SharePoint 2010 Search, Apress
Working with Fast Search Server 2010, Microsoft Press
© Concept Searching 2017
Agenda
• What is metadata and why should we use it?
• Types of metadata
• Metadata in SharePoint
• What about records?
• Metadata and auto-classification
• Case study
• Takeaways
© Concept Searching 2017
• Company founded in 2002
• Product launched in 2003
• Focus on management of structured and unstructured information
• Profitable, debt free
• Technology Platform
• Delivered as a web service
• Automatic concept identification, content tagging, auto-classification,
taxonomy management
• Only statistical vendor that can extract conceptual metadata
• 8 years KMWorld ‘100 Companies that Matter in Knowledge Management’
8 years KMWorld ‘Trend Setting Product’
• Authority to Operate enterprise wide US Air Force, NETCON US Army,
and Canadian SLSA
• Client base: Fortune 500/1000 organizations in Healthcare,
Financial Services, Manufacturing, Energy, Professional Services,
Pharmaceutical, Public sector and DoD
• Microsoft Gold Certification in Application Development
• Member of SharePoint PAC and TAP programs
• Deployed as a full trust Add-in for all versions of SharePoint on-premises
and SharePoint Online, including the latest vNext dedicated platform and the
government cloud
The Global Leader in
Managed Metadata Solutions
© Concept Searching 2017
Concept Searching’s technology platforms deliver
semantic metadata generation, auto-classification and
taxonomy/Term Store management, and are fully
integrated with all versions of SharePoint on-premises,
Microsoft Online/Office 365, and OneDrive for Business
What Do We Do?
These infrastructure platforms integrate not only with
SharePoint but also other content repositories, search
engines and file shares, enabling our clients to add
structure and manage their enterprise content,
regardless of environment
The resulting classification metadata is used by clients
to deliver ‘intelligent metadata solutions’ in areas such
as enhanced search, migration, data privacy, records
management, policy enforcement, compliance, text
analytics, and business and social collaboration
© Concept Searching 2017
Unique Approach – Compound Term Processing
• Remains unique in the industry
• Ability to identify and correctly weight
multi-word concepts in unstructured text
6
Concept Searching
provides Automatic
Concept Term Extraction
Triple
Baseball
Three
Heart
Organ
Center
Bypass
Highway
Avoid
© Concept Searching 2017
Take Control
• Content is a mess
• It’s costing you time and money
• You may not even realize it
© Concept Searching 2017
What is Metadata?
• Data about data
• Metadata explains information
• Usually tags but could be anything
• Columns in SharePoint can hold metadata
• Helps you understand information
• Helps you find information
• Helps you control information
• Helps you dispose of information
“That book is so meta!”
© Concept Searching 2017
Why Should I Use It?
• Organize data logically
• Improve “look and see” user experience
• Create different ways of viewing and slicing the same information
• Metadata navigation
• Fix search
• Enable governance
• Ensure compliance
• Business processes
• Save time and money
© Concept Searching 2017
Metadata Types
• Structural
• Location
• Size
• Filetype
• Administrative
• Creation date, last modified
• Author
• Access rights
• Classifications
• Descriptive
• Concepts describing the content
• Functional
• Purpose
• Department Source: National Information Standards Organization (NISO)
© Concept Searching 2017
Metadata in SharePoint
• Document properties
• Custom columns
• Document sets and folders
• Folksonomies – end user tagging
• Managed Metadata Service
• Content types
© Concept Searching 2017
Document Properties
• Name
• Title
• Author – maps to the creator document property
• Last modified by
• Last modified date
• Creation date
• Content type
• Access
© Concept Searching 2017
Custom Columns
• Sortable
• Calculable
• Building blocks of content types
• Searchable
• Refinable
• Required for records management
© Concept Searching 2017
Document Sets
• A type of folder
• A single work “product”
• Group common content
• Route common work product
• A place to physically keep related items
• Can be seen on a single pane – welcome page
• Manageable permissions
• Can have default content
© Concept Searching 2017
Taxonomies and Folksonomies
• You need a controlled vocabulary
• Folksonomies are not great – this is not Instagram
• Managed metadata can be user augmented
• Folksonomies are good in theory, but tough in practice
• The key to governance is control
• SharePoint supports taxonomies not ontologies – but that’s ok
© Concept Searching 2017
Managed Metadata
• Managed metadata forces metadata consistency
• Build term sets, enforce language
• Take control of information governance – don’t just talk about it
© Concept Searching 2017
Content Types
What can I do with content types?
• Group properties
• Share properties
• Create common documents
• Use a specific document in a
standard way
• Share specific types of content
uniformly
• Set retention schedules
• Dispose of a certain type of content
• Examples
• Contracts
• Legal agreements
• Invoices
• Project plans
• Receipts
• HR records
• Digital assets
© Concept Searching 2017
Records and Compliance
• Use metadata to identify and control records – content types
• Content types are required to apply information management
policies and route records
• Record file classification plans – file plans – can be created as
managed metadata and record retention schedules applied
• Metadata allows for in place records management
• Ensure compliance by knowing what is in documents or how they
are related
• Find, tag sensitivity, and sort based on metadata
• FOIA and eDiscovery
© Concept Searching 2017
Auto-classification and Taxonomy Building
• Standard metadata
• Document purpose
• Document author
• Dates – creation, last modified, approval, contract date, expiry
• Department owner
• Audience
• Advanced metadata
• Document meaning
• Customers
• Compliance levels
• Sentiment
• Value
© Concept Searching 2017
Types of Taxonomies
• Lists – dictionaries
• Thesauri – synonyms
• Categories – classification lists
• Ontologies
© Concept Searching 2017
© Concept Searching 2017
Situation:
• Global automotive organization
• 40,000 users
Challenge:
• Indexed over 20 million documents as part of migration from SharePoint
on-premises to the Office 365 dedicated vNext platform
• Improved enterprise search and collaboration across 30 content sources
• Simplified access to information for a variety of stakeholders
Solution:
• conceptClassifier for Office 365 platform
Benefits:
• Cost reduction – decommission of 50 on-premises servers to 5
• Content now auto-classified and searchable in the cloud
• Ease of access to information
• Improved business production
Case Study – Not the Norm
© Concept Searching 2017
How Did the Process Work?
• Create a taxonomy
• Taxonomy designed for subject-matter experts
• Easy to use
• Begin auto-classification
• Update taxonomy using the taxonomy prompt ‘suggest clues from class’
• Reiterate for content optimization, security breaches, records
• Index and classify the whole corpus in alignment with business
requirements
© Concept Searching 2017
What Was the Result?
• Reduced on-premises servers from 50 servers to 5
• Achieved immediate improvements in enterprise search and eDiscovery,
enabled concept-based searching
• Accomplished in 2 weeks
• Successful indexing of 20 million documents
© Concept Searching 2017
Takeaways
• Use metadata not folders
• Use managed metadata to control vocabulary
• Automate tagging wherever possible
• Use properties wherever it makes sense
• Metadata controls your success in governance and compliance
© Concept Searching 2017
Thank You
www.conceptsearching.com
Twitter @conceptsearch
Robert Piddocke
Vice President of Business and Channel Development
Concept Searching
robertp@conceptsearching.com

SharePoint Saturday Toronto - Going Meta – How to Use Metadata in SharePoint and Office 365

  • 1.
    © Concept Searching2017 Going Meta – How to Really Use Metadata in SharePoint www.conceptsearching.com Twitter @conceptsearch Robert Piddocke Vice President of Business and Channel Development Concept Searching robertp@conceptsearching.com
  • 2.
    © Concept Searching2017 Robert Piddocke – Vice President of Channel and Business Development is passionate about information management and governance. He has worked for several information management companies and assisted with a number of migration projects. In addition, he is an information retrieval geek, who has authored two books on SharePoint Search. Pro SharePoint 2010 Search, Apress Working with Fast Search Server 2010, Microsoft Press
  • 3.
    © Concept Searching2017 Agenda • What is metadata and why should we use it? • Types of metadata • Metadata in SharePoint • What about records? • Metadata and auto-classification • Case study • Takeaways
  • 4.
    © Concept Searching2017 • Company founded in 2002 • Product launched in 2003 • Focus on management of structured and unstructured information • Profitable, debt free • Technology Platform • Delivered as a web service • Automatic concept identification, content tagging, auto-classification, taxonomy management • Only statistical vendor that can extract conceptual metadata • 8 years KMWorld ‘100 Companies that Matter in Knowledge Management’ 8 years KMWorld ‘Trend Setting Product’ • Authority to Operate enterprise wide US Air Force, NETCON US Army, and Canadian SLSA • Client base: Fortune 500/1000 organizations in Healthcare, Financial Services, Manufacturing, Energy, Professional Services, Pharmaceutical, Public sector and DoD • Microsoft Gold Certification in Application Development • Member of SharePoint PAC and TAP programs • Deployed as a full trust Add-in for all versions of SharePoint on-premises and SharePoint Online, including the latest vNext dedicated platform and the government cloud The Global Leader in Managed Metadata Solutions
  • 5.
    © Concept Searching2017 Concept Searching’s technology platforms deliver semantic metadata generation, auto-classification and taxonomy/Term Store management, and are fully integrated with all versions of SharePoint on-premises, Microsoft Online/Office 365, and OneDrive for Business What Do We Do? These infrastructure platforms integrate not only with SharePoint but also other content repositories, search engines and file shares, enabling our clients to add structure and manage their enterprise content, regardless of environment The resulting classification metadata is used by clients to deliver ‘intelligent metadata solutions’ in areas such as enhanced search, migration, data privacy, records management, policy enforcement, compliance, text analytics, and business and social collaboration
  • 6.
    © Concept Searching2017 Unique Approach – Compound Term Processing • Remains unique in the industry • Ability to identify and correctly weight multi-word concepts in unstructured text 6 Concept Searching provides Automatic Concept Term Extraction Triple Baseball Three Heart Organ Center Bypass Highway Avoid
  • 7.
    © Concept Searching2017 Take Control • Content is a mess • It’s costing you time and money • You may not even realize it
  • 8.
    © Concept Searching2017 What is Metadata? • Data about data • Metadata explains information • Usually tags but could be anything • Columns in SharePoint can hold metadata • Helps you understand information • Helps you find information • Helps you control information • Helps you dispose of information “That book is so meta!”
  • 9.
    © Concept Searching2017 Why Should I Use It? • Organize data logically • Improve “look and see” user experience • Create different ways of viewing and slicing the same information • Metadata navigation • Fix search • Enable governance • Ensure compliance • Business processes • Save time and money
  • 10.
    © Concept Searching2017 Metadata Types • Structural • Location • Size • Filetype • Administrative • Creation date, last modified • Author • Access rights • Classifications • Descriptive • Concepts describing the content • Functional • Purpose • Department Source: National Information Standards Organization (NISO)
  • 11.
    © Concept Searching2017 Metadata in SharePoint • Document properties • Custom columns • Document sets and folders • Folksonomies – end user tagging • Managed Metadata Service • Content types
  • 12.
    © Concept Searching2017 Document Properties • Name • Title • Author – maps to the creator document property • Last modified by • Last modified date • Creation date • Content type • Access
  • 13.
    © Concept Searching2017 Custom Columns • Sortable • Calculable • Building blocks of content types • Searchable • Refinable • Required for records management
  • 14.
    © Concept Searching2017 Document Sets • A type of folder • A single work “product” • Group common content • Route common work product • A place to physically keep related items • Can be seen on a single pane – welcome page • Manageable permissions • Can have default content
  • 15.
    © Concept Searching2017 Taxonomies and Folksonomies • You need a controlled vocabulary • Folksonomies are not great – this is not Instagram • Managed metadata can be user augmented • Folksonomies are good in theory, but tough in practice • The key to governance is control • SharePoint supports taxonomies not ontologies – but that’s ok
  • 16.
    © Concept Searching2017 Managed Metadata • Managed metadata forces metadata consistency • Build term sets, enforce language • Take control of information governance – don’t just talk about it
  • 17.
    © Concept Searching2017 Content Types What can I do with content types? • Group properties • Share properties • Create common documents • Use a specific document in a standard way • Share specific types of content uniformly • Set retention schedules • Dispose of a certain type of content • Examples • Contracts • Legal agreements • Invoices • Project plans • Receipts • HR records • Digital assets
  • 18.
    © Concept Searching2017 Records and Compliance • Use metadata to identify and control records – content types • Content types are required to apply information management policies and route records • Record file classification plans – file plans – can be created as managed metadata and record retention schedules applied • Metadata allows for in place records management • Ensure compliance by knowing what is in documents or how they are related • Find, tag sensitivity, and sort based on metadata • FOIA and eDiscovery
  • 19.
    © Concept Searching2017 Auto-classification and Taxonomy Building • Standard metadata • Document purpose • Document author • Dates – creation, last modified, approval, contract date, expiry • Department owner • Audience • Advanced metadata • Document meaning • Customers • Compliance levels • Sentiment • Value
  • 20.
    © Concept Searching2017 Types of Taxonomies • Lists – dictionaries • Thesauri – synonyms • Categories – classification lists • Ontologies
  • 21.
  • 22.
    © Concept Searching2017 Situation: • Global automotive organization • 40,000 users Challenge: • Indexed over 20 million documents as part of migration from SharePoint on-premises to the Office 365 dedicated vNext platform • Improved enterprise search and collaboration across 30 content sources • Simplified access to information for a variety of stakeholders Solution: • conceptClassifier for Office 365 platform Benefits: • Cost reduction – decommission of 50 on-premises servers to 5 • Content now auto-classified and searchable in the cloud • Ease of access to information • Improved business production Case Study – Not the Norm
  • 23.
    © Concept Searching2017 How Did the Process Work? • Create a taxonomy • Taxonomy designed for subject-matter experts • Easy to use • Begin auto-classification • Update taxonomy using the taxonomy prompt ‘suggest clues from class’ • Reiterate for content optimization, security breaches, records • Index and classify the whole corpus in alignment with business requirements
  • 24.
    © Concept Searching2017 What Was the Result? • Reduced on-premises servers from 50 servers to 5 • Achieved immediate improvements in enterprise search and eDiscovery, enabled concept-based searching • Accomplished in 2 weeks • Successful indexing of 20 million documents
  • 25.
    © Concept Searching2017 Takeaways • Use metadata not folders • Use managed metadata to control vocabulary • Automate tagging wherever possible • Use properties wherever it makes sense • Metadata controls your success in governance and compliance
  • 26.
    © Concept Searching2017 Thank You www.conceptsearching.com Twitter @conceptsearch Robert Piddocke Vice President of Business and Channel Development Concept Searching robertp@conceptsearching.com

Editor's Notes

  • #5 For those of you that perhaps do not know who we are, we have been in business since 2002. We were the first to market with a solution to automatically apply semantic metadata; and we are also the only solution in the market with a fully integrated statistical multi-term auto-classification platform which we use to deliver metadata driven search and governance solutions. We have been a managed Microsoft partner since 2008, and we are one of only 11 search partners worldwide who is a Microsoft Business-Critical SharePoint program partner. Our product platforms deliver metadata driven search and governance solutions for all versions of SharePoint on premise from 2007 onwards; as well as all versions of SharePoint Online – the multi tenant, the older dedicated version and the government cloud, as well as OneDrive for Business. We offer full product support in the form of both a farm solution and a provider hosted add-in.
  • #6 [This is an animated screen with each paragraph appearing after pressing enter.] This is actually a pretty straightforward definition. The first, is essentially what we do: Metadata Generation Auto-classification Taxonomy/Term Store Management The second: Flexibility of the technology to access content regardless of where it is stored The third: Simply that once we generate the semantic metadata, classify at to a taxonomy, then all of the Intelligent Metadata Solutions can be developed and deployed [Remember, all the Intelligent Metadata Solutions are not ‘real’ applications although can be run stand alone, but they also integrate with records management, security, search solutions – we augment traditional applications with one set of technologies.]
  • #7 Key Differentiator – still unique in the industry – compound term processing Traditional search assumes the end user knows what they are looking for, or must enter the ‘right’ combination of words to get the ‘right’ result. Knowledge workers need to identify content in the context of what they are seeking. The fundamental problem with search solutions is that they are based on an index of single words. Yet most queries are expressed in short patterns of words and not single words in isolation – which are highly ambiguous. In the example above, a search engine relying on keywords would identify all the documents that contained the words: triple, heart, bypass instead of documents that contained the concept of ‘triple heart bypass’. Since the concept has been identified, other documents that have related concepts will be identified even if they do not contain that exact phrase (coronary surgery, heart attack, stroke, etc.)
  • #8 The business world was dominated by paper for centuries but now the shift to electronic information is upon us, largely due to the vast amount of information we deal with but also due to its transient nature. Physical paper has taught us how to structure content in folders and cabinets but this method of organizing information is now outmoded by modern techniques. We are in the age of metadata. Records managers that have used color bar filing systems are actually using a form of Metadata. Now we have so much more flexibility but we often have resistance to that flexibility. Folders seem logical.
  • #9 Metadata is about data.
  • #10 Metadata is why we invented computers. We can break out of the structures that bind us with our physical representation of the world and apply logical relationships that are multi-facetted. In the past we had to put files into a folder in a filing cabinet that explained what that file was about. With large amounts of documents and information things got lost. Being a secretary was an important task because they were the only ones who could navigate those structures.
  • #11 There are many choices when it comes to structuring metadata. This is a key part of building your Metadata strategy. How will you organize your content.
  • #12 SharePoint is essentially a metadata machine. It differs from a traditional file share in many ways but many of these essentially boil down to its capability to associate data with documents.
  • #13 It is very familiar to everyone that documents have or can have properties and all documents in SharePoint will have some default properties applied. These are often ignored but can be leveraged to help find and use documents. Things as simple as a title are actually metadata.
  • #14 Columns are the default container for metadata in SharePoint. Extensive use of these columns can add a lot of value to content but can also be a burden for users. IA is, again, essential in this process as making sure the metadata is in line with your IW needs will mean the difference between success and failure.
  • #15 Document sets allow for the best of metadata with the comfortable feel of a folder. Document sets are a type of Content Type and can be used to contain a single work product. An easy example is a project where documents can be collected. They can even have default documents added each time a set it created to help kickstart the and shape the work product. The disadvantage is that the documents must be physically collected in the set, possibly creating duplicates and permission based discovery issues.
  • #16 Taxonomies and good AI are the key to successful Information Management. Built out your Taxonomy in the MMS and decide on structure before you start applying it. SharePoint can be a pain with making changes to Content Types so be planned out before you start using them.
  • #17 Use a taxonomy for the data profiling. This will provide you the ability to segregate content quickly and easily. Using the taxonomy for data profiling, you will be able to eliminate garbage data. Since the technology understands the context within content and inter and intra related content grouping is very precise. In addition separate taxonomies can be set-up to identify security breaches within content as well as undeclared records.
  • #18 Content Types are a reusable container of Metadata, workflow, and settings in SharePoint. This container can also dictate information management policies and document routing. This collection of metadata can be used to define a work product or process of a typical type of content in SharePoint. The key to Content Types is, of course, Metadata – data used to describe the content this container holds. By creating content types you can automatically define what types of information should be used to describe and extend the content in question.
  • #19 The point being made, the amount of files with no value are also returned during a search, rendering effective search a significant challenge. Users overwhelmingly use file stores instead of designated repositories to store information and as a result they end up growing out of control. Impacts any application that requires the use of metadata such as eDiscovery, text mining and analytics. Increases risk and is costly.
  • #20 Use a taxonomy for the data profiling. This will provide you the ability to segregate content quickly and easily. Using the taxonomy for data profiling, you will be able to eliminate garbage data. Since the technology understands the context within content and inter and intra related content grouping is very precise. In addition separate taxonomies can be set-up to identify security breaches within content as well as undeclared records.
  • #21 Term lists (authority files, glossaries, dictionaries, and gazetteers) 2. Classifications and categories (subject headings, classification schemes, taxonomies, and categorization schemes) 3. Relationship lists (thesauri, semantic networks, and ontologies) Ontologies show relationships. Hierarchical Taxonomies match a parent child relationship. Poly-hierarchical taxonomies allow at least one child to have more than one parent.
  • #22 BUT Metadata can become a burden on IW time and work loads. We are now not only asking our employees to sell our products and services, create projects, coordinate with colleagues and deliver those products and services to the customer while keeping them satisfied. We are also asking them to be experts in information management and governance and apply meaningful metadata across the organization. What CS does is support your metadata and IM strategy while not burdening the IW end user.
  • #23 Client Profile: One of the largest global automotive parts manufacturers, and operates technical centers, manufacturing sites, and customer support services in 44 countries. A highly diversified company, it designs, engineers, and manufactures a wide variety of components, integrated systems, and modules.   Issue/Situation/Challenge: Needed to provide business tools to improve its enterprise-wide search and collaboration experience, through the tagging, classification, and reduction of discrete pieces of information that business users had to sort through daily.   Cost reduction was required, while at the same time recognizing that key user groups dependent on the search facility were the engineering and knowledge management teams.   Both the design and manufacturing processes needed to be part of the solution structure, to create effective best practices and bill of design and bill of process methods.   In support of improving information access for its global internal community, the organization faced the classification of literally millions of documents, before the business user search experience could be improved.   Indexed over 20 million documents as part of a migration to the Office 365 dedicated vNext platform, live just two weeks after availability of Microsoft cloud hybrid search – reducing IT costs in the digital workplace.   This solution enabled client to decommission its 50 plus server on-premises search environment to just five.   Achieved this challenging goal with management consulting expertise and technologies that enabled multi-term metadata generation, auto-classification, taxonomy management, and search improvements through the conceptClassifier for Office 365 platform   Benefits: The solution allows any of the 40,000 users to search 20 million documents from over 30 content sources, securely and within seconds. It leveraged the Microsoft Azure cloud platform, which reduced the required infrastructure tenfold, while improving performance and reducing complexity in the digital workplace.   Cost reduction, gained from the reduction of on-premises servers.   The move to the cloud platform – this was the organization’s first Office 365 implementation, and was achieved speedily. Content is now auto-classified and searchable in the cloud.   Maintained and enhanced enterprise-wide search capability, which improved business production. The speed of this development and deployment was unprecedented and enabled cost reduction and return on investment to be achieved extremely quickly.   The solution enables collaboration, as it is also used by suppliers and partners, giving them access to same information network resources.   The responsiveness of this platform means client can now intelligently migrate all their content, to further reduce more servers. This requirements of this solution were the impetus for achieving this future-proof approach.  
  • #24 A taxonomy was created with input and collaboration from all stakeholders across the organization and then taxonomy access via CS and the MMS was given to these ‘super users’. End users never are required to apply metadata.