This session introduces metadata and its full use in SharePoint, both on-premises and Online. It explores best practices on how meaningful metadata can save time and money, improve user experience, and determine the overall success of collaboration and document management.
Robert Piddocke – with over a decade of experience in SharePoint, passionate about information management, and the author of two books on SharePoint Search – discusses business values and considerations involved when determining how to govern information in SharePoint and SharePoint Online.
Learn how going meta helps transcend typical SharePoint information architecture, and understand how to realize the potential of intelligent content in context.
Speaker(s)
Client Profile:
One of the largest global automotive parts manufacturers, and operates technical centers, manufacturing sites, and customer support services in 44 countries. A highly diversified company, it designs, engineers, and manufactures a wide variety of components, integrated systems, and modules.
Issue/Situation/Challenge:
Needed to provide business tools to improve its enterprise-wide search and collaboration experience, through the tagging, classification, and reduction of discrete pieces of information that business users had to sort through daily.
Cost reduction was required, while at the same time recognizing that key user groups dependent on the search facility were the engineering and knowledge management teams.
Both the design and manufacturing processes needed to be part of the solution structure, to create effective best practices and bill of design and bill of process methods.
In support of improving information access for its global internal community, the organization faced the classification of literally millions of documents, before the business user search experience could be improved.
Indexed over 20 million documents as part of a migration to the Office 365 dedicated vNext platform, live just two weeks after availability of Microsoft cloud hybrid search – reducing IT costs in the digital workplace.
This solution enabled client to decommission its 50 plus server on-premises search environment to just five.
Achieved this challenging goal with management consulting expertise and technologies that enabled multi-term metadata generation, auto-classification, taxonomy management, and search improvements through the conceptClassifier for Office 365 platform
Benefits:
The solution allows any of the 40,000 users to search 20 million documents from over 30 content sources, securely and within seconds. It leveraged the Microsoft Azure cloud platform, which reduced the required infrastructure tenfold, while improving performance and reducing complexity in the digital workplace.
Cost reduction, gained from the reduction of on-premises servers.
The move to the cloud platform – this was the organization’s first Office 365 implementation, and was achieved speedily. Content is now auto-classified and searchable in the cloud.
Maintained and enhanced enterprise-wide search capability, which improved business production.
The speed of this development and deployment was unprecedented and enabled cost reduction and return on investment to be achieved extremely quickly.
The solution enables collaboration, as it is also used by suppliers and partners, giving them access to same information network resources.
The responsiveness of this platform means client can now intelligently migrate all their content, to further reduce more servers. This requirements of this solution were the impetus for achieving this future-proof approach.
For those of you that perhaps do not know who we are, we have been in business since 2002.
We were the first to market with a solution to automatically apply semantic metadata; and we are also the only solution in the market with a fully integrated statistical multi-term auto-classification platform which we use to deliver metadata driven search and governance solutions.
We have been a managed Microsoft partner since 2008, and we are one of only 11 search partners worldwide who is a Microsoft Business-Critical SharePoint program partner.
Our product platforms deliver metadata driven search and governance solutions for all versions of SharePoint on premise from 2007 onwards; as well as all versions of SharePoint Online – the multi tenant, the older dedicated version and the government cloud, as well as OneDrive for Business. We offer full product support in the form of both a farm solution and a provider hosted add-in.
[This is an animated screen with each paragraph appearing after pressing enter.]
This is actually a pretty straightforward definition.
The first, is essentially what we do:
Metadata Generation
Auto-classification
Taxonomy/Term Store Management
The second:
Flexibility of the technology to access content regardless of where it is stored
The third:
Simply that once we generate the semantic metadata, classify at to a taxonomy, then all of the Intelligent Metadata Solutions can be developed and deployed
[Remember, all the Intelligent Metadata Solutions are not ‘real’ applications although can be run stand alone, but they also integrate with records management, security, search solutions – we augment traditional applications with one set of technologies.]
Key Differentiator – still unique in the industry – compound term processing
Traditional search assumes the end user knows what they are looking for, or must enter the ‘right’ combination of words to get the ‘right’ result.
Knowledge workers need to identify content in the context of what they are seeking. The fundamental problem with search solutions is that they are based on an index of single words. Yet most queries are expressed in short patterns of words and not single words in isolation – which are highly ambiguous.
In the example above, a search engine relying on keywords would identify all the documents that contained the words: triple, heart, bypass instead of documents that contained the concept of ‘triple heart bypass’. Since the concept has been identified, other documents that have related concepts will be identified even if they do not contain that exact phrase (coronary surgery, heart attack, stroke, etc.)
The business world was dominated by paper for centuries but now the shift to electronic information is upon us, largely due to the vast amount of information we deal with but also due to its transient nature. Physical paper has taught us how to structure content in folders and cabinets but this method of organizing information is now outmoded by modern techniques. We are in the age of metadata. Records managers that have used color bar filing systems are actually using a form of Metadata. Now we have so much more flexibility but we often have resistance to that flexibility. Folders seem logical.
Metadata is about data.
Metadata is why we invented computers. We can break out of the structures that bind us with our physical representation of the world and apply logical relationships that are multi-facetted. In the past we had to put files into a folder in a filing cabinet that explained what that file was about. With large amounts of documents and information things got lost. Being a secretary was an important task because they were the only ones who could navigate those structures.
There are many choices when it comes to structuring metadata. This is a key part of building your Metadata strategy. How will you organize your content.
Even using term sets, and these steps are actually more important than if you were using a taxonomy tool, you need to plan for the organizational metadata and the various functional groups and what they are using the metadata for and what metadata is useful to them. Are they records managers or in sales? Quite a difference.
Content Lifecycle Management: Is your goal to improve search, or actually implement content lifecycle management. If you think you might, take that into consideration. (The implementation of retention schedules in SharePoint can be associated with specific types of content through the application of information management policies.)
Use a taxonomy for the data profiling. This will provide you the ability to segregate content quickly and easily.
Using the taxonomy for data profiling, you will be able to eliminate garbage data. Since the technology understands the context within content and inter and intra related content grouping is very precise. In addition separate taxonomies can be set-up to identify security breaches within content as well as undeclared records.
Term lists (authority files, glossaries, dictionaries, and gazetteers) 2. Classifications and categories (subject headings, classification schemes, taxonomies, and categorization schemes) 3. Relationship lists (thesauri, semantic networks, and ontologies)
Ontologies show relationships. Hierarchical Taxonomies match a parent child relationship. Poly-hierarchical taxonomies allow at least one child to have more than one parent.
SharePoint is essentially a metadata machine. It differs from a traditional file share in many ways but many of these essentially boil down to its capability to associate data with documents. This is why we have
It is very familiar to everyone that documents have or can have properties and all documents in SharePoint will have some default properties applied. These are often ignored but can be leveraged to help find and use documents. Things as simple as a title are actually metadata.
Columns are the default container for metadata in SharePoint. Extensive use of these columns can add a lot of value to content but can also be a burden for users. IA is, again, essential in this process as making sure the metadata is in line with your IW needs will mean the difference between success and failure.
Document sets allow for the best of metadata with the comfortable feel of a folder. Document sets are a type of Content Type and can be used to contain a single work product. An easy example is a project where documents can be collected. They can even have default documents added each time a set it created to help kickstart the and shape the work product. The disadvantage is that the documents must be physically collected in the set, possibly creating duplicates and permission based discovery issues.
Content Types are a reusable container of Metadata, workflow, and settings in SharePoint. This container can also dictate information management policies and document routing. This collection of metadata can be used to define a work product or process of a typical type of content in SharePoint. The key to Content Types is, of course, Metadata – data used to describe the content this container holds. By creating content types you can automatically define what types of information should be used to describe and extend the content in question.
Taxonomies and good AI are the key to successful Information Management. Built out your Taxonomy in the MMS and decide on structure before you start applying it. SharePoint can be a pain with making changes to Content Types so be planned out before you start using them.
Prompt end users to add metadata: statistics show that they will predominantly select the first item in the list (Engineering as opposed to Product XYZ Revision 28 RP)
Attention View – very helpful tool
External sharing of content – this is a good feature, but end users will typically just send information, secure collaboration is an issue onto itself there are just so many loop holes for unintentional sharing of content
Self Service Migration Kit:
Designed to move content (such as document libraries or file shares) from SharePoint Server sites located at an organization's datacenters to Microsoft's cloud-based SharePoint Online service or the OneDrive Office 365 service. This should be used with care. Garbage in/Garbage out. Most end users will not clean-up their data before they move it. This can be a big detriment in search if end users migrations are frequent.
Implement Sharing Controls:
Let users share SharePoint content with external users
Let users share OneDrive content with external users
Default sharing links
Direct – only people with permission
Internal – all internal to the organization
Anonymous – includes expiration date and file permissions
Device policy:
Control access from devices that aren’t compliant or joined to a domain
Limited access
Block access
Block access from third party apps and Office 2010 and earlier
Use a taxonomy for the data profiling. This will provide you the ability to segregate content quickly and easily.
Using the taxonomy for data profiling, you will be able to eliminate garbage data. Since the technology understands the context within content and inter and intra related content grouping is very precise. In addition separate taxonomies can be set-up to identify security breaches within content as well as undeclared records.
RM is now dependent on Content Types. It is available in place and with a record center or with a combination by using information management policies. New in O365 are two new features in the Security and Compliance Center. Retention Policies and Labels which can be end user applied. This allows for users to choose a “retention policy” without having to understand the file plan by making a friendly term to label the content. This will apply in-place retention and disposition for these items in both SharePoint (onedrive) and Exchange. Data Governance retention works on sites or entire mailboxes.
A taxonomy was created with input and collaboration from all stakeholders across the organization and then taxonomy access via CS and the MMS was given to these ‘super users’. End users never are required to apply metadata.
BUT Metadata can become a burden on IW time and work loads. We are now not only asking our employees to sell our products and services, create projects, coordinate with colleagues and deliver those products and services to the customer while keeping them satisfied. We are also asking them to be experts in information management and governance and apply meaningful metadata across the organization. What CS does is support your metadata and IM strategy while not burdening the IW end user.