2. INTRO
The news is that SharePoint 2016 will improve search.
This can mean many things. It may, for example, mean that something that
impacts search quality has improved.
Taken in this sense, maybe the news means that SharePoint usability has
improved
This is one possible meaning of the news that SharePoint 2016 will improve
search. Said in another way, maybe the news is about buttons that are
easier to find and search bars that are easier on the eyes.1
Then again, maybe the news points at something else entirely. It may rather
mean that something otherwise constraining or limiting the scope of search
has been removed.
For example, it may mean that SharePoint 2016 will allow for joint indexing
in search of cloud and on-premises repositories (which Microsoft calls
“cloud hybrid search”). SharePoint 2013 did not provide for cloud hybrid
search. SharePoint 2016, on the other hand, does provide cloud hybrid
search.2
So in the sense of being able to search cloud databases and on-premises
databases at the same time, yes, there is an improvement in SharePoint
2016.
But is that the news? – Or can it be said that SharePoint 2016 search
modifications include any benefits in the case of a direct migration from
SharePoint 2013?
1 No such changes were observed.
2 See the “Cloud hybrid search” entry in https://technet.microsoft.com/en-us/library/mt346121(v=office.16).aspx#hybrid
Last accessed March 4, 2016.
01
Will the improvement in SharePoint2016 search increase user adoption?
g and this improvement supports user adoption
g which translates into more inputs for an adaptive system to work with
g which in turn can be expected to encourage adaptation in the search
.........mechanism in a way that benefits search
g which itself supports further adoption and utilization
g and creates a feedback loop that results in search improvement.
3. Scenario of interest
Imagine an organization with a ton of unstructured data, which lives in
SharePoint 2013 in a multitude of libraries across a multitude of sites. Staff
within their units across the organization has to have contextual knowledge
to be able to come up with search queries, and from time to time find what
they need.
Now imagine that:
In short, the data itself will simply cross over into SharePoint 2016, where
the familiar search bars of SharePoint 2013 will be there for users, exactly
where they are used to seeing them – one at library level, and one at site
level.
Question?
In such a scenario, does the user have reason to expect better search results
in SharePoint 2016?
02
Will the improvement in SharePoint2016 search increase user adoption?
INTRO
This data is to be migrated over to SharePoint 2016.
During the migration, while the Intranet may be made more
engaging in an effort to effect user adoption, the data will not
be given any more structure. For example, metadata will not be
enhanced, nor duplicates noted.
Rather, the sites and contents will simply be mirrored in
SharePoint 2016, which allows people to rely on their prior
knowledge of where things are.
Hint: When evaluating
enterprise solutions, think of
users and typical use cases.
These common use scenarios
have the most impact on your
bottom-line.
4. Results
After a comparative study, designed to specifically look into the kinds of
search enhancements that would make a difference in such a scenario (see
the Methodology section for details), it is fair to say, that while something in
the way of underlying search mechanisms has changed in SharePoint 2016,
it cannot be said that the quality of search results for queries of large
document repositories has been improved sufficiently.
A paradox
But how does this make sense? – That is, how can it be that the underlying
search mechanisms can be shown to be different, and yet no enhancement
in the quality of search results over a large office document store can be
noticed?
As mentioned, SharePoint 2016 does include some search modifications;
the most notable being those related to the new cloud hybrid search. These
modifications may have had an impact on the underlying flow by which
search results are retrieved. This is not unimaginable, given that the flow of
the hybrid search can be expected to be at least to some extent different in
order to accommodate the hybrid indexing.
03
Will the improvement in SharePoint2016 search increase user adoption?
RESULTS
SP 2016 is no better
than SP 2013 at locating
particular documents
inside libraries.
5. Differences observed
04
Will the improvement in SharePoint2016 search increase user adoption?
RESULTS
During the study, some differences were observed in the behavior of SharePoint 2016 when compared to the behavior of SharePoint 2013. For
example, from time to time, SharePoint 2016 returned a different total number of results. Further, as a user would progress through consecutive
pages of results, the recalculation function of SharePoint 2016 – by which is meant the process whereby a new estimate of total results is generated
– could be observed to not always act exactly as the recalculation function in SharePoint 2013.
However, it is noted that the difference in total number of results was never dramatic, and deviance in this regard correlated with total number of
results returned. Importantly, for returns under 100 total results, the numbers were nearly identical in all cases observed.
20162013
6. Are these differences important?
These are not the kinds of differences that would translate into any benefit
to a user.
Users often prefer to recraft their query than move onto the second page
of returns, and often do so at mere glance of the first three results. This
is because if the first three results appear to indicate that their query is
too broad or vague – for example, by revealing that the document store
includes more documents of some variety than they suspected – they will
often add greater specificity to their query.
At the same time, except for maybe in the most suffocating offices, it is
unlikely that any user progresses through every page of thousands of
results just to take a ride on the recalculation function curve.3
Again, such differences, while noticeable under a comparative study, do
not support the notion that users in the scenario of interest will be more
satisfied with SharePoint 2016 search. At the same time, comparison of
qualitative evaluation of results across the two environments leads to the
same conclusion. While some minimal reordering of results could from time
to time be observed, quality of results was altogether identical across the
two environments, and the reordering was ultimately trivial to overall user
experience.4
3 Although it must be admitted that this is by far the most amusing use of SharePoint, whether 2013 or 2016. And comparing
the curves between the two SharePoint environments, this was – and bear in mind that all comments are to be taken in context
– absolutely thrilling.
4 See the Methodology section for more information.
05
Will the improvement in SharePoint2016 search increase user adoption?
RESULTS
Hint: How staff uses
software greatly affects
what software changes
matter.
7. What the results do not show
This does not, however, mean that comparison of the two environments can
decidedly show that SharePoint 2016 search will not function better in the
scenario of interest. The study conducted did not engage with the question
of whether SharePoint 2016 is more adaptive than SharePoint 2013. At the
same time, as adaptation in the main benefits from anything that lifts user
engagement – as computers need inputs to adapt, regardless their adaptive
means – the general expansion in the capaciousness of SharePoint 2016 can
be said to complement adaptive potential.5
Enriching data still, however, remains the most effective means of
improving search, as environment adaptation can potentially push user
behavior in a direction that does not align with organizational goals and
reinforces unconventional uses of language. It is noted that SharePoint
2016 is equally convenient for enrichment and categorization projects,
and that, as with SharePoint 2013, search can be made responsive to your
desired information outcomes. In the absence of responsible policy-driven
intervention, adaption can entrench user habits that ultimately undermine
organization-wide information retrieval goals.
5 Please see the http://mstechtalk.com/comparing-sharepoint-2016-boundaries-and-limitations-with-
sharepoint-2013-2010/ Last accessed March 4, 2016. What is meant more specifically here is that as SharePoint 2016 can
technically handle more data, if that capaciousness is exploited and user engagement with search in SharePoint is enhanced,
then it is possible that SharePoint 2016 search may benefit from this in terms of adaptive response. However, this presumes
that something in the way of total engagement and input benefits adaption, that is, adaptation goes beyond localized activities
such as the search behaviors of single users.
06
Will the improvement in SharePoint2016 search increase user adoption?
RESULTS
Hint: Be careful with “adaptive”
search, as it can easily go to
entrench undesirable staff
practices, and thereby erode
organizational initiatives related
to culture and education.
8. A setup was designed to compare the search of SharePoint 2013 with
the search in SharePoint 2016. The setup design was attentive to the
following concerns:
In the following subsections, there is provided a summary of the
following elements in the setup design:
07
Will the improvement in SharePoint2016 search increase user adoption?
METHODOLOGY
the SharePoint environments used
the documents used
the kinds of queries used
and the data collection schema
That no difference in the SharePoint environments used during the
comparison would tend to affect the results
That the documents used would be representative of the kinds of
documents typically found on SharePoint in an organizational or
business setting
That the queries initiated would be representative of the kinds of
queries that people in an office environment from time to time use
That results would be measured in a way that gives a meaningful
view into the kinds of differences that are likely to translate into
better or worse search
9. SharePoint environments used
The environments were designed so that site-level6
search could be
meaningfully compared.
First, new sites were made in each of the SharePoint environments. These
sites did not have any sub sites.
The documents were then manually uploaded into the sites prepared for
the comparison. A full search crawl was initiated and allowed to finish on
both environments, so that the site-level search would be ready.
No search reconfiguration was performed, and the Default Search Model
ranking model was retained in both environments. Prior to the making of
any observations, the SharePoint environments were not interacted with,
so as to avoid, to the extent possible, triggering adaptation in SharePoint
search.
SharePoint search is said to be adaptive. Microsoft states that “SharePoint
search is like Bing for any information within your company”,7
and described
the search results as “Personalized results based on your intent and past
behavior”.8
Accordingly, throughout the comparison, care was taken to
ensure that the level of interaction across the two SharePoint environments
was consistent. Although the query practices used were not designed to
look into SharePoint search adaptation, it is here noted that no adaption
was observed throughout the comparison.9
6 It is noted that SharePoint 2013 introduced library-specific search, and that this feature is retained in SharePoint 2016.
As attention in corporate search optimization focuses on site-level search, the comparison does not include any results from
the library-specific search function. It is, however, noted that differences exist between the site-level and the library-specific
search options. In general, although this article in no way focuses on the matter and will offer no support for the assertion, it
appears that the library-specific search and site-level search can, depending on the query, return very different results.
7 https://products.office.com/en-us/SharePoint/connect-with-employees-across-the-enterprise?tab=fcf30fc4-890b-c550-
f1cd-79c5ced96edb#a (click the “Discover” tab, and then click on the “Find stuff” video link icon). Last accessed March 3,
2016. To the extent that SharePoint search is similar to Bing, it can be expected to learn over time, as Bing has been providing
adaptive search since 2011 (https://blogs.bing.com/search/2011/09/14/adapting-search-to-you/ Last accessed March 3,
2016).
8 https://products.office.com/en-us/SharePoint/connect-with-employees-across-the-enterprise?tab=fcf30fc4-890b-c550-
f1cd-79c5ced96edb# (click on the “Discover” tab, and then click on the “Your results” video link icon). Last accessed March 3,
2016.
9 Note that the total level of interaction was not comparable to the level to be expected in an organizational setting. In an
organization, the level of interaction involved in the comparison would likely be generated by any given employee in several
days (presuming the nature of their work to be document intensive).
step
08
Will the improvement in SharePoint2016 search increase user adoption?
METHODOLOGY
10. Documents
The document set was comprised of 1,500 disparate office documents
collected from the Internet. The document set included documents in a
variety of formats (Word documents, PDFs, emails and so on) to try to
approximate common office conditions.10
The documents also varied in
terms of:
The variety in the documents was intended to mirror, as much as possible,
the variety found in the document libraries of large organizations.
10 By proportion of total data: 83.3% of the files were PDF’s; 7.2% Office Open XML documents; 5.6% DOC files; 1.6%
MSG files; 1.5% PPT files; .6% HTML files; and less than 1% of HTM, JPEG, CSV, PPSX, and DOTX files. By proportion of total
number of files: 69.2% were PDF’s; 14.3% Office Open XML documents; 8.9% DOC files; 3.3% HTML files; 2.6% MSG files; .7%
PPT tiles; .6% HTM files; and exactly 2 JPEG files, 1 DOTX file, 1 PPSX file, and 1CSV file.
09
Will the improvement in SharePoint2016 search increase user adoption?
METHODOLOGY
type (including reports, CVs, memos, minutes, draft articles, etc.)
content (including public policy, opinion, notes, research,
communications, etc.)
date (from mid-20th century to 2015)
length (from one page to several hundred pages)
language (ranging different English dialects and including foreign text)
features within the document (codes, graphics, layout elements, etc.)
11. Queries
Queries were chosen so as to simulate the kind of search that commonly
occurs in an office environment. The queries fall into four categories: (1)
For each of these categories, five representative queries were used, for a
total of twenty queries made across both environments.11
In order to locate names that would make for useful queries, a search was
performed across both environments so as to illicit exposure of names
across documents.12
For document types and subject-matter queries,
familiarity with the document set (which is frequently used internally for
testing purposes) assisted with identification of representative terms. From
time to time, what was noticed within a document was used as inspiration
for queries in the “other” category.
11 The queries were made in the exact same order across both environments. Although, as noted, no adaptive behavior was
observed. After the comparison was made and the data collected, making of the same queries repeatedly returned the exact
same results as appeared during the first run.
12 The same search was used across both environments.
step
10
Will the improvement in SharePoint2016 search increase user adoption?
METHODOLOGY
names (e.g., John Smith, Smith)
document types (e.g., contract, report)
subject-matter queries (e.g., alarm system, bicycle ride)
other (in the main, queries for specific materials – e.g., a specific
document known to exist in the document library – and queries
using features known to exist in particular documents – e.g., a unique
document identifier)
12. Data collection schema
When comparing search, some metrics are more important than others.
For example, it is well known that user behavior prioritizes results on the
first page, and especially the top three results therein. What is meant by
user behavior here is a generalization taken from observations made across
large populations, and variations admittedly exist. Nonetheless, in order to
collect meaningful data, the comparison focused on the top three results
and those appearing on the first page (in SharePoint, a maximum of ten).
The specific order of results was ignored. This is because users are generally
indifferent as to where a result appears in the top three, so long as it
does appear in the top three. The same is true for the results appearing
on the first page. That is, if a user has had to trouble reading beyond the
top three, it is not of great importance whether the result they find most
compelling appears seventh or eighth, so long as it appears on the first page.
Accordingly, the comparison made use of the measurement called overlap,
which is the proportion of elements that are the same across the sets under
comparison.
The comparison also included observation of the total number of search
results.13
This metric gives a crude, but useful, measurement of the general
comparability of results. For example, it can be expected that if the total
number of results across identical data sets is widely divergent for a great
proportion of queries, then the underlying search mechanisms likely differ
greatly. This is important, because when the underlying search mechanisms
are very different, the quality of search results can be effected.
Assessment of quality of results for purposes of qualitative comparison
made use of the top three results in every category.14
Each result was
evaluated on the following basis (beginning on the next page, Quality
Assessment):
13 The actual figure taken is that which appears at the bottom of the first page of results. This number can change the moment
that you move on to the second page of results (the larger the number, the more probable the change). It is here noted that
the change observed across the two SharePoint environment in the recalculation of total results weakly indicates that how
this recalculation is performed is different. It must be kept in mind that SharePoint 2016 is still offered in only Beta form, and
that resource allocations supporting performance of SharePoint 2016 may or may not be comparable to those allocated for
SharePoint 2016 performance. This may or may not have an effect on the recalculation, less as a function of the underlying
mechanism, but possibly as a function of its workings under different resource scenarios.
14 It is noted that sometimes only one or two results were returned by either environment.
step
11
Will the improvement in SharePoint2016 search increase user adoption?
METHODOLOGY
Hint: When comparing
enterprise solutions, make
sure that your metrics relate
to what you care about.
13. Quality Assessment
Evaluation on this basis, coupled with the overlap measurement, allowed
for a look into whether SharePoint 2016 could be said to have significantly
improved search.
15 For example, the words in a complex query (meaning a query using multiple terms) happen to appear somewhere across
the document, however, independently of each other.
16 It is acknowledged that the result is still considered valid and useful, as sometimes a user will try to locate a document
based on a small detail recalled, and possibly only recalled imperfectly.
12
Will the improvement in SharePoint2016 search increase user adoption?
EXACT – If a result was exactly what was wanted, in the case that the
query was formed in an attempt at getting a particular known document.
EXCELLENT – If when looking at a result, it appears likely and sensible
that a user would use the query used in effort to locate that particular
document.
OK – If there is some discernable basis15
on which the result can be said
to relate to the query, however, the query is not the most obvious query
if you wanted to locate the document in particular and were familiar with
its contents.16
POOR – If something integral to the query is overlooked by the search
mechanism when it generates results. For example, in the case of
complex queries involving multiple terms, from time to time, results
will be returned that overlook the relationship between the terms
used. Accordingly, the results will be unrelated to the query as a whole,
although not completely unrelated to the terms themselves. It is admitted
that craft of the query itself is involved in such scenarios, and that users
often recraft queries when they suspect that the search mechanism has
failed to grasp something in the nature of the query.
INVALID – If reasonable effort cannot avail of any explanation for the
result.
METHODOLOGY
14. 13
Will the improvement in SharePoint2016 search increase user adoption?
CONCLUSION
Search in SharePoint 2016 cannot be said to be better than search in
SharePoint 2013 at finding documents in large file repositories. The most
common challenges with corporate search remain unaffected, and content
enrichment is still required for improvements that align user behaviors with
business goals and increases user adoption.
15. DiscoveryOneTM
DiscoveryOne Content Enrichment is the
easiest way to improve search, enable
defensible deletion and identify document
security risks. By reading, categorizing
and tagging documents, DiscoveryOne
automatically creates metadata. This
metadata can be used in systems such as
enterprise search, document management,
email and CRM.
CONTENT ENRICHMENT
DiscoveryOne Content Inventory reads file
systems to present an overview of what
they contain. It identifies documents that
are redundant, outdated, trivial, or useful
and worth retaining. Usually, a file system
contains only 25% of relevant, valuable and
useful content, the rest are candidates for
disposal.
CONTENT INVENTORY
DiscoveryOne Content Intelligence allows
you to mine text for business opportunities
and commercial risks. It extracts and
condenses insights from massive amounts
of text and it does in hours what would
otherwise take a person months. Analysts
can now identify relationships, topics, trends
and sentiment from emails, documents, the
web and social media. Utilise your content to
discover new value as you have never been
able to do before.
CONTENT INTELLIGENCE
For more information please contact us | North America +1 408 663 2328 | Asia Pacific +64 9 950 3299 | info@pingar.com | www.pingar.com