1. Jeff Fried
CTO, BA Insight
SPS Boston
September 2016
Understanding and Applying Cloud Hybrid
Search
2. we love hybrid search - it's amazing how fast usage is growing
Jeff Teper @jeffteper
3. SPS Boston 2016 is made possible
by our Sponsors
Mindsharp
Contego Cyber
Solutions
4. Focused on Search and
SharePoint since 2004
Longtime
Search Nerd
• CTO, BA Insight
• Senior PM, Microsoft
• VP, FAST
• SVP, LingoMotors
About Jeff Fried
Passionate About
• Search
• SharePoint
• Search-driven
applications
• Information Strategy
Blog:
BAinsight.com/blog
Technet Column
“A View from the
Crawlspace”
jeff.fried@bainsight.com
12. Approaches to Hybrid SharePoint
Split Workload
different tools in
different places
Split User
task uses content or
sites across ‘the divide’
Exchange, SharePoint, Skype
OneDrive, Yammer, PowerBI, Delve
Extranet, Mysites, Team Sites, Project Sites
Portals, Intranet, Services/Applications
Links Search
19. 20
Reduce your footprint
Servers
Volume of Content
(indexable items) Pattern
On-prem
Search Farm
Cloud Hybrid
Search
0-10 million items small 4 App + 2 DB 1 or 2
10-40 million items medium 12 App + 2 DB 2
40-100 million items large 28 App + 4 DB 2
400 million items XL example (SP2016) 86 App + 4DB 2 or 3
20. Benefits of Cloud Hybrid Search
2) Makes finding content easy, wherever the content lives
1) Simpler, easier, and less costly to run search
21. SharePoint Server
(On-premises or Hosted)
Office 365
SharePoint Online Content
Onedrive for Business Content
SharePoint Content
Cloud Hybrid Search
26. Result Sources are Your Friend
SharePoint Online
Custom result source using Local SharePoint results plus
a filter which excludes results from on-premises
TIP: Can be used during validation of hybrid search in
the production tenant.
Result source query:
{searchTerms}
NOT(IsExternalContent:1)
27. Start with “Everything”?
This is the default result source using Local
SharePoint results but it has been renamed to
«Everything» in the Search Navigation
configuration.
SharePoint Online Everything
28. Result Sources are your friend
The Support Search vertical only searches sites that
are relevant to the Support team.
It uses Local SharePoint results plus a filter on
which sites to include in the search results
Result source query:
{searchTerms} (
Path:»http://sp2010» OR
Path:»file://fileshare» OR
Path:»http://demohybrid...
/../supportforum»)
SharePoint Online Support Search
29. SharePoint 2016 Hybrid
Cloud Hybrid
Search
User Profiles Following
Extranet
Compliance
(DLP/e-
Discovery)
Config
Experience
Built on Search
34.
External Content in O365 UX
Unified view across all content
- on-premises and on-line
- inside and outside SharePoint
35. DLP Sensitive Data Search works with hybrid
Search for sensitive data
across on-premises and
SharePoint Online
All Built-in sensitive types
Identification and export
Extends to data in OneDrive
Sensitive Information type
detection through KQL
searches
Get instant statistics
Preview &
export results
39. Item Limits and Pricing
Licensing: 1M items of external content in index for every 1TB storage in O365
1TB included by default
+ 0.5 GB per licensed O365 user
No limit on number of items from O365 in the index
Default throttling at 20M external items; current threshold at 25M
2000 users x 0.5 GB = 1TB
+ 1TB default = 2 TB total
-> 2M external items indexed
+ Can also buy the “Office 365 Extra File Storage” Add-on
$0.20/GB/Month = $200/TB/Month = $200/M items/Month
50,000 users x 0.5 GB = 25TB
+ 1TB default = 26 TB total
-> 26M external items indexed
44. SUPPORTED
– Custom IFilter
– BCS connectors
– Partner connectors
Customizations with Cloud Hybrid Search
SUPPORTED
– Tenant level schema mapping
– Query rules
– Result sources
Cloud SSA SCS/O365
NOT SUPPORTED
• Content that requires custom
security trimming
NOT SUPPORTED
• Site collection level schema mapping
• Custom security trimming
• Custom entity extraction
• Content enrichment web service
45. Issues with Cloud Hybrid Search (1)
Cloud Hybrid Search "annoyances"
Performance Characteristics
slower query latency for on-prem queries against Cloud SSA
SharePoint Online Limitations
no synonyms
no site-level schema
no full trust code access
Hybrid Administration Weaknesses
clunky metadata mapping
can't remove on-premises search results from Cloud SSA
trickier to test & debug crawls
can't reset index from Cloud SSA
Be aware of these
& compensate for them
(Fixed in August PU)
(Semi-addressed in June PU)
And it’s getting better:
46. Should I run index reset?
NO!
DeleteAllCloudHybridSearchContent()
https://blogs.technet.microsoft.com/beyondsharepoint/2016/07/07/cloud-hybrid-search-service-application-removing-items-from-the-office-365-search-index/
47. Issues with Cloud Hybrid Search OOB
51
Content Enrichment
no CEWS
no Entity Extraction
Security
no Custom Security Trimming
Can't crawl across Multiple Domains
Can't Crawl SP in Classic Auth Mode
Data Sovereignty
export-restricted content
can't be put in O365 index
Limitations of Cloud SSA
48. External Content
(on-premises and/or
in the cloud)
SharePoint Server
(On-premises or Hosted)
SPO Content
OneDrive Content
Connectors
SharePoint Content
Connector
Framework
Office 365
AutoClassifier
(app version)
CEWS
Custom
Processing
49. Case study D:
Content Enrichment
Content
Cloud
SSA
Connector
Framework
Indexing
Connectors
Smart
Pipeline
AutoClassifier
Custom
Stage A
Custom
Stage C
Custom
Stage B
50. Online
On-Prem
Cloud Hybrid Search under the covers
Security = identity sync + ACL mapping
Cloud SSA
Cloud SSA
ParseCrawl
SCS
ACL Map Process
Blob
store
queue
52. Mapping of Access Control Lists
Allow: S-1-5-21-1212121212-1212121212-1212 Allow: PUID-XXXX-XXXXXXXXXX
• User SIDs are mapped to PUIDs
• Group SIDs are mapped to Object IDs
• «Everyone» and «Authenticated users» are mapped to
«Everyone except external users»
59. Issues with Cloud Hybrid Search OOB
Content Enrichment
no CEWS
no Entity Extraction
Security
no Custom Security Trimming
Can't crawl across Multiple Domains
Can't Crawl SP in Classic Auth Mode
Data Sovereignty
export-restricted content
can't be put in O365 index
Limitations of Cloud SSA BA Insight Solution
Connector Framework
AutoClassifier
Connector Framework
can 'map down' to AD groups
can 'map across' cross-domain
can crawl and map security
Federator
60.
61. Key Considerations for Hybrid:
Workloads, Environment, Data, Customizations
Availability of features Online versus
On-Premises on particular workloads
Significant investments in
customization of On-Premises
workloads
Concerns over global network
performance with remote sites
Regulatory
considerations
Manageability concerns
62.
63.
64.
65.
66. Visit extaCloud’s booth for Drink Tickets!
Champions Bar
6pm
LOCATED IN BOSTON MARRIOTT
CAMBRIDGE
2 Cambridge Center
Cambridge, MA 02142
(1 min walk from Microsoft)
http://www.championscambridge.com/
Key take-away
Search is much more than 10 blue links
a powerful way to create a unified view of enterprise information
And compelling search-based applications and dashboards
Remote
Result
Sources
Data Sovereignty Laws
Safe Harbor agreement struck down this fall (US/EC)
New Russian localization law (went into effect in September
Currently 20+ countries also considering similar privacy laws