Jeff Fried
CTO, BA Insight
SPS Boston
September 2016
Understanding and Applying Cloud Hybrid
Search
we love hybrid search - it's amazing how fast usage is growing
Jeff Teper @jeffteper
SPS Boston 2016 is made possible
by our Sponsors
Mindsharp
Contego Cyber
Solutions
Focused on Search and
SharePoint since 2004
Longtime
Search Nerd
• CTO, BA Insight
• Senior PM, Microsoft
• VP, FAST
• SVP, LingoMotors
About Jeff Fried
Passionate About
• Search
• SharePoint
• Search-driven
applications
• Information Strategy
Blog:
BAinsight.com/blog
Technet Column
“A View from the
Crawlspace”
jeff.fried@bainsight.com
About BA Insight


– Connectivity
– Applications -
– Classification -
– Analytics

KCTCS (background)
Search is not stationary

–
–

–

–

–
Why Hybrid SharePoint?
The
Evolution
of
SharePoint:
HYBRID ManagementExtensibilityExperiences
| Server
Experiences ManagementExtensibility
| Server | Server
HYBRID
Team
Sites
Portals
Enterprise
Content Mngt
BI
Approaches to Hybrid SharePoint
Split Workload
different tools in
different places
Split User
task uses content or
sites across ‘the divide’
Exchange, SharePoint, Skype
OneDrive, Yammer, PowerBI, Delve
Extranet, Mysites, Team Sites, Project Sites
Portals, Intranet, Services/Applications
Links Search
Search Provides a Unified View
“Classic” Hybrid Search is Federated
not a single result set OOB
SharePoint 2013/2016
Search Architecture
Web Service (CEWS)

–
–

–
–

–
–
Case Study B:
Crawling O365
Cloud Hybrid Search
19
20
Reduce your footprint
Servers
Volume of Content
(indexable items) Pattern
On-prem
Search Farm
Cloud Hybrid
Search
0-10 million items small 4 App + 2 DB 1 or 2
10-40 million items medium 12 App + 2 DB 2
40-100 million items large 28 App + 4 DB 2
400 million items XL example (SP2016) 86 App + 4DB 2 or 3
Benefits of Cloud Hybrid Search
2) Makes finding content easy, wherever the content lives
1) Simpler, easier, and less costly to run search
SharePoint Server
(On-premises or Hosted)
Office 365
SharePoint Online Content
Onedrive for Business Content
SharePoint Content
Cloud Hybrid Search
Case Study C: Split Users with SharePoint





Setting up Cloud Hybrid Search
•
•
1.
2.
3.
4.
The Cloud SSA
Result Sources are Your Friend
SharePoint Online
Custom result source using Local SharePoint results plus
a filter which excludes results from on-premises
TIP: Can be used during validation of hybrid search in
the production tenant.
Result source query:
{searchTerms}
NOT(IsExternalContent:1)
Start with “Everything”?
This is the default result source using Local
SharePoint results but it has been renamed to
«Everything» in the Search Navigation
configuration.
SharePoint Online Everything
Result Sources are your friend
The Support Search vertical only searches sites that
are relevant to the Support team.
It uses Local SharePoint results plus a filter on
which sites to include in the search results
Result source query:
{searchTerms} (
Path:»http://sp2010» OR
Path:»file://fileshare» OR
Path:»http://demohybrid...
/../supportforum»)
SharePoint Online Support Search
SharePoint 2016 Hybrid
Cloud Hybrid
Search
User Profiles Following
Extranet
Compliance
(DLP/e-
Discovery)
Config
Experience
Built on Search
PRO






CON






Cloud SSA Pro/Con
External Content
(on-premises and/or
in the cloud)
SharePoint Server
(On-premises or Hosted)
Office 365
SharePoint Online Content
Onedrive for Business Content
Connectors
SharePoint Content
Adding External Content
Cloud Hybrid Search
Connectors to MANY Enterprise Systems
•
•
•
•
ERP and Portal Systems
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•


External Content in O365 UX
Unified view across all content
- on-premises and on-line
- inside and outside SharePoint
DLP Sensitive Data Search works with hybrid
Search for sensitive data
across on-premises and
SharePoint Online
All Built-in sensitive types
Identification and export
Extends to data in OneDrive
Sensitive Information type
detection through KQL
searches
Get instant statistics
Preview &
export results
Current Caveats:
1) don’t see thumbnails, just file icons
2) Have to query for it to show up

–
–

–

–
Case Study C:
Cloud SSA, external content
Large global company
in materials science
Scaling
Item Limits and Pricing
Licensing: 1M items of external content in index for every 1TB storage in O365
1TB included by default
+ 0.5 GB per licensed O365 user
No limit on number of items from O365 in the index
Default throttling at 20M external items; current threshold at 25M
2000 users x 0.5 GB = 1TB
+ 1TB default = 2 TB total
-> 2M external items indexed
+ Can also buy the “Office 365 Extra File Storage” Add-on
$0.20/GB/Month = $200/TB/Month = $200/M items/Month
50,000 users x 0.5 GB = 25TB
+ 1TB default = 26 TB total
-> 26M external items indexed
External Content
(on-premises and/or
in the cloud)
Custom
Processing
CEWS
Bottlenecks:
1) Source systems
2) Content Processing
3) Indexer
….
External Content
(on-premises and/or
in the cloud)
Bottlenecks:
1) Uplink
2) Source systems
….
46
Performance
500K items crawled on an Azure D3
50 DPS 
 100 DPS
 1 hour 
SUPPORTED
– Custom IFilter
– BCS connectors
– Partner connectors
Customizations with Cloud Hybrid Search
SUPPORTED
– Tenant level schema mapping
– Query rules
– Result sources
Cloud SSA SCS/O365
NOT SUPPORTED
• Content that requires custom
security trimming
NOT SUPPORTED
• Site collection level schema mapping
• Custom security trimming
• Custom entity extraction
• Content enrichment web service
Issues with Cloud Hybrid Search (1)
Cloud Hybrid Search "annoyances"
Performance Characteristics
slower query latency for on-prem queries against Cloud SSA
SharePoint Online Limitations
no synonyms
no site-level schema
no full trust code access
Hybrid Administration Weaknesses
clunky metadata mapping
can't remove on-premises search results from Cloud SSA
trickier to test & debug crawls
can't reset index from Cloud SSA
Be aware of these
& compensate for them
(Fixed in August PU)
(Semi-addressed in June PU)
And it’s getting better:
Should I run index reset?
NO!
DeleteAllCloudHybridSearchContent()
https://blogs.technet.microsoft.com/beyondsharepoint/2016/07/07/cloud-hybrid-search-service-application-removing-items-from-the-office-365-search-index/
Issues with Cloud Hybrid Search OOB
51
Content Enrichment
no CEWS
no Entity Extraction
Security
no Custom Security Trimming
Can't crawl across Multiple Domains
Can't Crawl SP in Classic Auth Mode
Data Sovereignty
export-restricted content
can't be put in O365 index
Limitations of Cloud SSA
External Content
(on-premises and/or
in the cloud)
SharePoint Server
(On-premises or Hosted)
SPO Content
OneDrive Content
Connectors
SharePoint Content
Connector
Framework
Office 365
AutoClassifier
(app version)
CEWS
Custom
Processing
Case study D:
Content Enrichment











Content
Cloud
SSA
Connector
Framework
Indexing
Connectors
Smart
Pipeline
AutoClassifier
Custom
Stage A
Custom
Stage C
Custom
Stage B
Online
On-Prem
Cloud Hybrid Search under the covers
Security = identity sync + ACL mapping
Cloud SSA
Cloud SSA
ParseCrawl
SCS
ACL Map Process
Blob
store
queue
•
•
Directory Synchronization
SID S-1-5-21-1212121212-1212121212-1212
jaden@corp.hybridsearch.com
msOnline-
OnPremiseSecurity
Identifier
S-1-5-21-1212121212-1212121212-1212
PUID PUID-XXXX-XXXXXXXXXX
Mapping of Access Control Lists
Allow: S-1-5-21-1212121212-1212121212-1212 Allow: PUID-XXXX-XXXXXXXXXX
• User SIDs are mapped to PUIDs
• Group SIDs are mapped to Object IDs
• «Everyone» and «Authenticated users» are mapped to
«Everyone except external users»







Case Study E:
Crawling Cross-Domain
A global single index solution
Cloud SSA
Cloud SSA
Cloud SSA
Cloud SSA
Cloud SSA
BUT export-restricted content
can’t be in the global index
Connect & Crawl
Federate
“Classic” Hybrid Search is Federated
not a single result set OOB
BA Insight Federator
Case study F:
Data Sovereignty & Federation









Issues with Cloud Hybrid Search OOB
Content Enrichment
no CEWS
no Entity Extraction
Security
no Custom Security Trimming
Can't crawl across Multiple Domains
Can't Crawl SP in Classic Auth Mode
Data Sovereignty
export-restricted content
can't be put in O365 index
Limitations of Cloud SSA BA Insight Solution
Connector Framework
AutoClassifier
Connector Framework
can 'map down' to AD groups
can 'map across' cross-domain
can crawl and map security
Federator
Key Considerations for Hybrid:
Workloads, Environment, Data, Customizations
Availability of features Online versus
On-Premises on particular workloads
Significant investments in
customization of On-Premises
workloads
Concerns over global network
performance with remote sites
Regulatory
considerations
Manageability concerns
Visit extaCloud’s booth for Drink Tickets!
Champions Bar
6pm
LOCATED IN BOSTON MARRIOTT
CAMBRIDGE
2 Cambridge Center
Cambridge, MA 02142
(1 min walk from Microsoft)
http://www.championscambridge.com/
Contact:
Jeff.Fried@BAinsight.com
www.BAinsight.com
Questions
References
http://technet.microsoft.com/en-us/library/dn197172(v=office.15).aspx
http://sp2013searchtool.codeplex.com/
https://github.com/OfficeDev/PnP-
Tools/tree/master/Scripts/SharePoint.Hybrid.Search.Configuration
References - Blogs
http://blogs.msdn.com/b/spses/archive/2015/09/15/cloud-hybrid-search-service-application.aspx
http://blogs.msdn.com/b/spses/archive/2013/10/22/office-365-configure-hybrid-search-with-directory-
synchronization.aspx
http://blogs.msdn.com/b/spses/archive/2014/01/05/office-365-configure-hybrid-search-with-directory-
synchronization-password-sync-part2.aspx
http://blogs.msdn.com/b/spses/archive/2014/01/07/identity-federation-amp-single-sign-on-deployment-for-
hybrid-search-in-office-365-sharepoint-online-part3.aspx
http://blogs.msdn.com/b/spses/archive/2015/03/19/configuring-microsoft-web-application-proxy-server-for-
inbound-hybrid-topology-with-office-365-and-microsoft-sharepoint-server-2013-part7.aspx
https://www.youtube.com/watch?v=JWEZx9SHDb0&list=PLvmwu6WYeFdjNbiy7SISJAZd1HjzIJoz5
https://azure.microsoft.com/en-us/documentation/articles/active-directory-aadconnect/
https://azure.microsoft.com/en-us/documentation/articles/active-directory-aadconnect/
http://blogs.msdn.com/b/spses/archive/2015/09/15/cloud-hybrid-search-service-application.aspx
References – Installing with SP2016

Understanding and Applying Cloud Hybrid Search

Editor's Notes

  • #14 Key take-away Search is much more than 10 blue links a powerful way to create a unified view of enterprise information And compelling search-based applications and dashboards
  • #17 Remote Result Sources
  • #59 Data Sovereignty Laws Safe Harbor agreement struck down this fall (US/EC) New Russian localization law (went into effect in September Currently 20+ countries also considering similar privacy laws