Microsoft Office SharePoint Server 2007 Search Workshop 游家德  Jade Yu 敦群數位科技股份有限公司
Microsoft Office SharePoint Server 2007 Enterprise Search Enterprise Search Advanced Training – Building and Implementing ...
Workshop Agenda <ul><li>Day 1 –  Search Overview </li></ul><ul><ul><li>Microsoft Search Landscape </li></ul></ul><ul><ul><...
Assumptions <ul><li>Some knowledge and experience with Search functionality </li></ul><ul><li>Knowledge of the Business Da...
Workshop Objectives <ul><li>Explain how to use the Office 2007 Search functionality </li></ul><ul><li>Interpret the Office...
Module 1 Enterprise Search Overview
Module Agenda <ul><li>Microsoft Enterprise Search </li></ul><ul><li>Client-side Search Platform </li></ul><ul><li>Client-s...
Microsoft Enterprise Search Server-Side Search Platform Line-of-business systems and structured data sources Unstructured ...
Client-Side Search Platform <ul><li>Windows Desktop Search (WDS) for XP and Windows Server </li></ul><ul><ul><li>You must ...
Client-Side Comparison Microsoft ®  Windows ®  Desktop Search Microsoft ®  Windows ®  Vista Rich, actionable interface X X...
Server-Side Search Platforms <ul><li>Windows SharePoint Services v3 </li></ul><ul><ul><li>“ Basic” index / search capabili...
Key Differences Between WSS and MOSS WSS v3 Microsoft Office SharePoint Server (MOSS) Can Index Local SharePoint content X...
MOSS 2007 for Search <ul><li>A Search-only solution for intranets and public-facing Web (Internet) sites </li></ul><ul><li...
MOSS 2007 and MOSS FS Usage Scenarios Description Scenario MOSS 2007 An information management solution that includes ente...
MOSS 2007 for Search and MOSS 2007 Features Comparison Features MOSS 2007  for Search  (Standard Edition) MOSS 2007  for S...
Questions?
Module 2 Microsoft Office SharePoint Search 2007 –  Walkthrough
Module Agenda <ul><li>End-User Improvements </li></ul><ul><ul><li>Relevance </li></ul></ul><ul><ul><li>People and Expertis...
End-User Improvements Relevance <ul><li>Dramatically improved relevance is the top goal of this release </li></ul><ul><li>...
End-User Improvements  People and Expertise <ul><li>Bring people into the Search experience </li></ul><ul><ul><li>Getting ...
End-User Improvements  Business Data Search <ul><li>Information in Line of Business (LOB) systems is often hard to access ...
<ul><li>Address SPS 2003 administration user interface pain points </li></ul><ul><li>Unify WSS and MOSS search </li></ul><...
<ul><li>Streamlined experience and more control </li></ul><ul><li>One index per shared service; no need to worry about man...
Administration Improvements   Security <ul><li>Query-time security trimming in SPS 2003 </li></ul><ul><ul><li>File shares,...
Administration Improvements   Customization <ul><li>Search in  every  company is different </li></ul><ul><ul><li>Different...
Administration Improvements   Query Reporting <ul><li>Best way to improve Search is to understand current usage </li></ul>...
Performance Improvements <ul><li>Key new features make the crawls faster so the content is fresher </li></ul><ul><ul><li>M...
Demo – MOSS 2007 <ul><li>G oal of demo is a high level overview with focus on: </li></ul><ul><li>Search boxes and advanced...
<ul><li>Questions? </li></ul>
Module 3  Architecture and Deployment Scenarios
Agenda <ul><li>Key concepts  </li></ul><ul><ul><li>MS Search Architecture </li></ul></ul><ul><ul><li>Deployment Building B...
Microsoft Search Architecture Notes Query Engine Index Engine Protocol Handlers iFilters Content Index OOB Search UI/Custo...
SharePoint Search Topologies: Deployment Building Blocks <ul><li>Physical building blocks:  </li></ul><ul><ul><li>Web Fron...
WSS v3 Search Topology Basics <ul><li>WSS uses both server roles on the same machine (“Search Server”) </li></ul><ul><ul><...
Sample  WSS  v3  Topology
WSS v3 - Topology Considerations <ul><li>Scale out just like WSS </li></ul><ul><li>Add content databases for content </li>...
<ul><li>Adds new functionality over base WSS Search </li></ul><ul><li>Application server roles can be separated: </li></ul...
MOSS 2007 Search Topology Basics (cont) <ul><li>Query role can be assigned to one or more servers </li></ul><ul><li>Indexi...
Sample  MOSS  2007  Topology Query servers separated from indexer Indexer crawling local + external content
MOSS 2007 – Search Topology Considerations <ul><li>Indexing operations are CPU intensive </li></ul><ul><li>Dedicated query...
MOSS 2007 – Search Topology Considerations (cont) <ul><li>Each farm can index up to 50 million items </li></ul><ul><li>Bey...
Shared Search Service <ul><li>Shared Service Provider (SSP) – grouped high-value, resource intensive services </li></ul><u...
Search Shared Service Search service People service … Shared Service  Provider (SSP) http://sales http://finance http://hr...
Search Shared Service Search service People service … Shared Service  Provider http://sales http://finance http://hr spsit...
Common Search Topologies <ul><li>Deployment scenarios  </li></ul><ul><ul><li>Small  </li></ul></ul><ul><ul><li>Medium  </l...
Small Search Deployment <ul><li>WSS </li></ul><ul><ul><li>Single Search Server with both roles </li></ul></ul><ul><ul><ul>...
Medium Search Deployment <ul><li>WSS </li></ul><ul><ul><li>Multiple Search Servers with the following limitations </li></u...
Large Search Deployment <ul><li>WSS </li></ul><ul><ul><li>Multiple Search Servers with the following limitations </li></ul...
Geographically Distributed Sites MOSS Search Deployment Other Locations Corp. Sites Search service  People service  --- Sh...
Deployment Scenarios <ul><li>Collaboration Environment (WSS v3) </li></ul><ul><li>Enterprise Portal (MOSS 2007) </li></ul>...
Collaboration Environment Scenario  WSS v3 <ul><li>iTech – startup software consulting firm </li></ul><ul><li>Large number...
Collaboration Environment Scenario  WSS v3 (cont) <ul><li>WSS farm with single IIS virtual server http://team  </li></ul><...
Collaboration Environment Scenario  WSS v3 (cont) <ul><ul><li>Search – core feature of WSS </li></ul></ul><ul><ul><li>Cont...
Enterprise Portal Scenario MOSS 2007 <ul><li>iTech – growing company with growing needs </li></ul><ul><li>iTech – needs a ...
Enterprise Portal Scenario MOSS 2007 (cont) <ul><li>Upgrade from WSS    MOSS </li></ul><ul><li>Search is a shared service...
Enterprise Portal Scenario MOSS 2007 (cont) Farm http://team team1 team2 spweb spweb Virtual Server team3 spweb spweb SPSi...
Enterprise Portal Scenario MOSS 2007 (cont) <ul><li>Topology with indexer and query servers </li></ul><ul><li>Load balance...
Internet Facing Portal Scenario - MOSS 2007 <ul><li>Internet facing site for customers – www.itech.com </li></ul><ul><li>H...
Internet Facing Portal Scenario - MOSS 2007 (cont) <ul><li>Two separate farms:  Production and test farms </li></ul><ul><l...
Internet Facing Portal Scenario - MOSS 2007 (cont) www.itech.com Services Customers spweb spweb Virtual Server About  itec...
Questions?
Module 4   Crawl and Query Processes
Agenda <ul><li>The Crawl Process </li></ul><ul><ul><li>Crawl Walkthrough </li></ul></ul><ul><ul><li>Index Propagation </li...
Crawl Walkthrough <ul><li>When a crawl is requested . . . </li></ul><ul><li>Indexer grabs the start address of content sou...
Crawl Walkthrough (cont) <ul><li>Protocol handler invokes IFilter associated with content node type </li></ul><ul><li>IFil...
Crawl Overview Diagram
Index Propagation Farm Sample Indexer Load Balancer Crawling Web  front  ends Index Propagation Query Servers User Requests
<ul><li>Propagation will occur only when the index and search components are on separate servers </li></ul><ul><li>Continu...
Index Propagation <ul><li>Index File Location </li></ul><ul><ul><li>Set in Office SharePoint Server Search Service setting...
The Query Process <ul><li>Query Initiation and Results Presentation </li></ul><ul><li>Query Execution </li></ul><ul><li>Qu...
Query Initiation and Results Presentation <ul><li>Typically, provided by the WSS / MOSS WFE role, through OOB WebParts </l...
Query Execution <ul><li>Always provided by a server tagged with the Query role </li></ul><ul><li>Consumes a query request ...
Query Walkthrough (cont) <ul><li>When a query is requested . . . </li></ul><ul><li>Query terms collected </li></ul><ul><li...
Questions?
Module 5 The Search End-User Experience
Module Agenda <ul><li>Introducing the Search End-User Experience </li></ul><ul><li>Customizing Search </li></ul><ul><li>Pe...
Introducing the Search End-User Experience <ul><li>Complete Search experience </li></ul><ul><li>Search is everywhere </li>...
Introducing the End-User Search Experience <ul><li>Search Boxes </li></ul><ul><li>Search Center </li></ul><ul><li>Search W...
Query Results Http: Get Http: Post Search Box XML Web Parts XSL Transformation Query OM  Advanced Search Hidden Object XML...
Search WebParts <ul><li>Nine  Standard Search Web Parts  </li></ul><ul><ul><li>Search Box </li></ul></ul><ul><ul><li>Core ...
Result page infrastructure  <ul><li>Data shared through hidden object </li></ul><ul><ul><li>All Search Web Parts within th...
Advanced Search  <ul><li>Allows power searchers to exercise greater control on how they query </li></ul><ul><li>A link fro...
Customizing the End User Experience <ul><li>Search in every   company is different </li></ul><ul><ul><li>Different metadat...
Customization Choices <ul><li>Search Center </li></ul><ul><ul><li>Simple Site with few pages </li></ul></ul><ul><ul><ul><l...
Customizing Search <ul><li>Adding Search Center Tabs </li></ul><ul><li>Customizing Search Web Parts </li></ul><ul><li>Cust...
People Search <ul><li>Bring people into the search experience </li></ul><ul><ul><li>Getting your job done means working wi...
People Search <ul><li>People Results </li></ul><ul><li>Customizing Results </li></ul>
Refine Your People Search  <ul><li>Refine by Job Title </li></ul><ul><ul><li>Searches for the selected Job Title </li></ul...
People Search Web Parts  <ul><li>Two OOB People Search Web Parts  </li></ul><ul><ul><li>People Search Box </li></ul></ul><...
People Results Search Web Parts <ul><li>Web Part properties such as: </li></ul><ul><li>(similar to Core Search WP) </li></...
Social Distance Colleagues  <ul><li>Suggested Colleague list members are mined from: </li></ul><ul><ul><li>Microsoft Windo...
Questions?
Module 6 Search Object Model
Workshop Agenda <ul><li>Scenarios for Extending Search </li></ul><ul><li>Query Syntax </li></ul><ul><li>Query Object Model...
Topic:  Scenarios for Extending Search <ul><li>In this first section we will examine 2 scenarios for extending Search: </l...
Integrate with MOSS Search Center <ul><li>Use cases: </li></ul><ul><li>Use Search URL request parameters to add predefined...
Integrate MOSS Search into 3rd Party Sites and Applications <ul><li>Build 3rd party user interface which leverages MOSS Se...
Topic: Query Syntax <ul><li>In this section we will examine the three types of search syntax for building search queries s...
Keyword Syntax <ul><li>Used in standard Search Box </li></ul><ul><li>New keyword syntax </li></ul><ul><li>Simple and easy ...
<ul><li>Build-in support for using include and exclude terms </li></ul><ul><ul><li>Look for term bike, but not related to ...
<ul><li>Narrowing results by default </li></ul><ul><ul><li>Searches using “AND” between query terms </li></ul></ul><ul><li...
Keyword Syntax Property restrictions <ul><li>Supports property:value as part of the keyword string </li></ul><ul><li>Can u...
<ul><li>No wildcard support in Keyword Syntax </li></ul><ul><ul><li>Search box does not do wildcard searching. The followi...
URL Syntax <ul><li>Use Case </li></ul><ul><ul><li>Launching a URL in custom application </li></ul></ul><ul><ul><li>Save Se...
SQL Syntax Overview <ul><li>SQL Syntax offers: </li></ul><ul><li>Consistent SQL across enterprise and desktop </li></ul><u...
<ul><li>Write complex Boolean searches using AND, OR, NOT </li></ul>SQL Syntax Complex Boolean Searches
<ul><li>Returns documents for which the following is true: </li></ul><ul><ul><li>Document contains all the search terms in...
<ul><li>Get wildcard support using the CONTAINS predicate: </li></ul><ul><ul><li>Wildcard: Words or phrases with an asteri...
<ul><li>Removed in MOSS 2007  </li></ul><ul><li>Query property weights </li></ul><ul><li>UNION ALL  </li></ul><ul><li>MATC...
Topic:  Query Object Model <ul><li>In this section we will examine: </li></ul><ul><li>The Query Object Model </li></ul><ul...
Query Object Model <ul><li>New object model </li></ul><ul><li>Use the query object model to: </li></ul><ul><ul><li>Build c...
Query Object Model Features <ul><li>Managed code API </li></ul><ul><li>Single request – multiple results </li></ul><ul><li...
Query Object Path Query OM Input Output SQL Query Optional Parameters Query Engine ResultTableCollection ResultTable: IDat...
Query Web Service Use and Methods <ul><li>Use Case </li></ul><ul><ul><li>Leverage Search in remote sites or application  <...
Query Web Service Search Center Features <ul><li>Standard Search Center features not built into the Web service </li></ul>...
Questions?
Module 7 Administration
Module Agenda <ul><li>Administrative Architecture </li></ul><ul><ul><li>Farm Administration </li></ul></ul><ul><ul><li>SSP...
Administrative Architecture <ul><li>Shared Services </li></ul><ul><li>Business unit IT </li></ul><ul><li>Service-level  co...
Farm Management (IT Administrators)
SharePoint 3.0 Central Administration <ul><li>Common Tasks </li></ul><ul><ul><li>Manage Topology and Services </li></ul></...
Using Central Admin
Operations – Topology and Services Servers in Farm / Services on Server <ul><li>Query Server(s) </li></ul><ul><ul><li>Offi...
Operations – Backup and Restore <ul><li>Perform a backup </li></ul><ul><li>Restore from backup </li></ul>
Operations – Global Configuration <ul><li>Timer Job Definitions </li></ul><ul><ul><li>SharePoint Services Search Refresh <...
Search Application Management <ul><li>Manage Search Service </li></ul><ul><ul><li>Farm-level Search settings </li></ul></u...
Crawler Impact Rules <ul><li>Configured through Central Administration </li></ul><ul><li>Allows “throttling” of the indexe...
Crawler Impact Rules (cont) Use . . . To . . . * as the site name Apply the rule to all sites *.* as the site name Apply t...
Shared Services Provider (SSP) Management (SSP Administrators) (Content Oriented Administration)
Common Tasks <ul><li>Configure Search Settings  </li></ul><ul><ul><li>Content Sources </li></ul></ul><ul><ul><li>Crawl Set...
Content Sources <ul><li>Represent an arbitrary container of information </li></ul><ul><li>Require at least one start addre...
SharePoint Content Source <ul><li>Includes both SPS 2003, MOSS 2007, WSS v2, and WSS v3 sites </li></ul><ul><li>Can limit ...
Web Site Content Source <ul><li>Any content source available over HTTP or HTTPS </li></ul><ul><li>If a SharePoint URL is p...
Web Site Content Source  (cont) <ul><li>Security information around content is not included in index </li></ul><ul><li>Dyn...
File Shares Content Source <ul><li>Any content visible over a Windows server shared folder </li></ul><ul><li>Some non-Wind...
Exchange Public Folders Content Source <ul><li>Allows the indexer to crawl a public folder that exists on Exchange </li></...
Business Data Content Source <ul><li>Allows the indexer to crawl metadata exposed through the Business Data Catalog </li><...
Lotus Notes Content Source
Crawling Schedules <ul><li>Allow administrator to indicate the frequency at which a content source will be re-crawled (dai...
Maximum File Size <ul><li>Default file size limit is 16MB </li></ul><ul><li>To change the limit, you must add in the regis...
Crawl Rules <ul><li>Define exceptions to the “typical” crawl process </li></ul><ul><ul><li>Addresses can be pattern matche...
Search Result Removal (From Live Index) <ul><li>Typically used when someone discovers something in the index that shouldn’...
Default Content Access Account <ul><li>Account used for crawling, by default </li></ul><ul><li>Can be overridden in the Cr...
Metadata Property Mappings
Server Name Mapping <ul><li>Override how MOSS displays search results </li></ul><ul><li>Hide file path </li></ul><ul><li>S...
Search-based Alerts <ul><li>Can be Activated / Deactivated </li></ul><ul><li>Deactivated after a reset of crawled content ...
Reset Crawled Content <ul><li>Powerful action! </li></ul><ul><li>Will delete the content index! </li></ul><ul><li>Search R...
Specify Authoritative Pages <ul><li>Helps prioritize Search Results - a way to influence relevance results that are linked...
Scopes <ul><li>Scopes are filters applied to search results to narrow the results of a search query </li></ul><ul><li>Type...
Site Collection Management (Site Collection Administrators)  (Application Administrators)
Site Collection Administration Options <ul><li>Common Tasks </li></ul><ul><ul><li>Search Settings </li></ul></ul><ul><ul><...
Search Settings <ul><li>Two Options </li></ul><ul><ul><li>Use the Search Center and custom scopes in the dropdown </li></u...
Site Level Scopes <ul><li>Site Level Scopes display all scopes associated with a Site Collection </li></ul><ul><li>Display...
Keywords and Best Bets <ul><li>Prominently present editorially selected search results </li></ul><ul><li>Keywords: Glossar...
Search Settings for Fields - NoCrawl <ul><li>Set a NoCrawl  attribute on one or more columns within the site collection </...
Search Visibility <ul><li>Site level </li></ul><ul><ul><li>Allow or deny the site to appear in search results. </li></ul><...
Search Usage Reports
Benefits of Search Queries and Results Reporting <ul><li>Allows Site and SSP Administrators to: </li></ul><ul><ul><li>Have...
To Improve the Overall Search Experience One Must… <ul><li>Best way to improve search is to  understand visitors’ current ...
Reporting Tools <ul><li>Two sets of reports </li></ul><ul><ul><li>Search Query Reports </li></ul></ul><ul><ul><li>Search R...
Reporting Tools <ul><li>At the SSP level </li></ul><ul><li>For enterprise content oriented administrators </li></ul>
Reporting Tools <ul><li>At the Site Collection level </li></ul><ul><li>For Site Collection administrators </li></ul>
Search Query Reporting – SSP <ul><li>Tracks Queries that users issued for  all sites managed by this SSP </li></ul><ul><li...
Search Query Reporting – Site Collection <ul><li>Tracks Queries issued  within this Site Collection </li></ul><ul><li>Four...
Search Results Reporting – SSP <ul><li>Tracks Result Click Selections by users within the sites managed by this SSP </li><...
Search Results Reporting – Site Collection <ul><li>Tracks Result Click Selections by users for this Site Collection </li><...
Exporting Results <ul><li>Export data for extended reporting in Excel and/or Excel Services </li></ul>
Questions?
Module 8  Performance, Scalability, and  Capacity Planning
Module Agenda <ul><li>Introduction </li></ul><ul><li>Search Capacity Planning in SPS 2003 </li></ul><ul><li>MOSS 2007 Sear...
MOSS 2007 Search Capacity Planning <ul><li>Improvement highlights </li></ul><ul><ul><li>Topology restrictions removed </li...
Topology <ul><li>Deployment options </li></ul><ul><ul><li>Collapse index and query services on the same server </li></ul><...
Topology (cont) <ul><li>Topology restrictions from v2 removed </li></ul><ul><ul><li>Can mix indexer/search roles </li></ul...
Topology (cont) <ul><li>Topology Scaling Reccomandations (for Search): </li></ul><ul><ul><li>Query servers: 8 per farm </l...
MOSS 2007 Search Topology Indexer Load Balancer Propagation of indexes Content databases External content Web  front  ends...
Querying <ul><li>Performance parameters </li></ul><ul><li>Scaling factors </li></ul>
Querying – Performance Parameters <ul><li>Network always is responsible on query performances to end-user experience: </li...
Querying – Performance Parameters
Querying – Performance Parameters <ul><li>Query server memory: </li></ul><ul><ul><li>The more memory is available, the les...
Querying – Scaling Factors <ul><li>Processor architecture </li></ul><ul><ul><li>Use 64-bit servers </li></ul></ul><ul><li>...
Indexing <ul><li>Planning </li></ul><ul><li>Performance optimization </li></ul><ul><li>Storage </li></ul><ul><li>Limitatio...
Indexing Planning <ul><li>Customer environment </li></ul><ul><ul><li>Number of users </li></ul></ul><ul><ul><li>Network an...
Indexing Planning (cont) <ul><li>Corpus definition: </li></ul><ul><ul><li>A corpus is defined as the sum of all content th...
Indexing Planning (cont) <ul><li>For each content source estimate: </li></ul><ul><ul><li>Number of items </li></ul></ul><u...
Indexing -  Performance   Optimization <ul><li>Use dedicated front-end for best indexing performance </li></ul><ul><ul><li...
Indexing -  Performance   Optimization <ul><li>Index server CPU: </li></ul><ul><ul><li>As many processors are available as...
Index Storage <ul><li>Planning index storage as ratio of corpus </li></ul><ul><li>Sizing depends on content in corpus </li...
Index Storage (cont) <ul><li>Index / Query Server disk space requirements: </li></ul><ul><ul><li>Index catalog size is nor...
Index Storage (cont) <ul><li>Search database </li></ul><ul><ul><li>Contains metadata, ACLs, hit highlighting, crawl histor...
Index Capacity Limitations <ul><li>Supported limit for a single index server is 50 million documents </li></ul><ul><ul><li...
Index Scaling <ul><li>First scale up (recommended) </li></ul><ul><ul><li>Optimal ranking and user experience </li></ul></u...
Index Scaling <ul><li>Scale out </li></ul><ul><ul><li>Add multiple SSPs each crawling unique parts of the corpus </li></ul...
Test Environment <ul><li>Establish a starting point topology </li></ul><ul><li>Use monitoring to establish actual performa...
Real World Experiences <ul><li>Microsoft Intranet </li></ul><ul><li>Microsoft Technology Center PoC </li></ul>
Microsoft Intranet <ul><li>Environment </li></ul><ul><ul><li>Estimate of indexed content  Around 12 TB in SharePoint Conte...
Microsoft Technology Center PoC <ul><li>Objectives </li></ul><ul><ul><li>Indexing large numbers of secure files on file sh...
Topology Indexed corpus Search db Index catalog Propagated catalog 1TB 23GB 25GB
Results <ul><li>For the biggest test run, which included indexing 2.4 million secure files, here are the key metrics: </li...
Results (cont)
Summary of Known Limits and Restrictions <ul><li>Tested recommendation of 50 million items per farm </li></ul><ul><li>Hard...
Capacity Planning References <ul><li>Planning for performance and capacity: </li></ul><ul><ul><li>http://technet2.microsof...
Questions?
 
Upcoming SlideShare
Loading in …5
×

Microsoft Enterprise Seach using SharePoint

22,542
-1

Published on

Published in: Business
1 Comment
6 Likes
Statistics
Notes
No Downloads
Views
Total Views
22,542
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
543
Comments
1
Likes
6
Embeds 0
No embeds

No notes for slide
  • Microsoft Enterprise Seach using SharePoint

    1. 1. Microsoft Office SharePoint Server 2007 Search Workshop 游家德 Jade Yu 敦群數位科技股份有限公司
    2. 2. Microsoft Office SharePoint Server 2007 Enterprise Search Enterprise Search Advanced Training – Building and Implementing Enterprise Search Solutions
    3. 3. Workshop Agenda <ul><li>Day 1 – Search Overview </li></ul><ul><ul><li>Microsoft Search Landscape </li></ul></ul><ul><ul><li>MOSS 2007 Walkthrough </li></ul></ul><ul><ul><li>Architecture and Deployment Scenarios </li></ul></ul><ul><ul><li>Crawl and Query Processes </li></ul></ul><ul><ul><li>Search Object Model </li></ul></ul><ul><li>Day 2 – Customization and Management </li></ul><ul><ul><li>Search Object Model </li></ul></ul><ul><ul><li>Business Data Catalog (BDC) Search </li></ul></ul><ul><ul><li>Extensibility and Integration </li></ul></ul><ul><ul><li>Administration </li></ul></ul><ul><ul><li>Capacity Planning </li></ul></ul>
    4. 4. Assumptions <ul><li>Some knowledge and experience with Search functionality </li></ul><ul><li>Knowledge of the Business Data Catalog in general (new in Office 2007 System) </li></ul><ul><li>Office 2007 System Content Creation/Contribution experience </li></ul><ul><li>Knowledge of Web site creation and management in general </li></ul><ul><li>Knowledge of MS platform (Windows 2003 Server, ADS, IIS, SQL 2005 & Office Clients) </li></ul><ul><li>Knowledge of ASP.NET 2.0 and XSLT </li></ul>
    5. 5. Workshop Objectives <ul><li>Explain how to use the Office 2007 Search functionality </li></ul><ul><li>Interpret the Office 2007 System Search Terminology </li></ul><ul><li>Describe the rich feature set of Office 2007 System Search - Servers and Clients </li></ul><ul><li>Describe how to use the platform well enough to use its APIs to extend the products </li></ul><ul><li>Explain how Office 2007 System Search will solve enterprise business requirements </li></ul>
    6. 6. Module 1 Enterprise Search Overview
    7. 7. Module Agenda <ul><li>Microsoft Enterprise Search </li></ul><ul><li>Client-side Search Platform </li></ul><ul><li>Client-side Comparison </li></ul><ul><li>Server-side Search Platform </li></ul><ul><li>Key Differences between WSS and MOSS </li></ul><ul><li>MOSS 2007 for Search Key Features </li></ul><ul><li>MOSS 2007 for Search and MOSS 2007 Comparison </li></ul>
    8. 8. Microsoft Enterprise Search Server-Side Search Platform Line-of-business systems and structured data sources Unstructured information People, expertise External Web sites E-mail messages, appointments, and instant messaging Client-Side Search Platform Documents, programs, and media
    9. 9. Client-Side Search Platform <ul><li>Windows Desktop Search (WDS) for XP and Windows Server </li></ul><ul><ul><li>You must install an additional program for Search </li></ul></ul><ul><li>Vista – Integrated Desktop Search </li></ul><ul><ul><li>Integration in the Operating System </li></ul></ul><ul><ul><li>Ability to search nearly anywhere </li></ul></ul><ul><ul><li>Virtual Folders </li></ul></ul>
    10. 10. Client-Side Comparison Microsoft ® Windows ® Desktop Search Microsoft ® Windows ® Vista Rich, actionable interface X X Integration with Microsoft Outlook X X Polite indexing (Pauses when computer is in use) X X Live icons & document previews X X Advanced Search integrated into the Operating System X Save searches to search folders X Instant Search X (on taskbar) X (from start menu)
    11. 11. Server-Side Search Platforms <ul><li>Windows SharePoint Services v3 </li></ul><ul><ul><li>“ Basic” index / search capabilities to support WSS collaboration and document management </li></ul></ul><ul><li>Microsoft Office SharePoint Server (MOSS) 2007 </li></ul><ul><ul><li>Enterprise search and indexing features “unlocked” </li></ul></ul><ul><ul><li>Several SKUs to support different scenarios and customer needs </li></ul></ul>
    12. 12. Key Differences Between WSS and MOSS WSS v3 Microsoft Office SharePoint Server (MOSS) Can Index Local SharePoint content XSharePoint sites / collections, Exchange Public Folders, File Shares, Web Content, Lotus Notes, LOB Apps, and others . . . Rich, relevant results X Alerts, RSS, Did you mean, Duplicate collapsing X Scopes, Managed Properties X Best Bets, Result Removal, Query Reports X Search Center Tabs X BDC Search X API’s provided   Query Query + Admin
    13. 13. MOSS 2007 for Search <ul><li>A Search-only solution for intranets and public-facing Web (Internet) sites </li></ul><ul><li>Two versions </li></ul><ul><ul><li>Standard Edition limited to 500,000 docs </li></ul></ul><ul><ul><li>Enterprise Edition with unlimited docs </li></ul></ul><ul><li>Includes </li></ul><ul><ul><li>Out of the box search for file shares, Web sites, SharePoint sites, Exchange Public Folders, Lotus Notes databases </li></ul></ul><ul><ul><li>Extensibility to 3rd party document repositories and file types </li></ul></ul>
    14. 14. MOSS 2007 and MOSS FS Usage Scenarios Description Scenario MOSS 2007 An information management solution that includes enterprise search integrated with portal, collaboration, web content management, ECM, forms, and BI functionalities Customers who desire search as an integrated part of a broader information management solution MOSS FS A core search-only solution for intranet and public-facing web sites <ul><li>Customers who require a core search-only product that can be integrated into their existing infrastructure </li></ul><ul><li>Customers who require search functionality for their public-facing web (Internet) sites </li></ul>
    15. 15. MOSS 2007 for Search and MOSS 2007 Features Comparison Features MOSS 2007 for Search (Standard Edition) MOSS 2007 for Search (Enterprise Edition) MOSS 2007 (Standard CAL) MOSS 2007 (Standard plus Enterprise CAL) File shares X X X X Web sites X X X X SharePoint sites X X X X Microsoft Exchange Server public folders X X X X Lotus Notes databases X X X X Third party document repositories 1 X X X X Secure content access control X X X X Enhanced Search Center user interface X X Search for people and expertise X X Business Data Catalog (BDC) X Search structured data sources X Document limit 500,000 No Limit 2 No Limit 2 No Limit 2
    16. 16. Questions?
    17. 17. Module 2 Microsoft Office SharePoint Search 2007 – Walkthrough
    18. 18. Module Agenda <ul><li>End-User Improvements </li></ul><ul><ul><li>Relevance </li></ul></ul><ul><ul><li>People and Expertise </li></ul></ul><ul><ul><li>Business Data Search </li></ul></ul><ul><li>Administration Improvements </li></ul><ul><ul><li>Design Goals </li></ul></ul><ul><ul><li>Indexing Management </li></ul></ul><ul><ul><li>Security </li></ul></ul><ul><ul><li>Customization </li></ul></ul><ul><ul><li>Query Reporting </li></ul></ul><ul><li>Performance Improvements </li></ul><ul><li>Demo MOSS 2007 </li></ul>
    19. 19. End-User Improvements Relevance <ul><li>Dramatically improved relevance is the top goal of this release </li></ul><ul><li>New ingredients added including: </li></ul><ul><ul><li>Anchor text </li></ul></ul><ul><ul><li>Click distance </li></ul></ul><ul><ul><li>URL depth </li></ul></ul><ul><ul><li>Missing metadata creation </li></ul></ul><ul><li>Result is noticeably more relevant search </li></ul><ul><ul><li>100% better on all queries </li></ul></ul><ul><ul><li>500% better on common queries </li></ul></ul>
    20. 20. End-User Improvements People and Expertise <ul><li>Bring people into the Search experience </li></ul><ul><ul><li>Getting your job done means working with the right people </li></ul></ul><ul><ul><li>Find subject-matter experts based on their knowledge and contacts </li></ul></ul><ul><li>Numerous improvements over SPS 2003 </li></ul><ul><ul><li>Index any LDAP V3 directory </li></ul></ul><ul><ul><li>Dedicated tab for finding people </li></ul></ul><ul><ul><li>Results grouped by “social distance” to you </li></ul></ul>
    21. 21. End-User Improvements Business Data Search <ul><li>Information in Line of Business (LOB) systems is often hard to access </li></ul><ul><li>MOSS 2007 can bring that data to your users </li></ul><ul><ul><li>Data is accessed through the Business Data Catalog </li></ul></ul><ul><ul><li>Exposed to many features in SharePoint </li></ul></ul><ul><li>Search can easily index the data </li></ul><ul><ul><li>No need to write code </li></ul></ul><ul><ul><li>Highly customizable results </li></ul></ul><ul><ul><li>Integrated with scopes and Search center </li></ul></ul>
    22. 22. <ul><li>Address SPS 2003 administration user interface pain points </li></ul><ul><li>Unify WSS and MOSS search </li></ul><ul><li>Enable full programmability via the object model </li></ul><ul><li>Even better scalability and performance </li></ul>Administration Improvements Design Goals
    23. 23. <ul><li>Streamlined experience and more control </li></ul><ul><li>One index per shared service; no need to worry about managing discrete indexes </li></ul><ul><li>Multiple start addresses per content source </li></ul><ul><li>MOSS indexes can drive the WSS search experience </li></ul><ul><ul><li>Allow upgrade from WSS to MOSS </li></ul></ul>Administration Improvements Indexing Management
    24. 24. Administration Improvements Security <ul><li>Query-time security trimming in SPS 2003 </li></ul><ul><ul><li>File shares, WSS/SPS 2003, Exchange, Lotus Notes (via mapping) </li></ul></ul><ul><li>Now supports pluggable authentication for content in WSS/MOSS sites </li></ul><ul><ul><li>Based on ASP.NET 2.0 model </li></ul></ul><ul><li>Minimum required crawler permission is now just Full Read, not Administrator </li></ul><ul><ul><li>Still provides the same security trimming functionality </li></ul></ul><ul><li>Ability to remove single items </li></ul>
    25. 25. Administration Improvements Customization <ul><li>Search in every company is different </li></ul><ul><ul><li>Different metadata might matter: </li></ul></ul><ul><ul><ul><li>Documents: Title, Author, File location, Size </li></ul></ul></ul><ul><ul><ul><li>Records: Patient, Doctor, Healthcare provider, SSN… </li></ul></ul></ul><ul><ul><li>How users meaningfully scope searches differs: </li></ul></ul><ul><ul><ul><li>“ All finance documents” </li></ul></ul></ul><ul><ul><ul><li>“ All patient records” </li></ul></ul></ul><ul><ul><ul><li>“ All published documents” </li></ul></ul></ul><ul><li>Customize results to “pop” metadata that matters </li></ul><ul><li>Customization offered at many levels </li></ul><ul><ul><li>Web Parts, XSLT/CSS, full object model… </li></ul></ul>
    26. 26. Administration Improvements Query Reporting <ul><li>Best way to improve Search is to understand current usage </li></ul><ul><li>New out-of-box usage reporting: </li></ul><ul><ul><li>Query volume trends, top queries, click-through rates, queries with zero results, etc. </li></ul></ul><ul><ul><li>At both site and service provider levels </li></ul></ul><ul><ul><li>Export data for extended reporting in Excel </li></ul></ul><ul><ul><li>Respond to feedback with configuration changes or editorial results </li></ul></ul>
    27. 27. Performance Improvements <ul><li>Key new features make the crawls faster so the content is fresher </li></ul><ul><ul><li>More efficient SharePoint crawling (Change Log Crawl) </li></ul></ul><ul><ul><li>Continuous propagation </li></ul></ul><ul><ul><li>Unified WSS and MOSS search </li></ul></ul><ul><ul><li>Security Change Only Crawl </li></ul></ul><ul><li>Maximum scale is 10s of millions of documents per indexer </li></ul>
    28. 28. Demo – MOSS 2007 <ul><li>G oal of demo is a high level overview with focus on: </li></ul><ul><li>Search boxes and advanced search </li></ul><ul><li>Search results experience </li></ul><ul><li>Search Center </li></ul><ul><li>Admin experience </li></ul>
    29. 29. <ul><li>Questions? </li></ul>
    30. 30. Module 3 Architecture and Deployment Scenarios
    31. 31. Agenda <ul><li>Key concepts </li></ul><ul><ul><li>MS Search Architecture </li></ul></ul><ul><ul><li>Deployment Building Blocks </li></ul></ul><ul><ul><li>WSS v3 Search Topologies </li></ul></ul><ul><ul><li>MOSS 2007 Search Topologies </li></ul></ul><ul><li>Search Topology scenarios </li></ul><ul><ul><li>Small </li></ul></ul><ul><ul><li>Medium </li></ul></ul><ul><ul><li>Large </li></ul></ul><ul><ul><li>Geographically distributed </li></ul></ul><ul><li>Solution scenarios </li></ul><ul><ul><li>Collaboration sites </li></ul></ul><ul><ul><li>Enterprise portal </li></ul></ul><ul><ul><li>Internet facing portal </li></ul></ul>
    32. 32. Microsoft Search Architecture Notes Query Engine Index Engine Protocol Handlers iFilters Content Index OOB Search UI/Custom Search Apps Query OM and Web Service Information … Exchange Folders Network Shares External Web Sites SharePoint Sites Business Data Stemmers WordBreakers Results Query Content Sources Crawl Log Scopes Schema Best Bets Keywords Ranking Search Configuration Data
    33. 33. SharePoint Search Topologies: Deployment Building Blocks <ul><li>Physical building blocks: </li></ul><ul><ul><li>Web Front-End Servers </li></ul></ul><ul><ul><li>Application servers (Query, Index, Excel Services, etc.) </li></ul></ul><ul><ul><li>SQL Databases </li></ul></ul><ul><li>Search functionality segmented into two roles: </li></ul><ul><ul><li>Indexer </li></ul></ul><ul><ul><li>Query </li></ul></ul><ul><li>MOSS 2007 specific </li></ul><ul><ul><li>Shared Service Provider (SSP) </li></ul></ul><ul><ul><ul><li>Indexer </li></ul></ul></ul><ul><ul><ul><li>Web Application(s) </li></ul></ul></ul><ul><ul><ul><ul><li>Site Collection(s) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Content Database(s) </li></ul></ul></ul></ul><ul><ul><ul><li>Virtual Server(s) (IIS) </li></ul></ul></ul>
    34. 34. WSS v3 Search Topology Basics <ul><li>WSS uses both server roles on the same machine (“Search Server”) </li></ul><ul><ul><li>Indexing </li></ul></ul><ul><ul><li>Query </li></ul></ul><ul><li>Ability to index local content only </li></ul><ul><ul><li>Site Collection (content database(s)) </li></ul></ul><ul><li>Content is automatically indexed </li></ul><ul><ul><li>minimal search administration </li></ul></ul><ul><li>Ability to query at a site and below it </li></ul><ul><li>stsadm command exposes some admin operations </li></ul><ul><li>Can Crawl Multiple content databases </li></ul>
    35. 35. Sample WSS v3 Topology
    36. 36. WSS v3 - Topology Considerations <ul><li>Scale out just like WSS </li></ul><ul><li>Add content databases for content </li></ul><ul><li>Add search servers for search </li></ul><ul><li>Each search server can serve up to 100 content databases </li></ul><ul><ul><li>Could be lower depending on the data in the content database </li></ul></ul>
    37. 37. <ul><li>Adds new functionality over base WSS Search </li></ul><ul><li>Application server roles can be separated: </li></ul><ul><ul><li>Indexer </li></ul></ul><ul><ul><li>Query server </li></ul></ul><ul><li>Propagation from indexer to query servers </li></ul><ul><li>Crawl local + external content </li></ul><ul><li>Enhanced administration experience </li></ul><ul><li>Ability to search across site collections </li></ul>MOSS 2007 Search Topology Basics
    38. 38. MOSS 2007 Search Topology Basics (cont) <ul><li>Query role can be assigned to one or more servers </li></ul><ul><li>Indexing role can only be assigned to a single server </li></ul><ul><li>Multiple query servers not allowed IF server is providing both indexing and query services </li></ul><ul><li>Only one index per SSP . . . although you can have multiple SSPs </li></ul>
    39. 39. Sample MOSS 2007 Topology Query servers separated from indexer Indexer crawling local + external content
    40. 40. MOSS 2007 – Search Topology Considerations <ul><li>Indexing operations are CPU intensive </li></ul><ul><li>Dedicated query servers *might* be better in a query heavy environment </li></ul><ul><li>MOSS / WSS crawls do involve making HTTP requests against the WFE(s) </li></ul><ul><li>Dual role, WFE / Query servers more efficient with security trimming </li></ul><ul><li>All servers should be on same network segment </li></ul>
    41. 41. MOSS 2007 – Search Topology Considerations (cont) <ul><li>Each farm can index up to 50 million items </li></ul><ul><li>Beyond this, add more farms </li></ul><ul><li>Hardware is important </li></ul>
    42. 42. Shared Search Service <ul><li>Shared Service Provider (SSP) – grouped high-value, resource intensive services </li></ul><ul><li>Shared services are consumed by web applications (and sites within them) </li></ul><ul><li>“ Always on” shared services – all sites in a web application use the same index </li></ul><ul><li>Resource intensive operations controlled centrally </li></ul><ul><li>Some admin experience is manageable at site level </li></ul>
    43. 43. Search Shared Service Search service People service … Shared Service Provider (SSP) http://sales http://finance http://hr spsite spsite spsite spsite spsite spsite spweb spweb spweb spweb spweb spweb Virtual Servers Content Databases External content
    44. 44. Search Shared Service Search service People service … Shared Service Provider http://sales http://finance http://hr spsite spsite spsite spsite spsite spsite spweb spweb spweb spweb spweb spweb Virtual Servers Content Indexed Content Databases External content
    45. 45. Common Search Topologies <ul><li>Deployment scenarios </li></ul><ul><ul><li>Small </li></ul></ul><ul><ul><li>Medium </li></ul></ul><ul><ul><li>Large </li></ul></ul><ul><ul><li>Geographically Distributed (MOSS only) </li></ul></ul>
    46. 46. Small Search Deployment <ul><li>WSS </li></ul><ul><ul><li>Single Search Server with both roles </li></ul></ul><ul><ul><ul><li>Index </li></ul></ul></ul><ul><ul><ul><ul><li>Single Site Collection only! </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Single Set of Content Databases </li></ul></ul></ul></ul><ul><ul><ul><li>Query </li></ul></ul></ul><ul><li>MOSS </li></ul><ul><ul><li>Single Server </li></ul></ul><ul><ul><ul><li>Dual Role </li></ul></ul></ul><ul><ul><ul><ul><li>Index </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>SSP Based – Multiple Site Collections </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Multiple Set of Content Databases </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><li>Query </li></ul></ul></ul></ul><ul><li>MOSS for Search </li></ul><ul><ul><li>Single Server / Dual Role (Index and Query) </li></ul></ul>
    47. 47. Medium Search Deployment <ul><li>WSS </li></ul><ul><ul><li>Multiple Search Servers with the following limitations </li></ul></ul><ul><ul><ul><li>Single Index Server </li></ul></ul></ul><ul><ul><ul><ul><li>Single Site Collection </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Single Set of Content Databases </li></ul></ul></ul></ul><ul><ul><ul><li>Multiple Query Servers </li></ul></ul></ul><ul><li>MOSS </li></ul><ul><ul><li>Three Servers </li></ul></ul><ul><ul><ul><li>One Index Server </li></ul></ul></ul><ul><ul><ul><li>Two Query Servers running on two Web Front-End servers </li></ul></ul></ul><ul><li>MOSS for Search </li></ul><ul><ul><li>Three Servers </li></ul></ul><ul><ul><ul><li>One Index Server </li></ul></ul></ul><ul><ul><ul><li>Two Query Servers </li></ul></ul></ul>
    48. 48. Large Search Deployment <ul><li>WSS </li></ul><ul><ul><li>Multiple Search Servers with the following limitations </li></ul></ul><ul><ul><ul><li>Multiple Index Servers (64-bit) </li></ul></ul></ul><ul><ul><ul><ul><li>Each Indexing a Single Site Collection with their own Set of Content Databases </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Index Servers are not redundant from one another. </li></ul></ul></ul></ul><ul><ul><ul><li>Multiple Query Servers each associated with their own single Index Server running on the same machine (64-bit) </li></ul></ul></ul><ul><ul><ul><ul><li>Query servers are not redundant from one another </li></ul></ul></ul></ul><ul><li>MOSS </li></ul><ul><ul><li>One Index Server (64-bit) </li></ul></ul><ul><ul><li>Many Separate Query servers (64-bit) </li></ul></ul><ul><li>MOSS for Search </li></ul><ul><ul><li>One Index Server (64-bit) </li></ul></ul><ul><ul><li>Many Separate Query servers (64-bit) </li></ul></ul>
    49. 49. Geographically Distributed Sites MOSS Search Deployment Other Locations Corp. Sites Search service People service --- Shared Service Provider (SSP) Index Corp, EMEA, APAC and other locations http://sales http://finance http://hr spsite spsite spsite spsite spsite spsite spweb spweb spweb spweb spweb spweb Virtual Servers External content Search service People service --- Shared Service Provider (SSP) Index APAC only http://apacsales http://apacfinance http://apachr spsite spsite spsite spsite spsite spsite spweb spweb spweb spweb spweb spweb Virtual Servers External content Search service People service --- Shared Service Provider (SSP) Index EMEA only http://emeasales http://emeafinance http://emeahr spsite spsite spsite spsite spsite spsite spweb spweb spweb spweb spweb spweb Virtual Servers External content
    50. 50. Deployment Scenarios <ul><li>Collaboration Environment (WSS v3) </li></ul><ul><li>Enterprise Portal (MOSS 2007) </li></ul><ul><li>Internet Facing Portal (MOSS 2007) </li></ul>
    51. 51. Collaboration Environment Scenario WSS v3 <ul><li>iTech – startup software consulting firm </li></ul><ul><li>Large number of disjoint teams working on projects of varying durations </li></ul><ul><li>Team sites used for collaboration and communication </li></ul><ul><li>No organizational needs across sites </li></ul>
    52. 52. Collaboration Environment Scenario WSS v3 (cont) <ul><li>WSS farm with single IIS virtual server http://team </li></ul><ul><li>Scales to large number of team sites </li></ul><ul><li>Content indexed automatically </li></ul><ul><li>WSS v3 standalone topology </li></ul><ul><ul><li>1 Search box (both roles) </li></ul></ul>
    53. 53. Collaboration Environment Scenario WSS v3 (cont) <ul><ul><li>Search – core feature of WSS </li></ul></ul><ul><ul><li>Contextual scopes – site and list </li></ul></ul><ul><ul><li>No search across sites </li></ul></ul>http://team team1 team2 spweb spweb Virtual Server team3 spweb spweb SPSites Content Databases
    54. 54. Enterprise Portal Scenario MOSS 2007 <ul><li>iTech – growing company with growing needs </li></ul><ul><li>iTech – needs a single point for information access for employees </li></ul><ul><li>They now need to search over other repositories: </li></ul><ul><ul><li>Personnel records – People search </li></ul></ul><ul><ul><li>Seibel sources – BDC search </li></ul></ul><ul><ul><li>File Shares / Web sites – other external data </li></ul></ul>
    55. 55. Enterprise Portal Scenario MOSS 2007 (cont) <ul><li>Upgrade from WSS  MOSS </li></ul><ul><li>Search is a shared service through the SSP </li></ul><ul><li>Central enterprise portal – http://itech </li></ul><ul><li>Existing virtual server http://team associated with SSP – search box switches to use MOSS </li></ul><ul><li>Base WSS search is not running – but search available to sites through shared search service </li></ul><ul><li>Indexes – local and external content </li></ul>
    56. 56. Enterprise Portal Scenario MOSS 2007 (cont) Farm http://team team1 team2 spweb spweb Virtual Server team3 spweb spweb SPSites Content Databases Search service People service … Shared Service Provider External content http://itech HR Sales spweb spweb Virtual Server Finance spweb spweb SPSites Content Databases
    57. 57. Enterprise Portal Scenario MOSS 2007 (cont) <ul><li>Topology with indexer and query servers </li></ul><ul><li>Load balanced query servers </li></ul><ul><li>Scale out and scale up – new SSP dimension </li></ul>Query Servers added for throughput Single indexer crawls logical SSP = local + external content
    58. 58. Internet Facing Portal Scenario - MOSS 2007 <ul><li>Internet facing site for customers – www.itech.com </li></ul><ul><li>High traffic focused on content presentation </li></ul><ul><li>Public access </li></ul><ul><li>More publishing and less collaboration </li></ul><ul><li>Controlled and tightly managed content </li></ul>
    59. 59. Internet Facing Portal Scenario - MOSS 2007 (cont) <ul><li>Two separate farms: Production and test farms </li></ul><ul><li>MOSS installation </li></ul><ul><li>Controlled publishing of content to production farm from test farm </li></ul><ul><li>Single shared service provider per farm </li></ul><ul><li>Shared search service in each farm crawls content in each farm independently </li></ul>
    60. 60. Internet Facing Portal Scenario - MOSS 2007 (cont) www.itech.com Services Customers spweb spweb Virtual Server About itech spweb spweb Content Databases SPSites Search service People service --- SSP Production farm http://itechtest Services Customers spweb spweb Virtual Server About itech spweb spweb Content Databases SPSites Search service People service --- SSP Test Farm
    61. 61. Questions?
    62. 62. Module 4 Crawl and Query Processes
    63. 63. Agenda <ul><li>The Crawl Process </li></ul><ul><ul><li>Crawl Walkthrough </li></ul></ul><ul><ul><li>Index Propagation </li></ul></ul><ul><li>The Query Process </li></ul>
    64. 64. Crawl Walkthrough <ul><li>When a crawl is requested . . . </li></ul><ul><li>Indexer grabs the start address of content source </li></ul><ul><li>Start address is prefixed with protocol associated with accessing the content </li></ul><ul><li>Appropriate protocol handler invoked to traverse the content source </li></ul><ul><li>During traversal, the handler will identify content nodes it needs to index </li></ul>
    65. 65. Crawl Walkthrough (cont) <ul><li>Protocol handler invokes IFilter associated with content node type </li></ul><ul><li>IFilter identifies and extracts properties from content node </li></ul><ul><li>Protocol handler supplements IFilter data with additional property information </li></ul><ul><li>Data associated with content node is added to index </li></ul><ul><li>Index “delta” propagates to search servers </li></ul>
    66. 66. Crawl Overview Diagram
    67. 67. Index Propagation Farm Sample Indexer Load Balancer Crawling Web front ends Index Propagation Query Servers User Requests
    68. 68. <ul><li>Propagation will occur only when the index and search components are on separate servers </li></ul><ul><li>Continuous propagation </li></ul><ul><ul><li>Changes sent incrementally to all query servers associated with the index server. </li></ul></ul><ul><ul><li>Merging of the index occurs on the query servers after propagation. </li></ul></ul><ul><ul><li>Query servers continue serving queries while propagation is in progress </li></ul></ul>Index Propagation
    69. 69. Index Propagation <ul><li>Index File Location </li></ul><ul><ul><li>Set in Office SharePoint Server Search Service settings </li></ul></ul><ul><ul><ul><li>Default location: C: Program FilesMicrosoft Office Servers12.0DataOffice ServerApplications </li></ul></ul></ul><ul><ul><ul><li>Can be programmatically set using the stsadm command </li></ul></ul></ul><ul><ul><ul><li>Index Server: </li></ul></ul></ul><ul><ul><ul><li>“ stsadm.exe -o editssp –indexlocation index file path” </li></ul></ul></ul><ul><ul><ul><li>Query Server </li></ul></ul></ul><ul><ul><ul><li>“ stsadm.exe –o osearch –propagationlocation index file path” </li></ul></ul></ul>
    70. 70. The Query Process <ul><li>Query Initiation and Results Presentation </li></ul><ul><li>Query Execution </li></ul><ul><li>Query Walkthrough </li></ul>
    71. 71. Query Initiation and Results Presentation <ul><li>Typically, provided by the WSS / MOSS WFE role, through OOB WebParts </li></ul><ul><li>Could be an Office client or other custom application </li></ul><ul><li>Responsible for constructing the “full” query and communicating with the query execution services </li></ul>
    72. 72. Query Execution <ul><li>Always provided by a server tagged with the Query role </li></ul><ul><li>Consumes a query request </li></ul><ul><li>Executes the request using the query index on the file system as well as the SSP search database (if MOSS) </li></ul><ul><li>Handles OOB security trimming </li></ul><ul><li>Returns requested properties of the result set to the caller </li></ul>
    73. 73. Query Walkthrough (cont) <ul><li>When a query is requested . . . </li></ul><ul><li>Query terms collected </li></ul><ul><li>Terms supplemented with contextual information </li></ul><ul><li>Query formulated and issued through the Query OM or the Web Service </li></ul><ul><li>Query is executed against the index and property store </li></ul><ul><li>Query results returned </li></ul><ul><ul><li>Results are ordered according to their relevance to the query words </li></ul></ul><ul><ul><li>Trimmed based on the user’s permissions. </li></ul></ul>
    74. 74. Questions?
    75. 75. Module 5 The Search End-User Experience
    76. 76. Module Agenda <ul><li>Introducing the Search End-User Experience </li></ul><ul><li>Customizing Search </li></ul><ul><li>People Search </li></ul>
    77. 77. Introducing the Search End-User Experience <ul><li>Complete Search experience </li></ul><ul><li>Search is everywhere </li></ul><ul><li>Tab-based user interface for easy navigation </li></ul><ul><li>Easy to extend and customize </li></ul>
    78. 78. Introducing the End-User Search Experience <ul><li>Search Boxes </li></ul><ul><li>Search Center </li></ul><ul><li>Search Web Parts </li></ul>
    79. 79. Query Results Http: Get Http: Post Search Box XML Web Parts XSL Transformation Query OM Advanced Search Hidden Object XML XML OOB Search UI/Custom Search Apps Query OM and Web Service
    80. 80. Search WebParts <ul><li>Nine Standard Search Web Parts </li></ul><ul><ul><li>Search Box </li></ul></ul><ul><ul><li>Core Results </li></ul></ul><ul><ul><li>High Confidence </li></ul></ul><ul><ul><li>Statistics </li></ul></ul><ul><ul><li>Pagination </li></ul></ul><ul><ul><li>Action Links </li></ul></ul><ul><ul><li>Matching Keywords and Best Bets </li></ul></ul><ul><ul><li>Search Summary (Did you mean?) </li></ul></ul><ul><ul><li>Advanced Search </li></ul></ul>
    81. 81. Result page infrastructure <ul><li>Data shared through hidden object </li></ul><ul><ul><li>All Search Web Parts within the same page share the same hidden object </li></ul></ul><ul><ul><li>Connection between Search Web Part is automatically done </li></ul></ul><ul><ul><li>Need only to Drag and Drop (or select) a Search Web Part on the page </li></ul></ul><ul><ul><li>Allows for rapid page design </li></ul></ul><ul><ul><li>Hidden Object is internal and cannot be used by custom Web Parts </li></ul></ul><ul><li>All Search Web Parts derive from Data Form Web Part </li></ul>
    82. 82. Advanced Search <ul><li>Allows power searchers to exercise greater control on how they query </li></ul><ul><li>A link from the search box </li></ul><ul><li>Control what is displayed in the page by modifying the xml stored in the web part property “Properties” </li></ul><ul><ul><li>i.e., can be used for displaying a new language check box </li></ul></ul><ul><li>Not provided by WSS Search UI </li></ul><ul><li>Implemented using the SQL syntax </li></ul>
    83. 83. Customizing the End User Experience <ul><li>Search in every company is different </li></ul><ul><ul><li>Different metadata might matter </li></ul></ul><ul><ul><ul><li>Documents: Title, Author, File location, size </li></ul></ul></ul><ul><ul><ul><li>Records: Patient, Doctor, Healthcare provider, SSN… </li></ul></ul></ul><ul><ul><li>Multi- or single-languages </li></ul></ul><ul><ul><li>How users meaningfully scope searches differs </li></ul></ul><ul><ul><ul><li>“ All finance documents” </li></ul></ul></ul><ul><ul><ul><li>“ All patient records” </li></ul></ul></ul><ul><ul><ul><li>“ All published documents” </li></ul></ul></ul><ul><li>Customize results to “pop” metadata that matters </li></ul><ul><li>Customization offered at many levels </li></ul><ul><ul><li>Web Parts, XSLT/CSS, full Object Model… </li></ul></ul>
    84. 84. Customization Choices <ul><li>Search Center </li></ul><ul><ul><li>Simple Site with few pages </li></ul></ul><ul><ul><ul><li>Default Page </li></ul></ul></ul><ul><ul><ul><li>Result Page </li></ul></ul></ul><ul><ul><ul><li>Advanced Search Page </li></ul></ul></ul><ul><ul><ul><li>People Search Page </li></ul></ul></ul><ul><li>Results Pages </li></ul><ul><ul><li>All Sites Results Page </li></ul></ul><ul><ul><li>People Results Page </li></ul></ul><ul><li>Advanced Search Page and Web Part </li></ul><ul><ul><li>Show Scope Picker </li></ul></ul><ul><ul><ul><li>Scopes </li></ul></ul></ul><ul><ul><li>Property Picker </li></ul></ul><ul><ul><li>Languages </li></ul></ul><ul><li>Search Web Parts </li></ul>
    85. 85. Customizing Search <ul><li>Adding Search Center Tabs </li></ul><ul><li>Customizing Search Web Parts </li></ul><ul><li>Customizing Search Results </li></ul>
    86. 86. People Search <ul><li>Bring people into the search experience </li></ul><ul><ul><li>Getting your job done means working with the right people </li></ul></ul><ul><ul><li>Find subject matter experts based on their knowledge and contacts </li></ul></ul><ul><ul><li>People list can come from AD, SQL, others </li></ul></ul>Discovering Experts People are as important as data!
    87. 87. People Search <ul><li>People Results </li></ul><ul><li>Customizing Results </li></ul>
    88. 88. Refine Your People Search <ul><li>Refine by Job Title </li></ul><ul><ul><li>Searches for the selected Job Title </li></ul></ul><ul><li>Refine by Department </li></ul><ul><ul><li>Searches for the selected Department </li></ul></ul><ul><li>“ Show more options” link (6+) </li></ul><ul><li>Listed in order of frequency </li></ul>
    89. 89. People Search Web Parts <ul><li>Two OOB People Search Web Parts </li></ul><ul><ul><li>People Search Box </li></ul></ul><ul><ul><li>People Search Core Results </li></ul></ul><ul><ul><ul><li>Inherit from the Search Core Results Web Part </li></ul></ul></ul><ul><li>Can be mixed on the same page with other Search Web Parts </li></ul>
    90. 90. People Results Search Web Parts <ul><li>Web Part properties such as: </li></ul><ul><li>(similar to Core Search WP) </li></ul><ul><ul><li>Formatting (i.e. width of the search box) </li></ul></ul><ul><ul><li>Number of Results per page </li></ul></ul><ul><ul><li>Display “Alert Me”, “RSS” links </li></ul></ul><ul><ul><li>Turn stemming on/off (default “off”) </li></ul></ul><ul><ul><li>Remove Duplicate Results on/off (default “on”) </li></ul></ul><ul><ul><li>Fixed keyword Query </li></ul></ul><ul><ul><li>Select Columns </li></ul></ul><ul><ul><li>Results formatting with XSL </li></ul></ul><ul><ul><li>Social Distance (view) </li></ul></ul>
    91. 91. Social Distance Colleagues <ul><li>Suggested Colleague list members are mined from: </li></ul><ul><ul><li>Microsoft Windows Messenger (IM) </li></ul></ul><ul><ul><li>Microsoft Office Outlook e-mail </li></ul></ul><ul><ul><li>(Outlook Add-In) </li></ul></ul>
    92. 92. Questions?
    93. 93. Module 6 Search Object Model
    94. 94. Workshop Agenda <ul><li>Scenarios for Extending Search </li></ul><ul><li>Query Syntax </li></ul><ul><li>Query Object Model </li></ul><ul><li>Query Web Service </li></ul>
    95. 95. Topic: Scenarios for Extending Search <ul><li>In this first section we will examine 2 scenarios for extending Search: </li></ul><ul><li>Integrate with Search Center </li></ul><ul><li>Integrate Search into 3rd party sites and applications </li></ul>
    96. 96. Integrate with MOSS Search Center <ul><li>Use cases: </li></ul><ul><li>Use Search URL request parameters to add predefined saved searches </li></ul><ul><li>Build custom search box Web parts for custom look and feel </li></ul><ul><li>Build custom search core result Web parts for own look and feel and customized querying </li></ul>Extending Search
    97. 97. Integrate MOSS Search into 3rd Party Sites and Applications <ul><li>Build 3rd party user interface which leverages MOSS Search through Web Services </li></ul><ul><li>Use cases </li></ul><ul><ul><li>Add MOSS Search features into existing Web sites </li></ul></ul><ul><ul><li>Add MOSS Search into existing line of business or custom applications </li></ul></ul>Extending Search
    98. 98. Topic: Query Syntax <ul><li>In this section we will examine the three types of search syntax for building search queries supported by MOSS: </li></ul><ul><li>Keyword </li></ul><ul><li>URL </li></ul><ul><li>SQL </li></ul>
    99. 99. Keyword Syntax <ul><li>Used in standard Search Box </li></ul><ul><li>New keyword syntax </li></ul><ul><li>Simple and easy to use </li></ul><ul><li>Consistent property:value syntax across Office, Windows and Live search </li></ul>Overview gallery hinges –brass site:http//supportdesk scope:Products
    100. 100. <ul><li>Build-in support for using include and exclude terms </li></ul><ul><ul><li>Look for term bike, but not related to fitness </li></ul></ul><ul><ul><li>Look for phrase “SharePoint Services” but not the term v2 </li></ul></ul><ul><ul><li>Include is implied when is no (+/-) prefix </li></ul></ul>Keyword Syntax Include/Exclude bike -fitness +”SharePoint Services”-v2
    101. 101. <ul><li>Narrowing results by default </li></ul><ul><ul><li>Searches using “AND” between query terms </li></ul></ul><ul><li>Does not recognize logical operators like “OR”, “NEAR” as keywords – it treats them all as search terms </li></ul><ul><li>Does not support complex queries like (A AND B) OR (C AND D) </li></ul><ul><li>Complex Boolean searches are supported by the engine and the SQL syntax </li></ul>Keyword Syntax Boolean Search
    102. 102. Keyword Syntax Property restrictions <ul><li>Supports property:value as part of the keyword string </li></ul><ul><li>Can use any managed property </li></ul><ul><li>Supports the use of phrases </li></ul><ul><ul><li>Can be used for exact matches when the property value includes spaces </li></ul></ul><ul><ul><li>Without quotes then prefix matching is done. Supports word stemming </li></ul></ul>
    103. 103. <ul><li>No wildcard support in Keyword Syntax </li></ul><ul><ul><li>Search box does not do wildcard searching. The following is not recognized as a wildcard search  </li></ul></ul><ul><ul><li>Use Advanced Search property restrictions to look for parts of a word </li></ul></ul><ul><li>Requires new search results Web parts </li></ul><ul><li>Wildcards are supported by the engine and the SQL query syntax </li></ul>Keyword Syntax No wildcard support ShareP*
    104. 104. URL Syntax <ul><li>Use Case </li></ul><ul><ul><li>Launching a URL in custom application </li></ul></ul><ul><ul><li>Save Searches </li></ul></ul><ul><ul><li>Custom search boxes </li></ul></ul><ul><li>Request Parameters </li></ul><ul><ul><li>Content: results.aspx?k=fish </li></ul></ul><ul><ul><li>Scopes: results.aspx?k=fish&s=BBC </li></ul></ul><ul><ul><li>Sort: </li></ul></ul><ul><ul><ul><li>results.aspx?v=date </li></ul></ul></ul><ul><ul><ul><li>results.aspx?v=relevance </li></ul></ul></ul><ul><ul><li>Page: results.aspx?start=21 </li></ul></ul>
    105. 105. SQL Syntax Overview <ul><li>SQL Syntax offers: </li></ul><ul><li>Consistent SQL across enterprise and desktop </li></ul><ul><li>Complex queries and Boolean searches </li></ul><ul><ul><li>Comparison operators </li></ul></ul><ul><ul><li>Arbitrary groupings for AND, OR, NOT </li></ul></ul><ul><ul><li>Freetext() </li></ul></ul><ul><ul><li>CONTAINS() </li></ul></ul><ul><ul><li>LIKE </li></ul></ul><ul><ul><li>ORDER BY ASC | DESC </li></ul></ul><ul><li>Custom SQL query statements </li></ul><ul><li>Wildcard support </li></ul>
    106. 106. <ul><li>Write complex Boolean searches using AND, OR, NOT </li></ul>SQL Syntax Complex Boolean Searches
    107. 107. <ul><li>Returns documents for which the following is true: </li></ul><ul><ul><li>Document contains all the search terms in at least one of the columns specified </li></ul></ul><ul><ul><li>One of the search terms must also be found in the Contents column </li></ul></ul><ul><li>Use only one FREETEXT predicate for most optimal ranking </li></ul><ul><li>The FREETEXT predicate also supports (+/-) </li></ul>SQL Syntax FREETEXT predicate
    108. 108. <ul><li>Get wildcard support using the CONTAINS predicate: </li></ul><ul><ul><li>Wildcard: Words or phrases with an asterisk (*) added to the end. </li></ul></ul><ul><ul><ul><li>WHERE CONTAINS </li></ul></ul></ul><ul><ul><ul><li>(' </li></ul></ul></ul><ul><ul><ul><li>&quot;compu*&quot; NEAR &quot;soft*&quot; </li></ul></ul></ul><ul><ul><ul><li>') </li></ul></ul></ul>SQL Syntax Wildcard Support
    109. 109. <ul><li>Removed in MOSS 2007 </li></ul><ul><li>Query property weights </li></ul><ul><li>UNION ALL </li></ul><ul><li>MATCHES </li></ul><ul><li>SELECT * </li></ul><ul><li>COALESCE TABLE   </li></ul>SQL Syntax Removed from SQL syntax
    110. 110. Topic: Query Object Model <ul><li>In this section we will examine: </li></ul><ul><li>The Query Object Model </li></ul><ul><li>The Query Object Path </li></ul><ul><li>The Query Web Service </li></ul>
    111. 111. Query Object Model <ul><li>New object model </li></ul><ul><li>Use the query object model to: </li></ul><ul><ul><li>Build custom search user interface, like Web parts or ASPX applications </li></ul></ul><ul><ul><li>Gain direct access to query and results properties </li></ul></ul><ul><ul><li>Invoke custom queries </li></ul></ul><ul><li>2 types of query syntaxes: </li></ul><ul><ul><li>Keyword </li></ul></ul><ul><ul><li>SQL </li></ul></ul>
    112. 112. Query Object Model Features <ul><li>Managed code API </li></ul><ul><li>Single request – multiple results </li></ul><ul><li>Result Types </li></ul><ul><li>Relevant results </li></ul><ul><li>High confidence results </li></ul><ul><li>Special terms </li></ul><ul><li>Definitions </li></ul><ul><li>Optional parameters </li></ul><ul><li># of Sentences in Summary </li></ul><ul><li>Implicit - AND/OR </li></ul><ul><li>Number of results </li></ul><ul><li>Ignore noise words </li></ul><ul><li>Enable stemming </li></ul><ul><li>Language </li></ul>
    113. 113. Query Object Path Query OM Input Output SQL Query Optional Parameters Query Engine ResultTableCollection ResultTable: IDataReader Relevant results High confidence Special terms Definitions Site UI Custom Client Local Remote Keyword Query Execute()
    114. 114. Query Web Service Use and Methods <ul><li>Use Case </li></ul><ul><ul><li>Leverage Search in remote sites or application </li></ul></ul><ul><ul><li>Office Research Pane </li></ul></ul><ul><li>Methods </li></ul><ul><ul><li>Query </li></ul></ul><ul><ul><li>QueryEx </li></ul></ul><ul><ul><li>GetSearchMetaData </li></ul></ul><ul><ul><li>Registration </li></ul></ul><ul><ul><li>Status </li></ul></ul>
    115. 115. Query Web Service Search Center Features <ul><li>Standard Search Center features not built into the Web service </li></ul><ul><ul><li>Hit highlighting </li></ul></ul><ul><ul><li>Search usage reporting </li></ul></ul><ul><ul><li>Search logging </li></ul></ul><ul><ul><li>Search statistics </li></ul></ul><ul><ul><li>Result type icons </li></ul></ul><ul><li>Using Query vs. QueryEx </li></ul><ul><li>Implementing hit highlighting </li></ul>
    116. 116. Questions?
    117. 117. Module 7 Administration
    118. 118. Module Agenda <ul><li>Administrative Architecture </li></ul><ul><ul><li>Farm Administration </li></ul></ul><ul><ul><li>SSP Administration </li></ul></ul><ul><ul><li>Site Collection Administration </li></ul></ul><ul><ul><li>Site Administration </li></ul></ul><ul><li>Search Usage Reporting </li></ul><ul><li>Administrative Tools </li></ul><ul><li>Lab: Adding Content Sources </li></ul><ul><li>Lab: Search Schema </li></ul>
    119. 119. Administrative Architecture <ul><li>Shared Services </li></ul><ul><li>Business unit IT </li></ul><ul><li>Service-level configuration </li></ul><ul><li>E.g. Create search content source, Search Scopes </li></ul><ul><li>Central Administration </li></ul><ul><li>IT Administrators </li></ul><ul><li>Farm-level </li></ul><ul><ul><ul><li>Status </li></ul></ul></ul><ul><ul><ul><li>Resource management </li></ul></ul></ul><ul><li>One per farm </li></ul><ul><li>E.g. Create new site </li></ul><ul><li>Site Settings </li></ul><ul><li>Business site owner </li></ul><ul><li>Site specific configuration and tasks </li></ul><ul><li>e.g. Create new list </li></ul>Three Tier Administration <ul><li>Web-based </li></ul><ul><li>Role- and Task-delineated </li></ul><ul><li>Controlled Delegation </li></ul><ul><li>Secure Isolation </li></ul>
    120. 120. Farm Management (IT Administrators)
    121. 121. SharePoint 3.0 Central Administration <ul><li>Common Tasks </li></ul><ul><ul><li>Manage Topology and Services </li></ul></ul><ul><ul><ul><li>Servers in Farm </li></ul></ul></ul><ul><ul><ul><li>Services in Server </li></ul></ul></ul><ul><ul><li>Security Configuration </li></ul></ul><ul><ul><ul><li>Update Farm Administrator’s Group </li></ul></ul></ul><ul><ul><li>Backup and Restore </li></ul></ul><ul><ul><ul><li>Index </li></ul></ul></ul><ul><ul><ul><li>Search Database </li></ul></ul></ul><ul><ul><li>Global Configuration </li></ul></ul><ul><ul><ul><li>Timer Job Definitions </li></ul></ul></ul><ul><ul><ul><li>Timer Job Status </li></ul></ul></ul><ul><ul><li>Manage Search Service </li></ul></ul>
    122. 122. Using Central Admin
    123. 123. Operations – Topology and Services Servers in Farm / Services on Server <ul><li>Query Server(s) </li></ul><ul><ul><li>Office SharePoint Server Search Service </li></ul></ul><ul><ul><ul><li>Stop / Start </li></ul></ul></ul><ul><ul><li>Office SharePoint Services Help Search Service </li></ul></ul><ul><ul><ul><li>Stop / Start </li></ul></ul></ul><ul><li>Index Server(s) </li></ul><ul><ul><li>Office SharePoint Server Search Service </li></ul></ul><ul><ul><ul><li>Stop / Start </li></ul></ul></ul>
    124. 124. Operations – Backup and Restore <ul><li>Perform a backup </li></ul><ul><li>Restore from backup </li></ul>
    125. 125. Operations – Global Configuration <ul><li>Timer Job Definitions </li></ul><ul><ul><li>SharePoint Services Search Refresh </li></ul></ul><ul><ul><ul><li>Disable / Enable (Change and update WSS search configuration) </li></ul></ul></ul><ul><ul><li>Indexing Schedule Manager on MOSS </li></ul></ul><ul><ul><ul><li>Disable / Enable </li></ul></ul></ul><ul><li>Timer Job Status </li></ul><ul><ul><li>Succeeded / Failed </li></ul></ul>
    126. 126. Search Application Management <ul><li>Manage Search Service </li></ul><ul><ul><li>Farm-level Search settings </li></ul></ul><ul><ul><li>Proxy Server settings </li></ul></ul><ul><ul><li>Query and Index Servers </li></ul></ul><ul><ul><li>Server Listing and their Search service </li></ul></ul><ul><ul><li>Shared Service Providers with Search enabled </li></ul></ul><ul><ul><li>SSP name listing </li></ul></ul><ul><ul><li>Crawler Impact Rules </li></ul></ul>
    127. 127. Crawler Impact Rules <ul><li>Configured through Central Administration </li></ul><ul><li>Allows “throttling” of the indexer to reduce impact of a crawl on a particular server </li></ul><ul><li>Supports wildcards </li></ul><ul><li>Used in conjunction with crawl schedules </li></ul>
    128. 128. Crawler Impact Rules (cont) Use . . . To . . . * as the site name Apply the rule to all sites *.* as the site name Apply the rule to sites with a dot in their name *. site_name .com as the site name Apply the rule to all sites in the site_name .com domain *.top-level_domain_name (such as *.com or *.net) as the site name Apply the rule to all sites that end with a specific top-level domain name ? Replace any single character in a rule
    129. 129. Shared Services Provider (SSP) Management (SSP Administrators) (Content Oriented Administration)
    130. 130. Common Tasks <ul><li>Configure Search Settings </li></ul><ul><ul><li>Content Sources </li></ul></ul><ul><ul><li>Crawl Settings </li></ul></ul><ul><ul><li>Authoritative Pages Settings </li></ul></ul><ul><ul><li>Scopes </li></ul></ul>
    131. 131. Content Sources <ul><li>Represent an arbitrary container of information </li></ul><ul><li>Require at least one start address, although multiple start addresses can be provided </li></ul><ul><li>Start address cannot be reused </li></ul><ul><li>Requires a registered protocol handler </li></ul><ul><li>Five out-of-box content source types are available, mapping to the five out-of-box protocol handlers </li></ul>
    132. 132. SharePoint Content Source <ul><li>Includes both SPS 2003, MOSS 2007, WSS v2, and WSS v3 sites </li></ul><ul><li>Can limit crawl to only sites specified in start address or all sites found below one or more provided hostnames </li></ul><ul><li>Crawler will use target site’s APIs to include security information around content in the index </li></ul><ul><li>For SPS 2003 content sources, crawler account requires “change” rights, which necessitates the crawler having administrator rights </li></ul><ul><li>Examples: sps3://moss-01/ or http://moss-01/sitecollection/ </li></ul><ul><li>Content sources decoupled from scopes </li></ul>
    133. 133. Web Site Content Source <ul><li>Any content source available over HTTP or HTTPS </li></ul><ul><li>If a SharePoint URL is provided, the crawler will detect this and index it as though it were a SharePoint content source (this can be overridden with crawl rules) </li></ul><ul><li>Page depth and server hops can be controlled </li></ul>
    134. 134. Web Site Content Source (cont) <ul><li>Security information around content is not included in index </li></ul><ul><li>Dynamic personalization will result in the index being populated with what the crawler is presented with </li></ul><ul><li>Example: http://website or http://www.somesite.com </li></ul>
    135. 135. File Shares Content Source <ul><li>Any content visible over a Windows server shared folder </li></ul><ul><li>Some non-Windows shares *may* be crawled, if that share can be presented as a Windows share (for instance, Samba with Linux, Services for Unix) </li></ul><ul><li>Start address can be the share root or subfolders beneath it </li></ul><ul><li>Security information is picked up by the gatherer </li></ul>
    136. 136. Exchange Public Folders Content Source <ul><li>Allows the indexer to crawl a public folder that exists on Exchange </li></ul><ul><li>Requires Outlook Web Access, as crawl is done over HTTP </li></ul><ul><li>Includes messages, conversations, and other collaborative content </li></ul><ul><li>URL presented in the search results will point to a deep link within OWA </li></ul><ul><li>Example: http://owa/public/folder </li></ul>
    137. 137. Business Data Content Source <ul><li>Allows the indexer to crawl metadata exposed through the Business Data Catalog </li></ul><ul><li>Can elect to include all Business Data Applications or a selected number of them </li></ul>
    138. 138. Lotus Notes Content Source
    139. 139. Crawling Schedules <ul><li>Allow administrator to indicate the frequency at which a content source will be re-crawled (daily, weekly, monthly) </li></ul><ul><li>Can indicate what time the content source should be crawled </li></ul><ul><li>Schedule should be driven by: </li></ul><ul><ul><li>Anticipated change at the content source (is this static content or content that is constantly changing) </li></ul></ul><ul><ul><li>Business expectations around when content changes should be reflected in the index </li></ul></ul><ul><li>Schedule can always be modified </li></ul>
    140. 140. Maximum File Size <ul><li>Default file size limit is 16MB </li></ul><ul><li>To change the limit, you must add in the registry new DWORD entry MaxDownloadSize at HKEY_LOCAL_MACHINESOFTWAREMicrosoftOffice Server12.0SearchGlobalGathering Manager </li></ul><ul><li>Make sure to increase timeout value to avoid timeout exceptions </li></ul><ul><ul><li>Change the value using the Manage Search Service page of the Central Admin </li></ul></ul>
    141. 141. Crawl Rules <ul><li>Define exceptions to the “typical” crawl process </li></ul><ul><ul><li>Addresses can be pattern matched for special treatment </li></ul></ul><ul><ul><li>Support exclusion </li></ul></ul><ul><ul><li>Support altering the authentication mechanism </li></ul></ul><ul><li>Examples of Crawl Rules </li></ul><ul><li>Testing of Crawl Rules </li></ul>
    142. 142. Search Result Removal (From Live Index) <ul><li>Typically used when someone discovers something in the index that shouldn’t be there </li></ul><ul><li>Permits administrator to immediately remove that content from the index </li></ul><ul><li>Crawl rule automatically created to prevent that content from being indexed in the future </li></ul><ul><li>Restoring that content requires dropping the crawl rule and re-indexing </li></ul>
    143. 143. Default Content Access Account <ul><li>Account used for crawling, by default </li></ul><ul><li>Can be overridden in the Crawl Rules </li></ul><ul><li>Set the default account to use when crawling content </li></ul><ul><ul><li>Minimum crawler permission is “Full Read” (still provides the same security trimming functionality) </li></ul></ul><ul><ul><li>Automatically configured for new sites </li></ul></ul><ul><ul><li>Do not use an Administrator Account to avoid crawling unpublished versions of a document. </li></ul></ul>
    144. 144. Metadata Property Mappings
    145. 145. Server Name Mapping <ul><li>Override how MOSS displays search results </li></ul><ul><li>Hide file path </li></ul><ul><li>Sample: “file://moss/HOL” to “http://moss.litwareinc.com” </li></ul>
    146. 146. Search-based Alerts <ul><li>Can be Activated / Deactivated </li></ul><ul><li>Deactivated after a reset of crawled content </li></ul><ul><li>Users can subscribe to an alert on a search query </li></ul><ul><li>Alert is triggered if there are new or changed items that satisfy the search query </li></ul><ul><li>An item is considered changed if its content or metadata has changed </li></ul><ul><li>Timer service is used to issue all alerts notifications (See User Alerts in Site Settings) </li></ul><ul><li>Frequency can be set to Daily / Weekly </li></ul><ul><li>“ Alert Me” and RSS links can be added/removed using their Web Part property </li></ul>
    147. 147. Reset Crawled Content <ul><li>Powerful action! </li></ul><ul><li>Will delete the content index! </li></ul><ul><li>Search Results will no longer be available on the farm until the index has been rebuild! </li></ul><ul><li>Search alerts are deactivated unless the administrator unchecks the check box. </li></ul><ul><li>Alerts should be activated after a full crawl was performed. </li></ul>
    148. 148. Specify Authoritative Pages <ul><li>Helps prioritize Search Results - a way to influence relevance results that are linked to the authoritative pages, which will benefit from a boost in rank. </li></ul><ul><ul><li>Most authoritative </li></ul></ul><ul><ul><li>Second-level authoritative </li></ul></ul><ul><ul><li>Third-level authoritative </li></ul></ul><ul><ul><li>Sites to demote </li></ul></ul>
    149. 149. Scopes <ul><li>Scopes are filters applied to search results to narrow the results of a search query </li></ul><ul><li>Types of Scopes </li></ul><ul><li>Scope Rules and Behaviors </li></ul><ul><li>Single-rule Scopes </li></ul><ul><li>Multi-rule Scopes </li></ul>
    150. 150. Site Collection Management (Site Collection Administrators) (Application Administrators)
    151. 151. Site Collection Administration Options <ul><li>Common Tasks </li></ul><ul><ul><li>Search Settings </li></ul></ul><ul><ul><li>Search Scopes </li></ul></ul><ul><ul><li>Search Keywords </li></ul></ul>
    152. 152. Search Settings <ul><li>Two Options </li></ul><ul><ul><li>Use the Search Center and custom scopes in the dropdown </li></ul></ul><ul><ul><li>The way to change standard Search Center URL for search boxes </li></ul></ul><ul><ul><li>Do not use the Search Center – no custom scopes </li></ul></ul>
    153. 153. Site Level Scopes <ul><li>Site Level Scopes display all scopes associated with a Site Collection </li></ul><ul><li>Display Scopes are a site-level feature that is purely UI </li></ul><ul><ul><li>Administrator – Combine multiple scopes into one selectable item </li></ul></ul><ul><ul><li>Visitors – UI Search dropdown box (or checked boxes for the Advanced Search page) populated with the scopes included in the display group </li></ul></ul>+
    154. 154. Keywords and Best Bets <ul><li>Prominently present editorially selected search results </li></ul><ul><li>Keywords: Glossary of important terms within your organization </li></ul><ul><li>Best Bets are associated with particular search keywords </li></ul><ul><li>Not available across site collections </li></ul>
    155. 155. Search Settings for Fields - NoCrawl <ul><li>Set a NoCrawl attribute on one or more columns within the site collection </li></ul><ul><li>Column content will not be indexed! </li></ul><ul><li>Associated with Site Columns (Content Types) </li></ul>
    156. 156. Search Visibility <ul><li>Site level </li></ul><ul><ul><li>Allow or deny the site to appear in search results. </li></ul></ul><ul><ul><li>If denied, the site will not be indexed. </li></ul></ul><ul><ul><li>Control ASPX pages within the site for visibility. Will take into consideration item’s specific permissions. </li></ul></ul><ul><li>List Level </li></ul><ul><ul><li>Allow or deny the list to appear in search results. </li></ul></ul><ul><ul><li>If denied, the list will not be indexed. </li></ul></ul><ul><li>Document Libraries and Folder Level </li></ul><ul><ul><li>Allow or deny the document library or folder to appear in search results. </li></ul></ul><ul><ul><li>If denied, the Document Library (or folder) will not be indexed. </li></ul></ul>
    157. 157. Search Usage Reports
    158. 158. Benefits of Search Queries and Results Reporting <ul><li>Allows Site and SSP Administrators to: </li></ul><ul><ul><li>Have a visual look at end-user queries through charts and graphs </li></ul></ul><ul><ul><li>Quickly quantify the success or failure of the optimizations they can make to crawlers and indexes </li></ul></ul><ul><ul><li>Export data to Microsoft Excel to further analyze and mine </li></ul></ul>
    159. 159. To Improve the Overall Search Experience One Must… <ul><li>Best way to improve search is to understand visitors’ current search usage! </li></ul><ul><ul><li>Understand what visitors are searching for </li></ul></ul><ul><ul><ul><li>Products, features, services, general Information about the company, etc. </li></ul></ul></ul><ul><ul><li>Understand if their search was successful </li></ul></ul><ul><ul><ul><li>Have they clicked on one of the results? </li></ul></ul></ul><ul><ul><ul><li>Were there any results – does content exist? </li></ul></ul></ul><ul><ul><ul><li>Were they offered suggestions specifically associated with their query? </li></ul></ul></ul><ul><ul><ul><li>Have they misspelled the words within their query? </li></ul></ul></ul>
    160. 160. Reporting Tools <ul><li>Two sets of reports </li></ul><ul><ul><li>Search Query Reports </li></ul></ul><ul><ul><li>Search Results Reports </li></ul></ul><ul><li>Two different levels of reports </li></ul><ul><ul><li>Shared Service Provider (SSP) </li></ul></ul><ul><ul><li>Site Collection </li></ul></ul><ul><li>Enabled by default </li></ul><ul><li>Enabled within the SSP </li></ul><ul><li>Do not log queries from the Search Web Service and from the custom Web Parts administrators </li></ul><ul><li>Note: Data Stored in the SSP database </li></ul>
    161. 161. Reporting Tools <ul><li>At the SSP level </li></ul><ul><li>For enterprise content oriented administrators </li></ul>
    162. 162. Reporting Tools <ul><li>At the Site Collection level </li></ul><ul><li>For Site Collection administrators </li></ul>
    163. 163. Search Query Reporting – SSP <ul><li>Tracks Queries that users issued for all sites managed by this SSP </li></ul><ul><li>Five Different Reports </li></ul><ul><ul><li>Queries Over Previous 30 Days </li></ul></ul><ul><ul><li>Queries Over Previous 12 Months </li></ul></ul><ul><ul><li>Top Query Origin Site Collection Over Previous 30 Days* </li></ul></ul><ul><ul><li>Query for Scopes Over Previous 30 Days </li></ul></ul><ul><ul><li>Top Queries Over Previous 30 Days </li></ul></ul><ul><li>Also has Tabular View for most reports </li></ul>* Specific to SSP
    164. 164. Search Query Reporting – Site Collection <ul><li>Tracks Queries issued within this Site Collection </li></ul><ul><li>Four Different Reports </li></ul><ul><ul><li>Queries Over Previous 30 Days </li></ul></ul><ul><ul><li>Queries Over Previous 12 Months </li></ul></ul><ul><ul><li>Top Queries Over Previous 30 Days </li></ul></ul><ul><ul><li>Query for Scopes Over Previous 30 Days </li></ul></ul><ul><li>Also has Tabular View for most reports </li></ul>
    165. 165. Search Results Reporting – SSP <ul><li>Tracks Result Click Selections by users within the sites managed by this SSP </li></ul><ul><li>Five Different Reports </li></ul><ul><ul><li>Search Results Top Destination Pages </li></ul></ul><ul><ul><li>Queries with Zero Results </li></ul></ul><ul><ul><li>Most Clicked Best Bets </li></ul></ul><ul><ul><li>Queries With Zero Best Bets </li></ul></ul><ul><ul><li>Queries With Low Click-through </li></ul></ul>
    166. 166. Search Results Reporting – Site Collection <ul><li>Tracks Result Click Selections by users for this Site Collection </li></ul><ul><li>Five Different Reports </li></ul><ul><ul><li>Search Results Top Destination Pages </li></ul></ul><ul><ul><li>Queries with Zero Results </li></ul></ul><ul><ul><li>Most Clicked Best Bets (Editorial Results) </li></ul></ul><ul><ul><li>Queries With Zero Best Bets </li></ul></ul><ul><ul><li>Queries With Low Click-through </li></ul></ul>Same list reports as SSP but, for Site Collection
    167. 167. Exporting Results <ul><li>Export data for extended reporting in Excel and/or Excel Services </li></ul>
    168. 168. Questions?
    169. 169. Module 8 Performance, Scalability, and Capacity Planning
    170. 170. Module Agenda <ul><li>Introduction </li></ul><ul><li>Search Capacity Planning in SPS 2003 </li></ul><ul><li>MOSS 2007 Search Capacity Planning </li></ul><ul><ul><li>Topology </li></ul></ul><ul><ul><li>Querying </li></ul></ul><ul><ul><li>Indexing </li></ul></ul><ul><ul><li>Test Environment </li></ul></ul><ul><li>Real World Experiences </li></ul><ul><ul><li>Microsoft Intranet </li></ul></ul><ul><ul><li>Microsoft Technology Center Proof of Concept (PoC) </li></ul></ul>
    171. 171. MOSS 2007 Search Capacity Planning <ul><li>Improvement highlights </li></ul><ul><ul><li>Topology restrictions removed </li></ul></ul><ul><ul><li>Indexing limitations improved </li></ul></ul><ul><ul><li>Continuous propagation </li></ul></ul>
    172. 172. Topology <ul><li>Deployment options </li></ul><ul><ul><li>Collapse index and query services on the same server </li></ul></ul><ul><ul><li>Enable index service on one server and query service on one or more different servers </li></ul></ul><ul><li>For both options you can have only one index server </li></ul><ul><li>Scale up versus scaling out </li></ul>
    173. 173. Topology (cont) <ul><li>Topology restrictions from v2 removed </li></ul><ul><ul><li>Can mix indexer/search roles </li></ul></ul><ul><ul><li>Service can be managed after initial setup or later on </li></ul></ul><ul><li>Use mixed x86 and x64 hardware architectures </li></ul><ul><ul><li>Ifilter, Protocol Handler limitations </li></ul></ul><ul><li>Index server is very CPU intensive </li></ul><ul><li>Plan for availablity requirements </li></ul>
    174. 174. Topology (cont) <ul><li>Topology Scaling Reccomandations (for Search): </li></ul><ul><ul><li>Query servers: 8 per farm </li></ul></ul><ul><ul><li>Front end servers: 8 per farm </li></ul></ul><ul><ul><li>Index servers: 4 per farm </li></ul></ul>
    175. 175. MOSS 2007 Search Topology Indexer Load Balancer Propagation of indexes Content databases External content Web front ends Query servers User Requests Query servers separated from indexer
    176. 176. Querying <ul><li>Performance parameters </li></ul><ul><li>Scaling factors </li></ul>
    177. 177. Querying – Performance Parameters <ul><li>Network always is responsible on query performances to end-user experience: </li></ul><ul><ul><li>In querying the Index Catalog, a front-end always hits SQL database for getting information on search results and for Security Trimming. </li></ul></ul><ul><ul><li>In querying the Property Store, the Query server is not involved since the Property Store is now on SQL Search database. </li></ul></ul>
    178. 178. Querying – Performance Parameters
    179. 179. Querying – Performance Parameters <ul><li>Query server memory: </li></ul><ul><ul><li>The more memory is available, the less the Search service will have to access the hard disk to satisfy a given query. </li></ul></ul><ul><ul><li>Ideally, enough memory should be installed on the query servers to accommodate the entire index. </li></ul></ul><ul><li>Query server disk speed: </li></ul><ul><ul><li>RAID 10 is recommended. </li></ul></ul>
    180. 180. Querying – Scaling Factors <ul><li>Processor architecture </li></ul><ul><ul><li>Use 64-bit servers </li></ul></ul><ul><li>Planning for performances: separate query from front-end </li></ul><ul><ul><li>Dedicated processor time </li></ul></ul><ul><ul><li>Much available RAM for caching </li></ul></ul><ul><li>Planning for availability: add more than one query server in your farm </li></ul><ul><ul><li>This will require a dedicated machine for index, as described before </li></ul></ul><ul><ul><li>Tested maximum of eight query servers </li></ul></ul>
    181. 181. Indexing <ul><li>Planning </li></ul><ul><li>Performance optimization </li></ul><ul><li>Storage </li></ul><ul><li>Limitations </li></ul><ul><li>Scaling </li></ul>
    182. 182. Indexing Planning <ul><li>Customer environment </li></ul><ul><ul><li>Number of users </li></ul></ul><ul><ul><li>Network and connectivity </li></ul></ul><ul><ul><li>Disperse locations </li></ul></ul><ul><ul><li>Expected workloads </li></ul></ul><ul><ul><ul><li>Pilot </li></ul></ul></ul><ul><ul><ul><li>Rollout plan </li></ul></ul></ul><ul><ul><li>Estimate indexing window </li></ul></ul>
    183. 183. Indexing Planning (cont) <ul><li>Corpus definition: </li></ul><ul><ul><li>A corpus is defined as the sum of all content that is being indexed. </li></ul></ul><ul><ul><li>This includes all valid content sources, like Web pages, items, documents, BDC, and any metadata and security information associated with this content. </li></ul></ul>
    184. 184. Indexing Planning (cont) <ul><li>For each content source estimate: </li></ul><ul><ul><li>Number of items </li></ul></ul><ul><ul><li>Storage used </li></ul></ul><ul><ul><li>Types of items </li></ul></ul><ul><ul><li>Security </li></ul></ul><ul><ul><li>Latency requirements </li></ul></ul><ul><ul><li>Connectivity </li></ul></ul><ul><ul><li>Estimate indexing window </li></ul></ul><ul><ul><li>Expected yearly growth </li></ul></ul>
    185. 185. Indexing - Performance Optimization <ul><li>Use dedicated front-end for best indexing performance </li></ul><ul><ul><li>No other services allowed on that server </li></ul></ul><ul><li>Adjust the indexing performance level </li></ul><ul><ul><li>Use Maximum for best performance </li></ul></ul><ul><li>Use Crawler Impact Rules </li></ul><ul><ul><li>Carefully test impact </li></ul></ul><ul><li>Continuous propagation </li></ul><ul><ul><li>Average time is 3 to 27 seconds </li></ul></ul><ul><li>WSS Change log for incremental crawls </li></ul>
    186. 186. Indexing - Performance Optimization <ul><li>Index server CPU: </li></ul><ul><ul><li>As many processors are available as much crawl speed increases </li></ul></ul><ul><li>Index server memory: </li></ul><ul><ul><li>The greater the memory capacity the more documents the crawler can process in parallel </li></ul></ul><ul><ul><li>Having much available memory means to improve crawl speed </li></ul></ul><ul><li>Index Server Disk Speed: </li></ul><ul><ul><li>Raid 10 with 2 ms access time and greater than 150 MB/sec write time </li></ul></ul>
    187. 187. Index Storage <ul><li>Planning index storage as ratio of corpus </li></ul><ul><li>Sizing depends on content in corpus </li></ul><ul><ul><li>Type of content source </li></ul></ul><ul><ul><li>Document formats </li></ul></ul><ul><ul><li>Level of metadata and security information </li></ul></ul><ul><ul><li>Plan for expected growth rates </li></ul></ul>
    188. 188. Index Storage (cont) <ul><li>Index / Query Server disk space requirements: </li></ul><ul><ul><li>Index catalog size is normally in a range of 5% to trough 12% of corpus size </li></ul></ul><ul><ul><li>Recommended initial disk space is a minimum of 2.5 times of index catalog size </li></ul></ul><ul><ul><li>That means: recommended initial disk space is at lease 30% of indexed corpus size </li></ul></ul>
    189. 189. Index Storage (cont) <ul><li>Search database </li></ul><ul><ul><li>Contains metadata, ACLs, hit highlighting, crawl history, and usage reports </li></ul></ul><ul><ul><li>Estimated 2K per crawled document </li></ul></ul><ul><ul><li>Sizing depends on corpus content </li></ul></ul><ul><ul><li>Requires more space than the index catalog </li></ul></ul><ul><ul><li>Recommended initial disk space is a minimum of 4 times of index catalog size </li></ul></ul>
    190. 190. Index Capacity Limitations <ul><li>Supported limit for a single index server is 50 million documents </li></ul><ul><ul><li>In this scenario we recommand only one Index server per farm </li></ul></ul><ul><li>One index server per SSP </li></ul><ul><ul><li>More SSPs can use the same indexer </li></ul></ul><ul><li>All MOSS 2007 for Search Editions are limited to one SSP per farm </li></ul><ul><li>MOSS 2007 is limited to 20 SSPs per farm </li></ul><ul><li>MOSS 2007 for Search Standard Edition limited to 500,000 documents per farm </li></ul>
    191. 191. Index Scaling <ul><li>First scale up (recommended) </li></ul><ul><ul><li>Optimal ranking and user experience </li></ul></ul><ul><ul><li>Best managability </li></ul></ul><ul><ul><li>Scale up system resources </li></ul></ul><ul><ul><ul><li>Use x64 architecture </li></ul></ul></ul><ul><ul><ul><li>Add more CPUs to increase performance </li></ul></ul></ul><ul><ul><ul><li>Plan for minimum 4GB of memory </li></ul></ul></ul><ul><ul><ul><li>RAID 10 is recommended for optimal disk speeds </li></ul></ul></ul>
    192. 192. Index Scaling <ul><li>Scale out </li></ul><ul><ul><li>Add multiple SSPs each crawling unique parts of the corpus </li></ul></ul><ul><ul><li>Complete isolation between SSPs </li></ul></ul><ul><ul><li>Querying across multiple SSPs to get a single relevant results set is not possible </li></ul></ul><ul><ul><li>Tested maximum of four index servers per farm </li></ul></ul><ul><li>Recommended limit per farm across all indexes is 50 million items </li></ul><ul><ul><li>For scenarios higher than 50 million items, add more farms </li></ul></ul>
    193. 193. Test Environment <ul><li>Establish a starting point topology </li></ul><ul><li>Use monitoring to establish actual performance and capacity data </li></ul><ul><ul><li>Use Performance Monitor to collect processor, memory, and disk information for each server </li></ul></ul><ul><ul><li>Look for resource bottlenecks </li></ul></ul><ul><ul><ul><li>Scale up available resources </li></ul></ul></ul><ul><ul><ul><li>Scale out server roles </li></ul></ul></ul>
    194. 194. Real World Experiences <ul><li>Microsoft Intranet </li></ul><ul><li>Microsoft Technology Center PoC </li></ul>
    195. 195. Microsoft Intranet <ul><li>Environment </li></ul><ul><ul><li>Estimate of indexed content  Around 12 TB in SharePoint Content Databases (mix of 2003 / 2007), unknown size outside of this environment </li></ul></ul><ul><li>Total size of the index </li></ul><ul><ul><li>SSP search database ~282GB </li></ul></ul><ul><ul><li>SSP profiles database ~51GB </li></ul></ul><ul><ul><li>Index size on disk ~156GB </li></ul></ul><ul><li>Total number of objects </li></ul><ul><ul><li>23 million objects </li></ul></ul><ul><ul><li>30 content sources, 6 with daily crawls </li></ul></ul><ul><li>Typical 'real world' query response time from this implementation </li></ul><ul><ul><li>~2 seconds, although the product group is looking into ways we can optimize this for our environment </li></ul></ul>
    196. 196. Microsoft Technology Center PoC <ul><li>Objectives </li></ul><ul><ul><li>Indexing large numbers of secure files on file shares </li></ul></ul><ul><ul><li>Verify MOSS 2007 search architecture </li></ul></ul><ul><ul><li>Test and recommend capacity planning and scale </li></ul></ul>
    197. 197. Topology Indexed corpus Search db Index catalog Propagated catalog 1TB 23GB 25GB
    198. 198. Results <ul><li>For the biggest test run, which included indexing 2.4 million secure files, here are the key metrics: </li></ul><ul><ul><li>Full first-time indexing of entire corpus took 23.1 hours. </li></ul></ul><ul><ul><li>Incremental crawls, where 4.7% of the corpus was updated, took 3.7 hours. </li></ul></ul><ul><ul><li>Total size of index, versus the corpus, was 2.4%, and for the search database, it was 2.1%. </li></ul></ul><ul><ul><li>Full corpus crawl versus average number of items indexed per minute was 1642 files/minute. </li></ul></ul>
    199. 199. Results (cont)
    200. 200. Summary of Known Limits and Restrictions <ul><li>Tested recommendation of 50 million items per farm </li></ul><ul><li>Hard limits: </li></ul><ul><ul><li>1 indexer per SSP </li></ul></ul><ul><ul><li>20 indexes per MOSS 2007 farm </li></ul></ul><ul><ul><li>1 index per MOSS 2007 for Search farm </li></ul></ul><ul><ul><li>500 content sources per SSP </li></ul></ul><ul><ul><li>500 start addresses per content source </li></ul></ul><ul><ul><li>500,000 documents limit for MOSS 2007 for Search Standard Edition </li></ul></ul>
    201. 201. Capacity Planning References <ul><li>Planning for performance and capacity: </li></ul><ul><ul><li>http://technet2.microsoft.com/Office/en-us/library/eb2493e8-e498-462a-ab5d-1b779529dc471033.mspx </li></ul></ul><ul><li>Plan for software boundaries: </li></ul><ul><ul><li>http://technet2.microsoft.com/Office/en-us/library/6a13cd9f-4b44-40d6-85aa-c70a8e5c34fe1033.mspx </li></ul></ul><ul><li>Estimate performance and capacity requirements for search environments </li></ul><ul><ul><li>http://technet2.microsoft.com/Office/en-us/library/5465aa2b-aec3-4b87-bce0-8601ff20615e1033.mspx </li></ul></ul>
    202. 202. Questions?
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×