MS Fast Search Server

  • 2,082 views
Uploaded on

Microsoft Fast Search server for Sharepoint 2010 server

Microsoft Fast Search server for Sharepoint 2010 server

More in: Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
2,082
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
29
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Example and a picture
  • Tony Hart & Mark Stone are working on user context Keyword – for a manual approachAdvanced - (Rank Profiles that are contextually aware)
  • Structure under the hood within SharePoint..done in layersContent: collected, processed, indexedQueries: federated, processed, searched, results passed backAlso, have developed a structure where people search is different. Uses profile store structure, relevance tuned for people, and lots of cool stuff.Leveraged these layers to ADD FAST to SharePointShunt content, federate queriesRemember, will see this in the user experience, as IT Pros, as DevelopersSee this in admin: FAST connector
  • -Query Pipeline – define. OM path used when the user sends a query and results are returned.-Web Service enh: (FAST) Refinement data; Query Suggestions; Click logging; SQL/FQL Syntax; Query Options (Phonetic, nickname, rel model)
  • Time: 1 minuteSpeaker notes:Patent searches typically involved very precise query terms. Note the use of fielded search and Boolean operators. Sophisticated queries are not a required, though.Note the use of visual refiners as bar charts.Also application code to Save Query, etc…And the rich search results with structured patent fields attributes like Pub#, Assignee, Inventor, …
  • Using new Services architectureBlue is core SharePointOrange is what is added with FAST
  • Include both quad + six cores (FS14 loves multicores). Disk setups: There are now 1TB 7200 RPM 3.5” SAS disks out that are quite attractively priced, not much more than the 300GBs. We are soon testing e.g. a Dell R510 with 12x1TB in RAID10 (since we do not need 10+TB disk space anyhow). Since FS14 is less IOPS demanding than ESP, they should be able to hold up. I have a setup in RAID50@NPG2 now (12x2TB, 20TB effective) for the “how far can we get” testing, and it’s more the CPU rather than IOPS that holds back on performance. RAID 10 should be even better for perf. These disks (at least when combined with the Dell H700 RAID controller) has also an amazing read/write bandwidth in RAID, e.g. ~900MB/s sustained for bulk reads/writes on the RAID50 above.
  • MAIN COMPONENTS, NEXT SLIDE SHOWS HOW TO SCALE OUTWeb Analyzer: Explain that it’s there for improved relevancy (The Page rank algorithm, etc.)
  • Thomas M
  • Notice:Two alternatives on SP side: Replicated components for HASplit SP/SQL environment for better performance
  • Notice:Scaling of the main components is based on QPS and content volume.Scaling of People Crawl component is based on QPS and # of people (employees).FAST farm can potentially be 1 (all) + 1 (indexer/search), but not good for scaling out and can give poor performance if heavy work on Web Analyzer processRef people crawl/query SSA: These can not be split, but can be scaled across multiple servers. The scaling for these are fully independent of the content volume on the FS14 backed, and should thus be scaled separately. Use (or point to) guidance from Search Server for scaling these along the two orthogonal axis: QPS and number of people.Additional servers on the Crawler side is only needed to ensure feeding redundancy.
  • Notes:Here you actually start to need 2x crawl in SP not only for redundancy, but also for network throughput (unless they have 10Gbit crawl <-> CD) 3 Col setup needs more than 1Gbit/s of data to utilize the system during feeds.“scale out” strategy: columns for data volume and rows for redundancy/query volume3 columns for 45M docs1 adm1 WADatabase nodes represent databases and dependent on the throughput of the database, this can run on one database server.
  • SQL server needs to be scaled for increased IO# of Web Analyzers depends on amount of clicks, 2 or 3 needed for 100M index

Transcript

  • 1. FAST Search forSharepointBrad FreelsTechnology Specialistbfreels@microsoft.com
  • 2. Microsoft’s Search Vision & StrategySearch is Everywhere Desktop Enterprise Internet Devices Big Bet Enterprise Internet Consumer Portals / Partner Portals Employee Productivity High Value Search Marketing / B2B / … Monetization eDiscoveryPeople & Connect to all Research PortalsExpertise your Content 360o customer views Interactive Visual Search - Competitive Intelligence Personalization - Social Networks … Best of Microsoft - Best of SharePoint - Best of High End Search
  • 3. Introducing FAST Search for SharePointOOB User ExperienceTailoring General ProductivitySearch Platform and ArchitectureSearch Driven ApplicationsDeployment and AdministrationSummary and Resources
  • 4. Microsoft Enterprise SearchThe 2010 Wave General productivity search Customized productivity search Light customization and search driven applications Common across the product line • UI Framework • Connector Framework (BDC) • Social search features and integration • APIs and developer Experience • SharePoint platform integration • Admin & deployment capabilities • End user and site administrator enablement • Operations advantages (SCOM, scripting)
  • 5. SharePoint vs. FAST Search for SharePointUser Interface • Query and ResultCentral Administration Processing • Content Processing PipelineCrawler and • Customizability andConnector Scalability
  • 6. SharePoint and FAST Search Server
  • 7. Introducing FAST Search for SharePointOOB User ExperienceTailoring General ProductivitySearch Platform and ArchitectureSearch Driven ApplicationsDeployment and AdministrationSummary and Resources
  • 8. Get visual, interactive search experienceusing a better answers, faster Sorting on Query any property Related completion searches & people Document Scrolling thumbnails previews Read in Office Web Apps Federated results
  • 9. Connecthow you find and collaborate with othersand streamline with people and expertise Filter by title, Phonetic expertise & name lookupother attributes Expertise Real-time matching presence Org browsing Find recent content
  • 10. Deep Refinement and Sorting Enables precise control of results Enables conversational experience across all of the Out of the Box results You will never miss any content Enabling better findability and exploration Discover non-obvious relationships across the entire result set Exact counts shows relative weight Provides analytic view of your results Indicates priority and importance The right lever to slice and dice your content Sort on any field Sorting Options Empower the user to use the relevance model that best fits their needs Exact Counts Rearrange the result set to meet specific criteria Alphabetical, numeric, and date
  • 11. Demo
  • 12. Introducing FAST Search for SharePointOOB User ExperienceTailoring General ProductivitySearch Platform and ArchitectureSearch Driven ApplicationsDeployment and AdministrationSummary and Resources
  • 13. Customize search to meet your businessneeds Deliver results that are contextually relevant Search in the language of your business Tune relevancy to improve accuracy Create structure from unstructured content Configure the UI to extend your application Similarity Search
  • 14. Custom Query Suggestions
  • 15. Visual Best Bets Identify static content that is always relevant Set Vertical Visual Notification Orientation Visual Best BetsBuilt on SharePoint Keywords Easy and quick to setupMatches keywords and synonyms that are contextually Point and click setup for site admins. Set and forget withrelevant to users. Include banners, videos, external content expiration dates . Web Parts allow for easy pagewebsites. customization
  • 16. Audience-specific search experiences Use User Context to meet the needs of diverse groupsRenee Lo Alan BrewerEngineering Sales ManagerContoso Consulting User Contoso Consulting”What should I know about context ”What should I know aboutimplementing ERP?” selling ERP consulting?” Infor Soci m- al ation conte conte xt xt Application context Username & Group Business Unit Preferred Sites Memberships Department SharePoint Audiences Location Team Interests & Current Projects Languages Time of Day Context of Current Task
  • 17. Quicklybased tools contextual experience to your users User build a for creating results that are relevant One-way synonyms Keywords map to other terms Two-way synonyms Keywords become equivalent to other terms Best Bets Highlights key resources that are always relevant to a keyword Visual Best Bets Extend Best Bets with pictures, video, Silverlight controls Document Promotion / Demotion Tailor specific document relevancy Pick Create new keywords the right ingredients user contexts Match the proper terms and contexts toand simpleuser for Site Administrators create contexts boost relevancy administrators have powerful based on tools targeted users to the search experience always finding the to configure ensure your users are the right profiles to deliver relevant results tofor groups of right audiences users content
  • 18. Search in the language of your business Identify what is important to improve the search experienceUse language that has specific meaning to your mobile workforce direct mailbusiness revenue merger communications Users can quickly refine content using familiar terms Taxonomy Build confidence that you found the correct answers the first cloudchain supply computing time audit Productivity best practices XML archive acquisition storageLeverage corporate knowledge to make content cost savings Social Mediafindable Profit Strategy Development Corporate taxonomies customer relations market share Business terminology IP Telephony quality Product names SOX compliance Competition Acronyms risk target markets part numbers brand managementDefine custom rules to identify unique terms Global presence Handle complex terms such as part numbers or forms Disaster Recovery Searching for ”XXX 123 abc“ finds “XXX-123-abc“ and “GG^XXX-123-abc_HH“
  • 19. Introducing FAST Search for SharePointOOB User ExperienceTailoring General ProductivitySearch Platform and ArchitectureSearch Driven ApplicationsDeployment and AdministrationSummary and Resources
  • 20. FAST Search Extends SharePoint
  • 21. FAST Search for SharePoint … …High Level Architecture
  • 22. FAST Search for SharePoint … …High Level Architecture FAST FAST Query Connector SSA SSA
  • 23. Extensible Content ProcessingEnables search that has a deep understanding of your information Properties Entity Format Mapper Vectorizer Extractor Lemmatizer Converter … Web Date/Time Word Language Analyzer Normalizer Breaker Detector
  • 24. How does the pipeline work? A systematic approach to interpreting your content Sequential stages perform specific tasks while ingesting content Breaks down content to the smallest addressable chunks to build meaning Understands file encoding, data formats, and written languages Supports 400+ file formats, 80+ languages Process your content to make it searchable Normalizes content so that a consistent relevancy model can be applied Identifies structured and unstructured metadata in your content Maps document metadata to SharePoint Crawled Properties LanguageExtraction Format Conversion WebDate Encoding Document Vector Entity and Time Lemmatization Link Analysis Map Crawled Tokenization Extracts language written aawas extracting anchorvarious maps run, Identifies root metadatahyperlinks language. byspecific encoding so Out Apply the documents for forand standard representation, and applications Finds terms Converts theintext content torules for document that reflects importantthat Creates of theof a from multiple ofdiscoveredpredefinedto handle idioms Analyzesplainnative specificlanguage them toencodings, concepts, locale Maps all adatestherepresentationfile a identifyingFor Englishwhich reinforces unique word that given formats, and times maps and locale words, categories. the text it pipeline Normalization andProperties Detection andproper runs, box dictionaries occurrence. Used tokenization documents. of the and frequency a For Companies to find 14-Mar-10 is specific support for back be a single breakers similar in part numbers termsrunning and ran People,example, the and Locations, but can be the authority ranking ofcan custom wordknows thatfoundand language stages representations. document.by lemma. Understands lemmatization phrases. Also applies to used stages or telephone numbers. specific grammar 14, context. extended March and2010. equivalentto any category.
  • 25. Extending Pipeline capabilities Straightforward way to add custom text analysis functionality Configure Optional Processing Steps XML Properties mapper Offensive Content Filter Field Collapsing Verbatim (wholeword) extractor Use a dictionary for custom extraction Pipeline Extensibility Calls external applications for custom item processingAdd Custom Processing Sandboxed executionPipeline Extensibility is a specially defined stage that takes a Executable arguments and temporary files are automaticallyset of crawled properties, as flat text handled with timeouts. as input and maps output to another crawled property Runs just before the Crawled Property Mapper, providing accessibility within SharePoint
  • 26. Powerful Entity Extraction businessEnables search-driven navigation that is relevant to your PRODUCT CONCEPT COMPANY
  • 27. Tune relevancy to improve accuracy Changing content and users need require a flexible solution Start with great relevance OOB Tuned for great general productivity experience Automatically improves relevancy with social click-throughs and link text analysis Create new relevance models Standard Sorting Options MultipleRank Profiles Blend static and dynamic ranking parameters to instantly improve search results Custom Rank Profiles Create with simple PowerShell commands Expose as new sorting options
  • 28. Tunablebusiness-specific search results for diverse rolesEnables unique, Relevance
  • 29. Search 2010 “Stack”The platform for Search Customization
  • 30. Robust query language Use FAST Query Language (FQL) for precise query development FQL provides a robust and expressive query language Wildcard support - *, ? Numeric Data types (Integer, Float, Decimal, Datetime) Operators Direct field access (e.g., title:othello, author:shakespeare) Numeric (COUNT, RANGE, <, <=, >, >=) Boolean (AND, OR, ANY, NOT) Rank (RANK, XRANK) Proximity (NEAR, ONEAR) Sorting (SORT, SORTFORMULA) String (operator support for strings) Boundary (starts-with, ends-with, equals) Filter
  • 31. Introducing FAST Search for SharePointOOB User ExperienceTailoring General ProductivitySearch Platform and ArchitectureSearch Driven ApplicationsDeployment and AdministrationSummary and Resources
  • 32. Searchthe search application needs you have across your business Meet all Driven Applications Sales: 360o Customer Insight “How do I support the Services: unique search needs of Knowledge Browser teams and work that Marketing: impact our business?” Competitive Intelligence Research & Development: To do so, you need a Innovation Portal search platform that Support: has Call Center Advisor • A deep understanding of Operations: your information Systems/Logistics Portal • Flexible relevance to meet diverse needs Legal, HR, IT, Finance, …… • A customizable UX to
  • 33. real estate risk
  • 34. Top information from Woodgrove…new market view report to send to clients Set of Customers to explore, with rollup Experts to help, with availability and rating View of information across different pivots, with drilldownImmediate actions on selected items News and external Drilldown to single view with all clues about a customer: portfolio, opinion to monitor holdings, communications, annual and quarterly customer plans, etc… and send to clients
  • 35. How would you create this? Content Crawling: bring in data from lots of places OOB connectors to SharePoint (reports, account documents), exchange public folders, shared files; BDC with customization in SPD (no code) for customer portfolio/holdings Content processing: creating metadata Names of holdings, offerings, key concepts, companies, people Synonyms for key concepts (real estate ~ REIT) OOB web parts configured for style Federation, People Search, Search actions Custom web parts for visual navigation Roll-up configured via results collapsing Custom relevance profile SharePoint workflows for act-on-selected-items
  • 36. Introducing FAST Search for SharePointOOB User ExperienceTailoring General ProductivitySearch Platform and ArchitectureSearch Driven ApplicationsDeployment and AdministrationSummary and Resources
  • 37. Secure,or federate with content, to information Index unified access applications, and services OpenSearch Federation Search IndexUser Experience Enterprise Business Information Content Applications Services Indexing Connectors
  • 38. FAST Search for SharePoint
  • 39. FAST Search HW – Best Practices Admin / Processing Server CPU: 2 x 2GHz+ (Quad/six core) Memory: 24-48 GB Disk: 2 x 300 GB, SAS, 10K RPM (RAID 1) Storage Server CPU: 2 x 2GHz+ (Quad/six core) Memory: 24-48 GB Disk alternatives: 1.0 TB: 8 x 300 GB, SAS, 10K RPM (RAID10) 1.8 TB: 8 x 300 GB, SAS, 10K RPM (RAID 5) 3.6 TB: 16 x 300 GB, SAS, 10K RPM (RAID 5+0) New: 7.2 TB: 16 x 600 GB, SAS, 10K RPM (RAID 5+0) SAN: Configured for “database performance”
  • 40. FAST Search – Main Components SharePoint Crawler SharePoint Crawler Capacity: ~30 mill items per crawler node, SQL server needs to be scaled for high IO SP CrawlPeople Crawl Crawl DB Web Analyzer CPU/disk footprint can vary by a factor of 10 depending on the content: Web Analyzer - number of links - length of links - internal cross link ratio Average capacity: ~30 mill items per web analyzer node FAST-WA-1 Can be deployed with the Indexer in normal scenarios Web Analyzer Indexer/Search Indexer/search node Two supported models: - Normal mode: ~15 mill items per node ~25 QPS FAST-FSTIDX-11 Index/Search - High Density Mode: ~ 40 mill. items per node ~ 7 QPS
  • 41. Rows and Columns Columns give you more indexing Need more Doc Processors and Content Distributor roles Rows give you more query and redundancy More Query roles
  • 42. FAST Search – Pilot/DevDeployment SP2010 Farm FAST Search for SP 2010 Farm All roles All roles
  • 43. FAST Search – Extra Small Farm SP2010 Farm FAST Search for SharePoint 2010 Farm Admin Web Front End Index (Search) Query Content Distributor SP Crawl Indexing Dispatcher People Crawl Web Analyzer SQL Server 4 Docprocs+ (Index) Search Web Front End Content Distributor Query Indexing Dispatcher SP Crawl Web Analyzer People Crawl 4 Docprocs+ SQL Server
  • 44. FAST Search – Small Deployment SP2010 Farm FAST Search for SharePoint 2010 Farm * Admin Index (Search) Web Front End Web Front End Content Distributor Content Distributor Query Query Indexing Dispatcher Indexing Dispatcher 12 Docprocs+ 12 Docprocs+ Web Analyzer Web Analyzer QR Server * * SP Crawl SP Crawl People Crawl People Crawl (Index) Search QR Server Search Admin DB Crawl DB SharePoint SQL 2008 Cluster Note: Servers marked with * are only needed for high availability
  • 45. FAST Search – MediumDeployment SP2010 Farm FAST Search for SharePoint 2010 Farm Admin Index (Search) Index (Search) Index (Search) WFE WFE Content Distributor Content Distributor Web Analyzer Web Analyzer Query SSA Query SSA Web Analyzer Web Analyzer Indexing Dispatcher Indexing Dispatcher 12 Docprocs+ 12 Docprocs+ 12 Docprocs+ 12 Docprocs+ SP Crawl SP Crawl (Index) Search (Index) Search (Index) Search People Crawl People Crawl QR Server QR Server QR Server Search Admin DB Crawl DB SharePoint DB SQL 2008 Cluster
  • 46. FAST Search – Large Deployment SP2010 Farm Web Front End Web Front End SP Crawl SP Crawl Query Query People Crawl People Crawl Search Admin DB Crawl DB SharePoint SQL 2008 Cluster FAST Search for SharePoint 2010 Farm Admin Index (Search) Index (Search) Index (Search) Index (Search) Index (Search) Index (Search) ConfigServer Content Distributor Indexing Dispatcher Indexing Dispatcher Web Analyzer Web Analyzer Web AnalyzerContent Distributor Web Analyzer Web Analyzer Web Analyzer 12 Docprocs+ 12 Docprocs+ 12 Docprocs+ Web Analyzer 12 Docprocs+ 12 Docprocs+ 12 Docprocs+ 12 Docprocs+ (Index) Search (Index) Search (Index) Search (Index) Search (Index) Search (Index) Search QR Server QR Server QR Server QR Server QR Server QR Server
  • 47. Introducing FAST Search for SharePointOOB User ExperienceTailoring General ProductivitySearch Platform and ArchitectureSearch Driven ApplicationsDeployment and AdministrationSummary and Resources
  • 48. Tools – QR Server Neil Richard’s Blog Enabling the QR Server Blog Post - http://tinyurl.com/3b9ren4
  • 49. FAST Search for Sharepoint Query Tool http://fastforsharepoint.codeplex.com/ Connect to web app running FAST SSA (SP box) Use it to test FQL
  • 50. Useful Resources FAST University Training MSDN & TechNet Blogs Leonardo De Souza’s Blog http://searchunleashed.wordpresss.com Thomas Svensen’s Blog http://blogs.msdn.com/b/thomsven/ Comperio Search Nuggets http://nuggets.comperiosearch.com/ Books