Your SlideShare is downloading. ×
Carlos Valcarcel: Arrchitecture-Fast Search Server 2010 For SharePoint
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Carlos Valcarcel: Arrchitecture-Fast Search Server 2010 For SharePoint

2,930
views

Published on

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,930
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
149
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Tony Hart & Mark Stone are working on user context Keyword – for a manual approachAdvanced - (Rank Profiles that are contextually aware)
  • Overview of MSFT’s Search Solution (MSW is MSFT’s Intranet)
  • Can you please add a one line description of each type of plugin in this slide and how can you enable/disable individual plugins..
  • This slide provides a more detailed architecture of content processing in FS14 and how the content processing maps to concepts in SS14. There is a schema object model that defines the schema of the properties. As in SS14 there are CP and MP and they can be managed by either the UI/powershell or the schema OM. The updates to the schema are stored in the config server and update tools are used to batch the update process. The document processing pipeline reads the schema information and performs activities like, idetntifying new CP, mapping MP to crawled proeprties extracting informtion from documents and mangedproepties that is used for deep navigatoin, sort by custom proeprtiesetx.
  • Transcript

    • 1. Architecture: Fast Search Server 2010 for SharePoint
      SharePoint Saturday
      Carlos Valcarcel
      Fast Technology Specialist, Fast, A Microsoft Subsidiary
    • 2. Demo: Fast Search Server 2010
      FAST: A Brief Time of History
      SharePoint 2010
      Search features
      Fast Search Server 2010
      Features
      Architecture
      Why Fast Search Server instead of SharePoint search?
      Agenda
    • 3. MSW – Microsoft Internal Web Site
      demo
    • 4. You’ve probably heard it all before.
      Fast was founded in 1997; it was 11 when the acquisition completed (2008).
      AllTheWeb.com – still an active site!
      Sold by Fast to Overture, then Overture bought by Yahoo!
      Fast invested in enterprise search
      Our flagship product, ESP, powers some of the largest sites on the web
      Dell, Best Buy, Scirus (Reed Elsevier), Financial Times, Oodle, Rakutan
      When we OEM’ed our product:
      Documentum
      Dell Message One (Email/eDiscovery)
      CommVault
      EMC Centera
      MatterSpace®
      Fast: A Brief Time of History
      Where did Fast come from?
    • 5. Linear scalability
      Support for more languages
      Better relevancy
      Support for 100 million documents per farm
      Federated results on one page (OpenSearch compliant)
      Navigators (navigator counts not displayed)
      Users can tag documents
      SharePoint follows clicks to boost relevancy
      Auto detect languages in documents
      User can increase boosting based on language
      Query completion
      Did you mean…?
      Sub second response time
      Synonym support (called Aliases)
      Phonetic matching (ShartenMicklesonKjartanMikkelsen)
      Native 64-bit deployment
      Scaling along all dimensions
      Query processing across multiple servers
      Search dashboard
      Adding content
      Crawl rules
      Powershell has 128 commandlets for search so everything you want to do for search can now be scripted.
      Merges results from multiple nodes
      SharePoint 2010 Search
      A Brief Look: Great New Features! Less Filling! Secret Ingredients from Norway!
    • 6. Almost everything available in SharePoint 2010
      Lemmatization/Stemming
      Document Thumbnail and Preview
      Visual Best Bets
      People Search with phonetic search
      Federated Search (OpenSearch)
      Single search (federated) across all content
      Relevancy per audience
      Custom GUI per audience is possible
      Location, Language, Role, and Search aware
      Document boosting and blocking (click-through relevancy)
      Document processing pipeline
      Synonyms
      Secure Search
      Dynamic navigators (OOTB and custom)
      Taxonomy
      Breadcrumb navigation
      Fast Search Server for 2010
      The Future of SharePoint Search: More and Better (did I mention with Secret Ingredients from Norway?)
    • 7. The GUI: Enhancing the Search ExperienceYou’ve Got Your Search in My Collaboration Platform!
      FS4SP
    • 8. User Interface is visual and actionable
      Visual and conversational interaction with precise control
      Deep Refinement
      Thumbnails
      Sort on any field
      Similar Results
      Previews
      Built on SharePoint Search Center
      Leverages all of innovations in SharePoint
      Open Web Parts, Federation, query suggestions, related queries, Did you mean?
      Visual results connects users with content
      Thumbnails for Word and PowerPoint
      Visual Best Bets highlight premium content
      Preview in browser without leaving the results
    • 9. Map metadata to Managed Properties
      Automatic association of metadata to content
      Crawled Properties
      Crawled Properties Standard document metadata discovered by the crawler or extracted from the full text by the FAST Content Processing Pipeline.
      Managed Properties
      Map one or more Crawled Properties to a single field. Enables sorting, refinement, relevance tuning and fielded searching.
      Maps automatically or through Central Administration or PowerShell
      Any data can be found!!
      Index Profile
      Managed Properties
    • 10. How does it work?
      Put your terms in the out of the box extraction dictionaries by modifying an XML file
      Map the crawled property to a managed property
      Index your content
      Modify refinement panel web part
      Example: Create a custom entity extractor
      Customized Extraction Dictionary
    • 11. How does it work?
      Built on a SharePoint List or custom extractor
      Edit the Search Center Results Page
      Modify the shared web part by adding tags to the refinement panel XML
      Create your own labels
      Save and Publish
      Custom Collections
      Add refiners to user interface
    • 12. Quickly build a contextual experience
      User based tools for creating results that are relevant to your users
      One-way synonyms
      Keywords map to other terms
      Two-way synonyms
      Keywords become equivalent to other terms
      Best Bets
      Highlights key resources that are always relevant to a keyword
      Visual Best Bets
      Extend Best Bets with pictures, video, Silverlight controls
      Document Promotion / Demotion
      Tailor specific document relevancy
      Pick the right ingredients
      Match the proper terms and contexts to boost relevancy for targeted users to ensure your users are always finding the right content
      Create new user contexts
      Site administrators create contexts based on user profiles to deliver relevant results to the right audiences
      Create new keywords
      Site Administrators have powerful and simple tools to configure the search experience for groups of users
    • 13. Deliver results that are contextually relevant
      with search that can understands your business and role
      Role-specific
      relevance
      Targeted Best
      Bets / Visual
      Best Bets
      Business driven
      refinement
      ”What should I know about selling ERP?”
      - Alan Brewer, Sales Lead
      ”What should I know about implementing ERP?”
      - Renee Lo, Consultant
    • 14. Rank Profiles
      Tune relevancy without impacting the default algorithm
      Out of the box relevancy
      Tuned for great general productivity experience, relevancy improves with click-throughs and link text analysis.
      Extend the default algorithms
      Create new default relevancy models. Blend static and dynamic ranking parameters to instantly improve search results.
    • 15. How to create a Rank Profile
      IT Pros are empowered to create new profiles quickly
      Rank Profiles created in PowerShell by extending the default relevancy algorithm…
      … and are exposed in the user interface by modifying the sorting web part.
    • 16. Back End Processing Tasks:
      Load content from many different places
      Out of the box connectors for SharePoint, exchange public folders, and shared files
      SharePoint Designer to configure connection to customer portfolio/holdings database
      Create custom metadata with content processing pipeline
      Names of holdings, offerings, key concepts, companies, people
      Synonyms for key concepts (real estate ~ REIT)
      Roll-ups configured with optional results collapsing stage
      Create custom relevance profile
      Designers can stylize the User Interface
      Apply styles to web parts
      Federation, People Search, Search actions
      Build custom web parts for visual navigation
      Use SharePoint workflows to perform business specific actions
      Leveraging the platform to build applications
      Putting together all of the pieces to build search-driven applications
    • 17. Simplified, powerful administration
      A high-end enterprise search solution that’s easy to deploy and manage
      Manage efficiently with full support for Microsoft System Center and PowerShell scripting to automate tasks
      Deploy easilyusing wizard-driven installation, a topology designer, and native support for 64-bit virtualization
      Streamline administrationwith a simplified admin console that helps you manage search services across your enterprise
    • 18. Architecture
      FS4SP
    • 19. Microsoft’s 2010 Dog-Food Farm
      Description: Team Collaboration Portal & Social Networking
      Day to day work and internal experiments
      Data Set:
      Workload:
      Search Full Crawl generating ~75%
    • 20. FAST Search for SharePoint Scaleout
      Back-end with extreme and flexible scale out options
      Scale-out multiple “dimensions”
      Query Volume
      Content Volume
      Indexing freshness
      Redundancy options
      Search
      Indexing
      Performance targets*
      30M Docs/node
      50 QPS/node
      35 docs/sec
      Query Volume
      Search and Indexing
      Query and Result Processing
      Content Volume
      No theoretical upper bounds!
      Crawling and Content
      Processing
      *Depends on content and hardware specifics
    • 21. SharePoint Server(s)
      FAST Search Server 2010
      FAST Server(s)
      Summary of architectural components
      Other Server(s)
      Site Collection Level Admin UI
      • Keyword Management
      • 22. User Context Management
      • 23. Site Promotion/Demotion
      PowerShell
      • Schema configuration
      • 24. Admin configuration
      • 25. Deployment configuration
      Central Administration UI
      • Property mapping
      • 26. Property extraction
      • 27. Spell-checking
      Administration and Schema Object Model
      Advanced Content Processing
      Linguistics
      Web
      Link
      Analysis
      Connectors
      Security
      Access
      Module
      Indexing
      SharePoint
      Front-end
      Custom
      Front-End
      Query Object Model
      Query and
      Result
      Processing
      Search
      Core
      Query Web Service
      Connectors
      Federation
      Object Model
      Monitoring Services
      Content
      Microsoft System Center Operations Manager
      OpenSearch or Other Sources
      People Search
    • 35. Search LOB Systems via BDC/BCS
      Enhance SharePoint platform capabilities with out-of-box features, services, and tools that streamline development of solutions with deep integration of External Data and Services.
      Office Apps
      Cache
      Offline Operations
      BDC Client Runtime
      SharePoint
      SPD
      Design
      Tools
      VSTO
      Web 2.0
      LOB
      Siebel
      SAP
      Dynamics
    • 36. Document Processing Pipeline Stages
      Default
      Optional
      Format Conversion
      iFilters, OutSideIn
      Language detection and encoding
      Lemmatizer
      Linguistics normalization
      Tokenizer
      Word breaking
      Entity Extraction
      Persons, companies, locations, email, date/time, URL, prices, file names
      DateTimeNormalizer
      Date normalization
      Vectorizer
      Create document vector for similarity searching
      WebAnalyzer
      Anchor text and link cardinality analysis
      PropertiesMapper
      Map to crawled properties
      PropertiesReporter
      Report detected properties
      XML Properties mapper
      Offensive Content Filter
      Verbatim extractor
      Loads dictionary for custom extraction, e.g product names
      Field Collapsing
      Mapper

      Configurable
      Stages
      EntityExtraction
      Language
      Detection
      Format
      Conversion
      The different plug-ins can either be configured from UI or from config files
    • 37. Content Processing and Schema
      Admin UI
      Schema CmdLets
      Custom Client
      Extracted document attributes reported as Crawled Properties
      Crawled Properties mapped to Managed Properties
      Characteristics are defined for Managed Properties, e.g.
      Refiners
      Sorting
      Queryable
      Type
      Definition and mapping done via UI or Powershell
      Schema Object Model
      Update configuration
      Schema Service (hosted in IIS)
      Report discovered crawled properties
      Update Tools
      Persistence
      Property backend
      bliss
      psctrl
      configserver
      Alert pipeline
      of updated
      schema
      Document Processing Pipeline
      PropertiesMapper
      PropertiesReporter
    • 38. Pipeline Extensibility API
      Motivation
      Straightforward way to add text analysis functionality
      Flexibility and supportability
      Example uses
      Sentiment analysis
      Translation
      Auto-Classification
      Mechanism
      Just before Mapper
      “any” binary
      Runs in sandbox with timeout
      Mapper
      Extensibility

      Standard processing
    • 39. Yeah, So What?
      100 million documents per farm
      Refiners: only uses the first 1000 results
      Search is restricted to one farm
      Tell Me Something Awesome
      SharePoint 2010
      Fast Search Server 2010
      40 Million Documents per server
      Refiners: exact count from the entire result set
      Content can be indexed and search across farms
      3.6 TB of disk space per server (so far!) and support for NAS and SANs.
      Full support for VMs (Hyper-V and VMware)
    • 40. There is nothing wrong with SharePoint!
      SharePoint brings together a number of collaborative technologies that would otherwise not play well together
      As SharePoint adoption spreads the need for enterprise search only increases
      Search today is where RDBMSs were over 20 years ago
      Let me say that again: there is nothing wrong with SharePoint!
      Is Something Wrong With SharePoint?
    • 41. The Present
      SharePoint 2010 search addresses a host or previous issues
      No migration path from SP 2010 to Fast Search 2010
      The Future
      Where do you think Fast Search Server will be in 3 years (the next release of SharePoint)?
      Why Fast Search Instead of SharePoint Search?
    • 42. You’ve Got QuestionsI’ve probably got answers…
      Q and A
    • 43. Demo: Fast Search Server 2010
      FAST: A Brief Time of History
      SharePoint 2010
      Search features
      Fast Search Server 2010
      Features
      Architecture
      Why Fast Search Server instead of SharePoint search?
      Agenda
    • 44. The organizers of SharePoint Saturday
      To all of you for attending!
      Thanks
    • 45. Capacity Planning White Paper
      http://www.microsoft.com/downloads/details.aspx?FamilyID=65b799e3-825c-4398-8cd7-3311d3297997&displaylang=en
      RSS: FAST Search Server 2010 for SharePoint Newly Published Content
      If you bookmark only one RSS feed for Fast Search Server 2010 this is the one: http://services.social.microsoft.com/feeds/feed/FASTSearchServer2010NewContent
      Documentation
      TechNet: http://technet.microsoft.com/en-us/library/ee781286.aspx
      MSDN Blogs
      Enterprise Search: http://blogs.msdn.com/b/enterprisesearch/
      Steve Nicolaou, Fast Architect: http://blogs.msdn.com/b/stevennicolaou/
      Jørgen's FAST Search Blog: http://blogs.msdn.com/b/jorgeni/
      Dark Corners: http://blogs.msdn.com/b/dark_corners/
      Enterprise Search User Group
      Second Wednesday of every month! You missed July! Don’t miss August!
      Case Study: Search and the FBI Sentinel Program
      Author: Marti Hearst, Search User Interfaces (http://www.searchuserinterfaces.com/)
      Next Generation Tools: Content Transformation Service/Interaction Management Service
      References