Carlos Valcarcel: Arrchitecture-Fast Search Server 2010 For SharePoint
Upcoming SlideShare
Loading in...5
×
 

Carlos Valcarcel: Arrchitecture-Fast Search Server 2010 For SharePoint

on

  • 3,386 views

 

Statistics

Views

Total Views
3,386
Slideshare-icon Views on SlideShare
3,386
Embed Views
0

Actions

Likes
1
Downloads
147
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Tony Hart & Mark Stone are working on user context Keyword – for a manual approachAdvanced - (Rank Profiles that are contextually aware)
  • Overview of MSFT’s Search Solution (MSW is MSFT’s Intranet)
  • Can you please add a one line description of each type of plugin in this slide and how can you enable/disable individual plugins..
  • This slide provides a more detailed architecture of content processing in FS14 and how the content processing maps to concepts in SS14. There is a schema object model that defines the schema of the properties. As in SS14 there are CP and MP and they can be managed by either the UI/powershell or the schema OM. The updates to the schema are stored in the config server and update tools are used to batch the update process. The document processing pipeline reads the schema information and performs activities like, idetntifying new CP, mapping MP to crawled proeprties extracting informtion from documents and mangedproepties that is used for deep navigatoin, sort by custom proeprtiesetx.

Carlos Valcarcel: Arrchitecture-Fast Search Server 2010 For SharePoint Carlos Valcarcel: Arrchitecture-Fast Search Server 2010 For SharePoint Presentation Transcript

  • Architecture: Fast Search Server 2010 for SharePoint
    SharePoint Saturday
    Carlos Valcarcel
    Fast Technology Specialist, Fast, A Microsoft Subsidiary
  • Demo: Fast Search Server 2010
    FAST: A Brief Time of History
    SharePoint 2010
    Search features
    Fast Search Server 2010
    Features
    Architecture
    Why Fast Search Server instead of SharePoint search?
    Agenda
  • MSW – Microsoft Internal Web Site
    demo
  • You’ve probably heard it all before.
    Fast was founded in 1997; it was 11 when the acquisition completed (2008).
    AllTheWeb.com – still an active site!
    Sold by Fast to Overture, then Overture bought by Yahoo!
    Fast invested in enterprise search
    Our flagship product, ESP, powers some of the largest sites on the web
    Dell, Best Buy, Scirus (Reed Elsevier), Financial Times, Oodle, Rakutan
    When we OEM’ed our product:
    Documentum
    Dell Message One (Email/eDiscovery)
    CommVault
    EMC Centera
    MatterSpace®
    Fast: A Brief Time of History
    Where did Fast come from?
  • Linear scalability
    Support for more languages
    Better relevancy
    Support for 100 million documents per farm
    Federated results on one page (OpenSearch compliant)
    Navigators (navigator counts not displayed)
    Users can tag documents
    SharePoint follows clicks to boost relevancy
    Auto detect languages in documents
    User can increase boosting based on language
    Query completion
    Did you mean…?
    Sub second response time
    Synonym support (called Aliases)
    Phonetic matching (ShartenMicklesonKjartanMikkelsen)
    Native 64-bit deployment
    Scaling along all dimensions
    Query processing across multiple servers
    Search dashboard
    Adding content
    Crawl rules
    Powershell has 128 commandlets for search so everything you want to do for search can now be scripted.
    Merges results from multiple nodes
    SharePoint 2010 Search
    A Brief Look: Great New Features! Less Filling! Secret Ingredients from Norway!
  • Almost everything available in SharePoint 2010
    Lemmatization/Stemming
    Document Thumbnail and Preview
    Visual Best Bets
    People Search with phonetic search
    Federated Search (OpenSearch)
    Single search (federated) across all content
    Relevancy per audience
    Custom GUI per audience is possible
    Location, Language, Role, and Search aware
    Document boosting and blocking (click-through relevancy)
    Document processing pipeline
    Synonyms
    Secure Search
    Dynamic navigators (OOTB and custom)
    Taxonomy
    Breadcrumb navigation
    Fast Search Server for 2010
    The Future of SharePoint Search: More and Better (did I mention with Secret Ingredients from Norway?)
  • The GUI: Enhancing the Search ExperienceYou’ve Got Your Search in My Collaboration Platform!
    FS4SP
  • User Interface is visual and actionable
    Visual and conversational interaction with precise control
    Deep Refinement
    Thumbnails
    Sort on any field
    Similar Results
    Previews
    Built on SharePoint Search Center
    Leverages all of innovations in SharePoint
    Open Web Parts, Federation, query suggestions, related queries, Did you mean?
    Visual results connects users with content
    Thumbnails for Word and PowerPoint
    Visual Best Bets highlight premium content
    Preview in browser without leaving the results
  • Map metadata to Managed Properties
    Automatic association of metadata to content
    Crawled Properties
    Crawled Properties Standard document metadata discovered by the crawler or extracted from the full text by the FAST Content Processing Pipeline.
    Managed Properties
    Map one or more Crawled Properties to a single field. Enables sorting, refinement, relevance tuning and fielded searching.
    Maps automatically or through Central Administration or PowerShell
    Any data can be found!!
    Index Profile
    Managed Properties
  • How does it work?
    Put your terms in the out of the box extraction dictionaries by modifying an XML file
    Map the crawled property to a managed property
    Index your content
    Modify refinement panel web part
    Example: Create a custom entity extractor
    Customized Extraction Dictionary
  • How does it work?
    Built on a SharePoint List or custom extractor
    Edit the Search Center Results Page
    Modify the shared web part by adding tags to the refinement panel XML
    Create your own labels
    Save and Publish
    Custom Collections
    Add refiners to user interface
  • Quickly build a contextual experience
    User based tools for creating results that are relevant to your users
    One-way synonyms
    Keywords map to other terms
    Two-way synonyms
    Keywords become equivalent to other terms
    Best Bets
    Highlights key resources that are always relevant to a keyword
    Visual Best Bets
    Extend Best Bets with pictures, video, Silverlight controls
    Document Promotion / Demotion
    Tailor specific document relevancy
    Pick the right ingredients
    Match the proper terms and contexts to boost relevancy for targeted users to ensure your users are always finding the right content
    Create new user contexts
    Site administrators create contexts based on user profiles to deliver relevant results to the right audiences
    Create new keywords
    Site Administrators have powerful and simple tools to configure the search experience for groups of users
  • Deliver results that are contextually relevant
    with search that can understands your business and role
    Role-specific
    relevance
    Targeted Best
    Bets / Visual
    Best Bets
    Business driven
    refinement
    ”What should I know about selling ERP?”
    - Alan Brewer, Sales Lead
    ”What should I know about implementing ERP?”
    - Renee Lo, Consultant
  • Rank Profiles
    Tune relevancy without impacting the default algorithm
    Out of the box relevancy
    Tuned for great general productivity experience, relevancy improves with click-throughs and link text analysis.
    Extend the default algorithms
    Create new default relevancy models. Blend static and dynamic ranking parameters to instantly improve search results.
  • How to create a Rank Profile
    IT Pros are empowered to create new profiles quickly
    Rank Profiles created in PowerShell by extending the default relevancy algorithm…
    … and are exposed in the user interface by modifying the sorting web part.
  • Back End Processing Tasks:
    Load content from many different places
    Out of the box connectors for SharePoint, exchange public folders, and shared files
    SharePoint Designer to configure connection to customer portfolio/holdings database
    Create custom metadata with content processing pipeline
    Names of holdings, offerings, key concepts, companies, people
    Synonyms for key concepts (real estate ~ REIT)
    Roll-ups configured with optional results collapsing stage
    Create custom relevance profile
    Designers can stylize the User Interface
    Apply styles to web parts
    Federation, People Search, Search actions
    Build custom web parts for visual navigation
    Use SharePoint workflows to perform business specific actions
    Leveraging the platform to build applications
    Putting together all of the pieces to build search-driven applications
  • Simplified, powerful administration
    A high-end enterprise search solution that’s easy to deploy and manage
    Manage efficiently with full support for Microsoft System Center and PowerShell scripting to automate tasks
    Deploy easilyusing wizard-driven installation, a topology designer, and native support for 64-bit virtualization
    Streamline administrationwith a simplified admin console that helps you manage search services across your enterprise
  • Architecture
    FS4SP
  • Microsoft’s 2010 Dog-Food Farm
    Description: Team Collaboration Portal & Social Networking
    Day to day work and internal experiments
    Data Set:
    Workload:
    Search Full Crawl generating ~75%
  • FAST Search for SharePoint Scaleout
    Back-end with extreme and flexible scale out options
    Scale-out multiple “dimensions”
    Query Volume
    Content Volume
    Indexing freshness
    Redundancy options
    Search
    Indexing
    Performance targets*
    30M Docs/node
    50 QPS/node
    35 docs/sec
    Query Volume
    Search and Indexing
    Query and Result Processing
    Content Volume
    No theoretical upper bounds!
    Crawling and Content
    Processing
    *Depends on content and hardware specifics
  • SharePoint Server(s)
    FAST Search Server 2010
    FAST Server(s)
    Summary of architectural components
    Other Server(s)
    Site Collection Level Admin UI
    • Keyword Management
    • User Context Management
    • Site Promotion/Demotion
    PowerShell
    • Schema configuration
    • Admin configuration
    • Deployment configuration
    Central Administration UI
    • Property mapping
    • Property extraction
    • Spell-checking
    Administration and Schema Object Model
    Advanced Content Processing
    Linguistics
    Web
    Link
    Analysis
    Connectors
    • SharePoint
    • File Traverser
    • Web
    • BDC
    • Exchange
    • Notes
    • Documentum
    Security
    Access
    Module
    Indexing
    SharePoint
    Front-end
    Custom
    Front-End
    Query Object Model
    Query and
    Result
    Processing
    Search
    Core
    Query Web Service
    Connectors
    • Web Crawler
    • JDBC
    Federation
    Object Model
    Monitoring Services
    Content
    Microsoft System Center Operations Manager
    OpenSearch or Other Sources
    People Search
  • Search LOB Systems via BDC/BCS
    Enhance SharePoint platform capabilities with out-of-box features, services, and tools that streamline development of solutions with deep integration of External Data and Services.
    Office Apps
    Cache
    Offline Operations
    BDC Client Runtime
    SharePoint
    SPD
    Design
    Tools
    VSTO
    Web 2.0
    LOB
    Siebel
    SAP
    Dynamics
  • Document Processing Pipeline Stages
    Default
    Optional
    Format Conversion
    iFilters, OutSideIn
    Language detection and encoding
    Lemmatizer
    Linguistics normalization
    Tokenizer
    Word breaking
    Entity Extraction
    Persons, companies, locations, email, date/time, URL, prices, file names
    DateTimeNormalizer
    Date normalization
    Vectorizer
    Create document vector for similarity searching
    WebAnalyzer
    Anchor text and link cardinality analysis
    PropertiesMapper
    Map to crawled properties
    PropertiesReporter
    Report detected properties
    XML Properties mapper
    Offensive Content Filter
    Verbatim extractor
    Loads dictionary for custom extraction, e.g product names
    Field Collapsing
    Mapper

    Configurable
    Stages
    EntityExtraction
    Language
    Detection
    Format
    Conversion
    The different plug-ins can either be configured from UI or from config files
  • Content Processing and Schema
    Admin UI
    Schema CmdLets
    Custom Client
    Extracted document attributes reported as Crawled Properties
    Crawled Properties mapped to Managed Properties
    Characteristics are defined for Managed Properties, e.g.
    Refiners
    Sorting
    Queryable
    Type
    Definition and mapping done via UI or Powershell
    Schema Object Model
    Update configuration
    Schema Service (hosted in IIS)
    Report discovered crawled properties
    Update Tools
    Persistence
    Property backend
    bliss
    psctrl
    configserver
    Alert pipeline
    of updated
    schema
    Document Processing Pipeline
    PropertiesMapper
    PropertiesReporter
  • Pipeline Extensibility API
    Motivation
    Straightforward way to add text analysis functionality
    Flexibility and supportability
    Example uses
    Sentiment analysis
    Translation
    Auto-Classification
    Mechanism
    Just before Mapper
    “any” binary
    Runs in sandbox with timeout
    Mapper
    Extensibility

    Standard processing
  • Yeah, So What?
    100 million documents per farm
    Refiners: only uses the first 1000 results
    Search is restricted to one farm
    Tell Me Something Awesome
    SharePoint 2010
    Fast Search Server 2010
    40 Million Documents per server
    Refiners: exact count from the entire result set
    Content can be indexed and search across farms
    3.6 TB of disk space per server (so far!) and support for NAS and SANs.
    Full support for VMs (Hyper-V and VMware)
  • There is nothing wrong with SharePoint!
    SharePoint brings together a number of collaborative technologies that would otherwise not play well together
    As SharePoint adoption spreads the need for enterprise search only increases
    Search today is where RDBMSs were over 20 years ago
    Let me say that again: there is nothing wrong with SharePoint!
    Is Something Wrong With SharePoint?
  • The Present
    SharePoint 2010 search addresses a host or previous issues
    No migration path from SP 2010 to Fast Search 2010
    The Future
    Where do you think Fast Search Server will be in 3 years (the next release of SharePoint)?
    Why Fast Search Instead of SharePoint Search?
  • You’ve Got QuestionsI’ve probably got answers…
    Q and A
  • Demo: Fast Search Server 2010
    FAST: A Brief Time of History
    SharePoint 2010
    Search features
    Fast Search Server 2010
    Features
    Architecture
    Why Fast Search Server instead of SharePoint search?
    Agenda
  • The organizers of SharePoint Saturday
    To all of you for attending!
    Thanks
  • Capacity Planning White Paper
    http://www.microsoft.com/downloads/details.aspx?FamilyID=65b799e3-825c-4398-8cd7-3311d3297997&displaylang=en
    RSS: FAST Search Server 2010 for SharePoint Newly Published Content
    If you bookmark only one RSS feed for Fast Search Server 2010 this is the one: http://services.social.microsoft.com/feeds/feed/FASTSearchServer2010NewContent
    Documentation
    TechNet: http://technet.microsoft.com/en-us/library/ee781286.aspx
    MSDN Blogs
    Enterprise Search: http://blogs.msdn.com/b/enterprisesearch/
    Steve Nicolaou, Fast Architect: http://blogs.msdn.com/b/stevennicolaou/
    Jørgen's FAST Search Blog: http://blogs.msdn.com/b/jorgeni/
    Dark Corners: http://blogs.msdn.com/b/dark_corners/
    Enterprise Search User Group
    Second Wednesday of every month! You missed July! Don’t miss August!
    Case Study: Search and the FBI Sentinel Program
    Author: Marti Hearst, Search User Interfaces (http://www.searchuserinterfaces.com/)
    Next Generation Tools: Content Transformation Service/Interaction Management Service
    References