Real World Challenges in Enterprise Search


Published on

Enterprise Search is complex, even in theory. But when you implement your search solution and everything turns to reality, you’ll find some new, never-seen challenges. In this session, I’ll collect the best, biggest and most exciting challenges from my experience, including real world customer scenarios and solutions. Regardless of the SharePoint version you use (SharePoint 2010, FAST Search for SharePoint, SharePoint 2013), this session is for you if you want to prepare for these “unexpected” scenarios.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • No longer within the firewallRelevance is criticalSearch within the organization„Transparent” SearchSearch Driven Applications
  • Management by Walking Around
  • New Search Solution vs. Upgrading/Changing the existing one
  • Plan!!Research on SOURCE SYSTEM, involve the admins there!!TestOn Source systemOn SearchInvolve:Source system key usersSource system adminsTest users (<7)More test users
  • Jeff
  • Real World Challenges in Enterprise Search

    1. 1. Real World Challenges in Enterprise SearchCOM 704Agnes Molnar
    2. 2. Introduction – Agnes MolnarIndependent Consultant• 10+ Years SharePoint Experience• Enterprise Content Management• SearchSharePoint Server MVP• 6 Years SharePoint Server MVP• 5+ Years Speaking at Conferences Around theWorld• Books, White PapersContact• E-mail:• Blog:• Twitter: @molnaragnes
    3. 3. Agenda
    4. 4. Enterprise SearchSearch Technologythat your organization owns and controls
    5. 5. Search is more than TechnologySource:
    6. 6. Requirements Gathering
    7. 7. Requirements GatheringFind CollectAnalyze Take ActionSearch
    8. 8. Requirements GatheringInformation-Seeking Patterns „I know what I’m searching for and know how to do that” „I know what I’m searching for but I don’t know how to dothat” „I don’t know what I’m searching for” „Am I Searching?...”
    9. 9. Requirements GatheringContentSourcesTypes ofContentActions toTakeMetadataAmount ofContentUsers’BehaviorCurrentPain PointsMBWA
    10. 10. Implementation Planning
    11. 11. Assessing your needs:ExpectationsUsers IT BusinessInteractiveFlexible‘Psychic’CustomizedErgonomicSecureScalableStandards-basedFlexibleAccurateTunableFlexibleActivity Reports
    12. 12. Traditional System Integration12Custom IntegrationCost:Time Frame:$ $$ $Challenge: Brittle codeMiddlewareCost:Time Frame:$ $$ $Challenge: $1,000,000 + costs with limited content sources$SOACost:Time Frame:$ $$ $Challenge: Massive re-engineer$Sales & MarketingI need all information we have about mycustomer?Where can I see all my suppliers data inone place?Where do I have to lookfor research projects?Do we know our liabilityon this matter?Engineering/R&D Legal/ComplianceSupply ChainDataSilosEmail& MessagingERPCRMECM, Search,CollaborationStructured Data(databases)UnstructuredData(file shares)PublicWeb SitesCloud/Office 365
    13. 13. Search Driven Architecture
    14. 14. Security
    15. 15. The Search Security ParadoxAs Search is deployed further and further into theEnterprise, the likelihood of having a security problemincreases.
    16. 16. Search Federation
    17. 17. Search Federation
    18. 18. Federate vs. Build IndexFederate CANNOT or don’t want tocrawl remote site’s robots.txt blocksSharePoint’s crawler you need results only withspecific keywords and/orkeyword patterns in the query content changes very often,immediately crawling needed queries under different securitycontext infrequently queried contents >500 content sourcesBuild Index you CAN crawl and index you don’t have enoughbandwith to federate content changes very often,but immediately crawling NOTneeded content that is not indexed bythe remote server remote server does not returnwith RSS or Atom
    19. 19. Sizing and Capacity Planning
    20. 20. Scaling FactorsContentcharacteristicsSearchfeaturesQueryperformanceDocumentfreshnessHighavailability
    21. 21. Scale-Out PrincipalsTo improve this… Take these actions…Index freshness / crawl times Add more Indexer machines and/or Crawlcomponents.Add additional Crawl DB on the same SQL server.Add additional SQL server(s) with additional CrawlDBs.Query Latency / Throughput Partition the index to smaller index partitions.Add Query components with mirror indexpartitions.Add additional Crawl DB on the same SQL server.Add additional SQL server(s) with additional CrawlDBs.Query Availability Deploy redundant Query servers, redundant Indexpartitions and components.Use clustered or mirrored DB servers to hostProperty DBsCrawl / Index Availability Use multiple Crawler components or redundantIndex servers.Add Crawl DBs.
    22. 22. Components – Scaling cheat sheetComponent CPU Network Disk MemorySearch administration    Crawling    Content processing (CPC)   Analytics processing (APC)    Index    Query processing (QPC)   
    23. 23. Conclusion
    24. 24. Thank you for attending!ContactE-mail: aghy@aghy.huBlog: http://aghy.huTwitter: @molnaragnes