Your SlideShare is downloading. ×
Enterprise Search @EPAM
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Enterprise Search @EPAM

2,216
views

Published on


0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,216
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
6
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Enterprise Search• Best Practices• Connector Framework• Relevancy overviewSharepoint User Group2013, March 26, Minsk Confidential 1
  • 2. EPAM has more than 100 systemsknowledgebase.epam.com trainings.epam.co ??????? HR file sharesInformation.epam.comXXX.epam.com Bla.bla.bla.epam.co YYY.epam.com Jira.epam.com Confidential 2
  • 3. Confidential 3
  • 4. Little homework Confidential 4
  • 5. We started POC in September 2012 Confidential 5
  • 6. Available as search.epam.com in November 2012 • Sharepoint 2010 • FAST Search for Sharepoint • Branded Search Center • Custom connectors • Fine-tuned relevance to reflect EPAM landscape Confidential 6
  • 7. Confidential 7
  • 8. We become stronger every day… • 550 000 searchable items • 30+ content sources • 400+ daily searches • Exposed to internet Confidential 9
  • 9. … to help you search Confidential 10
  • 10. What we’ve learned1. Deploy “painkiller” project as soon as possible2. Connect as much systems as possible (Cap O. speaking)3. Analyze • Watch search logs • Connect external analytics • Speak with users • Feedback forms sucks4. Tune relevancy • hot-fix using bugs using best-bets5. Work with departments to adopt their content • Basic SEO Confidential 12
  • 11. Search Connectors in SP2010/2013 Search Connectors Protocol Handers BCS Lotus File Share Exchange Custom BCS Notes SharePoint Database WebSite WebService People .NET
  • 12. BCS Connectors in SP 2010/2013 Stereotyped Operations• Get IDs• Get By ID• Describe Security• Read Stream
  • 13. EPAM Data Import Framework Tree DescribeTree() • Altassian Confluence ISource Node DownloadData(Node) • SVN • PMC Workflow 1. Source to build tree IImporter 2. Destination to build tree 3. Diff trees 4. Destination to import diff (add, remove) Timer Job Tree DescribeTree() • SharePoint Library IDestination void Import(Tree) • File System
  • 14. BCS vs DataImport Comparison Data Import BCSEffort to build the same + +Document Previews + -Indexing Speed + +/-Customizable + -Storage Space - +Unit Testing + +/-Incremental crawl + +/-
  • 15. RELEVANCY
  • 16. Search is a two step process0. User submits query1. Get candidates: all docs that match query2. Predict relevancy • Query terms importance • Proximity of query terms • Hit location (mp) importance • Freshness • Clicks • User rating •… Confidential 18
  • 17. Relevancy in FAST Search• Linear combination of features• RankProfile• Weights are configured via Powershell• Easy to understand via RankLog• Easy tuning – Content Source – Managed Property Confidential 19
  • 18. RankLog example (QueryLogger @codeplex) Confidential 20
  • 19. Relevancy in Sharepoint Confidential
  • 20. Relevancy in Sharepoint• Nonlinear combination of features. Two Neural Networks.• Ranking Model Schema described • http://www.google.com/patents/US8296292 • http://www.google.com/patents/US7840569• Cmdlets to import/export• Default Ranking Model Features: Type Instance BM25 BM25 Static UrlDepth BucketedStatic InternalFileType BucketedStatic Language Static ClickDistance Static QueryLogClicks Static QueryLogSkips Static LastClicks Static EventRate MinSpan - soft Title MinSpan - soft Title MinSpan - soft Title MinSpan - soft Content Confidential 22
  • 21. ExplainRank page• Google for “explain rank sharepoint”• Parses RankDetail managed Property Confidential 23
  • 22. Ranking Model Tuning Confidential 24
  • 23. Ranking Model TuningApproach described by Microsoft – http://msdn.microsoft.com/en-us/library/bb499682(v=office.12).aspx1. Collect Query Judgements2. Use Machine Learning to train Neural Network• namespace Microsoft.Office.Server.Search.RankerTuning• Wait for tuning tool Confidential 25
  • 24. Query Judgment framework Confidential 26
  • 25. Manual relevancy tuning in Sharepoint• Authoritative Pages• QueryRules – Best Bets – Understanding User Intent• Synonyms (cmdlets)• Entity Extractors• Spelling Corrections• Query Suggestions• ManagedMetadata• (!) Query Builder Confidential 27
  • 26. Manual relevancy tuning in Sharepoint 28
  • 27. Manual relevancy tuning in Sharepoint 29
  • 28. SP 2013 REST Query tool• http://sp2013searchtool.codeplex.com/ Confidential 30
  • 29. Alexey KozhemiakinSolution Architect, Enterprise Search Confidential 31