Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Enterprise Search Summit Keynote: A Big Data Architecture for Search

1,156 views

Published on

This presentation was given by Search Technologies' CEO Kamran Khan at the November 2013 Enterprise Search Summit / KMWorld in Washington DC. He discussed how modern search engines are currently being combined with powerful independent content processing pipelines and the distributed processing technologies from big data to form new and exciting enterprise search architecture, delivering results only available to the biggest companies with the deepest pockets in the past. For more information visit http://www.searchtechnologies.com/.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Enterprise Search Summit Keynote: A Big Data Architecture for Search

  1. 1. A Big Data Architecture for Search Kamran Khan, CEO The expert in the search space
  2. 2. Search Technologies Overview Ascot, UK Karlsruhe, DE Cincinnati, OH Herndon, VA San Diego, CA San Jose, CR • The leading IT Services company dedicated to Enterprise Search & Search-based Applications • Implementation, Consulting, Managed Services • 120 employees and growing • Independent, working with all of the leading software vendors and open source alternatives The expert in the search space
  3. 3. 500+ Customers The expert in the search space
  4. 4. What Is Big Data? The expert in the search space
  5. 5. Where Did Modern Big Data Come From? Web Web Servers Servers Web Servers Content Content Content The expert in the search space
  6. 6. What is Big Data? LOG FILES LOG FILES LOG FILES LOG FILES LOG FILES LOG FILES LOG FILES The expert in the search space
  7. 7. What is Big Data? Too big for a single machine • Physically impossible for a single machine Data Aggregation & Analysis • Simply transforming data records is not enough • Must aggregate / “boil down” the data Batch Processing • Very long running jobs (not real-time) Message: Lots of Data  “Big Data” The expert in the search space
  8. 8. Enabling Technologies Big Data For Search Hadoop Elastic / Cloud Computing Modern Statistical Analysis The expert in the search space
  9. 9. What is Big Data? Content Content Content Content Content Content Content Content Content Content Content Content Content Content Content Hadoop The expert in the search space
  10. 10. A Traditional Integrated Architecture Does a lot of what we need for Enterprise Search Content Sources SharePoint Search Engine File System Aspire Connectors ETC. Search Index Connector RDBMS Employee Directory Index Pipeline Limitations • • • • Limited support for modern analytics Limited support for content processing Re-indexing takes too long Limits ability to do continuous improvement cycle The expert in the search space
  11. 11. Why Content Processing is Important Content Sources Employee Directory File System Search Engine Aspire Connectors Connector Content Processing Index Pipeline Search Index RDBMS Employee Directory ETC. • Powerful & Complete Content Processing Service • Clean and consistent data and metadata • Ability to supplement metadata • Support for Continuous Improvement Cycle • Develop and maintain processing IP • Ability to easily migrate to new search engines The expert in the search space
  12. 12. A New Enterprise Search Architecture Content Sources Employee Directory File System Aspire Connectors Connector Content Processing & Tokenization Search Engine Search Index Pipeline Index RDBMS Secure Cache Employee Directory Analytics ETC. • • • • • Docs, Log files, Supplemental Data Integrated Platform (Docs, Log Files and External data) Reduced Cost Better Agility and Scalability Fast Reindexing Expanded Functionality The expert in the search space
  13. 13. Advanced Features & Analytics Enabled Search and Match Forward and Reverse Citation Latent Semantic Analysis More Precise Term Weighting Beyond TF/IDF Near Duplicate Detection Document Topic Tagging Results ranking including popularity Recommendations based on user behavior Suggested queries based on user behavior The expert in the search space
  14. 14. In Summary New architectureBig Data Technology better: Structured for search providing Will Analytics and other functionality Search Revolutionize Enterprise Content processing Agility Economics and scalability Big Data architectures will significantly move search forward The expert in the search space
  15. 15. For further information www.searchtechnologies.com The expert in the search space

×