A near real time search and alert engine powered by SolR Lucene
Upcoming SlideShare
Loading in...5
×
 

A near real time search and alert engine powered by SolR Lucene

on

  • 1,555 views

The trade-off between scale and update rate that search engines face on the Web 2.0. How enhanced indexing and smart filtering enable near-real-time engines. SolR Lucene ultra-fast search server and ...

The trade-off between scale and update rate that search engines face on the Web 2.0. How enhanced indexing and smart filtering enable near-real-time engines. SolR Lucene ultra-fast search server and the user-defined "websphere" (feeds and filters).

Statistics

Views

Total Views
1,555
Views on SlideShare
1,555
Embed Views
0

Actions

Likes
3
Downloads
9
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

A near real time search and alert engine powered by SolR Lucene A near real time search and alert engine powered by SolR Lucene Presentation Transcript

  • 1A near‐real‐time search and alert service based on SolR LuceneApril 2013                                                                                  www.visibium.com
  • 2The needWhat’s new with NFC technology?What is said on my competitors?What is said by my competitors?What’s said on my brand?What’s said on key executives of my company?What’s said on my last marketing campaign?What’s said on my product launch? What’s said on my last ad campaign?
  • 3The needWhat’s new with NFC technology?What is said on my competitors?What is said by my competitors?What’s said on my brand?What’s said on key executives of my company?What’s said on my last marketing campaign?What’s said on my product launch? What’s said on my last ad campaign?Industry watchCompetition watchBrand protectionCampaign Impact analysisI need to permanently search the Web 2.0 on certain topics
  • 4The needWhat’s new with NFC technology?What is said on my competitors?What is said by my competitors?What’s said on my brand?What’s said on key executives of my company?What’s said on my last marketing campaign?What’s said on my product launch? What’s said on my last ad campaign?I need to permanently search the Web 2.0 on certain topicsI know where to lookI know what I’m looking for…… and I want to get an alert whena new matching content is posted.Within minutes, not the day after.
  • 5The ProblemI need to permanently search the Web 2.0 on certain topicsI want to get an alert whena new matching content is posted…Some websites take days to get indexed by the major search engines (Google, Bing, Yahoo!…)Alert services are as good as their indexing rate is. A day, not a minute, is the norm (except for breaking news and weather alerts). … within minutes, not the day after.Real “real‐time search” engines(OneRiot, Wowd, Crowdeye, Collecta) failed as the technology involved massive R&D costsGoogle closed its real time search service in 2011
  • 6The State of the Union… within minutes, not the day after.Narrow look, deep digging Broad look, shallow digging
  • 7The State of the Union… within minutes, not the day after.Narrow look, deep digging Broad look, shallow diggingSocial Web Monitoring & Trending solutions• Look at big chunks of the Web• Detect trends, mood, new topics, influencers, etc.Near‐real‐time search engines• Typically look at the most popular content feeds, and run indexing at frequent intervals (hence the near‐real‐time)• Some offer powerful query tools.
  • 8The State of the Union… within minutes, not the day after.Narrow look, deep digging Broad look, shallow diggingSocial Web Monitoring & Trending solutions• Look at big chunks of the Web• Detect trends, mood, new topics, influencers, etc.• Typically can’t single out contributions on a match to a user‐defined query.Near‐real‐time search engines• Typically look at the most popular content feeds, and run indexing at frequent intervals (hence the near‐real‐time)• Some offer powerful query tools to users.
  • 9Let’s dig deepDeep dig is about using powerful query toolswhich require full‐text indexing (among other things).The lesser data the “nearer” real time.So…Full text indexing carriesa trade‐off betweenscale and update rate.
  • 10Let’s dig deeperDeep dig is about using powerful query toolswhich require full‐text indexing (among other things).Full text indexing carriesa trade‐off betweenscale and update rate.The lesser data the “nearer” real time.So … 2 directions fora nearer real timeEnhanced indexingSmart selection of data to index
  • 11Enhanced indexingWhat do Apple, Netflix, Wikipedia, LinkedIn eBay and Twitter have in common?
  • 12Enhanced indexing with SolR LuceneWhat do Apple, Netflix, Wikipedia, LinkedIn eBay and Twitter have in common?
  • 13Enhanced indexing with SolR LucenePicking up the right tools for the job
  • 14Limiting the indexed dataContent feeds• Twitter public stream (fire hose)• Twitter private feeds• Facebook updates• Syndicated content (RSS)• Blogs, forums• NewsSEARCH• Watch• QueriesMatchingresultsBasic architecture• Alerts• Dispatch
  • 15Limiting the indexed dataSelective architectureSEARCHContent feeds• Twitter public stream (fire hose)• Twitter private feeds• Facebook updates• Syndicated content (RSS)• Blogs, forums• NewsFiltered dataindexFILTERS• Geo (e.g. local search engine)• Audience (e.g. most popular)• Buzz (e.g. #tags)• Watch• QueriesMatchingresults• Alerts• Dispatch
  • 16Smart selection of data to indexUser‐defined filtersSEARCHContent feeds• Twitter public stream (fire hose)• Twitter private feeds• Facebook updates• Syndicated content (RSS)• Blogs, forums• NewsFiltered dataindexFILTERSUser‐defined filters• Watch• Queries• Refined queries (reprocessing)Matchingresults• Alerts• Dispatch
  • 17Visibium• A near‐real‐time search and alert service• User‐defined feeds and filters• Full‐text indexing• Advanced queries• Refined search reprocessing• Powered by SolR LuceneMonitor the slice of the web you really care about© Visibium, 2011‐2013www.visibium.com