Amanda Love at educationBest one Hope you are in good health. My name is AMANDA . I am a single girl, Am looking for reliable and honest person. please have a little time for me. Please reach me back amanda_n14144@yahoo.com so that i can explain all about myself . Best regards AMANDA. amanda_n14144@yahoo.com10 months ago
# https://github.com/rubygems/gemcutter/blob/master/app/models/rubygem.rb#L29-33#def self.search(query) where("versions.indexed and (upper(name) like upper(:query) or upper(versions.description) like upper(:query))", {:query => "%#{query.strip}%"}). includes(:versions). order("rubygems.downloads desc")end
???
???
Search (mostly) sucks.Why? ElasticSearch
WHY SEARCH SUCKS?How do you implement search?class MyModel include Whatever::SearchendMyModel.search "something"
WHY SEARCH SUCKS?How do you implement search?class MyModel include Whatever::Search MAGICendMyModel.search "whatever"
WHY SEARCH SUCKS?How do you implement search? Query Results Resultdef search @results = MyModel.search params[:q] respond_with @resultsend
WHY SEARCH SUCKS?How do you implement search? Query Results Result MAGICdef search @results = MyModel.search params[:q] respond_with @resultsend
WHY SEARCH SUCKS?How do you implement search? Query Results Result MAGIC +def search @results = MyModel.search params[:q] respond_with @resultsend
23px 670pxA personal story...
WHY SEARCH SUCKS?Compare your search library with your ORM libraryMyModel.search "(this OR that) AND NOT whatever"Arel::Table.new(:articles). where(articles[:title].eq(On Search)). where(["published_on => ?", Time.now]). join(comments). on(article[:id].eq(comments[:article_id])) take(5). skip(4). to_sql
Your data, your search. ElasticSearch
HOW DOES SEARCH WORK?A collection of documents file_1.txt The ruby is a pink to blood-‐red colored gemstone ... file_2.txt Ruby is a dynamic, reflective, general-‐purpose object-‐oriented programming language ... file_3.txt "Ruby" is a song by English rock band Kaiser Chiefs ...
HOW DOES SEARCH WORK?How do you search documents?File.read(file1.txt).include?(ruby)
HOW DOES SEARCH WORK?The inverted indexTOKENS POSTINGS ruby file_1.txt file_2.txt file_3.txt pink file_1.txt gemstone file_1.txt dynamic file_2.txt reflective file_2.txt programming file_2.txt song file_3.txt english file_3.txt rock file_3.txthttp://en.wikipedia.org/wiki/Index_(search_engine)#Inverted_indices
HOW DOES SEARCH WORK?The inverted indexMySearchLib.search "ruby" ruby file_1.txt file_2.txt file_3.txt pink file_1.txt gemstone file_1.txt dynamic file_2.txt reflective file_2.txt programming file_2.txt song file_3.txt english file_3.txt rock file_3.txthttp://en.wikipedia.org/wiki/Index_(search_engine)#Inverted_indices
HOW DOES SEARCH WORK?The inverted indexMySearchLib.search "song" ruby file_1.txt file_2.txt file_3.txt pink file_1.txt gemstone file_1.txt dynamic file_2.txt reflective file_2.txt programming file_2.txt song file_3.txt english file_3.txt rock file_3.txthttp://en.wikipedia.org/wiki/Index_(search_engine)#Inverted_indices
module SimpleSearch def index document, content tokens = analyze content store document, tokens puts "Indexed document #{document} with tokens:", tokens.inspect, "n" end def analyze content # >>> Split content by words into "tokens" content.split(/W/). # >>> Downcase every word map { |word| word.downcase }. # >>> Reject stop words, digits and whitespace reject { |word| STOPWORDS.include?(word) || word =~ /^d+/ || word == } end def store document_id, tokens tokens.each do |token| # >>> Save the "posting" ( (INDEX[token] ||= []) << document_id ).uniq! end end def search token puts "Results for token #{token}:" # >>> Print documents stored in index for this token INDEX[token].each { |document| " * #{document}" } end INDEX = {} STOPWORDS = %w|a an and are as at but by for if in is it no not of on or that the then there t extend selfend A naïve Ruby implementation
HOW DOES SEARCH WORK?Indexing documentsSimpleSearch.index "file1", "Ruby is a language. Java is also a language.SimpleSearch.index "file2", "Ruby is a song."SimpleSearch.index "file3", "Ruby is a stone."SimpleSearch.index "file4", "Java is a language."Indexed document file1 with tokens:["ruby", "language", "java", "also", "language"]Indexed document file2 with tokens:["ruby", "song"] Words downcased, stopwords removed.Indexed document file3 with tokens:["ruby", "stone"]Indexed document file4 with tokens:["java", "language"]
HOW DOES SEARCH WORK?Search the indexSimpleSearch.search "ruby"Results for token ruby:* file1* file2* file3
HOW DOES SEARCH WORK?The inverted indexTOKENS POSTINGS ruby 3 file_1.txt file_2.txt file_3.txt pink 1 file_1.txt gemstone file_1.txt dynamic file_2.txt reflective file_2.txt programming file_2.txt song file_3.txt english file_3.txt rock file_3.txthttp://en.wikipedia.org/wiki/Index_(search_engine)#Inverted_indices
It is very practical to know how search works.For instance, now you know thatthe analysis step is very important.Most of the time, its more important than the search step. ElasticSearch
module SimpleSearch def index document, content tokens = analyze content store document, tokens puts "Indexed document #{document} with tokens:", tokens.inspect, "n" end def analyze content # >>> Split content by words into "tokens" content.split(/W/). # >>> Downcase every word map { |word| word.downcase }. # >>> Reject stop words, digits and whitespace reject { |word| STOPWORDS.include?(word) || word =~ /^d+/ || word == } end def store document_id, tokens tokens.each do |token| # >>> Save the "posting" ( (INDEX[token] ||= []) << document_id ).uniq! end end def search token puts "Results for token #{token}:" # >>> Print documents stored in index for this token INDEX[token].each { |document| " * #{document}" } end INDEX = {} STOPWORDS = %w|a an and are as at but by for if in is it no not of on or that the then there t extend selfend A naïve Ruby implementation
HOW DOES SEARCH WORK?The Search Engine Textbook Search Engines Information Retrieval in Practice Bruce Croft, Donald Metzler and Trevor Strohma Addison Wesley, 2009http://search-engines-book.com
SEARCH IMPLEMENTATIONSThe Baseline Information Retrieval Implementation Lucene in Action Michael McCandless, Erik Hatcher and Otis Gospodnetic July, 2010http://manning.com/hatcher3
http://elasticsearch.org
{ }HTTPJSONSchema-freeIndex as ResourceDistributedQueriesFacetsMappingRuby ElasticSearch
ELASTICSEARCH FEATURESHTTP JSON / Schema-free / Index as Resource / Distributed / Queries / Facets / Mapping / Ruby# Add documentcurl -‐X POST "http://localhost:9200/articles/article/1" -‐d { "title" : "One" }# Querycurl -‐X GET "http://localhost:9200/articles/_search?q=One"curl -‐X POST "http://localhost:9200/articles/_search" -‐d { INDEX TYPE ID "query" : { "terms" : { "tags" : ["ruby", "python"], "minimum_match" : 2 } }}# Delete indexcurl -‐X DELETE "http://localhost:9200/articles"# Create index with settings and mappingcurl -‐X PUT "http://localhost:9200/articles" -‐d { "settings" : { "index" : "number_of_shards" : 3, "number_of_replicas" : 2 }},{ "mappings" : { "document" : { "properties" : { "body" : { "type" : "string", "analyzer" : "snowball" } } } }}
ELASTICSEARCH FEATURESHTTP / JSON / Schema Free / Index as Resource / Distributed / Queries / Facets / Mapping / Ruby Index A is split into 3 shards, and duplicated in 2 replicas. A1 A1 A1 Replicas A2 A2 A2 A3 A3 A3 curl -‐XPUT http://localhost:9200/A/ -‐d { "settings" : { "index" : { Shards "number_of_shards" : 3, "number_of_replicas" : 2 } } }
ELASTICSEARCH FEATURES HTTP / JSON / Schema Free / Index as Resource / Distributed / Queries / Facets / Mapping / RubyIm pr ce ove an rm in de rfo xi pe ng h pe a rc rfo se rm e ov an pr ce Im SH AR AS DS IC PL RE
ELASTICSEARCH FEATURESHTTP / JSON / Schema Free / Distributed / Queries / Facets / Mapping / Ruby $ curl -‐X GET "http://localhost:9200/_search?q=<YOUR QUERY>" apple Terms apple iphone Phrases "apple iphone" Proximity "apple safari"~5 Fuzzy apple~0.8 app* Wildcards *pp* Boosting apple^10 safari [2011/05/01 TO 2011/05/31] Range [java TO json] apple AND NOT iphone +apple -‐iphone Boolean (apple OR iphone) AND NOT review title:iphone^15 OR body:iphone Fields published_on:[2011/05/01 TO "2011/05/27 10:00:00"]http://lucene.apache.org/java/3_1_0/queryparsersyntax.html
ELASTICSEARCH FEATURESHTTP / JSON / Schema Free / Distributed / Queries / Facets / Mapping / Ruby K R O A T I E N K R O } R O A O A T Trigrams A T I T I E I E N
ELASTICSEARCH FEATURESHTTP / JSON / Schema Free / Distributed / Queries / Facets / Mapping / Rubyclass Article < ActiveRecord::Base include Tire::Model::Search include Tire::Model::Callbacksend$ rake environment tire:import CLASS=ArticleArticle.search do query { string love } facet(timeline) { date :published_on, :interval => month } sort { published_on desc }end http://github.com/karmi/tire
ELASTICSEARCH FEATURESHTTP / JSON / Schema Free / Distributed / Queries / Facets / Mapping / Rubyclass Article include Whatever::ORM include Tire::Model::Search include Tire::Model::Callbacksend$ rake environment tire:import CLASS=ArticleArticle.search do query { string love } facet(timeline) { date :published_on, :interval => month } sort { published_on desc }end http://github.com/karmi/tire
Try ElasticSearch and Tire with a one-line command.$ rails new tired -‐m "https://gist.github.com/raw/951343/tired.rb" A “batteries included” installation. Downloads and launches ElasticSearch. Sets up a Rails applicationand and launches it. When youre tired of it, just delete the folder.
Thanks! d
Let LinkedIn power your SlideShare experience
+
Let LinkedIn power your SlideShare experience
Customize SlideShare content based on your interests
We will import your LinkedIn profile and you will be visible on SlideShare.
Keep up to date when your LinkedIn contacts post on SlideShare
1–1 of 1 previous next