Your SlideShare is downloading. ×
SphinxSE with MySQL
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

SphinxSE with MySQL

4,318
views

Published on

Published in: Technology, Design

0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,318
On Slideshare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
65
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Show an dummy config file after this slide before moving on with the options of config
  • Show an dummy config file after this slide before moving on with the options of config
  • Show an dummy config file after this slide before moving on with the options of config
  • Show an dummy config file after this slide before moving on with the options of config
  • Show an dummy config file after this slide before moving on with the options of config
  • Transcript

    • 1.  
    • 2.
      • Introduction to Sphinx .
      • Sphinx Searching and Sorting Features.
      • Sphinx Implementation.
      • Demo.
    • 3.
      • Introduction to Sphinx .
      • Sphinx Searching and Sorting Features.
      • Sphinx Implementation.
      • Demo.
    • 4.  
    • 5.
      • Open Source Search Engine.
      • Developed by Andrew Aksyonoff
      • Integrates well with MySQL.
      • Provides greatly improved full-text search.
      • Specially designed for indexing databases.
    • 6.  
    • 7.  
    • 8.
      • Search on 500 MB of docs.
      • Docs are 3,000.000 in count.
      • Looking for “internet web design (match any)”.
      • Returning 134.000 docs.
    • 9.  
    • 10.  
    • 11.
      • It has Two standalone programs :
      • Indexer – Pulls data from DB, builds indexes.
      • Searchd- Uses indexes and answers queries.
      • Clients interact with searchd through :
      • Via native API’s: PHP, Python, Perl, Ruby, and Java.
      • Via SphinxSE.
      • Indexer periodically rebuilds the indexes :
      • Typically using cron jobs.
      • Searching works ok during rebuilds (Live Updates).
    • 12.
      • Sphinx documents = Records in DB.
      • Document = It just like ROW in DB and it has its own UNIQUE ID .
      • Each Document comprises of Fields and Attributes.
      • Fields are the columns on which we want to search.
      • Attributes may be used for filtering, sorting, grouping.
    • 13.
      • Sphinx Search Engine Returns only Unique Document ID’s.
      • This means if we Search for Dominos we get corresponding rows
      • UNIQUE ID possessing it.
      • 3. Hence after searching returns results, you will still likely NEED TO FETCH DETAILS of documents in your FINAL RESULT PAGE.
    • 14.
      • Introduction to Sphinx .
      • Sphinx Searching and Sorting Features.
      • Sphinx Implementation.
      • Demo.
    • 15.
      • SELECT id
      • FROM sphinx_table
      • WHERE
      • query =‘dominos; -- thing which you want to search
      • mode = ext2; -- searching mode
      • weights = 1000,100,10; --weight distribution
      • sort = attr_asc:group_id;’; --sorting type
    • 16.
      • SPH_MATCH_ALL : match all keywords.
      • SPH_MATCH_ANY : match any keywords.
      • SPH_MTACH_BOOLEAN : no relevance, implicit Boolean AND between keywords
      • if not specified otherwise.
      • 1. hello & world
      • 2. hello | world
      • 3. hello –world
      • SPH_MATCH_PHRASE : treats query as a phrase and requires a perfect match.
      • SPH_MATCH_EXTENDED : this has been super ceded by SPH_MATCH_EXTENDED2.
      • SPH_MATCH_EXTENDED2 : it provide varied functionalities.
    • 17.
      • FIELD SEARCH OPERATOR : @title hello @body world.
      • QUORUM MATCHING OPERATOR : “world is wonderful place”/3.
      • PROXIMITY SEARCH OPERATOR : “hello world”~10.
      • STRICT ORDER OPERATOR : black << cat
    • 18.
      • Phrase Ranking : Higher preference to Documents possessing matching phrase like “ hello world ”.
      • Statistical Ranking : Here more preference is giving to word frequency i.e.
      • Document containing more number of “ hello ” and/or “ world ” is given more weightage.
    • 19.
      • SPH_MATCH_BOOLEAN : No weighting performed.
      • SPH_MATCH_ALL and SPH_MATCH_PHRASE : Uses Phrase Ranking.
      • SPH_MATCH_ANY : Phrase ranks * Big value + Statistical ranking
      • ( Here we multiply with big value to guarantee higher phrase rank even if it’s field weight is low ).
      • SPH_MATCH_EXTENDED : ( Phrase Rank + BM25)*1000.
      • Personalized Weighting : This can be done using “weights “ keyword in your Sphinx Query. This is generally used in the case when we want more preference between column to be searched .
      • E.g. weights = 1,2,3; --this possible in mode=ext2.
    • 20.
      • SPH_SORT_RELEVANCE : Sorts by Relevance in DESC order.
      • SPH_SORT_ATTR_DESC : Sorts by an Attribute in DESC order.
      • SPH_SORT_ATTR_ASC : Sorts by an Attribute in ASC order.
      • SPH_SORT_TIME_SEGMENTS : Sorts by (hour/day/week/month) in DESC order.
      • SPH_SORT_EXTENDED : Here we can SPECIFY the COLUMNS on which we are applying our SEARCH for KEYWORDS for sorting order.
      • SPH_SORT_EXPR : Allows sorting using a mathematical equation involving column.
    • 21.
      • Introduction to Sphinx .
      • Sphinx Searching and Sorting Features.
      • Sphinx Implementation.
      • Demo.
    • 22.
      • Installation is usually straightforward :
      • REQUIREMENT:
      • A Good working C++ compiler.
      • A Good Make Program.
      • STEPS:
      • $./configure - - prefix /path - -with-mysql - - with-pgsql
      • $make
      • $make install
    • 23. Checking SphinxSE Installation
    • 24.
      • There are 2 components that we need to setup before Sphinx is ready for searching:
      • Sphinx Table
      • Configuration File (e.g.: file_name.conf )
    • 25.
      • Requirements:
      • The data types of the first 3 columns must be INT,INT,VARCHAR.
      • which will be mapped to document id, match weight and the search query.
      • Query column must be indexed and no other column must be indexed.
      • All other attributes in the source comes as columns.
      • CREATE TABLE sphinx_table
      • (
      • id int not null,
      • Weight int not null,
      • Query varchar(255) not null,
      • Key (query)
      • )ENGINE=SPHINX CONNECTION=‘sphinx://localhost:3313/city_search_cust_mess’
    • 26.
      • Now in a Configuration File there are 4 section to configure which are as follows:
      • Source (multiple)
      • Index (multiple)
      • Indexer
      • Searchd
    • 27.
      • Now in a Configuration File there are 4 section to configure which are as follows:
      • Source (multiple)
      • Index (multiple)
      • Indexer
      • Searchd
    • 28.
      • Following are some of the options available in the source section of the configuration file:
      • TYPE:
      • type : data source type.
      • possible options: mysql,pgsql,xmlpipe,xmlpipe2.
      • Connection Info:
      • sql_host : SQL server host to connect (Mandatory).
      • sql_port : SQL server IP to connect ( Default 3306).
      • sql_user : SQL user to use when connecting to sql_host (Mandatory).
      • sql_pass : SQL user password to use when connecting to sql_host (Mandatory).
      • sql_db : SQL DB to be used.
      • sql_sock : socket name to connect to for local SQL servers.
    • 29.
      • Queries Info:
      • mysql_query_pre : pre-fetch query , or pre-query.
      • eg: sql_query_pre= SET NAMES utf8
      • sql_query : main document fetch query.
      • sql_query_post : Post-fetch query.
      • e.g.: sql_query_post= DROP TABLE my_tmp_table
      • sql_query_info : Document info query. (similar to comment in MySQL)
      • Attributes Info:
      • sql_attr_xxx: attribute declaration.(xxx : uint,bigint,float,str2ordinal,timestamp).
    • 30.
      • Now in a Configuration File there are 4 section to configure which are as follows:
      • Source (multiple)
      • Index (multiple)
      • Indexer
      • Searchd
    • 31.
      • type: index type .optional (possible option: local , distributed)
      • source: adds document source to local index. Multi-value.
      • path: Index files path and file name (without extension).
      • docinfo : Document attribute values ( inline , extern ) storage mode.
      • mlock : Memory locking for cached data . (Optional default 0).
      • min_word_len: minimum indexed word length (optional default 1).
      • Charset type: character set encoding type
    • 32.
      • Stemming Options:
      • morphology : A list of morphology preprocessors to apply.
      • e.g.: cars = car ; running =run.
      • Stopwords : stopwords file list (space seperated).
      • e.g.: the,is,are,an,a,etc….
    • 33.
      • Now in a Configuration File there are 4 section to configure which are as follows:
      • Source (multiple)
      • Index (multiple)
      • Indexer
      • Searchd
    • 34.
      • mem_limit : Indexing RAM usage limit . Optional, default is 32MB.
      • max_iops : maximum i/o operations per second.
      • max_iosize : maximum allowed i/o operation size.
      Setting Configuration File: Indexer Section
    • 35.
      • Now in a Configuration File there are 4 section to configure which are as follows:
      • Source (multiple)
      • Index (multiple)
      • Indexer
      • Searchd
    • 36.
      • address: IP address to bind on default 0.0.0.0 listens to all interfaces.
      • port : searchd TCP port number. (mandatory, default is 3312).
      • log : log file name. (optional, default is empty).
      • query_log : query log file name . (optional , default is empty).
      • pid file : searchd process ID file name (mandatory).
      • max_matches : maximum amount of matches that the daemon keep in RAM for each index and can return to the client. (optional, default 1000)
      • preopen_indexes : whether to forcibly preopen all indexes on startup.(optional , default 0 i.e. don’t open).
      Setting Configuration File: Searchd Section
    • 37.  
    • 38.  
    • 39.
      • Introduction to Sphinx .
      • Sphinx Searching and Sorting Features.
      • Sphinx Implementation.
      • Demo.
    • 40.  
    • 41.  

    ×