Your SlideShare is downloading. ×
0
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
SphinxSE with MySQL
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

SphinxSE with MySQL

4,348

Published on

Published in: Technology, Design
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,348
On Slideshare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
65
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Show an dummy config file after this slide before moving on with the options of config
  • Show an dummy config file after this slide before moving on with the options of config
  • Show an dummy config file after this slide before moving on with the options of config
  • Show an dummy config file after this slide before moving on with the options of config
  • Show an dummy config file after this slide before moving on with the options of config
  • Transcript

    • 1.  
    • 2.
      • Introduction to Sphinx .
      • Sphinx Searching and Sorting Features.
      • Sphinx Implementation.
      • Demo.
    • 3.
      • Introduction to Sphinx .
      • Sphinx Searching and Sorting Features.
      • Sphinx Implementation.
      • Demo.
    • 4.  
    • 5.
      • Open Source Search Engine.
      • Developed by Andrew Aksyonoff
      • Integrates well with MySQL.
      • Provides greatly improved full-text search.
      • Specially designed for indexing databases.
    • 6.  
    • 7.  
    • 8.
      • Search on 500 MB of docs.
      • Docs are 3,000.000 in count.
      • Looking for “internet web design (match any)”.
      • Returning 134.000 docs.
    • 9.  
    • 10.  
    • 11.
      • It has Two standalone programs :
      • Indexer – Pulls data from DB, builds indexes.
      • Searchd- Uses indexes and answers queries.
      • Clients interact with searchd through :
      • Via native API’s: PHP, Python, Perl, Ruby, and Java.
      • Via SphinxSE.
      • Indexer periodically rebuilds the indexes :
      • Typically using cron jobs.
      • Searching works ok during rebuilds (Live Updates).
    • 12.
      • Sphinx documents = Records in DB.
      • Document = It just like ROW in DB and it has its own UNIQUE ID .
      • Each Document comprises of Fields and Attributes.
      • Fields are the columns on which we want to search.
      • Attributes may be used for filtering, sorting, grouping.
    • 13.
      • Sphinx Search Engine Returns only Unique Document ID’s.
      • This means if we Search for Dominos we get corresponding rows
      • UNIQUE ID possessing it.
      • 3. Hence after searching returns results, you will still likely NEED TO FETCH DETAILS of documents in your FINAL RESULT PAGE.
    • 14.
      • Introduction to Sphinx .
      • Sphinx Searching and Sorting Features.
      • Sphinx Implementation.
      • Demo.
    • 15.
      • SELECT id
      • FROM sphinx_table
      • WHERE
      • query =‘dominos; -- thing which you want to search
      • mode = ext2; -- searching mode
      • weights = 1000,100,10; --weight distribution
      • sort = attr_asc:group_id;’; --sorting type
    • 16.
      • SPH_MATCH_ALL : match all keywords.
      • SPH_MATCH_ANY : match any keywords.
      • SPH_MTACH_BOOLEAN : no relevance, implicit Boolean AND between keywords
      • if not specified otherwise.
      • 1. hello & world
      • 2. hello | world
      • 3. hello –world
      • SPH_MATCH_PHRASE : treats query as a phrase and requires a perfect match.
      • SPH_MATCH_EXTENDED : this has been super ceded by SPH_MATCH_EXTENDED2.
      • SPH_MATCH_EXTENDED2 : it provide varied functionalities.
    • 17.
      • FIELD SEARCH OPERATOR : @title hello @body world.
      • QUORUM MATCHING OPERATOR : “world is wonderful place”/3.
      • PROXIMITY SEARCH OPERATOR : “hello world”~10.
      • STRICT ORDER OPERATOR : black << cat
    • 18.
      • Phrase Ranking : Higher preference to Documents possessing matching phrase like “ hello world ”.
      • Statistical Ranking : Here more preference is giving to word frequency i.e.
      • Document containing more number of “ hello ” and/or “ world ” is given more weightage.
    • 19.
      • SPH_MATCH_BOOLEAN : No weighting performed.
      • SPH_MATCH_ALL and SPH_MATCH_PHRASE : Uses Phrase Ranking.
      • SPH_MATCH_ANY : Phrase ranks * Big value + Statistical ranking
      • ( Here we multiply with big value to guarantee higher phrase rank even if it’s field weight is low ).
      • SPH_MATCH_EXTENDED : ( Phrase Rank + BM25)*1000.
      • Personalized Weighting : This can be done using “weights “ keyword in your Sphinx Query. This is generally used in the case when we want more preference between column to be searched .
      • E.g. weights = 1,2,3; --this possible in mode=ext2.
    • 20.
      • SPH_SORT_RELEVANCE : Sorts by Relevance in DESC order.
      • SPH_SORT_ATTR_DESC : Sorts by an Attribute in DESC order.
      • SPH_SORT_ATTR_ASC : Sorts by an Attribute in ASC order.
      • SPH_SORT_TIME_SEGMENTS : Sorts by (hour/day/week/month) in DESC order.
      • SPH_SORT_EXTENDED : Here we can SPECIFY the COLUMNS on which we are applying our SEARCH for KEYWORDS for sorting order.
      • SPH_SORT_EXPR : Allows sorting using a mathematical equation involving column.
    • 21.
      • Introduction to Sphinx .
      • Sphinx Searching and Sorting Features.
      • Sphinx Implementation.
      • Demo.
    • 22.
      • Installation is usually straightforward :
      • REQUIREMENT:
      • A Good working C++ compiler.
      • A Good Make Program.
      • STEPS:
      • $./configure - - prefix /path - -with-mysql - - with-pgsql
      • $make
      • $make install
    • 23. Checking SphinxSE Installation
    • 24.
      • There are 2 components that we need to setup before Sphinx is ready for searching:
      • Sphinx Table
      • Configuration File (e.g.: file_name.conf )
    • 25.
      • Requirements:
      • The data types of the first 3 columns must be INT,INT,VARCHAR.
      • which will be mapped to document id, match weight and the search query.
      • Query column must be indexed and no other column must be indexed.
      • All other attributes in the source comes as columns.
      • CREATE TABLE sphinx_table
      • (
      • id int not null,
      • Weight int not null,
      • Query varchar(255) not null,
      • Key (query)
      • )ENGINE=SPHINX CONNECTION=‘sphinx://localhost:3313/city_search_cust_mess’
    • 26.
      • Now in a Configuration File there are 4 section to configure which are as follows:
      • Source (multiple)
      • Index (multiple)
      • Indexer
      • Searchd
    • 27.
      • Now in a Configuration File there are 4 section to configure which are as follows:
      • Source (multiple)
      • Index (multiple)
      • Indexer
      • Searchd
    • 28.
      • Following are some of the options available in the source section of the configuration file:
      • TYPE:
      • type : data source type.
      • possible options: mysql,pgsql,xmlpipe,xmlpipe2.
      • Connection Info:
      • sql_host : SQL server host to connect (Mandatory).
      • sql_port : SQL server IP to connect ( Default 3306).
      • sql_user : SQL user to use when connecting to sql_host (Mandatory).
      • sql_pass : SQL user password to use when connecting to sql_host (Mandatory).
      • sql_db : SQL DB to be used.
      • sql_sock : socket name to connect to for local SQL servers.
    • 29.
      • Queries Info:
      • mysql_query_pre : pre-fetch query , or pre-query.
      • eg: sql_query_pre= SET NAMES utf8
      • sql_query : main document fetch query.
      • sql_query_post : Post-fetch query.
      • e.g.: sql_query_post= DROP TABLE my_tmp_table
      • sql_query_info : Document info query. (similar to comment in MySQL)
      • Attributes Info:
      • sql_attr_xxx: attribute declaration.(xxx : uint,bigint,float,str2ordinal,timestamp).
    • 30.
      • Now in a Configuration File there are 4 section to configure which are as follows:
      • Source (multiple)
      • Index (multiple)
      • Indexer
      • Searchd
    • 31.
      • type: index type .optional (possible option: local , distributed)
      • source: adds document source to local index. Multi-value.
      • path: Index files path and file name (without extension).
      • docinfo : Document attribute values ( inline , extern ) storage mode.
      • mlock : Memory locking for cached data . (Optional default 0).
      • min_word_len: minimum indexed word length (optional default 1).
      • Charset type: character set encoding type
    • 32.
      • Stemming Options:
      • morphology : A list of morphology preprocessors to apply.
      • e.g.: cars = car ; running =run.
      • Stopwords : stopwords file list (space seperated).
      • e.g.: the,is,are,an,a,etc….
    • 33.
      • Now in a Configuration File there are 4 section to configure which are as follows:
      • Source (multiple)
      • Index (multiple)
      • Indexer
      • Searchd
    • 34.
      • mem_limit : Indexing RAM usage limit . Optional, default is 32MB.
      • max_iops : maximum i/o operations per second.
      • max_iosize : maximum allowed i/o operation size.
      Setting Configuration File: Indexer Section
    • 35.
      • Now in a Configuration File there are 4 section to configure which are as follows:
      • Source (multiple)
      • Index (multiple)
      • Indexer
      • Searchd
    • 36.
      • address: IP address to bind on default 0.0.0.0 listens to all interfaces.
      • port : searchd TCP port number. (mandatory, default is 3312).
      • log : log file name. (optional, default is empty).
      • query_log : query log file name . (optional , default is empty).
      • pid file : searchd process ID file name (mandatory).
      • max_matches : maximum amount of matches that the daemon keep in RAM for each index and can return to the client. (optional, default 1000)
      • preopen_indexes : whether to forcibly preopen all indexes on startup.(optional , default 0 i.e. don’t open).
      Setting Configuration File: Searchd Section
    • 37.  
    • 38.  
    • 39.
      • Introduction to Sphinx .
      • Sphinx Searching and Sorting Features.
      • Sphinx Implementation.
      • Demo.
    • 40.  
    • 41.  

    ×