Real time indexes in Sphinx
Search 1.10
This presentation about new feature of Sphinx
Search 1.10 - Real Time indexes
About Me
● Yaroslav Vorozhko
● Web developer at Ivinco
● Specialized at search engines and high load
systems
● E-mail: yar...
The problem of plain indexes
● New data required to update entire index
● Index merge in main + delta scheme
● Depend on i...
Real Time indexes
● What is RT index
● Index update on the fly
● Support of mysql protocol with SphinxQL
Testing Environment
● Testing read-write index
● Testing read-only index
Index schema comparation
● HDD usage
10,000 100,000 1,000,000 2,000,000
0
1
2
3
4
5
6
7
8
9
HDD usage - Plain vs RT indexe...
Index schema comparation
● Single query performance
10,000 100,000 1,000,000 2,000,000
0
0.05
0.1
0.15
0.2
0.25
SphinxAPI ...
Index schema comparation
● Multy query performance
10,000 100,000 1,000,000 2,000,000
0
0.01
0.01
0.02
0.02
0.03
0.03
0.04...
Index schema comparation
● Single query performance with loads
10000 100000 1000000 2000000
0
0.02
0.04
0.06
0.08
0.1
0.12...
Demonstration
● Easy to create an index
index rt
{
type = rt
path = /usr/local/sphinx/data/rt
rt_field = title
rt_field = ...
Demonstration
● Easy to CRUD
mysql -h 127.0.0.1 -P 9306
INSERT INTO rt VALUES ....
SELECT * FROM rt;
DELETE FROM rt WHERE ...
Migration
● Simple and easy using existing tools
mysqldump -uroot blog users > users_dump.sql
mysql -P9306 < users_dump.sql
Migration
● Custom script as replace of ”source” block
● Support all ”source” settings
– Connection settings
– SQL query a...
Migration
● Support of mixed indexes
index distributed
{
type = distributed
local = plain_main_index
local = real_time_inc...
Problem with LjSeek migration
● Huge memory usage
● Low search speed
Questions
Questions ?
References
● Sphinx Search http://sphinxsearch.com/
● Migration from plain to real time indexes script
https://launchpad.n...
Upcoming SlideShare
Loading in …5
×

Real time indexes in Sphinx, Yaroslav Vorozhko

4,173 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
4,173
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
34
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Real time indexes in Sphinx, Yaroslav Vorozhko

  1. 1. Real time indexes in Sphinx Search 1.10 This presentation about new feature of Sphinx Search 1.10 - Real Time indexes
  2. 2. About Me ● Yaroslav Vorozhko ● Web developer at Ivinco ● Specialized at search engines and high load systems ● E-mail: yaroslav@ivinco.com
  3. 3. The problem of plain indexes ● New data required to update entire index ● Index merge in main + delta scheme ● Depend on indexer tool ● Not simple to manage
  4. 4. Real Time indexes ● What is RT index ● Index update on the fly ● Support of mysql protocol with SphinxQL
  5. 5. Testing Environment ● Testing read-write index ● Testing read-only index
  6. 6. Index schema comparation ● HDD usage 10,000 100,000 1,000,000 2,000,000 0 1 2 3 4 5 6 7 8 9 HDD usage - Plain vs RT indexes Plain index Real time index Indexed records GB
  7. 7. Index schema comparation ● Single query performance 10,000 100,000 1,000,000 2,000,000 0 0.05 0.1 0.15 0.2 0.25 SphinxAPI performance for single query Plain index Real time index Records in index Querytimesec.
  8. 8. Index schema comparation ● Multy query performance 10,000 100,000 1,000,000 2,000,000 0 0.01 0.01 0.02 0.02 0.03 0.03 0.04 0.04 0.05 0.05 SphinxAPI performance for multi query Plain index Real time index Records in index Querytimesec.
  9. 9. Index schema comparation ● Single query performance with loads 10000 100000 1000000 2000000 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 SphinxAPI performance for single query and insert loads RT binlog off RT binlog 0 RT binlog 1 RT binlog 2 Plain index Records in index Querytimesec.
  10. 10. Demonstration ● Easy to create an index index rt { type = rt path = /usr/local/sphinx/data/rt rt_field = title rt_field = content rt_attr_uint = gid }
  11. 11. Demonstration ● Easy to CRUD mysql -h 127.0.0.1 -P 9306 INSERT INTO rt VALUES .... SELECT * FROM rt; DELETE FROM rt WHERE id=2; REPLACE INTO rt VALUES .... ● SphinxAPI support
  12. 12. Migration ● Simple and easy using existing tools mysqldump -uroot blog users > users_dump.sql mysql -P9306 < users_dump.sql
  13. 13. Migration ● Custom script as replace of ”source” block ● Support all ”source” settings – Connection settings – SQL query among with Pre and Post SQL – SQL Range queries ● Support fill index from scratch ● Support index update
  14. 14. Migration ● Support of mixed indexes index distributed { type = distributed local = plain_main_index local = real_time_increment_index }
  15. 15. Problem with LjSeek migration ● Huge memory usage ● Low search speed
  16. 16. Questions Questions ?
  17. 17. References ● Sphinx Search http://sphinxsearch.com/ ● Migration from plain to real time indexes script https://launchpad.net/migrate-sphinx-plain- indexes-into-real-time-indexes ● Ivinco blog – is good resource about Sphinx Search http://www.ivinco.com/blog/ ● My blog – also good resource about Sphinx Search on russian language http://pro100pro.com/

×