Multi-Terabyte Sphinx HA cluster
Vyacheslav Kryukov
vkrukov@ivinco.com
Sphinx cluster
Sphinx cluster
Sphinx cluster
Sphinx cluster
Sphinx cluster
Sphinx cluster
Sphinx HA cluster, requrements

●

Incident tolerance and availability level

●

Adaptive balancing

●

Resources redundancy utilisation

●

Easy deployment of new resources
Sphinx HA cluster architecture
Sphinx HA cluster, architecture #1
Sphinx HA cluster, architecture #2
Sphinx HA cluster, ha_strategy

●

●

Simple balancing
●
random
●
roundrobin
Adaptive balancing
●
nodeads
●
noerrors

http://sphinxsearch.com/docs/current.html#conf-ha-strategy
Sphinx HA cluster, adaptive balancing
●

Latency

●

Query timeouts

●

Connect timeouts

●

Connect failures

●

Network errors

●

Wrong replies

●

Unexpected closings

●

Warnings
Sphinx HA cluster, configuration
index some_index
{
type = distributed
agent = se01-1:3312|se01-2:3312:some_index_se01
agent = se02-1:3312|se02-2:3312:some_index_se02
agent = se03-1:3312|se03-2:3312:some_index_se03
agent = se04-1:3312|se04-2:3312:some_index_se04
ha_strategy = nodeads
}
searchd
{
...
ha_ping_interval = 1000
ha_period_karma = 60
...
}
http://sphinxsearch.com/docs/current.html#conf-ha-ping-interval
http://sphinxsearch.com/docs/current.html#conf-ha-period-karma
Sphinx HA cluster, SHOW AGENT STATUS
mysql> SHOW AGENT STATUS;
+-------------------------------------+--------------------+
| Key
| Value
|
+-------------------------------------+--------------------+
| status_period_seconds
| 60
|
| status_stored_periods
| 15
|
...
| ag_19_hostname
| se02-1:3312
|
| ag_19_references
| 13
|
| ag_19_lastquery
| 1.91
|
| ag_19_lastanswer
| 1.86
|
| ag_19_lastperiodmsec
| 51
|
| ag_19_errorsarow
| 0
|
| ag_19_1periods_query_timeouts
| 0
|
| ag_19_1periods_connect_timeouts
| 0
|
| ag_19_1periods_connect_failures
| 0
|
| ag_19_1periods_network_errors
| 0
|
| ag_19_1periods_wrong_replies
| 0
|
| ag_19_1periods_unexpected_closings | 0
|
| ag_19_1periods_warnings
| 0
|
| ag_19_1periods_succeeded_queries
| 101
|
| ag_19_1periods_msecsperqueryy
| 83.92
|
(the same for 5periods_ and 15periods_)
| ag_20_hostname
| se02-2:3312
|
| ag_20_references
| 13
|
| ag_20_lastquery
| 0.55
|
| ag_20_lastanswer
| 0.49
|
| ag_20_lastperiodmsec
| 55
|
| ag_20_errorsarow
| 0
|
| ag_20_1periods_query_timeouts
| 0
|
| ag_20_1periods_connect_timeouts
| 0
|
| ag_20_1periods_connect_failures
| 0
|
| ag_20_1periods_network_errors
| 0
|
| ag_20_1periods_wrong_replies
| 0
|
| ag_20_1periods_unexpected_closings | 0
|
| ag_20_1periods_warnings
| 0
|
| ag_20_1periods_succeeded_queries
| 55
|
| ag_20_1periods_msecsperqueryy
| 86.08
|
(the same for 5periods_ and 15periods_)
...
Sphinx HA cluster, balancing in real time
Sphinx HA cluster, balancing in real time

# cd /mnt/data
# iozone -i0 -i2 -s16g -r32k -f iozone.tmp
Sphinx HA cluster, balancing in real time
Sphinx HA cluster, balancing in real time
Sphinx HA cluster, data processing

●

Data loading to permanent store

●

Data indexig

Indexes validation and synchronization (Rsync and
NetCat)
●

●

Update indexes from application
Sphinx HA cluster, performance and
availability
●

Provide performance with band wide

●

What to monitor
●

SHOW AGENT STATUS, nodes performance, disc
space, io and cpu usage

●

Errors, warnings, crashes

●

Indexes synchronization, validity, freshness
Sphinx HA cluster, distributed indexer
Sphinx HA cluster, distributed indexer
●

Automated
●

distributed indexing

●

Indexes validation

●

indexes delivery

●

Failover

●

Centralised Sphinx indexes configuration management

●

Indexes rebalancing
Resources consumption accounting

●

io ops

●

io size

●

fetched_docs

●

fetched_hits

●

fetched_skips

●

total_found
Rosette Linguistics Platform

●

Used for analysis of unstructured text in CJK languages

●

Better quality then using ngram options

●

Slow indexer performance

http://www.basistech.com/text-analytics/rosette/
Questions?

vkrukov@ivinco.com
Sphinx cluster

Вячеслав Крюков, Ivinco

  • 1.
    Multi-Terabyte Sphinx HAcluster Vyacheslav Kryukov vkrukov@ivinco.com
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
    Sphinx HA cluster,requrements ● Incident tolerance and availability level ● Adaptive balancing ● Resources redundancy utilisation ● Easy deployment of new resources
  • 9.
    Sphinx HA clusterarchitecture
  • 10.
    Sphinx HA cluster,architecture #1
  • 11.
    Sphinx HA cluster,architecture #2
  • 12.
    Sphinx HA cluster,ha_strategy ● ● Simple balancing ● random ● roundrobin Adaptive balancing ● nodeads ● noerrors http://sphinxsearch.com/docs/current.html#conf-ha-strategy
  • 13.
    Sphinx HA cluster,adaptive balancing ● Latency ● Query timeouts ● Connect timeouts ● Connect failures ● Network errors ● Wrong replies ● Unexpected closings ● Warnings
  • 14.
    Sphinx HA cluster,configuration index some_index { type = distributed agent = se01-1:3312|se01-2:3312:some_index_se01 agent = se02-1:3312|se02-2:3312:some_index_se02 agent = se03-1:3312|se03-2:3312:some_index_se03 agent = se04-1:3312|se04-2:3312:some_index_se04 ha_strategy = nodeads } searchd { ... ha_ping_interval = 1000 ha_period_karma = 60 ... } http://sphinxsearch.com/docs/current.html#conf-ha-ping-interval http://sphinxsearch.com/docs/current.html#conf-ha-period-karma
  • 15.
    Sphinx HA cluster,SHOW AGENT STATUS mysql> SHOW AGENT STATUS; +-------------------------------------+--------------------+ | Key | Value | +-------------------------------------+--------------------+ | status_period_seconds | 60 | | status_stored_periods | 15 | ... | ag_19_hostname | se02-1:3312 | | ag_19_references | 13 | | ag_19_lastquery | 1.91 | | ag_19_lastanswer | 1.86 | | ag_19_lastperiodmsec | 51 | | ag_19_errorsarow | 0 | | ag_19_1periods_query_timeouts | 0 | | ag_19_1periods_connect_timeouts | 0 | | ag_19_1periods_connect_failures | 0 | | ag_19_1periods_network_errors | 0 | | ag_19_1periods_wrong_replies | 0 | | ag_19_1periods_unexpected_closings | 0 | | ag_19_1periods_warnings | 0 | | ag_19_1periods_succeeded_queries | 101 | | ag_19_1periods_msecsperqueryy | 83.92 | (the same for 5periods_ and 15periods_) | ag_20_hostname | se02-2:3312 | | ag_20_references | 13 | | ag_20_lastquery | 0.55 | | ag_20_lastanswer | 0.49 | | ag_20_lastperiodmsec | 55 | | ag_20_errorsarow | 0 | | ag_20_1periods_query_timeouts | 0 | | ag_20_1periods_connect_timeouts | 0 | | ag_20_1periods_connect_failures | 0 | | ag_20_1periods_network_errors | 0 | | ag_20_1periods_wrong_replies | 0 | | ag_20_1periods_unexpected_closings | 0 | | ag_20_1periods_warnings | 0 | | ag_20_1periods_succeeded_queries | 55 | | ag_20_1periods_msecsperqueryy | 86.08 | (the same for 5periods_ and 15periods_) ...
  • 16.
    Sphinx HA cluster,balancing in real time
  • 17.
    Sphinx HA cluster,balancing in real time # cd /mnt/data # iozone -i0 -i2 -s16g -r32k -f iozone.tmp
  • 18.
    Sphinx HA cluster,balancing in real time
  • 19.
    Sphinx HA cluster,balancing in real time
  • 20.
    Sphinx HA cluster,data processing ● Data loading to permanent store ● Data indexig Indexes validation and synchronization (Rsync and NetCat) ● ● Update indexes from application
  • 21.
    Sphinx HA cluster,performance and availability ● Provide performance with band wide ● What to monitor ● SHOW AGENT STATUS, nodes performance, disc space, io and cpu usage ● Errors, warnings, crashes ● Indexes synchronization, validity, freshness
  • 22.
    Sphinx HA cluster,distributed indexer
  • 23.
    Sphinx HA cluster,distributed indexer ● Automated ● distributed indexing ● Indexes validation ● indexes delivery ● Failover ● Centralised Sphinx indexes configuration management ● Indexes rebalancing
  • 24.
    Resources consumption accounting ● ioops ● io size ● fetched_docs ● fetched_hits ● fetched_skips ● total_found
  • 25.
    Rosette Linguistics Platform ● Usedfor analysis of unstructured text in CJK languages ● Better quality then using ngram options ● Slow indexer performance http://www.basistech.com/text-analytics/rosette/
  • 26.
  • 27.