O C T O B E R 1 3 - 1 6 , 2 0 1 6 • A U S T I N , T X
Rackspace Email’s solution for indexing 50k documents per second
George Bailey – Software Developer, Rackspace
Cameron Baker – Linux Engineer, Rackspace
george.bailey@rackspace.com
cameron.baker@rackspace.com
3
02Who we are…
•  “Rackers” dedicated to Fanatical Support!
•  Based out of San Antonio, TX
•  Part of Cloud Office
•  Email infrastructure development and engineering
4
022008: Original Problem
•  1 million mailboxes
•  Support needs to track message delivery
•  We need event aggregation + search
•  Needed to provide Fanatical Support!
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data
5
02Original System Design
•  Scribed: log aggregation, deposit into HDFS
•  Hadoop 0.20: index via mapreduce
•  solr 1.4: search
•  Custom tools: index loader, mapreduce, scheduler
Scribed
6
02Original System Architecture
7
02Past performance
Step Time
Transport < 1 minute
Index Generation
(Mapreduce)
10 minutes
(cron)
Index Merge
10 minutes
(cron)
Searchable Events 20+ minutes
8
027 years later…
•  4+ million mailboxes
•  Still running solr 1.4, hadoop 0.20, scribed
•  Scaling, maintenance issues
•  Grew to 100+ physical servers, 15 VMs
•  Events need to be used in other contexts
•  20+ minute time-to-search no longer acceptable
9
02
Time to modernize!
10
02Goals
•  Improve customer experience – Fanatical Support!
•  Provide search results faster
•  Reduce technologies
•  Reduce the amount of custom code
•  Reduce the number of physical servers
11
02New System - Components
•  Apache Flume: aggregation + processing
•  Solr 1.4 to 4.x/5.x: NRT indexing, distributed search
•  SolrCloud allowed us to reduce custom code by 75%
12
02System architecture
13
02
Performance Tuning
14
02Flume: backpressure + hop availability
•  Sinks may be unreachable or slow
•  File Channel = durable buffering
•  capacity: disk / event size
•  transactionCapacity: match source / sink
•  minimumRequiredSpace
15
02Flume: batching and throughput
•  Batch size is important
•  File channels = slow
•  Memory channels = fast
•  “Loopback” flows
16
02Flume: controlling the flows
•  One event, multiple uses
•  Channel selectors
•  Optional channels
•  Interceptors
agent.sources.avroSource.selector.type = multiplexing
agent.sources.avroSource.selector.header = eventType
agent.sources.avroSource.selector.default = defaultChannel
agent.sources.avroSource.selector.authEvent = authEventChannel
agent.sources.avroSource.selector.mailEvent = mailEventChannel
agent.sources.avroSource.selector.optional.authEvent = optionalChannel
17
02Flume: Morphlines + Solr
•  Works with SolrCloud
•  Many helpful built-in commands
•  Scripting support for Java
•  Route to multiple collections
•  Validate, modify events in-flight
http://kitesdk.org/docs/current/morphlines/morphlines-reference-guide.html
18
02Requirements for Solr
•  Near real time indexing of 30,000+ docs per sec
•  Few queries (< 10,000 per day)
•  Heavy distributed facet/group/sort queries
•  Support removing documents older than X days
•  Minimize JVM GC impact on indexing performance
19
02Basic Solr install
Server A
Solr
Replica
Server B
Solr
Replica
Server C
Solr
Replica
Server D
Solr
Replica
Collection
Shard 1 Shard 2
~2,500 docs per second
Goal 30,000 (30,000/2,500 = 12)
12 * # of Servers = 48 total servers
20
02Consult the experts…
•  Days of talking/100’s of emails with Rishi Easwaran
•  Recommendations from Shalin Mangar
•  solr-user@lucene.apache.org
Result:
•  Fewer physical servers
•  Faster indexing
21
02Collections – Optimized for additions/deletions
collection-2015-10-11
collection-2015-10-12
collection-2015-10-13
collection-2015-10-14
collection-2015-10-15collection-2015-10-16
•  Rolling collections by date
•  ~1 billion documents removed
•  Aliases for updates/queries
•  25 shards - 2 replicas per shard
22
02JVM – Lean and mean
•  4GB max/min JVM heap size
•  5 Solr JVM processes per server
•  Using Concurrent Mark Sweep GC
•  GC only on very heavy queries
•  GC < 10ms; occurs < 10 times a day
•  No impact on index performance
•  Reads 28 indexes; writes 2 indexes
Server A
Solr
Server A
Solr
Solr
Solr
Solr
Solr
23
02JVM Monitoring – before it’s too late
•  Proactive OOM monitoring
•  Memory not being released
•  Trapped in GC
•  Restart processes
•  Can impact entire cluster
24
02autoCommit for near real time indexing
Tested autoCommit and autoSoftCommit settings of:
•  autoCommit 5 seconds to 5 minutes
•  autoSoftCommit 1 second to 1 minute
Result:
•  autoSoftCommit of 5 seconds and autoCommit of 1
minute balanced out memory usage and disk IO
25
02DocValues – Reduced OOM Errors
•  Struggled with OOME under heavy load
•  Automated restart for nodes trapped in GC cycle
•  Distributed facet/group/sort queries
Solution:
•  docValues=“true” – for facet/group/sort fields
26
02Caching/Cache Warming – Measure and tune
•  filterCache/queryResultCache/documentCache/etc.
•  Very diverse queries (cache hits were too low)
•  Benefits for our workload did not justify the cost
•  Willing to accept slower queries
27
02Configs - Keep it simple
•  Example configs show off advanced features
•  If you are not using the feature, turn it off
•  Start with a trimmed down config
•  Only add features as needed
28
02
Performance Comparison
29
02Present performance
•  Sustained indexing of ~50,000 docs per sec
•  Each replica indexes ~1,000 docs per sec
•  New documents are searchable within 5 seconds
•  10,000 distributed facet/group/sort queries per day
•  1 billion new documents are indexed per day
•  13 billion documents are searchable
•  7TB of data across all indexes
30
02Performance Comparison
Step Performance (2008) Performance (2015)
Transport <1 minute <1 second (NRT)
Index
Generation
10 minutes <5 seconds
Index Merge 10 minutes N/A
Search 20+ minutes <5 seconds
•  Faster transport
•  No more batch processing
•  No external index generation
•  NRT indexing with SolrCloud
31
02Environment Comparison
Server Type Servers (2008) Servers (2015)
Transport
Physical: 4
Virtual: 15
Physical: 4
Virtual: 20
Storage /
processing
Physical: 100+
Virtual: 0
Physical: 0
Virtual: 0
Search
Physical: 12
Virtual: 0
Physical: 10
Virtual: 5
Total
Physical: 100+
Virtual: 15
Physical: 14
Virtual: 25
•  Flume / Solr handle event storage
and processing
•  No more Hadoop footprint
•  Over 80% reduction in servers
32
02Future…
•  Dedicated Solr nodes with SSDs for indexing
•  Shard query collections for improved performance
•  Larger JVM size for query nodes
•  Multiple Datacenter SolrCloud (replication/mirroring)
Rackspace Email’s solution for indexing 50k documents per second
George Bailey – Software Developer, Rackspace
Cameron Baker – Linux Engineer, Rackspace
george.bailey@rackspace.com
cameron.baker@rackspace.com
Thank you

Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented by George Bailey & Cameron Baker, Rackspace

  • 1.
    O C TO B E R 1 3 - 1 6 , 2 0 1 6 • A U S T I N , T X
  • 2.
    Rackspace Email’s solutionfor indexing 50k documents per second George Bailey – Software Developer, Rackspace Cameron Baker – Linux Engineer, Rackspace george.bailey@rackspace.com cameron.baker@rackspace.com
  • 3.
    3 02Who we are… • “Rackers” dedicated to Fanatical Support! •  Based out of San Antonio, TX •  Part of Cloud Office •  Email infrastructure development and engineering
  • 4.
    4 022008: Original Problem • 1 million mailboxes •  Support needs to track message delivery •  We need event aggregation + search •  Needed to provide Fanatical Support! http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data
  • 5.
    5 02Original System Design • Scribed: log aggregation, deposit into HDFS •  Hadoop 0.20: index via mapreduce •  solr 1.4: search •  Custom tools: index loader, mapreduce, scheduler Scribed
  • 6.
  • 7.
    7 02Past performance Step Time Transport< 1 minute Index Generation (Mapreduce) 10 minutes (cron) Index Merge 10 minutes (cron) Searchable Events 20+ minutes
  • 8.
    8 027 years later… • 4+ million mailboxes •  Still running solr 1.4, hadoop 0.20, scribed •  Scaling, maintenance issues •  Grew to 100+ physical servers, 15 VMs •  Events need to be used in other contexts •  20+ minute time-to-search no longer acceptable
  • 9.
  • 10.
    10 02Goals •  Improve customerexperience – Fanatical Support! •  Provide search results faster •  Reduce technologies •  Reduce the amount of custom code •  Reduce the number of physical servers
  • 11.
    11 02New System -Components •  Apache Flume: aggregation + processing •  Solr 1.4 to 4.x/5.x: NRT indexing, distributed search •  SolrCloud allowed us to reduce custom code by 75%
  • 12.
  • 13.
  • 14.
    14 02Flume: backpressure +hop availability •  Sinks may be unreachable or slow •  File Channel = durable buffering •  capacity: disk / event size •  transactionCapacity: match source / sink •  minimumRequiredSpace
  • 15.
    15 02Flume: batching andthroughput •  Batch size is important •  File channels = slow •  Memory channels = fast •  “Loopback” flows
  • 16.
    16 02Flume: controlling theflows •  One event, multiple uses •  Channel selectors •  Optional channels •  Interceptors agent.sources.avroSource.selector.type = multiplexing agent.sources.avroSource.selector.header = eventType agent.sources.avroSource.selector.default = defaultChannel agent.sources.avroSource.selector.authEvent = authEventChannel agent.sources.avroSource.selector.mailEvent = mailEventChannel agent.sources.avroSource.selector.optional.authEvent = optionalChannel
  • 17.
    17 02Flume: Morphlines +Solr •  Works with SolrCloud •  Many helpful built-in commands •  Scripting support for Java •  Route to multiple collections •  Validate, modify events in-flight http://kitesdk.org/docs/current/morphlines/morphlines-reference-guide.html
  • 18.
    18 02Requirements for Solr • Near real time indexing of 30,000+ docs per sec •  Few queries (< 10,000 per day) •  Heavy distributed facet/group/sort queries •  Support removing documents older than X days •  Minimize JVM GC impact on indexing performance
  • 19.
    19 02Basic Solr install ServerA Solr Replica Server B Solr Replica Server C Solr Replica Server D Solr Replica Collection Shard 1 Shard 2 ~2,500 docs per second Goal 30,000 (30,000/2,500 = 12) 12 * # of Servers = 48 total servers
  • 20.
    20 02Consult the experts… • Days of talking/100’s of emails with Rishi Easwaran •  Recommendations from Shalin Mangar •  solr-user@lucene.apache.org Result: •  Fewer physical servers •  Faster indexing
  • 21.
    21 02Collections – Optimizedfor additions/deletions collection-2015-10-11 collection-2015-10-12 collection-2015-10-13 collection-2015-10-14 collection-2015-10-15collection-2015-10-16 •  Rolling collections by date •  ~1 billion documents removed •  Aliases for updates/queries •  25 shards - 2 replicas per shard
  • 22.
    22 02JVM – Leanand mean •  4GB max/min JVM heap size •  5 Solr JVM processes per server •  Using Concurrent Mark Sweep GC •  GC only on very heavy queries •  GC < 10ms; occurs < 10 times a day •  No impact on index performance •  Reads 28 indexes; writes 2 indexes Server A Solr Server A Solr Solr Solr Solr Solr
  • 23.
    23 02JVM Monitoring –before it’s too late •  Proactive OOM monitoring •  Memory not being released •  Trapped in GC •  Restart processes •  Can impact entire cluster
  • 24.
    24 02autoCommit for nearreal time indexing Tested autoCommit and autoSoftCommit settings of: •  autoCommit 5 seconds to 5 minutes •  autoSoftCommit 1 second to 1 minute Result: •  autoSoftCommit of 5 seconds and autoCommit of 1 minute balanced out memory usage and disk IO
  • 25.
    25 02DocValues – ReducedOOM Errors •  Struggled with OOME under heavy load •  Automated restart for nodes trapped in GC cycle •  Distributed facet/group/sort queries Solution: •  docValues=“true” – for facet/group/sort fields
  • 26.
    26 02Caching/Cache Warming –Measure and tune •  filterCache/queryResultCache/documentCache/etc. •  Very diverse queries (cache hits were too low) •  Benefits for our workload did not justify the cost •  Willing to accept slower queries
  • 27.
    27 02Configs - Keepit simple •  Example configs show off advanced features •  If you are not using the feature, turn it off •  Start with a trimmed down config •  Only add features as needed
  • 28.
  • 29.
    29 02Present performance •  Sustainedindexing of ~50,000 docs per sec •  Each replica indexes ~1,000 docs per sec •  New documents are searchable within 5 seconds •  10,000 distributed facet/group/sort queries per day •  1 billion new documents are indexed per day •  13 billion documents are searchable •  7TB of data across all indexes
  • 30.
    30 02Performance Comparison Step Performance(2008) Performance (2015) Transport <1 minute <1 second (NRT) Index Generation 10 minutes <5 seconds Index Merge 10 minutes N/A Search 20+ minutes <5 seconds •  Faster transport •  No more batch processing •  No external index generation •  NRT indexing with SolrCloud
  • 31.
    31 02Environment Comparison Server TypeServers (2008) Servers (2015) Transport Physical: 4 Virtual: 15 Physical: 4 Virtual: 20 Storage / processing Physical: 100+ Virtual: 0 Physical: 0 Virtual: 0 Search Physical: 12 Virtual: 0 Physical: 10 Virtual: 5 Total Physical: 100+ Virtual: 15 Physical: 14 Virtual: 25 •  Flume / Solr handle event storage and processing •  No more Hadoop footprint •  Over 80% reduction in servers
  • 32.
    32 02Future… •  Dedicated Solrnodes with SSDs for indexing •  Shard query collections for improved performance •  Larger JVM size for query nodes •  Multiple Datacenter SolrCloud (replication/mirroring)
  • 33.
    Rackspace Email’s solutionfor indexing 50k documents per second George Bailey – Software Developer, Rackspace Cameron Baker – Linux Engineer, Rackspace george.bailey@rackspace.com cameron.baker@rackspace.com Thank you