George Bailey and Cameron Baker of Rackspace presented their solution for indexing over 50,000 documents per second for Rackspace Email. They modernized their system using Apache Flume for event processing and aggregation and SolrCloud for real-time search. This reduced indexing time from over 20 minutes to under 5 seconds, reduced the number of physical servers needed from over 100 to 14, and increased indexing throughput from 1,000 to over 50,000 documents per second while supporting over 13 billion searchable documents.
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented by George Bailey & Cameron Baker, Rackspace
1. O C T O B E R 1 3 - 1 6 , 2 0 1 6 • A U S T I N , T X
2. Rackspace Email’s solution for indexing 50k documents per second
George Bailey – Software Developer, Rackspace
Cameron Baker – Linux Engineer, Rackspace
george.bailey@rackspace.com
cameron.baker@rackspace.com
3. 3
02Who we are…
• “Rackers” dedicated to Fanatical Support!
• Based out of San Antonio, TX
• Part of Cloud Office
• Email infrastructure development and engineering
4. 4
022008: Original Problem
• 1 million mailboxes
• Support needs to track message delivery
• We need event aggregation + search
• Needed to provide Fanatical Support!
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data
5. 5
02Original System Design
• Scribed: log aggregation, deposit into HDFS
• Hadoop 0.20: index via mapreduce
• solr 1.4: search
• Custom tools: index loader, mapreduce, scheduler
Scribed
8. 8
027 years later…
• 4+ million mailboxes
• Still running solr 1.4, hadoop 0.20, scribed
• Scaling, maintenance issues
• Grew to 100+ physical servers, 15 VMs
• Events need to be used in other contexts
• 20+ minute time-to-search no longer acceptable
17. 17
02Flume: Morphlines + Solr
• Works with SolrCloud
• Many helpful built-in commands
• Scripting support for Java
• Route to multiple collections
• Validate, modify events in-flight
http://kitesdk.org/docs/current/morphlines/morphlines-reference-guide.html
18. 18
02Requirements for Solr
• Near real time indexing of 30,000+ docs per sec
• Few queries (< 10,000 per day)
• Heavy distributed facet/group/sort queries
• Support removing documents older than X days
• Minimize JVM GC impact on indexing performance
19. 19
02Basic Solr install
Server A
Solr
Replica
Server B
Solr
Replica
Server C
Solr
Replica
Server D
Solr
Replica
Collection
Shard 1 Shard 2
~2,500 docs per second
Goal 30,000 (30,000/2,500 = 12)
12 * # of Servers = 48 total servers
20. 20
02Consult the experts…
• Days of talking/100’s of emails with Rishi Easwaran
• Recommendations from Shalin Mangar
• solr-user@lucene.apache.org
Result:
• Fewer physical servers
• Faster indexing
21. 21
02Collections – Optimized for additions/deletions
collection-2015-10-11
collection-2015-10-12
collection-2015-10-13
collection-2015-10-14
collection-2015-10-15collection-2015-10-16
• Rolling collections by date
• ~1 billion documents removed
• Aliases for updates/queries
• 25 shards - 2 replicas per shard
22. 22
02JVM – Lean and mean
• 4GB max/min JVM heap size
• 5 Solr JVM processes per server
• Using Concurrent Mark Sweep GC
• GC only on very heavy queries
• GC < 10ms; occurs < 10 times a day
• No impact on index performance
• Reads 28 indexes; writes 2 indexes
Server A
Solr
Server A
Solr
Solr
Solr
Solr
Solr
23. 23
02JVM Monitoring – before it’s too late
• Proactive OOM monitoring
• Memory not being released
• Trapped in GC
• Restart processes
• Can impact entire cluster
24. 24
02autoCommit for near real time indexing
Tested autoCommit and autoSoftCommit settings of:
• autoCommit 5 seconds to 5 minutes
• autoSoftCommit 1 second to 1 minute
Result:
• autoSoftCommit of 5 seconds and autoCommit of 1
minute balanced out memory usage and disk IO
25. 25
02DocValues – Reduced OOM Errors
• Struggled with OOME under heavy load
• Automated restart for nodes trapped in GC cycle
• Distributed facet/group/sort queries
Solution:
• docValues=“true” – for facet/group/sort fields
26. 26
02Caching/Cache Warming – Measure and tune
• filterCache/queryResultCache/documentCache/etc.
• Very diverse queries (cache hits were too low)
• Benefits for our workload did not justify the cost
• Willing to accept slower queries
27. 27
02Configs - Keep it simple
• Example configs show off advanced features
• If you are not using the feature, turn it off
• Start with a trimmed down config
• Only add features as needed
29. 29
02Present performance
• Sustained indexing of ~50,000 docs per sec
• Each replica indexes ~1,000 docs per sec
• New documents are searchable within 5 seconds
• 10,000 distributed facet/group/sort queries per day
• 1 billion new documents are indexed per day
• 13 billion documents are searchable
• 7TB of data across all indexes
30. 30
02Performance Comparison
Step Performance (2008) Performance (2015)
Transport <1 minute <1 second (NRT)
Index
Generation
10 minutes <5 seconds
Index Merge 10 minutes N/A
Search 20+ minutes <5 seconds
• Faster transport
• No more batch processing
• No external index generation
• NRT indexing with SolrCloud
31. 31
02Environment Comparison
Server Type Servers (2008) Servers (2015)
Transport
Physical: 4
Virtual: 15
Physical: 4
Virtual: 20
Storage /
processing
Physical: 100+
Virtual: 0
Physical: 0
Virtual: 0
Search
Physical: 12
Virtual: 0
Physical: 10
Virtual: 5
Total
Physical: 100+
Virtual: 15
Physical: 14
Virtual: 25
• Flume / Solr handle event storage
and processing
• No more Hadoop footprint
• Over 80% reduction in servers
32. 32
02Future…
• Dedicated Solr nodes with SSDs for indexing
• Shard query collections for improved performance
• Larger JVM size for query nodes
• Multiple Datacenter SolrCloud (replication/mirroring)
33. Rackspace Email’s solution for indexing 50k documents per second
George Bailey – Software Developer, Rackspace
Cameron Baker – Linux Engineer, Rackspace
george.bailey@rackspace.com
cameron.baker@rackspace.com
Thank you