Scaling to Millions of Concurrent
 SPARQL Queries on the Cloud
   OWLIM Replication Cluster @ Amazon EC2




             ...
Goals

• Test the scalability of OWLIM RC on a really large
  cluster
• Can we break the million queries per hour barrier?...
INTRODUCTION




         OWLIM Replication Cluster @ AWS   Sep 2010   #3
Berlin SPARQL Benchmark (BSBM)

• http://www4.wiwiss.fu-
  berlin.de/bizer/BerlinSPARQLBenchmark/results/
• Evaluates the ...
Benchmarking AWS

• Extensive performance tests of EC2 instances
   – I/O, CPU, Network
   – BSBM (SPARQL), RDF materialis...
Benchmarking AWS – BSBM 100M results
                     5000


                     4500


                     4000


 ...
Benchmarking AWS – RDF materialisation
                             6000




                             5000
materialisa...
OWLIM Replication Cluster

• Improves scalability with respect to concurrent user
  requests
• How does it work?
   – Each...
OWLIM CLUSTER ON EC2 –
BENCHMARKS



          OWLIM Replication Cluster @ AWS   Sep 2010   #9
AWS testbed setup

• OWLIM Replication Cluster
  – One Master node, 10-100 Slave nodes
  – 100 million triples / 16GB data...
Total QMpH (Query Mix per Hour)

                             BSBM-100M, 1000 concurrent clients
             250000




 ...
Total QMpH – summary

• (almost) Linear scalability of the cluster
• 20 nodes handle more than 1 million SPARQL queries
  ...
QMpH per cluster node

                                 BSBM-100M, 1000 concurrent clients
                2400




      ...
QMpH per cluster node – summary

• Low parallelisation overhead
   – Only 10% deterioration in QMpH per cluster node when
...
What about the cost?

• 100,000 SPARQL queries per 1$ on AWS
  – ~4,000 Query Mixes / $
     • 1 Query Mix = 25 SPARQL que...
What about the cost (2)

                                        Query Mixes per 1 USD
                  4600




        ...
DETAILED CLUSTER METRICS




          OWLIM Replication Cluster @ AWS   Sep 2010   #17
Cluster monitoring

• Amazon CloudWatch                    provides         instance     level
  monitoring for EC2
  – CP...
CPU load (Master)

           CPU load (Master)
    80


    70


    60


    50


    40
%




                         ...
CPU load (Slave)

                                                CPU load (random Slave)
    120




    100




     80
...
Network traffic (Master)

              Network traffic (Master)
       35



       30



       25



       20
MB/s



...
Network traffic (Slave)

                                                         Network traffic (random Slave)
       0....
I/O (Slave)

                 I/O (random Slave)
       3.50



       3.00



       2.50



       2.00
MB/s




       ...
Q&A




Questions?



OWLIM Replication Cluster @ AWS   Sep 2010   #24
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Upcoming SlideShare
Loading in …5
×

Scaling to Millions of Concurrent SPARQL Queries on the Cloud

5,638 views

Published on

Published in: Technology

Scaling to Millions of Concurrent SPARQL Queries on the Cloud

  1. 1. Scaling to Millions of Concurrent SPARQL Queries on the Cloud OWLIM Replication Cluster @ Amazon EC2 Sep 2010
  2. 2. Goals • Test the scalability of OWLIM RC on a really large cluster • Can we break the million queries per hour barrier? OWLIM Replication Cluster @ AWS Sep 2010 #2
  3. 3. INTRODUCTION OWLIM Replication Cluster @ AWS Sep 2010 #3
  4. 4. Berlin SPARQL Benchmark (BSBM) • http://www4.wiwiss.fu- berlin.de/bizer/BerlinSPARQLBenchmark/results/ • Evaluates the performance of RDF query engines in an e-commerce use case – searching products and navigating related information • Randomized 'query mixes (25 SPARQL queries) are evaluated continuously • Different dataset size & number of concurrent clients – 25M, 100M and 200M triples OWLIM Replication Cluster @ AWS Sep 2010 #4
  5. 5. Benchmarking AWS • Extensive performance tests of EC2 instances – I/O, CPU, Network – BSBM (SPARQL), RDF materialisation • High Memory EC2 instances offer (surprisingly) good performance for RDF related processing – Comparable to local non-virtualised hardware OWLIM Replication Cluster @ AWS Sep 2010 #5
  6. 6. Benchmarking AWS – BSBM 100M results 5000 4500 4000 3500 Local-L Query mixes / hour 3000 L-ub Local-XL XL-ub 2500 HM-XL-ub HM-2XL-ub 2000 Local-3XL Local-3XL-SSD 1500 HM-4XL-ub HC-XL-ub 1000 500 0 1 4 16 32 64 concurrent clients OWLIM Replication Cluster @ AWS Sep 2010 #6
  7. 7. Benchmarking AWS – RDF materialisation 6000 5000 materialisation time (sec) 4000 3000 UMBEL DBP-SKOS 2000 1000 0 OWLIM Replication Cluster @ AWS Sep 2010 #7
  8. 8. OWLIM Replication Cluster • Improves scalability with respect to concurrent user requests • How does it work? – Each write request is multiplexed to all repository instances – Each read request is dispatched to one instance only – To ensure load-balancing, read requests are sent to the instance with the shortest execution queue OWLIM Replication Cluster @ AWS Sep 2010 #8
  9. 9. OWLIM CLUSTER ON EC2 – BENCHMARKS OWLIM Replication Cluster @ AWS Sep 2010 #9
  10. 10. AWS testbed setup • OWLIM Replication Cluster – One Master node, 10-100 Slave nodes – 100 million triples / 16GB database size • BSBM 100M dataset – Each cluster node has a replica of the database – 1000 concurrent BSBM clients • Amazon EC2 – Master node – HM-2XL (34GB RAM, 4x3.25 ECU) – Slave nodes – HM-XL (17 GB RAM, 2x3.25 ECU) – Ubuntu (x64) OWLIM Replication Cluster @ AWS Sep 2010 #10
  11. 11. Total QMpH (Query Mix per Hour) BSBM-100M, 1000 concurrent clients 250000 200000 150000 total QMpH 1000 clients 100000 50000 0 10 20 30 40 50 60 70 80 90 100 cluster size (HM-XL nodes) OWLIM Replication Cluster @ AWS Sep 2010 #11
  12. 12. Total QMpH – summary • (almost) Linear scalability of the cluster • 20 nodes handle more than 1 million SPARQL queries per hour (40,000 QMpH) – 1 Query Mix = 25 SPARQL queries • 100 nodes handle 5 million SPARQL queries per hour (200,000 QMpH) OWLIM Replication Cluster @ AWS Sep 2010 #12
  13. 13. QMpH per cluster node BSBM-100M, 1000 concurrent clients 2400 2300 2200 QMpH per node 2100 1000 clients trendline (Power) 2000 1900 1800 10 20 30 40 50 60 70 80 90 100 cluster size (HM-XL nodes) OWLIM Replication Cluster @ AWS Sep 2010 #13
  14. 14. QMpH per cluster node – summary • Low parallelisation overhead – Only 10% deterioration in QMpH per cluster node when the cluster grows 10 times (from 10 to 100 nodes) – Cluster nodes handle 2,000-2,300 QMpH (a standalone HM-XL node on EC2 handles ~2,500 QMpH) OWLIM Replication Cluster @ AWS Sep 2010 #14
  15. 15. What about the cost? • 100,000 SPARQL queries per 1$ on AWS – ~4,000 Query Mixes / $ • 1 Query Mix = 25 SPARQL queries – EC2 pricing • Master node (on-demand HM-2XL) – $1.00/hour • Slave node (on demand HM-XL) – $0.50/hour OWLIM Replication Cluster @ AWS Sep 2010 #15
  16. 16. What about the cost (2) Query Mixes per 1 USD 4600 4400 4200 Query Mixes / $ 4000 QMpH/$ 3800 3600 3400 10 20 30 40 50 60 70 80 90 100 cluster size OWLIM Replication Cluster @ AWS Sep 2010 #16
  17. 17. DETAILED CLUSTER METRICS OWLIM Replication Cluster @ AWS Sep 2010 #17
  18. 18. Cluster monitoring • Amazon CloudWatch provides instance level monitoring for EC2 – CPU load, Bandwidth utilisation, I/O, … – Minimum granularity of monitoring periods – 1 minute • OWLIM Cluster metrics – Monitor Master and a random Slave for ~180 min – Many test runs • a single run takes a few minutes – Idle CPU/IO/Network on diagram is the time between test runs OWLIM Replication Cluster @ AWS Sep 2010 #18
  19. 19. CPU load (Master) CPU load (Master) 80 70 60 50 40 % CPU load 30 20 10 0 0 5 100 105 110 115 120 125 130 135 140 145 150 155 160 165 170 175 180 185 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 time (min) OWLIM Replication Cluster @ AWS Sep 2010 #19
  20. 20. CPU load (Slave) CPU load (random Slave) 120 100 80 60 % CPU load 40 20 0 0 5 100 105 110 115 120 125 130 135 140 145 150 155 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 time (min) OWLIM Replication Cluster @ AWS Sep 2010 #20
  21. 21. Network traffic (Master) Network traffic (Master) 35 30 25 20 MB/s inbound (MB/s) 15 outbound (MB/s) 10 5 0 0 5 100 105 110 115 120 125 130 135 140 145 150 155 160 165 170 175 180 185 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 time (min) OWLIM Replication Cluster @ AWS Sep 2010 #21
  22. 22. Network traffic (Slave) Network traffic (random Slave) 0.12 0.10 0.08 MB/s 0.06 inbound (MB/s) outbound (MB/s) 0.04 0.02 0.00 0 5 100 105 110 115 120 125 130 135 140 145 150 155 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 time (min) OWLIM Replication Cluster @ AWS Sep 2010 #22
  23. 23. I/O (Slave) I/O (random Slave) 3.50 3.00 2.50 2.00 MB/s Disk Read (MB/s) 1.50 Disk Write (MB/s) 1.00 0.50 0.00 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 0 5 100 105 110 115 120 125 130 135 140 145 150 155 160 165 170 time (min) OWLIM Replication Cluster @ AWS Sep 2010 #23
  24. 24. Q&A Questions? OWLIM Replication Cluster @ AWS Sep 2010 #24

×