Bleeding Edge Databases

1,189 views

Published on

On Aerospike, AlgebraixData and Google BigQuery for BigDataCampLA

Published in: Technology, Education
1 Comment
1 Like
Statistics
Notes
No Downloads
Views
Total views
1,189
On SlideShare
0
From Embeds
0
Number of Embeds
32
Actions
Shares
0
Downloads
14
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide
  • http://db-engines.com/en/ranking_trend
  • http://documentary.net/the-art-of-data-visualization/
  • http://www.aerospike.com/blog/aerospike-doubles-in-memory-nosql-database-performance/

    8 CPU & 32 GB RAM
  • Results by Thumbtack Technology
  • YCSB Benchmark
  • http://www.aerospike.com/free-aerospike-3-community-edition/
  • http://lod-cloud.net/versions/2011-09-19/lod-cloud.html
  • http://dbis.informatik.uni-freiburg.de/index.php?project=SP2B
    http://www.algebraixdata.com/algebraix-data-achieves-unrivaled-semantic-benchmark-performance/
  • http://demo.algebraixdata.com/#!/ss/math
  • Mathematics-based data management platform
    Kernel for any data model
    High performance
    High scalability
    Self-tuning
    Automatic data re-organization
    Small footprint
  • http://www.algebraixdata.com/
  • http://gdeltproject.org/
  • http://martinfowler.com/articles/bigQueryPOC.html
  • https://developers.google.com/bigquery/pricing#data

    http://g-calculator.appspot.com/bigtable.html
  • http://www.megapivot.com/blog/posts/redshift-vs-bigquery-vs-hadoop.html

    http://courses.cs.washington.edu/courses/cse544/13sp/final-projects/p18-lijl.pdf
  • http://bigqueri.es/categories
  • https://developers.google.com/bigquery/third-party-tools

    http://bigquery.bimeanalytics.com/
  • http://bigqueri.es/

    https://developers.google.com/bigquery/streaming-data-into-bigquery
  • https://cloud.google.com/developers/starterpack/
  • www.teachingkidsprogramming.org
  • Bleeding Edge Databases

    1. 1. Bleeding Edge Databases @LynnLangit
    2. 2. Unstructured Data
    3. 3. Live Tweets on a Building
    4. 4. What is Aerospike?
    5. 5. Benchmark Results • 200,000 tps (read-write) & 300,000 tps (read-heavy) • 10X Faster for R/W loads on SSDs
    6. 6. DEMO
    7. 7. More Benchmark Results Config • 10G network • Aerospike 3 • Same hardware • 4-node CentOS Data • 500GB • 50M records Each Record • 100 bytes • 23 byte key • 10 fields
    8. 8. Aerospike Architecture
    9. 9. Example Architecture
    10. 10. How to try it out • Bare metal or pick a Cloud, set up a VM • Get the free community edition • Go…
    11. 11. Linked Open Data Cloud
    12. 12. What is Algebraix Data? IoT – Semantic Web Super Powerful 1 Billion Triples on 1 Node Native Mathematical Engine Triple store RDF (Graph)
    13. 13. SPARQL Server™ W3C & OGC compliant RDF / SPARQL Semantic Database Natively built with proprietary Math • Algebraix technology (and patents) Runs on commodity hardware • In the cloud (or on premise) • Scales Up and Down Significantly better benchmark performance • over leading RDF databases
    14. 14. Benchmark Results • SP2Bench SPARQL Performance Benchmark
    15. 15. SP^2 Benchmark Visualized
    16. 16. DEMO
    17. 17. It’s the Math…
    18. 18. Patents
    19. 19. Runs on common hardware • Any Cloud or • On Rremises High Performance & Capacity • Needs no indexes • Works particularly well w/sparse data Self-tuning • Retains results & intermediate sets • Supports point- in-time queries SPARQL Server™
    20. 20. Algebraix Solution Stack Data Algebra DatabaseNoSQL Relational RDF Semantic Applications Meaning Organization Optimization & Execution Conceptual Data Loaders Query Translators • Modern abstract algebra • Zermelo-Fraenkel set theory • Mathematics-based data management platform • Universal data language • Collection of I.P. • SPARQL Server – RDF • A2DB - Relational • Search • Analytics • Business Intelligence • Data Integration Algebraix Platform
    21. 21. How to try it out • Sign up on their website • Try out when notified (this July)
    22. 22. What is Google Big Query? QaaS – interactive RESTful web service SQL-like language Queries data stored in Google cloud Wide Column Tables Uses OAuth for access control Very Fast 750M Rows in <10 secs
    23. 23. Easy & Fast •Text or Json •Up to 100k inserts/sec (streaming) Load it •Supports core SQL query concepts •SELECT, FROM, JOIN, WHERE, ORDER BY, GROUP BY •Windowing functions (OVER / PARTITION) •Common Aggregates (SUM, COUNT, MAX) •Includes ‘analytic’ SQL •STDDEV, VARIANCE, CORRELATION •REGEXP_MATCH Query it •Query is $ 5 per TB processed •Storage is around $30 TB per month Pay (for) it
    24. 24. Benchmark Results • TCP-H Benchmark
    25. 25. DEMO
    26. 26. Partners and BigQuery Google Sheets Tableau QlikView Bime Excel
    27. 27. How to try it out • Set up a Google Cloud account • Upload or stream data • Query
    28. 28. Google Cloud Starter Pack Use code “gde-in”
    29. 29. Next steps Try them out @LynnLangit

    ×