Your SlideShare is downloading. ×
0
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
GaianDB
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

GaianDB

2,810

Published on

presentation I gave on GaianDB - a dynamic federated distributed database available on IBM alphaWorks …

presentation I gave on GaianDB - a dynamic federated distributed database available on IBM alphaWorks

The presentation wont make a lot of sense without speaker notes... which I've not written yet. Sorry about that.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,810
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
32
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. GaianDB A dynamic distributed federated database Dale Lane @dalelane
  • 2. A massively over-simplified view of data-warehousing...
  • 3. The “Internet of Things”
  • 4. GaianDB a dynamic distributed federated database
  • 5. Federated data
  • 6. Network of distributed databases
  • 7. A dynamic network
  • 8. A dynamic network Biologically-Inspired Self-Organisation Exploit natural selection in nature to build better networks Robust self-organizing network architectures Frameworks and algorithms for robust fault-tolerant information dissemination Robust communications with minimal complexity or human control
  • 9. Gaian database N0 N3 N11 N4 N5 N1 N2 N6 N7 N8 N10 N9 SQL Query N0 N3 N11 N4 N5 N1 N2 N6 N7 N8 N10 N9 SQL Query N0 N3 N11 N4 N5 N1 N2 N6 N7 N8 N10 N9 SQL Query N0 N3 N11 N4 N5 N1 N2 N6 N7 N8 N10 N9 SQL Queries Queries routed to all database nodes – a flood query, but retrieving only the data required to satisfy a query Exchanges query traffic in the network for data traffic – aiming to minimize total traffic Predicated on a concept of ‘store data locally - read data from anywhere’ paradigm
  • 10. Architecture GaianDB Derby Engine: Parsing, Compilation, Execution GaianPStmtNode VTI: Executes queries on physical leaf nodes + Propagates the original SQL (+ queryID & steps state info) to linked Gaian nodes Instantiates Invokes costing methods Pushes columns and ‘where’ clause in a structure MQ(tt) Stream Data Original SQL DB2 Oracle MS SQLServer Sybase MySQL Flat files In-memory tables Derby GaianDB GaianDB GaianDB propagate Text Index Derby tables N0 N3 N11 N4 N5 N1 N2 N6 N7 N8 N10 N9 SQL Query N0 N3 N11 N4 N5 N1 N2 N6 N7 N8 N10 N9 SQL Query Expanded Node Multithreaded, breadth-first query propagation Loop detection/handling – no duplicates
  • 11. Performance – with 1,250 nodes Query time for 1025 nodes, fetching up to 1025 rows from each y = 4.217x + 349.251 0 1000 2000 3000 4000 5000 6000 0 200 400 600 800 1000 1200 Row s fetched per node Time(milliseconds) Query Execute Time Total Query Time Linear (Total Query Time) Query Performance 0.0 53.9 107.8 161.7 215.6 269.5 323.4 377.3 431.2 485.1 539.0 0 200 400 600 800 1000 1200 Number of Nodes QueryTime(milliseconds) Average Query Time Predicted Max (Layers) Predicted Min (Layers)
  • 12. Performance questions The time to propagate a query to all of the nodes in the database, as a function of the number of database nodes (N); The time to fetch data from across the nodes of the database to a single node, as a function of the volume of data; The time to fetch data from across the database to multiple nodes concurrently querying, as a function of the number of nodes concurrently querying.
  • 13. Graph metrics The eccentricity ε(νi) of a graph vertex νi is the maximum graph distance between νi and any other vertex νj of G i.e. the "longest shortest path" between any two graph vertices (νi , νj) of the graph. The maximum eccentricity is the graph diameter Gd. The minimum graph eccentricity is the graph radius Gr. We define the size of G as the number of vertices N and the number of connections at each vertex as the vertex degree δi (1 < i ≤ N).
  • 14. Biologically inspired self-organisation 0 1 2 3 4 5 6 7 8 9 10 0 200 400 600 800 1000 Number of Nodes (N) GraphDimension(edges) Radius Diameter (1+e)ln(N) (1-e)ln(N) Network growth by preferential attachment Using a fitness function at each node Limit maximum vertex degree =10 Gd = nint [ (1+e) * ln(N) ] Gr = nint [ (1-e) * ln(N) ] e = 0.24
  • 15. Query propagation time The predicted maximum (Tmax) and minimum times (Tmin) to execute the flood query are: TL = link latency Tp = processor delay Tmax = (Gd + 1)(TL + Tp) Tmin = (Gr + 1)(TL + Tp) with the predicted execute query time from any node (Tν) being: Tν = (ε(ν) + 1)(TL + Tp) Hence substituting for ε(ν) Tν = nint[1 + B * ln(N) * (TL + Tp)]
  • 16. Measured query propagation IndividualQueryTimeScalability 0.0 53.9 107.8 161.7 215.6 269.5 323.4 377.3 431.2 485.1 539.0 592.9 0 200 400 600 800 1000 1200 Number of Nodes QueryTime(ms) AverageQueryTime PredictedMax(Diameter+1) PredictedMin(Radius+1) Queriednodeeccentricity+1 Individual Query Time Scalability 0 53.9 107.8 161.7 215.6 269.5 323.4 0 50 100 Number ofNodes QueryTime(ms) Individual Query Times Average Query Time Queried node eccentricity+1
  • 17. Measured data fetch Query time to fetch 1 million rows y = 4.217x + 349.251 y = 1.7383x + 678.141 0 1000 2000 3000 4000 5000 6000 0 200000 400000 600000 800000 1000000 1200000 Total Rows fetched Time(milliseconds) Total Query Time 1025 nodes Total Query Time 1 node Total Query Time 1 node indexed Linear (Total Query Time 1025 nodes) Linear (Total Query Time 1 node)
  • 18. Example uses
  • 19. Smart Metering centralised write
  • 20. Smart Metering centralised read
  • 21. Smart Metering distributed federated write
  • 22. Smart Metering distributed federated read
  • 23. Other uses...
  • 24. http://www.alphaworks.ibm.com/tech/gaiandb
  • 25. Image credits Background: YouTube video “The Internet of Things”, IBM http://www.youtube.com/watch?v=sfEbMV295Kk Icons: DB and envelope icons, Tim Morgan http://flickr.com/photos/timothymorgan/sets/1615269 Microsoft Excel icon, Vincent Garnier (courtesy of IconArchive) http://iconarchive.com/show/softdimension-icons-by-benjigarner/Excel-icon.html Photo of car mechanics, Tomas http://flickr.com/photos/tma/2264878 All other images original from GaianDB work

×