DEX presentation for GDM 2011

2,221 views

Published on

DEX presentation for GDM 2011
- Logical graph model
- Internal representation
- Software architecture
- Experimental results

Published in: Technology
  • Be the first to comment

  • Be the first to like this

DEX presentation for GDM 2011

  1. 1. DEX: a High-Performance Graph Database Management System A High-Performance Graph Database Management System Authors: Norbert Martínez-Bazan Sergio Gómez-Villamor Francesc Escalé-Claveras
  2. 2. DEX: a High-Performance Graph Database Management System OutlineNom e la presenatació o altra info (opcional)  Introduction  DEX  Logical graph model  Internal representation  Software architecture  Experimental results  Conclusions  Future work
  3. 3. DEX: a High-Performance Graph Database Management System IntroductionNom e la presenatació o altra info (opcional)  [2006] DEX started by DAMA-UPC  [2010] Sparsity Technologies is a spin- out from DAMA-UPC  Sparsity comercializes and provides services  DAMA-UPC does development and research  DEX Versions  V2.0 March/2009  V3.0 October/2009  V4.0 November/2010
  4. 4. DEX: a High-Performance Graph Database Management System DEXNom e la presenatació o altra info (opcional)  DEX is a graph database:  Data and schema both are represented as a graph  Data operations are based on graph operations  Graph-based integrity restrictions Renzo Angles and Claudio Gutierrez. 2008. Survey of graph database models. ACM Comput. Surv. 40, 1, Article 1 (February 2008)  Focus:  Management of very large graphs  High-performance on query operations
  5. 5. DEX: a High-Performance Graph Database Management System Logical graph modelNom e la presenatació o altra info (opcional)  Labeled: nodes and edges are “typed”  Directed: edges can have a fixed direction  Attributed: nodes and edges can have multiple single-valued attributes  Multigraph: two nodes can be connected by multiple edges
  6. 6. DEX: a High-Performance Graph Database Management System Internal representationNom e la presenatació o altra info (opcional)  Requirements  Split the graph into smaller structures • Favour the caching • Move to main memory just significant parts  OIDs instead of objects • Reduce memory requirements  Specific structures to improve traversals • Index edges of a node  Attributes fully indexed • Improve queries based on value filters
  7. 7. DEX: a High-Performance Graph Database Management System Internal representationNom e la presenatació o altra info (opcional)  Our approach:  Map + Bitmaps  Link  Link: bidirectional association between values and OIDs  Two functionalities: • Given a value  a set of OIDs (a bitmap) • Given an OID  the value Bitmaps oid oids 1 2 3 4 5 a 1 value 1 1 0 0 0 1 b 2 2 a 1 2 3 c 3 3 b 0 1 1 1 2 3 4 4 c 4 0 0 0 1 5 5 Map Link
  8. 8. DEX: a High-Performance Graph Database Management System Internal representationNom e la presenatació o altra info (opcional) A Graph as a combination of Bitmaps:  1 Bitmap for each node or edge type  1 Link for each attribute  2 Links for each edge type:  Out-going and in-going edges N. Martínez-Bazán, V. Muntés-Mulero, S. Gómez-Villamor, J. Nin, M. A. Sánchez-Martínez, and J. Larriba-Pey, Dex: high-performance exploration on large graphs for information retrieval. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management (CIKM 07)
  9. 9. DEX: a High-Performance Graph Database Management System Software architectureNom e la presenatació o altra info (opcional)  DEXCORE  Complete C++ library • Storage and query  Linux / Windows / MacOSX  32 / 64 bits  JDEX  DEXCORE functionality provided as a Java library
  10. 10. DEX: a High-Performance Graph Database Management System Software architectureNom e la presenatació o altra info (opcional) DEXCORE:  IO  Segment: Logical space of pages  Pool: Groups of segments  Storage: I/O device  Cache: I/O management • Replacement policy  Data:  Paged out-of-core structures  Bitmaps, Maps, Links, …  Graph:  A combination of structures  DbGraph and RGraphs  DEX:  Database and Session management
  11. 11. DEX: a High-Performance Graph Database Management System Software architectureNom e la presenatació o altra info (opcional) Implementation details:  37-bit unsigned integer OIDs  + 137 billion objects per graph  Bitmaps are compressed  Clusters of 32 consecutive bits  Just existing clusters are stored  Groups of OIDs for each type  Higher density of consecutive bits into bitmaps  Maps are B+ trees  A compressed UTF-8 storage for UNICODE strings
  12. 12. DEX: a High-Performance Graph Database Management System Experimental resultsNom e la presenatació o altra info (opcional)  Load tests IMDB Wikipedia RMAT (sf=28) DB 2.4 GB 7.6 GB 83 GB Physical Mem 9 GB 9 GB 60 GB Load time 21 min 2h 6min 15h Nodes 13 M 19 M 230 M Edges 22 M 180 M 2147 M Values 48 M 283 M 230 M Insertions per sec 65 K 62 K 48 K
  13. 13. DEX: a High-Performance Graph Database Management System Experimental resultsNom e la presenatació o altra info (opcional)  Query tests  IMDB database (2.4 GB)  Queries: • A: full extraction of a movie, multiple 1-hop traversal [4K edges] • B: distance between two actors [8K edges] • C: extract all movies that match a given pattern [315K edges] In-memory 128 MB A 0.13 sec 0.13 sec B 1.52 sec 1.79 sec C 384 sec 385 sec
  14. 14. DEX: a High-Performance Graph Database Management System ConclusionsNom e la presenatació o altra info (opcional)  We propose DEX, a high performance graph database querying system for labeled and directed attributed multigraphs  We propose a graph representation based on the intensive use of bitmaps  We perform an experimental performance analysis to show the ability of DEX to store and query very large graphs
  15. 15. DEX: a High-Performance Graph Database Management System Future workNom e la presenatació o altra info (opcional)  Trillions of objects  Transactional system  Distributed system  Query language  Query optimization  High-level graph operations  Pattern matching
  16. 16. DEX: a High-Performance Graph Database Management System Questions?Nom e la presenatació o altra info (opcional) sgomez@sparsity-technologies.com Sparsity Technologies http://www.sparsity-technologies.com DAMA-UPC http://www.dama.upc.edu

×