Dex Technical Seminar (April 2011)
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Dex Technical Seminar (April 2011)

  • 815 views
Uploaded on

Dex Technical Seminar (April 2011):...

Dex Technical Seminar (April 2011):
- Introduction
- Database construction
- Query operations
- Graph algorithms

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
815
On Slideshare
815
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
17
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Technical Seminar
    April, 2011
    Sergio Gómez
  • 2. Index
    • Introduction
    • 3. Basic Concepts
    • 4. Database construction
    • 5. Validate construction
    • 6. Query database
    • 7. Graph algorithms
    • 8. Script loaders
    • 9. Tips & tricks
  • Index
  • Introduction
    Graph database
    • Graph databases focus on the structure of the model.
    • 17. Nodes and edges instead of tables.
    • 18. Implicit relation in the model.
    • 19. DEX is a programming librarywhich allows to manage agraph database.
    • 20. Very large datasets.
    • 21. High performance query processing.
  • Introduction
    DEX Definition
    • Persistent and temporary graph management programming library.
    • 22. Data model: Typed and attributed directed multigraph.
    • 23. Typed:Node and edge instances belong to a type (label).
    • 24. Attributed: Node and edge instances may have attribute values.
    • 25. Directed: Edge can be directed or undirected.
    • 26. Multigraph: Multiple edges between two nodes.
    • 27. Type of edges:
    • 28. Materialized: directed and undirected.
    • 29. Virtual: constrained by the values of two attributes (foreign keys)
    • 30. Just for navigation
  • Introduction
    Graph Model
  • 31. Index
  • Basic Concepts
    • Java library: jdex.jar public API
    • 39. Native library
    • 40. Linux: libjdex.so
    • 41. Windows: jdex.dll
    • 42. System requirements:
    • 43. Java Runtime Environment, v1.5 or higher.
    • 44. Operative system:
    • 45. Windows – 32 bits (64 bits to be supported in new release)
    • 46. Linux – 32 and 64 bits
    • 47. Soon to announce Mac OS
  • Basic Concepts
    Class Diagram
    Graphfactory
    Persistent DB
    Session
    GraphPool
    DbGraph
    1
    N
    DEX
    1
    1
    N
    1
    Graph
    1
    1
    RGraph
    N
    N
    Objects
    Set of OIDs
    Temporary
  • 48. Basic Concepts
    Main methods
    GraphPool
    newSession()  Session
    Session
    getDbGraph()  DbGraph
    newGraph()  Rgraph
    close()
    DEX
    open(filename)  GraphPool
    create(filename)  GraphPool
    close()
    Objects
    add(long)
    exists(long)
    copy(objs)
    union(objs)
    Intersection(objs)
    difference(objs)
    Graph
    newNodeType(name)  int
    newEdgeType(name)  int
    newNode(type)  long
    newEdge(type)  long
    newAttribute(type, name)  long
    setAttribute(oid, attr, value)
    getAttribute(oid, attr)  value
    select(type)  Objects
    select(attr, op, value)  Objects
    explode(oid, type)  Objects
    Objects.Iterator
    hasNext()  boolean
    next()  long
  • 49. Index
  • Database construction
    • Graph
    • 57. DEX: Loads library and manages graph db instances.
    • 58. Graph Pool: Manages a graph db instance.
    • 59. Session: Manages a set of queries and temporary data.
    • 60. Nodes & Edges
    • 61. Type:
    • 62. DEX identifier (integer)
    • 63. Public identifier (string)
    • 64. Instance:
    • 65. DEX identifier (long) – OID
    • 66. belongsto a type
    • 67. Attributes
    • 68. Attribute:
    • 69. DEX identifier (long)
    • 70. public identifier (string)
    • 71. Scope: type or global
  • Databaseconstruction
    Create a graph database
    GraphPoolDEX#create(String name)
    Creates a new graph database instance.
    Returns the GraphPool instance to manage the new graph database instance.
    GraphPool DEX#open(String name)
    Opens an existing graph database instance.
    Returns the GraphPool instance to manage the existing graph database instance.
    Session GraphPool#newSession()
    Initiates a new user Session.
    DbGraph Session#getDbGraph()
    Gets the Graph instance representing the persistent graph database.
  • 72. Database construction
    Create a graph database example
    import edu.upc.dama.dex.core.*;

    DEX dex = new DEX();
    GraphPoolgpool = dex.create(“C:/image.dex”);
    Session s = gpool.newSession();


    s.close();
    gpool.close();
    dex.close();
  • 73. Databaseconstruction
    Add nodes
    intGraph#newNodeType(String name)
    Creates a new node type with the given unique name.
    Returns the DEX node type identifier.
    long Graph#newNode(intnodeType)
    Creates a new node belonging to the given node type. Returns the DEX object identifier.
  • 74. Databaseconstruction
    Add edges
    intGraph#newEdgeType(String name, bool directed)
    Creates a new edge type with the given unique name. Directed or undirected edge type.
    Returns the DEX edge type identifier.
    intGraph#newRestrictedEdgeType(String name, intsrcNodeType, intdstNodeType)
    Creates a new directed edge type with the given unique name.
    (Integrity restriction) Source and destionation of the edge instances are restricted to the given node types.
    Returns the DEX edge type identifier.
    long Graph#newEdge(long tail, long head, intedgeType)
    Creates a new edge belonging to the given edge type.
    Tail is the source and head is the target (iff directed).
    Returns the DEX edge identifier.
  • 75. Databaseconstruction
    Add nodes and edges example

    DbGraphdbg = s.getDbGraph();
    intperson = dbg.newNodeType(“PERSON”);
    long p1 = dbg.newNode(person);
    long p2 = dbg.newNode(person);
    long p3 = dbg.newNode (person);
    intfriend = dbg.newUndirectedEdgeType(“FRIEND”);
    long e1 = dbg.newEdge(p1, p2, friend);
    long e2 = dbg.newEdge(p2, p3, friend);
    intloves = dbg.newEdgeType(“LOVES”);
    long e3 = dbg.newEdge(p1, p3, loves);

    p1
    p2
    p3
    p1
    p2
    p3
  • 76. Databaseconstruction
    Manage attributes
    classValue
    Encapsulates a value and itsdomain (data type): String, Integer, Long, Double, Boolean, Timestamp.
    Use themto set and getattributevaluesfortheobjects.
    longGraph#newAttribute(inttype, Stringname, short dataType, short kind)
    Creates a new attributewiththegivenuniquenameforthegivennodeoredgetype.
    Returnsthe DEX attributeidentifier.
    “dataType” can be: Value#STRING, Value#INT, Value#LONG, Value#DOUBLE, Value#BOOL, Value#TIMESTAMP.
    “kind” can be:
    Graph#ATTR_KIND_BASIC. Basic attribute (just set and getvalues).
    Grahp#ATTR_KIND_INDEXED. Indexedattribute (set and getvalues as well as selectoperations)
    Graph#ATTR_KINDUNIQUE. Indexedattribute. Unique (PK).
  • 77. Databaseconstruction
    Manage attributes
    Graph#setAttribute(longoid, longattr, Value v)
    Sets thegivenValueforthegivenattributetothegivenobjectidentifier.
    Givenattributemustbedefinedfortheobject’stype.
    Value ‘s data typemust match attribute’s data typeor NULL.
    Graph#getAttribute(longoid, longattr, Value v)
    GetstheValueforthegivenattribute and forthegivenobjectidentifier.
    Givenattributemustbedefinedfortheobject’stype.
  • 78. Databaseconstruction
    Manage attributes example

    longname = dbg.newAttribute(person, “NAME”, Value.STRING);
    longage= dbg.newAttribute(person, “AGE”, Value.INT);
    Value v = new Value();
    dbg.setAttribute(p1, age, v.setInt(18));
    dbg.setAttribute(p2, name, v.setString(“KELLY"));
    dbg.setAttribute(p3, name, v.setString(“MARY"));

    longsince = dbg.newAttribute(friend, “SINCE”, Value.INT);
    dbg.setAttribute(e1, since, v.setInt(2000));
    dbg.setAttribute(e2, since, v.setInt(1995));

    JOHN
    18
    KELLY
    MARY
    JOHN
    18
    2000
    KELLY
    MARY
    1995
  • 79. Databaseconstruction
    Manage attributes example

    int phones = dbg.newEdgeType("phones");
    long when = dbg.newAttribute(phones, "when", Value.STRING);
    long e4 = dbg.newEdge(p1, p3, phones);
    dbg.setAttribute(e4, when, v.setString("4pm")));
    long e5 = dbg.newEdge(p1, p3, phones);
    dbg.setAttribute(e5, when, v.setString("5pm"));
    long e6 = dbg.newEdge(p3, p2, phones);
    dbg.setAttribute(e6, when, v.setString("6pm"));

    System.out.println(dbg.getAttribute(p1, name));
    System.out.println(dbg.getAttribute(e4, when));
    System.out.println(dbg.getAttribute(e5, when));
    JOHN
    18
    2000
    KELLY
    4pm
    5pm
    MARY
    1995
    6pm
  • 80. Index
  • Validateconstruction
    GraphPool#dumpData(File f)
    Dumps a summary of thelogicalcontent of thegraphdatabase.
    GraphPool#dumpStorage(File f)
    Dumpsinternalinformationaboutstoragecontent of thegraphdatabase.
    Graph#export(PrintWriterpw, short kind, Export e)
    Exportsthegraphtoanexternalformat.
    “kind” can be: GRAPHVIZ or YGRAPHML.
    Export defines thevisualization (ifnull, default export).
  • 88. Validateconstruction
    Validate construction example
    import java.io.PrintWriter;
    import java.io.FileWriter;

    //Default Export is used in the example by stating null.
    //Personalized export can be created implementing class Export PrintWriterstdOut = new PrintWriter(
    new FileWriter("out.graphml"));
    dbg.export(stdOut, Type.YGRAPHML, null);
    pw.close();
    //Dump data file
    gpool.dumpData(“out.data”);
    //Dump internal storage file
    gpool.dumpStorage(“out.storage”);
  • 89. Validateconstruction
    Data file example
    Edgetypes
    -- 1:2 = Friend
    -- 2:3 = Loves
    -- 3:4 = phones
    edgetypes = 3
    -- 4:1 = Person
    ------- #00000400 [Person]
    ATTR=1 //Person//age{18}
    TO [Friend]
    #00000401 [Person] ATTR=1 //Person//name{kelly}
    TO [Loves]
    #00000402 [Person] ATTR=1 //Person//name{mary}
    TO [phones]
    #00000402 [Person] ATTR=1 //Person//name{mary}
    #00000402 [Person] ATTR=1 //Person//name{mary}

    ------- 3 nodes.
    node types = 4
    # of edgetypes
    Node of typeperson
    Valuefor “age” attribute
    “Phones” out-goingedges
    # of nodetypes
  • 90. Validateconstruction
    Internal storage file example
    Number of nodes
    2011-02-09 12:12:45.330
    ...
    SIZE: 262KB / 2688KB
    NODES: 3
    EDGES: 6
    ...
    - NODES Person : N=3
    ...
    ATTR DATA: 4 ...
    - //Friend//since[INDEXED|INDEXED LINK] 2/2...
    [1995] - [2000]
    - //Person//age[INDEXED|INDEXED LINK] 1/1
    [18] - [18]
    ...
    Number of edges
    Number of “Person”s
    Number of attributes
    Name and kind of theattribute
    Minimum and maximumvalue
    # differentvalues / # values
  • 91. Validateconstruction
    Export file example, yEd Graph Editor
    <?xmlversion="1.0" encoding="ISO-8859-1"?>
    <graphmlxmlns="http://graphml.graphdrawing.org/xmlns/graphml"
    xmlns:y="http://www.yworks.com/xml/graphml"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns/graphml
    http://www.yworks.com/xml/schema/graphml/1.0/ygraphml.xsd">
    <key id="d0" for="node" yfiles.type="nodegraphics"/>
    <key id="d1" for="edge" yfiles.type="edgegraphics"/>
    <graph id="DB" edgedefault="undirected">
    <node id="0">
    <data key="d0" >
    <y:ShapeNode>
    <y:Geometry height="19.0" width="20"/>
    <y:Fill color="#a5c3f6"/>
    <y:BorderStyle color="#a5c3f6"/>
    <y:NodeLabel fontSize="10" textColor="#000000">1024</y:NodeLabel>
    <y:Shape type="rectangle"/> </y:ShapeNode></data>
    </node>

    </graph>
    </graphml>
  • 92. Index
  • QueryDatabase
    Objects
    classObjects
    Unordered set of OIDsforlargecollections.
    Implements Set<Long>, Iterable<Long>, Closeable.
    booleanObjects#add(longoid)
    Addsthegiven OID tothecollection.
    Returns true ifadded, false ifthe OID wasalreadyintothecollection.
    booleanObjects#exists(longoid)
    Returns true ifthegiven OID existsintothecollection, false otherwise.
    booleanObjects#remove(longoid)
    Removesthegiven OID fromthecollection.
    Returns true if removed or false ifthe OID wasnotintothecollection.
  • 100. QueryDatabase
    Objects
    longObjects#union(Objectsobjs)
    this = this UNION objs
    Returnsthe new size of thecollection.
    longObjects#intersection(Objectsobjs)
    this = this INTERSECTION objs
    Returnsthe new size of thecollection.
    longObjects#difference(Objectsobjs)
    this = this DIFFERENCE objs
    Returnsthe new size of thecollection.
  • 101. QueryDatabase
    Retrieve data
    ObjectsGraph#select(int t)
    Retrievesobjectidentifiersbelongingtothegivennodeoredgetype.
    ObjectsGraph#select(longattr, short op, Value v)
    Retrievesobjectidentifierswhichsatisfaythequery.
    “op” can be: Graph#OPERATION_{EQ|NE|GT|GE|LT|LE|LIKE|ERE}
    longGraph#findObj(longattr, Value v)
    Randomlyretrievesanobjectidentifierwhich has thegivenvalueforthegivenattribute (or INVALID_OID ifnotfound).
    Usefulforunique atributes.
  • 102. QueryDatabase
    Navigation
    ObjectsGraph#explode(longoid, intedgeType, short direction)
    Retrievesout-goingor in-goingedges (orboth) fromortothegivennodeidentifier and forthegivenedgetype.
    “direction” can be: Graph#EDGES_IN, Graph#EDGES_OUT, Graph#EDGES_BOTH.
    ObjectsGraph#neighbors(longoid, intedgeType, short direction)
    Retrievesneighbornodestothegivennodeidentifierwhich can bereachedthroughthegivenedgetype and direction.
    “direction” can be: Graph#EDGES_IN, Graph#EDGES_OUT, Graph#EDGES_BOTH.
  • 103. QueryDatabase
    Retrieve data example

    DbGraphdbg = s.getDbGraph();
    Objectspersons = dbg.select(person);
    Objects.Iteratorit = persons.iterator();
    while (it.hasNext()) {
    long p = it.next();
    Stringname = dbg.getAttribute(p, name).toString();
    }
    it.close();
    persons.close();

    JOHN
    18
    JOHN
    18
    2000
    5pm
    KELLY
    KELLY
    4pm
    1995
    MARY
    MARY
    6pm
  • 104. QueryDatabase
    Navigation & Objects operations example

    Objects objs1 = dbg.select(when, >=, 5pm);
    // objs1 = { e5, e6 }
    Objects objs2 = dbg.explode(p1, phones, OUT);
    // objs2 = { e4, e5 }
    Objectsobjs = objs1.intersection(objs2);
    // objs = { e5, e6 } ∩ { e4, e5 } = { e5 }

    objs.close();
    objs1.close();
    objs2.close();

    JOHN
    18
    JOHN
    18
    2000
    5pm
    KELLY
    KELLY
    4pm
    1995
    MARY
    MARY
    6pm
  • 105. Index
  • GraphAlgorithms
    Traversals
    classTraversal
    ImplementsIterator<Long>: hasNext(), next().
    Traversesthegraphfrom a givennodeidentifier.
    Node and edgetypes can befilteredout.
    ClassTraversalBFS
    Uses a BFS (breadth-firstsearch ) algorithm.
    ClassTraversalDFS
    Uses a DFS (depth-firstsearch ) algorithm.
  • 113. GraphAlgorithms
    Traversals example
    Graphgraph = … // graphinstance
    long idsource= … // sourcenode
    TraversalBFSbfs = new TraversalBFS(graph, idsource);
    // Addingtheallowededgetypes for traversingthegraph
    fs.addEdge(graph.findType("street"), Algorithm.NAVIGATION_FORWARD);
    bfs.addEdge(graph.findType("road"), Algorithm.NAVIGATION_FORWARD);
    // Addingtheallowed node types for traversingthegraph
    bfs.addAllNodes();
    // Runningthealgorithm
    long idnode;
    while(bfs.hasNext()) {
    idnode= bfs.next();
    }
    // Closingthetraversal
    bfs.close();
  • 114. GraphAlgorithms
    Shortest paths
    classSinglePairShortestPath
    Findstheshortestpathbetweentwogivennodeidentifiers.
    Node and edgetypes can befilteredout.
    ClassSinglePairShortestPathBFS
    Uses a BFS traversaltofindtheshortestpath.
    For un-weightededges.
    ClassSinglePairShortestPathDijkstra
    Uses a Dijkstraalgorithmtofindtheshortestpath.
    Forweightededges.
  • 115. GraphAlgorithms
    ShortestPath example
    Graphgraph = … // graphinstance
    long idsource = … // sourcenode
    longiddestination = … // destinationnode
    SinglePairShortestPathBFSsp=
    new SinglePairShortestPathBFS(graph, idsource, iddestination);
    // Addingtheallowededgetypes for traversingthegraph
    sp.addEdge(graph.findType("street"), Algorithm.NAVIGATION_FORWARD);
    sp.addEdge(graph.findType("road"), Algorithm.NAVIGATION_UNDIRECTED);
    // Settingthemaximumhops
    sp.setMaximumHops(7);
    // Runningthealgorithm
    sp.run();
    // Retrievingthegeneratedresults
    if(sp.existsShortestPath()) {
    long[] spAsNodes= sp.getPathAsNodes();
    long[] spAsEdges= sp.getPathAsEdges();
    inthopsDone= sp.getCost();
    }
    // Closingtheinstance
    sp.close();
  • 116. GraphAlgorithms
    Connected components
    classConnectivity
    Retrievesconnectedcomponents of thegivengraph.
    Node and edgetypes can befilteredout.
    Connectedcomponents can bematerializedintoanattribute.
    classStrongConnectivityGabow
    Retrievesstrongconnectedcomponents in a directedgraph.
    Uses theGabowalgorithm.
    classWeakConnectivityDFS
    Retrievesweaklyconnectedcomponents in a undirectedgraphor in a directedgraph (ignoringdirections).
    Uses a DFS algorithm.
  • 117. Index
  • Script loaders
    Schema definition
    CREATE DBGRAPH alias INTO filename
    CREATE NODE node_type_name "(“
    [attribute_name
    (INT|LONG|DOUBLE|STRING|BOOLEAN|TIMESTAMP|TEXT)
    [INDEXED|UNIQUE|BASIC]
    , ...]
    ")“
    CREATE [UNDIRECTED|VIRTUAL] EDGE edge_type_name
    [FROM node_type_name[.attribute_name]
    TO node_type_name[.attribute_name]] "(“
    [attribute_name
    (INT|LONG|DOUBLE|STRING|BOOLEAN|TIMESTAMP|TEXT)
    [INDEXED|UNIQUE|BASIC]
    , ...]
    ") [MATERIALIZE NEIGHBORS]"
  • 125. Script loaders
    Load nodes
    LOAD NODES file_name
    COLUMNS attribute_name [alias_name], …
    INTO node_type_name
    [IGNORE (attribute_name|alias_name), …]
    [FIELDS
    [TERMINATED char]
    [ENCLOSED char]
    [ALLOW_MULTILINE]]
    [FROM num]
    [MAX num]
    [MODE (ROWS|COLUMNS [SPLIT [PARTITIONS num]])]
  • 126. Script loaders
    Load edges
    LOAD EDGES file_name
    COLUMNS attribute_name [alias_name], …
    INTO node_type_name
    [IGNORE (attribute_name|alias_name), …]
    WHERE
    TAIL (attribute_name|alias_name) = node_type_name.attribute_name
    HEAD (attribute_name|alias_name) = node_type_name.attribute_name
    [FIELDS
    [TERMINATED char]
    [ENCLOSED char]
    [ALLOW_MULTILINE]]
    [FROM num]
    [MAX num]
    [MODE (ROWS|COLUMNS [SPLIT [PARTITIONS num]])]
  • 127. Index
  • Tips & tricks
  • Tips & tricks
    • Index or not?
    • 143. Attributes:
    • 144. Attributes used at select operations must be indexed.
    • 145. Optionally, index once the attribute has been created/loaded.
    • 146. Neighbors:
    • 147. Edge types used at neighbors operations it is recommended to be indexed.
  • Tips & tricks
  • Tips & tricks
    • Others:
    • 157. DB cross-platform format.
    • 158. 32 – 64 bits, OS independent.
    • 159. Justconsidere platform endianess.
    • 160. Read only mode.
    • 161. Configuration:
    • 162. edu.upc.dama.dex.utils.DEXConfig
    • 163. Set the maximum memory usage.
    • 164. 0 means unlimited.
    • 165. License.
    • 166. No license means evaluation version.
  • Thanks for your
    attention
    Any questions?
    SPARSITY-TECHNOLOGIES
    Jordi Girona, 1-3, Edifici K2M 08034 Barcelona
    info@sparsity-technologies.com
    http://www.sparsity-technologies.com