Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
Amazon Redshift SSD - Queries on TBs of data can run in a few seconds
Next
Download to read offline and view in fullscreen.

38

Share

Odessapy2013 - Graph databases and Python

Download to read offline

Page 10 "Я из Одессы я просто бухаю." translation: I'm from Odessa I just drink. Meaning his drinking a lot of "Vodka" ^_^ (@tuc @hackernews)
This is local meme - when someone asking question and you will look stupid in case you don't have answer.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Odessapy2013 - Graph databases and Python

  1. 1. graphs databases! and python Maksym Klymyshyn CTO @ GVMachines Inc. (zakaz.ua)
  2. 2. What’s inside? ‣ PostgreSQL ‣ Neo4j ‣ ArangoDB
  3. 3. Python Frameworks ‣ Bulbflow ‣ py4neo ‣ NetworkX ‣ Arango-python
  4. 4. Relational to Graph model crash course “Switching from relational to the graph model”! by Luca Garulli http://goo.gl/z08qwk! ! http://www.slideshare.net/lvca/switching-from-relational-to-the-graph-model
  5. 5. My motivation is quite simple:
  6. 6. “The best material model of a cat is another, or preferably the same, cat.” –Norbert Wiener
  7. 7. Old good Postgres
  8. 8. create table nodes ( node integer primary key, name varchar(10) not null, feat1 char(1), feat2 char(1)) ! create table edges ( a integer not null references nodes(node) on update cascade on delete cascade, b integer not null references nodes(node) on update cascade on delete cascade, primary key (a, b)); ! create index a_idx ON edges(a); create index b_idx ON edges(b); ! create ! unique index pair_unique_idx on edges (LEAST(a, b), GREATEST(a, b)); ; and no self-loops alter table edges add constraint no_self_loops_chk check (a <> b); ! insert insert insert insert insert insert insert ! into into into into into into into nodes nodes nodes nodes nodes nodes nodes values values values values values values values (1, (2, (3, (4, (5, (6, (7, 'node1', 'node2', 'node3', 'node4', 'node5', 'node6', 'node7', 'x', 'x', 'x', 'z', 'x', 'x', 'x', 'y'); 'w'); 'w'); 'w'); 'y'); 'z'); 'y'); insert into edges values (1, 3), (2, 1), (2, 4), (3, 4), (3, 5), (3, 6), (4, 7), (5, 1), (5, 6), (6, 1); ! ; directed graph select * from nodes n left join edges e on n.node = e.b where e.a = 2; ! ; undirected graph select * from nodes where node in (select case when a=1 then b else a end from edges where 1 in (a,b)); !
  9. 9. Я из Одессы, я просто бухаю.
  10. 10. Neo4j
  11. 11. Most famous graph database. • 1,333 mentions within repositories on Github • 1,140,000 results in Google • 26,868 tweets • Really nice Admin interface • Awesome help tips
  12. 12. A lot of python libraries Py2Neo, Neomodel, neo4django, bulbflow
  13. 13. ; Create a node1, node2 and ; relation RELATED between two nodes CREATE (node1 {name:"node1"}), (node2 {name: "node2"}), (node1)-[:RELATED]->(node2); !
  14. 14. neo4j is friendly and powerful. The only thing is a bit complex querying language – Cypher
  15. 15. py4neo nodes from py2neo import neo4j, node, rel ! ! graph_db = neo4j.GraphDatabaseService( "http://localhost:7474/db/data/") ! die_hard = graph_db.create( node(name="Bruce Willis"), node(name="John McClane"), node(name="Alan Rickman"), node(name="Hans Gruber"), node(name="Nakatomi Plaza"), rel(0, "PLAYS", 1), rel(2, "PLAYS", 3), rel(1, "VISITS", 4), rel(3, "STEALS_FROM", 4), rel(1, "KILLS", 3))
  16. 16. py4neo paths from py2neo import neo4j, node ! graph_db = neo4j.GraphDatabaseService( "http://localhost:7474/db/data/") alice, bob, carol = node(name="Alice"), node(name="Bob"), node(name="Carol") abc = neo4j.Path( alice, "KNOWS", bob, "KNOWS", carol) abc.create(graph_db) abc.nodes # [node(**{'name': 'Alice'}), # node(**{‘name': ‘Bob'}), # node(**{‘name': 'Carol'})]
  17. 17. Alice KNOWS Bob KNOWS Carol
  18. 18. bulbflow framework from bulbs.neo4jserver import Graph g = Graph() james = g.vertices.create(name="James") julie = g.vertices.create(name="Julie") g.edges.create(james, "knows", julie)
  19. 19. FlockDB OrientDB InfoGrid HyperGraphDB WAT?
  20. 20. ArangoDB
  21. 21. “In any investment, you expect to have fun and make profit.” –Michael Jordan
  22. 22. I’m developer of python driver for ArangoDB
  23. 23. • NoSQL Database storage • Graph of documents • AQL (arango query language) to execute graph queries • Edge data type to create edges between nodes (with properties) • Multiple edges collections to keep different kind of edges • Support of Gremlin graph query language
  24. 24. Small experiment with graphs and twitter:! I’ve looked on my tweets and people who added it to favorites. After that I’ve looked to that person’s tweets and did the same thing with people who favorited their tweets.
  25. 25. 1-level depth
  26. 26. 2-level depth
  27. 27. 3-level depth
  28. 28. Code behind from arango import create ! arango = create(db="tweets_maxmaxmaxmax") arango.database.create() arango.tweets.create() arango.tweets_edges.create( type=arango.COLLECTION_EDGES) !
  29. 29. Here we creating edge from from_doc to to_doc ! from_doc = arango.tweets.documents.create({}) to_doc = arango.tweets.documents.create({}) arango.tweets_edges.edges.create(from_doc, to_doc) Getting edges for tweet 196297127 query = db.tweets_edge.query.over( F.EDGES( "tweets_edges", ~V("tweets/196297127"), ~V("outbound")))
  30. 30. Full example • Sample dataset with 10 users • Relations between users • Visualise within admin interface
  31. 31. Sample dataset from arango import create ! def dataset(a): a.database.create() a.users.create() a.knows.create(type=a.COLLECTION_EDGES) ! for u in range(10): a.users.documents.create({ "name": "user_{}".format(u), "age": u + 20, "gender": u % 2 == 0}) ! ! a = create(db="experiments") dataset(a)
  32. 32. Relations between users def relations(a): rels = ( (0, 1), (0, 2), (2, 3), (4, 3), (3, 5), (5, 1), (0, 5), (5, 6), (6, 7), (7, 8), (9, 8)) ! ! ! get_user = lambda id: a.users.query.filter( "obj.name == 'user_{}'".format(id)).execute().first for f, t in rels: what = "user_{} knows user_{}".format(f, t) from_doc, to_doc = get_user(f), get_user(t) a.knows.edges.create(from_doc, to_doc, {"what": what}) print ("{}->{}: {}".format(from_doc.id, to_doc.id, what)) a = create(db="experiments") relations(a)
  33. 33. Relations between users users/2744664487->users/2744926631: users/2744664487->users/2745123239: users/2745123239->users/2745319847: users/2745516455->users/2745319847: users/2745319847->users/2745713063: users/2745713063->users/2744926631: users/2744664487->users/2745713063: users/2745713063->users/2745909671: users/2745909671->users/2746106279: users/2746106279->users/2746302887: users/2746499495->users/2746302887: user_0 user_0 user_2 user_4 user_3 user_5 user_0 user_5 user_6 user_7 user_9 knows knows knows knows knows knows knows knows knows knows knows user_1 user_2 user_3 user_3 user_5 user_1 user_5 user_6 user_7 user_8 user_8
  34. 34. AQL, getting paths FOR p IN PATHS(users, knows, 'outbound') FILTER p.source.name == 'user_5' RETURN p.vertices[*].name from arango import create from arango.aql import F, V ! ! def querying(a): for data in a.knows.query.over( F.PATHS("users", "knows", ~V("outbound"))) .filter("obj.source.name == '{}'".format("user_5")) .result("obj.vertices[*].name") .execute(wrapper=lambda c, i: i): print (data) ! ! a = create(db="experiments") ! querying(a)
  35. 35. Paths output ['user_5'] ['user_5', ['user_5', ['user_5', ['user_5', 'user_1'] 'user_6'] 'user_6', 'user_7'] 'user_6', 'user_7', 'user_8']
  36. 36. Links • Arango paths: http://goo.gl/n2L3SK • Neo4j: http://goo.gl/au5y9I • Scraper: http://goo.gl/nvMFGk! • Visualiser: http://goo.gl/Rzdwci
  37. 37. Thanks. Q’s? ! @maxmaxmaxmax
  • LeeYongwoon

    Oct. 26, 2015
  • ajeebkp23

    Apr. 22, 2015
  • DmitryAineskazin

    Apr. 12, 2015
  • greencipher

    Feb. 11, 2015
  • Quasi_quant2010

    Sep. 22, 2014
  • adamburan

    Apr. 4, 2014
  • aymanislam

    Feb. 12, 2014
  • mahirmyavuz

    Jan. 27, 2014
  • Ssrdjan

    Dec. 18, 2013
  • brianclements3

    Dec. 17, 2013
  • ssuser09e119

    Dec. 15, 2013
  • manojkumrm

    Dec. 13, 2013
  • tsubame9590206

    Dec. 13, 2013
  • mrded

    Dec. 13, 2013
  • levenrobby

    Dec. 13, 2013
  • Musicianstlas

    Dec. 12, 2013
  • joaodubas

    Dec. 12, 2013
  • neoprolog

    Dec. 12, 2013
  • aguasfe

    Dec. 12, 2013
  • albertdolivera

    Dec. 11, 2013

Page 10 "Я из Одессы я просто бухаю." translation: I'm from Odessa I just drink. Meaning his drinking a lot of "Vodka" ^_^ (@tuc @hackernews) This is local meme - when someone asking question and you will look stupid in case you don't have answer.

Views

Total views

56,500

On Slideshare

0

From embeds

0

Number of embeds

26,067

Actions

Downloads

178

Shares

0

Comments

0

Likes

38

×