SeminarBigData, NoSQL graph database forJava developers*Presenter:     Evgeny Hanikblum
Data is getting bigger:“Every 2 days wecreate as muchinformation as we didup to 2003”– Eric Schmidt, Google
Big Data Technologies
NoSQL Overview
NoSQL->Not Only SQL
Key Value Stores• Most Based on Dynamo: Amazon Highly  Available Key-Value Store• Data Model:  – Global key-value mapping ...
Key Value Stores• Pros:  – Simple data model  – Scalable• Cons  – Create your own “foreign keys”  – Poor for complex data
Column Databases• Most Based on BigTable: Google’s Distributed  Storage System for Structured Data• Data Model:  – A big t...
Column Databases• Pros:  – Supports Simi-Structured Data  – Naturally Indexed (columns)  – Scalable• Cons  – Poor for inte...
Document Databases• Data Model:  – A collection of documents  – A document is a key value collection  – Index-centric, lot...
Document Databases• Pros:  – Simple, powerful data model  – Scalable• Cons  – Poor for interconnected data  – Query model ...
Graph Databases• Data Model:  – Nodes and Relationships• Projects:
Graph Databases• Pros:  – Powerful data model, as general as RDBMS  – Connected data locally indexed  – Easy to query• Con...
Why you need GraphDB ?
GraphDB OverviewBecause of Data expanded intorelationships
GraphDB OverviewBecause of Data becameinterconnected
When should I use it ?
Use graph db, if you should deal withsomething like this :
or this …
or this …
GraphDB OverviewData is more connected:•   Text (content)•   HyperText (added pointers)•   RSS (joined those pointers)•   ...
GraphDB OverviewData is less structured:• If you tried to collect all the data of every  movie ever made, how would you mo...
What is Graph
What is Graph• An abstract representation of a set of  objects where some pairs are connected by  links.          Object (...
Different Kinds of Graphs• Undirected Graph• Directed Graph• Pseudo Graph• Multi Graph• Hyper Graph
More Kinds of Graphs• Weighted Graph• Labeled Graph• Property Graph
What is Graph DB
What is a Graph DB?• A database with an explicit graph structure• Each node knows its adjacent nodes• As the number of nod...
Compared to Relational Databases Optimized for aggregation   Optimized for connections
What is Neo4j?
What is Neo4j?• A java based graph database• Property Graph• Full ACID (atomicity, consistency, isolation, durability)• Hi...
What is Neo4j?•   Both nodes and relationships can have metadata.•   Integrated pattern-matching-based query language (“Cy...
Neo4j is good for :• Highly connected data (social networks)• Recommendations (e-commerce)• Path Finding (how do I know yo...
how do I know you?
how can I get there ?
If you’ve ever•   Joined more than 7 tables together•   Modeled a graph in a table•   Written a recursive CTE•   Tried to ...
rewiring you brain       Language        LanguageCountry          Countrylanguage_code        language_code       country_...
rewiring you brain                 name: “Canada”                 languages_spoken: “[ „English‟, „French‟ ]”             ...
rewiring you brain                           Country                   name                   flag_uri                   l...
show me the code!GraphDatabaseService graphDb =       new EmbeddedGraphDatabase("var/neo4j");Node david = graphDb.createNo...
Neo4j data browser
Neo4j data browser
Neoclipse
console.neo4j.org       Try it right now:       start n=node(*) match n-[r:LOVES]->m return n, type(r), m       Notice the...
Spring-Data-Neo4J
Spring-Data-Neo4J• Focus on Spring Data Neo4j• VMWare is collaborating with Neo Technology, the  company behind the Neo4j ...
Spring-Data-Neo4J@NodeEntity@NodeEntitypublic class Actor {       private String name;       private int age;       privat...
Spring-Data-Neo4J@NodeEntity public class Movie {    @GraphId Long id;    @Indexed(type = FULLTEXT, indexName = "search") ...
Spring-Data-Neo4J@RelationshipEntity@RelationshipEntitypublic class Role {       @StartNodeprivate Actor actor;       @End...
Spring-Data-Neo4J@RelationshipEntitypublic class Role {       @StartNode private   Actor actor;       @EndNode   private  ...
How they did that ?
NoSql->Graph DB->Neo4JLecturer : Evgeny Hanikblum @ AlphaCSP:OracleWeek2012:IsraelEmail : evgenyh@alphacsp.com
NoSQL, Neo4J for Java Developers , OracleWeek-2012
Upcoming SlideShare
Loading in …5
×

NoSQL, Neo4J for Java Developers , OracleWeek-2012

3,776 views

Published on

Ne4j for Java developers. Presented at OracleWeek 2012.
Created by Eugene Hanikblum @ AlphaCSP

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,776
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
104
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • An undirected graph is one in which edges have no orientation. The edge (a, b) is identical to the edge (b, a).A directed graph or digraph is an ordered pair D = (V, A)A pseudo graph is a graph with loopsA multi graph allows for multiple edges between nodesA hyper graph allows an edge to join more than two nodes
  • An undirected graph is one in which edges have no orientation. The edge (a, b) is identical to the edge (b, a).A directed graph or digraph is an ordered pair D = (V, A)A pseudo graph is a graph with loopsA multi graph allows for multiple edges between nodesA hyper graph allows an edge to join more than two nodes
  • Best used: For graph-style, rich or complex, interconnected data. Neo4j is quite different from the others in this sense.For example: Social relations, public transport links, road maps, network topologies.
  • NoSQL, Neo4J for Java Developers , OracleWeek-2012

    1. 1. SeminarBigData, NoSQL graph database forJava developers*Presenter: Evgeny Hanikblum
    2. 2. Data is getting bigger:“Every 2 days wecreate as muchinformation as we didup to 2003”– Eric Schmidt, Google
    3. 3. Big Data Technologies
    4. 4. NoSQL Overview
    5. 5. NoSQL->Not Only SQL
    6. 6. Key Value Stores• Most Based on Dynamo: Amazon Highly Available Key-Value Store• Data Model: – Global key-value mapping – Big scalable HashMap – Highly fault tolerant (typically)• Projects:
    7. 7. Key Value Stores• Pros: – Simple data model – Scalable• Cons – Create your own “foreign keys” – Poor for complex data
    8. 8. Column Databases• Most Based on BigTable: Google’s Distributed Storage System for Structured Data• Data Model: – A big table, with column families – Map Reduce for querying/processing• Projects:
    9. 9. Column Databases• Pros: – Supports Simi-Structured Data – Naturally Indexed (columns) – Scalable• Cons – Poor for interconnected data
    10. 10. Document Databases• Data Model: – A collection of documents – A document is a key value collection – Index-centric, lots of map-reduce• Projects :
    11. 11. Document Databases• Pros: – Simple, powerful data model – Scalable• Cons – Poor for interconnected data – Query model limited to keys and indexes – Map reduce for larger queries
    12. 12. Graph Databases• Data Model: – Nodes and Relationships• Projects:
    13. 13. Graph Databases• Pros: – Powerful data model, as general as RDBMS – Connected data locally indexed – Easy to query• Cons – Sharding ( lots of people working on this) • Scales UP reasonably well – Requires rewiring your brain
    14. 14. Why you need GraphDB ?
    15. 15. GraphDB OverviewBecause of Data expanded intorelationships
    16. 16. GraphDB OverviewBecause of Data becameinterconnected
    17. 17. When should I use it ?
    18. 18. Use graph db, if you should deal withsomething like this :
    19. 19. or this …
    20. 20. or this …
    21. 21. GraphDB OverviewData is more connected:• Text (content)• HyperText (added pointers)• RSS (joined those pointers)• Blogs (added pingbacks)• Tagging (grouped related data)• RDF (described connected data)• GGG (content + pointers + relationships + descriptions)
    22. 22. GraphDB OverviewData is less structured:• If you tried to collect all the data of every movie ever made, how would you model it?• Actors, Characters, Locations, Dates, Costs, Ratings, Showings, Ticket Sales, etc.
    23. 23. What is Graph
    24. 24. What is Graph• An abstract representation of a set of objects where some pairs are connected by links. Object (Vertex, Node) Link (Edge, Arc, Relationship)
    25. 25. Different Kinds of Graphs• Undirected Graph• Directed Graph• Pseudo Graph• Multi Graph• Hyper Graph
    26. 26. More Kinds of Graphs• Weighted Graph• Labeled Graph• Property Graph
    27. 27. What is Graph DB
    28. 28. What is a Graph DB?• A database with an explicit graph structure• Each node knows its adjacent nodes• As the number of nodes increases, the cost of a local step (or hop) remains the same• Plus an Index for lookups
    29. 29. Compared to Relational Databases Optimized for aggregation Optimized for connections
    30. 30. What is Neo4j?
    31. 31. What is Neo4j?• A java based graph database• Property Graph• Full ACID (atomicity, consistency, isolation, durability)• High Availability (with Enterprise Edition)• 32 Billion Nodes, 32 Billion Relationships, 64 Billion Properties• Embedded Server• REST API
    32. 32. What is Neo4j?• Both nodes and relationships can have metadata.• Integrated pattern-matching-based query language (“Cypher”).• Also the “Gremlin” graph traversal language can be used.• Indexing of nodes and relationships. (Lucene)• Nice self-contained web admin.• Advanced path-finding with multiple algorithms.• Optimized for reads.• Has transactions (in the Java API)• Scriptable in Groovy• Online backup, advanced monitoring and High Availability is AGPL/commercial licensed
    33. 33. Neo4j is good for :• Highly connected data (social networks)• Recommendations (e-commerce)• Path Finding (how do I know you?)• A* (Least Cost path)• Data First Schema (bottom-up, but you still need to design)
    34. 34. how do I know you?
    35. 35. how can I get there ?
    36. 36. If you’ve ever• Joined more than 7 tables together• Modeled a graph in a table• Written a recursive CTE• Tried to write some crazy stored procedure with multiple recursive self and inner joins You should use Neo4j
    37. 37. rewiring you brain Language LanguageCountry Countrylanguage_code language_code country_codelanguage_name country_code country_nameword_count primary flag_uri Language Countryname name IS_SPOKEN_INcode codeword_count as_primary flag_uri
    38. 38. rewiring you brain name: “Canada” languages_spoken: “[ „English‟, „French‟ ]” language:“English” spoken_in name: “USA”name: “Canada” language:“Frech” spoken_in name: “France”
    39. 39. rewiring you brain Country name flag_uri language_name number_of_words yes_in_langauge no_in_language currency_code Country Languagename nameflag_uri number_of_words SPEAKS yes no Currency code name
    40. 40. show me the code!GraphDatabaseService graphDb = new EmbeddedGraphDatabase("var/neo4j");Node david = graphDb.createNode();Node andreas = graphDb.createNode();david.setProperty("name", "David Montag");andreas.setProperty("name", "Andreas Kollegger");Relationship presentedWith = david.createRelationshipTo(andreas, PresentationTypes.PRESENTED_WITH);presentedWith.setProperty("date", System.currentTimeMillis());
    41. 41. Neo4j data browser
    42. 42. Neo4j data browser
    43. 43. Neoclipse
    44. 44. console.neo4j.org Try it right now: start n=node(*) match n-[r:LOVES]->m return n, type(r), m Notice the two nodes in red, they are your result set.
    45. 45. Spring-Data-Neo4J
    46. 46. Spring-Data-Neo4J• Focus on Spring Data Neo4j• VMWare is collaborating with Neo Technology, the company behind the Neo4j graph database.• Improved programming model: Annotation-based programming model for applications with rich domain models• Cross-store persistence: Extend existing JPA application with NoSQL persistence• Tagging (grouped related data)• RDF (described connected data)
    47. 47. Spring-Data-Neo4J@NodeEntity@NodeEntitypublic class Actor { private String name; private int age; private HairColor hairColor; private transient String nickname;}
    48. 48. Spring-Data-Neo4J@NodeEntity public class Movie { @GraphId Long id; @Indexed(type = FULLTEXT, indexName = "search") String title; Person director; @RelatedTo(type="ACTS_IN", direction = INCOMING) Set<Person> actors; @RelatedToVia(type = "RATED") Iterable<Rating> ratings; @Query("start movie=node({self}) match movie-->genre<--similar return similar") Iterable<Movie> similarMovies;}
    49. 49. Spring-Data-Neo4J@RelationshipEntity@RelationshipEntitypublic class Role { @StartNodeprivate Actor actor; @EndNodeprivate Movie movie; privateString roleName;}
    50. 50. Spring-Data-Neo4J@RelationshipEntitypublic class Role { @StartNode private Actor actor; @EndNode private Movie movie; private String roleName;}@NodeEntitypublic class Actor { @RelatedToVia(type = “ACTS_IN”) private Iterable<Role> roles;}
    51. 51. How they did that ?
    52. 52. NoSql->Graph DB->Neo4JLecturer : Evgeny Hanikblum @ AlphaCSP:OracleWeek2012:IsraelEmail : evgenyh@alphacsp.com

    ×