Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Graph Analytics for big data


Published on

-Graph Analytics Frameworks & Trends
-Graph Databases & Trends
-IBM System G
-IBM Graph Store
-IBM System G & Graph Store in Bluemix

Published in: Data & Analytics

Graph Analytics for big data

  1. 1. By Priyabrata Dash email: Graph Analytics for Big Data (Current Trends &IBM System G)
  2. 2. Agenda  Graph Analytics Frameworks & Trends  Graph Databases & Trends  IBM System G  IBM Graph Store  IBM System G & Graph Store in Bluemix  Demo  Q & A
  3. 3. Graph Analytics Frameworks  Processing extremely large graphs has been and remains a challenge, but recent advances in Big Data technologies have made this task more practical.  There are two classes of systems to consider: − Graph databases for OLTP workloads for quick low-latency access to small portions of graph data. − Graph processing engines for OLAP workloads allowing batch processing of large portions of a graph.
  4. 4. Graph Processing Engines  Graph problems have now become mainstream and in response to the growing popularity for graph analyses, a large number of specialized graph engines have emerged  A key feature that these specialized graph engines have going for them is that they provide a vertex-centric way of graph programming, which is intuitive for the end (graph analytics) application developer to use.
  5. 5. Graph Processing Engines  Apache Giraph  Apache Hama  GraphX for Apache Spark  Faunus  Apache Tinkerpop  Gelly for Apache Flink  Dendrite  IBM System G & Many More .....
  6. 6. Graph Databases  In computing, a graph database is a database that uses graph structures for semantic queries with nodes, edges and properties to represent and store data.  Graph databases tend to be optimized for graph-based traversal algorithms.
  7. 7. Graph Databases
  8. 8. Graph Database – Where is the graph?
  9. 9. Graph Databases - Trends
  10. 10. Which Graph is Used?
  11. 11. Apache Tinkerpop  The TinkerPop stack provides a foundation for building high- performance graph applications of any size  It has the ability to build applications simple to handle trillion edge graphs scaled across a cluster of computers.
  12. 12. IBM System G
  13. 13. A missing pillar for Big Data
  14. 14. What is IBM System G?
  15. 15. System G Graph Computing Tools
  16. 16. 5 Key Use Case Categories
  17. 17. System G Application Use Cases
  18. 18. System G Native Store  System G Native store represents graphs in- memory and on-disk − Organizing graph data for representing a graph that stores both graph structure and vertex properties and edge properties − Caching graph data in memory in either batch- mode or on-demand from the on-disk streaming graph data − Persisting graph updates along with the time stamps from in-memory graph to on-disk graph − Performing graph queries by loading graph structure and/or property data
  19. 19. System G Native Store Solution
  20. 20. System G Native Store Overview  Native store not only offers persistent graph storage, but also sequential /concurrent/distributed graph runtimes  A set of C++ graph programming APIs, a CLI command set (gShell),  A socket client, a socket client GUI, and some visualization toolkit.
  21. 21. Tinkerpop & SPARQL Over Native Store  IBM System G has a JNI layer to translate the Native Store graph APIs into the TinkerPop APIs.  Therefore, JAVA graph applications built on top of the TinkerPop Blueprint can be ported onto the IBM System G Native Store. And various Open Source tools can be integrated into the IBM System G.  System G provides TinkerPop Blueprints interfaces to both it's high-performance C++ implementations and its HBase-based GBase graphs.  Since Native Store provides Tinkerpop/ Blueprints interface via JNI, Gremlin is running on Native Store.  A JENA based SPARQL query engine is installed on top of the System G Native Store
  22. 22. IBM System G on Bluemix  IBM System G on Bluemix, the cloud version of IBM System G, aims to help users get started with IBM System G graph database technologies, analytics, visualizations and solutions by interacting with the system in an online setting.   IBM® Graph Data Store enables you to build and work with powerful applications, using a fully-managed graph database service, accessible through a REST-based HTTP API interface.  IBM Graph Data Store is an experimental service. Data held in the Graph Data Store is not necessarily being backed up. In particular, it should not currently be used for high volume, high performance, or production applications.
  23. 23. Q & A
  24. 24. Thank You