The document discusses switching from a relational database model to a graph database model. It begins with an overview of common objections to adopting NoSQL databases due to complexity of data modeling. It then provides a brief introduction to key-value stores, column-based databases, and document databases before arguing that graph databases are well-suited to complex domains. The remainder of the document focuses on explaining the graph database model and how it differs from and improves upon the relational model for managing relationships through index-free adjacency rather than joins. Code examples are provided for creating a simple graph representing customer and address data using the OrientDB graph database.
During this talk, I'll present the main features of API Platform. We will install the framework, design an API data model as a set of tiny plain old PHP classes and learn how to get:
* A fully featured dev environment with Symfony Flex and React containers, HTTP/2 and HTTPS support and a cache proxy
* Pagination, data validation, access control, relation embedding, filters and error handling
* Support for modern REST API formats: JSON-LD/Hydra, OpenAPI/Swagger, JSONAPI, HAL, JSON…
* GraphQL support
* An API responding in a just few milliseconds thanks to the builtin invalidation based cache mechanism
* A dynamically created Material Design admin interface (a la Sonata / EasyAdmin – but 100% client-side) built with React
* Client apps skeletons: React/Redux, React Native, Vue.js, Angular…
Finally, we'll see ho to deploy the project in 1 command on Google Container Engine or any cloud with a Kubernetes.
Watch this talk here: https://www.confluent.io/online-talks/apache-kafka-architecture-and-fundamentals-explained-on-demand
This session explains Apache Kafka’s internal design and architecture. Companies like LinkedIn are now sending more than 1 trillion messages per day to Apache Kafka. Learn about the underlying design in Kafka that leads to such high throughput.
This talk provides a comprehensive overview of Kafka architecture and internal functions, including:
-Topics, partitions and segments
-The commit log and streams
-Brokers and broker replication
-Producer basics
-Consumers, consumer groups and offsets
This session is part 2 of 4 in our Fundamentals for Apache Kafka series.
During this talk, I'll present the main features of API Platform. We will install the framework, design an API data model as a set of tiny plain old PHP classes and learn how to get:
* A fully featured dev environment with Symfony Flex and React containers, HTTP/2 and HTTPS support and a cache proxy
* Pagination, data validation, access control, relation embedding, filters and error handling
* Support for modern REST API formats: JSON-LD/Hydra, OpenAPI/Swagger, JSONAPI, HAL, JSON…
* GraphQL support
* An API responding in a just few milliseconds thanks to the builtin invalidation based cache mechanism
* A dynamically created Material Design admin interface (a la Sonata / EasyAdmin – but 100% client-side) built with React
* Client apps skeletons: React/Redux, React Native, Vue.js, Angular…
Finally, we'll see ho to deploy the project in 1 command on Google Container Engine or any cloud with a Kubernetes.
Watch this talk here: https://www.confluent.io/online-talks/apache-kafka-architecture-and-fundamentals-explained-on-demand
This session explains Apache Kafka’s internal design and architecture. Companies like LinkedIn are now sending more than 1 trillion messages per day to Apache Kafka. Learn about the underlying design in Kafka that leads to such high throughput.
This talk provides a comprehensive overview of Kafka architecture and internal functions, including:
-Topics, partitions and segments
-The commit log and streams
-Brokers and broker replication
-Producer basics
-Consumers, consumer groups and offsets
This session is part 2 of 4 in our Fundamentals for Apache Kafka series.
Andrea Bielli, IT Architect Global Digital Solution, Enel
Davide Gimondo, Software Engineer, Enel
Enel mostra come neo4j aiuta nella gestione delle reti elettriche in 8 paesi nel mondo.
Con l’obiettivo di ottimizzare gli algoritmi di percorrenza della rete elettrica, in modo da rendere le reti sempre più efficienti e resilienti.
L’obiettivo di Enel è una gestione ottimale della topologia della rete per garantire gli obiettivi del gruppo: la transizione energetica e l’elettrificazione dei paesi in cui opera, verso l’obiettivo Net Zero, relativo alla riduzione delle emissioni nella produzione e distribuzione dell’energia elettrica.
JSON-LD is a set of W3C standards track specifications for representing Linked Data in JSON. It is fully compatible with the RDF data model, but allows developers to work with data entirely within JSON.
More information on JSON-LD can be found at http://json-ld.org/
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...Neo4j
What patterns are most appropriate for building ETLs using Neo4j? In this session, we share how we built the Google Cloud DataFlow flex template using the Neo4j Java API. You can then apply the same approach to building read and write operators in any framework, including AWS Lambda and Google Cloud Functions.
JavaLand 2022, März, Brühl, Mario-Leander Reimer (@LeanderReimer, Principal Software Architect bei QAware).
== Dokument bitte herunterladen, falls unscharf! Please download slides if blurred! ==
This session focuses on modern and efficient Inter Process Communication (IPC) for microservices. We start with a REST API, built using JAX-RS and Quarkus to briefly discuss the pros and cons of this approach. Then, we will extend the API with an efficient Protobuf payload representation in order to finally transform the API into a fully fledged high-performance gRPC interface definition. But that's not all! To put some extra icing on the cake, this talk will demonstrate how to consume the gRPC service from a JavaScript web client and also how to completely generate a matching REST API from an enhanced gRPC interface definition to ensure full interoperability in a microservice architecture.
Following on from the success of last year, this annual event for London's architect community will have architectural innovation as a theme this year, and particularly CQRS. At the DDD eXchange we will feature leading thinkers and architects who will share their experience and Eric Evans is the programme lead.
Combine Spring Data Neo4j and Spring Boot to quicklNeo4j
Speakers: Michael Hunger (Neo Technology) and Josh Long (Pivotal)
Spring Data Neo4j 3.0 is here and it supports Neo4j 2.0. Neo4j is a tiny graph database with a big punch. Graph databases are imminently suited to asking interesting questions, and doing analysis. Want to load the Facebook friend graph? Build a recommendation engine? Neo4j's just the ticket. Join Spring Data Neo4j lead Michael Hunger (@mesirii) and Spring Developer Advocate Josh Long (@starbuxman) for a look at how to build smart, graph-driven applications with Spring Data Neo4j and Spring Boot.
Microservices architecture can make functionality more flexible, testable, and scalable. The execution of the architecture has several difficulties, though, which a microservice design pattern can aid in overcoming. This presentation covers several design patterns in depth to assist you in choosing the best one.
Kafka at Scale: Multi-Tier ArchitecturesTodd Palino
This is a talk given at ApacheCon 2015
If data is the lifeblood of high technology, Apache Kafka is the circulatory system in use at LinkedIn. It is used for moving every type of data around between systems, and it touches virtually every server, every day. This can only be accomplished with multiple Kafka clusters, installed at several sites, and they must all work together to assure no message loss, and almost no message duplication. In this presentation, we will discuss the architectural choices behind how the clusters are deployed, and the tools and processes that have been developed to manage them. Todd Palino will also discuss some of the challenges of running Kafka at this scale, and how they are being addressed both operationally and in the Kafka development community.
Note - there are a significant amount of slide notes on each slide that goes into detail. Please make sure to check out the downloaded file to get the full content!
Domain Driven Design - Strategic Patterns and MicroservicesRadosław Maziarka
Presentation describes Domain Driven Design - approach to create applications driven by business domain. I show how to split your monolith base on DDD strategic patterns.
Scaling the mirrorworld with knowledge graphsAlan Morrison
After registration at https://www.brighttalk.com/webcast/9273/364148, you can view the full recording, which begins with Scott Abel's intro for a few minutes, then my talk for 20 minutes, and then Sebastian Gabler's. First presented on October 23 at an SWC webinar.
Conclusions:
(1) The mirrorworld (a world of digital twins, which will be 25 years in the making, according to Kevin Kelly) will require semantic knowledge graphs for interaction and interoperability.
(2) This fact implies massive future demand for knowledge graph technology and other new data infrastructure innovations, comparable to the scale of oil & gas industry infrastructure development over 150 years.
(3) Conceivably, knowledge graphs could be used to address a $205 billion market demand by 2021 for graph databases, information management, digital twins, conversational AI, virtual assistants and as knowledge bases/accelerated training for deep learning, etc. but the problem is that awareness of the tech is low, and the semantics community that understands the tech is still quite small.
(4) Over the next decades, knowledge graphs promise both scalability and substantial efficiencies in enterprises. But lack of awareness of its potential and how to harness it will continue to be stumbling blocks to adoption.
Developing applications with a microservice architecture (SVforum, microservi...Chris Richardson
Here is the version of my microservices talk that that I gave on September 17th at the SVforum Cloud SIG/Microservices meetup.
To learn more see http://microservices.io and http://plainoldobjects.com
Building microservices with Scala, functional domain models and Spring BootChris Richardson
In this talk you will learn about a modern way of designing applications that’s very different from the traditional approach of building monolithic applications that persist mutable domain objects in a relational database.We will talk about the microservice architecture, it’s benefits and drawbacks and how Spring Boot can help. You will learn about implementing business logic using functional, immutable domain models written in Scala. We will describe event sourcing and how it’s an extremely useful persistence mechanism for persisting functional domain objects in a microservices architecture.
Andrea Bielli, IT Architect Global Digital Solution, Enel
Davide Gimondo, Software Engineer, Enel
Enel mostra come neo4j aiuta nella gestione delle reti elettriche in 8 paesi nel mondo.
Con l’obiettivo di ottimizzare gli algoritmi di percorrenza della rete elettrica, in modo da rendere le reti sempre più efficienti e resilienti.
L’obiettivo di Enel è una gestione ottimale della topologia della rete per garantire gli obiettivi del gruppo: la transizione energetica e l’elettrificazione dei paesi in cui opera, verso l’obiettivo Net Zero, relativo alla riduzione delle emissioni nella produzione e distribuzione dell’energia elettrica.
JSON-LD is a set of W3C standards track specifications for representing Linked Data in JSON. It is fully compatible with the RDF data model, but allows developers to work with data entirely within JSON.
More information on JSON-LD can be found at http://json-ld.org/
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...Neo4j
What patterns are most appropriate for building ETLs using Neo4j? In this session, we share how we built the Google Cloud DataFlow flex template using the Neo4j Java API. You can then apply the same approach to building read and write operators in any framework, including AWS Lambda and Google Cloud Functions.
JavaLand 2022, März, Brühl, Mario-Leander Reimer (@LeanderReimer, Principal Software Architect bei QAware).
== Dokument bitte herunterladen, falls unscharf! Please download slides if blurred! ==
This session focuses on modern and efficient Inter Process Communication (IPC) for microservices. We start with a REST API, built using JAX-RS and Quarkus to briefly discuss the pros and cons of this approach. Then, we will extend the API with an efficient Protobuf payload representation in order to finally transform the API into a fully fledged high-performance gRPC interface definition. But that's not all! To put some extra icing on the cake, this talk will demonstrate how to consume the gRPC service from a JavaScript web client and also how to completely generate a matching REST API from an enhanced gRPC interface definition to ensure full interoperability in a microservice architecture.
Following on from the success of last year, this annual event for London's architect community will have architectural innovation as a theme this year, and particularly CQRS. At the DDD eXchange we will feature leading thinkers and architects who will share their experience and Eric Evans is the programme lead.
Combine Spring Data Neo4j and Spring Boot to quicklNeo4j
Speakers: Michael Hunger (Neo Technology) and Josh Long (Pivotal)
Spring Data Neo4j 3.0 is here and it supports Neo4j 2.0. Neo4j is a tiny graph database with a big punch. Graph databases are imminently suited to asking interesting questions, and doing analysis. Want to load the Facebook friend graph? Build a recommendation engine? Neo4j's just the ticket. Join Spring Data Neo4j lead Michael Hunger (@mesirii) and Spring Developer Advocate Josh Long (@starbuxman) for a look at how to build smart, graph-driven applications with Spring Data Neo4j and Spring Boot.
Microservices architecture can make functionality more flexible, testable, and scalable. The execution of the architecture has several difficulties, though, which a microservice design pattern can aid in overcoming. This presentation covers several design patterns in depth to assist you in choosing the best one.
Kafka at Scale: Multi-Tier ArchitecturesTodd Palino
This is a talk given at ApacheCon 2015
If data is the lifeblood of high technology, Apache Kafka is the circulatory system in use at LinkedIn. It is used for moving every type of data around between systems, and it touches virtually every server, every day. This can only be accomplished with multiple Kafka clusters, installed at several sites, and they must all work together to assure no message loss, and almost no message duplication. In this presentation, we will discuss the architectural choices behind how the clusters are deployed, and the tools and processes that have been developed to manage them. Todd Palino will also discuss some of the challenges of running Kafka at this scale, and how they are being addressed both operationally and in the Kafka development community.
Note - there are a significant amount of slide notes on each slide that goes into detail. Please make sure to check out the downloaded file to get the full content!
Domain Driven Design - Strategic Patterns and MicroservicesRadosław Maziarka
Presentation describes Domain Driven Design - approach to create applications driven by business domain. I show how to split your monolith base on DDD strategic patterns.
Scaling the mirrorworld with knowledge graphsAlan Morrison
After registration at https://www.brighttalk.com/webcast/9273/364148, you can view the full recording, which begins with Scott Abel's intro for a few minutes, then my talk for 20 minutes, and then Sebastian Gabler's. First presented on October 23 at an SWC webinar.
Conclusions:
(1) The mirrorworld (a world of digital twins, which will be 25 years in the making, according to Kevin Kelly) will require semantic knowledge graphs for interaction and interoperability.
(2) This fact implies massive future demand for knowledge graph technology and other new data infrastructure innovations, comparable to the scale of oil & gas industry infrastructure development over 150 years.
(3) Conceivably, knowledge graphs could be used to address a $205 billion market demand by 2021 for graph databases, information management, digital twins, conversational AI, virtual assistants and as knowledge bases/accelerated training for deep learning, etc. but the problem is that awareness of the tech is low, and the semantics community that understands the tech is still quite small.
(4) Over the next decades, knowledge graphs promise both scalability and substantial efficiencies in enterprises. But lack of awareness of its potential and how to harness it will continue to be stumbling blocks to adoption.
Developing applications with a microservice architecture (SVforum, microservi...Chris Richardson
Here is the version of my microservices talk that that I gave on September 17th at the SVforum Cloud SIG/Microservices meetup.
To learn more see http://microservices.io and http://plainoldobjects.com
Building microservices with Scala, functional domain models and Spring BootChris Richardson
In this talk you will learn about a modern way of designing applications that’s very different from the traditional approach of building monolithic applications that persist mutable domain objects in a relational database.We will talk about the microservice architecture, it’s benefits and drawbacks and how Spring Boot can help. You will learn about implementing business logic using functional, immutable domain models written in Scala. We will describe event sourcing and how it’s an extremely useful persistence mechanism for persisting functional domain objects in a microservices architecture.
The Digital Machine - Transforming Systems to SoftwareTim Sheiner
The Digital Machine is a model-based method for designing software.
This presentation explains how the systems thinking concepts of elements, relationships, purpose, feedback and state can be used to develop the 5 critical models that make up the Digital Machine:
- conceptual model
- persona model
- interaction model
- object model
- data model
The Harvest Digital Guide to Attribution ModellingMike Teasdale
Balancing spend and developing strategy across channels like paid and organic search, display and social is one of the biggest challenges in digital marketing.
In the real world, attribution modelling often boils down to choosing a model and seeing whether we like the results it gives. But this is hardly scientific. So what would a data-driven process to defining and assessing a cross-channel attribution model look like?
Data Day Texas presentation on our decision to switch to a graph database at WellAware. It gives an overview of the major factors that went into the decision to switch, challenges we’ve faced, and the lessons learned along the way to assist anyone looking to make the plunge into the world of graph databases.
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Cloudera, Inc.
Across all industries, organizations are embracing the promise of Apache Hadoop to store and analyze data of all types, at larger volumes than ever before possible. But to tap into the true value of this data, organizations need to manage this data and its subsequent metadata to understand its context, see how it’s changing, and take actions on it.
Cloudera Navigator is the only integrated data management and governance for Hadoop and is designed to do exactly this. With Cloudera 5.7, we have further expanded the capabilities in Cloudera Navigator to make it even easier to understand your data and maintain metadata consistency as it moves through Hadoop.
Marlabs Capabilities Overview: DWBI, Analytics and Big Data ServicesMarlabs
Marlabs’ Business Intelligence and Analytics practice can support customers’ needs throughout the information management lifecycle. As a vendor-agnostic and holistic service provider with expertise in a range of tools and technologies, we can help clients make informed decisions to employ the right technologies that align with their business needs.
[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼Kyunghoon Kim
https://www.pycon.kr/2016apac/program/1
How to make news fun?
Slideshare의 폰트 인식 문제로 인해 위 파일은 이미지 PDF로 업로드 되어 있습니다.
텍스트가 선택되는 PDF의 다운로드는 아래 링크를 이용하세요.
https://github.com/pythonkr/pyconapac-2016-files/raw/master/20160813-101-1-KimKyunghoon.pdf
Digital marketing ROI - An introduction to attribution modellingDifferent Spin
To help you get started in the potentially daunting realm of attribution modelling, we’ve crafted this whitepaper to explore what it is and how you can implement it for your business. We go through some of the most common attribution models and help define which of these is likely to be the best starting point for you.
Switching from the Relational to the Graph modelLuca Garulli
One of the main resistences of RDBMS users to pass to a NoSQL product are related to the complexity of the model: Ok, NoSQL products are super for BigData and BigScale but what about the model?
Today’s blog post described the several aspects of Nuke software and its utilization effects and magic created in movies.
The blog is initiated by the MAAC Kolkata team to acknowledge the readers about the software Nuke.
A brief introduction to LibreOffice - the Free and Open Source Office Productivity Suite by The Document Foundation, by our Manager (Coomunity & Relations) - Kinshuk Sunil
Polyglot Persistence vs Multi-Model DatabasesLuca Garulli
Many complex applications scale up by using several different databases, i.e. selecting the best DBMS for each use case. This tends to complicate modern architecture with many products by different vendors, no standards, and a lot of ETL which ultimately causes unpredictable results and a lot of headaches. Multi-Model DBMSs were created to make your life easier, giving you the option of using one NoSQL product with powerful multi-purpose engines capable of handling complex domains. Could one DBMS handle all your needs including speed and scalability in the times of Big Data? Luca will walk you through the benefits and trade-offs of multi-model DBMSs and will show you how easy it is to setup one open source database to handle many different use cases, saving you time and money.
Presented at Data Day Texas - Austin (TX) - USA
Why relationships are cool but "join" sucksLuca Garulli
Relational DBMS and Document Databases use the "JOIN" operation to connect records and documents. Is there a better way to connect things? This presentation illustrates how OrientDB manages relationships by using the same technique of Graph Databases for super fast traversal.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
1. Switching from the
Relational to the
Graph model
Luca Garulli –
Founder and CEO @NuvolaBase Ltd
Author of OrientDB Doc/Graph DB
Nov 23rd 2012 in Oxford, UK
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 1
www.orientechnologies.com
2. One of the main resistances of
RDBMS users to pass to a NoSQL product
are related to the
complexity of the model:
Ok, NoSQL products are super for
BigData and BigScale
but...
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 2
3. ...what about the model?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 3
4. What is the NoSQL answer
about managing complex domains?
Key-Value stores ?
Column-Based ?
Document database ?
Graph database !
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 4
5. CAUTION!
This presentation will not use a
social like domain with
the classic paradigm of
friend-of-friendN
where the graph databases
are already widely used...
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 5
6. ...But rather we will explore how
to think «graphically» with one of the
most common domains in the
enterprise world:
The old-classic CRM* domain
* today in 99% of the cases a RDBMS is used
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 6
7. Every developer knows
the Relational Model,
but who knows the
Graph one?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 7
8. Back to school:
Graph Theory crash course
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 8
9. Basic Graph
All Your
All Your
Likes
Luca
Luca Base
Base
Conference
Conference
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 9
10. Property Graph Model*
Vertices are
directed
Luca
Luca All Your Base
All Your Base
Likes
name: Luca
name: Luca Conference
Conference
surname: Garulli
surname: Garulli since: 2012
company: NuvolaBase
company: NuvolaBase date: Nov 23 2012
date: Nov 23 2012
Vertices and Edges
can have properties
* https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 10
11. Property Graph Model
Likes
012
since:
2
All Your
All Your
Luca
Luca Base
Base
Speak Conference
Conference
s
ti
abstra tle: «Switch
ct: «Th in
is talk g...»
presen
ts...»
An Edge connects 2
vertices: use multiple
vertices to represents 1-N
and N-M relationships
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 11
12. Property Graph Model
Studies Oxford
Oxford
Luca
Luca
Likes located
FriendOf
All Your Base
All Your Base
Conference
Conference
John
John Organizes
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 12
13. Compliments, this is your diploma in
«Graph Theory»
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 13
14. Now go back
to our domain:
the CRM
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 14
15. Domain: the super minimal CRM
Customer
Customer Address
Address
Registry system
Order system
Order
Order Stock
Stock
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 15
16. Domain: the super minimal CRM
Customer
Customer Address
Address
How does
Relational DBMS
Registry system
manage relationships?
Order system
Order
Order Stock
Stock
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 16
17. Relational World: 1-1 Relationships
Primary key Primary key
Customer Address
Id Name Address Id Location
Foreign key
10 Luca 34 34 Rome
11 Mike 44 44 London
34 John 54 54 Oxford
56 Mark 66 66 New Mexico
88 Steve 68 68 Palo Alto
JOIN Customer.Address -> Address.Id
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 17
18. Relational World: 1-N Relationships
Customer Address
Id Name Id Customer Location
10 Luca 24 10 Rome
11 Mike 33 10 London
34 John 44 34 Oxford
56 Mark 66 56 Cologne
88 Steve 68 88 Palo Alto
Inverse JOIN Address.Customer -> Customer.Id
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 18
19. Relational World: N-M Relationships
Customer CustomerAddress Address
Id Name Id Address Id Location
10 Luca 10 24 24 Rome
11 Mike 10 33 33 London
34 John 11 44 44 Oxford
56 Mark 66 Cologne
88 Steve 68 Palo Alto
Additional table with 2 JOINs
(1) CustomerAddress.Id -> Customer.Id and
(2) CustomerAddress.Address -> Address.Id
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 19
20. Relational World: N-M Relationships
Customer CustomerAddress Address
Id Name Id Address Id Location
10 Luca 10 24 24 Rome
11 Mike 10 33 33 London
34 John 11 44 44 Oxford
56 Mark 66 Cologne
88 Steve 68 Palo Alto
Additional table with 2 JOINs
(1) CustomerAddress.Id -> Customer.Id and
(2) CustomerAddress.Address -> Address.Id
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 20
21. What’s wrong with the
Relational Model?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 21
22. The JOIN is the evil!
Customer CustomerAddress Address
Id Name Id Address Id Location
10 Luca 10 24 24 Rome
11 Mike 10 33 33 London
34 John 34 24 44 Oxford
56 Mark 66 Cologne
88 Steve 68 Palo Alto
These are all JOINs executed
everytime you traverse a
relationship!
relationship
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 22
23. A JOIN means searching for a key in
another table
The first rule to improve performance
is indexing all the keys
Index speeds up searches, but slows down
insert, updates and deletes
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 23
24. So in the best case a JOIN is a lookup
into an index
This is done per single join!
If you traverse hundreds of relationships
you’re executing hundreds of JOINs
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 24
25. Index Lookup
is it really that fast?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 25
26. Index Lookup: how does it works?
A-Z
A-L M-Z
Think to an
Address Book
where we have to find
the Luca’s phone
number
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 26
27. Index Lookup: how does it works?
A-Z
A-L M-Z
A-L M-Z
A-D E-L M-R S-Z
Index algorithms are all
similar and based on
balanced trees
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 27
28. Index Lookup: how does it works?
A-Z
A-L M-Z
A-L M-Z
A-D E-L M-R S-Z
A-D E-L
A-B C-D E-G H-L
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 28
29. Index Lookup: how does it works?
A-Z
A-L M-Z
A-L M-Z
A-D E-L M-R S-Z
A-D E-L
A-B C-D E-G H-L
E-G H-L
E-F G H-J K-L
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 29
30. Index Lookup: how does it works?
A-Z
A-L M-Z
A-L M-Z
Found!
A-D E-L M-R S-Z
This lookup took 5
A-D E-L steps and grows
A-B C-D E-G H-L
up with the index
E-G H-L size!
E-F G H-J K-L
Luca
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 30
31. An index lookup is executed
for each JOIN
Querying more tables can easily
produce millions of JOINs/Lookups!
Here the rule: more entries
= more lookup steps = slower JOIN
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 31
32. Oh! This is why
performance of my database
drops down when
it becomes bigger,
and bigger,
and bigger!
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 32
33. Is there a better way to
manage relationships?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 33
34. “A graph database is any
storage system
that provides
index-free adjacency”
- Marko Rodriguez
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 34
35. How does GraphDB manage
index-free relationships?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 35
36. an Open Source (Apache licensed)
document-graph NoSQL dbms
supports: transactions, extended-SQL,
Multi-Master replication, etc
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 36
37. OrientDB: traverse a relationship
The Record ID (RID)
is the physical position
RID = #13:35
RID = #13:35 RID = #13:100
RID = #13:100
RID = #14:54
RID = #14:54
Lives
Luca
Luca Rome
Rome
out: [#13:35]
out: [#13:35]
in: [#13:100]
in: [#13:100]
out : :[#14:54] Label : :‘Lives’
Label ‘Lives’ in: [#14:54]
out [#14:54] in: [#14:54]
label : :‘Customer’
label ‘Customer’ label = ‘Address’
label = ‘Address’
name : :‘Luca’
name ‘Luca’ name = ‘Rome’
name = ‘Rome’
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 37
38. GraphDB handles relationships as a
physical LINK to the record
assigned when the edge is created
on the other side
RDBMS computes the
relationship every time you query a database
Is not that crazy?!
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 38
39. This means jumping from a
O(log N) algorithm to a near O(1)
traversing cost is not more affected
by database size!
This is huge in the BigData age
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 39
40. OrientDB in the Blueprints micro-benchmark,
on common hw, with a hot cache,
traverses 29,6 Millions
of records in less than 5 seconds
about 6 Millions of nodes traversed per sec!
Do not try this at home
with a RDBMS*!
*unless you live in the Google’s server farm
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 40
41. Create the graph in SQL
$luca> cd bin
$luca> ./console.sh
OrientDB console v.1.3.0-SNAPSHOT (www.orientdb.org)
Type 'help' to display all the commands supported.
orientdb> create vertex Customer set name = ‘Luca’
Created vertex #13:35 in 0.03 secs
orientdb> create vertex Address set name = ‘Rome’
Created vertex #13:100 in 0.02 secs
orientdb> create edge Lives from #13:35 to #13:100
Created edge #14:54 in 0.02 secs
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 41
42. Create the graph in Java
OGraphDatabase graph = new OGraphDatabase("local:/tmp/db/graph”);
ODocument luca = graph.createVertex(“Customer");
luca.field(“name", “Luca");
ODocument rome = graph.createVertex(“Address”);
rome.field(“name", “Rome”);
ODocument edge = graph.createEdge(luca, rome, “Lives”);
edge.save();
graph.close();
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 42
43. Query the graph in SQL
orientdb> select in.out from Address where name = ‘Rome’
---+------+---------|--------------------+--------------------+--------+
#| RID |@class |label |out |in |
---+------+---------+--------------------+--------------------+--------+
0| 13:35|Customer |Luca |[#14:54] | |
---+------+---------+--------------------+--------------------+--------+
1 item(s) found. Query executed in 0.007 sec(s).
Incoming vertices
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 43
44. More on query power
orientdb> select sum( orders.total ) from Customer
where name = ‘Luca’
orientdb> traverse friend from Customer while $depth <= 7
orientdb> select from (
traverse friend from Customer while $depth <= 7
) where city.name = ‘Oxford’
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 44
45. Query vs traversal
Once you’ve a well connected database
in the form of a Super Graph you can
cross records instead of query them!
All you need is some root vertices
where to start traversing
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 45
46. Query vs traversal
Special
Special
Customers
Customers Stocks
Stocks
Customers
Customers
Luca
Luca John
John Sylvia
Sylvia
White
White
This is a Soap
Soap
root vertex Order
Order Order
Order
2332
2332 8834
8834
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 46
47. This is your database
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 47
48. Get last customer bought Whisky
select last(orders.customers) from Stock
where name = ‘Whisky’
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 48
49. Get it’s country
select city.country from #34:22
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 49
50. Get orders from that country
select orders from #55:12
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 50
51. NuvolaBase.com
HTTP/REST
HTTP/REST
The first Graph Database as a Service
on the Cloud
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 51
52. Do we have enough time for a demo?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 52
53. Questions & (maybe) Answers
Luca Garulli
CEO at
Document-Graph NoSQL
Open Source project
Ltd, London UK
www.twitter.com/lgarulli
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 53
54. Summary
1)JOIN is heavy, specially on large databases
2)GraphDB uses LINK as
direct pointers to records:
times from O(log)N to near O(1)
3) GraphDB has a query language specialized to
traverse relationships
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 54
55. Let’s move like a
Spider
on the web
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 55