The document discusses OrientDB's transition from a master-slave architecture to a new multi-master distributed architecture. The new architecture allows any node to read and write, improves scalability, and handles conflicts intelligently. It will be released in OrientDB version 1.0 in December 2011.
NERSC is the production high-performance computing (HPC) center for the United States Department of Energy (DOE) Office of Science. The center supports over 6,000 users in 600 projects, using a variety of applications in materials science, chemistry, biology, astrophysics, high energy physics, climate science, fusion science, and more.
NERSC deployed the Cori system on over 9,000 Intel® Xeon Phi™ processors. This session describes the optimization strategy for porting codes that target traditional manycore architectures to the processors. We also discuss highlights and lessons learned from the optimization process on 20 applications associated with the NERSC Exascale Science Application Program (NESAP).
Using GPUs to handle Big Data with Java by Adam Roberts.J On The Beach
Modern graphics processing units (GPUs) are efficient general-purpose stream processors. Learn how Java can exploit the power of GPUs to optimize high-performance enterprise and technical computing applications such as big data and analytics workloads. This presentation covers principles and considerations for GPU programming from Java and looks at the software stack and developer tools available. It also presents a demo showing GPU acceleration and discusses what is coming in the future.
Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...Anne Nicolas
This talk will introduce XDP (eXpress Data Path), and explain how this is essentially a new (programmable) network layer in-front of the existing network stack. Then it will dive into the details of the new XDP redirect feature, which goes beyond forwarding packets out other NIC devices.
The eXpress Data Path (XDP) has been gradually integrated into the Linux kernel over several releases. XDP offers fast and programmable packet processing in kernel context. The operating system kernel itself provides a safe execution environment for custom packet processing applications, in form of eBPF programs, executed in device driver context. XDP provides a fully integrated solution working in concert with the kernel’s networking stack. Applications are written in higher level languages such as C and compiled via LLVM into eBPF bytecode which the kernel statically analyses for safety, and JIT translates into native instructions. This is an alternative approach, compared to kernel bypass mechanisms (like DPDK and netmap).
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?ArangoDB Database
View the video of this webinar here: https://www.arangodb.com/arangodb-events/gvisor-kata-containers-firecracker-docker/
Containers* have revolutionized the IT landscape and for a long time. Docker seemed to be the default whenever people were talking about containerization technologies**. But traditional container technologies might not be suitable if strong isolation guarantees are required. So recently new technologies such as gVisor, Kata Container, or firecracker have been introduced to close the gap between the strong isolation of virtual machines and the small resource footprint of containers.
In this talk, we will provide an overview of the different containerization technologies, discuss their tradeoffs, and provide guidance for different use cases.
* We will define the term container in more detailed during the talk
** and yes we will also cover some of the pre-docker container space!
An evaluation of LLVM compiler for SVE with fairly complicated loopsLinaro
By Hiroshi Nakashima, Kyoto University / RIKEN AICS
As a part of the evaluation of Post-K’s compilers, we have been investigating compiled codes of vectorizable kernel loops in a particle-in-cell simulation program. This talk will reveal how the latest version of LLVM compiler (v1.4) works on the loops together with the qualitative and quantitative comparison with the code generated by Intel’s compiler for KNL.
Hiroshi Nakashima Bio
Currently working as a professor of Kyoto University’s supercomputer center (ACCMS) for R&D on HPC programming and supercomputer system architecture, as well as a visiting senior researcher of RIKEN AICS for the evaluation of Post-K computer and its compilers.
Email
h.nakashima@media.kyoto-u.ac.jp
For more info on The Linaro High Performance Computing (HPC) visit https://www.linaro.org/sig/hpc/
Long-term, H.265 will likely succeed H.264’s position as the premier solution for advanced video, though that may depend on whether or not battery consumption while decoding can match H.264’s levels in the long term. That’s something we’ll only be able to evaluate once hardware is available, but for now we’re optimistic. H.265’s explicitly parallel model should map well against multi-core devices of the future.
Encode/decode support, meanwhile, is already going to be possible on a vast range of products. Modern CPUs are more than capable of decoding H.265 in software, OpenCL support is coming in future iterations, and hardware GPU support, while not formally guaranteed by AMD, Intel, or Nvidia for next-generation products, is a mid-term certainty. All three companies have previously leapt to include advanced video pipelines in their products — as the H.265 presentation notes, video is something that’s become ubiquitous across every type of device
NERSC is the production high-performance computing (HPC) center for the United States Department of Energy (DOE) Office of Science. The center supports over 6,000 users in 600 projects, using a variety of applications in materials science, chemistry, biology, astrophysics, high energy physics, climate science, fusion science, and more.
NERSC deployed the Cori system on over 9,000 Intel® Xeon Phi™ processors. This session describes the optimization strategy for porting codes that target traditional manycore architectures to the processors. We also discuss highlights and lessons learned from the optimization process on 20 applications associated with the NERSC Exascale Science Application Program (NESAP).
Using GPUs to handle Big Data with Java by Adam Roberts.J On The Beach
Modern graphics processing units (GPUs) are efficient general-purpose stream processors. Learn how Java can exploit the power of GPUs to optimize high-performance enterprise and technical computing applications such as big data and analytics workloads. This presentation covers principles and considerations for GPU programming from Java and looks at the software stack and developer tools available. It also presents a demo showing GPU acceleration and discusses what is coming in the future.
Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...Anne Nicolas
This talk will introduce XDP (eXpress Data Path), and explain how this is essentially a new (programmable) network layer in-front of the existing network stack. Then it will dive into the details of the new XDP redirect feature, which goes beyond forwarding packets out other NIC devices.
The eXpress Data Path (XDP) has been gradually integrated into the Linux kernel over several releases. XDP offers fast and programmable packet processing in kernel context. The operating system kernel itself provides a safe execution environment for custom packet processing applications, in form of eBPF programs, executed in device driver context. XDP provides a fully integrated solution working in concert with the kernel’s networking stack. Applications are written in higher level languages such as C and compiled via LLVM into eBPF bytecode which the kernel statically analyses for safety, and JIT translates into native instructions. This is an alternative approach, compared to kernel bypass mechanisms (like DPDK and netmap).
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?ArangoDB Database
View the video of this webinar here: https://www.arangodb.com/arangodb-events/gvisor-kata-containers-firecracker-docker/
Containers* have revolutionized the IT landscape and for a long time. Docker seemed to be the default whenever people were talking about containerization technologies**. But traditional container technologies might not be suitable if strong isolation guarantees are required. So recently new technologies such as gVisor, Kata Container, or firecracker have been introduced to close the gap between the strong isolation of virtual machines and the small resource footprint of containers.
In this talk, we will provide an overview of the different containerization technologies, discuss their tradeoffs, and provide guidance for different use cases.
* We will define the term container in more detailed during the talk
** and yes we will also cover some of the pre-docker container space!
An evaluation of LLVM compiler for SVE with fairly complicated loopsLinaro
By Hiroshi Nakashima, Kyoto University / RIKEN AICS
As a part of the evaluation of Post-K’s compilers, we have been investigating compiled codes of vectorizable kernel loops in a particle-in-cell simulation program. This talk will reveal how the latest version of LLVM compiler (v1.4) works on the loops together with the qualitative and quantitative comparison with the code generated by Intel’s compiler for KNL.
Hiroshi Nakashima Bio
Currently working as a professor of Kyoto University’s supercomputer center (ACCMS) for R&D on HPC programming and supercomputer system architecture, as well as a visiting senior researcher of RIKEN AICS for the evaluation of Post-K computer and its compilers.
Email
h.nakashima@media.kyoto-u.ac.jp
For more info on The Linaro High Performance Computing (HPC) visit https://www.linaro.org/sig/hpc/
Long-term, H.265 will likely succeed H.264’s position as the premier solution for advanced video, though that may depend on whether or not battery consumption while decoding can match H.264’s levels in the long term. That’s something we’ll only be able to evaluate once hardware is available, but for now we’re optimistic. H.265’s explicitly parallel model should map well against multi-core devices of the future.
Encode/decode support, meanwhile, is already going to be possible on a vast range of products. Modern CPUs are more than capable of decoding H.265 in software, OpenCL support is coming in future iterations, and hardware GPU support, while not formally guaranteed by AMD, Intel, or Nvidia for next-generation products, is a mid-term certainty. All three companies have previously leapt to include advanced video pipelines in their products — as the H.265 presentation notes, video is something that’s become ubiquitous across every type of device
OrientDB vs Neo4j - and an introduction to NoSQL databasesCurtis Mosters
NoSQL databases are a good alternative to common SQL technologies. Here you get an introduction and comparison of SQL vs NoSQL. Furthermore we have a look on Graph databases and especially OrientDB vs Neo4j.
A graph is a data structure composed of vertices/dots and edges/lines. A graph database is a software system used to persist and process graphs. The common conception in today's database community is that there is a tradeoff between the scale of data and the complexity/interlinking of data. To challenge this understanding, Aurelius has developed Titan under the liberal Apache 2 license. Titan supports both the size of modern data and the modeling power of graphs to usher in the era of Big Graph Data. Novel techniques in edge compression, data layout, and vertex-centric indices that exploit significant orders are used to facilitate the representation and processing of a single atomic graph structure across a multi-machine cluster. To ensure ease of adoption by the graph community, Titan natively implements the TinkerPop 2 Blueprints API. This presentation will review the graph landscape, Titan's techniques for scale by distribution, and a collection of satellite graph technologies to be released by Aurelius in the coming summer months of 2012.
OrientDB vs Neo4j - Comparison of query/speed/functionalityCurtis Mosters
This presentation gives an overview on OrientDB and Neo4j. It also compares some specific querys, their speed and the overall functionality of both databases.
The querys might not be optimized in both cases. At least they have the same outcome and are both written as querys. For sure in Neo4j you should do this in Java code. But that is way harder to write, so this presentation is more like a direkt comparision instead of really getting the best results.
Also it's done with real data and at the end round about 200 GB of data.
User Expertise Characterization across Multiple Social NetworksTU Delft
In this work of Master thesis, we target three platforms related to software development (StackOverflow, GitHub, and Twitter). One of the first contributions is the design and building of a database on these three platforms, as a result of a complex phase of crawling, extraction and matching of 58K user profiles and their respective
interaction networks. By capitalising on this dataset, we characterise different types of user expertise within and across professional-oriented online platforms, and operationalise the notions of ubiquitous and specialist expertise. We investigate how personal and relational triggering stimuli impact on the within- and across-network
expert activities; how the users’ reputation vary across networks; and how they tend to form communities in different networks. Results show the importance of identifying and analysing different types of expertise traits across social platforms, as a mean to better characterise expertise and its online manifestation.
Bitsy is a small, fast, embeddable, durable in-memory graph database that implements the Blueprints API. It supports [ACID] transactions with optimistic concurrency control and on-disk persistence.
Implementation details of Sparksee's graph database, learn how bitmaps store graph information and how this result in a lightweight & high-performance solution.
Intro to Graph Databases Using Tinkerpop, TitanDB, and GremlinCaleb Jones
A quick overview of the history, motivation, and uses of graph modeling and graph databases in various industries. Covers a brief introduction to graph databases with an emphasis on the Tinkerpop stack and Gremlin query language. These concepts are then solidified through a hands-on lab modeling a blog engine using Titan and Gremlin.
See more at http://allthingsgraphed.com.
Titan is an open source distributed graph database build on top of Cassandra that can power real-time applications with thousands of concurrent users over graphs with billions of edges. Graphs are a versatile data model for capturing and analyzing rich relational structures. Graphs are an increasingly popular way to represent data in a wide range of domains such as social networking, recommendation engines, advertisement optimization, knowledge representation, health care, education, and security.
This presentation discusses Titan's data model, query language, and novel techniques in edge compression, data layout, and vertex-centric indices which facilitate the representation and processing of Big Graph Data across a Cassandra cluster. We demonstrate Titan's performance on a large scale benchmark evaluation using Twitter data.
Presented at the Cassandra 2012 Summit.
Webinar: Accelerate Your Inner Dev Loop for Kubernetes Services Ambassador Labs
Many turn to static duplicate dev environments to shorten the dev loop and isolate code tests, but those bring about additional issues. The idea of safely sharing a dev environment and seeing your code changes in action immediately before sharing them probably seems impossible.
Service Preview, powered by Telepresence and the Ambassador Edge Stack, is here to help! This capability enables you to preview changes immediately and test locally with your tool of choice, while sharing a development cluster.
In this 45-minute webinar, Abhay Saxena will demonstrate using Service Preview to have a fast inner development loop while fixing a bug in a microservice, including stepping through the code in a debugger while other developers continue working unaffected.
A detailed description of how Cloudscaling's Open Cloud System (OCS) has solved the network scalability problems in OpenStack. We'll cover how and why we designed a Layer-3 (L3) scale-out network, how we plugin and extend OpenStack, and talk about why we did it this way.
Would you like to know how to build an application server from scratch? This talk would provide an insight to the thought process and the key decisions made while building WebROaR from grounds up using C & Ruby.
What enables this server to deliver high performance and also offer a rich bouquet of integrated features like Analytics, Exception Notifications etc? If gaining knowledge about design of a good software product interests you, do join us for this interactive session.
In this slide, I briefly introduce the container and how docker implement it, including the image and container itself. also show how docker setup the networking connectivity by default bridge network.
Ruby on Rails com certeza é a estrela que elevou Ruby ao patamar de linguagem praticamente obrigatória a todo programador moderno. Porém muitos esquecem que Rails não é a única forma de desenvolver aplicações Web e muitas vezes nem é a melhor opção. O Ecossistema Ruby evolui a passos largos, todos os novos frameworks Web em Ruby adotaram o padrão Rack, que facilita a interoperabilidade entre frameworks Ruby. Além disso muitos estão olhando para o mundo da alta concorrência com novas tecnologias como Node.JS, mas o mundo Ruby tem opções robustas e testadas em produção como EventMachine. Enfim, vamos tentar aumentar o leque de soluções web com Ruby, além do Rails, traçando um paralelo com a plataforma .NET.
Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016Cloud Native Day Tel Aviv
For the past 2 years we’ve been working on running OpenStack in ever growing scales. In Juno we started Dragonflow, a Neutron integrated distributed SDN project with an ambitious goal – scaling to 10,000 physical servers in a single zone. Although we’re not there yet, we’re definitely on the right track.
Dragonflow employs the following principles
• Pluggable NoSQL DB – to adapt for different size deployments and SLAs
• Distribution of Policies rather than flows – moving the “brain” to the edges
• Distributed architecture – enforcing network policies in the compute nodes
• Hybrid flow pipeline – Utilizing both proactive and reactive flows to easily allow built-in distributed smart apps (e.g. distributed DHCP)
Similar to OrientDB distributed architecture 1.1 (20)
Polyglot Persistence vs Multi-Model DatabasesLuca Garulli
Many complex applications scale up by using several different databases, i.e. selecting the best DBMS for each use case. This tends to complicate modern architecture with many products by different vendors, no standards, and a lot of ETL which ultimately causes unpredictable results and a lot of headaches. Multi-Model DBMSs were created to make your life easier, giving you the option of using one NoSQL product with powerful multi-purpose engines capable of handling complex domains. Could one DBMS handle all your needs including speed and scalability in the times of Big Data? Luca will walk you through the benefits and trade-offs of multi-model DBMSs and will show you how easy it is to setup one open source database to handle many different use cases, saving you time and money.
Presented at Data Day Texas - Austin (TX) - USA
Why relationships are cool but "join" sucksLuca Garulli
Relational DBMS and Document Databases use the "JOIN" operation to connect records and documents. Is there a better way to connect things? This presentation illustrates how OrientDB manages relationships by using the same technique of Graph Databases for super fast traversal.
Switching from the Relational to the Graph modelLuca Garulli
One of the main resistences of RDBMS users to pass to a NoSQL product are related to the complexity of the model: Ok, NoSQL products are super for BigData and BigScale but what about the model?
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Enhancing Performance with Globus and the Science DMZGlobus
ESnet has led the way in helping national facilities—and many other institutions in the research community—configure Science DMZs and troubleshoot network issues to maximize data transfer performance. In this talk we will present a summary of approaches and tips for getting the most out of your network infrastructure using Globus Connect Server.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Monitoring Java Application Security with JDK Tools and JFR Events
OrientDB distributed architecture 1.1
1. rev 1.1
Distributed architecture
with a Multi-Master approach
Available in version 1.0
(planned for December 2011)
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 1 of 41
2. Where is the previous
OrientDB
Master/Slave
architecture?
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 2 of 41
3. www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 3 of 41
4. After first tests we decided to
throw away the old Master-Slave
architecture because it was
against the OrientDB philosophy:
doesn't scale
and
it's hard to configure properly
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 4 of 41
5. So what's next?
We've re-designed the entire distributed
architecture to get it working as
Multi-Master* *http://en.wikipedia.org/wiki/Multi-master_replication
to release in the version 1.0
(december 2011)
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 5 of 41
6. In the Multi-Master architecture
any node can read/write to the database
this scale up horizontly
adding nodes is straightforward
Say wow!
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 6 of 41
7. www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 7 of 41
8. ...but
you have to fight
with
conflicts
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 8 of 41
9. Fortunately we found some
smart ways to resolve conflicts without
falling in a
Blood Bath
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 9 of 41
10. The actors
Only 1 per Leader per cluster, checks other nodes and
Leader Node notify changes to other Peer Nodes. Can be any server
node in the cluster, usually the first to start
Any server node in the cluster. Has a permanent
Peer Node connection to the Leader Node
Clients are connected to Server Nodes no matter if Leader
Client
or Peer
Database Database, where data are stored
Synchronous mode replication. Server node propagates
changes waiting for the response from the remote server,
then sends the ACK to the client
Asynchronous mode replication. Server node propagates
changes and sends the ACK to the client without waiting
for the response from the remote server
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 10 of 41
11. How the cluster
of nodes is
composed
and
managed?
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 11 of 41
12. Cluster auto-discovering
At start up each Server Node sends a IP Multicast message in broadcast to
discover if any Leader Node is available to join the cluster. If available, the
Leader Node will connect to it and it becomes a Peer Node, otherwise it becomes
the Leader node.
Server #1
(Leader) DBDB
DBDB
DBDB
Server #2
(Peer)
DBDB
DBDB
DBDB
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 12 of 41
13. One Leader Multiple Peers
The first node to start is always the Leader but in case of failure can be elected
any other. Leader Node polls all the servers verifying the status and alerts all the
Peer Nodes at every changes in the cluster composition.
Server #1
(Leader) DBDB
DBDB
DBDB
Server #2 Server #3
(Peer) (Peer)
DBDB
DBDB
DBDB DBDB
DBDB
DB
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 13 of 41
14. Asymmetric clustering
Each database can be clustered in multiple server nodes. Databases can be moved
across servers. Replication strategy has per database/server granularity.
This means you could have Server #2 that replicates database B in asynch way
to the Server #3 and database A in synch way to the Server #1.
A
Server #1
(Leader)
C
Server #2 Server #3
(Peer) (Peer)
A B C B
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 14 of 41
15. Distributed configuration
Cluster configuration is broadcasted from the Leader Node to all the Peer Nodes.
Peer Nodes broadcast to all the connected clients.
Everybody knows who has the database
Client #1 Server #1
(Leader) Client #3
Server #2 Server #3
(Peer) (Peer)
Client #2
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 15 of 41
16. Security
To join a cluster the Server Node has to configure the cluster name and password
Broadcast messages are encrypted using the password
Password doesn't cross the network: it's stored in the configuration file
Server #1
(Leader)
Server #2 Join the cluster
(Peer) ONLY
If knows the name
DBDB
DBDB
DBDB and password
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 16 of 41
17. Leader election
Each Peer Node continuously checks the connection with the Leader Node
If lost try to elect itself as a new Leader Node
Split Network resolved using a simple algorithm
Server #1 Server #2
192.168.0.10:2424 192.168.10.27:2424
(Leader) (Leader)
Server #1 takes the
leadership
because has the lower ID
ID = <ip-address>:<port>
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 17 of 41
18. Multiple clusters
Multiple separate clusters can coexist in the same network
Clusters can't see each others. Are separated boxes
What identify a cluster is name + password
Cluster 'A', password 'aaa'
Server #1 Cluster 'B', password 'bbb'
(Leader)
Server #2 Server #1
(Peer)
Server #3 (Leader)
(Peer) Server #2
(Peer)
Server #3
(Peer)
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 18 of 41
19. Fail-over
Clients knows about other nodes, so transparently switch
to good servers. No error is sent to the client app.
Running transactions will be repeated transparently too (v1.2)
Client #1 Client #2 Client #3 Client #4
Server #1 Server #2
DB-1 DB-2
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 19 of 41
20. How the replication works?
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 20 of 41
21. Synchronous Replication
Guarantees two databases are always consistent
More expensive than asynchronous because the First Server
waits for the Second Server's answer before to send back
the ACK to the client. After ACK the Client is secure
the data is placed in multiple nodes at the same time
Server #1 Server #2
DB-1 DB-2
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 21 of 41
22. Synchronous Replication
steps
Client #1
6) Sends back OK to Client #1
1) Update record request
3) Propagates the update
Server #1 Server #2
2) Update record to DB-1 5) Sends back OK to Server #1 4) update record to DB-2
DB-1 DB-2
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 22 of 41
23. Asynchronous Replication
Changes are propagated without waiting for the answer
Two databases could be not consistent in the range of few ms
For this reason it's called “Eventually Consistent”
It's much less expensive than synchronous replication.
Server #1 Server #2
DB-1 DB-2
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 23 of 41
24. Asynchronous Replication
steps
(4a and 4b are executed in parallel)
Client #1
4a) Sends back OK to Client #1
1) Update record request
3) Propagates the update
Server #1 Server #2
2) Update record to DB-1 4b) update record to DB-2
DB-1 DB-2
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 24 of 41
25. Error Management
During replication the Second Server could get an error due to a
conflict (the record was modified in the same moment from another client)
or a I/O problem. In this case the error is logged to disk to being fixed later.
Client #1
4) Sends back OK to Client #1
1) Update record request
3) Propagates the update
Server #1 Server #2
2) Update record to DB-1 6) log the error 5) update record to DB-2
DB-1 Synch Log DB-2
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 25 of 41
26. Conflict Management
During replication conflicts could happen if two clients are
updating the same record at the same time
The conflicts resolution strategy can be plugged by providing
implementations of the OConflictResolver interface
Server #2
Conflict Strategy DB-2
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 26 of 41
27. Conflict Management
Default strategy
Default implementation Server #2
merges the records:
in case same fields are
changed the oldest
Default DB-2
document wins and the
Conflict Strategy
newest is written into the
Synch Log
Synch Log
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 27 of 41
28. Manual control of conflicts
like SVN/GIT tools
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 28 of 41
29. Display the diff of 2 databases
> compare database db1 db2
Copy a record across databases
> copy record #10:20@db1 to #10:20@db2
Copy entire cluster across databases
> copy cluster city@db1 to city@db2
Merges two records across databases
> merge records #10:20@db1 #10:20@db2
to #10:20@db1
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 29 of 41
30. How nodes are re-aligned
once up again after a fail,
shutdown or network problem?
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 30 of 41
31. During replication all operations
are logged using
unique op-id with the format <node>#<serial>
Client
Update a record
Server #1 Server #2
Op-id: 192.168.0.10:2424#123232 Op-id: 192.168.0.10:2424#123232
Operation Log DB-1 DB-2 Operation Log
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 31 of 41
32. On restart the node asks to the Leader
which are the servers to synchronize
op-ids are used to know the operation missed
Server #1 Server #2
Op-id: 192.168.1.11:2424#9569 Op-id: 192.168.0.10:2424#123232
Operation Log DB-1 DB-2 Operation Log
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 32 of 41
33. To be
consistent
or not be,
that is
the question
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 33 of 41
34. Always consistent
use it as a Master-Slave
Read only, consistent. Leave it as
Read/Write. All replica. Since it's always aligned it's
changes on this server the best candidate as new master if
avoiding conflicts Server #1 is unavailable
Client Server #1 Server #2
Master Synch Slave
Client read + write read only
Perfect for Analysis,
One-way only
Business Intelligence
and Reports
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 34 of 41
35. Read-only scaling
using many asynchronous replicas
Read/Write. All
changes on this server
avoiding conflicts
Server #2
Synch Slave
Client Server #1 read only
Master
Client read + write Server #N
Server #3
Asynch Slave#3
Server
Asynch Slave#3
Server
read only
Asynch Slave
read only
Asynch Slave
Read only, eventually read only
read only
consistent. Replication
cost close to zero
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 35 of 41
36. Read/Write scaling
Multi master + handling conflicts
Client Server #1
Master
Client read + write
Server #2 Client
Master
read + write Client
Client Server #3
Master
Client read + write
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 36 of 41
37. Read/Write scaling + sharding
Multi master, no conflict! :-)
Server USA
Client Master customers_usa
Writes on read + write
customers_usa
Writes on
customers_china
Server CHI
Client Master customers_china
read + write
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 37 of 41
38. Multi-Master + Sharding
=
big scale in high-availability and no conflicts
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 38 of 41
39. www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 39 of 41
40. NuvolaBase.com (beta)
The first
Graph Database
on the Cloud
always available
few seconds to setup it
use it from Web & Mobile
apps
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 40 of 41
41. Luca Garulli
Author of OrientDB and
Roma <Meta> Framework
Open Source projects,
Member of JSR#12 (jdo 1.0)
and JSR#243 (jdo 2.0)
www.twitter.com/lgarulli
@London, UK CEO at Nuvola Base Ltd
and
@Rome, Italy
www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 41 of 41