The slide deck for our RubyConf 2011 (New Orleans) talk.
Follow us on lanyrd to get the video and other material: http://lanyrd.com/profile/ponnappa and http://lanyrd.com/profile/niranjan_p
Spark Summit - Mobius C# Binding for Apache Sparkshareddatamsft
Slides used for the talk at Spark Summit West - https://spark-summit.org/2016/events/mobius-c-language-binding-for-spark.
With Mobius developers can use .NET with Apache Spark. This talk covers writing Spark driver program in C# using Mobius, internal architecture of Mobius, observations of C# applications running in Spark cluster and recommended best practices. Mobius is open-sourced @ http://github.com/Microsoft/Mobius.
Centralizing Kubernetes and Container OperationsKublr
While developers see and realize the benefits of Kubernetes, how it improves efficiencies, saves time, and enables focus on the unique business requirements of each project; InfoSec, infrastructure, and software operations teams still face challenges when managing a new set of tools and technologies, and integrating them into an existing enterprise infrastructure.
These meetup slides go over what’s needed for a general architecture of a centralized Kubernetes operations layer based on open source components such as Prometheus, Grafana, ELK Stack, Keycloak, etc., and how to set up reliable clusters and multi-master configuration without a load balancer. It also outlines how these components should be combined into an operations-friendly enterprise Kubernetes management platform with centralized monitoring and log collection, identity and access management, backup and disaster recovery, and infrastructure management capabilities. This presentation will show real-world open source projects use cases to implement an ops-friendly environment.
Check out this and more webinars in our BrightTalk channel: https://goo.gl/QPE5rZ
Apache Kafka Scalable Message Processing and more! Guido Schmutz
media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analysed, often with many consumers or systems interested in all or part of the events. How can me make sure that all these event are accepted and forwarded in an efficient and reliable way? This is where Apache Kafaka comes into play, a distirbuted, highly-scalable messaging broker, build for exchanging huge amount of messages between a source and a target.
This session will start with an introduction into Apache and presents the role of Apache Kafka in a modern data / information architecture and the advantages it brings to the table. Additionally the Kafka ecosystem will be covered as well as the integration of Kafka in the Oracle Stack, with products such as Golden Gate, Service Bus and Oracle Stream Analytics all being able to act as a Kafka consumer or producer.
From a skunk-works project to running the entire enterprise
While developers see and realize the benefits of Kubernetes, how it improves efficiencies, saves time, and enables focus on the unique business requirements of each project; InfoSec, infrastructure, and software operations teams still face challenges when managing a new set of tools and technologies, and integrating them into an existing enterprise infrastructure.
In this meetup, Chris, CTO at Tigera, and Oleg, CTO at Kublr, discussed the evolution of your Kubernetes cluster - from a skunk-works project to running the entire enterprise.
Spark Summit - Mobius C# Binding for Apache Sparkshareddatamsft
Slides used for the talk at Spark Summit West - https://spark-summit.org/2016/events/mobius-c-language-binding-for-spark.
With Mobius developers can use .NET with Apache Spark. This talk covers writing Spark driver program in C# using Mobius, internal architecture of Mobius, observations of C# applications running in Spark cluster and recommended best practices. Mobius is open-sourced @ http://github.com/Microsoft/Mobius.
Centralizing Kubernetes and Container OperationsKublr
While developers see and realize the benefits of Kubernetes, how it improves efficiencies, saves time, and enables focus on the unique business requirements of each project; InfoSec, infrastructure, and software operations teams still face challenges when managing a new set of tools and technologies, and integrating them into an existing enterprise infrastructure.
These meetup slides go over what’s needed for a general architecture of a centralized Kubernetes operations layer based on open source components such as Prometheus, Grafana, ELK Stack, Keycloak, etc., and how to set up reliable clusters and multi-master configuration without a load balancer. It also outlines how these components should be combined into an operations-friendly enterprise Kubernetes management platform with centralized monitoring and log collection, identity and access management, backup and disaster recovery, and infrastructure management capabilities. This presentation will show real-world open source projects use cases to implement an ops-friendly environment.
Check out this and more webinars in our BrightTalk channel: https://goo.gl/QPE5rZ
Apache Kafka Scalable Message Processing and more! Guido Schmutz
media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analysed, often with many consumers or systems interested in all or part of the events. How can me make sure that all these event are accepted and forwarded in an efficient and reliable way? This is where Apache Kafaka comes into play, a distirbuted, highly-scalable messaging broker, build for exchanging huge amount of messages between a source and a target.
This session will start with an introduction into Apache and presents the role of Apache Kafka in a modern data / information architecture and the advantages it brings to the table. Additionally the Kafka ecosystem will be covered as well as the integration of Kafka in the Oracle Stack, with products such as Golden Gate, Service Bus and Oracle Stream Analytics all being able to act as a Kafka consumer or producer.
From a skunk-works project to running the entire enterprise
While developers see and realize the benefits of Kubernetes, how it improves efficiencies, saves time, and enables focus on the unique business requirements of each project; InfoSec, infrastructure, and software operations teams still face challenges when managing a new set of tools and technologies, and integrating them into an existing enterprise infrastructure.
In this meetup, Chris, CTO at Tigera, and Oleg, CTO at Kublr, discussed the evolution of your Kubernetes cluster - from a skunk-works project to running the entire enterprise.
While developers see and realize the benefits of Kubernetes, how it improves efficiencies, saves time, and enables focus on the unique business requirements of each project; InfoSec, infrastructure, and software operations teams still face challenges when managing a new set of tools and technologies, and integrating them into existing enterprise infrastructure. This is especially true for environments where security and governance requirements are so strict as to come into conflict with the cloud-native reference architectures.
This deck will outline a plan that leverages Kubernetes as an infrastructure abstraction (hint: there is a lot more to it than just container orchestration!). Such an approach allows enterprises to untie themselves from infrastructure provider-specific technology stack and free development to use whichever tool fits their use case best. But how do you implement open source cloud-native technologies while meeting enterprise security and governance requirements? We’ll summarize common prerequisites for running Kubernetes in production, and how to leverage fine-grained controls and separation of responsibilities to meet enterprise governance and security needs; what’s needed for a general architecture of a centralized Kubernetes operations layer based on open source components such as Prometheus, Grafana, ELK Stack, Keycloak, etc.
In today’s world it’s no longer enough to build systems that process big volumes of information. We now need applications that can handle large continuous streams of data with very low latency so we can react to the ever-changing environment around us. To efficiently handle such problems we need to deploy a stream processing solution. During the talk we’ll explore one of the most popular frameworks for stream processing – Apache Flink. We’ll see what unique capabilities it provides and how they apply to some real world problems. And we’ll also explore how it works under the hood and how to get the scalable and fault-tolerant stream processing that Flink provides.
Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30...HostedbyConfluent
If you have already worked on various Kafka Streams applications before, then you have probably found yourself in the situation of rewriting the same piece of code again and again.
Whether it's to manage processing failures or bad records, to use interactive queries, to organize your code, to deploy or to monitor your Kafka Streams app, build some in-house libraries to standardize common patterns across your projects seems to be unavoidable.
And, if you're new to Kafka Streams you might be interested to know what are those patterns to use for your next streaming project.
In this talk, I propose to introduce you to Azkarra, an open-source lightweight Java framework that was designed to provide most of that stuffs off-the-shelf by leveraging the best-of-breed ideas and proven practices from the Apache Kafka community.
Containers and Kubernetes allow for code portability across on-premise VMs, bare metal or multiple cloud provider environments. Yet, despite this portability promise, developers may include configuration and application definitions that constrain or even eliminate application portability. In this meetup Oleg Chunikhin, CTO at Kublr, described best practices for “configuration as code” in a Kubernetes environment. He demonstrated how a properly constructed containerized app can be deployed to both Amazon and Azure using the Kublr platform, and how Kubernetes objects, such as persistent volumes, ingress rules and services, can be used to abstract from the infrastructure.
Enabling support for data processing, data analytics, and machine learning workloads in Kubernetes has been one of the goals of the open source community. During this online meetup we discussed the growing use of Kubernetes for data science and machine learning workloads. We examined how new Kubernetes extensibility features such as custom resources and custom controllers are used for applications and frameworks integration. Apache Spark 2.3.’s native support is the latest indication of this growing trend. We demoed a few examples of data science workloads running on Kubernetes clusters setup by our Kublr platform
Connecting the Dots: Kong for GraphQL EndpointsJulien Bataillé
GraphQL is a popular alternative to REST for front-end applications as it offers flexibility and developer-friendly tooling. In this talk, we will look into the differences between REST and GraphQL, how GraphQL API Management presents a new set of challenges, and finally, how we can address those challenges by leveraging Kong extensibility.
Microservices Architectures (aka Distributed Architectures) are the new paradigm to develop and deploy applications in Cloud environments. These architectures resolve several problems and improve the new life cycle in DevOps teams, however new challenges should be resolved or managed.
OpenShift Service Mesh (based in Istio, Kiali, Jaeger) allows us to manage this new paradigm easily without to change our current applications.
These slides will introduce you in OpenShift Service Mesh as a new component on OpenShift to manage your microservices architectures. Carlos Vicens worked on it with me.
Slides used during a coordinated meetup between three different groups in Madrid:
- OpenShift Madrid Group: https://www.meetup.com/es/openshift_spain/events/258188248/
- Microservices Madrid Group: https://www.meetup.com/es-ES/Microservicios/events/258188068/
- Madrid Spring User Group: https://www.meetup.com/es/madrid-spring-user-group/events/258322835/
Centralizing Kubernetes Management in Restrictive EnvironmentsKublr
While developers see and realize the benefits of Kubernetes, how it improves efficiencies, saves time, and enables focus on the unique business requirements of each project; InfoSec, infrastructure, and software operations teams still face challenges when managing a new set of tools and technologies, and integrating them into existing enterprise infrastructure.
This is especially true for environments where security and governance requirements are so strict as to come into conflict with the cloud-native reference architectures.
During his presentation, Oleg will outline a plan that leverages open source cloud-native technologies while meeting enterprise security and governance requirements. He’ll summarize common prerequisites for running Kubernetes in production, and how to leverage fine-grained controls and separation of responsibilities to meet enterprise governance and security needs; what’s needed for a general architecture of a centralized Kubernetes operations layer based on open source components such as Prometheus, Grafana, ELK Stack, Keycloak, etc.
The presentation will cover basic requirements for audit, security, authentication, authorization, integration with existing identity management, logging, and monitoring. Additionally, the audience will learn whether cloud-hosted Kubernetes cover these requirements, how to integrate a compliant Kubernetes installation with their existing cloud infrastructure, the limitations of a bare-metal installation, interactions with vSphere’s API, achieving HA, reliability and disaster recovery, as well as handling OS upgrades, security patches, and Kubernetes upgrades.
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
Independent of the source of data, the integration of event streams into an Enterprise Architecture gets more and more important in the world of sensors, social media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analysed, often with many consumers or systems interested in all or part of the events. How can me make sure that all these event are accepted and forwarded in an efficient and reliable way? This is where Apache Kafaka comes into play, a distirbuted, highly-scalable messaging broker, build for exchanging huge amount of messages between a source and a target.
This session will start with an introduction into Apache and presents the role of Apache Kafka in a modern data / information architecture and the advantages it brings to the table. Additionally the Kafka ecosystem will be covered as well as the integration of Kafka in the Oracle Stack, with products such as Golden Gate, Service Bus and Oracle Stream Analytics all being able to act as a Kafka consumer or producer.
ContainerConf 2019, November 2019, Mannheim: Vortrag von Mario-Leander Reimer (@LeanderReimer, Cheftechnologe bei QAware)
== Dokument bitte herunterladen, falls unscharf! ==
Abstract:
Vor nicht allzu langer Zeit haben Microservice-Architekturen die Art und Weise, wie wir Softwaresysteme bauen, revolutioniert: Anstatt als Monolithen werden Systeme nun in Form autonomer Services komponiert und ausgeführt.
Serverless und FaaS sind die nächste logische Stufe in dieser Evolution, um die Komplexität in der Entwicklung und im Betrieb solcher Systeme zu reduzieren.
FaaS-Plattformen schießen derzeit wie Pilze aus dem Boden: Knative, OpenFaaS, Fission oder Nuclio sind nur einige Beispiele. Aber welche davon sind bereits geeignet für den Einsatz im nächsten Projekt? Lassen sich damit hybride Architekturen umsetzen oder muss es vollständig Functionless sein? Lasst es uns herausfinden.
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Big Data Spain
2gether is a financial platform based on Blockchain, Big Data and Artificial Intelligence that allows interaction between users and third-party services in a single interface.
https://www.bigdataspain.org/2017/talk/scaling-a-backend-for-a-big-data-and-blockchain-environment
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Function Mesh: Complex Streaming Jobs Made Simple - Pulsar Summit NA 2021StreamNative
Pulsar Function is a succinct computing abstraction Apache Pulsar provides users to express simple ETL and streaming tasks. The simplicity comes in two folds: Simple Interface and Simple Deployment. As it has been adopted, we realized that the native support of organizing multiple functions into integrity will be very beneficial. With such support, people can express and manage multi-stage jobs easily. In addition, this support also provides the possibility of higher-level abstraction DSL to further simplify the job composition. We call this new feature -- Function Mesh.
This talk aims to provide a thorough walkthrough of this new Function Mesh Feature, including its design, implementation, use cases, and examples, to help people seeking simple streaming solutions understand this newly created powerful tool in Apache Pulsar.
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...confluent
Kafka Streams and the addition of KSQL has provided opportunities do stateful processing of data. Sometimes, the biggest challenge is determining how you can join that data. Keying and windowing are core concepts that need to be understood in order to properly and efficiently stream data. In this presentation, Neil will utilize geospatial data to showcase non-trivial joining; particularly, but not limited to, distance comparisons. The stream processing will be written in Kafka Streams DSL and in KSQL with the topologies being compared. KSQL 2.0 concepts of User Defined Functions (UDFs), nested AVRO structures, and ‘insert into’ functionality of KSQL will be showcased.
The presentation will show a custom OpenSky Connector for obtaining real-time aircraft, a Streams application for processing that data, a D3 topojson application to visualize the data, and an addition KSQL implementation of the streams application for comparison. Expect a deep dive into the Streams DSL and KSQL implementations that will provide the bases into a discussion around Apache Kafka and stream processing.
Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)Chris Bolman
Presentation by Raffi Krikorian, VP of Engineering at Twitter, on scaling Twitter to over 150 million active users with redis and other architectural approaches
How to build 1000 microservices with Kafka and thriveNatan Silnitsky
This talk is about the Wix ecosystem for event driven architecture on top of Kafka.
I share the best practices, SDKs and tools we have created in order to be able to scale our distributed system to more than 1000 microservices.
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
Developing Real-Time Data Pipelines with Apache Kafka http://kafka.apache.org/ is an introduction for developers about why and how to use Apache Kafka. Apache Kafka is a publish-subscribe messaging system rethought of as a distributed commit log. Kafka is designed to allow a single cluster to serve as the central data backbone. A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of coordinated consumers. Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages. For the Spring user, Spring Integration Kafka and Spring XD provide integration with Apache Kafka.
While developers see and realize the benefits of Kubernetes, how it improves efficiencies, saves time, and enables focus on the unique business requirements of each project; InfoSec, infrastructure, and software operations teams still face challenges when managing a new set of tools and technologies, and integrating them into existing enterprise infrastructure. This is especially true for environments where security and governance requirements are so strict as to come into conflict with the cloud-native reference architectures.
This deck will outline a plan that leverages Kubernetes as an infrastructure abstraction (hint: there is a lot more to it than just container orchestration!). Such an approach allows enterprises to untie themselves from infrastructure provider-specific technology stack and free development to use whichever tool fits their use case best. But how do you implement open source cloud-native technologies while meeting enterprise security and governance requirements? We’ll summarize common prerequisites for running Kubernetes in production, and how to leverage fine-grained controls and separation of responsibilities to meet enterprise governance and security needs; what’s needed for a general architecture of a centralized Kubernetes operations layer based on open source components such as Prometheus, Grafana, ELK Stack, Keycloak, etc.
In today’s world it’s no longer enough to build systems that process big volumes of information. We now need applications that can handle large continuous streams of data with very low latency so we can react to the ever-changing environment around us. To efficiently handle such problems we need to deploy a stream processing solution. During the talk we’ll explore one of the most popular frameworks for stream processing – Apache Flink. We’ll see what unique capabilities it provides and how they apply to some real world problems. And we’ll also explore how it works under the hood and how to get the scalable and fault-tolerant stream processing that Flink provides.
Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30...HostedbyConfluent
If you have already worked on various Kafka Streams applications before, then you have probably found yourself in the situation of rewriting the same piece of code again and again.
Whether it's to manage processing failures or bad records, to use interactive queries, to organize your code, to deploy or to monitor your Kafka Streams app, build some in-house libraries to standardize common patterns across your projects seems to be unavoidable.
And, if you're new to Kafka Streams you might be interested to know what are those patterns to use for your next streaming project.
In this talk, I propose to introduce you to Azkarra, an open-source lightweight Java framework that was designed to provide most of that stuffs off-the-shelf by leveraging the best-of-breed ideas and proven practices from the Apache Kafka community.
Containers and Kubernetes allow for code portability across on-premise VMs, bare metal or multiple cloud provider environments. Yet, despite this portability promise, developers may include configuration and application definitions that constrain or even eliminate application portability. In this meetup Oleg Chunikhin, CTO at Kublr, described best practices for “configuration as code” in a Kubernetes environment. He demonstrated how a properly constructed containerized app can be deployed to both Amazon and Azure using the Kublr platform, and how Kubernetes objects, such as persistent volumes, ingress rules and services, can be used to abstract from the infrastructure.
Enabling support for data processing, data analytics, and machine learning workloads in Kubernetes has been one of the goals of the open source community. During this online meetup we discussed the growing use of Kubernetes for data science and machine learning workloads. We examined how new Kubernetes extensibility features such as custom resources and custom controllers are used for applications and frameworks integration. Apache Spark 2.3.’s native support is the latest indication of this growing trend. We demoed a few examples of data science workloads running on Kubernetes clusters setup by our Kublr platform
Connecting the Dots: Kong for GraphQL EndpointsJulien Bataillé
GraphQL is a popular alternative to REST for front-end applications as it offers flexibility and developer-friendly tooling. In this talk, we will look into the differences between REST and GraphQL, how GraphQL API Management presents a new set of challenges, and finally, how we can address those challenges by leveraging Kong extensibility.
Microservices Architectures (aka Distributed Architectures) are the new paradigm to develop and deploy applications in Cloud environments. These architectures resolve several problems and improve the new life cycle in DevOps teams, however new challenges should be resolved or managed.
OpenShift Service Mesh (based in Istio, Kiali, Jaeger) allows us to manage this new paradigm easily without to change our current applications.
These slides will introduce you in OpenShift Service Mesh as a new component on OpenShift to manage your microservices architectures. Carlos Vicens worked on it with me.
Slides used during a coordinated meetup between three different groups in Madrid:
- OpenShift Madrid Group: https://www.meetup.com/es/openshift_spain/events/258188248/
- Microservices Madrid Group: https://www.meetup.com/es-ES/Microservicios/events/258188068/
- Madrid Spring User Group: https://www.meetup.com/es/madrid-spring-user-group/events/258322835/
Centralizing Kubernetes Management in Restrictive EnvironmentsKublr
While developers see and realize the benefits of Kubernetes, how it improves efficiencies, saves time, and enables focus on the unique business requirements of each project; InfoSec, infrastructure, and software operations teams still face challenges when managing a new set of tools and technologies, and integrating them into existing enterprise infrastructure.
This is especially true for environments where security and governance requirements are so strict as to come into conflict with the cloud-native reference architectures.
During his presentation, Oleg will outline a plan that leverages open source cloud-native technologies while meeting enterprise security and governance requirements. He’ll summarize common prerequisites for running Kubernetes in production, and how to leverage fine-grained controls and separation of responsibilities to meet enterprise governance and security needs; what’s needed for a general architecture of a centralized Kubernetes operations layer based on open source components such as Prometheus, Grafana, ELK Stack, Keycloak, etc.
The presentation will cover basic requirements for audit, security, authentication, authorization, integration with existing identity management, logging, and monitoring. Additionally, the audience will learn whether cloud-hosted Kubernetes cover these requirements, how to integrate a compliant Kubernetes installation with their existing cloud infrastructure, the limitations of a bare-metal installation, interactions with vSphere’s API, achieving HA, reliability and disaster recovery, as well as handling OS upgrades, security patches, and Kubernetes upgrades.
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
Independent of the source of data, the integration of event streams into an Enterprise Architecture gets more and more important in the world of sensors, social media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analysed, often with many consumers or systems interested in all or part of the events. How can me make sure that all these event are accepted and forwarded in an efficient and reliable way? This is where Apache Kafaka comes into play, a distirbuted, highly-scalable messaging broker, build for exchanging huge amount of messages between a source and a target.
This session will start with an introduction into Apache and presents the role of Apache Kafka in a modern data / information architecture and the advantages it brings to the table. Additionally the Kafka ecosystem will be covered as well as the integration of Kafka in the Oracle Stack, with products such as Golden Gate, Service Bus and Oracle Stream Analytics all being able to act as a Kafka consumer or producer.
ContainerConf 2019, November 2019, Mannheim: Vortrag von Mario-Leander Reimer (@LeanderReimer, Cheftechnologe bei QAware)
== Dokument bitte herunterladen, falls unscharf! ==
Abstract:
Vor nicht allzu langer Zeit haben Microservice-Architekturen die Art und Weise, wie wir Softwaresysteme bauen, revolutioniert: Anstatt als Monolithen werden Systeme nun in Form autonomer Services komponiert und ausgeführt.
Serverless und FaaS sind die nächste logische Stufe in dieser Evolution, um die Komplexität in der Entwicklung und im Betrieb solcher Systeme zu reduzieren.
FaaS-Plattformen schießen derzeit wie Pilze aus dem Boden: Knative, OpenFaaS, Fission oder Nuclio sind nur einige Beispiele. Aber welche davon sind bereits geeignet für den Einsatz im nächsten Projekt? Lassen sich damit hybride Architekturen umsetzen oder muss es vollständig Functionless sein? Lasst es uns herausfinden.
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Big Data Spain
2gether is a financial platform based on Blockchain, Big Data and Artificial Intelligence that allows interaction between users and third-party services in a single interface.
https://www.bigdataspain.org/2017/talk/scaling-a-backend-for-a-big-data-and-blockchain-environment
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Function Mesh: Complex Streaming Jobs Made Simple - Pulsar Summit NA 2021StreamNative
Pulsar Function is a succinct computing abstraction Apache Pulsar provides users to express simple ETL and streaming tasks. The simplicity comes in two folds: Simple Interface and Simple Deployment. As it has been adopted, we realized that the native support of organizing multiple functions into integrity will be very beneficial. With such support, people can express and manage multi-stage jobs easily. In addition, this support also provides the possibility of higher-level abstraction DSL to further simplify the job composition. We call this new feature -- Function Mesh.
This talk aims to provide a thorough walkthrough of this new Function Mesh Feature, including its design, implementation, use cases, and examples, to help people seeking simple streaming solutions understand this newly created powerful tool in Apache Pulsar.
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...confluent
Kafka Streams and the addition of KSQL has provided opportunities do stateful processing of data. Sometimes, the biggest challenge is determining how you can join that data. Keying and windowing are core concepts that need to be understood in order to properly and efficiently stream data. In this presentation, Neil will utilize geospatial data to showcase non-trivial joining; particularly, but not limited to, distance comparisons. The stream processing will be written in Kafka Streams DSL and in KSQL with the topologies being compared. KSQL 2.0 concepts of User Defined Functions (UDFs), nested AVRO structures, and ‘insert into’ functionality of KSQL will be showcased.
The presentation will show a custom OpenSky Connector for obtaining real-time aircraft, a Streams application for processing that data, a D3 topojson application to visualize the data, and an addition KSQL implementation of the streams application for comparison. Expect a deep dive into the Streams DSL and KSQL implementations that will provide the bases into a discussion around Apache Kafka and stream processing.
Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)Chris Bolman
Presentation by Raffi Krikorian, VP of Engineering at Twitter, on scaling Twitter to over 150 million active users with redis and other architectural approaches
How to build 1000 microservices with Kafka and thriveNatan Silnitsky
This talk is about the Wix ecosystem for event driven architecture on top of Kafka.
I share the best practices, SDKs and tools we have created in order to be able to scale our distributed system to more than 1000 microservices.
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
Developing Real-Time Data Pipelines with Apache Kafka http://kafka.apache.org/ is an introduction for developers about why and how to use Apache Kafka. Apache Kafka is a publish-subscribe messaging system rethought of as a distributed commit log. Kafka is designed to allow a single cluster to serve as the central data backbone. A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of coordinated consumers. Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages. For the Spring user, Spring Integration Kafka and Spring XD provide integration with Apache Kafka.
Understanding the Topic and Main Idea of Readings - 文章のトピックと主旨を理解するCOCOJUKU plus
Standard Reading - Level 5
Understanding the Topic and Main Idea of Readings
こちらのフォーカスは「文章のトピックと主旨を理解する」です。
前回のパラグラフからさらに広げて、文全体の主旨を理解するためのトレーニングです。
パラグラフが複数組み合わさった文章を読み解くための方法をご説明します。
I can haz HTTP - Consuming and producing HTTP APIs in the Ruby ecosystemSidu Ponnappa
The Ruby ecosystem is pretty awesome when it comes to developing or
consuming HTTP APIs. On the publishing front, the Rails framework is
an attractive option because it supports publishing what are popularly
(but inaccurately) referred to as 'RESTful' APIs quickly and
effortlessly. On the consumer side, the Ruby ecosystem provides
several very fluent and powerful libraries that make it easy to
consume HTTP based APIs.
Since a significant proportion of projects today require that APIs be
both published and consumed, many of them wind up choosing Ruby as a
platform for the reasons mentioned above. This talk is targeted at
folks that are currently on such projects, or anticipate being on such
projects in the future.
We will cover:
Consuming HTTP APIs:
1) The basics of making HTTP calls with Ruby
2) The strengths and weaknesses of Ruby's Net::HTTP across 1.8, 1.9
and JRuby (possibly Rubinius if we have the time to do research)
3) Popular HTTP libraries that either make it easier to do HTTP by
providing better APIs, make it faster by using libCurl or both
4) Different approaches to deserializing popular encoding formats such
as XML and JSON and the pitfalls thereof
Producing HTTP APIs using Rails:
1) The basics of REST
2) What Rails gives you out of the box - content-type negotiation,
deserialization etc. and the limitations thereof
3) What Rails fails to give you out of the box - hypermedia controls etc.
4) What Rails does wrong - wrong PUT semantics, no support for PATCH,
error handling results in responses that violate the clients Accepts
header constraints etc.
4) How one can achieve Level 2 on the Richardson Maturity Model of
REST using Rails
5) Writing tests for all of this
At the end of this, our audience will understand how you can both
consume and produce HTTP APIs in the Ruby ecosystem. They will also
have a clear idea of what the limitations of such systems are and what
the can do to work around the limitations.
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LMEconfluent
Confluent Platform is supporting London Metal Exchange’s Kafka Centre of Excellence across a number of projects with the main objective to provide a reliable, resilient, scalable and overall efficient Kafka as a Service model to the teams across the entire London Metal Exchange estate.
Modern businesses have data at their core, and this data is changing continuously. How can we harness this torrent of information in real-time? The answer is stream processing, and the technology that has since become the core platform for streaming data is Apache Kafka. Among the thousands of companies that use Kafka to transform and reshape their industries are the likes of Netflix, Uber, PayPal, and AirBnB, but also established players such as Goldman Sachs, Cisco, and Oracle.
Unfortunately, today’s common architectures for real-time data processing at scale suffer from complexity: there are many technologies that need to be stitched and operated together, and each individual technology is often complex by itself. This has led to a strong discrepancy between how we, as engineers, would like to work vs. how we actually end up working in practice.
In this session we talk about how Apache Kafka helps you to radically simplify your data processing architectures. We cover how you can now build normal applications to serve your real-time processing needs — rather than building clusters or similar special-purpose infrastructure — and still benefit from properties such as high scalability, distributed computing, and fault-tolerance, which are typically associated exclusively with cluster technologies. Notably, we introduce Kafka’s Streams API, its abstractions for streams and tables, and its recently introduced Interactive Queries functionality. As we will see, Kafka makes such architectures equally viable for small, medium, and large scale use cases.
apidays LIVE Jakarta - REST the events: REST APIs for Event-Driven Architectu...apidays
apidays LIVE Jakarta 2021 - Accelerating Digitisation
February 24, 2021
REST the events: REST APIs for Event-Driven Architecture
Mark Teehan, Principal Solution Engineer at Confluent APAC
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
Independent of the source of data, the integration of event streams into an Enterprise Architecture gets more and more important in the world of sensors, social media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analysed, often with many consumers or systems interested in all or part of the events. How can me make sure that all these event are accepted and forwarded in an efficient and reliable way? This is where Apache Kafaka comes into play, a distirbuted, highly-scalable messaging broker, build for exchanging huge amount of messages between a source and a target.
This session will start with an introduction into Apache and presents the role of Apache Kafka in a modern data / information architecture and the advantages it brings to the table. Additionally the Kafka ecosystem will be covered as well as the integration of Kafka in the Oracle Stack, with products such as Golden Gate, Service Bus and Oracle Stream Analytics all being able to act as a Kafka consumer or producer.
apidays LIVE Singapore 2021 - REST the Events - REST APIs for Event-Driven Ar...apidays
apidays LIVE Singapore 2021 - Digitisation, Connected Services and Embedded Finance
April 21 & 22, 2021
REST the Events - REST APIs for Event-Driven Architecture
Mark Teehan, Principal Solution Engineer at Confluent APAC
What is Apache Kafka and What is an Event Streaming Platform?confluent
Speaker: Gabriel Schenker, Lead Curriculum Developer, Confluent
Streaming platforms have emerged as a popular, new trend, but what exactly is a streaming platform? Part messaging system, part Hadoop made fast, part fast ETL and scalable data integration. With Apache Kafka® at the core, event streaming platforms offer an entirely new perspective on managing the flow of data. This talk will explain what an event streaming platform such as Apache Kafka is and some of the use cases and design patterns around its use—including several examples of where it is solving real business problems. New developments in this area such as KSQL will also be discussed.
apidays LIVE India - REST the Events - REST APIs for Event-Driven Architectur...apidays
apidays LIVE India 2021 - Connecting 1.3 billion digital innovators
May 20, 2021
REST the Events - REST APIs for Event-Driven Architecture
Mark Teehan, Principal Solution Engineer at Confluent APAC
The Role of Integration in Microservice Architecture (MSA)Asanka Abeysinghe
Integration was treated as old-school technology when microservice architecture (MSA) was introduced. However, when theory became practice, the technologist who designed and implemented MSA identified the important role integration plays in this modern architecture paradigm. An architecture layer that connects many microservices and builds composite services, requests dispatching and service routing, connects microservices with legacy services and cloud providers are among common integration use cases across most enterprises.
During this session, Asanka will discuss how integration fits into MSA and technologies that can be used to implement integration microservices.
Announcing the next-generation dA Platform 2, which includes open source Apache Flink and the new Application Manager. dA Platform 2 makes it easier than ever to operationalize your Flink-powered stream processing applications in production.
Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...InfluxData
Using FLaNK with InfluxDB for EdgeAI IoT at Scale
Timothy from StreamNative take you on a hands-on deep-dive on using Pulsar, Apache NiFi + Edge Flow Manager + MiniFi Agents with Apache MXNet, OpenVino, TensorFlow Lite, and other Deep Learning Libraries on the actual edge devices including Raspberry Pi with Movidius 2, Google Coral TPU and NVidia Jetson Nano. The team run deep learning models on the edge devices and send images, and capture real-time GPS and sensor data. Their low-coding IoT applications provide easy edge routing, transformation, data acquisition and alerting before they decide what data to stream real-time to their data space. These edge applications classify images and sensor readings real-time at the edge and then send Deep Learning results to Flink SQL and Apache NiFi for transformation, parsing, enrichment, querying, filtering and merging data to InfluxDB.
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramièreconfluent
During the Confluent Streaming event in Paris, Florent Ramière, Technical Account Manager at Confluent, goes beyond brokers, introducing a whole new ecosystem with Kafka Streams, KSQL, Kafka Connect, Rest proxy, Schema Registry, MirrorMaker, etc.
Concepts and Patterns for Streaming Services with KafkaQAware GmbH
Cloud Native Night March 2020, Mainz: Talk by Perry Krol (@perkrol, Confluent)
=== Please download slides if blurred! ===
Abstract: Proven approaches such as service-oriented and event-driven architectures are joined by newer techniques such as microservices, reactive architectures, DevOps, and stream processing. Many of these patterns are successful by themselves, but they provide a more holistic and compelling approach when applied together. In this session Confluent will provide insights how service-based architectures and stream processing tools such as Apache Kafka® can help you build business-critical systems. You will learn why streaming beats request-response based architectures in complex, contemporary use cases, and explain why replayable logs such as Kafka provide a backbone for both service communication and shared datasets.
Based on these principles, we will explore how event collaboration and event sourcing patterns increase safety and recoverability with functional, event-driven approaches, apply patterns including Event Sourcing and CQRS, and how to build multi-team systems with microservices and SOA using patterns such as “inside out databases” and “event streams as a source of truth”.
Moderne Serverless-Computing-Plattformen sind in aller Munde und stellen ein Programmiermodell zur Verfügung, wo sich der Nutzer keine Gedanken mehr über die Administration der Server, Storage, Netzwerk, virtuelle Maschinen, Hochverfügbarkeit und Skalierbarkeit machen brauch, sondern sich auf das Schreiben von eigenen Code konzentriert. Der Code bildet die Geschäftsanforderungen modular in Form von kleinen Funktionspaketen (Functions) ab. Functions sind das Herzstück der Serverless-Computing-Plattform. Sie lesen von der (oft Standard-)Eingabe, tätigen ihre Berechnungen und erzeugen eine Ausgabe. Die zu speichernden Ergebnisse von Funktionen werden in einem permanenten Datastore abgelegt, wie z.B. der Autonomous Database gespeichert. Die Autonomous Database besitzt folgende drei Eigenschaften self-driving, self-repairing und self-securing, die für einen modernen Anwendungsentwicklungsansatz benötigt werden.
Similar to Rails services in the walled garden (20)
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
This talk is based primarily on our experience building a suite of 9 different RESTful web services and multiple clients to manage a data center. \n
These services were build to replace a monolithic app which had become a maintenance nightmare. Adding any new features to the app was painful and time consuming.\n
We’ve subsequently worked on other projects that involved building APIs, but the first remains the biggest\n
Let us quickly cover tour the two parts of our talk - SOA, and Rails.\n
\n
\n
\n
Service Oriented Architectures allow applications be split into several self-contained services\n
along lines that match the different business verticals involved. Each such service should be usable by itself by the people of that vertical. While they provide APIs for other services to integrate in order to create organization wide workflows. \n
Being focused on a business vertical helps in limiting the ripple effects caused by changes in a business requirement for a particular app as long as the API remains stable. This makes a significant difference while building complex workflows which are specific to that vertical because you no longer worry about other business verticals that you may not understand or care about. So long as your API is stable, you’re fine.\n\n
This allows Independent evolution of the each service based on the needs of the corresponding vertical\n\n
Services can be deployed independently - so long as APIs are respected, teams no longer need to wait on other teams to release.\n
Only those services which see high traffic need to be scaled out\n
\n
\n
Having multiple small teams working independently on separate code bases is munch better than having one big team with everyone modifying the same codebase, smoothing out both development and deployment as you have to remember significant portion of the entire app (if not whole) while incorporating change requests.\n
While the list continues there are a few nuances which you should pay attention to \n
Services like to talk to other services, and the graph of HTTP requests can grow very quickly. This can potentially lead to...\n
...performance bottlenecks. Every call to a service comes with all the overhead introduced by both HTTP as well as the framework.\n
Managing user base across all services and granting them appropriate privileges.\n
Managing ACID (Atomic, Consistent, Isolated, Durable) transactions across distributed databases is complex, even more so with distributed services.\n
This comes up in almost every discussion about building APIs. While it is important, in a walled garden impact of API versioning can be curtailed as you control both producer and consumer \n
continuous integration of APIs is a difficult business at best, with no existing open source infrastructure to solve this for us\n
These problems are generic in nature and are common to any RESTful web services not just Rails and hence most of the gotchas we’ll discuss are generic in nature, while the solutions may be more specific to Rails. But even before we go there, why should we develop these APIs in Rails?\n
Rails lends itself well to creating synchronous APIs.\n
\n
Rails supports transparent format negotiation using url or Accepts header and provides a mechanism to register custom formats\n
\n
It is a pleasure to write well engineered backend code in Ruby\n
It doesn’t mater whether we are using Rails or Sinatra or any other web framework. While beautiful APIs can be built using sinatra. \n
Walled garden signifies that we control both producers and consumers. Which allows us to establish conventions and potentially loosen the constraints.\n
\n
\n
Based on what’s demanded of the APIs and how much time/money is available you can decide where to stop, RMM 2.5 is fairly easily done with Rails.\n
As you might have noticed we have talking about SOA and not REST all this while. Because creating standard Rails web services does not necessarily mean creating RESTful web services.\n\nWe’ll try to be careful about this during the course of this talk, but it’s worth remembering that much of what we are talking about involves building APIs with Rails *as it is today*. This means that RMM 3 cannot be achieved without significant effort, effort that is often unnecessary inside an enterprise.\n
\n
Often you want to restrict access to various APIs you are building to limited set of users, even if they are internal services.\n
This authentication should needs to span across multiple services in order to support SSO\n
It needs to allow user to access multiple services behind the scene to manage a workflow which spans across multiple services \n
\n
\n
Simplest way to achieve it is by restricting access to these services from a range of IPs and allowing all internal communication\n
This will work as long as it doesn’t matter who within the organization is accessing the services, which is rarely the case. Next logical step is to create a centralized auth server which can be backed by any of the existing data sources such as LDAP or Active Directory\n
\n
\n
\n
OAuth2 provides a simple way to authenticate users against a centralized authentication system. There are multiple implementations of open-source OAuth2 provider. You can chose any of these providers and tweak it to suit specific requirements you might have.\n
OAuth2 provides a simple way to authenticate users against a centralized authentication system. There are multiple implementations of open-source OAuth2 provider. You can chose any of these providers and tweak it to suit specific requirements you might have.\n
\n
\n
With a decent HTTP library, an OAuth2 client can be hand-rolled in 30 minutes\n
There’s an interesting catch though. ActiveResource doesn’t allow custom headers for individual request to a server out of the box and OAuth2 operates on information passed through headers.\n
Which means, unless you are willing to open ActiveResource and monkey patch to set authentication information on every outbound request you might want to reconsider OAuth2, or (better yet) skip ActiveResource in favour of a friendlier library\n
\n
If you become an OAuth2 provider, it will be easier to provide authentication using any of the external OAuth2 providers such as Google, Twitter, Facebook.\n
So now we are restricting access to our services, the next step would be to determine who can do what with these services. Two broad levels at which we might want to control the access are\n
Both these areas can easily be tackled with a role based system. There are plenty of gems which allow you specify access rules.\n
If we have pulled out a central authentication server it will be better to manage the user roles centrally. We can query the central server to figure out what roles user has in a context of a service.\n
But let every service manage it’s own access rules based on the roles returned by the user service. It can become messy to manage the authorization across all services in a central server as every service can have custom set of rules which are tied to the data. \n
\n
\n
\n
\n
\n
Services like to talk a lot. Specifically among themselves. This can become a significant overhead. Consider following scenario with two services\n
\n
\n
\n
\n
The request graph can go wild with more services thrown into the mix. While it is difficult to reduce the chattiness between services it’s impact can be reduced. \n
\n
It is essential to setup a performance build early in the project and track the graph as services grow in number and complexity\n
Set a target for the performance. Say average GET request should not be more than 40ms.\n
\n
There are going to be a lot of HTTP calls with small response payload. So you might want to optimize for it.\n
\n
\n
To reduce the time taken to serve a frequently queried and time consuming requests, we can introduce various kinds of server side caching\n
Fragment or action caching can be used for optimizing response time for resources which need authorized access\n
For resources publicly available page caching can be used to avoid the Rails stack entirely\n
Etags can be used effectively to check if a requested resource is modified or not. The catch here is that the client has to implement caching and respect etags\n
If possible you should use a client which supports client side caching. So far we haven’t come across any client which does it and started an open sources project to build one. It ended up as a Ruby Net/HTTP wrapper which implements RFC 2616.\n
If you are using ActiveResource as a client, it will be difficult to achieve without monkey patching ActiveResource as it neither supports caching nor exposes request/response objects\n
With any such library and ActiveModel it is possible to quickly hand-roll a simple client. It might not have lot of things ActiveResource supports out of the box, but those features can be introduced as and when needed.\n
Check out Varnish or Squid to introduce caching between services.\n
Introducing caching adds an overhead of figuring out when and how to expire the caches.\n
Expiring caches under applications control is far more easier than expiring caches maintained by Squid or Varnish.\n
Pagination of resources requires some meta data along with the array of resources such as total number of available resources, number of resources included in each page. It can be achieved by either exposing a collection resource or adding additional attributes to the root node of the array xml. The latter is a better fit for ActiveResource.\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
If you’re using ActiveResource, what you really need is something like WillPaginate. Luckily, it so happens that we have just the thing...\n
\n
So far we have been talking about caching HTTP responses between services. In certain scenarios it becomes essential to have a local cache of the resources. Let’s consider following scenario\n
We have a user management service which has users across multiple companies\n
and a project management service which has projects for individual users\n
A new user story comes in “As an admin user I want a paginated view of all projects for a given company” \n
In a typical monolithic app this can be easily solved by firing a database join query.\n
As we don’t have the required information locally we’ll be forced to access it over http. \n* It will involve multiple calls if the service returns paginated results. \n* It adds an overhead of serialization and deserialization.\n* If there are a lot of users for a given company, the ‘in’ clause won’t work due to limitation of the query string size\n\nSo on and so forth\n
One way to solve this problem is by allowing services to share their data with other services. \n
If one service starts writing in the database of the other service, it defeats the purpose of building separate services in the first place.\n
We can expose a readonly copy. We don’t recommend it although it is an option to keep in mind. It works in certain scenarios as long as everyone on the team understands that it should not be abused.\n
For this we can either use the same database for all our services or create master slave and read from slave\n
\n
\n
\n
As we said, sharing database connection is equivalent to integrating services at the database level and it comes with a lot of problems\n
Suddenly services which share databases start relying on the internal representation of resource instead of what is exposed at the service level\n
Behind the API resources might have computed fields which are not stored in the database or might be split into multiple tables etc. \n
\n
\n
We also tie to the internal stack of the service as different services might have different kind of data stores depending on their needs\n
\n
Before we discuss how to tackle it let’s talk about another problem.\n\n
\n
Imagine these multiple services needs to be notified when a particular user logs out of the system so as to do a local cleanup.\n
One way to approach this problem is by allowing services to register callback URIs with user management service, either through configuration or programmatically at bootstrap time. \n
It will work as long as we have only handful of events which can be registered against. But as we keep adding more events and more services interested in listening to them...\n
b. the overall complexity of managing the callback configurations grows massively.\n
a. Response time for a simple action like logout increases due to increasing number of callbacks it has to invoke.\n
Obviously by creating a background job for invoking callbacks we can guarantee quick response\n
We can use an MQ server for that which provides an internal centralized bus for all services who want to to broadcast messages. \n
It is a nice way of decoupling producers and consumers of the events. Producer can essentially fire the event and forget about it.\n
Any consumer who is interested in any such events will register with the central bus for notifications and is solely responsible for acting upon those event as it sees fit.\n
One thing to remember though is that these calls are asynchronous. We should not wrap make two consecutive calls to service expecting that a consumer of the first event has already received and processed it. \n
Establish a few convention in terms of what will be an exchange name and a topic name used by a service to propagate a particular type of event so that consumers can easily register for such events without massive configuration. \n
A few slides back we spoke about having local cache of resources to which shared database was one solution. If we implement an event system we can use it to manage local cache of resources and use local database joins.\n
\n
\n
\n
It’s a local cache and should be treated like one\n
Due to asynchronous nature of the system this cache will not reflect latest data. It might have a slightly older version. It can be safely used for resources which don’t change frequently and for which eventual consistency is not a problem.\n
Witch caching comes the problem of cache expiry. For which a consumer can listen to update and delete events. Not to mention, the producer has trigger these events with appropriate payload for consumer to modify the local cache. \n
Services evolve over a period of time and API changes. But we still have to maintain backward compatibility as the clients depend on a contract of the API\n
This is more or less a solved problem. All major APIs such as github, twitter, facebook do it.\n
Only thing to keep in mind is, if we are developing both producers and consumers in a walled garden we can deterministically predict number of revisions any API needs to support.\n\nRather going a step further, we can evaluate the cost of upgrading all clients with changing API vs cost of introducing API versioning and take a call whether we want to support multiple versions or not.\n
Implementing ACID transactions across multiple databases is a complex problem and having to do so over services adds to the complexity\n
There’s no framework in place which does it transparently does it. Databases solve this problem by introducing two phase commit.\n
It’s problem and there is one solution we have seen it working but haven’t been involved in its development. We’ll be willing to discuss it offline.\n
\n
\n
Local gem server\n
APIs are to be consumed by machines, web pages by human they have different requirements \n