Developing Realtime Data Pipelines With Apache Kafka. Apache Kafka is publish-subscribe messaging rethought as a distributed commit log. A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients. Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers. Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact. Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees.
Scale changes everything. Number of connections and destinations went from dozen to thousands, number of messages increased by order of magnitude. What once was quite adequate for enterprise messaging can't scale to support "Internet of Things". We need new protocols, patterns and architectures to support this new world. This session will start with basic introduction to the concept of Internet of things. Next it will discuss general technical challenges involved with the concept and explain why it is becoming mainstream now. Now we're ready to start talking about solutions. We will introduce some messaging patterns (like telemetry and command/control) and protocols (such as MQTT and AMQP) used in these scenarios. Finally we will see how Apache ActiveMQ is gearing up for this race. We will show tips for horizontal and vertical scaling of the broker, related projects that can help with deployments and what the future development road map looks like.
Scale changes everything. Number of connections and destinations went from dozen to thousands, number of messages increased by order of magnitude. What once was quite adequate for enterprise messaging can't scale to support "Internet of Things". We need new protocols, patterns and architectures to support this new world. This session will start with basic introduction to the concept of Internet of things. Next it will discuss general technical challenges involved with the concept and explain why it is becoming mainstream now. Now we're ready to start talking about solutions. We will introduce some messaging patterns (like telemetry and command/control) and protocols (such as MQTT and AMQP) used in these scenarios. Finally we will see how Apache ActiveMQ is gearing up for this race. We will show tips for horizontal and vertical scaling of the broker, related projects that can help with deployments and what the future development road map looks like.
Deep Dive into the Pulsar Binary Protocol - Pulsar Virtual Summit Europe 2021StreamNative
To achieve maximum performance, some important choices have been made when designing the Pulsar binary protocol.
This session will explain how Pulsar implements all the features of a high quality streaming protocol such as frame multiplexing, session establishment, keep-alive, flow control, authentication and authorisation, encoding, zero-copy capabilities and more.
WebSocket MicroService vs. REST MicroserviceRick Hightower
Comparing the speed of RPC calls over WebScoket Microservices versus REST based microservices. Using wrk, QBit, and examples in Java we show how much faster WebSocket is for doing RPC service calls.
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre ZembStreamNative
OVHcloud is the biggest European cloud provider. From dedicated servers to Managed Kubernetes, from VMware® based Hosted Private Cloud to OpenStack-based Public Cloud, we have over 1.4 million customers worldwide.
Internally, we have been running Apache Kafka for years, and despite all the skills obtained operating multiples clusters with millions of messages per second, we decided to shift and build the foundation of our 'topic-as-a-service' product called ioStream on Apache Pulsar.
In this talk, you will have the insights of why we decided to use Apache Pulsar instead of Apache Kafka as the core of ioStream. We will tell you our journey to use Apache Pulsar, from our deployments to the management, what did work and what did not.
Kafka is a real-time, fault-tolerant, scalable messaging system.
It is a publish-subscribe system that connects various applications with the help of messages - producers and consumers of information.
Micro on NATS - Microservices with MessagingApcera
This is a talk given by Asim Aslam at the NATS London Meetup on May 10th, 2016. It explains what Micro is (Microservices toolkit), and how it uses NATS - a lightweight high performance open source messaging system for microservices, cloud native, and IoT networks written in Golang.
You can learn more about NATS at http://www.nats.io
You can learn more about Micro at https://micro.mu/
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...Natan Silnitsky
Kafka is the bedrock of Wix's distributed microservices system. For the last 5 years we have learned a lot about how to successfully scale our event-driven architecture to roughly 1500 microservices.
We’ve managed to achieve higher decoupling and independence for our various services and dev teams that have very different use-cases while maintaining a single uniform infrastructure in place.
In these slides you will learn about 8 key decisions and steps you can take in order to safely scale-up your Kafka-based system. These include:
* How to increase dev velocity of event driven style code.
* How to optimize working with Kafka in polyglot setting
* How to support growing amount of traffic and developers.
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...Lucas Jellema
Introduction of Apache Kafka - the open source platform for real time message queuing and reliable, scalable, distributed event handling and high volume pub/sub implementation.
see GitHub https://github.com/MaartenSmeets/kafka-workshop for the workshop resources.
Protecting your data at rest with Apache Kafka by Confluent and Vormetricconfluent
Learn how data in motion is secure within Apache Kafka and the broader Confluent Platform, while data at rest can be secured by solutions like Vormetric Data Security Manager.
The Easiest Way to Configure Security for Clients AND Servers (Dani Traphagen...confluent
In this baller talk, we will be addressing the elephant in the room that no one ever wants to look at or talk about: security. We generally never want to talk about configuring security because if we do, we allocate risk of penetration by exposing ourselves to exploitation. However, this leads to a lot of confusion around proper Kafka security best practices and how to appropriately lock down a cluster when you are starting out. In this talk we will demystify the elephant in the room without deconstructing it limb by limb. We will give you a notion of how to configure the following for BOTH clients and servers: * TLS or Kerberos Authentication * Encrypt your network traffic via TLS * Perform authorization via access control lists (ACLs) We will also demonstrate the above with a GitHub repo you can try out for yourself. Lastly, we will present a reference implementation of oauth if that suits your fancy. All in all you should walk away with a pretty decent understanding of the necessary aspects required for a secure Kafka environment.
Quantum (OpenStack Meetup Feb 9th, 2012)Dan Wendlandt
This is a talk I gave on Quantum at the Bay Area OpenStack Meetup on Feb 9th, 2012.
I added a few slides to try and address some of questions people had during the talk.
Message broker is a method to distribute the information across server. Recently, message broker used to build a distributed system, to scale up massive data distribution in this Information Era. Kafka is one of message broker tools that emerge recently to data streaming. This slide explain the benefit of message broker and the benefit of Kafka for a good quality of data distribution.
This slide is exported from Ms. Power Point to PDF.
How Splunk Mission Control leverages various Pulsar subscription types_Pranav...StreamNative
This talk describes why Splunk Mission Control (currently in private beta) chose Pulsar to power their distributed messaging use cases. Mission Control has been designed as a cloud native, micro service architecture and has a varied set of use cases for distributed messaging. Our application needs diverse features - queueing, exclusive access to messages and consumer affinity. Besides these, we have additional needs such as TTL of messages, ranging from no persistence to persistence for several hours. In this talk, we will elaborate how we leveraged Pulsar and its features to satisfy the needs rather than using a medley of multiple infrastructures.
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...StreamNative
MQTT (Message Queuing Telemetry Transport,) is a message protocol based on the pub/sub model with the advantages of compact message structure, low resource consumption, and high efficiency, which is suitable for IoT applications with low bandwidth and unstable network environments.
This session will introduce MQTT on Pulsar, which allows developers users of MQTT transport protocol to use Apache Pulsar. I will share the architecture, principles and future planning of MoP, to help you understand Apache Pulsar's capabilities and practices in the IoT industry.
Cassandra Day 2014: Re-envisioning the Lambda Architecture - Web-Services & R...DataStax Academy
Do you want to expose a kick-ass REST API? Do you want interactions with that API to drive slick dashboards that answer hard questions? Do you want all of that in near real-time, distributed across a number of machines, and tolerant of system faults? Take one part Cassandra, stir in a bit of Paxos, and blend with Storm. Coat the rim of the glass with DropWizard. Sit back, relax, and enjoy the show. Come see how Health Market Science is using this mixology to fix our healthcare system.
Deep Dive into the Pulsar Binary Protocol - Pulsar Virtual Summit Europe 2021StreamNative
To achieve maximum performance, some important choices have been made when designing the Pulsar binary protocol.
This session will explain how Pulsar implements all the features of a high quality streaming protocol such as frame multiplexing, session establishment, keep-alive, flow control, authentication and authorisation, encoding, zero-copy capabilities and more.
WebSocket MicroService vs. REST MicroserviceRick Hightower
Comparing the speed of RPC calls over WebScoket Microservices versus REST based microservices. Using wrk, QBit, and examples in Java we show how much faster WebSocket is for doing RPC service calls.
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre ZembStreamNative
OVHcloud is the biggest European cloud provider. From dedicated servers to Managed Kubernetes, from VMware® based Hosted Private Cloud to OpenStack-based Public Cloud, we have over 1.4 million customers worldwide.
Internally, we have been running Apache Kafka for years, and despite all the skills obtained operating multiples clusters with millions of messages per second, we decided to shift and build the foundation of our 'topic-as-a-service' product called ioStream on Apache Pulsar.
In this talk, you will have the insights of why we decided to use Apache Pulsar instead of Apache Kafka as the core of ioStream. We will tell you our journey to use Apache Pulsar, from our deployments to the management, what did work and what did not.
Kafka is a real-time, fault-tolerant, scalable messaging system.
It is a publish-subscribe system that connects various applications with the help of messages - producers and consumers of information.
Micro on NATS - Microservices with MessagingApcera
This is a talk given by Asim Aslam at the NATS London Meetup on May 10th, 2016. It explains what Micro is (Microservices toolkit), and how it uses NATS - a lightweight high performance open source messaging system for microservices, cloud native, and IoT networks written in Golang.
You can learn more about NATS at http://www.nats.io
You can learn more about Micro at https://micro.mu/
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...Natan Silnitsky
Kafka is the bedrock of Wix's distributed microservices system. For the last 5 years we have learned a lot about how to successfully scale our event-driven architecture to roughly 1500 microservices.
We’ve managed to achieve higher decoupling and independence for our various services and dev teams that have very different use-cases while maintaining a single uniform infrastructure in place.
In these slides you will learn about 8 key decisions and steps you can take in order to safely scale-up your Kafka-based system. These include:
* How to increase dev velocity of event driven style code.
* How to optimize working with Kafka in polyglot setting
* How to support growing amount of traffic and developers.
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...Lucas Jellema
Introduction of Apache Kafka - the open source platform for real time message queuing and reliable, scalable, distributed event handling and high volume pub/sub implementation.
see GitHub https://github.com/MaartenSmeets/kafka-workshop for the workshop resources.
Protecting your data at rest with Apache Kafka by Confluent and Vormetricconfluent
Learn how data in motion is secure within Apache Kafka and the broader Confluent Platform, while data at rest can be secured by solutions like Vormetric Data Security Manager.
The Easiest Way to Configure Security for Clients AND Servers (Dani Traphagen...confluent
In this baller talk, we will be addressing the elephant in the room that no one ever wants to look at or talk about: security. We generally never want to talk about configuring security because if we do, we allocate risk of penetration by exposing ourselves to exploitation. However, this leads to a lot of confusion around proper Kafka security best practices and how to appropriately lock down a cluster when you are starting out. In this talk we will demystify the elephant in the room without deconstructing it limb by limb. We will give you a notion of how to configure the following for BOTH clients and servers: * TLS or Kerberos Authentication * Encrypt your network traffic via TLS * Perform authorization via access control lists (ACLs) We will also demonstrate the above with a GitHub repo you can try out for yourself. Lastly, we will present a reference implementation of oauth if that suits your fancy. All in all you should walk away with a pretty decent understanding of the necessary aspects required for a secure Kafka environment.
Quantum (OpenStack Meetup Feb 9th, 2012)Dan Wendlandt
This is a talk I gave on Quantum at the Bay Area OpenStack Meetup on Feb 9th, 2012.
I added a few slides to try and address some of questions people had during the talk.
Message broker is a method to distribute the information across server. Recently, message broker used to build a distributed system, to scale up massive data distribution in this Information Era. Kafka is one of message broker tools that emerge recently to data streaming. This slide explain the benefit of message broker and the benefit of Kafka for a good quality of data distribution.
This slide is exported from Ms. Power Point to PDF.
How Splunk Mission Control leverages various Pulsar subscription types_Pranav...StreamNative
This talk describes why Splunk Mission Control (currently in private beta) chose Pulsar to power their distributed messaging use cases. Mission Control has been designed as a cloud native, micro service architecture and has a varied set of use cases for distributed messaging. Our application needs diverse features - queueing, exclusive access to messages and consumer affinity. Besides these, we have additional needs such as TTL of messages, ranging from no persistence to persistence for several hours. In this talk, we will elaborate how we leveraged Pulsar and its features to satisfy the needs rather than using a medley of multiple infrastructures.
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...StreamNative
MQTT (Message Queuing Telemetry Transport,) is a message protocol based on the pub/sub model with the advantages of compact message structure, low resource consumption, and high efficiency, which is suitable for IoT applications with low bandwidth and unstable network environments.
This session will introduce MQTT on Pulsar, which allows developers users of MQTT transport protocol to use Apache Pulsar. I will share the architecture, principles and future planning of MoP, to help you understand Apache Pulsar's capabilities and practices in the IoT industry.
Cassandra Day 2014: Re-envisioning the Lambda Architecture - Web-Services & R...DataStax Academy
Do you want to expose a kick-ass REST API? Do you want interactions with that API to drive slick dashboards that answer hard questions? Do you want all of that in near real-time, distributed across a number of machines, and tolerant of system faults? Take one part Cassandra, stir in a bit of Paxos, and blend with Storm. Coat the rim of the glass with DropWizard. Sit back, relax, and enjoy the show. Come see how Health Market Science is using this mixology to fix our healthcare system.
Real-time streaming and data pipelines with Apache KafkaJoe Stein
Get up and running quickly with Apache Kafka http://kafka.apache.org/
* Fast * A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients.
* Scalable * Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers
* Durable * Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact.
* Distributed by Design * Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees.
Detail behind the Apache Cassandra 2.0 release and what is new in it including Lightweight Transactions (compare and swap) Eager retries, Improved compaction, Triggers (experimental) and more!
• CQL cursors
Data Pipeline with Kafka, This slide include
Kafka Introduction, Topic / Partitions, Produce / Consumer, Quick Start, Offset Monitoring, Example Code, Camus
Will it Scale? The Secrets behind Scaling Stream Processing ApplicationsNavina Ramesh
This talk was presented at the Apache Big Data 2016, North America conference that was held in Vancouver, CA (http://events.linuxfoundation.org/events/archive/2016/apache-big-data-north-america/program/schedule)
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
Developing Real-Time Data Pipelines with Apache Kafka http://kafka.apache.org/ is an introduction for developers about why and how to use Apache Kafka. Apache Kafka is a publish-subscribe messaging system rethought of as a distributed commit log. Kafka is designed to allow a single cluster to serve as the central data backbone. A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of coordinated consumers. Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages. For the Spring user, Spring Integration Kafka and Spring XD provide integration with Apache Kafka.
Using schedulers like Marathon and Aurora help to get your applications scheduled and executing on Mesos. In many cases it makes sense to build a framework and integrate directly. This talk will breakdown what is involved in building a framework, how to-do this with examples and why you would want to-do this. Frameworks are not only for generally available software applications (like Kafka, HDFS, Spark ,etc) but can also be used for custom internal R&D built software applications too.
Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively.
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LMEconfluent
Confluent Platform is supporting London Metal Exchange’s Kafka Centre of Excellence across a number of projects with the main objective to provide a reliable, resilient, scalable and overall efficient Kafka as a Service model to the teams across the entire London Metal Exchange estate.
Presentation from kafka meetup 13-SEP-2013. including some notes to clarify some slides. enjoy
Avi Levi
123avi@gmail.com
https://www.linkedin.com/in/leviavi/
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
Independent of the source of data, the integration of event streams into an Enterprise Architecture gets more and more important in the world of sensors, social media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analysed, often with many consumers or systems interested in all or part of the events. How can me make sure that all these event are accepted and forwarded in an efficient and reliable way? This is where Apache Kafaka comes into play, a distirbuted, highly-scalable messaging broker, build for exchanging huge amount of messages between a source and a target.
This session will start with an introduction into Apache and presents the role of Apache Kafka in a modern data / information architecture and the advantages it brings to the table. Additionally the Kafka ecosystem will be covered as well as the integration of Kafka in the Oracle Stack, with products such as Golden Gate, Service Bus and Oracle Stream Analytics all being able to act as a Kafka consumer or producer.
Stream Processing with Apache Kafka and .NETconfluent
Presentation from South Bay.NET meetup on 3/30.
Speaker: Matt Howlett, Software Engineer at Confluent
Apache Kafka is a scalable streaming platform that forms a key part of the infrastructure at many companies including Uber, Netflix, Walmart, Airbnb, Goldman Sachs and LinkedIn. In this talk Matt will give a technical overview of Kafka, discuss some typical use cases (from surge pricing to fraud detection to web analytics) and show you how to use Kafka from within your C#/.NET applications.
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraJoe Stein
Slides for our solution we developed for using Mesos, Docker, Kafka, Spark, Cassandra and Solr (DataStax Enterprise Edition) all developed in Go for doing realtime log analysis at scale. Many organizations either need or want log analysis in real time where you can see within a second what is happening within your entire infrastructure. Today, with the hardware available and software systems we have in place, you can develop, build and use as a service these solutions.
Apache Kafka - Scalable Message Processing and more!Guido Schmutz
In the world of sensors and social media streams, the integration and handling of high-volume event streams is more important than ever. Events have to be handled both efficiently and reliably and often many consumers or systems are interested in all or part of the events. How do we make sure that all these event are accepted and forwarded in an efficient and reliable way? Apache Kafka, a distributed, highly-scalable messaging broker, build for exchanging huge amount of messages between a source and a target can be of great help in such scenario.
This session introduces Apache Kafka and its place in a modern architecture, shows its integration with Oracle Stack and presents the Oracle Event Hub cloud service, the managed Kafka service.
Big Data Open Source Security LLC: Realtime log analysis with Mesos, Docker, ...DataStax Academy
We will be talking about the solution we developed for using Mesos, Docker, Kafka, Spark, Cassandra and Solr (DataStax Enterprise Edition) all developed in Go for doing realtime log analysis at scale. Many organizations either need or want log analysis in real time where you can see within a second what is happening within your entire infrastructure. Today, with the hardware available and software systems we have in place, you can develop, build and use as a service these solutions.
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
ndependent of the source of data, the integration of event streams into an Enterprise Architecture gets more and more important in the world of sensors, social media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analysed, often with many consumers or systems interested in all or part of the events. How can me make sure that all these event are accepted and forwarded in an efficient and reliable way? This is where Apache Kafaka comes into play, a distirbuted, highly-scalable messaging broker, build for exchanging huge amount of messages between a source and a target.
This session will start with an introduction into Apache and presents the role of Apache Kafka in a modern data / information architecture and the advantages it brings to the table. Additionally the Kafka ecosystem will be covered as well as the integration of Kafka in the Oracle Stack, with products such as Golden Gate, Service Bus and Oracle Stream Analytics all being able to act as a Kafka consumer or producer.
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...DevOps_Fest
Apache Kafka зараз на хайпі. Все більше компаній починають використовувати її, як message bus. Проте Kafka може набагато більше, аніж бути просто транспортом. Її реальна міць і краса розкриваються, коли Kafka стає центральною нервовою системою вашої архітектури. Вона швидка, надійна і доволі гнучка для різних сценаріїв використання.
На цій доповіді Сергій поділитися досвідом побудови data streaming платформи. Ми поговоримо про те, як Kafka працює, як її потрібно конфігурувати і в які халепи можна потрапити, якщо Kafka використовується неоптимально.
How to use kakfa for storing intermediate data and use it as a pub/sub model with each of the Producer/Consumer/Topic configs deeply and the Internals working of it.
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Guido Schmutz
Independent of the source of data, the integration and analysis of event streams gets more important in the world of sensors, social media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analyzed, often with many consumers or systems interested in all or part of the events. In this session we compare two popular Streaming Analytics solutions: Spark Streaming and Kafka Streams.
Spark is fast and general engine for large-scale data processing and has been designed to provide a more efficient alternative to Hadoop MapReduce. Spark Streaming brings Spark's language-integrated API to stream processing, letting you write streaming applications the same way you write batch jobs. It supports both Java and Scala.
Kafka Streams is the stream processing solution which is part of Kafka. It is provided as a Java library and by that can be easily integrated with any Java application.
Intro to Apache Kafka I gave at the Big Data Meetup in Geneva in June 2016. Covers the basics and gets into some more advanced topics. Includes demo and source code to write clients and unit tests in Java (GitHub repo on the last slides).
Similar to Developing Realtime Data Pipelines With Apache Kafka (20)
SMACK Stack 1.0 has been Spark, Mesos, Akka, Cassandra and Kafka working into different cohesive systems delivering different solutions for different use cases. Haven't heard about it before? Oh man! Where have you been? https://www.google.com/search?q=smack+stack+1.0
SMACK Stack 1.1 we go a step further Streaming, Mesos, Analytics, Cassandra and Kafka and Joe Stein will walk through in detail some of the different viable options for Streaming and Analytics with Mesos, Kafka and Cassandra.
Get started with Developing Frameworks in Go on Apache MesosJoe Stein
Apache Mesos provides a platform for building distributed systems. Mesos is built using the same principles as the Linux kernel, only at a different level of abstraction. The Mesos kernel runs on every machine and provides applications (e.g., Hadoop, Spark, Kafka, Elastic Search) with API’s for resource management and scheduling across entire datacenter and cloud environments. How to use that platform and what to make of it becomes a complex task requiring not only understanding of where the system has been but also where it is going. Using schedulers like Marathon and Aurora help to get your applications scheduled and executing on Mesos. In many cases it makes sense to build a framework and integrate directly. This talk will breakdown what is involved in building a framework in Go, how to-do this with examples and why you would want to-do this. Frameworks are not only for generally available software applications (like Kafka, HDFS, Spark ,etc) but should also be used for custom internal R&D built software applications too.
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloJoe Stein
In this talk we will walk through how Apache Kafka and Apache Accumulo can be used together to orchestrate a de-coupled, real-time distributed and reactive request/response system at massive scale. Multiple data pipelines can perform complex operations for each message in parallel at high volumes with low latencies. The final result will be inline with the initiating call. The architecture gains are immense. They allow for the requesting system to receive a response without the need for direct integration with the data pipeline(s) that messages must go through. By utilizing Apache Kafka and Apache Accumulo, these gains sustain at scale and allow for complex operations of different messages to be applied to each response in real-time.
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
The Metaverse and AI: how can decision-makers harness the Metaverse for their...Jen Stirrup
The Metaverse is popularized in science fiction, and now it is becoming closer to being a part of our daily lives through the use of social media and shopping companies. How can businesses survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and how does the Metaverse fit into business strategy when futurist ideas are developing into reality at accelerated rates? How do we do this when our data isn't up to scratch? How can we move towards success with our data so we are set up for the Metaverse when it arrives?
How can you help your company evolve, adapt, and succeed using Artificial Intelligence and the Metaverse to stay ahead of the competition? What are the potential issues, complications, and benefits that these technologies could bring to us and our organizations? In this session, Jen Stirrup will explain how to start thinking about these technologies as an organisation.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
2. Joe Stein
● Developer, Architect & Technologist
● Founder & Principal Consultant => Big Data Open Source Security LLC - http://stealth.ly
Big Data Open Source Security LLC provides professional services and product solutions for the collection,
storage, transfer, real-time analytics, batch processing and reporting for complex data streams, data sets and
distributed systems. BDOSS is all about the "glue" and helping companies to not only figure out what Big Data
Infrastructure Components to use but also how to change their existing (or build new) systems to work with
them.
● Apache Kafka Committer & PMC member
● Blog & Podcast - http://allthingshadoop.com
● Twitter @allthingshadoop
3. Overview
● What is Apache Kafka?
○ Data pipelines
○ Architecture
● How does Apache Kafka work?
○ Brokers
○ Producers
○ Consumers
○ Topics
○ Partitions
● How to use Apache Kafka?
○ Existing Integrations
○ Client Libraries
○ Out of the box API
○ Tools
16. How does Kafka do all of this?
● Producers - ** push **
○ Batching
○ Compression
○ Sync (Ack), Async (auto batch)
○ Replication
○ Sequential writes, guaranteed ordering within each partition
● Consumers - ** pull **
○ No state held by broker
○ Consumers control reading from the stream
● Zero Copy for producers and consumers to and from the broker http://kafka.
apache.org/documentation.html#maximizingefficiency
● Message stay on disk when consumed, deletes on TTL with compaction
available in 0.8.1 https://kafka.apache.org/documentation.html#compaction
22. Client Libraries
Community Clients https://cwiki.apache.org/confluence/display/KAFKA/Clients
● Python - Pure Python implementation with full protocol support. Consumer and Producer
implementations included, GZIP and Snappy compression supported.
● C - High performance C library with full protocol support
● C++ - Native C++ library with protocol support for Metadata, Produce, Fetch, and Offset.
● Go (aka golang) Pure Go implementation with full protocol support. Consumer and Producer
implementations included, GZIP and Snappy compression supported.
● Ruby - Pure Ruby, Consumer and Producer implementations included, GZIP and Snappy
compression supported. Ruby 1.9.3 and up (CI runs MRI 2.
● Clojure - Clojure DSL for the Kafka API
● JavaScript (NodeJS) - NodeJS client in a pure JavaScript implementation
Wire Protocol Developers Guide
https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
23. Really Quick Start
1) Install Vagrant http://www.vagrantup.com/
2) Install Virtual Box https://www.virtualbox.org/
3) git clone https://github.com/stealthly/scala-kafka
4) cd scala-kafka
5) vagrant up
Zookeeper will be running on 192.168.86.5
BrokerOne will be running on 192.168.86.10
All the tests in ./src/test/scala/* should pass (which is also /vagrant/src/test/scala/* in the vm)
6) ./gradlew test
25. Producers
https://github.com/stealthly/scala-kafka/blob/master/src/main/scala/KafkaProducer.scala
case class KafkaProducer(
topic: String,
brokerList: String,
/** brokerList - This is for bootstrapping and the producer will only use it for
getting metadata (topics, partitions and replicas). The socket connections for
sending the actual data will be established based on the broker information
returned in the metadata. The format is host1:port1,host2:port2, and the list can
be a subset of brokers or a VIP pointing to a subset of brokers.
*/
26. Producer
clientId: String = UUID.randomUUID().toString,
/** clientId - The client id is a user-specified string sent in each request to help
trace calls. It should logically identify the application making the request. */
synchronously: Boolean = true,
/** synchronously - This parameter specifies whether the messages are sent
asynchronously in a background thread. Valid values are false for
asynchronous send and true for synchronous send. By setting the producer to
async we allow batching together of requests (which is great for throughput) but
open the possibility of a failure of the client machine dropping unsent data.*/
27. Producer
compress: Boolean = true,
/** compress -This parameter allows you to specify the compression codec for
all data generated by this producer. When set to true gzip is used. To override
and use snappy you need to implement that as the default codec for
compression using SnappyCompressionCodec.codec instead of
DefaultCompressionCodec.codec below. */
batchSize: Integer = 200,
/** batchSize -The number of messages to send in one batch when using
async mode. The producer will wait until either this number of messages are
ready to send or queue.buffer.max.ms is reached.*/
28. Producer
messageSendMaxRetries: Integer = 3,
/** messageSendMaxRetries - This property will cause the producer to
automatically retry a failed send request. This property specifies the number of
retries when such failures occur. Note that setting a non-zero value here can
lead to duplicates in the case of network errors that cause a message to be
sent but the acknowledgement to be lost.*/
29. Producer
requestRequiredAcks: Integer = -1
/** requestRequiredAcks
0) which means that the producer never waits for an acknowledgement from the broker
(the same behavior as 0.7). This option provides the lowest latency but the weakest
durability guarantees (some data will be lost when a server fails).
1) which means that the producer gets an acknowledgement after the leader replica has
received the data. This option provides better durability as the client waits until the server
acknowledges the request as successful (only messages that were written to the now-
dead leader but not yet replicated will be lost).
-1) which means that the producer gets an acknowledgement after all in-sync replicas
have received the data. This option provides the best durability, we guarantee that no
messages will be lost as long as at least one in sync replica remains.*/
30. val props = new Properties()
val codec = if(compress) DefaultCompressionCodec.codec else NoCompressionCodec.codec
props.put("compression.codec", codec.toString)
http://kafka.apache.org/documentation.html#producerconfigs
props.put("require.requred.acks",requestRequiredAcks.toString)
val producer = new Producer[AnyRef, AnyRef](new ProducerConfig(props))
def kafkaMesssage(message: Array[Byte], partition: Array[Byte]): KeyedMessage[AnyRef, AnyRef] = {
if (partition == null) {
new KeyedMessage(topic,message)
} else {
new KeyedMessage(topic,message, partition)
}
}
Producer
32. High Level Consumer
https://github.com/stealthly/scala-kafka/blob/master/src/main/scala/KafkaConsumer.scala
class KafkaConsumer(
topic: String,
/** topic - The high-level API hides the details of brokers from the consumer
and allows consuming off the cluster of machines without concern for the
underlying topology. It also maintains the state of what has been consumed.
The high-level API also provides the ability to subscribe to topics that match a
filter expression (i.e., either a whitelist or a blacklist regular expression).*/
33. High Level Consumer
groupId: String,
/** groupId - A string that uniquely identifies the group of consumer processes
to which this consumer belongs. By setting the same group id multiple
processes indicate that they are all part of the same consumer group.*/
zookeeperConnect: String,
/** zookeeperConnect - Specifies the zookeeper connection string in the form
hostname:port where host and port are the host and port of a zookeeper server.
To allow connecting through other zookeeper nodes when that zookeeper
machine is down you can also specify multiple hosts in the form hostname1:
port1,hostname2:port2,hostname3:port3. The server may also have a
zookeeper chroot path as part of it's zookeeper connection string which puts its
data under some path in the global zookeeper namespace. */
34. High Level Consumer
val props = new Properties()
props.put("group.id", groupId)
props.put("zookeeper.connect", zookeeperConnect)
props.put("auto.offset.reset", if(readFromStartOfStream) "smallest" else "largest")
val config = new ConsumerConfig(props)
val connector = Consumer.create(config)
val filterSpec = new Whitelist(topic)
val stream = connector.createMessageStreamsByFilter(filterSpec, 1, new
DefaultDecoder(), new DefaultDecoder()).get(0)