Although Cassandra is well known for its ability to scale and handle heavy load, the team at Abc Arbitrage has preferred to expose its capacity to act as a distributed system.
In this presentation, Kévin Lovato, Software Engineer, will focus on the creation of their home-made Service Bus's Directory which relies on Cassandra to behave as a full-fledged distributed system.
Observer, a "real life" time series applicationKévin LOVATO
Time series examples are often seen in the Cassandra literature, but how do we deal with them in real life applications, outside of the usual "weather station" example?
We have been building and perfecting our own metrics system for over a year and we will share what we've learned, from schema design to data access optimization.
In the last few years, Apache Kafka has been used extensively in enterprises for real-time data collecting, delivering, and processing. In this presentation, Jun Rao, Co-founder, Confluent, gives a deep dive on some of the key internals that help make Kafka popular.
- Companies like LinkedIn are now sending more than 1 trillion messages per day to Kafka. Learn about the underlying design in Kafka that leads to such high throughput.
- Many companies (e.g., financial institutions) are now storing mission critical data in Kafka. Learn how Kafka supports high availability and durability through its built-in replication mechanism.
- One common use case of Kafka is for propagating updatable database records. Learn how a unique feature called compaction in Apache Kafka is designed to solve this kind of problem more naturally.
Strategies and techniques to optimize Kafka brokers and producers to minimize data loss under huge traffic volume, limited configuration options, less ideal and constant changing environment and balance against cost.
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...confluent
In the financial industry, losing data is unacceptable. Financial firms are adopting Kafka for their critical applications. Kafka provides the low latency, high throughput, high availability, and scale that these applications require. But can it also provide complete reliability? As a system architect, when asked “Can you guarantee that we will always get every transaction,” you want to be able to say “Yes” with total confidence.
In this session, we will go over everything that happens to a message – from producer to consumer, and pinpoint all the places where data can be lost – if you are not careful. You will learn how developers and operation teams can work together to build a bulletproof data pipeline with Kafka. And if you need proof that you built a reliable system – we’ll show you how you can build the system to prove this too.
Observer, a "real life" time series applicationKévin LOVATO
Time series examples are often seen in the Cassandra literature, but how do we deal with them in real life applications, outside of the usual "weather station" example?
We have been building and perfecting our own metrics system for over a year and we will share what we've learned, from schema design to data access optimization.
In the last few years, Apache Kafka has been used extensively in enterprises for real-time data collecting, delivering, and processing. In this presentation, Jun Rao, Co-founder, Confluent, gives a deep dive on some of the key internals that help make Kafka popular.
- Companies like LinkedIn are now sending more than 1 trillion messages per day to Kafka. Learn about the underlying design in Kafka that leads to such high throughput.
- Many companies (e.g., financial institutions) are now storing mission critical data in Kafka. Learn how Kafka supports high availability and durability through its built-in replication mechanism.
- One common use case of Kafka is for propagating updatable database records. Learn how a unique feature called compaction in Apache Kafka is designed to solve this kind of problem more naturally.
Strategies and techniques to optimize Kafka brokers and producers to minimize data loss under huge traffic volume, limited configuration options, less ideal and constant changing environment and balance against cost.
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...confluent
In the financial industry, losing data is unacceptable. Financial firms are adopting Kafka for their critical applications. Kafka provides the low latency, high throughput, high availability, and scale that these applications require. But can it also provide complete reliability? As a system architect, when asked “Can you guarantee that we will always get every transaction,” you want to be able to say “Yes” with total confidence.
In this session, we will go over everything that happens to a message – from producer to consumer, and pinpoint all the places where data can be lost – if you are not careful. You will learn how developers and operation teams can work together to build a bulletproof data pipeline with Kafka. And if you need proof that you built a reliable system – we’ll show you how you can build the system to prove this too.
Kafka Needs no Keeper( Jason Gustafson & Colin McCabe, Confluent) Kafka Summi...confluent
We have been served well by Zookeeper over the years, but it is time for Kafka to stand on its own. This is a talk on the ongoing effort to replace the use of Zookeeper in Kafka: why we want to do it and how it will work. We will discuss the limitations we have found and how Kafka benefits both in terms of stability and scalability by bringing consensus in house. This effort will not be completed over night, but we will discuss our progress, what work is remaining, and how contributors can help. (Note that I am proposing this as a joint talk with Colin McCabe, who is also a committer on the Apache Kafka project.)
Deploying Kafka at Dropbox, Mark Smith, Sean Fellowsconfluent
At Dropbox we are currently handling approximately 10,000,000 messages per second at peak across our handful of Kafka clusters. The largest of which has hit throughputs of 7,000,000 per second (~30 Gbps) on only 20 nodes. We’ll walk you through the steps we took to get where we are, the design that works for us — and those that didn’t. We’ll talk about the tooling we had to build and what we want to see exist.
We’ll dive deeper into configuration and provide a blueprint you can follow. We’ll talk about the trials and tribulations of using Kafka — including ways we’ve set our clusters on fire, ways we’ve lost data, ways we’ve turned our hairs gray, and ways we’ve heroically saved the day for our users. Finally, we’ll spend time on some of the work we’re doing to handle consumer coordination across our many different systems and to integrate Kafka into a well established corporate infrastructure. (I.e., making Kafka “”play nice”” with everybody.)
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019confluent
Data stream processing is built on the core concept of time. However, understanding time semantics and reasoning about time is not simple, especially if deterministic processing is expected. In this talk, we explain the difference between processing, ingestion, and event time and what their impact is on data stream processing. Furthermore, we explain how Kafka clusters and stream processing applications must be configured to achieve specific time semantics. Finally, we deep dive into the time semantics of the Kafka Streams DSL and KSQL operators, and explain in detail how the runtime handles time. Apache Kafka offers many ways to handle time on the storage layer, ie, the brokers, allowing users to build applications with different semantics. Time semantics in the processing layer, ie, Kafka Streams and KSQL, are even richer, more powerful, but also more complicated. Hence, it is paramount for developers, to understand different time semantics and to know how to configure Kafka to achieve them. Therefore, this talk enables developers to design applications with their desired time semantics, help them to reason about the runtime behavior with regard to time, and allow them to understand processing/query results.
Jay Kreps is a Principal Staff Engineer at LinkedIn where he is the lead architect for online data infrastructure. He is among the original authors of several open source projects including a distributed key-value store called Project Voldemort, a messaging system called Kafka, and a stream processing system called Samza. This talk gives an introduction to Apache Kafka, a distributed messaging system. It will cover both how Kafka works, as well as how it is used at LinkedIn for log aggregation, messaging, ETL, and real-time stream processing.
Netflix changed its data pipeline architecture recently to use Kafka as the gateway for data collection for all applications which processes hundreds of billions of messages daily. This session will discuss the motivation of moving to Kafka, the architecture and improvements we have added to make Kafka work in AWS. We will also share the lessons learned and future plans.
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGijccsa
Load balancing techniques in cloud computing can be applied at different levels. There are two main
levels: load balancing on physical server and load balancing on virtual servers. Load balancing on a
physical server is policy of allocating physical servers to virtual machines. And load balancing on virtual
machines is a policy of allocating resources from physical server to virtual machines for tasks or
applications running on them. Depending on the requests of the user on cloud computing is SaaS (Software
as a Service), PaaS (Platform as a Service) or IaaS (Infrastructure as a Service) that has a proper load
balancing policy. When receiving the task, the cloud data center will have to allocate these tasks efficiently
so that the response time is minimized to avoid congestion. Load balancing should also be performed
between different datacenters in the cloud to ensure minimum transfer time. In this paper, we propose a
virtual machine-level load balancing algorithm that aims to improve the average response time and
average processing time of the system in the cloud environment. The proposed algorithm is compared to the
algorithms of Avoid Deadlocks [5], Maxmin [6], Throttled [8] and the results show that our algorithms
have optimized response times.
Administrative techniques to reduce Kafka costs | Anna Kepler, ViasatHostedbyConfluent
When your Kafka clusters start growing so is the cost associated with them. As administrators we have to ensure that the service we support is operating in the most reliable way to satisfy the customers. However, for our business it is as important that we ensure the same service is also cost-efficient. There are two ways we can optimize the cost of service – tuning broker machines and tuning the data transfers. Minimizing data transfer is the largest return on investment since that is what accounts for the most spend. With the use of Kafka administrative tools and metrics we can find multiple ways to reduce the data transfers in the clusters.
The presentation will cover various techniques administrators of Kafka service can employ to reduce the data transfers and to save the operational costs. Reducing cross-AZ traffic, optimizing batching with use of DumpLogSegment script, utilizing Kafka metrics to shut down unused data streams and more.
With an objective of making our Kafka deployment as cost effective as possible, we have gained money saving tricks. And we would love to share them with the community.
Load Balancing from the Cloud - Layer 7 Aware SolutionImperva Incapsula
Incapsula's Layer 7 Load Balancing & Failover service enables organizations to replace their costly appliances with an enterprise-grade cloud-based solution.
The service supports all in-data center and cross-data center high availability scenarios.
Incapsula also provides real-time health monitoring to ensure that traffic is always routed to a viable web server.
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...HostedbyConfluent
If your data platform is powered only by batch data processing, you know you are always trailing your customer. Your databases aren’t always up to date. Your inability to have a synchronized data flow across systems leads to operational inefficiencies. And, your dreams of running advanced real-time AI and ML applications can’t be fulfilled. However, you might be wary of the implications of turning your product into an event-driven one. In this presentation we’ll share our experience transforming our CDP-based marketing orchestration engine to be both real-time and highly scalable with the Kafka ecosystem. We will look into how we saved resources with Connect when ingesting and syncing data with NoSQL databases, data warehouses and third-party platforms. What we did to turn ksqlDB into our data transformation, aggregation and querying hub, reducing latency and costs. How Streams helps us activate multiple real-time applications such as building identity graphs, updating materialized views in high frequency for efficient real-time lookups and inferencing machine learning models. Finally, we will look at how Confluent Cloud solved our pre-rollout sizing and scaling questions, significantly reducing time-to-market.
Better Kafka Performance Without Changing Any Code | Simon Ritter, AzulHostedbyConfluent
Apache Kafka is the most popular open-source stream-processing software for collecting, processing, storing, and analyzing data at scale. Most known for its excellent performance, low latency, fault tolerance, and high throughput, it's capable of handling thousands of messages per second. For mission-critical applications, how do you ensure that the performance delivered is the performance required? This is especially important as Kafka is written in Java and Scala and runs on the JVM. The JVM is a fantastic platform that delivers on an internet scale. In this session, we'll explore how making changes to the JVM design can eliminate the problems of garbage collection pauses and raise the throughput of applications. For cloud-based Kafka applications, this can deliver both lower latency and reduced infrastructure costs. All without changing a line of code!
Fault Tolerance and Job Recovery in Apache Flink @ FlinkForward 2015Till Rohrmann
The talk explains how Apache Flink checkpoints stateful jobs using the asynchronous barrier snapshotting algorithm to give exactly once semantics in streaming. Furthermore, Flink's approach to master high availability (HA) is described which solves the problem of the JobManager being the single point of failure. Job checkpointing in combination with HA is the basis for Flink's fault tolerance mechanism to recover from occurring failures.
Integrating Apache Pulsar with Big Data EcosystemStreamNative
In Apache Pulsar Beijing Meetup, Yijieshen gave a presentation of the current state of Apache Pulsar integrating with Big Data Ecosystem. He explains why and how Pulsar fits into current big data computing and query engines, and how Pulsar integrates with Spark, Flink and Presto for unified data processing system.
A short introduction of the unexpected problems we encountered (and solutions we designed) during the last two years of exploitation of our home made service bus in production.
Kafka Needs no Keeper( Jason Gustafson & Colin McCabe, Confluent) Kafka Summi...confluent
We have been served well by Zookeeper over the years, but it is time for Kafka to stand on its own. This is a talk on the ongoing effort to replace the use of Zookeeper in Kafka: why we want to do it and how it will work. We will discuss the limitations we have found and how Kafka benefits both in terms of stability and scalability by bringing consensus in house. This effort will not be completed over night, but we will discuss our progress, what work is remaining, and how contributors can help. (Note that I am proposing this as a joint talk with Colin McCabe, who is also a committer on the Apache Kafka project.)
Deploying Kafka at Dropbox, Mark Smith, Sean Fellowsconfluent
At Dropbox we are currently handling approximately 10,000,000 messages per second at peak across our handful of Kafka clusters. The largest of which has hit throughputs of 7,000,000 per second (~30 Gbps) on only 20 nodes. We’ll walk you through the steps we took to get where we are, the design that works for us — and those that didn’t. We’ll talk about the tooling we had to build and what we want to see exist.
We’ll dive deeper into configuration and provide a blueprint you can follow. We’ll talk about the trials and tribulations of using Kafka — including ways we’ve set our clusters on fire, ways we’ve lost data, ways we’ve turned our hairs gray, and ways we’ve heroically saved the day for our users. Finally, we’ll spend time on some of the work we’re doing to handle consumer coordination across our many different systems and to integrate Kafka into a well established corporate infrastructure. (I.e., making Kafka “”play nice”” with everybody.)
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019confluent
Data stream processing is built on the core concept of time. However, understanding time semantics and reasoning about time is not simple, especially if deterministic processing is expected. In this talk, we explain the difference between processing, ingestion, and event time and what their impact is on data stream processing. Furthermore, we explain how Kafka clusters and stream processing applications must be configured to achieve specific time semantics. Finally, we deep dive into the time semantics of the Kafka Streams DSL and KSQL operators, and explain in detail how the runtime handles time. Apache Kafka offers many ways to handle time on the storage layer, ie, the brokers, allowing users to build applications with different semantics. Time semantics in the processing layer, ie, Kafka Streams and KSQL, are even richer, more powerful, but also more complicated. Hence, it is paramount for developers, to understand different time semantics and to know how to configure Kafka to achieve them. Therefore, this talk enables developers to design applications with their desired time semantics, help them to reason about the runtime behavior with regard to time, and allow them to understand processing/query results.
Jay Kreps is a Principal Staff Engineer at LinkedIn where he is the lead architect for online data infrastructure. He is among the original authors of several open source projects including a distributed key-value store called Project Voldemort, a messaging system called Kafka, and a stream processing system called Samza. This talk gives an introduction to Apache Kafka, a distributed messaging system. It will cover both how Kafka works, as well as how it is used at LinkedIn for log aggregation, messaging, ETL, and real-time stream processing.
Netflix changed its data pipeline architecture recently to use Kafka as the gateway for data collection for all applications which processes hundreds of billions of messages daily. This session will discuss the motivation of moving to Kafka, the architecture and improvements we have added to make Kafka work in AWS. We will also share the lessons learned and future plans.
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGijccsa
Load balancing techniques in cloud computing can be applied at different levels. There are two main
levels: load balancing on physical server and load balancing on virtual servers. Load balancing on a
physical server is policy of allocating physical servers to virtual machines. And load balancing on virtual
machines is a policy of allocating resources from physical server to virtual machines for tasks or
applications running on them. Depending on the requests of the user on cloud computing is SaaS (Software
as a Service), PaaS (Platform as a Service) or IaaS (Infrastructure as a Service) that has a proper load
balancing policy. When receiving the task, the cloud data center will have to allocate these tasks efficiently
so that the response time is minimized to avoid congestion. Load balancing should also be performed
between different datacenters in the cloud to ensure minimum transfer time. In this paper, we propose a
virtual machine-level load balancing algorithm that aims to improve the average response time and
average processing time of the system in the cloud environment. The proposed algorithm is compared to the
algorithms of Avoid Deadlocks [5], Maxmin [6], Throttled [8] and the results show that our algorithms
have optimized response times.
Administrative techniques to reduce Kafka costs | Anna Kepler, ViasatHostedbyConfluent
When your Kafka clusters start growing so is the cost associated with them. As administrators we have to ensure that the service we support is operating in the most reliable way to satisfy the customers. However, for our business it is as important that we ensure the same service is also cost-efficient. There are two ways we can optimize the cost of service – tuning broker machines and tuning the data transfers. Minimizing data transfer is the largest return on investment since that is what accounts for the most spend. With the use of Kafka administrative tools and metrics we can find multiple ways to reduce the data transfers in the clusters.
The presentation will cover various techniques administrators of Kafka service can employ to reduce the data transfers and to save the operational costs. Reducing cross-AZ traffic, optimizing batching with use of DumpLogSegment script, utilizing Kafka metrics to shut down unused data streams and more.
With an objective of making our Kafka deployment as cost effective as possible, we have gained money saving tricks. And we would love to share them with the community.
Load Balancing from the Cloud - Layer 7 Aware SolutionImperva Incapsula
Incapsula's Layer 7 Load Balancing & Failover service enables organizations to replace their costly appliances with an enterprise-grade cloud-based solution.
The service supports all in-data center and cross-data center high availability scenarios.
Incapsula also provides real-time health monitoring to ensure that traffic is always routed to a viable web server.
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...HostedbyConfluent
If your data platform is powered only by batch data processing, you know you are always trailing your customer. Your databases aren’t always up to date. Your inability to have a synchronized data flow across systems leads to operational inefficiencies. And, your dreams of running advanced real-time AI and ML applications can’t be fulfilled. However, you might be wary of the implications of turning your product into an event-driven one. In this presentation we’ll share our experience transforming our CDP-based marketing orchestration engine to be both real-time and highly scalable with the Kafka ecosystem. We will look into how we saved resources with Connect when ingesting and syncing data with NoSQL databases, data warehouses and third-party platforms. What we did to turn ksqlDB into our data transformation, aggregation and querying hub, reducing latency and costs. How Streams helps us activate multiple real-time applications such as building identity graphs, updating materialized views in high frequency for efficient real-time lookups and inferencing machine learning models. Finally, we will look at how Confluent Cloud solved our pre-rollout sizing and scaling questions, significantly reducing time-to-market.
Better Kafka Performance Without Changing Any Code | Simon Ritter, AzulHostedbyConfluent
Apache Kafka is the most popular open-source stream-processing software for collecting, processing, storing, and analyzing data at scale. Most known for its excellent performance, low latency, fault tolerance, and high throughput, it's capable of handling thousands of messages per second. For mission-critical applications, how do you ensure that the performance delivered is the performance required? This is especially important as Kafka is written in Java and Scala and runs on the JVM. The JVM is a fantastic platform that delivers on an internet scale. In this session, we'll explore how making changes to the JVM design can eliminate the problems of garbage collection pauses and raise the throughput of applications. For cloud-based Kafka applications, this can deliver both lower latency and reduced infrastructure costs. All without changing a line of code!
Fault Tolerance and Job Recovery in Apache Flink @ FlinkForward 2015Till Rohrmann
The talk explains how Apache Flink checkpoints stateful jobs using the asynchronous barrier snapshotting algorithm to give exactly once semantics in streaming. Furthermore, Flink's approach to master high availability (HA) is described which solves the problem of the JobManager being the single point of failure. Job checkpointing in combination with HA is the basis for Flink's fault tolerance mechanism to recover from occurring failures.
Integrating Apache Pulsar with Big Data EcosystemStreamNative
In Apache Pulsar Beijing Meetup, Yijieshen gave a presentation of the current state of Apache Pulsar integrating with Big Data Ecosystem. He explains why and how Pulsar fits into current big data computing and query engines, and how Pulsar integrates with Spark, Flink and Presto for unified data processing system.
A short introduction of the unexpected problems we encountered (and solutions we designed) during the last two years of exploitation of our home made service bus in production.
How did we move the mountain? - Migrating 1 trillion+ messages per day across...HostedbyConfluent
Have you ever migrated Kafka clusters from one data center to another being completely transparent to client applications?
At PayPal, as part of a massive datacenter migration initiative, Kafka team successfully moved all PayPal Kafka traffic across data centers. This initiative involved migrating 20+ Kafka clusters (1000+ broker and zookeeper nodes), as well as 60+ mirrormaker groups which seamlessly handle Kafka traffic volumes as high as 1 trillion messages per day. Throughout the course of this migration, applications required no modification, encountered 0% service outage, 0% message loss and duplicated messages. The whole migration process was fully transparent to Kafka applications.
In this session, you will learn the strategies, techniques and tools the PayPal Kafka team has utilized for managing the migration process. You will also learn the lessons and pitfalls they experienced during this exercise, as well as the secret sauce of making the migration successful.
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
This talk shares experiences from deploying and tuning Flink steam processing applications for very large scale. We share lessons learned from users, contributors, and our own experiments about running demanding streaming jobs at scale. The talk will explain what aspects currently render a job as particularly demanding, show how to configure and tune a large scale Flink job, and outline what the Flink community is working on to make the out-of-the-box for experience as smooth as possible. We will, for example, dive into - analyzing and tuning checkpointing - selecting and configuring state backends - understanding common bottlenecks - understanding and configuring network parameters
From a kafkaesque story to The Promised LandRan Silberman
LivePerson moved from an ETL based data platform to a new data platform based on emerging technologies from the Open Source community: Hadoop, Kafka, Storm, Avro and more.
This presentation tells the story and focuses on Kafka.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Building your own Distributed System The easy way - Cassandra Summit EU 2014
1.
2. Building Your Own Distributed System
The Easy Way
Kévin Lovato - @alprema
3. What this presentation will
NOT talk about
• Gazillions of inserts per second
• Hundreds of nodes
• Migrations from old technology to C* that now go 100 times faster
4. What this presentation will talk
about
• Servers that synchronize their state
• Out of order messages
• CQL Schema design
• Time measurement madness
6. • Hedge fund specialized in algorithmic trading
• ~80 employees
• Our C* usage
• Historical data (6+ Tb)
• Time series (Metrics)
• Home made Service Bus (Zebus)
7. Service Bus 101
• Network abstraction layer
• Allows communication between services (SOA)
• Communication is enabled using Business level messages (events)
• Usually relies on a broker
8. Zebus 101
• Developed in .Net
• P2P
• Lightweight
• CQRS oriented
• 1+ year of production experience
• ~150M messages / day
10. Terminology
• Peer: A program connected to the Bus
• Subscription: A message type a Peer is interested in
• Directory server: A Peer that knows all the Peers and their Subscriptions
11. Directory 1 Directory 2
Peer 1 Peer 2
Peer 3
Peer 1 is not connected and needs to
register on the bus
17. The Directory servers must be identical (no master)
A peer can contact any of the Directory servers at any time
18. The Directory servers must be identical (no master)
A peer can contact any of the Directory servers at any time
Directory servers can be updated/restarted at any time
19. The Directory servers must be identical (no master)
A peer can contact any of the Directory servers at any time
Directory servers can be updated/restarted at any time
Peers have to be able to add Subscriptions one at a time if needed
24. • Allows to offload state synchronization to Cassandra (Quorum
everywhere)
• Makes restart / crash recovery easy
• Only « business » code in the Directory Server
26. Directory 1 Directory 2
Peer 1
Timestamps:
Naive implementation (server side)
Peer 1 is already registered on the Bus and
will need to do multiple Subscription updates
43. A Peer is already registered on the bus, and
has subscribed to one event type
Peer 1
Directory 1
Peer ID MessageType Sub. Info
Peer.1 CoolEvent { misc. Info } Initial subscriptions
44. It now needs to add a new subscription
Peer 1
Directory 1
Peer ID MessageType Sub. Info
Peer.1 CoolEvent { misc. Info } Initial subscriptions
45. Peer 1
Directory 1
Peer ID MessageType Sub. Info
Peer.1 CoolEvent { misc. Info }
Peer.1 OtherEvent
(new) { misc. Info }
It will send all its current subscriptions + the
new one
46. Peer 1
Directory 1
Now imagine that the peer adds 10 000
subscriptions
47. Peer 1
Directory 1
Now imagine that the peer adds 10 000
subscriptions, one at a time
48. Peer 1
Directory 1
Peer ID MessageType Sub. Info
Peer.1 CoolEvent { misc. Info }
Peer.1 OtherEvent
(new) { misc. Info }
…10 000 other events…
Peer.1 NthEvent { misc. Info }
10 000x times
49. Peer 1
Directory 1
Solution: Transfer subscriptions by message
type
50. Peer 1
Directory 1
Peer ID MessageType Sub. Info
Peer.1 NewEvent (1st) { misc. Info }
51. Peer 1
Directory 1
Peer ID MessageType Sub. Info
Peer.1 NewEvent (2nd) { misc. Info }
And so on…
54. • We want to only do upserts (no read-before-write)
• We want Cassandra to use client timestamps to resolve out of order
updates
• Subscriptions have to be updatable one by one
55. One subscription per row
Peer ID MessageType Subscription Info
Peer.18 CoolEvent { misc. Info }
… … …
• Primary Key (Peer Id, MessageType)
56. Directory
Peer 1 and 2 need to register on the Bus
Peer 1 Peer 2
57. Peer ID MessageType Sub. Info
Peer.1 CoolEvent { misc. Info }
Peer.1 OtherEvent { misc. Info }
Directory
• Peer 1 registers with 2
Subscriptions
Peer 1 Peer 2
58. Writing
Directory
• Peer 1 registers with 2
Subscriptions • Directory starts to write to C*
Peer 1 Peer 2
59. Still writing
Directory
• Peer 1 registers with 2
Subscriptions • Directory starts to write to C* • Peer 2 registers during the write
Register
Peer 1 Peer 2
60. Still writing
Directory
• Peer 1 registers with 2
Subscriptions • Directory starts to write to C* • Peer 2 registers during the write • Since insertion was not over,
Peer 2 gets an incomplete state
Peer ID MessageType Sub. Info
Peer.1 CoolEvent { misc. Info }
Peer 1 Peer 2
61. All subscriptions in one row
Peer ID All Subscriptions Blob
Peer.18 { blob }
… …
• Primary Key (Peer Id)
62. Directory 1 Directory 2
Peer 1 is already registered on the Bus
and needs to add two Subscriptions
Peer 1
65. Directory 1 Directory 2
A delay (again!) slows down Directory 1, causing both
Subscriptions to be added simultaneously
Peer 1
66. State:
No subscriptions
State:
No subscriptions
Directory 1 Directory 2
Peer 1
• Peer 1 adds Subscription 1 • Peer 1 adds Subscription 2 • Directory 1 gets the state to add Subscription 1 • Directory 2 gets the state to add Subscription 2
67. Store:
Subscription 1
Store:
Subscription 2
Directory 1 Directory 2
Peer 1
• Peer 1 adds Subscription 1 • Peer 1 adds Subscription 2 • Directory 1 gets the state to add Subscription 1 • Directory 2 gets the state to add Subscription 2 • They both store the updated state to C*
68. Stored:
Either Subscription 1 or 2 depending on
which was the slowest
Directory 1 Directory 2
Peer 1
• Peer 1 adds Subscription 1 • Peer 1 adds Subscription 2 • Directory 1 gets the state to add Subscription 1 • Directory 2 gets the state to add Subscription 2 • They both store the updated state to C* • Both store only their new subscription
69. Solution: Compromise
• We split subscriptions into Static and Dynamic subscriptions
• Static subscriptions cannot be updated one-by-one
• The Dynamic subscriptions list cannot be handled as atomic
• Each type has its own Column Family
73. DateTime.Now
• Calling DateTime.Now twice in a row can (and will) return the same value
• Its resolution is around 10ms
• We had to create a unique timestamp provider (add 1 tick when called in
same « time bucket »)
74. Cassandra timestamp
• .Net’s DateTime.Ticks is more precise than Cassandra’s timestamps (100
ns vs. 1 μs)
• Our custom time provider ensured uniqueness by adding 1 tick at a time,
which was lost in translation
75. « UselessKey »
• The Directory CF is really small and needs to be retrieved entirely and
frequently
• We used a « bool UselessKey » PartitionKey to force sequential storage
and squeeze the last bits of speeds we needed
76. « UselessKey »
UselessKey Peer ID MessageType Subscription info
false Peer.18 UserCreated { misc. Info }
… … …
• Primary Key (UselessKey, Peer Id, MessageType)
• You should bench (after a flush) with your real data
78. When you have multiple servers sharing a state, Cassandra can save you
some headaches
79. When you have multiple servers sharing a state, Cassandra can save you
some headaches
The schema design is very critical, think it thoroughly and make sure you
understand what is atomic and what is not
80. When you have multiple servers sharing a state, Cassandra can save you
some headaches
The schema design is very critical, think it thoroughly and make sure you
understand what is atomic and what is not
Client provided timestamps can be very useful, but be sure to generate
unique timestamps
81. When you have multiple servers sharing a state, Cassandra can save you
some headaches
The schema design is very critical, think it thoroughly and make sure you
understand what is atomic and what is not
Client provided timestamps can be very useful, but be sure to generate
unique timestamps
If you are not using Java, be well-aware of data types differences between
your language and Java
82. Want to see the code ?
www.github.com/Abc-Arbitrage
83. Want to see more code ?
jobs@abc-arbitrage.com