This document summarizes the design iterations of an automated home valuation system at Redfin that uses Redis and Kafka. It describes moving the system from AWS to using Kafka and Samza for distributed streaming. Key points included prefetching estimate values for nearby properties to warm the cache, increasing the cache hit rate from 55% to 80%. Monitoring and decoupling systems were identified as areas for further improvement. The system helped reduce response times and API calls for home valuations.
High cardinality time series search: A new level of scale - Data Day Texas 2016Eric Sammer
Modern search systems provide incredible feature sets, developer-friendly APIs, and low latency indexing and query response. By some measures, these systems operate "at scale," but rarely is that quantified. Customers of Rocana typically look to push ingest rates in excess of 1 million events per second, retaining years of data online for query, with the expectation of sub-second response times for any reasonably sized subset of data.
We quickly found that the tradeoffs made by general purpose search systems, while right for common use cases, were less appropriate for these high cardinality, large scale use cases.
This session details the architecture, tradeoffs, and interesting implementation decisions made in building a new time series optimized distributed search system using Apache Lucene, Kafka, and HDFS. Data ingestion and durability, index and metadata organization, storage, query scheduling and optimization, and failure modes will be covered. Finally, a summary of the results achieved will be shown.
Apache Kafka, Apache Cassandra and Kubernetes are open source big data technologies enabling applications and business operations to scale massively and rapidly. While Kafka and Cassandra underpins the data layer of the stack providing capability to stream, disseminate, store and retrieve data at very low latency, Kubernetes is a container orchestration technology that helps in automated application deployment and scaling of application clusters. In this presentation, we will reveal how we architected a massive scale deployment of a streaming data pipeline with Kafka and Cassandra to cater to an example Anomaly detection application running on a Kubernetes cluster and generating and processing massive amount of events. Anomaly detection is a method used to detect unusual events in an event stream. It is widely used in a range of applications such as financial fraud detection, security, threat detection, website user analytics, sensors, IoT, system health monitoring, etc. When such applications operate at massive scale generating millions or billions of events, they impose significant computational, performance and scalability challenges to anomaly detection algorithms and data layer technologies. We will demonstrate the scalability, performance and cost effectiveness of Apache Kafka, Cassandra and Kubernetes, with results from our experiments allowing the Anomaly detection application to scale to 19 Billion anomaly checks per day.
Kafka is a high-throughput, fault-tolerant, scalable platform for building high-volume near-real-time data pipelines. This presentation is about tuning Kafka pipelines for high-performance.
Select configuration parameters and deployment topologies essential to achieve higher throughput and low latency across the pipeline are discussed. Lessons learned in troubleshooting and optimizing a truly global data pipeline that replicates 100GB data under 25 minutes is discussed.
Streaming in Practice - Putting Apache Kafka in Productionconfluent
This presentation focuses on how to integrate all these components into an enterprise environment and what things you need to consider as you move into production.
We will touch on the following topics:
- Patterns for integrating with existing data systems and applications
- Metadata management at enterprise scale
- Tradeoffs in performance, cost, availability and fault tolerance
- Choosing which cross-datacenter replication patterns fit with your application
- Considerations for operating Kafka-based data pipelines in production
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...HostedbyConfluent
Apache Kafka is a key part of the Big Data infrastructure at Salesforce, enabling publish/subscribe and data transport in near real-time at enterprise scale handling trillions of messages per day. In this session, hear from the teams at Salesforce that manage Kafka as a service, running over a hundred clusters across on-premise and public cloud environments with over 99.9% availability. Hear about best practices and innovations, including:
* How to manage multi-tenant clusters in a hybrid environment
* High volume data pipelines with Mirus replicating data to Kafka and blob storage
* Kafka Fault Injection Framework built on Trogdor and Kibosh
* Automated recovery without data loss
* Using Envoy as an SNI-routing Kafka gateway
We hope the audience will have practical takeaways for building, deploying, operating, and managing Kafka at scale in the enterprise.
Strategies and techniques to optimize Kafka brokers and producers to minimize data loss under huge traffic volume, limited configuration options, less ideal and constant changing environment and balance against cost.
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...confluent
BY Jun Rao
From the Bay Area Apache Kafka September 2016 Meetup.
Abstract: To manage the ever-increasing volume and velocity of data within your company you have successfully made the transition from single machines and one-off solutions to large, distributed stream infrastructures in your data center powered by Apache Kafka. But what needs to be done if one data center is not enough? In this session we describe building resilient data pipelines with Apache Kafka that span multiple data centers and points of presence. We provide an overview of best practices and common patterns while covering key areas such as architecture guidelines, data replication and mirroring as well as disaster scenarios and failure handling.
Apache Kafka lies at the heart of the largest data pipelines, handling trillions of messages and petabytes of data every day. Learn the right approach for getting the most out of Kafka from the experts at LinkedIn and Confluent. Todd Palino and Gwen Shapira demonstrate how to monitor, optimize, and troubleshoot performance of your data pipelines—from producer to consumer, development to production—as they explore some of the common problems that Kafka developers and administrators encounter when they take Apache Kafka from a proof of concept to production usage. Too often, systems are overprovisioned and underutilized and still have trouble meeting reasonable performance agreements.
Topics include:
- What latencies and throughputs you should expect from Kafka
- How to select hardware and size components
- What you should be monitoring
- Design patterns and antipatterns for client applications
- How to go about diagnosing performance bottlenecks
- Which configurations to examine and which ones to avoid
Whether you are developing a greenfield data project or migrating a legacy system,
there are many critical design decisions to be made. Often, it is advantageous to not only
consider immediate requirements, but also the future requirements and technologies you may
want to support. Your project may start out supporting batch analytics with the vision of adding
realtime support. Or your data pipeline may feed data to one technology today, but tomorrow
an entirely new system needs to be integrated. Apache Kafka can help decouple these
decisions and provide a flexible core to your data architecture. This talk will show how building
Kafka into your pipeline can provide the flexibility to experiment, evolve and grow. It will also
cover a brief overview of Kafka, its architecture, and terminology.
Building an Event-oriented Data Platform with Kafka, Eric Sammer confluent
While we frequently talk about how to build interesting products on top of machine and event data, the reality is that collecting, organizing, providing access to, and managing this data is where most people get stuck. Many organizations understand the use cases around their data – fraud detection, quality of service and technical operations, user behavior analysis, for example – but are not necessarily data infrastructure experts. In this session, we’ll follow the flow of data through an end to end system built to handle tens of terabytes an hour of event-oriented data, providing real time streaming, in-memory, SQL, and batch access to this data. We’ll go into detail on how open source systems such as Hadoop, Kafka, Solr, and Impala/Hive are actually stitched together; describe how and where to perform data transformation and aggregation; provide a simple and pragmatic way of managing event metadata; and talk about how applications built on top of this platform get access to data and extend its functionality.
Attendees will leave this session knowing not just which open source projects go into a system such as this, but how they work together, what tradeoffs and decisions need to be addressed, and how to present a single general purpose data platform to multiple applications. This session should be attended by data infrastructure engineers and architects planning, building, or maintaining similar systems.
Netflix changed its data pipeline architecture recently to use Kafka as the gateway for data collection for all applications which processes hundreds of billions of messages daily. This session will discuss the motivation of moving to Kafka, the architecture and improvements we have added to make Kafka work in AWS. We will also share the lessons learned and future plans.
Jay Kreps is a Principal Staff Engineer at LinkedIn where he is the lead architect for online data infrastructure. He is among the original authors of several open source projects including a distributed key-value store called Project Voldemort, a messaging system called Kafka, and a stream processing system called Samza. This talk gives an introduction to Apache Kafka, a distributed messaging system. It will cover both how Kafka works, as well as how it is used at LinkedIn for log aggregation, messaging, ETL, and real-time stream processing.
An in-depth guide to VDI infrastructure delivering the best desktop/BYOD experience for your developers and other external knowledge workers. We will compare Amazon Workspaces with classic approaches to solving this challenge, and share best-practices for securing and managing a real-world production environment.
Speaker: Brett Looney, Solutions Architect, Amazon Web Services
High cardinality time series search: A new level of scale - Data Day Texas 2016Eric Sammer
Modern search systems provide incredible feature sets, developer-friendly APIs, and low latency indexing and query response. By some measures, these systems operate "at scale," but rarely is that quantified. Customers of Rocana typically look to push ingest rates in excess of 1 million events per second, retaining years of data online for query, with the expectation of sub-second response times for any reasonably sized subset of data.
We quickly found that the tradeoffs made by general purpose search systems, while right for common use cases, were less appropriate for these high cardinality, large scale use cases.
This session details the architecture, tradeoffs, and interesting implementation decisions made in building a new time series optimized distributed search system using Apache Lucene, Kafka, and HDFS. Data ingestion and durability, index and metadata organization, storage, query scheduling and optimization, and failure modes will be covered. Finally, a summary of the results achieved will be shown.
Apache Kafka, Apache Cassandra and Kubernetes are open source big data technologies enabling applications and business operations to scale massively and rapidly. While Kafka and Cassandra underpins the data layer of the stack providing capability to stream, disseminate, store and retrieve data at very low latency, Kubernetes is a container orchestration technology that helps in automated application deployment and scaling of application clusters. In this presentation, we will reveal how we architected a massive scale deployment of a streaming data pipeline with Kafka and Cassandra to cater to an example Anomaly detection application running on a Kubernetes cluster and generating and processing massive amount of events. Anomaly detection is a method used to detect unusual events in an event stream. It is widely used in a range of applications such as financial fraud detection, security, threat detection, website user analytics, sensors, IoT, system health monitoring, etc. When such applications operate at massive scale generating millions or billions of events, they impose significant computational, performance and scalability challenges to anomaly detection algorithms and data layer technologies. We will demonstrate the scalability, performance and cost effectiveness of Apache Kafka, Cassandra and Kubernetes, with results from our experiments allowing the Anomaly detection application to scale to 19 Billion anomaly checks per day.
Kafka is a high-throughput, fault-tolerant, scalable platform for building high-volume near-real-time data pipelines. This presentation is about tuning Kafka pipelines for high-performance.
Select configuration parameters and deployment topologies essential to achieve higher throughput and low latency across the pipeline are discussed. Lessons learned in troubleshooting and optimizing a truly global data pipeline that replicates 100GB data under 25 minutes is discussed.
Streaming in Practice - Putting Apache Kafka in Productionconfluent
This presentation focuses on how to integrate all these components into an enterprise environment and what things you need to consider as you move into production.
We will touch on the following topics:
- Patterns for integrating with existing data systems and applications
- Metadata management at enterprise scale
- Tradeoffs in performance, cost, availability and fault tolerance
- Choosing which cross-datacenter replication patterns fit with your application
- Considerations for operating Kafka-based data pipelines in production
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...HostedbyConfluent
Apache Kafka is a key part of the Big Data infrastructure at Salesforce, enabling publish/subscribe and data transport in near real-time at enterprise scale handling trillions of messages per day. In this session, hear from the teams at Salesforce that manage Kafka as a service, running over a hundred clusters across on-premise and public cloud environments with over 99.9% availability. Hear about best practices and innovations, including:
* How to manage multi-tenant clusters in a hybrid environment
* High volume data pipelines with Mirus replicating data to Kafka and blob storage
* Kafka Fault Injection Framework built on Trogdor and Kibosh
* Automated recovery without data loss
* Using Envoy as an SNI-routing Kafka gateway
We hope the audience will have practical takeaways for building, deploying, operating, and managing Kafka at scale in the enterprise.
Strategies and techniques to optimize Kafka brokers and producers to minimize data loss under huge traffic volume, limited configuration options, less ideal and constant changing environment and balance against cost.
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...confluent
BY Jun Rao
From the Bay Area Apache Kafka September 2016 Meetup.
Abstract: To manage the ever-increasing volume and velocity of data within your company you have successfully made the transition from single machines and one-off solutions to large, distributed stream infrastructures in your data center powered by Apache Kafka. But what needs to be done if one data center is not enough? In this session we describe building resilient data pipelines with Apache Kafka that span multiple data centers and points of presence. We provide an overview of best practices and common patterns while covering key areas such as architecture guidelines, data replication and mirroring as well as disaster scenarios and failure handling.
Apache Kafka lies at the heart of the largest data pipelines, handling trillions of messages and petabytes of data every day. Learn the right approach for getting the most out of Kafka from the experts at LinkedIn and Confluent. Todd Palino and Gwen Shapira demonstrate how to monitor, optimize, and troubleshoot performance of your data pipelines—from producer to consumer, development to production—as they explore some of the common problems that Kafka developers and administrators encounter when they take Apache Kafka from a proof of concept to production usage. Too often, systems are overprovisioned and underutilized and still have trouble meeting reasonable performance agreements.
Topics include:
- What latencies and throughputs you should expect from Kafka
- How to select hardware and size components
- What you should be monitoring
- Design patterns and antipatterns for client applications
- How to go about diagnosing performance bottlenecks
- Which configurations to examine and which ones to avoid
Whether you are developing a greenfield data project or migrating a legacy system,
there are many critical design decisions to be made. Often, it is advantageous to not only
consider immediate requirements, but also the future requirements and technologies you may
want to support. Your project may start out supporting batch analytics with the vision of adding
realtime support. Or your data pipeline may feed data to one technology today, but tomorrow
an entirely new system needs to be integrated. Apache Kafka can help decouple these
decisions and provide a flexible core to your data architecture. This talk will show how building
Kafka into your pipeline can provide the flexibility to experiment, evolve and grow. It will also
cover a brief overview of Kafka, its architecture, and terminology.
Building an Event-oriented Data Platform with Kafka, Eric Sammer confluent
While we frequently talk about how to build interesting products on top of machine and event data, the reality is that collecting, organizing, providing access to, and managing this data is where most people get stuck. Many organizations understand the use cases around their data – fraud detection, quality of service and technical operations, user behavior analysis, for example – but are not necessarily data infrastructure experts. In this session, we’ll follow the flow of data through an end to end system built to handle tens of terabytes an hour of event-oriented data, providing real time streaming, in-memory, SQL, and batch access to this data. We’ll go into detail on how open source systems such as Hadoop, Kafka, Solr, and Impala/Hive are actually stitched together; describe how and where to perform data transformation and aggregation; provide a simple and pragmatic way of managing event metadata; and talk about how applications built on top of this platform get access to data and extend its functionality.
Attendees will leave this session knowing not just which open source projects go into a system such as this, but how they work together, what tradeoffs and decisions need to be addressed, and how to present a single general purpose data platform to multiple applications. This session should be attended by data infrastructure engineers and architects planning, building, or maintaining similar systems.
Netflix changed its data pipeline architecture recently to use Kafka as the gateway for data collection for all applications which processes hundreds of billions of messages daily. This session will discuss the motivation of moving to Kafka, the architecture and improvements we have added to make Kafka work in AWS. We will also share the lessons learned and future plans.
Jay Kreps is a Principal Staff Engineer at LinkedIn where he is the lead architect for online data infrastructure. He is among the original authors of several open source projects including a distributed key-value store called Project Voldemort, a messaging system called Kafka, and a stream processing system called Samza. This talk gives an introduction to Apache Kafka, a distributed messaging system. It will cover both how Kafka works, as well as how it is used at LinkedIn for log aggregation, messaging, ETL, and real-time stream processing.
An in-depth guide to VDI infrastructure delivering the best desktop/BYOD experience for your developers and other external knowledge workers. We will compare Amazon Workspaces with classic approaches to solving this challenge, and share best-practices for securing and managing a real-world production environment.
Speaker: Brett Looney, Solutions Architect, Amazon Web Services
AWS Meetup - Nordstrom Data Lab and the AWS CloudNordstromDataLab
The Nordstrom Data Lab is building out an API that powers product recommendations for our customer online and beyond. Recommendo, our flagship product, was built from the ground up using Node.js and AWS in a little over three months. Since launch in November 2013 we've served up over three billion recommendations and survived Black Friday and Cyber Monday without breaking a sweat. We'll be sharing our learnings for building and operating a high traffic API on the AWS platform as a service focusing on Node.js, Elastic Beanstalk, and DynamoDB. Additionally we'll discuss some of the cultural challenges and opportunities presented when adopting the public cloud at a large corporate IT organization. In short, we believe there are tremendous advantages to be had for enterprises willing to make the leap to the cloud.
AWS re:Invent 2016: The State of Serverless Computing (SVR311)Amazon Web Services
Join us to learn about the state of serverless computing from Dr. Tim Wagner, General Manager of AWS Lambda. Dr. Wagner discusses the latest developments from AWS Lambda and the serverless computing ecosystem. He talks about how serverless computing is becoming a core component in how companies build and run their applications and services, and he also discusses how serverless computing will continue to evolve.
Stephen Liedig: Building Serverless Backends with AWS Lambda and API GatewaySteve Androulakis
Stephen Liedig (Amazon Web Services) is a Public Sector Solutions Architect at AWS working closely with local and state governments, educational institutions, and non-profit organisations across Australia and New Zealand to design, and deliver, highly secure, scalable, reliable and fault-tolerant architectures in the AWS Cloud while sharing best practices and current trends, with a specific focus on DevOps, messaging, and serverless technologies.
SRV418 Deep Dive on Accelerating Content, APIs, and Applications with Amazon ...Amazon Web Services
Attend this session to dive deeper into AWS's content delivery service, Amazon CloudFront. Learn how you can use CloudFront to accelerate the delivery of your APIs or applications, including content that cannot be cached, to global clients. We'll also walk you through how you can use Lambda@Edge, which gives you the ability to execute custom code inline with your CloudFront events to customize applications. With Lambda@Edge, you can now generate custom responses right at the edge, allowing you to leverage CloudFront to reduce end-to-end latency and more efficiently filter traffic to your back-end origin servers. We'll walk you through Lambda@Edge use cases and walk through a demo to show how this works.
How Disney Streaming Services and TrueCar Deliver Web Applications for Scale,...Amazon Web Services
Disney Streaming Service is a Direct-To-Consumer (DTC) video streaming service, part of Disney and TrueCar is a digital automotive marketplace. You will learn about their different perspectives on how they built global applications for scale, performance and availability. TrueCar will share how they moved internet operations off premises from their datacenters to the cloud in AWS. Disney Streaming Service will dive deep into how they are leveraging Amazon CloudFront and Lambda@Edge to enable their content APIs to perform at scale through dynamic original selection, latency reduction through the usage of edge caching, and guaranteeing high availability.
AWS re:Invent 2016: Born in the Cloud; Built Like a Startup (ARC205)Amazon Web Services
This presentation provides a comparison of three modern architecture patterns that startups are building their business around. It includes a realistic analysis of cost, team management, and security implications of each approach. It covers Elastic Beanstalk, Amazon ECS, Docker, Amazon API Gateway, AWS Lambda, Amazon DynamoDB, and Amazon CloudFront, as well as Docker.
General discussions
Why cloud?
The terminology: relating virtualization and cloud
Types of Virtualization and Cloud deployment model
Decisive factors in migration
Hands-on cloud deployment
Cloud for banks
Peng Kang, Software Engineer, Dropbox + Richi Gupta, Engineering Manager, Dropbox
As a scalable and reliable data streaming solution with a rich ecosystem, Kafka is widely adopted in Dropbox infrastructure in various scenarios. It is part of Dropbox’s analytics data pipeline, stream processing platform and more mission critical systems. Jetstream is the team that provides Kafka as a service in Dropbox infrastructure. We manage the clusters, develop tooling, and enforce policies, so that our users can enjoy a highly available and reliable service. In this talk, we will share our experiences and learnings running Kafka clusters, pipelines that enable high durability (direct writes to kafka) and availability (goscribe), the policies we enforce for high reliability, the tooling we have for maintenance and stress testing, and finally an overview of Dropbox’s next generation queueing service built on top Kafka.
https://www.meetup.com/KafkaBayArea/events/266327152/
(GAM404) Hunting Monsters in a Low-Latency Multiplayer Game on EC2Amazon Web Services
Hear how Turtle Rock launched Evolve, their fast-paced mercenary-vs-monster first-person shooter (FPS), to millions of players using AWS regions around the globe. Turtle Rock provides an in-depth view into Evolve's architecture on AWS, including both their Amazon EC2 and Elastic Load Balancing web API stack, as well as their Crytek-based UDP game servers. Hear how they used Amazon VPC subnets, along with an RDS MySQL based server registration service, to balance players across Availability Zones and regions. Learn about Turtle Rock's innovative game server scaling logic, which maintains a pool of game server capacity while keeping costs in check. Finally, see Evolve’s Graphite and Grafa monitoring setup, which provides player count and server health status across their worldwide fleet.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
19. • Performance
• AWS cost
• Data consistency/accuracy
Goals
• All productsreturnthe sameestimate value foragiven
homeatagiven time
• Near-realtimedata
Self Intro
An engineer on the Owner Engagement team at Redfin.
Redfin Intro
Technology-powered real estate brokerage.
Not only serve home information as a platform, but also employ real estate agents to provide professional home buying and selling services.
Stats
Every month, over 22 million users visit Redfin to check out homes across the country, to find out how much their home is worth, to keep updated with the neighborhood trend, to get in touch with a real estate professional for services.
Owner Engagement primarily focuses on homeowner experiences, providing tools to help owners understand home value better, and to keep their home information up-to-date
Redis at Redfin (use open source Redis) Will first briefly touch the usage of Redis at Redfin, we’re using the open source version
Two major use case
LRU cache
Rate limiting
Hoping to open up for more use cases soon
This talk will focus on a specific use case of Redis at Redfin and go through our thought process of designing the architecture around it, more specific, a cache warmup pipeline powered by kafka and samza. Hopefully you will find it useful
Agenda
Leading to avm…
Coming back to automated home valuation, or what we call Redfin Estimate
What is that? Redfin Estimate is a calculation of the market value of an individual home, or what we think your home is worth
We've commissioned analysis that shows that Redfin Estimate is the most accurate on the market
If you are a Redfin user, you probably have seen it on a listing details page before, either on the web, (next slide: or in the mobile app)
Screenshots
Desktop web
Or in the mobile app
Screenshots
Mobile app
We also use it extensively in email, internal tools for agents, as well as in a fairly new experiment that we are running in a few markets called Redfin Now
For example:
Redfin Now tries to buy home directly from the customers. The amount we pay for the home largely depends on the Redfin estimate value
There’s one more place you will see the Redfin estimate values is when you zoom in on the map page.
I personally like this feature a lot because it gives you a good idea on the overall home prices in a neighborhood, not because I want to know if my neighbor’s home is worth more. :grin:
Different from the other use cases, this usage presents some unique technical challenges, and I will dive into that later in this talk.
Screenshots
Avm on map
Coming back, this was our tech stack before a cache layer was introduced in the system
On the right side we have an Amazon API Gateway set up on AWS serving raw estimate data
Each of the feature is responsible for fetching the estimate on their own
This original setup has several issues:
Due to the complex nature of the estimate data and the fact that a property can be represented in many different forms, the estimate API takes multiple params.
It is possible for it to return a different estimate even for the same home based on the parameter that get passed in.
This inevitably lead to misuses by other services when they provided undesired inputs, which in turn caused inconsistent data across different products
For example, email to XDP could display different values
Also introduced unclear ownership between feature teams: who’s going to be responsible for the cache layer, do teams add their own implementation? Duplicate effort, debugging pain
Before adding the cache layer, unify the code path and create Estimate Service to take the responsibility
Provide simple API
Define clear ownership
Roll up the sleeve and resume the caching work
Spend a minute to talk about our goals here:
Why? Good software engineering practice. Without a clear goal, you may not be optimizing your time to solve the right problem
Perf: network round trip, db calls to fetch metadata
Aws cost, API gateway is not free, charges by number or API calls and amount of data transferred out
Data consistency matters a lot to us. To provide good UX.
Spend a minute to talk about our goals here:
Why? Good software engineering practice. Without a clear goal, you may not be optimizing your time to solve the right problem
Perf: cut down network round trip, db calls to fetch metadata
Aws cost, API gateway is not free, charges by number or API calls and amount of data transferred out
Data consistency matters a lot to us. To provide good UX.
Spend a minute to talk about our goals here:
Why? Good software engineering practice. Without a clear goal, you may not be optimizing your time to solve the right problem
Perf: network round trip, db calls to fetch metadata
Aws cost, Amazon API gateway is not free, charges by number or API calls and amount of data transferred out
Data consistency matters a lot to us. To provide good UX.
Spend a minute to talk about our goals here:
Why? Good software engineering practice. Without a clear goal, you may not be optimizing your time to solve the right problem
Perf: network round trip, db calls to fetch metadata
Aws cost, API gateway is not free, charges by number or API calls and amount of data transferred out
Data consistency matters a lot to us. To provide good UX.
All products should return the same estimate value for a given home at a given time
First proposal:
To simplify the diagram, consolidated intermediate services as part of the webserver
We naturally fitted a Redis cache in between the estimate service and API gateway.
Stores pairs of property id and its corresponding value
Now every time someone requests for estimate data, estimate service only performs the expensive fetch in case of a cache miss, and the results are cached in Redis
So far so good. Caching is that simple… right?
How hard could it be to cache some values
:shrug:
Remember earlier we talked about goals, and the first one is to improve perf
Now that we eliminated the network round trip to API gateway, are we done though?
You now notice that there’s still DB calls being made. What is it doing? Can we eliminate that?
We then take a look at what it is doing, and it turns out we saved a bunch of “legal” rules in the database on whether or not we can show the estimate value
Some of these come from local MLS which stands for multiple listing services, which are databases that real estate agents use to list properties
Others come from listing agent’s settings
In the end, it boils down to a value called access level
Controls the accessibility of a piece of data based on user’s roles
Unregistered, registered, email verified, agent
So we put access level in the cache as well and eliminated the db call.
So far so good. Caching is that simple… right??
How hard could it be to cache some values
:shrug:
before popping open a champagne and celebrate, let’s double check our goals. There’s a line item called consistency.
What does that mean? :thinking_face:
Have we achieved that? Someone might have already thought about it. It has something to do with expiration.
Turns out housing market is a pretty volatile market
Home prices changes frequently and dramatically
And I believe if you’ve gone through a home buying/selling process in the recent couple years, you would very well know what that means
What does this mean for us? For example, say the housing price fluctuates every hour, and we cache it for two hours, what would happen?
We ended up with stale data
Can we simply cache everything for a short amount of time? We can, but we’ll be making unnecessary calls to API gateway, which we already know costs us a fortune
How can we strike the perfect balance between the cost and benefit?
Well, we addressed this by treating data with different characteristics differently
We split the data in two groups: active properties that are for sale and off market properties that are not for sale
We know that for sale listings have more activities going on, their estimates update more often, so we give them a much shorter TTL
While off market properties don’t have too much going on, their estimates don’t update as often, thus it’s fine for them to live in the cache for longer
(this data is public)
And in fact, our estimate algorithm updates the estimate for active properties at least once a day, even more often for newer listings
For off-market ones, it updates only once a week
We ended up with this: caching estimates of active properties for 5 min, and 4 hrs for off-market properties
One question people often ask is that if the estimate value only updates once a week, why do you cache it for only 4 hours instead of a week
Cache expiration is a broad topic, and the answer varies case by case. If you know exactly when the data will be updated, you can cache it for as long as you want until you’re notified of the arrival of new data
In our case, our side of the system acts on the assumption that it doesn’t know the exact time the update happens. So to be on the cautious side, we naturally experiment with a shorter TTL to make sure the data isn’t too stale.
You never know if new data happens to arrive right after you cache the old data right?
And here comes a handy command in Redis: setex(), or set expire. It both sets the key/value and sets the key to timeout after a given period of time.
It combines the set() command and expire() command in an atomic action
It is a very common operation when using Redis as a cache
With all goals covered, when we rolled it out to 100%, non-surprisingly, we observed the expected drop off in response time, with an avg. hit rate of 53%
A graph illustrating the response time
A graph illustrating the drop in number of single calls to Estimate API
Alright, so far the setup works pretty well for most of the use cases where we only need estimate data for a single property such as this
What about situations where we need multiple estimate values at once?
Such as this
As i mentioned earlier, when you zoom in enough on the map, you will get to see the estimate values for homes that are not currently on the market in your viewport
And it turns out that we didn’t achieve much performance gain
Wuuuuut?
:shrug:
We observed that on the map, we can achieve an avg. of 55% hit rate, even slightly higher than the single fetch case
However, hit rate doesn’t matter as much when fetching multiple values at a time, at least when it’s only around the 50s
Say we are fetching 100 estimate values at once and 55 are retrieved from the cache, for the rest of the 45, we still need to contact AWS for the data
Even though we speed up part of the call by hitting the cache, the entire call didn’t improve much because we didn’t eliminate the expensive part, which is the network roundtrip
So, as long as we’re still making the network call as part of the map request, we won’t see too much improvement, and we are not going to save much on AWS cost
We then asked ourselves: is it possible to eliminate the synchronous call at all?
The answer is yes we can, if the hit rate is high enough
So we introduced cache warmup
Here’s how it works:
Given a map area the user is looking at (what we call viewport), prefetch estimate values for the neighboring tiles
Populate the cache ahead of time
When the user is browsing homes in this box, or the viewport
Which is the orange tile in the center at a more zoomed out level
We identify all neighboring tiles of the orange tile
And find all properties that are not for sale on these neighboring tiles
Prefetch the estimate values for these homes, populate them into the Redis cache in a non-blocking background process, a process that is supposed to happen really fast
The expectation is that when the user pans around the map and enter the neighboring tiles, the estimate values on these tiles will ideally already exist in the cache
There’ll be no need to fetch synchronously from Amazon API gateway and db
It sounds pretty straightforward, so how does the system look like?
Remember this is our previous iteration, with a Redis cache sitting in between the estimate service and the estimate api on aws
How hard could it be to add a background cache warmup pipeline
Probably just couple boxes around it and fill in the details right?
this is the final architecture of what we came up with. Consists of a cache warmup pipeline in the middle and couple invalidation pipelines on the bottom
Not far from the previous iteration right? :grin:
Exactly like how you draw an owl
Putting the owl-ful jokes aside, deep dive time
We built cache warmup with Kafka and Samza
The reason we chose stream processing to solve this problem is because it fits well with the stream processing paradigm which says:
Given a sequence of data (what we call a stream), a series of operations is applied to each element in the stream
Needs to be near-realtime
Work on messages one by one
More importantly, also because a stream processing pipeline built on top of kafka and samza is durable and scalable.
For those of you that aren’t quite familiar with these
Kafka is a distributed streaming framework similar to Storm, Spark, Flink, and etc.
It was originally developed by LinkedIn and later open-sourced
Redfin adopted Kafka in its earlier days back in 2015 to scale up our fast growing notification needs
At the time it was mainly serving as a low-latency messaging system and that’s where Samza comes into the picture
Samza is an open-source distributed stream processing framework
It uses Kafka for messaging, and hadoop YARN for fault tolerance, resource management, processor isolation, security, and etc.
Provides scalable near-real-time event streaming and data processing
Without going into too much detail, here’s a graph illustrating the relationship between Kafka and Samza
Kafka: buffer between Samza apps
At the time we were evaluating multiple similar streaming processing products, There are couple characteristics that made us choose Samza over another stream processing framework at the time
Differentiators:
Supports out of the box local storage. Because the state itself is modeled as a stream, in case of a machine failure, the state stream will come back in the state before the crash and can be replayed to restore it
Streams are ordered, partitioned, and replayable, which provides scalability and durability
It takes advantage of YARN for processor isolation, security, and fault tolerance, and YARN provides a distributed environment for Samza containers to run in
All jobs are decoupled. If one job experiences issues, the rest of the system is unaffected.
As of today, Redfin has more than 70 Samza apps running in prod, processing millions of bytes of content per second.
Usage varies from tracking market activity to sending tour notification and sending listing updates in a timely fashion.
More than 90% of the listing updates will land in our users’ inbox within 10 minutes since they get updated on MLS, which is at least 10x faster than our competitors
And you all know how valuable it is to know about new listings faster than other potential home buyers in such a competitive market
Final architecture of estimate service with a caching pipeline
Let’s take a closer look at the big grey box
The caching pipeline consists of four components:
Each of them is a standalone samza app
The forecaster identifies properties on the neighboring tiles, sends them to the request filter
More specifically, it takes in a list of property ids, finds the bounding box of these ids, and identifies the surrounding boxes of that box
For each of the surrounding box, it then fetches all properties on it, and sends them to the downstream app
It filters the incoming properties.
Couple things it checks on:
If the estimate value for a given property already exists in the cache, then there’s no need to fetch them again
If the given property display an estimate value at all. Remember the ”legal” rules that’s stored in the database that I mentioned earlier? That is it.
Later, the eligible properties are sent to the data fetcher and it batches them together and fetches the estimate data from API Gateway
Finally, the data writer populates these data into the Redis cache
The reason to make 4 standalone apps is so that every app is responsible for a simple specific task and no more than that.
Our goal is to complete the entire flow, from the request entering the pipeline to landing in cache, within 10 sec.
Separating the responsibilities allows us to achieve maximum horizontal scalability.
E.g. we know ahead of time that request filter will have to handle a comparatively larger amount of incoming messages than the other apps because forecaster fans out multiple messages for every single incoming message
so we made it a single app and gave it 4X containers and enabled multi-threading
E.g. for data fetcher, because it is responsible for making requests to AWS, instead of making a request for every incoming message,
we utilized a unique feature in Samza called local storage to batch the requests together in order to reduce the amount of calls
With the new caching pipeline in place, when the new design was rolled out in prod, we saw good results with fetching multiple estimate values.
a significant bump in hitrate, from 55% to 80%
and a huge amount of perf gain
The orange line on the top is upper 90, and the yellow one on the bottom is median
Also we didn’t eliminate the API gateway call 100%. In the case of a low hit rate, to ensure a good UX a sync call is still initiated
The load on API gateway decreased significantly
A graph illustrating the drop in number of batch calls to the Estimate API
Now that we have a fancy async pipeline running to update the cache, we piggybacked on that and built two additional streams to invalidate the cache when estimate visibility changes
What is the catch?
First thing that bit us: because we are sending messages to the streaming pipeline within the map request, system couple between the webserver and the streaming infrastructure
It turns out that the supposedly asynchronous process isn’t entirely async, there’s one synchronous tissue in the entire link
What happened is that in a recent Kafka upgrade, due to an unexpected bug in the Kafka Python client, some of the nodes in the Kafka cluster or what’s called a Kafka broker experienced enormous latency in accepting messages.
Which trickled down and caused webservers to return slowly
Fortunately we were able to shut down the pipeline by flipping a feature toggle
Build thorough monitoring and alarming system
Another lesson we learnt is that as your system gets more complicated, it gets harder to detect where the issue is coming from when there’s a fire
need to up yourgame in monitoring
For this feature alone, we built at least 4 dashboards to monitor the individual parts, including the cache itself, the caching pipeline, the estimate service, and the webserver endpoints
Tweak Kafka producer configs so brokers timeout faster to better isolate Kafka issue from webserver
Exploring Envoy to move from unsharded to sharded