RedisConf18 - Serving Automated Home Valuation with Redis & Kafka

•Download as PPTX, PDF•

1 like•286 views

This document summarizes the design iterations of an automated home valuation system at Redfin that uses Redis and Kafka. It describes moving the system from AWS to using Kafka and Samza for distributed streaming. Key points included prefetching estimate values for nearby properties to warm the cache, increasing the cache hit rate from 55% to 80%. Monitoring and decoupling systems were identified as areas for further improvement. The system helped reduce response times and API calls for home valuations.

Technology

Serving Automated Home
Valuation with Redis & Kafka
Jiaqi Wang
jiaqi.wang@redfin.com

RedisatRedfin
• Since 2016
• Opensource version
• LRU cache
• Rate limiting
• …

01 Feature Overview
02 Design Iterations
03 DeepDive
04 Next Steps
Agenda

Redfin Estimate
A calculation of the market value of an
individual home
AutomatedHome
Valuation

property_id,
listing_id,
othermetadata
Unclearownership
ComplexAPIthat...
• Promptsmisuse
• Leadstoinconsistentdata
Before…

• Performance
Goals
• Networkroundtrip, dbcalls

• Performance
• AWS cost
Goals
• CostonAmazonAPI Gateway

• Performance
• AWS cost
• Data consistency/accuracy
Goals
• All productsreturnthe sameestimate value foragiven
homeatagiven time
• Near-realtimedata

• Performance
• AWS cost
• Data consistency/accuracy
Goals

• Access level
BusinessLogics
• Visibilitycompliance
• LocalMLSrules
• Selleroverrides

pid → estimate value,
access level
AWS
IterationII

• Caching for too long leads
to stale and inaccurate data
• Caching for too short
reduces perf/cost benefit
VolatileEstimate

• Active properties
• Off-market properties
• Forsale
• Moreactivities
• Updatemoreoften
• Notforsale
• Lessactivities
• Updateless often
Activevs.Off-
Market

Active properties:TTL5 mins
Off-marketproperties:TTL4 hrs
pid → estimate value,
access level,
with TTL
AWS
IterationIII

Before
• hit rate: N/A
• p90: 102ms
• median: 45ms
After
• hit rate: 53%
• p90: 73ms
• median: 8ms
Results
(SingleGet)

Roll out to 100%
45%FewerEstimateAPICalls

• No significant perf gain
• Similar amount of Estimate API calls
ItTurnsOut…

¯_(ツ)_/¯
• No significant perf gain
• Similar amount of Estimate API calls
ItTurnsOut…

Increasehitratetoeliminate thesync call
Solution

• Prefetch estimate values on the
neighboring tiles
• Populate the cache ahead of time
Cache
Warmup

• Distributed streaming framework
• Storm, Spark, Flink, and etc.
• Redfin adopted Kafka in 2015
Kafka

• Distributed stream processing framework
• Scalablenear-realtime event streaming and
dataprocessing
• Uses Kafkaformessaging
• Uses YARNforfaulttolerance,resource
management,etc.
Samza

• Fault-tolerantlocal state
• Ordered,partitioned,replayablestreams
• Processorisolation,security,faulttoleranceprovidedbyYARN
• Decoupledjobs
Stand-outFeatures

• 70+Samzaapps
• Trackingmarketactivity
• Sending tournotifications
• Sending listing updates
• Cachewarmup
• ...
Kafka&Samza
atRedifn

Forecaster
Requestfilter
Datafetcher
Datawriter
CacheWarmup

StandaloneSamzaappsformax
horizontalscalability
CacheWarmup

Before
• hit rate: 55%
• p90: 314ms
• median: 58ms
After
• hit rate: 80%
• p90: 29ms
• median: 3ms
Results
(MultiGet)

Roll out to 50%
Roll out to 100%
ResponseTimeDrops

40% Fewer EstimateAPI Calls
40%FewerEstimateAPICalls

Lessons
SystemCouplings
• SlowKafkabroker,slow server responsetime

What’sNext
• Systemdecoupling
• Envoy:https://github.com/envoyproxy/envoy
• Increase hit rate even further
• Unshardedtoshardedwithproxy
• Experimentwith TTL
• Cachemoredata
• FurtherreduceAWScost

Modern search systems provide incredible feature sets, developer-friendly APIs, and low latency indexing and query response. By some measures, these systems operate "at scale," but rarely is that quantified. Customers of Rocana typically look to push ingest rates in excess of 1 million events per second, retaining years of data online for query, with the expectation of sub-second response times for any reasonably sized subset of data. We quickly found that the tradeoffs made by general purpose search systems, while right for common use cases, were less appropriate for these high cardinality, large scale use cases. This session details the architecture, tradeoffs, and interesting implementation decisions made in building a new time series optimized distributed search system using Apache Lucene, Kafka, and HDFS. Data ingestion and durability, index and metadata organization, storage, query scheduling and optimization, and failure modes will be covered. Finally, a summary of the results achieved will be shown.

ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...

Paul Brebner

Apache Kafka, Apache Cassandra and Kubernetes are open source big data technologies enabling applications and business operations to scale massively and rapidly. While Kafka and Cassandra underpins the data layer of the stack providing capability to stream, disseminate, store and retrieve data at very low latency, Kubernetes is a container orchestration technology that helps in automated application deployment and scaling of application clusters. In this presentation, we will reveal how we architected a massive scale deployment of a streaming data pipeline with Kafka and Cassandra to cater to an example Anomaly detection application running on a Kubernetes cluster and generating and processing massive amount of events. Anomaly detection is a method used to detect unusual events in an event stream. It is widely used in a range of applications such as financial fraud detection, security, threat detection, website user analytics, sensors, IoT, system health monitoring, etc. When such applications operate at massive scale generating millions or billions of events, they impose significant computational, performance and scalability challenges to anomaly detection algorithms and data layer technologies. We will demonstrate the scalability, performance and cost effectiveness of Apache Kafka, Cassandra and Kubernetes, with results from our experiments allowing the Anomaly detection application to scale to 19 Billion anomaly checks per day.

Kafka at scale facebook israel

Gwen (Chen) Shapira

Tuning kafka pipelines

Sumant Tambe

Kafka is a high-throughput, fault-tolerant, scalable platform for building high-volume near-real-time data pipelines. This presentation is about tuning Kafka pipelines for high-performance. Select configuration parameters and deployment topologies essential to achieve higher throughput and low latency across the pipeline are discussed. Lessons learned in troubleshooting and optimizing a truly global data pipeline that replicates 100GB data under 25 minutes is discussed.

Streaming in Practice - Putting Apache Kafka in Production

confluent

This presentation focuses on how to integrate all these components into an enterprise environment and what things you need to consider as you move into production. We will touch on the following topics: - Patterns for integrating with existing data systems and applications - Metadata management at enterprise scale - Tradeoffs in performance, cost, availability and fault tolerance - Choosing which cross-datacenter replication patterns fit with your application - Considerations for operating Kafka-based data pipelines in production

Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...

HostedbyConfluent

Apache Kafka is a key part of the Big Data infrastructure at Salesforce, enabling publish/subscribe and data transport in near real-time at enterprise scale handling trillions of messages per day. In this session, hear from the teams at Salesforce that manage Kafka as a service, running over a hundred clusters across on-premise and public cloud environments with over 99.9% availability. Hear about best practices and innovations, including: * How to manage multi-tenant clusters in a hybrid environment * High volume data pipelines with Mirus replicating data to Kafka and blob storage * Kafka Fault Injection Framework built on Trogdor and Kibosh * Automated recovery without data loss * Using Envoy as an SNI-routing Kafka gateway We hope the audience will have practical takeaways for building, deploying, operating, and managing Kafka at scale in the enterprise.

From Three Nines to Five Nines - A Kafka Journey

Allen (Xiaozhong) Wang

Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...

confluent

BY Jun Rao From the Bay Area Apache Kafka September 2016 Meetup. Abstract: To manage the ever-increasing volume and velocity of data within your company you have successfully made the transition from single machines and one-off solutions to large, distributed stream infrastructures in your data center powered by Apache Kafka. But what needs to be done if one data center is not enough? In this session we describe building resilient data pipelines with Apache Kafka that span multiple data centers and points of presence. We provide an overview of best practices and common patterns while covering key areas such as architecture guidelines, data replication and mirroring as well as disaster scenarios and failure handling.

Putting Kafka Into Overdrive

Todd Palino

Apache Kafka lies at the heart of the largest data pipelines, handling trillions of messages and petabytes of data every day. Learn the right approach for getting the most out of Kafka from the experts at LinkedIn and Confluent. Todd Palino and Gwen Shapira demonstrate how to monitor, optimize, and troubleshoot performance of your data pipelines—from producer to consumer, development to production—as they explore some of the common problems that Kafka developers and administrators encounter when they take Apache Kafka from a proof of concept to production usage. Too often, systems are overprovisioned and underutilized and still have trouble meeting reasonable performance agreements. Topics include: - What latencies and throughputs you should expect from Kafka - How to select hardware and size components - What you should be monitoring - Design patterns and antipatterns for client applications - How to go about diagnosing performance bottlenecks - Which configurations to examine and which ones to avoid

RedisConf18 - Scalable Microservices with Event Sourcing and Redis

Redis Labs

Deploying Confluent Platform for Production

confluent

Decoupling Decisions with Apache Kafka

Grant Henke

Whether you are developing a greenfield data project or migrating a legacy system, there are many critical design decisions to be made. Often, it is advantageous to not only consider immediate requirements, but also the future requirements and technologies you may want to support. Your project may start out supporting batch analytics with the vision of adding realtime support. Or your data pipeline may feed data to one technology today, but tomorrow an entirely new system needs to be integrated. Apache Kafka can help decouple these decisions and provide a flexible core to your data architecture. This talk will show how building Kafka into your pipeline can provide the flexibility to experiment, evolve and grow. It will also cover a brief overview of Kafka, its architecture, and terminology.

RedisConf17 - Redis in High Traffic Adtech Stack

Redis Labs

Netflix Data Pipeline With Kafka

Steven Wu

Building an Event-oriented Data Platform with Kafka, Eric Sammer

confluent

While we frequently talk about how to build interesting products on top of machine and event data, the reality is that collecting, organizing, providing access to, and managing this data is where most people get stuck. Many organizations understand the use cases around their data – fraud detection, quality of service and technical operations, user behavior analysis, for example – but are not necessarily data infrastructure experts. In this session, we’ll follow the flow of data through an end to end system built to handle tens of terabytes an hour of event-oriented data, providing real time streaming, in-memory, SQL, and batch access to this data. We’ll go into detail on how open source systems such as Hadoop, Kafka, Solr, and Impala/Hive are actually stitched together; describe how and where to perform data transformation and aggregation; provide a simple and pragmatic way of managing event metadata; and talk about how applications built on top of this platform get access to data and extend its functionality. Attendees will leave this session knowing not just which open source projects go into a system such as this, but how they work together, what tradeoffs and decisions need to be addressed, and how to present a single general purpose data platform to multiple applications. This session should be attended by data infrastructure engineers and architects planning, building, or maintaining similar systems.

Kafka At Scale in the Cloud

confluent

Apache Kafka at LinkedIn

Discover Pinterest

Jay Kreps is a Principal Staff Engineer at LinkedIn where he is the lead architect for online data infrastructure. He is among the original authors of several open source projects including a distributed key-value store called Project Voldemort, a messaging system called Kafka, and a stream processing system called Samza. This talk gives an introduction to Apache Kafka, a distributed messaging system. It will cover both how Kafka works, as well as how it is used at LinkedIn for log aggregation, messaging, ETL, and real-time stream processing.

Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...

DataWorks Summit/Hadoop Summit

Powering Remote Developers with Amazon Workspaces

Amazon Web Services

Your easy move to serverless computing and radically simplified data processing

gvernik

What's hot

RedisConf18 - Writing modular & encapsulated Redis code

Redis Labs

Tailoring Redis Modules For Your Users’ Needs

Redis Labs

High cardinality time series search: A new level of scale - Data Day Texas 2016

Eric Sammer

ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...

Paul Brebner

Kafka at scale facebook israel

Gwen (Chen) Shapira

Tuning kafka pipelines

Sumant Tambe

Streaming in Practice - Putting Apache Kafka in Production

confluent

Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...

HostedbyConfluent

From Three Nines to Five Nines - A Kafka Journey

Allen (Xiaozhong) Wang

Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...

confluent

Putting Kafka Into Overdrive

Todd Palino

RedisConf18 - Scalable Microservices with Event Sourcing and Redis

Redis Labs

Deploying Confluent Platform for Production

confluent

Decoupling Decisions with Apache Kafka

Grant Henke

RedisConf17 - Redis in High Traffic Adtech Stack

Redis Labs

Netflix Data Pipeline With Kafka

Steven Wu

Building an Event-oriented Data Platform with Kafka, Eric Sammer

confluent

Kafka At Scale in the Cloud

confluent

Apache Kafka at LinkedIn

Discover Pinterest

Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...

DataWorks Summit/Hadoop Summit

What's hot (20)

RedisConf18 - Writing modular & encapsulated Redis code

Tailoring Redis Modules For Your Users’ Needs

High cardinality time series search: A new level of scale - Data Day Texas 2016

ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...

Kafka at scale facebook israel

Tuning kafka pipelines

Streaming in Practice - Putting Apache Kafka in Production

Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...

From Three Nines to Five Nines - A Kafka Journey

Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...

Putting Kafka Into Overdrive

RedisConf18 - Scalable Microservices with Event Sourcing and Redis

Deploying Confluent Platform for Production

Decoupling Decisions with Apache Kafka

RedisConf17 - Redis in High Traffic Adtech Stack

Netflix Data Pipeline With Kafka

Building an Event-oriented Data Platform with Kafka, Eric Sammer

Kafka At Scale in the Cloud

Apache Kafka at LinkedIn

Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...

Similar to RedisConf18 - Serving Automated Home Valuation with Redis & Kafka

Powering Remote Developers with Amazon Workspaces

Amazon Web Services

Your easy move to serverless computing and radically simplified data processing

gvernik

AWS Meetup - Nordstrom Data Lab and the AWS Cloud

NordstromDataLab

The Nordstrom Data Lab is building out an API that powers product recommendations for our customer online and beyond. Recommendo, our flagship product, was built from the ground up using Node.js and AWS in a little over three months. Since launch in November 2013 we've served up over three billion recommendations and survived Black Friday and Cyber Monday without breaking a sweat. We'll be sharing our learnings for building and operating a high traffic API on the AWS platform as a service focusing on Node.js, Elastic Beanstalk, and DynamoDB. Additionally we'll discuss some of the cultural challenges and opportunities presented when adopting the public cloud at a large corporate IT organization. In short, we believe there are tremendous advantages to be had for enterprises willing to make the leap to the cloud.

AWS re:Invent 2016: The State of Serverless Computing (SVR311)

Amazon Web Services

Join us to learn about the state of serverless computing from Dr. Tim Wagner, General Manager of AWS Lambda. Dr. Wagner discusses the latest developments from AWS Lambda and the serverless computing ecosystem. He talks about how serverless computing is becoming a core component in how companies build and run their applications and services, and he also discusses how serverless computing will continue to evolve.

Serverless without Code (Lambda)

CloudHesive

Stephen Liedig: Building Serverless Backends with AWS Lambda and API Gateway

Steve Androulakis

Stephen Liedig (Amazon Web Services) is a Public Sector Solutions Architect at AWS working closely with local and state governments, educational institutions, and non-profit organisations across Australia and New Zealand to design, and deliver, highly secure, scalable, reliable and fault-tolerant architectures in the AWS Cloud while sharing best practices and current trends, with a specific focus on DevOps, messaging, and serverless technologies.

Building serverless backends - Tech talk 5 May 2017

ARDC

ARC201 Microservices Architecture @ AWS re:Invent 2015

Mitoc Group

Building a Service Provider Cloud Offering - MVMUG Sept2013Arron Stebbing

Art of Using Xen at ScaleThe Linux Foundation

SRV418 Deep Dive on Accelerating Content, APIs, and Applications with Amazon ...

Amazon Web Services

Attend this session to dive deeper into AWS's content delivery service, Amazon CloudFront. Learn how you can use CloudFront to accelerate the delivery of your APIs or applications, including content that cannot be cached, to global clients. We'll also walk you through how you can use Lambda@Edge, which gives you the ability to execute custom code inline with your CloudFront events to customize applications. With Lambda@Edge, you can now generate custom responses right at the edge, allowing you to leverage CloudFront to reduce end-to-end latency and more efficiently filter traffic to your back-end origin servers. We'll walk you through Lambda@Edge use cases and walk through a demo to show how this works.

Application Lifecycle Management on AWS

David Mat

Serverless Web Apps using API Gateway, Lambda and DynamoDB

Amazon Web Services

How Disney Streaming Services and TrueCar Deliver Web Applications for Scale,...

Amazon Web Services

Disney Streaming Service is a Direct-To-Consumer (DTC) video streaming service, part of Disney and TrueCar is a digital automotive marketplace. You will learn about their different perspectives on how they built global applications for scale, performance and availability. TrueCar will share how they moved internet operations off premises from their datacenters to the cloud in AWS. Disney Streaming Service will dive deep into how they are leveraging Amazon CloudFront and Lambda@Edge to enable their content APIs to perform at scale through dynamic original selection, latency reduction through the usage of edge caching, and guaranteeing high availability.

AWS re:Invent 2016: Born in the Cloud; Built Like a Startup (ARC205)

Amazon Web Services

Virtualization and cloud computing

Deep Gupta

Security on AWS

CloudHesive

Apache Kafka® at Dropbox

confluent

Peng Kang, Software Engineer, Dropbox + Richi Gupta, Engineering Manager, Dropbox As a scalable and reliable data streaming solution with a rich ecosystem, Kafka is widely adopted in Dropbox infrastructure in various scenarios. It is part of Dropbox’s analytics data pipeline, stream processing platform and more mission critical systems. Jetstream is the team that provides Kafka as a service in Dropbox infrastructure. We manage the clusters, develop tooling, and enforce policies, so that our users can enjoy a highly available and reliable service. In this talk, we will share our experiences and learnings running Kafka clusters, pipelines that enable high durability (direct writes to kafka) and availability (goscribe), the policies we enforce for high reliability, the tooling we have for maintenance and stress testing, and finally an overview of Dropbox’s next generation queueing service built on top Kafka. https://www.meetup.com/KafkaBayArea/events/266327152/

(GAM404) Hunting Monsters in a Low-Latency Multiplayer Game on EC2

Amazon Web Services

Hear how Turtle Rock launched Evolve, their fast-paced mercenary-vs-monster first-person shooter (FPS), to millions of players using AWS regions around the globe. Turtle Rock provides an in-depth view into Evolve's architecture on AWS, including both their Amazon EC2 and Elastic Load Balancing web API stack, as well as their Crytek-based UDP game servers. Hear how they used Amazon VPC subnets, along with an RDS MySQL based server registration service, to balance players across Availability Zones and regions. Learn about Turtle Rock's innovative game server scaling logic, which maintains a pool of game server capacity while keeping costs in check. Finally, see Evolve’s Graphite and Grafa monitoring setup, which provides player count and server health status across their worldwide fleet.

Serverless Culture

AWS User Group Bengaluru

Similar to RedisConf18 - Serving Automated Home Valuation with Redis & Kafka (20)

Powering Remote Developers with Amazon Workspaces

Your easy move to serverless computing and radically simplified data processing

AWS Meetup - Nordstrom Data Lab and the AWS Cloud

AWS re:Invent 2016: The State of Serverless Computing (SVR311)

Serverless without Code (Lambda)

Stephen Liedig: Building Serverless Backends with AWS Lambda and API Gateway

Building serverless backends - Tech talk 5 May 2017

ARC201 Microservices Architecture @ AWS re:Invent 2015

Building a Service Provider Cloud Offering - MVMUG Sept2013

Art of Using Xen at Scale

SRV418 Deep Dive on Accelerating Content, APIs, and Applications with Amazon ...

Application Lifecycle Management on AWS

Serverless Web Apps using API Gateway, Lambda and DynamoDB

How Disney Streaming Services and TrueCar Deliver Web Applications for Scale,...

AWS re:Invent 2016: Born in the Cloud; Built Like a Startup (ARC205)

Virtualization and cloud computing

Security on AWS

Apache Kafka® at Dropbox

(GAM404) Hunting Monsters in a Low-Latency Multiplayer Game on EC2

Serverless Culture

Recently uploaded

Essentials of Automations: Optimizing FME Workflows with Parameters

Safe Software

Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place. Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects. Here’s what you’ll gain: - Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows. - Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy. - Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency. - Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity. We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic. Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.

Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024

Tobias Schneck

As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other? Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.

DevOps and Testing slides at DASA Connect

Kari Kakkonen

FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf

FIDO Alliance

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...

UiPathCommunity

💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™: See how to accelerate model training and optimize model performance with active learning Learn about the latest enhancements to out-of-the-box document processing – with little to no training required Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath. Speakers: 👨‍🏫 Andras Palfi, Senior Product Manager, UiPath 👩‍🏫 Lenka Dulovicova, Product Program Manager, UiPath

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...

Product School

Connector Corner: Automate dynamic content and events by pushing a button

DianaGray10

Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to: Create a campaign using Mailchimp with merge tags/fields Send an interactive Slack channel message (using buttons) Have the message received by managers and peers along with a test email for review But there’s more: In a second workflow supporting the same use case, you’ll see: Your campaign sent to target colleagues for approval If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team But—if the “Reject” button is pushed, colleagues will be alerted via Slack message Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors. And... Speakers: Akshay Agnihotri, Product Manager Charlie Greenberg, Host

JMeter webinar - integration with InfluxDB and Grafana

RTTS

Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application. In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics. Length: 30 minutes Session Overview ------------------------------------------- During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana: - What out-of-the-box solutions are available for real-time monitoring JMeter tests? - What are the benefits of integrating InfluxDB and Grafana into the load testing stack? - Which features are provided by Grafana? - Demonstration of InfluxDB and Grafana using a practice web application To view the webinar recording, go to: https://www.rttsweb.com/jmeter-integration-webinar

FIDO Alliance Osaka Seminar: Overview.pdf

FIDO Alliance

Accelerate your Kubernetes clusters with Varnish Caching

Thijs Feryn

FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf

FIDO Alliance

Bits & Pixels using AI for Good.........

Alison B. Lowndes

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...

Product School

Transcript: Selling digital books in 2024: Insights from industry leaders - T...

BookNet Canada

The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more. Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/ Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

James Anderson

Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management. The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM). Speakers: Bob Boule Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle. Gopinath Rebala Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.

Epistemic Interaction - tuning interfaces to provide information for AI support

Alan Dix

Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024 https://alandix.com/academic/papers/synergy2024-epistemic/ As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.

ODC, Data Fabric and Architecture User Group

CatarinaPereira64715

State of ICS and IoT Cyber Threat Landscape Report 2024 preview

Prayukth K V

The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development. The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers: State of global ICS asset and network exposure Sectoral targets and attacks as well as the cost of ransom Global APT activity, AI usage, actor and tactic profiles, and implications Rise in volumes of AI-powered cyberattacks Major cyber events in 2024 Malware and malicious payload trends Cyberattack types and targets Vulnerability exploit attempts on CVEs Attacks on counties – USA Expansion of bot farms – how, where, and why In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East Why are attacks on smart factories rising? Cyber risk predictions Axis of attacks – Europe Systemic attacks in the Middle East Download the full report from here: https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...

Sri Ambati

PHP Frameworks: I want to break free (IPC Berlin 2024)

Ralf Eggert

In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development. This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.

Recently uploaded (20)

Essentials of Automations: Optimizing FME Workflows with Parameters

Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024

DevOps and Testing slides at DASA Connect

FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...

Connector Corner: Automate dynamic content and events by pushing a button

JMeter webinar - integration with InfluxDB and Grafana

FIDO Alliance Osaka Seminar: Overview.pdf

Accelerate your Kubernetes clusters with Varnish Caching

FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf

Bits & Pixels using AI for Good.........

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...

Transcript: Selling digital books in 2024: Insights from industry leaders - T...

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

Epistemic Interaction - tuning interfaces to provide information for AI support

ODC, Data Fabric and Architecture User Group

State of ICS and IoT Cyber Threat Landscape Report 2024 preview

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...

PHP Frameworks: I want to break free (IPC Berlin 2024)

RedisConf18 - Serving Automated Home Valuation with Redis & Kafka

1. Serving Automated Home Valuation with Redis & Kafka Jiaqi Wang jiaqi.wang@redfin.com

2. Redfin Tech powered brokerage

4. RedisatRedfin • Since 2016 • Opensource version • LRU cache • Rate limiting • …

5. 01 Feature Overview 02 Design Iterations 03 DeepDive 04 Next Steps Agenda

6. FeatureOverview

7. Redfin Estimate A calculation of the market value of an individual home AutomatedHome Valuation

10.

11.

12. Design Iterations

13. AWS Before…

14. property_id, listing_id, othermetadata Unclearownership ComplexAPIthat... • Promptsmisuse • Leadstoinconsistentdata Before…

15. FirstStep property_id AWS

16. Goals

17. • Performance Goals • Networkroundtrip, dbcalls

18. • Performance • AWS cost Goals • CostonAmazonAPI Gateway

19. • Performance • AWS cost • Data consistency/accuracy Goals • All productsreturnthe sameestimate value foragiven homeatagiven time • Near-realtimedata

20. pid → estimate value AWS Proposal

21. ¯_(ツ)_/¯

22. • Performance • AWS cost • Data consistency/accuracy Goals

23. pid → estimate value AWS IterationI

24. • Access level BusinessLogics • Visibilitycompliance • LocalMLSrules • Selleroverrides

25. pid → estimate value, access level AWS IterationII

26. ¯_(ツ)_/¯

27. • Performance • AWS cost • Data consistency/accuracy Goals

28. VolatileHousing Market

29. • Caching for too long leads to stale and inaccurate data • Caching for too short reduces perf/cost benefit VolatileEstimate

30. • Caching for too long leads to stale and inaccurate data • Caching for too short reduces perf/cost benefit VolatileEstimate

31. • Active properties • Off-market properties • Forsale • Moreactivities • Updatemoreoften • Notforsale • Lessactivities • Updateless often Activevs.Off- Market

32. Active properties:TTL5 mins Off-marketproperties:TTL4 hrs pid → estimate value, access level, with TTL AWS IterationIII

33.

34. Before • hit rate: N/A • p90: 102ms • median: 45ms After • hit rate: 53% • p90: 73ms • median: 8ms Results (SingleGet)

35. Roll out to 100% ResponseTimeDrops

36. Roll out to 100% 45%FewerEstimateAPICalls

37.

38.

39. • No significant perf gain • Similar amount of Estimate API calls ItTurnsOut…

40. ¯_(ツ)_/¯ • No significant perf gain • Similar amount of Estimate API calls ItTurnsOut…

41. ~55%hit rate ItTurnsOut…

42. Increasehitratetoeliminate thesync call Solution

43. • Prefetch estimate values on the neighboring tiles • Populate the cache ahead of time Cache Warmup

44. Cache Warmup

45. Cache Warmup

46. Cache Warmup

47. Cache Warmup async, near- realtime

48. Cache Warmup

49. AWS IterationIII

50. IterationIV

51.

52. Deep Dive

53. CacheWarmupPipeline

54. • Distributed streaming framework • Storm, Spark, Flink, and etc. • Redfin adopted Kafka in 2015 Kafka

55. • Distributed stream processing framework • Scalablenear-realtime event streaming and dataprocessing • Uses Kafkaformessaging • Uses YARNforfaulttolerance,resource management,etc. Samza

56.

57. • Fault-tolerantlocal state • Ordered,partitioned,replayablestreams • Processorisolation,security,faulttoleranceprovidedbyYARN • Decoupledjobs Stand-outFeatures

58. • 70+Samzaapps • Trackingmarketactivity • Sending tournotifications • Sending listing updates • Cachewarmup • ... Kafka&Samza atRedifn

59. Architecture

60. Forecaster Requestfilter Datafetcher Datawriter CacheWarmup

61. Identifynearbyproperties Forecaster

62. ValidateRequests RequestFilter

63. BatchfetchfromAPI gateway DataFetcher

64. Populatethe cache DataWriter

65. StandaloneSamzaappsformax horizontalscalability CacheWarmup

66. StandaloneSamzaappsformax horizontalscalability CacheWarmup

67. StandaloneSamzaappsformax horizontalscalability CacheWarmup

68. Before • hit rate: 55% • p90: 314ms • median: 58ms After • hit rate: 80% • p90: 29ms • median: 3ms Results (MultiGet)

69. Roll out to 50% Roll out to 100% ResponseTimeDrops

70. 40% Fewer EstimateAPI Calls 40%FewerEstimateAPICalls

71. CacheInvalidation

72. Lessons

73. Lessons SystemCouplings • SlowKafkabroker,slow server responsetime

74. Lessons Monitoring

75. Next Steps

76. What’sNext • Systemdecoupling • Envoy:https://github.com/envoyproxy/envoy • Increase hit rate even further • Unshardedtoshardedwithproxy • Experimentwith TTL • Cachemoredata • FurtherreduceAWScost

77. ThankYou

Editor's Notes

Self Intro An engineer on the Owner Engagement team at Redfin.
Redfin Intro Technology-powered real estate brokerage. Not only serve home information as a platform, but also employ real estate agents to provide professional home buying and selling services.
Stats Every month, over 22 million users visit Redfin to check out homes across the country, to find out how much their home is worth, to keep updated with the neighborhood trend, to get in touch with a real estate professional for services. Owner Engagement primarily focuses on homeowner experiences, providing tools to help owners understand home value better, and to keep their home information up-to-date
Redis at Redfin (use open source Redis) Will first briefly touch the usage of Redis at Redfin, we’re using the open source version Two major use case LRU cache Rate limiting Hoping to open up for more use cases soon This talk will focus on a specific use case of Redis at Redfin and go through our thought process of designing the architecture around it, more specific, a cache warmup pipeline powered by kafka and samza. Hopefully you will find it useful
Agenda
Leading to avm… Coming back to automated home valuation, or what we call Redfin Estimate What is that? Redfin Estimate is a calculation of the market value of an individual home, or what we think your home is worth We've commissioned analysis that shows that Redfin Estimate is the most accurate on the market
If you are a Redfin user, you probably have seen it on a listing details page before, either on the web, (next slide: or in the mobile app) Screenshots Desktop web
Or in the mobile app Screenshots Mobile app
We also use it extensively in email, internal tools for agents, as well as in a fairly new experiment that we are running in a few markets called Redfin Now For example: Redfin Now tries to buy home directly from the customers. The amount we pay for the home largely depends on the Redfin estimate value
There’s one more place you will see the Redfin estimate values is when you zoom in on the map page. I personally like this feature a lot because it gives you a good idea on the overall home prices in a neighborhood, not because I want to know if my neighbor’s home is worth more. :grin: Different from the other use cases, this usage presents some unique technical challenges, and I will dive into that later in this talk. Screenshots Avm on map
Coming back, this was our tech stack before a cache layer was introduced in the system On the right side we have an Amazon API Gateway set up on AWS serving raw estimate data Each of the feature is responsible for fetching the estimate on their own
This original setup has several issues: Due to the complex nature of the estimate data and the fact that a property can be represented in many different forms, the estimate API takes multiple params. It is possible for it to return a different estimate even for the same home based on the parameter that get passed in. This inevitably lead to misuses by other services when they provided undesired inputs, which in turn caused inconsistent data across different products For example, email to XDP could display different values Also introduced unclear ownership between feature teams: who’s going to be responsible for the cache layer, do teams add their own implementation? Duplicate effort, debugging pain
Before adding the cache layer, unify the code path and create Estimate Service to take the responsibility Provide simple API Define clear ownership Roll up the sleeve and resume the caching work
Spend a minute to talk about our goals here: Why? Good software engineering practice. Without a clear goal, you may not be optimizing your time to solve the right problem Perf: network round trip, db calls to fetch metadata Aws cost, API gateway is not free, charges by number or API calls and amount of data transferred out Data consistency matters a lot to us. To provide good UX.
Spend a minute to talk about our goals here: Why? Good software engineering practice. Without a clear goal, you may not be optimizing your time to solve the right problem Perf: cut down network round trip, db calls to fetch metadata Aws cost, API gateway is not free, charges by number or API calls and amount of data transferred out Data consistency matters a lot to us. To provide good UX.
Spend a minute to talk about our goals here: Why? Good software engineering practice. Without a clear goal, you may not be optimizing your time to solve the right problem Perf: network round trip, db calls to fetch metadata Aws cost, Amazon API gateway is not free, charges by number or API calls and amount of data transferred out Data consistency matters a lot to us. To provide good UX.
Spend a minute to talk about our goals here: Why? Good software engineering practice. Without a clear goal, you may not be optimizing your time to solve the right problem Perf: network round trip, db calls to fetch metadata Aws cost, API gateway is not free, charges by number or API calls and amount of data transferred out Data consistency matters a lot to us. To provide good UX. All products should return the same estimate value for a given home at a given time
First proposal: To simplify the diagram, consolidated intermediate services as part of the webserver We naturally fitted a Redis cache in between the estimate service and API gateway. Stores pairs of property id and its corresponding value Now every time someone requests for estimate data, estimate service only performs the expensive fetch in case of a cache miss, and the results are cached in Redis So far so good. Caching is that simple… right?
How hard could it be to cache some values :shrug:
Remember earlier we talked about goals, and the first one is to improve perf Now that we eliminated the network round trip to API gateway, are we done though?
You now notice that there’s still DB calls being made. What is it doing? Can we eliminate that?
We then take a look at what it is doing, and it turns out we saved a bunch of “legal” rules in the database on whether or not we can show the estimate value Some of these come from local MLS which stands for multiple listing services, which are databases that real estate agents use to list properties Others come from listing agent’s settings In the end, it boils down to a value called access level Controls the accessibility of a piece of data based on user’s roles Unregistered, registered, email verified, agent
So we put access level in the cache as well and eliminated the db call. So far so good. Caching is that simple… right??
How hard could it be to cache some values :shrug:
before popping open a champagne and celebrate, let’s double check our goals. There’s a line item called consistency. What does that mean? :thinking_face: Have we achieved that? Someone might have already thought about it. It has something to do with expiration.
Turns out housing market is a pretty volatile market Home prices changes frequently and dramatically And I believe if you’ve gone through a home buying/selling process in the recent couple years, you would very well know what that means
What does this mean for us? For example, say the housing price fluctuates every hour, and we cache it for two hours, what would happen? We ended up with stale data Can we simply cache everything for a short amount of time? We can, but we’ll be making unnecessary calls to API gateway, which we already know costs us a fortune
How can we strike the perfect balance between the cost and benefit?
Well, we addressed this by treating data with different characteristics differently We split the data in two groups: active properties that are for sale and off market properties that are not for sale We know that for sale listings have more activities going on, their estimates update more often, so we give them a much shorter TTL While off market properties don’t have too much going on, their estimates don’t update as often, thus it’s fine for them to live in the cache for longer (this data is public) And in fact, our estimate algorithm updates the estimate for active properties at least once a day, even more often for newer listings For off-market ones, it updates only once a week
We ended up with this: caching estimates of active properties for 5 min, and 4 hrs for off-market properties One question people often ask is that if the estimate value only updates once a week, why do you cache it for only 4 hours instead of a week Cache expiration is a broad topic, and the answer varies case by case. If you know exactly when the data will be updated, you can cache it for as long as you want until you’re notified of the arrival of new data In our case, our side of the system acts on the assumption that it doesn’t know the exact time the update happens. So to be on the cautious side, we naturally experiment with a shorter TTL to make sure the data isn’t too stale. You never know if new data happens to arrive right after you cache the old data right?
And here comes a handy command in Redis: setex(), or set expire. It both sets the key/value and sets the key to timeout after a given period of time. It combines the set() command and expire() command in an atomic action It is a very common operation when using Redis as a cache
With all goals covered, when we rolled it out to 100%, non-surprisingly, we observed the expected drop off in response time, with an avg. hit rate of 53%
A graph illustrating the response time
A graph illustrating the drop in number of single calls to Estimate API
Alright, so far the setup works pretty well for most of the use cases where we only need estimate data for a single property such as this What about situations where we need multiple estimate values at once?
Such as this As i mentioned earlier, when you zoom in enough on the map, you will get to see the estimate values for homes that are not currently on the market in your viewport
And it turns out that we didn’t achieve much performance gain
Wuuuuut? :shrug:
We observed that on the map, we can achieve an avg. of 55% hit rate, even slightly higher than the single fetch case However, hit rate doesn’t matter as much when fetching multiple values at a time, at least when it’s only around the 50s Say we are fetching 100 estimate values at once and 55 are retrieved from the cache, for the rest of the 45, we still need to contact AWS for the data Even though we speed up part of the call by hitting the cache, the entire call didn’t improve much because we didn’t eliminate the expensive part, which is the network roundtrip So, as long as we’re still making the network call as part of the map request, we won’t see too much improvement, and we are not going to save much on AWS cost
We then asked ourselves: is it possible to eliminate the synchronous call at all? The answer is yes we can, if the hit rate is high enough
So we introduced cache warmup Here’s how it works: Given a map area the user is looking at (what we call viewport), prefetch estimate values for the neighboring tiles Populate the cache ahead of time
When the user is browsing homes in this box, or the viewport
Which is the orange tile in the center at a more zoomed out level We identify all neighboring tiles of the orange tile
And find all properties that are not for sale on these neighboring tiles
Prefetch the estimate values for these homes, populate them into the Redis cache in a non-blocking background process, a process that is supposed to happen really fast
The expectation is that when the user pans around the map and enter the neighboring tiles, the estimate values on these tiles will ideally already exist in the cache There’ll be no need to fetch synchronously from Amazon API gateway and db It sounds pretty straightforward, so how does the system look like?
Remember this is our previous iteration, with a Redis cache sitting in between the estimate service and the estimate api on aws How hard could it be to add a background cache warmup pipeline Probably just couple boxes around it and fill in the details right?
this is the final architecture of what we came up with. Consists of a cache warmup pipeline in the middle and couple invalidation pipelines on the bottom Not far from the previous iteration right? :grin:
Exactly like how you draw an owl
Putting the owl-ful jokes aside, deep dive time
We built cache warmup with Kafka and Samza The reason we chose stream processing to solve this problem is because it fits well with the stream processing paradigm which says: Given a sequence of data (what we call a stream), a series of operations is applied to each element in the stream Needs to be near-realtime Work on messages one by one More importantly, also because a stream processing pipeline built on top of kafka and samza is durable and scalable.
For those of you that aren’t quite familiar with these Kafka is a distributed streaming framework similar to Storm, Spark, Flink, and etc. It was originally developed by LinkedIn and later open-sourced Redfin adopted Kafka in its earlier days back in 2015 to scale up our fast growing notification needs At the time it was mainly serving as a low-latency messaging system and that’s where Samza comes into the picture
Samza is an open-source distributed stream processing framework It uses Kafka for messaging, and hadoop YARN for fault tolerance, resource management, processor isolation, security, and etc. Provides scalable near-real-time event streaming and data processing
Without going into too much detail, here’s a graph illustrating the relationship between Kafka and Samza Kafka: buffer between Samza apps
At the time we were evaluating multiple similar streaming processing products, There are couple characteristics that made us choose Samza over another stream processing framework at the time Differentiators: Supports out of the box local storage. Because the state itself is modeled as a stream, in case of a machine failure, the state stream will come back in the state before the crash and can be replayed to restore it Streams are ordered, partitioned, and replayable, which provides scalability and durability It takes advantage of YARN for processor isolation, security, and fault tolerance, and YARN provides a distributed environment for Samza containers to run in All jobs are decoupled. If one job experiences issues, the rest of the system is unaffected.
As of today, Redfin has more than 70 Samza apps running in prod, processing millions of bytes of content per second. Usage varies from tracking market activity to sending tour notification and sending listing updates in a timely fashion. More than 90% of the listing updates will land in our users’ inbox within 10 minutes since they get updated on MLS, which is at least 10x faster than our competitors And you all know how valuable it is to know about new listings faster than other potential home buyers in such a competitive market
Final architecture of estimate service with a caching pipeline Let’s take a closer look at the big grey box
The caching pipeline consists of four components: Each of them is a standalone samza app
The forecaster identifies properties on the neighboring tiles, sends them to the request filter More specifically, it takes in a list of property ids, finds the bounding box of these ids, and identifies the surrounding boxes of that box For each of the surrounding box, it then fetches all properties on it, and sends them to the downstream app
It filters the incoming properties. Couple things it checks on: If the estimate value for a given property already exists in the cache, then there’s no need to fetch them again If the given property display an estimate value at all. Remember the ”legal” rules that’s stored in the database that I mentioned earlier? That is it.
Later, the eligible properties are sent to the data fetcher and it batches them together and fetches the estimate data from API Gateway
Finally, the data writer populates these data into the Redis cache
The reason to make 4 standalone apps is so that every app is responsible for a simple specific task and no more than that. Our goal is to complete the entire flow, from the request entering the pipeline to landing in cache, within 10 sec. Separating the responsibilities allows us to achieve maximum horizontal scalability.
E.g. we know ahead of time that request filter will have to handle a comparatively larger amount of incoming messages than the other apps because forecaster fans out multiple messages for every single incoming message so we made it a single app and gave it 4X containers and enabled multi-threading
E.g. for data fetcher, because it is responsible for making requests to AWS, instead of making a request for every incoming message, we utilized a unique feature in Samza called local storage to batch the requests together in order to reduce the amount of calls
With the new caching pipeline in place, when the new design was rolled out in prod, we saw good results with fetching multiple estimate values. a significant bump in hitrate, from 55% to 80% and a huge amount of perf gain
The orange line on the top is upper 90, and the yellow one on the bottom is median Also we didn’t eliminate the API gateway call 100%. In the case of a low hit rate, to ensure a good UX a sync call is still initiated The load on API gateway decreased significantly
A graph illustrating the drop in number of batch calls to the Estimate API
Now that we have a fancy async pipeline running to update the cache, we piggybacked on that and built two additional streams to invalidate the cache when estimate visibility changes
What is the catch?
First thing that bit us: because we are sending messages to the streaming pipeline within the map request, system couple between the webserver and the streaming infrastructure It turns out that the supposedly asynchronous process isn’t entirely async, there’s one synchronous tissue in the entire link What happened is that in a recent Kafka upgrade, due to an unexpected bug in the Kafka Python client, some of the nodes in the Kafka cluster or what’s called a Kafka broker experienced enormous latency in accepting messages. Which trickled down and caused webservers to return slowly Fortunately we were able to shut down the pipeline by flipping a feature toggle Build thorough monitoring and alarming system
Another lesson we learnt is that as your system gets more complicated, it gets harder to detect where the issue is coming from when there’s a fire need to up yourgame in monitoring For this feature alone, we built at least 4 dashboards to monitor the individual parts, including the cache itself, the caching pipeline, the estimate service, and the webserver endpoints
Tweak Kafka producer configs so brokers timeout faster to better isolate Kafka issue from webserver Exploring Envoy to move from unsharded to sharded

RedisConf18 - Serving Automated Home Valuation with Redis & Kafka

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to RedisConf18 - Serving Automated Home Valuation with Redis & Kafka

Similar to RedisConf18 - Serving Automated Home Valuation with Redis & Kafka (20)

More from Redis Labs

More from Redis Labs (20)

Recently uploaded

Recently uploaded (20)

RedisConf18 - Serving Automated Home Valuation with Redis & Kafka

Editor's Notes