This document discusses using Redis and the Redis::Client Perl module to build scalable distributed job queues. It provides an overview of Redis, describing it as a key-value store that is simple, fast, and open-source. It then covers the various Redis data types like strings, lists, hashes, sets and sorted sets. Examples are given of how to work with these types using Redis::Client. The document discusses using Redis lists to implement job queues, with jobs added via RPUSH and popped via BLPOP. Benchmark results show the Redis-based job queue approach significantly outperforms using a MySQL jobs table with polling. Some caveats are provided about the benchmarks.
Redis is a NoSQL technology that rides a fine line between database and in-memory cache. Redis also offers "remote data structures", which gives it a significant advantage over other in-memory databases. This session will cover several PHP clients for Redis, and how to use them for caching, data modeling and generally improving application throughput.
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuRedis Labs
Postgres and Redis Sitting in a Tree | In today’s world of polyglot persistence, it’s likely that companies will be using multiple data stores for storing and working with data based on the use case. Typically a company will
start with a relational database like Postgres and then add Redis for more high velocity use-cases. What if you could tie the two systems together to enable so much more?
Background Tasks in Node - Evan Tahler, TaskRabbitRedis Labs
The talk gives an overview of some of the many ways you can preform
background taks in node, which include: Foreground (in-line) Parallel (threaded-ish) Local Messages (fork-ish) Remote
Messages Remote Queues (Resque-ish) Event Bus (Kafka-ish) For every section, we show an example, and more interestingly,
note how node makes every step better/faster/stronger... even the bad ideas! The idea for the talk came from a twitter
converstaion with @dshaw, host of NodeUP about how easy it was to have multiple node workers in Node-Resque... Check
out the presentation to learn how!
Redis in Practice: Scenarios, Performance and Practice with PHPChen Huang
Knowledge sharing about Redis, mainly focusing on:
Why to use Redis? Comparison of some in-memory storages and their scenarios
How to make Redis faster? Consider time complexity, communication latency and serialization
Practice of replication and sentinel in PHP
Redis is a NoSQL technology that rides a fine line between database and in-memory cache. Redis also offers "remote data structures", which gives it a significant advantage over other in-memory databases. This session will cover several PHP clients for Redis, and how to use them for caching, data modeling and generally improving application throughput.
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuRedis Labs
Postgres and Redis Sitting in a Tree | In today’s world of polyglot persistence, it’s likely that companies will be using multiple data stores for storing and working with data based on the use case. Typically a company will
start with a relational database like Postgres and then add Redis for more high velocity use-cases. What if you could tie the two systems together to enable so much more?
Background Tasks in Node - Evan Tahler, TaskRabbitRedis Labs
The talk gives an overview of some of the many ways you can preform
background taks in node, which include: Foreground (in-line) Parallel (threaded-ish) Local Messages (fork-ish) Remote
Messages Remote Queues (Resque-ish) Event Bus (Kafka-ish) For every section, we show an example, and more interestingly,
note how node makes every step better/faster/stronger... even the bad ideas! The idea for the talk came from a twitter
converstaion with @dshaw, host of NodeUP about how easy it was to have multiple node workers in Node-Resque... Check
out the presentation to learn how!
Redis in Practice: Scenarios, Performance and Practice with PHPChen Huang
Knowledge sharing about Redis, mainly focusing on:
Why to use Redis? Comparison of some in-memory storages and their scenarios
How to make Redis faster? Consider time complexity, communication latency and serialization
Practice of replication and sentinel in PHP
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...DataStax Academy
Presenter: Ben Vanberg, Senior Software Engineer at FullContact
Here at FullContact we have lots and lots of contact data. In particular we have more than a billion profiles over which we would like to perform ad hoc data analysis. Much of this data resides in Cassandra, and we have many analytics MapReduce jobs that require us to iterate across terabytes of Cassandra data. To solve this problem we've implemented our own splittable input format which allows us to quickly process large SSTables for downstream analytics.
DBD::Gofer is the scalable stateless proxy driver for Perl DBI.
These are the slides for my lightning talk on DBD::Gofer given at the Italian Perl Workshop in 2008 (with a few extra slides added).
Cassandra Day SV 2014: Netflix’s Astyanax Java Client Driver for Apache Cassa...DataStax Academy
Astyanax is the thrift protocol based C* driver widely used and open sourced by Netflix. It was recently integrated with the Java Driver released by DataStax. This talk focusses on the different options available with Astyanax and how it complements the Java Driver.
About Puneet Oberai, Senior Software Engineer at Netflix
Senior Software Engineer at Netflix and proud team member of Netflix CDE (Cloud Data Engineering).
Aaron Mildenstein - Using Logstash with ZabbixZabbix
Logstash is a terrific tool for capturing, filtering, parsing and enriching data from a number of sources—including logs, of course. But Logstash is also able to capture from many other sources, including social media streams, databases, and many more. Data streams like these are a potential gold mine for Zabbix trending and alerting of all kinds.
In this talk Aaron Mildensten will provide an overview of how to configure and integrate Logstash with Zabbix to:
* capture data
* parse data events into key/value pairs
* associate an event with the time-stamp provided by the data
* generate metrics from the data
* output these values to Zabbix, with the associated time-stamp
Zabbix Conference 2015
Starting with v4, modules hold a promise for changing how Redis is used and developed for. Enabling custom data types and commands, Redis Modules build upon and extend the core functionality to handle any use case.
The video of the webinar given with these slides is at: https://youtu.be/EglSYFodaqw
Chapman: Building a High-Performance Distributed Task Service with MongoDBMongoDB
When you're building a web application, you want to respond to every request as quickly as possible. The usual approach is to use an asynchronous job queue like Sidekiq, Resque, Celery, RQ, or a number of other frameworks to handle those tasks outside the request/response cycle in a separate 'worker' process. Unfortunately, many of these frameworks either require the deployment of Redis, RabbitMQ, or some other request broker, or they resort to polling a database for new work to do. Chapman is a distributed task queue built on MongoDB that avoids gratuitous polling, using tailable cursors with the oplog to provide notifications of incoming work. Inspired by Celery, Chapman also supports task graphs, where multiple tasks that depend on each other can be executed by the system asynchronously. Come learn how Synapp.io is using MongoDB and Chapman to handle its core data processing needs.
Attack monitoring using ElasticSearch Logstash and KibanaPrajal Kulkarni
With growing trend of Big data, companies are tend to rely on high cost SIEM solutions. However, with introduction of open source and lightweight cluster management solution like ElasticSearch this has been the highlight of the year. Similarly, the log aggregation has been simplified by logstash and kibana providing a visual look to the complex data structure. This presentation will exactly cater to this need of having a appropriate log analysis+Detecting Intrusion+Visualizing data in a powerful interface.
Cassandra Community Webinar: Back to Basics with CQL3DataStax
Cassandra is a distributed, massively scalable, fault tolerant, columnar data store, and if you need the ability to make fast writes, the only thing faster than Cassandra is /dev/null! In this fast-paced presentation, we'll briefly describe big data, and the area of big data that Cassandra is designed to fill. We will cover Cassandra's unique, every-node-the-same architecture. We will reveal Cassandra's internal data structure and explain just why Cassandra is so darned fast. Finally, we'll wrap up with a discussion of data modeling using the new standard protocol: CQL (Cassandra Query Language).
Redis Use Patterns (DevconTLV June 2014)Itamar Haber
An introduction to Redis for the SQL practitioner, covering data types and common use cases.
The video of this session can be found at: https://www.youtube.com/watch?v=8Unaug_vmFI
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...DataStax Academy
Presenter: Ben Vanberg, Senior Software Engineer at FullContact
Here at FullContact we have lots and lots of contact data. In particular we have more than a billion profiles over which we would like to perform ad hoc data analysis. Much of this data resides in Cassandra, and we have many analytics MapReduce jobs that require us to iterate across terabytes of Cassandra data. To solve this problem we've implemented our own splittable input format which allows us to quickly process large SSTables for downstream analytics.
DBD::Gofer is the scalable stateless proxy driver for Perl DBI.
These are the slides for my lightning talk on DBD::Gofer given at the Italian Perl Workshop in 2008 (with a few extra slides added).
Cassandra Day SV 2014: Netflix’s Astyanax Java Client Driver for Apache Cassa...DataStax Academy
Astyanax is the thrift protocol based C* driver widely used and open sourced by Netflix. It was recently integrated with the Java Driver released by DataStax. This talk focusses on the different options available with Astyanax and how it complements the Java Driver.
About Puneet Oberai, Senior Software Engineer at Netflix
Senior Software Engineer at Netflix and proud team member of Netflix CDE (Cloud Data Engineering).
Aaron Mildenstein - Using Logstash with ZabbixZabbix
Logstash is a terrific tool for capturing, filtering, parsing and enriching data from a number of sources—including logs, of course. But Logstash is also able to capture from many other sources, including social media streams, databases, and many more. Data streams like these are a potential gold mine for Zabbix trending and alerting of all kinds.
In this talk Aaron Mildensten will provide an overview of how to configure and integrate Logstash with Zabbix to:
* capture data
* parse data events into key/value pairs
* associate an event with the time-stamp provided by the data
* generate metrics from the data
* output these values to Zabbix, with the associated time-stamp
Zabbix Conference 2015
Starting with v4, modules hold a promise for changing how Redis is used and developed for. Enabling custom data types and commands, Redis Modules build upon and extend the core functionality to handle any use case.
The video of the webinar given with these slides is at: https://youtu.be/EglSYFodaqw
Chapman: Building a High-Performance Distributed Task Service with MongoDBMongoDB
When you're building a web application, you want to respond to every request as quickly as possible. The usual approach is to use an asynchronous job queue like Sidekiq, Resque, Celery, RQ, or a number of other frameworks to handle those tasks outside the request/response cycle in a separate 'worker' process. Unfortunately, many of these frameworks either require the deployment of Redis, RabbitMQ, or some other request broker, or they resort to polling a database for new work to do. Chapman is a distributed task queue built on MongoDB that avoids gratuitous polling, using tailable cursors with the oplog to provide notifications of incoming work. Inspired by Celery, Chapman also supports task graphs, where multiple tasks that depend on each other can be executed by the system asynchronously. Come learn how Synapp.io is using MongoDB and Chapman to handle its core data processing needs.
Attack monitoring using ElasticSearch Logstash and KibanaPrajal Kulkarni
With growing trend of Big data, companies are tend to rely on high cost SIEM solutions. However, with introduction of open source and lightweight cluster management solution like ElasticSearch this has been the highlight of the year. Similarly, the log aggregation has been simplified by logstash and kibana providing a visual look to the complex data structure. This presentation will exactly cater to this need of having a appropriate log analysis+Detecting Intrusion+Visualizing data in a powerful interface.
Cassandra Community Webinar: Back to Basics with CQL3DataStax
Cassandra is a distributed, massively scalable, fault tolerant, columnar data store, and if you need the ability to make fast writes, the only thing faster than Cassandra is /dev/null! In this fast-paced presentation, we'll briefly describe big data, and the area of big data that Cassandra is designed to fill. We will cover Cassandra's unique, every-node-the-same architecture. We will reveal Cassandra's internal data structure and explain just why Cassandra is so darned fast. Finally, we'll wrap up with a discussion of data modeling using the new standard protocol: CQL (Cassandra Query Language).
Redis Use Patterns (DevconTLV June 2014)Itamar Haber
An introduction to Redis for the SQL practitioner, covering data types and common use cases.
The video of this session can be found at: https://www.youtube.com/watch?v=8Unaug_vmFI
Patterns for slick database applicationsSkills Matter
Slick is Typesafe's open source database access library for Scala. It features a collection-style API, compact syntax, type-safe, compositional queries and explicit execution control. Community feedback helped us to identify common problems developers are facing when writing Slick applications. This talk suggests particular solutions to these problems. We will be looking at reducing boiler-plate, re-using code between queries, efficiently modeling object references and more.
High-Volume Data Collection and Real Time Analytics Using Rediscacois
In this talk, we describe using Redis, an open source, in-memory key-value store, to capture large volumes of data from numerous remote sources while also allowing real-time monitoring and analytics. With this approach, we were able to capture a high volume of continuous data from numerous remote environmental sensors while consistently querying our database for real time monitoring and analytics.
* See more of my work at http://www.codehenge.net
Scalable Streaming Data Pipelines with RedisAvram Lyon
Slides for talk presented at LA Redis meetup, April 16, 2016 at Scopely.
This is a draft of a session to be presented at Redis Conference 2016.
Description:
Scopely's portfolio of social and mid-core games generates billions of events each day, covering everything from in-game actions to advertising to game engine performance. As this portfolio grew in the past two years, Scopely moved all event analysis from third-party hosted solutions to a new event analytics pipeline on top of Redis and Kinesis, dramatically reducing operating costs and enabling new real-time analysis and more efficient warehousing. Our solution receives events over HTTP and SQS and provides real-time aggregation using a custom Redis-backed application, as well as prompt loads into HDFS for batch analyses.
Recently, we migrated our realtime layer from a pure Redis datastore to a hybrid datastore with recent data in Redis and older data in DynamoDB, retaining performance while further reducing costs. In this session we will describe our experience building, tuning and monitoring this pipeline, and the role of Redis in supporting handling of Kinesis worker failover, deployment, and idempotence, in addition to its more visible role in data aggregation. This session is intended be helpful for those building streaming data systems and looking for solutions for aggregation and idempotence.
This is a presentation for Chapter 7 Distributed system management
Book: DISTRIBUTED COMPUTING , Sunita Mahajan & Seema Shah
Prepared by Students of Computer Science, Ain Shams University - Cairo - Egypt
Technical overview of three of the most representative KeyValue Stores: Cassandra, Redis and CouchDB. Focused on Ruby and Ruby on Rails developement, this talk shows how to solve common problems, the most popular libraries, benchmarking and the best use case for each one of them.
This talk was part of the Conferencia Rails 2009, Madrid, Spain.
http://app.conferenciarails.org/talks/43-key-value-stores-conviertete-en-un-jedi-master
HIgh Performance Redis- Tague Griffith, GoProRedis Labs
High Performance Redis looks at a wide range of techniques - from programming to system tuning - to deploy and maintain an extremely high performing Redis cluster. From the operational
perspective, the talk lays out multiple techniques for clustering (sharding) Redis systems and examines how the different
approaches impact performance time. The talk further examines system settings (Linux network parameters, Redis
system) and how they impact performance (both good and bad). Finally, for the developer, we look at how different ways of structuring data actually demonstrate different performance characteristics
This is an introduction to relational and non-relational databases and how their performance affects scaling a web application.
This is a recording of a guest Lecture I gave at the University of Texas school of Information.
In this talk I address the technologies and tools Gowalla (gowalla.com) uses including memcache, redis and cassandra.
Find more on my blog:
http://schneems.com
10 Ways to Scale with Redis - LA Redis Meetup 2019Dave Nielsen
Redis has 10 different data structures (String, Hash, List, Set, Sorted Set, Bit Array, Bit Field, Hyperloglog, Geospatial Index, Streams) plus Pub/Sub and many Redis Modules. In this talk, Dave will give 10 examples of how to use these data structures to scale your website. I will start with the basics, such as a cache and User session management. Then I demonstrate user generated tags, leaderboards and counting things with hyberloglog. I will with a demo of Redis Pub/Sub vs Redis Streams which can be used to scale your Microservices-based architecture.
10 Ways to Scale Your Website Silicon Valley Code Camp 2019Dave Nielsen
Redis has 10 different data structures (String, Hash, List, Set, Sorted Set, Bit Array, Bit Field, Hyperloglog, Geospatial Index, Streams) plus Pub/Sub and many Redis Modules. In this talk, Dave will give 10 examples of how to use these data structures to scale your website. I will start with the basics, such as a cache and User session management. Then I demonstrate user generated tags, leaderboards and counting things with hyberloglog. I will with a demo of Redis Pub/Sub vs Redis Streams which can be used to scale your Microservices-based architecture.
Ready to leverage the power of a graph database to bring your application to the next level, but all the data is still stuck in a legacy relational database?
Fortunately, Neo4j offers several ways to quickly and efficiently import relational data into a suitable graph model. It's as simple as exporting the subset of the data you want to import and ingest it either with an initial loader in seconds or minutes or apply Cypher's power to put your relational data transactionally in the right places of your graph model.
In this webinar, Michael will also demonstrate a simple tool that can load relational data directly into Neo4j, automatically transforming it into a graph representation of your normalized entity-relationship model.
NoSQL databases such as Redis, MongoDB and Cassandra are emerging as a compelling choice for many applications. They can simplify the persistence of complex data models and offer significantly better scalability and performance. However, using a NoSQL database means giving up the benefits of the relational model such as SQL, constraints and ACID transactions. For some applications, the solution is polyglot persistence: using SQL and NoSQL databases together.
In this talk, you will learn about the benefits and drawbacks of polyglot persistence and how to design applications that use this approach. We will explore the architecture and implementation of an example application that uses MySQL as the system of record and Redis as a very high-performance database that handles queries from the front-end. You will learn about mechanisms for maintaining consistency across the various databases.
Slides from my talk at ACCU2011 in Oxford on 16th April 2011. A whirlwind tour of the non-relational database families, with a little more detail on Redis, MongoDB, Neo4j and HBase.
From ddd to DDD : My journey from data-driven development to Domain-Driven De...Thibaud Desodt
This will be a review of my progress through architecture styles and patterns and going over the transitions from Db-first style to cleaner OOP practices and proper domain isolation. I'll go over concepts such as "Transaction Script", "CQS", "Anemic Model" and other buzzwords
Expect some code snippets in C#
You can find the original of the slides here : https://github.com/tsimbalar/from-ddd-to-ddd
The Perl API for the Mortally Terrified (beta)Mike Friedman
A brief introduction to get you started in working with Perl's internal API. This presentation is a work in progress.
Code samples: http://github.com/friedo/perl-api-terror
This talk examines four real-world use cases for MongoDB document-based data modeling. We examine the implications of several possible solutions for each problem.
This talk introduces the features of MongoDB by demonstrating how one can build a simple library application. The talk will cover the basics of MongoDB's document model, query language, and API.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
4. Redis Data Types
• Strings
• e.g. ‘foo’, ’42’, or a JSON blob
• Think Perl scalars
5. Redis Data Types
• Lists
• Zero or more strings, ordered
• RPUSH, RPOP, LPUSH, and LPOP ==
push, pop, unshift, and shift in
Perl
6. Redis Data Types
• Hashes
• Zero or more key-value pairs, unordered
• HDEL, HEXISTS, HKEYS, and HVALS
== delete, exists, keys, and
values in Perl
7. Redis Data Types
• Some types are not directly analogous to
Perl concepts
8. Redis Data Types
• Sets
• Zero or more keys, unordered
• No values
• Think of a Perl hash where all values are
undef
9. Redis Data Types
• Sets
• SREM, SISMEMBER, SMEMBERS ==
delete, exists, and keys in Perl
10. Redis Data Types
• Set operations
• Union
• Intersection
• Add / remove
• Cardinality
11. Redis Data Types
• Set operations can be implemented with
Perl hashes
• How?
12. Redis Data Types
• Sorted Sets (zsets)
• Zero or more keys, each with a numeric
“score”
• No values
• Can be modeled as a Perl hash where the
values are the scores
13. Redis Data Types
• Sorted Sets (zsets)
• ZREM, ZRANK, and ZRANGE (loosely)
== delete, exists, and keys in Perl
14. Redis Data Types
• Sorted set operations
• Cardinality (within ranges)
• Union / intersection
• Lots more
15. Redis Data Types
• Many more commands for working with
Redis types
• See http://redis.io/commands for the full list
20. Redis Protocol
• Old Protocol problems:
• Command and data separated with
whitespace
• Difficult to parse - escaping becomes an
issue
• Difficult to deal with binary data or
encoded text
• Inconsistent
22. Redis Protocol
• URP
• Consistent command syntax
• All data are prefixed with a byte length
• No escaping or encoding/decoding
required
• Binary round-trip safe
29. Redis Dists on CPAN
• Information as of when I began work on
Redis::Client, so some of this may have
changed by now.
30. Redis Dists on CPAN
• Redis.pm
• Simple interface; works
• Does some odd things with encoding
• No newer Redis types like hashes
• AUTOLOAD :(
31. Redis Dists on CPAN
• Redis.pm
• Forces UTF-8 flag on returned data by
default
• This is horribly broken
• It will be fixed
32. Redis Dists on CPAN
• Redis::hiredis
• Wrapper around hiredis binary client
• Works well if you have hiredis
available
• External binary dependency
• Slow IPC
33. Redis Dists on CPAN
• There was no Perl Redis module with:
• Native support for Redis hashes
• No outside binary dependencies
• Full URP Support
42. Redis::Client
• Distribution structure
• Redis/Client/Role/URP.pm
• Moose role; implements the Unified
Request Protocol
• Abstract enough to be used by other
projects
46. Redis::Client
• Encoding Caveats
• Redis::Client makes no assumptions
about your data encoding
• Character data MUST be encoded prior
to being sent to Redis
• Redis URP relies on accurate BYTE
counts, NOT character counts
• Data returned from Redis NOT decoded
48. Redis::Client Examples
my $redis = Redis::Client->new;
# store a string
$redis->set( mystring => ‘foo’ );
# retrieve string
my $str = $redis->get(‘mystring’);
49. Redis::Client Examples
# work with lists
$redis->lpush(
‘mylist’, ‘one’, ‘two’, ‘three’
);
my $tail = $redis->rpop(‘mylist’);
# three
50. Redis::Client Examples
# work with hashes
$redis->hset(
‘myhash’, key => 42
);
my $val = $redis->hget(
‘myhash’, ‘key’
);
# 42
51. Redis::Client Examples
# work with tied classes
tie my %hash, ‘Redis::Client::Hash’,
key => ‘myhash’, client => $redis;
my @keys = keys %hash;
if ( exists $hash{foo} ) { ... }
delete $hash{some_key};
while ( my ( $k, $v ) = each %hash )
{ ... }
# etc.
52. Job Queues
• Goals
• Add jobs with arbitrary data to queue
• Fetch and execute jobs as soon as
possible
• Prevent duplicate job execution
• Thousands of jobs per hour
53. Job Queues
• Old Model
• Jobs table in relational DB
• INSERT to add job
• Poll DB to find new jobs
• Set a status field when job is running
• Transactions to prevent duplicates
• Set status to ‘done’ or ‘error’ for
historical data
54. Job Queues
• Problems:
• Relational DB is slow
• Jobs table grows quickly; not scalable
• Wrong tool for the job
55. Job Queues
• Using Redis:
• A single Redis list implements a queue
• LPOP (shift) jobs off the front of the
queue
• RPUSH (push) jobs onto the end of the
queue
• Use JSON to store job arguments and
metadata
56. Job Queues
• Redis blocking push/pop
• LPOP returns undef if no jobs
• BLPOP command blocks until an item to
shift exists
• No need to poll server; just wait
57. Job Queues
# Simple Example of Job / Dispatcher
use Redis::Client;
use JSON::XS;
use Module::Load;
use TryCatch;
my $redis = Redis::Client->new;
58. Job Queues
# Add a job
my $job =
{ class => ‘Some::Job’,
method => ‘do_something’,
constructor_args => { foo => 42 },
method_args => { bar => 43 } };
$redis->lpush( jobs => encode_json $job );
59. Job Queues
# Simple dispatcher loop
my $job_str = $redis->blpop(‘jobs’);
my $job = decode_json $job_str;
my %c_args
= %{ $job->{constructor_args} };
my $m_args
= %{ $job->{method_args} };
my $class
= ‘MyApp::’ . $job->{class};
my $meth = $job->{method};
60. Job Queues
load $class;
my $obj = $class->new( %c_args );
try {
$obj->$meth( %m_args );
} catch ( $err ) {
# store error data --> relational DB
};
# store success data --> relational DB
61. Benchmarks
• Disclaimer:
• These are bad benchmarks
• Dependent on highly system-specific
architecture
• Didn’t try very hard
• Do your own evaluation
62. Benchmarks
• Old system (MySQL jobs table with polling,
5,000,000 dummy jobs populated)
100
1 Worker 10 Workers
Workers
88 jobs/ 710 jobs/ 3892 jobs/
minute minute minute
avg. avg. avg.
63. Benchmarks
• New system (Redis queue, historical data
only in MySQL, 5,000,000 dummy jobs
populated
100
1 Worker 10 Workers
Workers
92 jobs/ 1011 jobs/ 9222 jobs/
minute minute minute
avg. avg. avg.
64. Caveats
• These benchmarks are stupid
• Highly specific test; do your own tests
• Most job time is spent doing the job, not
talking to the dispatcher
• But it works for my purposes
65. Thank You
Have fun with Redis
and Redis::Client
Mike Friedman
friedo@friedo.com