Redis is a popular, powerful, and—most of all—FAST database. Developers worldwide use Redis as a Cache, a session store, a message bus, and even as their regular database. In this session, we'll introduce Redis, and learn some of the key design patterns that you can use with Redis. Then we'll go over how Redis fits into the ecosystem, and how to use it in our applications—including some of the major gotchas to avoid.
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah
Apache Kafka is a new breed of messaging system built for the "big data" world. Coming out of LinkedIn (and donated to Apache), it is a distributed pub/sub system built in Scala. It has been an Apache TLP now for several months with the first Apache release imminent. Built for speed, scalability, and robustness, Kafka should definitely be one of the data tools you consider when designing distributed data-oriented applications.
The talk will cover a general overview of the project and technology, with some use cases, and a demo.
Redis is a popular, powerful, and—most of all—FAST database. Developers worldwide use Redis as a Cache, a session store, a message bus, and even as their regular database. In this session, we'll introduce Redis, and learn some of the key design patterns that you can use with Redis. Then we'll go over how Redis fits into the ecosystem, and how to use it in our applications—including some of the major gotchas to avoid.
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah
Apache Kafka is a new breed of messaging system built for the "big data" world. Coming out of LinkedIn (and donated to Apache), it is a distributed pub/sub system built in Scala. It has been an Apache TLP now for several months with the first Apache release imminent. Built for speed, scalability, and robustness, Kafka should definitely be one of the data tools you consider when designing distributed data-oriented applications.
The talk will cover a general overview of the project and technology, with some use cases, and a demo.
Introduction to memcached, a caching service designed for optimizing performance and scaling in the web stack, seen from perspective of MySQL/PHP users. Given for 2nd year students of professional bachelor in ICT at Kaho St. Lieven, Gent.
Redis is an advanced in memory key-value store designed for a world where "Memory is the new disk and disk is the new tape". Redis has some unique properties -- like blazing read and write speed, rich atomic operations and asynchronous persistence -- which make it ideally suited for a number of situations.
The tutorial includes an introduction to redis, data types used for redis, performance related to redis, sweet spots of redis, design consideration/best practices, adopters of redis. Begins with a section giving an introduction to redis which includes an introduction to redis and the features of redis. It also includes a brief history of redis, characteristics of redis, language support of redis. Following is a data types section. It includes the data types of redis like strings, lists, hashes, sets, sorted sets. It also includes not thinking of redis as an RDBMS, installation, atomicity of commands, key expiration.
Moreover, it also includes operations on lists, sets, sorted sets, hashes, keys, redis administration command. Alongside there is a section about performance of redis which includes the performance given by redis like hardware, payload size, batch size. It also includes benchmarks attained by redis, a demo version of redis, data durability and advantages of persistence. Then comes a section about sweet spots of redis. It includes sweet spots like cache server, tag cloud, auto completion, activity feeds and many more sweet spots. It also includes case studies as a video marketing platform, content publishing app etc.
A neighbouring section to this is about best practices which includes the design considerations and best practices of redis like avoid excessive long keys, have human readable keys, all data must fit in memory, polygot persistence is a smart choice and many more practices and design considerations. The last section of this tutorial includes some adopters of redis like stack overflow, craiglist, github, Instagram, blizzard entertainment.
Learning Rust the Hard Way for a Production Kafka + ScyllaDB PipelineScyllaDB
🎥 Sign up for upcoming webinars or browse through our library of on-demand recordings here: https://www.scylladb.com/resources/webinars/
About this webinar:
Numberly operates business-critical data pipelines and applications where failure and latency means "lost money" in the best-case scenario. Most of their data pipelines and applications are deployed on Kubernetes and rely on Kafka and ScyllaDB, with Kafka acting as the message bus and ScyllaDB as the source of data for enrichment. The availability and latency of both systems are thus very important for data pipelines. While most of Numberly’s applications are developed using Python, they found a need to move high-performance applications to Rust in order to benefit from a lower-level programming language.
Learn the lessons from Numberly’s experience, including:
- Rationale in selecting a lower-level language
- Developing using a lower-level Rust code base
- Observability and analyzing latency impacts with Rust
- Tuning everything from Apache Avro to driver client settings
- How to build a mission-critical system combining Apache Kafka and ScyllaDB
- Half a year Rust in production feedback
DB12c: All You Need to Know About the Resource ManagerMaris Elsins
This presentation is different from the previous uploads as SLOB was used for the testing.
Oracle Database 12c Multitenant provides the highest level of Oracle Database resource efficiency, driven by an improved resource manager. The 12c resource manager effectively allocates resources both within a single database and between multiple pluggable databases in a container. This presentation will review new features of the 12c resource manager, provide guidelines for migration of your current resource management plan to 12c, and will also look into how much overhead the resource manager introduces.
High-Volume Data Collection and Real Time Analytics Using Rediscacois
In this talk, we describe using Redis, an open source, in-memory key-value store, to capture large volumes of data from numerous remote sources while also allowing real-time monitoring and analytics. With this approach, we were able to capture a high volume of continuous data from numerous remote environmental sensors while consistently querying our database for real time monitoring and analytics.
* See more of my work at http://www.codehenge.net
Introduction to memcached, a caching service designed for optimizing performance and scaling in the web stack, seen from perspective of MySQL/PHP users. Given for 2nd year students of professional bachelor in ICT at Kaho St. Lieven, Gent.
Redis is an advanced in memory key-value store designed for a world where "Memory is the new disk and disk is the new tape". Redis has some unique properties -- like blazing read and write speed, rich atomic operations and asynchronous persistence -- which make it ideally suited for a number of situations.
The tutorial includes an introduction to redis, data types used for redis, performance related to redis, sweet spots of redis, design consideration/best practices, adopters of redis. Begins with a section giving an introduction to redis which includes an introduction to redis and the features of redis. It also includes a brief history of redis, characteristics of redis, language support of redis. Following is a data types section. It includes the data types of redis like strings, lists, hashes, sets, sorted sets. It also includes not thinking of redis as an RDBMS, installation, atomicity of commands, key expiration.
Moreover, it also includes operations on lists, sets, sorted sets, hashes, keys, redis administration command. Alongside there is a section about performance of redis which includes the performance given by redis like hardware, payload size, batch size. It also includes benchmarks attained by redis, a demo version of redis, data durability and advantages of persistence. Then comes a section about sweet spots of redis. It includes sweet spots like cache server, tag cloud, auto completion, activity feeds and many more sweet spots. It also includes case studies as a video marketing platform, content publishing app etc.
A neighbouring section to this is about best practices which includes the design considerations and best practices of redis like avoid excessive long keys, have human readable keys, all data must fit in memory, polygot persistence is a smart choice and many more practices and design considerations. The last section of this tutorial includes some adopters of redis like stack overflow, craiglist, github, Instagram, blizzard entertainment.
Learning Rust the Hard Way for a Production Kafka + ScyllaDB PipelineScyllaDB
🎥 Sign up for upcoming webinars or browse through our library of on-demand recordings here: https://www.scylladb.com/resources/webinars/
About this webinar:
Numberly operates business-critical data pipelines and applications where failure and latency means "lost money" in the best-case scenario. Most of their data pipelines and applications are deployed on Kubernetes and rely on Kafka and ScyllaDB, with Kafka acting as the message bus and ScyllaDB as the source of data for enrichment. The availability and latency of both systems are thus very important for data pipelines. While most of Numberly’s applications are developed using Python, they found a need to move high-performance applications to Rust in order to benefit from a lower-level programming language.
Learn the lessons from Numberly’s experience, including:
- Rationale in selecting a lower-level language
- Developing using a lower-level Rust code base
- Observability and analyzing latency impacts with Rust
- Tuning everything from Apache Avro to driver client settings
- How to build a mission-critical system combining Apache Kafka and ScyllaDB
- Half a year Rust in production feedback
DB12c: All You Need to Know About the Resource ManagerMaris Elsins
This presentation is different from the previous uploads as SLOB was used for the testing.
Oracle Database 12c Multitenant provides the highest level of Oracle Database resource efficiency, driven by an improved resource manager. The 12c resource manager effectively allocates resources both within a single database and between multiple pluggable databases in a container. This presentation will review new features of the 12c resource manager, provide guidelines for migration of your current resource management plan to 12c, and will also look into how much overhead the resource manager introduces.
High-Volume Data Collection and Real Time Analytics Using Rediscacois
In this talk, we describe using Redis, an open source, in-memory key-value store, to capture large volumes of data from numerous remote sources while also allowing real-time monitoring and analytics. With this approach, we were able to capture a high volume of continuous data from numerous remote environmental sensors while consistently querying our database for real time monitoring and analytics.
* See more of my work at http://www.codehenge.net
Noah Davis & Luke Melia of Weplay share a series of examples of Redis in the real world. In doing so, they cover a survey of Redis' features, approach, history and philosophy. Most examples are drawn from the Weplay team's experience using Redis to power features on Weplay.com, a social site for youth sports.
Redis Use Patterns (DevconTLV June 2014)Itamar Haber
An introduction to Redis for the SQL practitioner, covering data types and common use cases.
The video of this session can be found at: https://www.youtube.com/watch?v=8Unaug_vmFI
NoSQL databases are often touted for their performance and whilst it's true that they usually offer great performance out of the box, it still really depends on how you deploy your infrastructure. Dedicated vs cloud? In memory vs on disk? Spindal vs SSD? Replication lag. Multi data centre deployment.
This talk considers all the infrastructure requirements of a successful high performance infrastructure with hints and tips that can be applied to any NoSQL technology. It includes things like OS tweaks, disk benchmarks, replication, monitoring and backups.
Presented at NoSQL Roadshow Berlin 2013 by David Mytton.
MongoDB supports replication for failover and redundancy. In this session we will introduce the basic concepts around replica sets, which provide automated failover and recovery of nodes. We'll show you how to set up, configure, and initiate a replica set, and methods for using replication to scale reads. We'll also discuss proper architecture for durability.
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAmazon Web Services
If you’re familiar with relational databases, designing your app to use a NoSQL database like DynamoDB may be new to you. In this webinar, we’ll walk you through common data design patterns for a variety of applications to help you learn how to design a schema, then store and retrieve the data with DynamoDB. We will discuss the benefits of using DynamoDB to develop mobile, web, IoT, and gaming apps.
Learning Objectives:
Learn schema design best practices with DynamoDB across multiple use cases, including gaming, AdTech, IoT, and others
Who Should Attend:
Architects, Developers, and SysOps interested in learning how to design NoSQL schemas to support mobile, web, IoT, AdTech, and gaming apps.
Familiarity with DynamoDB is helpful
Speakers: Chris Larsen (Limelight Networks) and Benoit Sigoure (Arista Networks)
The OpenTSDB community continues to grow and with users looking to store massive amounts of time-series data in a scalable manner. In this talk, we will discuss a number of use cases and best practices around naming schemas and HBase configuration. We will also review OpenTSDB 2.0's new features, including the HTTP API, plugins, annotations, millisecond support, and metadata, as well as what's next in the roadmap.
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
MongoDB is easy to download and run locally but requires some thought and further understanding when deploying to production. At scale, schema design, indexes and query patterns really matter. So does data structure on disk, sharding, replication and data centre awareness. This talk will examine these factors in the context of analytics, and more generally, to help you optimise MongoDB for any scale.
Presented at MongoDB Days London 2013 by David Mytton.
In this session, we explore Amazon DynamoDB capabilities and benefits in detail and discusses how to get the most out of your DynamoDB database. We go over schema design best practices with DynamoDB across multiple use cases, including gaming, AdTech, IoT, and others. We also explore designing efficient indexes, scanning, and querying, and go into detail on a number of recently released features, including JSON document support, Streams, and more.
For the first time this year, 10gen will be offering a track completely dedicated to Operations at MongoSV, 10gen's annual MongoDB user conference on December 4. Learn more at MongoSV.com
MongoDB supports replication for failover and redundancy. In this session we will introduce the basic concepts around replica sets, which provide automated failover and recovery of nodes. We'll show you how to set up, configure, and initiate a replica set, and methods for using replication to scale reads. We'll also discuss proper architecture for durability.
Preview of Cassandra 2.2 and 3.0 features. Materialized views, user defined functions, user defined aggregations, new storage engine, rewritten hints, improved vnodes, native JSON support, updated garbage collector.
Building a Scalable Inbox System with MongoDB and Javaantoinegirbal
Many user-facing applications present some kind of news feed/inbox system. You can think of Facebook, Twitter, or Gmail as different types of inboxes where the user can see data of interest, sorted by time, popularity, or other parameter. A scalable inbox is a difficult problem to solve: for millions of users, varied data from many sources must be sorted and presented within milliseconds. Different strategies can be used: scatter-gather, fan-out writes, and so on. This session presents an actual application developed by 10gen in Java, using MongoDB. This application is open source and is intended to show the reference implementation of several strategies to tackle this common challenge. The presentation also introduces many MongoDB concepts.
Amazon DynamoDB is a fully managed, highly scalable NoSQL database service. We will deep dive into how DynamoDB scaling and partitioning works, how to do data modeling based on access patterns using primitives such as hash/range keys, secondary indexes, conditional writes and query filters. We will also discuss how to use DynamoDB Streams to build cross-region replication and integrate with other services (such as Amazon S3, Amazon CloudSearch, Amazon ElastiCache, Amazon Redshift) to enable logging, search, analytics and caching. You will learn design patterns and best practices on how to use DynamoDB to build highly scalable applications, with the right performance characteristics at the right cost.
Media owners are turning to MongoDB to drive social interaction with their published content. The way customers consume information has changed and passive communication is no longer enough. They want to comment, share and engage with publishers and their community through a range of media types and via multiple channels whenever and wherever they are. There are serious challenges with taking this semi-structured and unstructured data and making it work in a traditional relational database. This webinar looks at how MongoDB’s schemaless design and document orientation gives organisation’s like the Guardian the flexibility to aggregate social content and scale out.
Searching Billions of Documents with RedisDvir Volk
RediSearch is a powerful Redis module for full-text and secondary indexing, that can scale to billions of documents and deliver super fast performance for real time indexing and searching. An overview of how it works and how it can scale to huge clusters.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 3
Kicking ass with redis
1. Kicking Ass With
Redis for real world problems
Dvir Volk, Chief Architect, Everything.me (@dvirsky)
2. O HAI! I CAN HAS REDIS?
Extremely Quick introduction to Redis
● Key => Data Structure server
● In memory, with persistence
● Extremely fast and versatile
● Rapidly growing (Instagr.am, Craigslist, Youporn ....)
● Open Source, awesome community
● Used as the primary data source in Everything.me:
○ Relational Data
○ Queueing
○ Caching
○ Machine Learning
○ Text Processing and search
○ Geo Stuff
3. Key => { Data Structures }
"I'm a Plain Text String!" Strings/Blobs/Bitmaps
Key1 Val1
Hash Tables (objects!)
Key2 Val 2
Key C B B A C Linked Lists
A B C D
Sets
Sorted Sets
A: 0.1 B: 0.3 C: 500 D: 500
4. Redis is like Lego for Data
● Yes, It can be used as a simple KV store.
● But to really Use it, you need to think of it as a tool set.
● You have a nail - redis is a hammer building toolkit.
● That can make almost any kind of hammer.
● Learning how to efficiently model your problem is the
Zen of Redis.
● Here are a few examples...
5. Pattern 1: Simple, Fast, Object Store
Our problem:
● Very fast object store that scales up well.
● High write throughput.
● Atomic manipulation of object members.
Possible use cases:
● Online user data (session, game state)
● Social Feed
● Shopping Cart
● Anything, really...
6. Storing users as HASHes
email john@domain.com
name John
users:1
Password aebc65feae8b
id 1
email Jane@domain.com
name Jane
users:2
Password aebc65ab117b
id 2
7. Redis Pattern 1
● Each object is saved as a HASH.
● Hash objects are { key=> string/number }
● No JSON & friends serialization overhead.
● Complex members and relations are stored
as separate HASHes.
● Atomic set / increment / getset members.
● Use INCR for centralized incremental ids.
● Load objects with HGETALL / HMGET
10. Pattern 2: Object Indexing
The problem:
● We want to index the objects we saved by
various criteria.
● We want to rank and sort them quickly.
● We want to be able to update an index
quickly.
Use cases:
● Tagging
● Real-Time score tables
● Social Feed Views
11. Indexing with Sorted Sets
k:users:email
email john@domain.com
user:1 => 1789708973
name John
users:1
score 300 user:2 => 2361572523
id 1
....
email Jane@domain.com
k:users:score
name Jane
user:2 => 250
users:2
score 250
user:1 => 300
id 2
user:3 => 300
12. Redis Pattern
● Indexes are sorted sets (ZSETs)
● Access by value O(1), by score O(log(N)). plus ranges.
● Sorted Sets map { value => score (double) }
● So we map { objectId => score }
● For numerical members, the value is the score
● For string members, the score is a hash of the string.
● Fetching is done with ZRANGEBYSCORE
● Ranges with ZRANGE / ZRANGEBYSCORE on
numeric values only (or very short strings)
● Deleting is done with ZREM
● Intersecting keys is possible with ZINTERSTORE
● Each class' objects have a special sorted set for ids.
13. Automatic Keys for objects
class User(RedisObject): > ZADD k:users:email 238927659283691 "1"
1
_keySpec = KeySpec(
UnorderedKey('email'), > ZADD k:users:name 9283498696113 "1"
UnorderedKey('name'), 1
OrderedNumericalKey('points') > ZADD k:users:points 300 "1"
)
.... 1
> ZREVRANGE k:users:points 0 20 withscores
#creating the users - now with points 1) "1"
user = User('user@domain.com', 'John', 2) "300"
'1234', points = 300)
> ZRANGEBYSCORE k:users:email 238927659283691
238927659283691
#saving auto-indexes 1) "1"
user.save()
redis 127.0.0.1:6379> HGETALL users:1
{ .. }
#range query on rank
users = User.getByRank(0,20)
#get by name
users = User.get(name = 'John')
14. Pattern 3: Unique Value Counter
The problem:
● We want an efficient way to measure
cardinality of a set of objects over time.
● We may want it in real time.
● We don't want huge overhead.
Use Cases:
● Daily/Monthly Unique users
● Split by OS / country / whatever
● Real Time online users counter
16. Redis Pattern
● Redis strings can be treated as bitmaps.
● We keep a bitmap for each time slot.
● We use BITSET offset=<object id>
● the size of a bitmap is max_id/8 bytes
● Cardinality per slot with BITCOUNT (2.6)
● Fast bitwise operations - OR / AND / XOR
between time slots with BITOP
● Aggregate and save results periodically.
● Requires sequential object ids - or mapping
of (see incremental ids)
17. Counter API (with redis internals)
counter = BitmapCounter('uniques', timeResolutions=(RES_DAY,))
#sampling current users
counter.add(userId)
> BITSET uniques:day:1339891200 <userId> 1
#Getting the unique user count for today
counter.getCount(time.time())
> BITCOUNT uniques:day:1339891200
#Getting the the weekly unique users in the past week
timePoints = [now() - 86400*i for i in xrange(7, 0, -1)]
counter.aggregateCounts(timePoints, counter.OP_TOTAL)
> BITOP OR tmp_key uniques:day:1339891200 uniques:day:1339804800 ....
> BITCOUNT tmp_key
18. Pattern 4: Geo resolving
The Problem:
● Resolve lat,lon to real locations
● Find locations of a certain class (restaurants)
near me
● IP2Location search
Use Cases:
● Find a user's City, ZIP code, Country, etc.
● Find the user's location by IP
19. A bit about geohashing
● Converts (lat,lon) into a single 64 bit hash (and back)
● The closer points are, their common prefix is generally bigger.
● Trimming more lower bits describes a larger bounding box.
● example:
○ Tel Aviv (32.0667, 34.7667) =>
14326455945304181035
○ Netanya (32.3336, 34.8578) =>
14326502174498709381
● We can use geohash as scores
in sorted sets.
● There are drawbacks such as
special cases near lat/lon 0.
20. Redis Pattern
● Let's index cities in a sorted set:
○ { cityId => geohash(lat,lon) }
● We convert the user's {lat,lon} into a goehash too.
● Using ZRANGEBYSCORE we find the N larger and N
smaller elements in the set:
○ ZRANGEBYSCORE <user_hash> +inf 0 8
○ ZREVRANGEBYSCORE <user_hash> -inf 0 8
● We use the scores as lat,lons again to find distance.
● We find the closest city, and load it.
● We can save bounding rects for more precision.
● The same can be done for ZIP codes, venues, etc.
● IP Ranges are indexed on a sorted set, too.
21. Other interesting use cases
● Distributed Queue
○ Workers use blocking pop (BLPOP) on a list.
○ Whenever someone pushes a task to the list (RPUSH) it will be
popped by exactly one worker.
● Push notifications / IM
○ Use redis PubSub objects as messaging channels between users.
○ Combine with WebSocket to push messages to Web Browsers, a-la
googletalk.
● Machine learning
○ Use redis sorted sets as a fast storage for feature vectors, frequency
counts, probabilities, etc.
○ Intersecting sorted sets can yield SUM(scores) - think log(P(a)) + log
(P(b))
22. Get the sources
Implementations of most of the examples in this
slideshow:
https://github.com/EverythingMe/kickass-redis
Geo resolving library:
http://github.com/doat/geodis
Get redis at http://redis.io