HBase release managers Lars Hofhansl, Andrew Purtell, Enis Soztutar, Michael Stack, and Liyin Tang jointly present highlights from their releases, and take your questions throughout.
Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops TeamRedis Labs
A presentation by Redis Labs' CTO, Yiftach Shoolman, given at the July 2nd meet up, hosted by I am OnDemand and IGT Cloud at the Microsoft ILDC Auditorium.
See the video at: https://www.youtube.com/watch?v=eymqHZaUOH4
In this In this session Yiftach shares tips on how the company manages 50,000+ scalable and highly avaliable Redis databases over the 4 largest public clouds, 8 leading Platforms-as-a-Service, and across 10 geographical regions. He explains the service's back-end architecture, the open-source projects it uses, and which tools the company builds in-house. Shoolman also shares what Redis Labs' small DevOps team does automatically, and what it still does manually. Finally, he offers advice on how to build a strong R&D team that lives and breathes DevOps.
Since the company launched its Redis Cloud service, it has dealt with 150+ node failure events and a half-dozen complete data-center outages. In addition, its team has experienced many interesting scenarios, such as hard to believe scaling patterns like 0 to a few hundreds gigabytes of in-memory data in just a few minutes, and 0 to 300K+ ops/sec in just a few seconds.
Elastic Data Processing with Apache Flink and Apache PulsarStreamNative
More and more applications are using Flink for low-latency data processing. Flink unifies batch and stream processing using one computation engine. However in reality, in order to really unify batch and stream processing, it requires a data system offers one unified data representation for both batch and streaming data. Nowadays, streaming data is typically stored in a log storage or messaging system, while batch data is stored in distributed filesystem and object stores. That means that data scientists still need write two different computing jobs to access same data stored in different data systems.
Apache Pulsar is the next generation messaging and streaming data system. It was originally built at Yahoo, and has graduated from Apache Incubator and become a Top-Level-Project. Pulsar separates messaging serving and data storage into two layers. Such layered architecture provides high throughput and low-latency while ensuring high availability and scalability. Pulsar’s segment centric storage design along with layered architecture makes Pulsar a perfect unbounded streaming data system, which can well fit into Flink’s computation model.
In this talk, Sijie Guo from Apache Pulsar PMC, will introduce Pulsar and its layered architecture and segment-centric storage, detailing how this architecture can well integrate with Flink to provide elastic unified batch and stream processing.
HBase release managers Lars Hofhansl, Andrew Purtell, Enis Soztutar, Michael Stack, and Liyin Tang jointly present highlights from their releases, and take your questions throughout.
Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops TeamRedis Labs
A presentation by Redis Labs' CTO, Yiftach Shoolman, given at the July 2nd meet up, hosted by I am OnDemand and IGT Cloud at the Microsoft ILDC Auditorium.
See the video at: https://www.youtube.com/watch?v=eymqHZaUOH4
In this In this session Yiftach shares tips on how the company manages 50,000+ scalable and highly avaliable Redis databases over the 4 largest public clouds, 8 leading Platforms-as-a-Service, and across 10 geographical regions. He explains the service's back-end architecture, the open-source projects it uses, and which tools the company builds in-house. Shoolman also shares what Redis Labs' small DevOps team does automatically, and what it still does manually. Finally, he offers advice on how to build a strong R&D team that lives and breathes DevOps.
Since the company launched its Redis Cloud service, it has dealt with 150+ node failure events and a half-dozen complete data-center outages. In addition, its team has experienced many interesting scenarios, such as hard to believe scaling patterns like 0 to a few hundreds gigabytes of in-memory data in just a few minutes, and 0 to 300K+ ops/sec in just a few seconds.
Elastic Data Processing with Apache Flink and Apache PulsarStreamNative
More and more applications are using Flink for low-latency data processing. Flink unifies batch and stream processing using one computation engine. However in reality, in order to really unify batch and stream processing, it requires a data system offers one unified data representation for both batch and streaming data. Nowadays, streaming data is typically stored in a log storage or messaging system, while batch data is stored in distributed filesystem and object stores. That means that data scientists still need write two different computing jobs to access same data stored in different data systems.
Apache Pulsar is the next generation messaging and streaming data system. It was originally built at Yahoo, and has graduated from Apache Incubator and become a Top-Level-Project. Pulsar separates messaging serving and data storage into two layers. Such layered architecture provides high throughput and low-latency while ensuring high availability and scalability. Pulsar’s segment centric storage design along with layered architecture makes Pulsar a perfect unbounded streaming data system, which can well fit into Flink’s computation model.
In this talk, Sijie Guo from Apache Pulsar PMC, will introduce Pulsar and its layered architecture and segment-centric storage, detailing how this architecture can well integrate with Flink to provide elastic unified batch and stream processing.
Red Hat Forum Tokyo - OpenStack Architecture DesignDan Radez
This was my second session presented at the Red Hat Forum in Tokyo, November 2012. There's a lot of animation in this presentation. The animation doesn't show well, but the basic ideas kinda show through. This presentation takes participants from a basic 1 or 2 node install to a Highly Available, Load Balanced, horizontally scaled deployment.
Redis in a Multi Tenant Environment–High Availability, Monitoring & Much More! Redis Labs
Running any
application in a multi-tenant environment poses its challenges. This talk is focused around how we at Rackspace run Redis
in a multi-tenant environment, ensuring security, performance, fault tolerance and high availability. This talk will cover: an
architecture deep dive of multi tenant Redis on the cloud, management of sentinels, monitoring and operations of a large
scale Redis deployment,introducing new Redis versions,scaling, security, some lessons learnt. The target audience for this
talk is anyone who is interested in the deployment/operational aspect of running Redis. This is relevant not only for those
who want to run Redis themselves, but also interested in how a Redis provider might be doing it for them.
Cassandra Day SV 2014: Netflix’s Astyanax Java Client Driver for Apache Cassa...DataStax Academy
Astyanax is the thrift protocol based C* driver widely used and open sourced by Netflix. It was recently integrated with the Java Driver released by DataStax. This talk focusses on the different options available with Astyanax and how it complements the Java Driver.
About Puneet Oberai, Senior Software Engineer at Netflix
Senior Software Engineer at Netflix and proud team member of Netflix CDE (Cloud Data Engineering).
Redis is an open source in memory database which is easy to use. In this introductory presentation, several features will be discussed including use cases. The datatypes will be elaborated, publish subscribe features, persistence will be discussed including client implementations in Node and Spring Boot. After this presentation, you will have a basic understanding of what Redis is and you will have enough knowledge to get started with your first implementation!
An overview of building and serving Lucene indexes on a Hadoop cluster with Solr for text and parametric searching, as presented at Cleveland Hadoop User Group on 13 January 2014.
Redis Introduction and customized framework base on StackExchange.Redis but update to using singleton pattern and JSON
Configuration Mapping with Redis Instance Group and Name concept.
In the big data world, our data stores communicate over an asynchronous, unreliable network to provide a facade of consistency. However, to really understand the guarantees of these systems, we must understand the realities of networks and test our data stores against them.
Jepsen is a tool which simulates network partitions in data stores and helps us understand the guarantees of our systems and its failure modes. In this talk, I will help you understand why you should care about network partitions and how can we test datastores against partitions using Jepsen. I will explain what Jepsen is and how it works and the kind of tests it lets you create. We will try to understand the subtleties of distributed consensus, the CAP theorem and demonstrate how different data stores such as MongoDB, Cassandra, Elastic and Solr behave under network partitions. Finally, I will describe the results of the tests I wrote using Jepsen for Apache Solr and discuss the kinds of rare failures which were found by this excellent tool.
HIgh Performance Redis- Tague Griffith, GoProRedis Labs
High Performance Redis looks at a wide range of techniques - from programming to system tuning - to deploy and maintain an extremely high performing Redis cluster. From the operational
perspective, the talk lays out multiple techniques for clustering (sharding) Redis systems and examines how the different
approaches impact performance time. The talk further examines system settings (Linux network parameters, Redis
system) and how they impact performance (both good and bad). Finally, for the developer, we look at how different ways of structuring data actually demonstrate different performance characteristics
Red Hat Forum Tokyo - OpenStack Architecture DesignDan Radez
This was my second session presented at the Red Hat Forum in Tokyo, November 2012. There's a lot of animation in this presentation. The animation doesn't show well, but the basic ideas kinda show through. This presentation takes participants from a basic 1 or 2 node install to a Highly Available, Load Balanced, horizontally scaled deployment.
Redis in a Multi Tenant Environment–High Availability, Monitoring & Much More! Redis Labs
Running any
application in a multi-tenant environment poses its challenges. This talk is focused around how we at Rackspace run Redis
in a multi-tenant environment, ensuring security, performance, fault tolerance and high availability. This talk will cover: an
architecture deep dive of multi tenant Redis on the cloud, management of sentinels, monitoring and operations of a large
scale Redis deployment,introducing new Redis versions,scaling, security, some lessons learnt. The target audience for this
talk is anyone who is interested in the deployment/operational aspect of running Redis. This is relevant not only for those
who want to run Redis themselves, but also interested in how a Redis provider might be doing it for them.
Cassandra Day SV 2014: Netflix’s Astyanax Java Client Driver for Apache Cassa...DataStax Academy
Astyanax is the thrift protocol based C* driver widely used and open sourced by Netflix. It was recently integrated with the Java Driver released by DataStax. This talk focusses on the different options available with Astyanax and how it complements the Java Driver.
About Puneet Oberai, Senior Software Engineer at Netflix
Senior Software Engineer at Netflix and proud team member of Netflix CDE (Cloud Data Engineering).
Redis is an open source in memory database which is easy to use. In this introductory presentation, several features will be discussed including use cases. The datatypes will be elaborated, publish subscribe features, persistence will be discussed including client implementations in Node and Spring Boot. After this presentation, you will have a basic understanding of what Redis is and you will have enough knowledge to get started with your first implementation!
An overview of building and serving Lucene indexes on a Hadoop cluster with Solr for text and parametric searching, as presented at Cleveland Hadoop User Group on 13 January 2014.
Redis Introduction and customized framework base on StackExchange.Redis but update to using singleton pattern and JSON
Configuration Mapping with Redis Instance Group and Name concept.
In the big data world, our data stores communicate over an asynchronous, unreliable network to provide a facade of consistency. However, to really understand the guarantees of these systems, we must understand the realities of networks and test our data stores against them.
Jepsen is a tool which simulates network partitions in data stores and helps us understand the guarantees of our systems and its failure modes. In this talk, I will help you understand why you should care about network partitions and how can we test datastores against partitions using Jepsen. I will explain what Jepsen is and how it works and the kind of tests it lets you create. We will try to understand the subtleties of distributed consensus, the CAP theorem and demonstrate how different data stores such as MongoDB, Cassandra, Elastic and Solr behave under network partitions. Finally, I will describe the results of the tests I wrote using Jepsen for Apache Solr and discuss the kinds of rare failures which were found by this excellent tool.
HIgh Performance Redis- Tague Griffith, GoProRedis Labs
High Performance Redis looks at a wide range of techniques - from programming to system tuning - to deploy and maintain an extremely high performing Redis cluster. From the operational
perspective, the talk lays out multiple techniques for clustering (sharding) Redis systems and examines how the different
approaches impact performance time. The talk further examines system settings (Linux network parameters, Redis
system) and how they impact performance (both good and bad). Finally, for the developer, we look at how different ways of structuring data actually demonstrate different performance characteristics
Speed up your Symfony2 application and build awesome features with RedisRicard Clau
Redis is an extremely fast data structure server that can be easily added to your existing stack and act like a Swiss army knife to help solve many problems that would be extremely difficult to workaround with the traditional RDBMS. In this session we will focus on what Redis is, how it works, what awesome features we can build with it and how we can use it with PHP and integrate it with Symfony2 applications making them blazing fast.
This presentation was used in a Redis talk that took place in ALT.NET Melbourne at February 25th, 2014. It is aimed for .Net developers, though almost all of the slides solely discuss Redis as the data store server, without any relation to client libraries in general and .Net in particular.
Performance is an important key in the success of a good user experienceCaching information is often the best way to achieve the performance.
Redis is far for the traditional cache which deals only with key-value pairs. Build from an open-sourceproject, it is accessible from multiple languages and supports atomic operations such as appending to a string, incrementing the value in a hash, pushing to a list, computing set intersection, union and difference, or getting the member with highest ranking in a sorted set.
This session will introduce many features of the Azure Redis Cache service through a demo application.
(BDT401) Amazon Redshift Deep Dive: Tuning and Best PracticesAmazon Web Services
Get a look under the covers: Learn tuning best practices for taking advantage of Amazon Redshift's columnar technology and parallel processing capabilities to improve your delivery of queries and improve overall database performance. This session explains how to migrate from existing data warehouses, create an optimized schema, efficiently load data, use work load management, tune your queries, and use Amazon Redshift's interleaved sorting features. Finally, learn how TripAdvisor uses these best practices to give their entire organization access to analytic insights at scale.
Peek behind the scenes to learn about Amazon ElastiCache's design and architecture. See common design patterns of our Memcached and Redis offerings and how customers have used them for in-memory operations and achieved improved latency and throughput for applications. During this session, we review best practices, design patterns, and anti-patterns related to Amazon ElastiCache.
Solr Recipes provides quick and easy steps for common use cases with Apache Solr. Bite-sized recipes will be presented for data ingestion, textual analysis, client integration, and each of Solr’s features including faceting, more-like-this, spell checking/suggest, and others.
Amazon Redshift é um serviço gerenciado que lhe dá um Data Warehouse, pronto para usar. Você se preocupa com carregar dados e utilizá-lo. Os detalhes de infraestrutura, servidores, replicação, backup são administrados pela AWS.
2. Ready to Redis
• Document-oriented Database
• Key-Value Data Store Program
• Key can contain strings, hashes, lists, sets and
sorted sets
• Value can contain strings, lists, sets, sorted set
• Redis use RAM for data store
3. Key-Value Data Store
• Insert data with specific key
• Get data with key by O(1) compexity
• Value can contain structured strings like as JSON,
XML
4. List Control
• SET : LINSERT, LPUSH, RPUSH, LSET
• GET : LPOP, LRANGE, RPOP
• DEL : LREM
• ETC : LTRIM, LLEN
5. Pushing IDs instead of the actual data
$ redis-cli incr next.news.id
(integer) 1
$ redis-cli set news:1:title "Redis is simple"
OK
$ redis-cli set news:1:url
"http://code.google.com/p/redis"
OK
$ redis-cli lpush submitted.news 1
OK
$ redis-cli lrange submitted.news 0 -1
1) “1”
6. Set Control
• SET : SADD,
• GET : SPOP, SRANDMEMBER, SMEMBERS
• DEL : SREM
• ETC : SINTER, SUNION, SCARD, SDIFF, SMOVE,
SISMEMBER
8. Sorted Set Control
• SET : ZADD, ZINCRBY
• GET : ZRANGE, ZRANGEBYSCORE, ZSCORE,
ZCARD, ZRANK, ZCOUNT
• DEL : ZREM, ZREMRANGEBYRANK,
ZREMRANGEBYSCORE
• ETC : ZINTERSTORE, Z UNIONSTORE
10. Replication
• Master-Slave replication
• Master can have multiple slaves
• Slaves are able to accept other slaves
connections
• Slaves can also be connected to other slaves in
graph-like structure
• Redis replication is non-blocking on the master
side. but blocking on the slave side.
11. How replication works
• To configure add below line to slave's
configuration file slaveof IP PORT
• After configuration done. when upon connection
slave sends a SYNC command
• Master start background data saving and collect
all new commands received that will modify
dataset
• When background saving complete, transfers the
dataset to slave, then send saved commands
12. Publish / Subcribe
• Messaging pattern where senders of messages
do not program the messages to be sent directly
to specific receiver.
• SUBCRIBE [channel] command create channel or
subscribe channel
• PUBLISH [channel] [message] command send
message via
• Support pattern matching subscribe with
PSUBSCRIBE
13. Pipelining
• Whatever network speed fast or slow, there's
always latency.
• To avoid network latency. Redis support multi
commands send with one request.
• Send commands with new line delemiter
echo -en "PINGrnPINGrnPINGrn" | nc
localhost 6379
• Result wiil received after all commands processed