Redis Introduction and customized framework base on StackExchange.Redis but update to using singleton pattern and JSON
Configuration Mapping with Redis Instance Group and Name concept.
An insight into NoSQL solutions implemented at RTV Slovenia and elsewhere, what problems we are trying to solve and an introduction to solving them with Redis.
Talk given at #wwwh @ Ljubljana, 30.1.2013 by me, Tit Petric
Redis is an open source in memory database which is easy to use. In this introductory presentation, several features will be discussed including use cases. The datatypes will be elaborated, publish subscribe features, persistence will be discussed including client implementations in Node and Spring Boot. After this presentation, you will have a basic understanding of what Redis is and you will have enough knowledge to get started with your first implementation!
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuRedis Labs
Postgres and Redis Sitting in a Tree | In today’s world of polyglot persistence, it’s likely that companies will be using multiple data stores for storing and working with data based on the use case. Typically a company will
start with a relational database like Postgres and then add Redis for more high velocity use-cases. What if you could tie the two systems together to enable so much more?
An insight into NoSQL solutions implemented at RTV Slovenia and elsewhere, what problems we are trying to solve and an introduction to solving them with Redis.
Talk given at #wwwh @ Ljubljana, 30.1.2013 by me, Tit Petric
Redis is an open source in memory database which is easy to use. In this introductory presentation, several features will be discussed including use cases. The datatypes will be elaborated, publish subscribe features, persistence will be discussed including client implementations in Node and Spring Boot. After this presentation, you will have a basic understanding of what Redis is and you will have enough knowledge to get started with your first implementation!
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuRedis Labs
Postgres and Redis Sitting in a Tree | In today’s world of polyglot persistence, it’s likely that companies will be using multiple data stores for storing and working with data based on the use case. Typically a company will
start with a relational database like Postgres and then add Redis for more high velocity use-cases. What if you could tie the two systems together to enable so much more?
Memcached or Redis? It's a question that nearly always arises in any discussion about squeezing more performance out of a modern, database-driven Web application. When performance needs to be improved, caching is often the first step employed, and Memcached and Redis are typically the first places to turn.
Provides an overview of Redis which is a Key Value NoSQL database and the different data types it supports. Also shows how to use Redis Client API from node.
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage TieringErik Krogen
A presentation by CR Hota and Ekanth Sethuramalingam of Uber regarding their storage infrastructure, leveraging HDFS features like Router-Based Federation and Storage Tiering.
This is taken from the Apache Hadoop Contributors Meetup on January 30, hosted by LinkedIn in Mountain View.
Redis is a NoSQL technology that rides a fine line between database and in-memory cache. Redis also offers "remote data structures", which gives it a significant advantage over other in-memory databases. This session will cover several PHP clients for Redis, and how to use them for caching, data modeling and generally improving application throughput.
Background Tasks in Node - Evan Tahler, TaskRabbitRedis Labs
The talk gives an overview of some of the many ways you can preform
background taks in node, which include: Foreground (in-line) Parallel (threaded-ish) Local Messages (fork-ish) Remote
Messages Remote Queues (Resque-ish) Event Bus (Kafka-ish) For every section, we show an example, and more interestingly,
note how node makes every step better/faster/stronger... even the bad ideas! The idea for the talk came from a twitter
converstaion with @dshaw, host of NodeUP about how easy it was to have multiple node workers in Node-Resque... Check
out the presentation to learn how!
Hadoop Meetup Jan 2019 - Hadoop EncryptionErik Krogen
A presentation by Wei-Chiu Chuang of Cloudera regarding the state of Hadoop encryption, with a particular eye towards the Key Management Service (KMS).
This is taken from the Apache Hadoop Contributors Meetup on January 30, hosted by LinkedIn in Mountain View.
Hadoop Meetup Jan 2019 - Mounting Remote Stores in HDFSErik Krogen
Virajith Jalaparti and Ashvin Agrawal of Microsoft present regarding their work to support mounting remote stores in HDFS. They show how HDFS can be used as a caching proxy to access remote stores such as ADLS and S3, enabling clients to be unaware of the location of their data, and increasing efficiency in the process.
This is taken from the Apache Hadoop Contributors Meetup on January 30, hosted by LinkedIn in Mountain View.
Tips to drive maria db cluster performance for nextcloudSeveralnines
Nextcloud requires a database to store administrative data for the platform. A poorly performing database can have a serious impact on performance and availability of Nextcloud. MariaDB Cluster is the recommended database backend for production installations that require high availability and performance.
This talk is a deep dive into how to design and optimize MariaDB Galera Cluster for Nextcloud. We will cover 5 tips on how to significantly improve performance and stability.
Agenda:
Overview of Nextcloud architecture
Database architecture design
Database proxy
MariaDB and InnoDB performance tuning
Nextcloud performance tuning
Q&A
ProxySQL - High Performance and HA Proxy for MySQLRené Cannaò
High Availability proxy designed to solve real issues of MySQL setups from small to very large production environments.
Presentation at Percona Live Amsterdam 2015
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby NodeErik Krogen
Konstantin Shvachko and Chen Liang of LinkedIn team up with Chao Sun of Uber to present regarding the current state of and future plans for HDFS scalability, with an extended discussion on the newly introduced read-from-standby feature.
This is taken from the Apache Hadoop Contributors Meetup on January 30, hosted by LinkedIn in Mountain View.
Ravi Namboori Hadoop & HDFS ArchitectureRavi namboori
HDFS Architecture: An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients.
Here we can see the figure explaining about all by a cisco evangelist Ravi Namboori.
High performance Redis is popular among developers for its incredible performance, versatility and simplicity. The powerful combination of low cost memory and high performance Redis brings to life new next generation analytic uses - such as simultaneous real time transaction and analytics processing. With Redis Labs' RLEC Flash on AWS SSD instances, you can get fantastic performance at up to 70% lower costs. Join this session to learn how next generation Flash from leading memory provider Intel has made significant strides in performance while retaining its cost advantage to memory. Using a combination of AWS' powerful SSD instances, and Redis Labs' RLEC Flash, you can achieve up to 3M ops/sec at sub millisecond latencies, with a combination of RAM and Flash. The session will also feature customer use cases from a large university, a large customer engagement company and a pioneer of online Flash sales. Session sponsored by Redis Labs.
Memcached or Redis? It's a question that nearly always arises in any discussion about squeezing more performance out of a modern, database-driven Web application. When performance needs to be improved, caching is often the first step employed, and Memcached and Redis are typically the first places to turn.
Provides an overview of Redis which is a Key Value NoSQL database and the different data types it supports. Also shows how to use Redis Client API from node.
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage TieringErik Krogen
A presentation by CR Hota and Ekanth Sethuramalingam of Uber regarding their storage infrastructure, leveraging HDFS features like Router-Based Federation and Storage Tiering.
This is taken from the Apache Hadoop Contributors Meetup on January 30, hosted by LinkedIn in Mountain View.
Redis is a NoSQL technology that rides a fine line between database and in-memory cache. Redis also offers "remote data structures", which gives it a significant advantage over other in-memory databases. This session will cover several PHP clients for Redis, and how to use them for caching, data modeling and generally improving application throughput.
Background Tasks in Node - Evan Tahler, TaskRabbitRedis Labs
The talk gives an overview of some of the many ways you can preform
background taks in node, which include: Foreground (in-line) Parallel (threaded-ish) Local Messages (fork-ish) Remote
Messages Remote Queues (Resque-ish) Event Bus (Kafka-ish) For every section, we show an example, and more interestingly,
note how node makes every step better/faster/stronger... even the bad ideas! The idea for the talk came from a twitter
converstaion with @dshaw, host of NodeUP about how easy it was to have multiple node workers in Node-Resque... Check
out the presentation to learn how!
Hadoop Meetup Jan 2019 - Hadoop EncryptionErik Krogen
A presentation by Wei-Chiu Chuang of Cloudera regarding the state of Hadoop encryption, with a particular eye towards the Key Management Service (KMS).
This is taken from the Apache Hadoop Contributors Meetup on January 30, hosted by LinkedIn in Mountain View.
Hadoop Meetup Jan 2019 - Mounting Remote Stores in HDFSErik Krogen
Virajith Jalaparti and Ashvin Agrawal of Microsoft present regarding their work to support mounting remote stores in HDFS. They show how HDFS can be used as a caching proxy to access remote stores such as ADLS and S3, enabling clients to be unaware of the location of their data, and increasing efficiency in the process.
This is taken from the Apache Hadoop Contributors Meetup on January 30, hosted by LinkedIn in Mountain View.
Tips to drive maria db cluster performance for nextcloudSeveralnines
Nextcloud requires a database to store administrative data for the platform. A poorly performing database can have a serious impact on performance and availability of Nextcloud. MariaDB Cluster is the recommended database backend for production installations that require high availability and performance.
This talk is a deep dive into how to design and optimize MariaDB Galera Cluster for Nextcloud. We will cover 5 tips on how to significantly improve performance and stability.
Agenda:
Overview of Nextcloud architecture
Database architecture design
Database proxy
MariaDB and InnoDB performance tuning
Nextcloud performance tuning
Q&A
ProxySQL - High Performance and HA Proxy for MySQLRené Cannaò
High Availability proxy designed to solve real issues of MySQL setups from small to very large production environments.
Presentation at Percona Live Amsterdam 2015
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby NodeErik Krogen
Konstantin Shvachko and Chen Liang of LinkedIn team up with Chao Sun of Uber to present regarding the current state of and future plans for HDFS scalability, with an extended discussion on the newly introduced read-from-standby feature.
This is taken from the Apache Hadoop Contributors Meetup on January 30, hosted by LinkedIn in Mountain View.
Ravi Namboori Hadoop & HDFS ArchitectureRavi namboori
HDFS Architecture: An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients.
Here we can see the figure explaining about all by a cisco evangelist Ravi Namboori.
High performance Redis is popular among developers for its incredible performance, versatility and simplicity. The powerful combination of low cost memory and high performance Redis brings to life new next generation analytic uses - such as simultaneous real time transaction and analytics processing. With Redis Labs' RLEC Flash on AWS SSD instances, you can get fantastic performance at up to 70% lower costs. Join this session to learn how next generation Flash from leading memory provider Intel has made significant strides in performance while retaining its cost advantage to memory. Using a combination of AWS' powerful SSD instances, and Redis Labs' RLEC Flash, you can achieve up to 3M ops/sec at sub millisecond latencies, with a combination of RAM and Flash. The session will also feature customer use cases from a large university, a large customer engagement company and a pioneer of online Flash sales. Session sponsored by Redis Labs.
This session shows an overview of the features and architecture of SQL Server on Linux and Containers. It covers install, config, performance, security, HADR, Docker containers, and tools. Find the demos on http://aka.ms/bobwardms
Data processing at the speed of 100 Gbps@Apache Crail (Incubating)DataWorks Summit
Once the staple of HPC clusters, today high-performance network and storage devices are everywhere. For a fraction of the cost, one can rent 40/100 Gbps RDMA networks and high-end NVMe flash devices supporting 10s GB/s bandwidths, less than 100 microseconds of latencies, with millions of IOPS. How does one leverage this phenomenal performance for popular data processing frameworks such as Apache Spark, Flink, Hadoop that we all know and love?
In this talk, I will introduce the Apache Crail (Incubating), which is a fast, distributed data store that is designed specifically for high-performance network and storage devices. The goal of the project is to deliver the true hardware performance to Apache data processing frameworks in the most accessible way. With its modular design, Crail supports multiple storage back ends (DRAM, NVMe Flash, and 3D XPoint) and networking protocols (RDMA and TPC/sockets). Crail provides multiple flexible APIs (file system, KV, HDFS, streaming) for a better integration with the high-level data access operations in Apache compute frameworks. As a result, on a 100 Gbps network infrastructure, Crail delivers all-to-all shuffle operations at 80+ Gbps speed, broadcast operations at less than 10 usec latencies, and more than 8M lookups/namenode, etc. Moreover, Crail is a generic solution that integrates well with the Apache ecosystem including frameworks like Spark, Hadoop, Hive, etc.
I will present the case for Crail, its current status, and future plans. As Crail is a young Apache project, we are seeking to build a community and expand its application to other interesting domains.
Speaker
Animesh Trivedi, IBM Research, Research Staff Member (RSM)
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...HostedbyConfluent
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam K Dey | Current 2022
Robinhood’s mission is to democratize finance for all. Data driven decision making is key to achieving this goal. Data needed are hosted in various OLTP databases. Replicating this data near real time in a reliable fashion to data lakehouse powers many critical use cases for the company. In Robinhood, CDC is not only used for ingestion to data-lake but is also being adopted for inter-system message exchanges between different online micro services. .
In this talk, we will describe the evolution of change data capture based ingestion in Robinhood not only in terms of the scale of data stored and queries made, but also the use cases that it supports. We will go in-depth into the CDC architecture built around our Kafka ecosystem using open source system Debezium and Apache Hudi. We will cover online inter-system message exchange use-cases along with our experience running this service at scale in Robinhood along with lessons learned.
What's new with enterprise Redis - Leena Joshi, Redis LabsRedis Labs
Redis Labs manages over 160k+ HA databases, 10k clustered databases, without data loss in spite of one node failure a day and one data center outage per month. Using Enterprise
Redis(RLEC), Redis Labs delivers seamless zero downtime scaling, true high availability with persistence, cross-rack/zone/
datacenter replication and instant automatic failover. Learn how. Join this session for a deep dive into how enterprise Redis makes for no-hassle Redis deployments and the roadmap for new Redis capabilities. Discover new cost savings with Redis on Flash for cost-effective high performance operations and analytics
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...HostedbyConfluent
Apache Hudi is a data lake platform, that provides streaming primitives (upserts/deletes/change streams) on top of data lake storage. Hudi powers very large data lakes at Uber, Robinhood and other companies, while being pre-installed on four major cloud platforms.
Hudi supports exactly-once, near real-time data ingestion from Apache Kafka to cloud storage, which is typically used in-place of a S3/HDFS sink connector to gain transactions and mutability. While this approach is scalable and battle-tested, it can only ingest data in mini batches, leading to lower data freshness. In this talk, we introduce a Kafka Connect Sink Connector for Apache Hudi, which writes data straight into Hudi's log format, making the data immediately queryable, while Hudi's table services like indexing, compaction, clustering work behind the scenes, to further re-organize for better query performance.
5 Factors When Selecting a High Performance, Low Latency DatabaseScyllaDB
There are hundreds of possible databases you can choose from today. Yet if you draw up a short list of critical criteria related to performance and scalability for your use case, the field of choices narrows and your evaluation decision becomes much easier.
In this session, we’ll explore 5 essential factors to consider when selecting a high performance low latency database, including options, opportunities, and tradeoffs related to software architecture, hardware utilization, interoperability, RASP, and Deployment.
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsAndrew Morgan
Understand how you can get the benefits you're looking for from NoSQL data stores without sacrificing the power and flexibility of the world's most popular open source database - MySQL.
GCPUG.TW Meetup #25 - ASP.NET Core with GCPChen-Tien Tsai
Introduce ASP.NET Core and sharing how to host ASP.NET Core application on GCP with GCE, GAE and GKE
[DEMO Code]
https://github.com/blackie1019/GCPUG-Meetup-Demo
[Blackie]
An Solution Architect interested in .NET, JavaScript and Coding with excellent architecture.
[Blogs]
http://blackie1019.github.io
[Related Posts]
- [Blackie's Failed Notes - Google Cloud Platform]
http://blackie1019.github.io/categories/Google-Cloud-Platform/
- [Blackie's Failed Notes - .NET Core and ASP.NET Core Special Column]
http://blackie1019.github.io/dotnet/
Introduce ASP.NET Core and sharing how to host ASP.NET Core application on GCP with GCE, GAE and GKE
[Slide Download]
https://drive.google.com/open?id=0ByZH69bRVHlzUDExUTEtTV81MUk
[DEMO Code]
https://github.com/blackie1019/GCPUG-Meetup-Demo
[Related Posts]
- [Blackie's Failed Notes - Google Cloud Platform]
http://blackie1019.github.io/categories/Google-Cloud-Platform/
- [Blackie's Failed Notes - .NET Core and ASP.NET Core Special Column]
http://blackie1019.github.io/dotnet/
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
An Approach to Detecting Writing Styles Based on Clustering Techniquesambekarshweta25
An Approach to Detecting Writing Styles Based on Clustering Techniques
Authors:
-Devkinandan Jagtap
-Shweta Ambekar
-Harshit Singh
-Nakul Sharma (Assistant Professor)
Institution:
VIIT Pune, India
Abstract:
This paper proposes a system to differentiate between human-generated and AI-generated texts using stylometric analysis. The system analyzes text files and classifies writing styles by employing various clustering algorithms, such as k-means, k-means++, hierarchical, and DBSCAN. The effectiveness of these algorithms is measured using silhouette scores. The system successfully identifies distinct writing styles within documents, demonstrating its potential for plagiarism detection.
Introduction:
Stylometry, the study of linguistic and structural features in texts, is used for tasks like plagiarism detection, genre separation, and author verification. This paper leverages stylometric analysis to identify different writing styles and improve plagiarism detection methods.
Methodology:
The system includes data collection, preprocessing, feature extraction, dimensional reduction, machine learning models for clustering, and performance comparison using silhouette scores. Feature extraction focuses on lexical features, vocabulary richness, and readability scores. The study uses a small dataset of texts from various authors and employs algorithms like k-means, k-means++, hierarchical clustering, and DBSCAN for clustering.
Results:
Experiments show that the system effectively identifies writing styles, with silhouette scores indicating reasonable to strong clustering when k=2. As the number of clusters increases, the silhouette scores decrease, indicating a drop in accuracy. K-means and k-means++ perform similarly, while hierarchical clustering is less optimized.
Conclusion and Future Work:
The system works well for distinguishing writing styles with two clusters but becomes less accurate as the number of clusters increases. Future research could focus on adding more parameters and optimizing the methodology to improve accuracy with higher cluster values. This system can enhance existing plagiarism detection tools, especially in academic settings.
Water billing management system project report.pdfKamal Acharya
Our project entitled “Water Billing Management System” aims is to generate Water bill with all the charges and penalty. Manual system that is employed is extremely laborious and quite inadequate. It only makes the process more difficult and hard.
The aim of our project is to develop a system that is meant to partially computerize the work performed in the Water Board like generating monthly Water bill, record of consuming unit of water, store record of the customer and previous unpaid record.
We used HTML/PHP as front end and MYSQL as back end for developing our project. HTML is primarily a visual design environment. We can create a android application by designing the form and that make up the user interface. Adding android application code to the form and the objects such as buttons and text boxes on them and adding any required support code in additional modular.
MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software. It is a stable ,reliable and the powerful solution with the advanced features and advantages which are as follows: Data Security.MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software.
3. Who am I
Blackie Tsai
Senior IT consultant of Xuenn
Full stack developer
Major on development of real-time transaction system with low latency and high concurrent
Learning CI&CD and run with Agile&LEAN
Blog
http://www.dotblogs.com.tw/blackie1019
5. Redis
Redis - A.K.A Remote Directory Server. It is an open source (BSD licensed), one of Key-Value database of
NoSQL, in-memory data structure store, used as database, cache and message broker. It also can run atomic
operations.
Most Popular NoSQL(http://techstacks.io/)
Recommend using Linux for deploying.
Features
Pure
Simple
Single Tread
In-memory but persistent on disk database
Remote dictionary server
3.0.4 is the latest stable version.
開源資料庫Redis實戰經驗大公開
7. https://clusterhq.com/assets/pdfs/state-of-container-usage-june-2015.pdf
Over 70% would like to run a database or other stateful service in their
container environments, with MySQL and Redis the two leading choices
Important features for data management in container solutions were:
Integration of data management capabilities into existing container work flow and
tools
seamless movement of data between dev, test and production environments.
8. Who using Redis
Facebook’s Instagram: Making the Switch to Cassandra from Redis, a 75% ‘Insta’ Savings
http://techstacks.io/tech/redis、http://redis.io/topics/whos-using-redis
10. Redis - Data Persistence
Master
Redis
Slaver
Redis
Persistence to Disk
Server A
Server B
Replication
IMPLEMENTING PERSISTENCE IN REDIS
• Master instance with no persistence
• Slave instance with AOF enabled
12. Redis - Insight Of Pit
Server-side session with Redis
Redis has many eviction policies, but most of them are based on 'sampling‘.
Alternative Solution
Use Database as an another back-end
Use Redis 3.0
Maximize CPUs usage
Redis is single thread. One instance usually only use one CPU
Redis, another step on the road
16. Stack Exchange
The world’s largest programming
community is growing
Stack Exchange is a network of 130+ Q&A communities
including Stack Overflow
Global traffic ranking 54th largest website
18. Stack Exchange - Info
Stack Overflow still uses Microsoft products.
Stack Overflow still uses a scale-up strategy with HA.
SQL Servers loaded with 384 GB of RAM and 2TB of SSD.
Stats
4 million users, 8 million questions, 40 million answers, 560 million pageviews a month.
Peak is more like 2600-3000 requests/sec on most weekdays.
25 servers, Stack Overflow has a 40:60 read-write ratio.
2 TB of SQL data all stored on SSDs, Each web server has 2x 320GB SSDs in a RAID 1.
DB servers average 10% CPU utilization, 11 web servers, using IIS.
2 load balancers, 1 active, using HAProxy
4 active database nodes, using MS SQL
2 machines for distributed cache and messaging using Redis
2 read-only SQL Servers for used mainly for the Stack Exchange API
3 machines doing search with ElasticSearch
19. Stack Exchange - Caching
Caching
Cache all the things.
5 levels of caches.
1st:
Caching in the browser, CDN, and proxies.
2nd:
Using HttpRuntime.Cache. An in-
memory, per server cache.
3rd:
Redis.
4th:
SQL Server
Cache.
5th:
SSD.
20. Stack Exchange - Lessons Learned
Why use Redis if you use MS products?
gabeech: It's not about OS evangelism. We run things on the platform they run best on. Period. C# runs best on a
windows machine, we use IIS. Redis runs best on a *nix machine we use *nix.
Overkill as a strategy
SSDs Rock
Know your read/write workload
Keeping things very efficient means new machines are not needed often
Don’t be afraid to specialize
Do only what needs to be done
Reinvention is OK
Go down to the bare metal
No bureaucracy.
Garbage collection driven programming
The cost of inefficient code can be higher than you think
21. StackExchange.Redis
Basic Usage - getting started and basic usage
Configuration - options available when connecting to redis
Pipelines and Multiplexers - what is a multiplexer?
Keys, Values and Channels - discusses the data-types used on the API
Transactions - how atomic transactions work in redis
Events - the events available for logging / information purposes
Pub/Sub Message Order - advice on sequential and concurrent processing
Scripting - running Lua scripts with convenient named parameter replacement
https://github.com/StackExchange/StackExchange.Redis
25. RedisDemo
Features
Connection Mapping with Configuration
Configuration with Redis Instance Group and Name concept supported
Singleton pattern avoid resource waste
Dependency
StackExchange.Redis
FX.Configuration
Newtonsoft.Json
Log4Net(Optional)
Demo Version
https://github.com/blackie1019/RedisDemo
30. Redis Data Management
Cross-platform redis desktop manager - desktop management GUI for mac OS X, Windows, Debian and
Ubuntu.
http://redisdesktop.com
32. Redis Server with Docker
Enable Virtualization Technology on Bios and install Docker Toolbox
Create a Docker container for Redis
Run the service
Create your web application container
If any problem you can remove and setup again
docker-machine rm default
docker-machine --native-ssh create -d virtualbox default
Dockerizing a Redis service
33. Reference
Scaling Stack Overflow (QCon NYC 2015)
Redis, another step on the road
Types of NoSQL databases
StackOverflow Update: 560M Pageviews A Month, 25 Servers, And It's All About Performance
Redis 设计与实现
《Redis 设计与实现》图片集
Editor's Notes
The Kiss Principle (儉樸原則)
說 ‘Less is more’ (少即是多) 的哲學。KISS is an acronym for the design principle
“Keep it simple, Stupid!“
“keep it short and simple”
“keep it simple and straightforward“.
簡純
ANSI C 撰寫。
幾乎不依賴第三方函式庫。
memcached 使用 libevent ,程式碼龐大。
Redis 參考 libevent 實現了自己的 epoll event loop 。
KISS 原則
每個數據結構只負責自己應當做的。
簡單
No map-reduce.
No indexes.
No vector clocks.
單執行緒
No thread context switch.
No thread race condition.
No other complicated condition
記憶體資料庫,但可永久儲存於硬碟中
記憶體操作資料。
資料可永久儲存於硬碟。
不只是快取伺服器
Queue
DevOps.com & ClusterHQ.com所統計
This report is based on the current and planned container usage patterns of 285 respondents. The survey was conducted over the latter half of May 2015.
Consider a setup as shown in the preceding image; that is:
Master instance with no persistence
Slave instance with AOF enabled
In this case, the master does not need to perform any background disk operations and is fully dedicated to serve client requests, except for a trivial slave connection. The slave server configured with AOF performs the disk operations. As mentioned before, this file can be used to restore the master in case of a disaster.
Persistence in Redis is a matter of configuration, balancing the trade-off between performance, disk I/O, and data durability. If you are looking for more information on persistence in Redis, you will find the article by Salvatore Sanfilippo at http://oldblog.antirez.com/post/redis-persistence-demystified.html interesting.
Teams:
SRE (System Reliability Engineering): - 5 people
Core Dev (Q&A site) : ~6-7 people
Core Dev Mobile: 6 people
Careers team that does development solely for the SO Careers product: 7 people
1st: Caching in the browser, CDN, and proxies.
2nd: Using HttpRuntime.Cache. An in-memory, per server cache.
3rd: Redis.
4th: SQL Server Cache.
5th: SSD.
For example, every help page is cached. Code to access a page is very terse:
Static methods and static classes re used. Really bad from an OOP perspective, but really fast and really friendly towards terse code. All code is directly addressed.
Caching is handled by a library layer of Redis and Dapper, a micro ORM.
To get around garbage collection problems, only one copy of a class used in templates are created and kept in a cache. Everything is measured, including GC operation, from statistics it is known that layers of indirection increase GC pressure to the point of noticeable slowness.
CDN hits vary, since the query string hash is based on file content, it’s only re-fetched on a build. It's typically 30-50 million hits a day for 300 to 600 GB of bandwidth.
A CDN is not used for CPU or I/O load, but to help users find answers faster.
Why use Redis if you use MS products?
gabeech: It's not about OS evangelism. We run things on the platform they run best on. Period. C# runs best on a windows machine, we use IIS. Redis runs best on a *nix machine we use *nix.
Overkill as a strategy.
Nick Craver on why their network is over provisioned: Is 20 Gb massive overkill? You bet your ass it is, the active SQL servers average around 100-200 Mb out of that 20 Gb pipe. However, things like backups, rebuilds, etc. can completely saturate it due to how much memory and SSD storage is present, so it does serve a purpose.
SSDs Rock.
The database nodes all use SSD and the average write time is 0 milliseconds.
Know your read/write workload.
Keeping things very efficient means new machines are not needed often.
Only when a new project comes along that needs different hardware for some reason is new hardware added. Typically memory is added, but other than that efficient code and low utilization means it doesn't need replacing. So typically talking about adding a) SSDs for more space, or b) new hardware for new projects.
Don’t be afraid to specialize.
SO uses complicated queries based on tags, which is why a specialized Tag Engine was developed.
Do only what needs to be done.
Tests weren’t necessary because an active community did the acceptance testing for them. Add projects only when required. Add a line of code only when necessary. You Aint Gone Need It really works.
Reinvention is OK.
Typical advice is don’t reinvent the wheel, you’ll just make it worse, by making it square, for example. At SO they don't worry about making a "Square Wheel". If developers can write something more lightweight than an already developed alternative, then go for it.
Go down to the bare metal.
Go into the IL (assembly language of .Net). Some coding is in IL, not C#. Look at SQL query plans. Take memory dumps of the web servers to see what is actually going on. Discovered, for example, a split call generated 2GB of garbage.
No bureaucracy.
There’s always some tools your team needs. For example, an editor, the most recent version of Visual Studio, etc. Just make it happen without a lot of process getting in the way.
Garbage collection driven programming.
SO goes to great lengths to reduce garbage collection costs, skipping practices like TDD, avoiding layers of abstraction, and using static methods. While extreme, the result is highly performing code. When you're doing hundreds of millions of objects in a short window, you can actually measure pauses in the app domain while GC runs. These have a pretty decent impact on request performance.
The cost of inefficient code can be higher than you think.
Efficient code stretches hardware further, reduces power usage, makes code easier for programmers to understand.
Renaming Commands
A slightly unusual feature of redis is that you can disable and/or rename individual commands. As per the previous example, this is done via the CommandMap
Twemproxy
is a tool that allows multiple redis instances to be used as though it were a single server, with inbuilt sharding and fault tolerance (much like redis cluster, but implemented separately).