This document summarizes Marian Marinov's testing and experience with various distributed filesystems including CephFS, GlusterFS, MooseFS, OrangeFS, and BeeGFS. Some key findings are:
- CephFS requires significant resources but lacks redundancy for small clusters. GlusterFS offers redundancy but can have high CPU usage.
- MooseFS and OrangeFS were easy to setup but MooseFS offered better reliability and stats.
- Performance testing found MooseFS and NFS+Ceph to have better small file creation times than GlusterFS and OrangeFS. Network latency was identified as a major factor impacting distributed filesystem performance.
- Tuning efforts focused on NFS
As more workloads move to severless-like environments, the importance of properly handling downscaling increases. While recomputing the entire RDD makes sense for dealing with machine failure, if your nodes are more being removed frequently, you can end up in a seemingly loop-like scenario, where you scale down and need to recompute the expensive part of your computation, scale back up, and then need to scale back down again.
Even if you aren’t in a serverless-like environment, preemptable or spot instances can encounter similar issues with large decreases in workers, potentially triggering large recomputes. In this talk, we explore approaches for improving the scale-down experience on open source cluster managers, such as Yarn and Kubernetes-everything from how to schedule jobs to location of blocks and their impact (shuffle and otherwise).
DTrace and SystemTap are dynamic tracing frameworks available for Solaris and Linux respectively. This session will give an overview of the static DTrace probes available in both Drizzle and MySQL and show numerous examples of scripts that utilize these probes. Mixing dynamic and static probes will also be discussed.
As more workloads move to severless-like environments, the importance of properly handling downscaling increases. While recomputing the entire RDD makes sense for dealing with machine failure, if your nodes are more being removed frequently, you can end up in a seemingly loop-like scenario, where you scale down and need to recompute the expensive part of your computation, scale back up, and then need to scale back down again.
Even if you aren’t in a serverless-like environment, preemptable or spot instances can encounter similar issues with large decreases in workers, potentially triggering large recomputes. In this talk, we explore approaches for improving the scale-down experience on open source cluster managers, such as Yarn and Kubernetes-everything from how to schedule jobs to location of blocks and their impact (shuffle and otherwise).
DTrace and SystemTap are dynamic tracing frameworks available for Solaris and Linux respectively. This session will give an overview of the static DTrace probes available in both Drizzle and MySQL and show numerous examples of scripts that utilize these probes. Mixing dynamic and static probes will also be discussed.
Are your Oracle databases highly available? You have deployed Real Application Clusters (RAC), Data Guard, or Failover Clusters and are well protected against server failures? Great – the prerequisites for a highly available environment are given. However, to assure that backend infrastructure failures also remain transparent to the client, an appropriate configuration is a prerequisite.
This lecture will discuss the Oracle technologies that can be used to achieve automatic client failover functionality. What are the advantages, but also the limitations of these technologies?
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
This presentation describes the reasons why Facebook decided to build yet another key-value store, the vision and architecture of RocksDB and how it differs from other open source key-value stores. Dhruba describes some of the salient features in RocksDB that are needed for supporting embedded-storage deployments. He explains typical workloads that could be the primary use-cases for RocksDB. He also lays out the roadmap to make RocksDB the key-value store of choice for highly-multi-core processors and RAM-speed storage devices.
This ppt was used by Devrim at pgDay Asia 2017. He talked about some important facts about WAL - Transaction Logs or xlogs in PostgreSQL. Some of these can really come handy on a bad day
Meta/Facebook's database serving social workloads is running on top of MyRocks (MySQL on RocksDB). This means our performance and reliability depends a lot on RocksDB. Not just MyRocks, but also we have other important systems running on top of RocksDB. We have learned many lessons from operating and debugging RocksDB at scale.
In this session, we will offer an overview of RocksDB, key differences from InnoDB, and share a few interesting lessons learned from production.
Josh Berkus
You've heard that PostgreSQL is the highest-performance transactional open source database, but you're not seeing it on YOUR server. In fact, your PostgreSQL application is kind of poky. What should you do? While doing advanced performance engineering for really high-end systems takes years to learn, you can learn the basics to solve performance issues for 80% of PostgreSQL installations in less than an hour. In this session, you will learn: -- The parts of database application performance -- The performance setup procedure -- Basic troubleshooting tools -- The 13 postgresql.conf settings you need to know -- Where to look for more information.
Video: https://www.youtube.com/watch?v=FJW8nGV4jxY and https://www.youtube.com/watch?v=zrr2nUln9Kk . Tutorial slides for O'Reilly Velocity SC 2015, by Brendan Gregg.
There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This tutorial explains methodologies for using these tools, and provides a tour of four tool types: observability, benchmarking, tuning, and static tuning. Many tools will be discussed, including top, iostat, tcpdump, sar, perf_events, ftrace, SystemTap, sysdig, and others, as well observability frameworks in the Linux kernel: PMCs, tracepoints, kprobes, and uprobes.
This tutorial is updated and extended on an earlier talk that summarizes the Linux performance tool landscape. The value of this tutorial is not just learning that these tools exist and what they do, but hearing when and how they are used by a performance engineer to solve real world problems — important context that is typically not included in the standard documentation.
Best Practices for Becoming an Exceptional Postgres DBA EDB
Drawing from our teams who support hundreds of Postgres instances and production database systems for customers worldwide, this presentation provides real-real best practices from the nation's top DBAs. Learn top-notch monitoring and maintenance practices, get resource planning advice that can help prevent, resolve, or eliminate common issues, learning top database tuning tricks for increasing system performance and ultimately, gain greater insight into how to improve your effectiveness as a DBA.
PostgreSQL and ZFS were made for each other. This talk dives downstack into the internals and way that PostgreSQL consumes disk resources and tricks that are available if you run PostgreSQL on ZFS (ZFS on Linux, ZFS on FreeBSD, or ZFS on Illumos). Topics covered will include:
* Performance and sizing considerations
* Workload estimation heuristics
* Standard administrative practices that leverage ZFS
* Recovery using ZFS
* Performing database migrations using ZFS
"Extended" or "Stretched" Oracle RAC has been available as a concept for a while. Oracle RAC 12c Release 2 introduces an Oracle Extended Cluster configuration, in which the cluster understands the concept of sites and extended setups. This knowledge is used to more efficiently manage "Extended Oracle RAC", whether the nodes are 0.1 mile or 10 miles apart.
The presentation was last updated on August 7th 2017 to add a reference to the new MAA White Paper: "Installing Oracle Extended Clusters on Exadata Database Machine" - http://www.oracle.com/technetwork/database/availability/maa-extclusters-installguide-3748227.pdf and to correct some minor details.
Linux 4.x Tracing: Performance Analysis with bcc/BPFBrendan Gregg
Talk about bcc/eBPF for SCALE15x (2017) by Brendan Gregg. "BPF (Berkeley Packet Filter) has been enhanced in the Linux 4.x series and now powers a large collection of performance analysis and observability tools ready for you to use, included in the bcc (BPF Complier Collection) open source project. BPF nowadays can do system tracing, software defined networks, and kernel fast path: much more than just filtering packets! This talk will focus on the bcc/BPF tools for performance analysis, which make use of other built in Linux capabilities: dynamic tracing (kprobes and uprobes) and static tracing (tracepoints and USDT). There are now bcc tools for measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. Tracing superpowers have finally arrived, built in to Linux."
Are your Oracle databases highly available? You have deployed Real Application Clusters (RAC), Data Guard, or Failover Clusters and are well protected against server failures? Great – the prerequisites for a highly available environment are given. However, to assure that backend infrastructure failures also remain transparent to the client, an appropriate configuration is a prerequisite.
This lecture will discuss the Oracle technologies that can be used to achieve automatic client failover functionality. What are the advantages, but also the limitations of these technologies?
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
This presentation describes the reasons why Facebook decided to build yet another key-value store, the vision and architecture of RocksDB and how it differs from other open source key-value stores. Dhruba describes some of the salient features in RocksDB that are needed for supporting embedded-storage deployments. He explains typical workloads that could be the primary use-cases for RocksDB. He also lays out the roadmap to make RocksDB the key-value store of choice for highly-multi-core processors and RAM-speed storage devices.
This ppt was used by Devrim at pgDay Asia 2017. He talked about some important facts about WAL - Transaction Logs or xlogs in PostgreSQL. Some of these can really come handy on a bad day
Meta/Facebook's database serving social workloads is running on top of MyRocks (MySQL on RocksDB). This means our performance and reliability depends a lot on RocksDB. Not just MyRocks, but also we have other important systems running on top of RocksDB. We have learned many lessons from operating and debugging RocksDB at scale.
In this session, we will offer an overview of RocksDB, key differences from InnoDB, and share a few interesting lessons learned from production.
Josh Berkus
You've heard that PostgreSQL is the highest-performance transactional open source database, but you're not seeing it on YOUR server. In fact, your PostgreSQL application is kind of poky. What should you do? While doing advanced performance engineering for really high-end systems takes years to learn, you can learn the basics to solve performance issues for 80% of PostgreSQL installations in less than an hour. In this session, you will learn: -- The parts of database application performance -- The performance setup procedure -- Basic troubleshooting tools -- The 13 postgresql.conf settings you need to know -- Where to look for more information.
Video: https://www.youtube.com/watch?v=FJW8nGV4jxY and https://www.youtube.com/watch?v=zrr2nUln9Kk . Tutorial slides for O'Reilly Velocity SC 2015, by Brendan Gregg.
There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This tutorial explains methodologies for using these tools, and provides a tour of four tool types: observability, benchmarking, tuning, and static tuning. Many tools will be discussed, including top, iostat, tcpdump, sar, perf_events, ftrace, SystemTap, sysdig, and others, as well observability frameworks in the Linux kernel: PMCs, tracepoints, kprobes, and uprobes.
This tutorial is updated and extended on an earlier talk that summarizes the Linux performance tool landscape. The value of this tutorial is not just learning that these tools exist and what they do, but hearing when and how they are used by a performance engineer to solve real world problems — important context that is typically not included in the standard documentation.
Best Practices for Becoming an Exceptional Postgres DBA EDB
Drawing from our teams who support hundreds of Postgres instances and production database systems for customers worldwide, this presentation provides real-real best practices from the nation's top DBAs. Learn top-notch monitoring and maintenance practices, get resource planning advice that can help prevent, resolve, or eliminate common issues, learning top database tuning tricks for increasing system performance and ultimately, gain greater insight into how to improve your effectiveness as a DBA.
PostgreSQL and ZFS were made for each other. This talk dives downstack into the internals and way that PostgreSQL consumes disk resources and tricks that are available if you run PostgreSQL on ZFS (ZFS on Linux, ZFS on FreeBSD, or ZFS on Illumos). Topics covered will include:
* Performance and sizing considerations
* Workload estimation heuristics
* Standard administrative practices that leverage ZFS
* Recovery using ZFS
* Performing database migrations using ZFS
"Extended" or "Stretched" Oracle RAC has been available as a concept for a while. Oracle RAC 12c Release 2 introduces an Oracle Extended Cluster configuration, in which the cluster understands the concept of sites and extended setups. This knowledge is used to more efficiently manage "Extended Oracle RAC", whether the nodes are 0.1 mile or 10 miles apart.
The presentation was last updated on August 7th 2017 to add a reference to the new MAA White Paper: "Installing Oracle Extended Clusters on Exadata Database Machine" - http://www.oracle.com/technetwork/database/availability/maa-extclusters-installguide-3748227.pdf and to correct some minor details.
Linux 4.x Tracing: Performance Analysis with bcc/BPFBrendan Gregg
Talk about bcc/eBPF for SCALE15x (2017) by Brendan Gregg. "BPF (Berkeley Packet Filter) has been enhanced in the Linux 4.x series and now powers a large collection of performance analysis and observability tools ready for you to use, included in the bcc (BPF Complier Collection) open source project. BPF nowadays can do system tracing, software defined networks, and kernel fast path: much more than just filtering packets! This talk will focus on the bcc/BPF tools for performance analysis, which make use of other built in Linux capabilities: dynamic tracing (kprobes and uprobes) and static tracing (tracepoints and USDT). There are now bcc tools for measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. Tracing superpowers have finally arrived, built in to Linux."
Gluster for Geeks: Performance Tuning Tips & TricksGlusterFS
In this Gluster for Geeks technical webinar, Jacob Shucart, Senior Systems Engineer, will provide useful tips and tricks to make a Gluster cluster meet your performance requirements. He will review considerations for all different phases including planning, configuration, implementation, tuning, and benchmarking.
Topics covered will include:
• Protocols (CIFS, NFS, GlusterFS)
• Hardware configuration
• Tuning parameters
• Performance benchmarks
Find Site Performance from the server to WordPress. A look at how some good performance gains can be made in tuning MySQL and APC and getting the most of out W3 Total Cache.
Slide chia sẻ công nghệ về caching, thông qua slide này bạn sẽ trả lời được những câu hỏi như:
- Caching là gì
- Làm sao sử dụng cũng như xây dựng hệ thống caching
- Tại sao cache giúp tăng tốc ứng dụng lên vài chục, vài trăm lần
- Các hệ thống lớn của Facebook, Twitter, ... đang sử dụng cache thế nào
- ...
Slide chia sẻ về công nghệ về caching, thông qua slide này bạn sẽ trả lời được những câu hỏi như:
- Caching là gì
- Làm sao sử dụng cũng như xây dựng hệ thống caching
- Tại sao cache giúp tăng tốc ứng dụng lên vài chục, vài trăm lần
- Các hệ thống lớn của Facebook, Twitter, ... đang sử dụng cache thế nào
- ...
hbaseconasia2017: Large scale data near-line loading method and architectureHBaseCon
Shuaifeng Zhou
When we do real-time data loading to HBase, we use put/putlist interface. After receiving put request, regionserver will write WAL, write data into memory store, flush memory store to disk-store, then compact files again and again. That precedure occupies too much resource and causing read/write performance decrease. To solve the problem, we provide a kind of near-line loading method and architecture, greatly increase the loading bandwidth, and decrease the influence to read operations.
hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#
This presentation provides a basic overview of Ceph, upon which SUSE Storage is based. It discusses the various factors and trade-offs that affect the performance and other functional and non-functional properties of a software-defined storage (SDS) environment.
Windows Server 2012 R2 Software-Defined StorageAidan Finn
In this presentation I taught attendees how to build a Scale-Out File Server (SOFS) using Windows Server 2012 R2, JBODs, Storage Spaces, Failover Clustering, and SMB 3.0 Networking, suitable for storing application data such as Hyper-V and SQL Server.
Administering a Hadoop cluster isn't easy. Many Hadoop clusters suffer from Linux configuration problems that can negatively impact performance. With vast and sometimes confusing config/tuning options, it can can tempting (and scary) for a cluster administrator to make changes to Hadoop when cluster performance isn't as expected. Learn how to improve Hadoop cluster performance and eliminate common problem areas, applicable across use cases, using a handful of simple Linux configuration changes.
How to implement PassKeys in your applicationMarian Marinov
PassKeys is relatively new way of authentication. This presentation aims to provide a bit of guidance on how you can implement them in your own application.
Management of system administrators and devops teams is different then managing Developers.
This presentation shows key differences and what to worry about :)
MySQL security is not trivial. This presentation will walk you trough some of the more important decisions you have to take, when configuring a MySQL server instance
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
The Roman Empire A Historical Colossus.pdfkaushalkr1407
The Roman Empire, a vast and enduring power, stands as one of history's most remarkable civilizations, leaving an indelible imprint on the world. It emerged from the Roman Republic, transitioning into an imperial powerhouse under the leadership of Augustus Caesar in 27 BCE. This transformation marked the beginning of an era defined by unprecedented territorial expansion, architectural marvels, and profound cultural influence.
The empire's roots lie in the city of Rome, founded, according to legend, by Romulus in 753 BCE. Over centuries, Rome evolved from a small settlement to a formidable republic, characterized by a complex political system with elected officials and checks on power. However, internal strife, class conflicts, and military ambitions paved the way for the end of the Republic. Julius Caesar’s dictatorship and subsequent assassination in 44 BCE created a power vacuum, leading to a civil war. Octavian, later Augustus, emerged victorious, heralding the Roman Empire’s birth.
Under Augustus, the empire experienced the Pax Romana, a 200-year period of relative peace and stability. Augustus reformed the military, established efficient administrative systems, and initiated grand construction projects. The empire's borders expanded, encompassing territories from Britain to Egypt and from Spain to the Euphrates. Roman legions, renowned for their discipline and engineering prowess, secured and maintained these vast territories, building roads, fortifications, and cities that facilitated control and integration.
The Roman Empire’s society was hierarchical, with a rigid class system. At the top were the patricians, wealthy elites who held significant political power. Below them were the plebeians, free citizens with limited political influence, and the vast numbers of slaves who formed the backbone of the economy. The family unit was central, governed by the paterfamilias, the male head who held absolute authority.
Culturally, the Romans were eclectic, absorbing and adapting elements from the civilizations they encountered, particularly the Greeks. Roman art, literature, and philosophy reflected this synthesis, creating a rich cultural tapestry. Latin, the Roman language, became the lingua franca of the Western world, influencing numerous modern languages.
Roman architecture and engineering achievements were monumental. They perfected the arch, vault, and dome, constructing enduring structures like the Colosseum, Pantheon, and aqueducts. These engineering marvels not only showcased Roman ingenuity but also served practical purposes, from public entertainment to water supply.
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
This is a presentation by Dada Robert in a Your Skill Boost masterclass organised by the Excellence Foundation for South Sudan (EFSS) on Saturday, the 25th and Sunday, the 26th of May 2024.
He discussed the concept of quality improvement, emphasizing its applicability to various aspects of life, including personal, project, and program improvements. He defined quality as doing the right thing at the right time in the right way to achieve the best possible results and discussed the concept of the "gap" between what we know and what we do, and how this gap represents the areas we need to improve. He explained the scientific approach to quality improvement, which involves systematic performance analysis, testing and learning, and implementing change ideas. He also highlighted the importance of client focus and a team approach to quality improvement.
1. Talk Title Here
Author Name, Company
Comparison of FOSS
distributed filesystems
Marian “HackMan” Marinov, SiteGround
2. Who am I?
• Chief System Architect of SiteGround
• Sysadmin since 96
• Teaching Network Security & Linux System
Administration in Sofia University
• Organizing biggest FOSS conference in Bulgaria
- OpenFest
3. • We are a hosting provider
• We needed shared filesystem between VMs and
containers
• Different size of directories and relatively small
files
• Sometimes milions of files in a single dir
Why were we considering shared FS?
4. • We started with DBRD + OCFS2
• I did a small test of cLVM + ATAoE
• We tried DRBD + OCFS2 for MySQL clusters
• We then switched to GlusterFS
• Later moved to CephFS
• and finally settled on good old NFS :)
but with the storage on Ceph RBD
The story of our storage endeavors
5. • CephFS
• GlusterFS
• MooseFS
• OrangeFS
• BeeGFS - no stats
I haven't played with Lustre or AFS
And because of our history with OCFS2 and GFS2 I skipped them
Which filesystems were tested
6. GlusterFS Tunning
● use distributed and replicated volumes
instead of only replicated
– gluster volume create VOL replica 2 stripe 2 ....
● setup performance parameters
8. What did I test?
• Setup
• Fault tollerance
• Resource requirements (client & server)
• Capabilities (Redundancy, RDMA, caching)
• FS performance (creation, deletion, random
read, random write)
9. Complexity of the setup
• CephFS - requires a lot of knowledge and time
• GlusterFS - relatively easy to setup and install
• OrangeFS - extremely easy to do basic setup
• MooseFS - extremely easy to do basic setup
• DRBD + NFS - very complex setup if you want HA
• BeeGFS -
10. CephFS
• Fault tollerant by desing...
– until all MDSes start crashing
● single directory may crash all MDSes
● MDS behind on trimming
● Client failing to respond to cache pressure
● Ceph has redundancy but lacks RDMA support
11. CephFS
• Uses a lot of memory on the MDS nodes
• Not suitable to run on the same machines as the
compute nodes
• Small number of nodes 3-5 is a no go
12. CephFS tunning
• Placement Groups(PG)
• mds_log_max_expiring &
mds_log_max_segments fixes the problem with
trimming
• When you have a lot of inodes, increasing
mds_cache_size works but increases the
memory usage
13. CephFS fixes
• Client XXX failing to respond to cache pressure
● This issue is due to the current working set size of the
inodes.
The fix for this is setting the mds_cache_size in
/etc/ceph/ceph.conf accordingly.
ceph daemon mds.$(hostname) perf dump | grep inode
"inode_max": 100000, - max cache size value
"inodes": 109488, - currently used
14. CephFS fixes
• Client XXX failing to respond to cache pressure
● Another “FIX” for the same issue is to login to current active
MDS and run:
● This way it will change currnet active MDS to another server
and will drop inode usage
/etc/init.d/ceph restart mds
15. GlusterFS
• Fault tollerance
– in case of network hikups sometimes the mount may
become inaccesable and requires manual remount
– in case one storage node dies, you have to manually
remount if you don't have local copy of the data
• Capabilities - local caching, RDMA and data
Redundancy are supported. Offers different
ways for data redundancy and sharding
16. GlusterFS
• High CPU usage for heavily used file systems
– it required a copy of the data on all nodes
– the FUSE driver has a limit of the number of
small operations, that it can perform
• Unfortunatelly this limit was very easy to be
reached by our customers
17. GlusterFS Tunning
volume set VOLNAME performance.cache-refresh-timeout 3
volume set VOLNAME performance.io-thread-count 8
volume set VOLNAME performance.cache-size 256MB
volume set VOLNAME performance.cache-max-file-size 300KB
volume set VOLNAME performance.cache-min-file-size 0B
volume set VOLNAME performance.readdir-ahead on
volume set VOLNAME performance.write-behind-window-size 100KB
18. GlusterFS Tunning
volume set VOLNAME features.lock-heal on
volume set VOLNAME cluster.self-heal-daemon enable
volume set VOLNAME cluster.metadata-self-heal on
volume set VOLNAME cluster.consistent-metadata on
volume set VOLNAME cluster.stripe-block-size 100KB
volume set VOLNAME nfs.disable on
20. MooseFS
• Reliability with multiple masters
• Multiple metadata servers
• Multiple chunkservers
• Flexible replication (per directory)
• A lot of stats and a web interface
22. OrangeFS
• No native redundancy uses corosync/pacemaker
for HA
• Adding new storage servers requires stopping of
the whole cluster
• It was very easy to break it
23. Ceph RBD + NFS
• the main tunning goes to NFS, by using the
cachefilesd
• it is very important to have cache for both
accessed and missing files
• enable FS-Cache by using "-o fsc" mount option
24. Ceph RBD + NFS
• verify that your mounts are with enabled cache:
•
# cat /proc/fs/nfsfs/volumes
NV SERVER PORT DEV FSID FSC
v4 aaaaaaaa 801 0:35 17454aa0fddaa6a5:96d7706699eb981b yes
v4 aaaaaaaa 801 0:374 7d581aa468faac13:92e653953087a8a4 yes
29. Sysctl tunning
# turn off selective ACK and timestamps
net.ipv4.tcp_sack = 0
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_low_latency = 1
# scalable or bbr
net.ipv4.tcp_congestion_control = scalable
30. Sysctl tunning
# Increase system IP port range to allow for more
concurrent connections
net.ipv4.ip_local_port_range = 1024 65000
# OS tunning
vm.swappiness = 0
# Increase system file descriptor limit
fs.file-max = 65535
31. Stats
• For small files, network BW is not a problem
• However network latency is a killer :(