Perfomance Tuning, Monitoring, Management: Getting the Most from SUSE Linux Enterprise Server

9,793 views
9,677 views

Published on

This session discusses and demonstrates the less-known tools and options in SUSE Linux Enterprise Server 11. These tools, while perhaps less obvious, can be very valuable to the experienced user or administrator.

This session will include an overview of:
1. Performance tuning
2. Kernel resource management
3. Built-in monitoring capabilities

0 Comments
19 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
9,793
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
432
Comments
0
Likes
19
Embeds 0
No embeds

No notes for slide

Perfomance Tuning, Monitoring, Management: Getting the Most from SUSE Linux Enterprise Server

  1. 1. Performance Tuning, Monitoring, Management Getting the Most out of SUSE Linux Enterprise Server ® Matthias G. Eckermann Senior Product Manager SUSE Linux Enterprise mge@novell.com
  2. 2. Agenda Performance Analysis and Tuning Kernel Resource Management with Control Groups Built-in Monitoring Capabilities 2 © Novell, Inc. All rights reserved.
  3. 3. Part I: Performance Analysis and Tuning
  4. 4. General Considerations (Hardware, Configuration,...)
  5. 5. Hardware and Configuration Ultimately, hardware and its configuration set the upper limits for our tuning efforts. Are we starting with the best possible (minimum needed) hardware platform and components? – CPU speed only critical for compute-intense tasks – RAM (amount and speed) and interconnects do matter – Bottleneck I/O: network bandwidth, disk,... Is the hardware configuration appropriate? The weakest link kills performance! 5 © Novell, Inc. All rights reserved.
  6. 6. (Hardware) Configuration Optimize storage configuration – Optimize distribution of data across controllers/disks – Swap to extra disk – Use RAID with striping Tune hardware setup (BIOS, EFI,...) – Only enable/proble what you have. – Tune for fast reboot vs. startup checks (if desired) – Carefully review all settings. Disable unneeded services # rc<SERVICE> stop 6 © Novell, Inc. All rights reserved.
  7. 7. Identifying Problems
  8. 8. Where Has My Memory Gone!? Slab Cache – Structures of much less than one page in size – Generic slabs of predefined sizes (32, 64) plus slabs for specific data structures Page Cache – Pages with actual contents of files (or block device) usually the largest, by far Buffer Cache – File system metadata 8 © Novell, Inc. All rights reserved.
  9. 9. Identifying Problems Start by finding the bottleneck: I/O, disk, mem,... iostat to identify overloaded drives – package syssat #iostat -x 1 vmstat for basic system usage # vmstat 1 r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 76804 8268 14996 167396 1 1 36 64 132 197 4 1 92 3 0 0 76804 8268 14996 167396 0 0 0 0 1023 879 3 0 97 0 0 0 76804 8300 14996 167396 0 0 0 0 1158 1134 2 0 98 0 Slabtop for slab cache use 9 © Novell, Inc. All rights reserved.
  10. 10. File Systems
  11. 11. Picking a File System Pick the right file system for the task – Indexed metadata – File sizes – Number of files – Workloads (database, mail server,...) – AccessPaths – Dump/Restore 11 © Novell, Inc. All rights reserved.
  12. 12. SUSE Linux Enterprise ® Filesystems Feature Ext 3 reiserfs XFS OCFS 2 btrfs Data/Metadata Journaling •/• ○/• ○/• ○/• N/A [3] Journal internal/external •/• •/• •/• •/○ N/A Offline extend/shrink •/• •/• ○/○ •/○ •/• Online extend/shrink •/○ •/○ •/○ •/○ •/• Inode-Allocation-Map table u. B*-tree B+-tree table B-tree Sparse Files • • • • • Tail Packing ○ • ○ ○ • Defrag ○ ○ • ○ • ExtAttr / ACLs •/• •/• •/• •/• •/• Quotas • • • • • Dump/Restore • ○ • ○ ○ Blocksize default 4KiB max. Filesystemsize [1] 16 TiB 16 TiB 8 EiB 16 TiB 16 EiB max. Filesize [1] 2 TiB 1 EiB 8 EiB 1 EiB 16 EiB Support Status SLES SLES SLES SLE HA Technology Preview SUSE® Linux Enterprise was the first enterprise Linux distribution to support journaling filesystems and logical volume managers back in 2000. Today, we have customers running XFS and ReiserFS with more than 8TiB in one filesystem, and the SUSE Linux Enterprise engineering team is using our 3 major Linux journaling filesystems for all their servers. We are excited to add the OCFS2 cluster filesystem to the range of supported filesystems in SUSE Linux Enterprise. For large-scale filesystems, for example for file serving (e.g., with with Samba, NFS, etc.), we recommend using XFS. (In this table "+" means "available/supported"; "-" is "unsupported") [1] The maximum file size above can be larger than the filesystem's actual size due to usage of sparse blocks. It should also be noted that unless a filesystem comes with large file support (LFS), the maximum file size on a 32-bit system is 2 GB (231 bytes). Currently all of our standard filesystems (including ext3 and ReiserFS) have LFS, which gives a maximum file size of 263 bytes in theory. The numbers given in the above tables assume that the filesystems are using 4 KiB block size. When using different block sizes, the results are different, but 4 KiB reflects the most common standard. [2] In this document: 1024 Bytes = 1 KiB; 1024 KiB = 1 MiB; 1024 MiB = 1 GiB; 1024 GiB = 1 TiB; 1024 TiB = 1 PiB; 1024 PiB = 1 EiB (see also http://physics.nist.gov/cuu/Units/binary.html ) [3] Btrf s is a copy -on-write logging-sty le f ile sy stem, so rather than needing to journal changes bef ore writing them in-place, it writes them in a new location, and then links it in. Until the last write, the new changes are not “committed.” 12 © Novell, Inc. All rights reserved.
  13. 13. File Systems: ReiserFS Applications that use many small files – Mail servers – NFS servers – Database servers or other applications that use synchronous I/O 13 © Novell, Inc. All rights reserved.
  14. 14. File Systems: Ext3 Default file system in SUSE Linux Enterprise 11 ® Best suited for – Small (<100GiB) file systems When using Ext3 with many files in one directory, consider enabling btree support (enabled by default in SUSE Linux Enterprise Server 11 SP 1) # mkfs.ext3 -O dir_index When using Ext3 with multiple threads appending to files in the same directory, consider turning preallocation on # mount -o reservation 14 © Novell, Inc. All rights reserved.
  15. 15. File Systems: XFS Best suited for – Medium (>100GiB) to very large file systems (> 1 TiB) – Large files/many files – Streaming multimedia (low latencies) Special features and capabilities – dump/restore – online filesystem-check – online-defragmentation 15 © Novell, Inc. All rights reserved.
  16. 16. Cluster File System: OCFS2 OCFS2 (Oracle Cluster File System) • Shared access by multiple nodes – Ensures data integrity in case of a node-failure – Scale-out for data access • Generic use – POSIX-compliant – Cluster-aware POSIX locking • Higher throughput – Parallel I/O • Disaster Tolerance – Integration with data replication for dual node 16 © Novell, Inc. All rights reserved.
  17. 17. Filesystems: btrfs • Integrated Volume Management • Support for copy on write • Powerful snapshot capabilities • Scalability • Data integrity (checksums) • Full community support • Technology preview in SUSE Linux Enterprise Server 11 SP 1 ® 17 © Novell, Inc. All rights reserved.
  18. 18. Barriers SUSE Linux Enterprise defaults to maximum data integrity ® guarantee by enforcing barriers from the file system so that reordering of journal writes cannot happen. This may cost some performance; tunable via mount option ReiserFS – enable with “barrier=flush” (default) – disable with “barrier=none” Ext3 – enable with “barrier=1” (default) – disable with “barrier=0” XFS – enable with “barrier” – disable with “nobarrier” 18 © Novell, Inc. All rights reserved.
  19. 19. Logging Modes Journaling file systems offer different modes to write the actual data For ReiserFS and Ext3, mount option data=<X> – data=ordered: use barriers for data no risk exposing old data (default) – data=writeback: no barriers for data fastest in many workloads – journal: use journal for data generally slow, but can improve mail server workloads By default, SUSE Linux Enterprise Server ensures ® data integrity at the cost of some performance 19 © Novell, Inc. All rights reserved.
  20. 20. Dedicated Logging Devices ReiserFS mkreiserfs -j /dev/xxx -s 8193 /dev/xxy reiserfstune –journal-new-device /dev/xxx -s 8193 Ext3 mke2fs -O journal_dev /dev/xxx mke2fs -j -J device=/dev/xxx,size=8193 /dev/xxy tune2fs -J device=/dev/xxx,size=8193 /dev/xxy XFS mkfs.xfs -l logdev=/dev/xxx,size=10000b /dev/xxy 20 © Novell, Inc. All rights reserved.
  21. 21. File System Tuning Split file systems based on data access patterns – Keep commit heavy data away from data that does not have to be synchronous – Keep streaming writes and reads on different spindles than random I/O Consider disabling atime updates on files and directories # mount -o noatime,nodiratime 21 © Novell, Inc. All rights reserved.
  22. 22. File System Tuning Optimize directory layout for the file system – Keep data that will be accessed together in the same subdirectories – Spread data out into different subdirectories to increase large file concurrency – Different file systems order directories differently 22 © Novell, Inc. All rights reserved.
  23. 23. Block Layer
  24. 24. I/O Scheduler Flexible, pluggable I/O scheduler Selectable via boot parameter elevator=<X> – noop – deadline – as (default in mainline kernels) – cfq (default in SUSE Linux Enterprise) ® I/O Scheduler per device – Check /sys/block/*DEV*/queue/iosched – Set echo SCHEDNAME > /sys/block/*DEV*/queue/scheduler 24 © Novell, Inc. All rights reserved.
  25. 25. I/O Scheduler: Noop No reordering, just merging Best for storage with extensive caching and scheduling of its own, such as: MultiPathing Activated by boot parameter elevator=noop 25 © Novell, Inc. All rights reserved.
  26. 26. I/O Scheduler: Deadline Per-request service deadline – Caps maximum latency per request – Maintains good disk throughput Best for disk-intensive database applications Activated by boot parameter elevator=deadline 26 © Novell, Inc. All rights reserved.
  27. 27. I/O Scheduler: Anticipatory Similar to “deadline”, but anticipates reads by putting them in front of the queue and delays a few ms after every read – Maximizes throughput – At the cost of increasing latency Best for file servers and desktop workloads with single IDE/SATA disks. Default in mainline kernels Activated by boot parameter elevator=as 27 © Novell, Inc. All rights reserved.
  28. 28. I/O Scheduler: CFQ Complete Fair Queuing Treat all competing processes equally by keeping a unique request queue for each and giving equal bandwidth to each queue – Good compromise between throughput and latency – Minimal worst case latency on all reads and writes Suitable for a wide variety of applications Default in SUSE Linux Enterprise ® Activated by boot parameter elevator=cfq 28 © Novell, Inc. All rights reserved.
  29. 29. Block Layer Tuning Spreading the load across controllers – Per-target locking for SCSI – Software RAID bandwidth Battery backed caching 29 © Novell, Inc. All rights reserved.
  30. 30. Blocker Layer Tunables Block read ahead buffer /sys/block/<sdX/hdX>/queue/read_ahead_kb Default is 128. Increase to 512 for fast storage (SCSI disks or RAID) May speed up streaming reads a lot Number of requests /sys/block/<sdX/hdX>/queue/nr_requests Default is 128. Increase to 256 with CFQ scheduler for fast storage Increases throughput at minor latency expense 30 © Novell, Inc. All rights reserved.
  31. 31. Memory Management (VM)
  32. 32. Buffer Flushing How to write dirty pages to disk This can be tuned by – /proc/sys/vm/dirty_ratio (40%) Generator of dirty data starts writeback. – /proc/sys/vm/dirty_background_ratio (10%) – /proc/sys/vm/dirty_expire_centisecs (3000) How long may dirty pages remain dirty? – /proc/sys/vm/dirty_writeback_centisecs (500) How often does bdflush wake up? Defaults are pretty high which is good for databases (but may result in lots of unreclaimable pagecache) For other workloads (HPC) you may want to lower these 32 © Novell, Inc. All rights reserved.
  33. 33. VM: Swapiness The threshold when processes should be swapped can be tuned via – /proc/sys/vm/swappiness Default is 60, which works well if you want to swap out daemons or programs which have not done a lot lately Higher values will provide more buffer/page cache, lower values will wait longer to swap out idle processes 33 © Novell, Inc. All rights reserved.
  34. 34. NUMA (1) NUMA = Non-uniform Memory Architecture SUSE Linux Enterprise detects and uses NUMA ® topology and automatically – Prefers memory that is local to a node; – Evenly balances system data across nodes; NUMA – Gracefully handle CPU-less nodes; etc. Also applications can (and should) optimize for NUMA topology! 34 © Novell, Inc. All rights reserved.
  35. 35. NUMA (2) The NUMA system can be tuned via `numactl $CMD`; the settings then apply to $CMD and all of its children – --preferred=255 – --membind=!0-1 NUMA – --cpunodebind=2-5 – --physcpubind=2-5 – --localalloc (always allocate from current node) Node 0 may be the most contended, so avoid it 35 © Novell, Inc. All rights reserved.
  36. 36. Miscellaneous (Scheduler, Network)
  37. 37. Binding Processes/Interrupts to CPUs Problem: context switching costs CPU affinity: binding CPUs to a specific process can improve performance – taskset 0x3 [-p pid] [command] In this example, 0x3 is a bitmap referring to CPUs 1 and 2; 0x6 would be CPUs 2 and 4. Bind interrupts to CPUs – cat /proc/interrupts – echo 0x3 > /proc/irq/0/smp_affinity – Example: distribute NICs among CPUs. 37 © Novell, Inc. All rights reserved.
  38. 38. Network Improvements Gigabit Ethernet and 10g – Significant interrupt overhead reduction – Consider Jumbo Frames (larger 1500 bytes) – # ifconfig <DEV> mtu 9000 NFS modes – TCP (default) vs UDP – NFSv3 (default) vs NFSv4 – rsize=<X>/wsize=<X> - read/write in chunks of <X> bytes - default is 1024, use 8192 for higher throughput 38 © Novell, Inc. All rights reserved.
  39. 39. Application Interplay
  40. 40. Async I/O, O_DIRECT Asynchronous I/O – Specific model for concurrency – Heavily used by databases Direct I/O (O_DIRECT) on block devices or files – Databases like to use raw disks. Historically /dev/raw was used, but O_DIRECT is more performant. – Files should be preallocated (no holes, no appending); the system falls back to buffered I/O otherwise. – In both cases: cache pollution benefits – Not specific to database workloads! 40 © Novell, Inc. All rights reserved.
  41. 41. Part II: Kernel Resource Managemet with Control Groups
  42. 42. Control Groups • Understanding control groups: An in-depth overview – What Control Groups is designed to do – How Control Groups work • Using control groups in SUSE Linux Enterprise Server 11 ® – Understanding the components 42 © Novell, Inc. All rights reserved.
  43. 43. Understanding Control Groups An In-depth Overview
  44. 44. What Are Control Groups? Control Groups provide a mechanism for aggregating/partitioning sets of tasks, and all their future children, into hierarchical groups with specialized behavior. – cgroup is another name for Control Groups – Partition tasks (processes) into a one or many groups of tree hierarchies – Associate a set of tasks in a group to a set subsystem parameters – Subsystems provide the parameters that can be assigned – Tasks are affected by the assigning parameters 44 © Novell, Inc. All rights reserved.
  45. 45. Example of the Capabilities of a cgroup Consider a large university server with various users - students, professors, system tasks etc. The resource planning for this server could be along the following lines: CPUs Memory Network I/O Top cpuset (20%) Professors = 50% WWW browsing = 20% / Students = 30% / CPUSet1 CPUSet2 System = 20% Prof (15%) Students (5%) | | (Profs) (Students) Disk I/O Network File System (60%) 60% 20% Professors = 50% Students = 30% Others (20%) System = 20% 45 © Novell, Inc. All rights reserved. Source: /usr/src/linux/Documentation/cgroups/cgroups.txt •
  46. 46. Control Group Subsystems Two types of subsystems • Isolation and special controls – cpuset, namespace, freezer, device, checkpoint/restart • Resource control – cpu(scheduler), memory, disk i/o, network Each subsystem can be mounted independently – mount -t cgroup -o cpu none /cpu – mount -t cgroup -o cpuset none /cpuset or all at once – mount -t cgroup none /cgroup 46 © Novell, Inc. All rights reserved. Source: http://jp.linuxfoundation.org/jp_uploads/seminar20081119/CgroupMemcgMaster.pdf •
  47. 47. Cpuset Subsystem (Isolation) Cpuset is for tying processes to cpu and memory. Process Process Process Process Process Process Group A1 Group A2 Group B Group A1 Group A2 Group B Memory Memory Memory Memory 47 © Novell, Inc. All rights reserved. Source: http://jp.linuxfoundation.org/jp_uploads/seminar20081119/CgroupMemcgMaster.pdf •
  48. 48. Namespace Subsystem (Isolation) Namespace is for showing private view of system to processes in cgroup. Mainly used for OS-level virtualization. This subsystem itself has no special functions and just tracks changes in namespace. 48 © Novell, Inc. All rights reserved. Source: http://jp.linuxfoundation.org/jp_uploads/seminar20081119/CgroupMemcgMaster.pdf •
  49. 49. Freezer Subsystem (Control) Freezer cgroup is for freezing (stopping) all tasks in a group. mount -t cgroup none /freezer -o freezer ....put task into /freezer/tasks... echo FROZEN > /freezer/freezer.state echo RUNNING > /freezer/freezer.state 49 © Novell, Inc. All rights reserved. Source: http://jp.linuxfoundation.org/jp_uploads/seminar20081119/CgroupMemcgMaster.pdf •
  50. 50. Device Subsystem (Isolation) A system administrator can provide a list of devices that can be accessed by processes under cgroup – Allow/Deny Rule – Allow/Deny : READ/WRITE/MKNOD Limits access to device or file system on a device to only tasks in specified cgroup 50 © Novell, Inc. All rights reserved. Source: http://jp.linuxfoundation.org/jp_uploads/seminar20081119/CgroupMemcgMaster.pdf •
  51. 51. Checkpoint/Restart Subsystem (Control) • Save all process's status in a cgroup to a dump file, restart it later (or just save state and continue) • For allowing “saved container” moved between physical machines (as VM can do) • Dump all process's image to a file 51 © Novell, Inc. All rights reserved. Source: http://jp.linuxfoundation.org/jp_uploads/seminar20081119/CgroupMemcgMaster.pdf •
  52. 52. CPU Subsystem (Resource Control) • Share CPU bandwidth between groups by group scheduling function of CFS (the scheduler) • Mechanically complicated Share = 2000 Share = 1000 Share = 4000 52 © Novell, Inc. All rights reserved.
  53. 53. Memory Subsystem (Resource Control) • For limiting memory usage of user space processes. • Limit LRU (Least Recently Used) pages – Anonymous and file cache • No limits for kernel memory – Maybe in another subsystem if needed 53 © Novell, Inc. All rights reserved. Source: http://jp.linuxfoundation.org/jp_uploads/seminar20081119/CgroupMemcgMaster.pdf •
  54. 54. Disk I/O Subsystem (Resource Control) (Draft) • 3 proposals are currently being discussed – dm-ioband, io-throttle, io-controller • Consensus has not been reached but io-controller seems to taking the lead – Both dm-ioband and io-throttle suffer from a significant problem: they can defeat the policies (such as I/O priority) being implemented by the I/O scheduler. – Io-throttle is does bandwidth control at the I/O scheduler level – Designed to work with mainline I/O controllers: CFQ, deadline, Anticipatory, and no-op but requires significant changes – Currently v4 as of June 8, 2009 and based on 2.6.30-rc8 kernel Source: http://jp.linuxfoundation.org/jp_uploads/seminar20081119/CgroupMemcgMaster.pdf Source: http://lwn.net/Articles/331857/ Source: http://lwn.net/Articles/332839/ 54 © Novell, Inc. All rights reserved. Source: http://lkml.org/lkml/2009/6/8/580
  55. 55. Network I/O Subsystem (Resource Control) (Draft) • Like the Disk I/O subsystem, it seems the jury is still out on the implementation of this subsystem • Kernel developers are talking about traffic control – cgroup_tc - This patch provides a simple resource controller which uses traffic control (tc) features already in the Linux kernel – Not much discussion on this topic since late 2008 Source: http://lkml.org/lkml/2008/7/22/361 Source: https://lists.linux-foundation.org/pipermail/containers/2008-August/012419.html 55 © Novell, Inc. All rights reserved. Source: https://lists.linux-foundation.org/pipermail/containers/2008-August/012512.html
  56. 56. Reading More on cgroups Remember to install kernel source!! – /usr/src/linux/Documentation/cgroups/cgroup.txt – /usr/src/linux/Documentation/cpusets.txt – /usr/src/linux/Documentation/controllers/* – /usr/src/linux/Documentation/scheduler/sched-design-CFS.txt – /usr/src/linux/Documentation/kernel-parameters.txt Additional RPM packages – libcgroup1 - /usr/share/doc/packages/libcgroup1/README* – cpuset (Alex Tsariounov) - /usr/share/doc/packages/cpuset/cset*.txt 56 © Novell, Inc. All rights reserved.
  57. 57. Reading More on cgroups (continued) Manpages – man cpuset – man cset On the web – http://lkml.org/lkml/2009/2/9/372 – http://lkml.org/lkml/2009/2/10/140 – http://lkml.org/lkml/2008/1/29/60 – http://kerneltrap.org/mailarchive/linux-kernel/2008/6/18/2161114/thread 57 © Novell, Inc. All rights reserved.
  58. 58. Using Control Groups in SUSE Linux Enterprise Server 11 ®
  59. 59. Preparing SUSE Linux Enterprise Server 11 ® • Start with patched SLES11 install • Add the following packages – libcgroup1 – cpuset – libcpuset1 – kernel-source (Documentation purposes) – gcc (Needed to compile the stress tool) 59 © Novell, Inc. All rights reserved.
  60. 60. What Subsystems Are Available? A way to figure this out. – mount -t cgroup none /cgroup – cat /proc/mounts Current subsystems in SUSE Linux Enterprise Server 11 – rw,freezer,devices,cpuacct,cpu,ns,cpuset – memory – Disabled by default > Add a kernel parameter - cgroup_enable=memory Possible future subsystems in SLES 11 – Disk and Network subsystem controllers 60 © Novell, Inc. All rights reserved.
  61. 61. Generating Load on SUSE Linux Enterprise Server 11 ® Search the Web for “linux load generator” - Results: • http://devin.com/lookbusy/ • http://www.ibm.com/developerworks/linux/library/l-stress/index.html – Good article • http://ltp.sourceforge.net/ – Powerful toolkit for Linux developers • http://hardware.slashdot.org/article.pl?sid=05/04/06/218233 – Simple scripting examples • http://weather.ou.edu/~apw/projects/stress/ – Probably best choice – Available (community driven) also at: http://software.opensuse.org/search?baseproject=SUSE:SLE-11&p=1&q=stress 61 © Novell, Inc. All rights reserved.
  62. 62. Crash Course on CPUSETs The Hard Way • Determine the number of CPUs and Memory Nodes – Look at /proc/cpuinfo and /proc/zoneinfo • Creating the CPUSET hierarchy mkdir /dev/cpuset mount -t cpuset cpuset /dev/cpuset cd /dev/cpuset mkdir Charlie cd Charlie /bin/echo 2-3 > cpus /bin/echo 1 > mems /bin/echo $$ > tasks # The current shell is now running in cpuset Charlie # The next line should display '/Charlie' cat /proc/self/cpuset • Removing the CPUSET cat /dev/cpuset/Charlie/tasks (move any remaining tasks!!) rmdir /dev/cpuset/Charlie 62 © Novell, Inc. All rights reserved.
  63. 63. Crash Course on CPUSETs The Easy Way – Thanks to Alex Tsariounov of Novell ® • Determine the number of CPUs and Memory Nodes – cset set --list • Creating the CPUSET hierarchy – cset set --cpu=2-3 --mem=1 --set=Charlie • Starting processes in a CPUSET – cset proc --set Charlie --exec -- stress -c 1 & • Moving existing processes to a CPUSET – cset proc --move --pid PID --toset=Charlie • List task in a CPUSET – cset proc --list --set Charlie • Removing a CPUSET – cset set --destroy Charlie 63 © Novell, Inc. All rights reserved.
  64. 64. Follow It Up with cgroups The Hard way • Creating the cgroup hierarchy mkdir /dev/cgroup mount -t cgroup cgroup /dev/cgroup cd /dev/cgroup mkdir priority cd priority cat cpu.shares • Understanding cpu.shares – 1024 is the default (more in sched-design-CFS.txt) = 50% utilization – 1524 = 60% utilization – 2048 = 67% utilization – 512 = 40% utilization • Changing cpu.shares – /bin/echo 1024 > cpu.shares 64 © Novell, Inc. All rights reserved.
  65. 65. More cgroup Functionality to Learn The libcgroup1 package • Basic tools in user space to simplify resource management functionality – uid, gid or exec rules for placement of a task – /etc/init.d/cgconfig – setup cgroup filesystem based on /etc/cgconfig.conf • UID/GID rules – Managed in /etc/cgrules.conf by root user • EXEC rules – Fully managed by a user in a config file in their home directory • Methods used to place task in proper cgroup – pam_cgroup (at login); cgexec (task start); cgclassify (task move) – User space daemon (cgred in /etc/init.d and /etc/sysconfig) 65 © Novell, Inc. All rights reserved.
  66. 66. Linux Containers LXC • Build upon CGroups and specific kernel settings; use “lxc-checkconfig” to check compliance • Fully enabled in SUSE Linux Enterprise Server 11 SP1 • Basic Functionality lxc-execute --name=NAME -- COMMAND • Function Overview – lxc-start lxc-execute / lxc-stop – lxc-freeze lxc-unfreeze – Monitoring: lxc-ps, lxc-info, lxc-netstat, lxc-monitor – Modifying CGroup parameters: lxc-cgroup 66 © Novell, Inc. All rights reserved.
  67. 67. Part III: Built-in Monitoring Capabilities
  68. 68. Monitoring Overview and Hands-on • Low Level – smartmontools - Monitor for S.M.A.R.T. Disks and Devices – sensors - Hardware health monitoring for Linux – iptraf - TCP/IP Network Monitor – pcp - Performance Co-Pilot (system-level performance monitoring) – sysstat - Sar and Iostat Commands for Linux – perfmon – blktrace, ltrace, strace – systemtap - Instrumentation System • High Level – argus – network auditing tool – nagios 68 © Novell, Inc. All rights reserved.
  69. 69. Unpublished Work of Novell, Inc. All Rights Reserved. This work is an unpublished work and contains confidential, proprietary, and trade secret information of Novell, Inc. Access to this work is restricted to Novell employees who have a need to know to perform tasks within the scope of their assignments. No part of this work may be practiced, performed, copied, distributed, revised, modified, translated, abridged, condensed, expanded, collected, or adapted without the prior written consent of Novell, Inc. Any use or exploitation of this work without authorization could subject the perpetrator to criminal and civil liability. General Disclaimer This document is not to be construed as a promise by any participating company to develop, deliver, or market a product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. Novell, Inc. makes no representations or warranties with respect to the contents of this document, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The development, release, and timing of features or functionality described for Novell products remains at the sole discretion of Novell. Further, Novell, Inc. reserves the right to revise this document and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. All Novell marks referenced in this presentation are trademarks or registered trademarks of Novell, Inc. in the United States and other countries. All third-party trademarks are the property of their respective owners.

×