Being a crucial feature on managing database load and with real world practice showing that Database
Resource Manager (DBRM) is not often used, this talk want to change this and demystify this feature by
explaining how it works in detail on different scenarios, the CPU math behind it, how to measure it in
real-time using Python and SQL and exploring more complex features to understand its behaviour.
Special attention will be made to understand its internals whenever possible.
Proof of Concept with Real Application Testing 12cLuis Marques
Evaluate how certain real world database workload behaves on different I/O subsystem, processors and
architecture or the coexistence with other databases is the goal of a Proof of Concept. The need of testing
real production workloads to eliminate uncertainty with help of techniques like Workload Folding, Time
Shifting and Schema Remapping, this talk will produce evidence that exploring Real Application Testing
features in 12c leverage what can be accomplished by a Proof of Concept.
Proof of Concept with Real Application Testing 12cLuis Marques
Evaluate how certain real world database workload behaves on different I/O subsystem, processors and
architecture or the coexistence with other databases is the goal of a Proof of Concept. The need of testing
real production workloads to eliminate uncertainty with help of techniques like Workload Folding, Time
Shifting and Schema Remapping, this talk will produce evidence that exploring Real Application Testing
features in 12c leverage what can be accomplished by a Proof of Concept.
This is the presentation on ASH that I did with Graham Wood at RMOUG 2014 and that represents the final best effort to capture essential and advanced ASH content as started in a presentation Uri Shaft and I gave at a small conference in Denmark sometime in 2012 perhaps. The presentation is also available publicly through the RMOUG website, so I felt at liberty to post it myself here. If it disappears it would likely be because I have been asked to remove it by Oracle.
Josh Berkus
You've heard that PostgreSQL is the highest-performance transactional open source database, but you're not seeing it on YOUR server. In fact, your PostgreSQL application is kind of poky. What should you do? While doing advanced performance engineering for really high-end systems takes years to learn, you can learn the basics to solve performance issues for 80% of PostgreSQL installations in less than an hour. In this session, you will learn: -- The parts of database application performance -- The performance setup procedure -- Basic troubleshooting tools -- The 13 postgresql.conf settings you need to know -- Where to look for more information.
Best Practices for Becoming an Exceptional Postgres DBA EDB
Drawing from our teams who support hundreds of Postgres instances and production database systems for customers worldwide, this presentation provides real-real best practices from the nation's top DBAs. Learn top-notch monitoring and maintenance practices, get resource planning advice that can help prevent, resolve, or eliminate common issues, learning top database tuning tricks for increasing system performance and ultimately, gain greater insight into how to improve your effectiveness as a DBA.
AWR Ambiguity: Performance reasoning when the numbers don't add upJohn Beresniewicz
A close look at an AWR report where DB Time is exceeded by the sum of DB CPU and foreground wait time. We recall core Oracle performance principles and instrumentation design on the way to untangling the confusion.
With the advent of Hadoop, there comes the need for professionals skilled in Hadoop Administration making it imperative to be skilled as a Hadoop Admin for better career, salary and job opportunities.
Know how to setup a Hadoop Cluster With HDFS High Availability here : www.edureka.co/blog/how-to-set-up-hadoop-cluster-with-hdfs-high-availability/
You most probably dont need an RMAN catalog databaseYury Velikanov
or 10 compelling reasons why you may need a catalog database (alternative title). The title of this session is on purpose thought provoking. The author is an experience Oracle DBA in Oracle backup & recovery area. During the presentation he will go through top reasons why you may need to implement RMAN catalog database and give you additional ideas on how you can improve your backups leveraging additional benefits provided by RMAN catalog database. The author will explain in what cases and why you may not need the catalog database. You will go away with a clear understanding on how to benefit from RMAN catalog database and when it may be optional. This is another presentation from author's popular RMAN papers.
DB12c: All You Need to Know About the Resource ManagerMaris Elsins
This presentation is different from the previous uploads as SLOB was used for the testing.
Oracle Database 12c Multitenant provides the highest level of Oracle Database resource efficiency, driven by an improved resource manager. The 12c resource manager effectively allocates resources both within a single database and between multiple pluggable databases in a container. This presentation will review new features of the 12c resource manager, provide guidelines for migration of your current resource management plan to 12c, and will also look into how much overhead the resource manager introduces.
High performance computing tutorial, with checklist and tips to optimize clus...Pradeep Redddy Raamana
Introduction to high performance computing, what is it, how to use it and when to use what. Provides a detailed checklist how to build pipelines and tips to optimize cluster usage and reduce waiting time in queue. It also provides a quick overview of resources available in Compute Canada.
This is the presentation on ASH that I did with Graham Wood at RMOUG 2014 and that represents the final best effort to capture essential and advanced ASH content as started in a presentation Uri Shaft and I gave at a small conference in Denmark sometime in 2012 perhaps. The presentation is also available publicly through the RMOUG website, so I felt at liberty to post it myself here. If it disappears it would likely be because I have been asked to remove it by Oracle.
Josh Berkus
You've heard that PostgreSQL is the highest-performance transactional open source database, but you're not seeing it on YOUR server. In fact, your PostgreSQL application is kind of poky. What should you do? While doing advanced performance engineering for really high-end systems takes years to learn, you can learn the basics to solve performance issues for 80% of PostgreSQL installations in less than an hour. In this session, you will learn: -- The parts of database application performance -- The performance setup procedure -- Basic troubleshooting tools -- The 13 postgresql.conf settings you need to know -- Where to look for more information.
Best Practices for Becoming an Exceptional Postgres DBA EDB
Drawing from our teams who support hundreds of Postgres instances and production database systems for customers worldwide, this presentation provides real-real best practices from the nation's top DBAs. Learn top-notch monitoring and maintenance practices, get resource planning advice that can help prevent, resolve, or eliminate common issues, learning top database tuning tricks for increasing system performance and ultimately, gain greater insight into how to improve your effectiveness as a DBA.
AWR Ambiguity: Performance reasoning when the numbers don't add upJohn Beresniewicz
A close look at an AWR report where DB Time is exceeded by the sum of DB CPU and foreground wait time. We recall core Oracle performance principles and instrumentation design on the way to untangling the confusion.
With the advent of Hadoop, there comes the need for professionals skilled in Hadoop Administration making it imperative to be skilled as a Hadoop Admin for better career, salary and job opportunities.
Know how to setup a Hadoop Cluster With HDFS High Availability here : www.edureka.co/blog/how-to-set-up-hadoop-cluster-with-hdfs-high-availability/
You most probably dont need an RMAN catalog databaseYury Velikanov
or 10 compelling reasons why you may need a catalog database (alternative title). The title of this session is on purpose thought provoking. The author is an experience Oracle DBA in Oracle backup & recovery area. During the presentation he will go through top reasons why you may need to implement RMAN catalog database and give you additional ideas on how you can improve your backups leveraging additional benefits provided by RMAN catalog database. The author will explain in what cases and why you may not need the catalog database. You will go away with a clear understanding on how to benefit from RMAN catalog database and when it may be optional. This is another presentation from author's popular RMAN papers.
DB12c: All You Need to Know About the Resource ManagerMaris Elsins
This presentation is different from the previous uploads as SLOB was used for the testing.
Oracle Database 12c Multitenant provides the highest level of Oracle Database resource efficiency, driven by an improved resource manager. The 12c resource manager effectively allocates resources both within a single database and between multiple pluggable databases in a container. This presentation will review new features of the 12c resource manager, provide guidelines for migration of your current resource management plan to 12c, and will also look into how much overhead the resource manager introduces.
High performance computing tutorial, with checklist and tips to optimize clus...Pradeep Redddy Raamana
Introduction to high performance computing, what is it, how to use it and when to use what. Provides a detailed checklist how to build pipelines and tips to optimize cluster usage and reduce waiting time in queue. It also provides a quick overview of resources available in Compute Canada.
DB12c: All You Need to Know About the Resource ManagerAndrejs Vorobjovs
Resource Manager has changed a lot in Oracle Database 12c, especially if Oracle Multitenant is used. It can manage the available resources between the consumer groups in a single PDB as well as among all the PDBs. DBAs who are planning the upgrades or consolidations to Oracle Database 12c need to understand how the new resource manager works and how the existing resource management plans need to be changed to make them work in the new Oracle Multitenant configuration.
This paper will explain the differences between 11g and 12c resource manager, will dig into resource management features and limitations in 12c Oracle Multitenant, will provide guidelines for migrating your current resource management plan to 12c at the time of upgrade or consolidation, and will also reveal how much overhead the resource manager introduces.
Video: https://www.youtube.com/watch?v=FJW8nGV4jxY and https://www.youtube.com/watch?v=zrr2nUln9Kk . Tutorial slides for O'Reilly Velocity SC 2015, by Brendan Gregg.
There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This tutorial explains methodologies for using these tools, and provides a tour of four tool types: observability, benchmarking, tuning, and static tuning. Many tools will be discussed, including top, iostat, tcpdump, sar, perf_events, ftrace, SystemTap, sysdig, and others, as well observability frameworks in the Linux kernel: PMCs, tracepoints, kprobes, and uprobes.
This tutorial is updated and extended on an earlier talk that summarizes the Linux performance tool landscape. The value of this tutorial is not just learning that these tools exist and what they do, but hearing when and how they are used by a performance engineer to solve real world problems — important context that is typically not included in the standard documentation.
How Busy Is Too Busy? Automating Your System for Maximum Throughput Compuware
Why should you care about CPU utilization? Intuitively, we all know it’s a measure of how “busy” our system is, and we know that this affects how fast things will run. We also know that we don’t want utilization to be too high, but unlimited capacity is certainly not free.
What exactly is “too high”? 80%, 90%, 99%? What is an accurate measurement and how current is it?
During this on-demand webcast, we dive into issues affecting utilization and the key role that Compuware ThruPut Manager plays in optimizing your batch workloads to keep your machine running at the performance sweet spot. Don’t get too busy!
OOW16 - Getting Optimal Performance from Oracle E-Business Suite [CON6711]vasuballa
This Oracle Development session summarizes practical tips and lessons learned from performance tuning and benchmarking the world’s largest Oracle E-Business Suite environments. Application system administrators will get concrete tips and techniques for identifying and resolving performance bottlenecks on all layers of the technology stack. They will also learn how Oracle’s engineered systems such as Oracle Exadata and Oracle Exalogic can dramatically improve the performance of their system
Caching and tuning fun for high scalabilityWim Godden
Caching has been a 'hot' topic for a few years. But caching takes more than merely taking data and putting it in a cache : the right caching techniques can improve performance and reduce load significantly. But we'll also look at some major pitfalls, showing that caching the wrong way can bring down your site. If you're looking for a clear explanation about various caching techniques and tools like Memcached, Nginx and Varnish, as well as ways to deploy them in an efficient way, this talk is for you.
2AM. We sleeping well. And our mobile ringing and ringing. Message: DISASTER! In this session (on slides) we are NOT talk about potential disaster (such BCM); we talk about: What happened NOW? Which tasks should have been finished BEFORE. Is virtual or physical SQL matter? We talk about systems, databases, peoples, encryption, passwords, certificates and users. In this session (on few demos) I'll show which part of our SQL Server Environment are critical and how to be prepared to disaster. In some documents I'll show You how to be BEST prepared.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
3. Agenda
What we are going to talk?
Luís Marques - @drune - http://lcmarques.com
4. About Database Resource
Manager with a lot of questions,
charts, arrows, screenshots and a
Python script
Luís Marques - @drune - http://lcmarques.com
5. Hand Raising
Is there a simple picture that summarize Resource
Manager CPU scheduling?
Luís Marques - @drune - http://lcmarques.com
6. OS
OS
BeforeDatabaseResourceManager
Luís Marques - @drune - http://lcmarques.com
CPU #1
CPU #2
P#n
P#n
P#n
P#n
P#n
OS
PMON
OS
OS
OS
P#n
P#n
OS run-queue
• quantum defined by OS
• Priority can be changed by
OS
• All Oracle user sessions
have the same priority to
be selected for CPU
LGWR
SMON
DBWR
7. OS
OS
AfterDatabaseResourceManager
Luís Marques - @drune - http://lcmarques.com
Processes waiting
for selection
(DBRM internal queue)
CPU #1
CPU #2
S#n
S#n
S#n
S#n
S#n
S#n
DBRM internal queue
(priority aware according DBRM plan)
OS
PM
ON
OS
OS
OS
S#n
S#n
OS run-queue
OS scheduler will
decide between the
processes in run-
queue
LGW
R
8. Moreabout DBRM scheduler…
• DBRM Scheduler is not Database Workload Agnostic
• Priority based round robin algorithm
• Fixed quantum time slice of 100ms given to each process
(_dbrm_quantum)
• More intelligent scheduling:
• Aware of Oracle internal structures (eg: mutex, latching)
• Has code to avoid problems like priority inversion.
• No CPU starvation from critical background processes
• 2 Background Processes: VKRM and DBRM
Luís Marques - @drune - http://lcmarques.com
9. Hand Raising
Interesting! How do you prove that you have internal
queues and how the processes there go chosen to be
on CPU?
Luís Marques - @drune - http://lcmarques.com
10. DBRM –Scheduling(VKRM)
• If process must yield,VKRM background process will
determine what is the next process to be on OS runqueue:
• perf Linux profiler output:
Luís Marques - @drune - http://lcmarques.com
kgskrunnext - function that is
responsible for next-process on
OS runqueue?
11. DBRM –Scheduling(VKRM)
• SuspendingVKRM will place all your session eternally waiting for
CPU.
• SQL> ORADEBUG SETOSPID 16568
Oracle pid: 10, Unix process pid: 16568, image: oracle@baco (VKRM)
• SQL> ORADEBUG SUSPEND
Luís Marques - @drune - http://lcmarques.com
ORADEBUG
SUSPEND
ORADEBUG
RESUME
100% resmgr: cpu quantum
12. DBRM –Scheduling(CPU run-queue)
Luís Marques - @drune - http://lcmarques.com
• vmstat data with DBRM disabled:
• OS run-queue does increase while increasing session number: 41
sessions at end for 2 CPUs
As soon as sessions
increase, OS run queue
increases
13. DBRM –Scheduling(CPU run-queue)
• Oracle maintains an internal queue for DBRM:
• vmstat data with DBRM active
• Increasing sessions number gradually
Luís Marques - @drune - http://lcmarques.com
OS run queue doesn’t
increase even with 41
sessions and 2 CPUs
14. Hand Raising
Nice theory but…
I have a database with several schemas with different
priorities.
How I handle
Resource Management?
Luís Marques - @drune - http://lcmarques.com
15. presman–DBRM monitorscript
• DBRM MonitoringTool written in Python 2.x and cx_Oracle
• Runs onWindows, Linux and OSX
• Usage ./presman.py -m measure -o filename –c column_id -p
• Available measures: CPU, SESSION_IO, PARALLEL, EMPHASIS
• Download: http://lcmarques.com/presman-dbrm-monitor/
• Available on github: https://github.com/lcmarques/presman
Luís Marques - @drune - http://lcmarques.com
17. Hand Raising
Hmm..but the sum of all allocation on all levels is way
over 100%?
How I know the the minimum CPU allocated per
consumer group?
Luís Marques - @drune - http://lcmarques.com
18. Emphasis-The MinimumCPU formula
Luís Marques - @drune - http://lcmarques.com
• Minimum CPU for the all DBRM managed sessions, not host
minimum CPU allocation
• Minimum CPU :
Minimum % of CPU for
Consumer Group “n”
The value specified in
plan directive mgmt_pn
Product of a sequence
k = mgmt_p1
n = mgmt_pn
The sum of
mgmt_p (n-1)
level
19. Emphasis-The MinimumCPU formula
Luís Marques - @drune - http://lcmarques.com
Consumer Group mgmt_p1 mgmt_p2 mgmt_p3 Maximum CPU
RISK 100%
RSK_REPORT 100%
ADHOC 60%
OTHER_GROUPS 100%
65%
17,5%
14%
3,5%
20. Hand Raising
Great stuff! Let’s go test the Resource Manager plan
ok?
Luís Marques - @drune - http://lcmarques.com
21. Test#1 –UTILIZATION_LIMIT
• ADHOC Consumer group with UTILIZATION_LIMIT = 60%
• CPU burner: burn_cpu_adhoc.sql
• UTILIZATION_LIMIT is not a host CPU limit!
• UTILIZATION_LIMIT is for Oracle user sessions managed by
DBRM
Luís Marques - @drune - http://lcmarques.com
Us ~66%
Sys ~7%
22. Hand Raising
Hey, hey, so how I measure it easily?
Luís Marques - @drune - http://lcmarques.com
23. Test#1 –UTILIZATION_LIMIT
• v$rsrcmgrmetric and v$osstat and do some math:
(cpu_consumed_time_sec / (60 * CPU_count)) * 100
• $ presman.py –m cpu -o oracle_cpu.csv –c 7 -p
Luís Marques - @drune - http://lcmarques.com
Oracle CPU in % by
Consumer Group
24. Hand Raising
That is easy!
How do I test my plan CPU allocation ?
Luís Marques - @drune - http://lcmarques.com
25. Test#2 –OracleCPUConsumption
• Step 0 – Start presman to measure CPU by CG
• $ presman.py –m cpu -o oracle_cpu.csv –c 5
• Step 1 - Fire up 3 sessions ADHOC consumer group
• Almost 100% CPU for all consumer groups is used onADHOC
Luís Marques - @drune - http://lcmarques.com
26. Test#2 –OracleCPUConsumption
• Step 2 - Fire up 10 sessions in consumer group RISK
• RISK have a lot more sessions and more priority
• No UTILIZATION_LIMIT directive on RISK consumer group
• ADHOC consumer groupCPU is down to almost 20% of all
consumer group CPU activity
Luís Marques - @drune - http://lcmarques.com
27. Test#2 –OracleCPUConsumption
• Step 3 - Fire up 5 Sessions in consumer group RSK_REPORT
• ADHOC querys got canceled to the directive CANCEL_SQL
• RISK and RISK_REPORT are consuming almost every CPU
cycle.
Luís Marques - @drune - http://lcmarques.com
28. Test#2 –OracleCPUConsumption
• Step 4 - Fire up 3 Sessions in consumer group ADHOC
• Real world test vs Plan Directives CPU allocation
Luís Marques - @drune - http://lcmarques.com
Consumer Group Minimum CPU Test Minimum
CPU
Sessions
RISK 65% 66,74% 10
RSK_REPORT 17,5% 18,23% 5
ADHOC 14% 14,81% 3 + 3
OTHERS_GROUP 3,5% 0,22% No sessions
30. Hand Raising
Clarified!
With so many sessions for a 4 CPU database!You
surely have throttling right?
Luís Marques - @drune - http://lcmarques.com
My hand
hurts…
31. Test#3 –ThrottlingbyWaitEvent
• Throttling by Resource Manager can be monitored by the wait event
resmgr:cpu quantum (wait class Scheduler)
• Without Resource Manager, the time spent in “resmgr:cpu
quantum” will be spent instead as waits on the operating system run
queue.
• AWR report indication of high waits on the run queue is from the
server load numbers (11g)
• 12c AWR has more information on CPU Wait
• resmgr: cpu quantum doesn’t necessarily means you have a
overloaded CPU (eg: UTILIZATION_LIMIT directive)
Luís Marques - @drune - http://lcmarques.com
32. Test#3 -ThrottlingbyWaitEvent
• SQL> alter system set resource_manager_plan=‘’
• CPU available = 4 x 10.04 x 60 = 2409,6 sec
• Consumed CPU = 2053,9 (85%)
• % of CPUWait = 99.79% - 42.7% = 57,09 % of DBTime spent
of OS run queue
Luís Marques - @drune - http://lcmarques.com
33. Test#3 -ThrottlingbyWaitEvent
• alter system set resource_manager_plan=‘DBRM_PLAN’
• CPU available = 4 x 9,03 x 60 = 2167,2
• Consumed CPU = 1820,9 (84%)
• 63% of DBTime is spent on waiting in Resource Manager internal queue
• % of CPUWait = 36,64% - 28,1 % = Only 8,54 % of DBTime spent of OS
run queue
Luís Marques - @drune - http://lcmarques.com
34. Hand Raising
Good! I’ve read that we can handle parallel execution.
Handling all the parallel servers seems to be hard for
me!
Luís Marques - @drune - http://lcmarques.com
35. TheDW forreporting–Plan #2
Consumer
Group
RATIO PARALLE
L_DEGRE
E_LIMIT
SWITCH
_TIME
S_GROUP PARALLEL_
SERVER_LI
MIT
PARALLEL
_QUEUE_
TIMEOUT
OTHERS_GRO
UP
10 0 120 sec SHORT_RE
PORTING
SHORT_REPO
RTING
5 900 sec LONG_RE
PORTING
50%
LONG_REPOR
TING
1 50% 3600 sec
Luís Marques - @drune - http://lcmarques.com
• RATIO was used on create_plan()
• Priority statements on OTHERS_GROUPS have to execute on
serial
• To limit the parallel servers used by a consumer group, use the
parallel_server_limit directive
36. Hand Raising
Hey hey...WAIT! Now you used plan directives with a
thing called RATIO or SHARE! What is that?
Luís Marques - @drune - http://lcmarques.com
37. Ratio-TheMinimumCPU formula
Luís Marques - @drune - http://lcmarques.com
Minimum % of CPU for
Consumer Group “n”
The value specified in
plan directive mgmt_pnSum of all ratios
Consumer Group Mgmt_p1 Ratio Ratio as Emphasis
OTHERS_GROUP 10 10 / 16 = 62,5 %
SHORT_REPORTING 5 5 / 16 = 31,25 %
LONG_REPORTING 1 1 / 16 = 6,25%
38. Hand Raising
Can you go forward with the plan testing. I’m
interested on parallel details!
Luís Marques - @drune - http://lcmarques.com
40. Test#1–PARALLEL_DEGREE_LIMIT
withoutAUTODOP
• Generation of a PARALLEL plan when execution is serial is more expensive
• Large difference between DOP assumed at optimization time (hard parse
time) and actual DOP at execution time might lead to not optimal
execution plans
Luís Marques - @drune - http://lcmarques.com
41. Test#1–PARALLEL_DEGREE_LIMITwith
AUTODOP
• Auto DOP is enabled via parallel_degree_policy= AUTO (or
ADAPTIVE in 12c)
• Only new Auto DOP codepath negotiates with DBRM
• alter session set "_px_trace"="high",all;
• $ burn_me.sh (1 session)
Luís Marques - @drune - http://lcmarques.com
42. Test#2 – PARALLEL_SERVER_LIMIT
• PARALLEL_SERVER_LIMIT directive is percentage of
parameter parallel_servers_target
• Avoid a low priority user and consumer group to get all parallel
servers
• When percentage of parallel servers is reached for Consumer
Group Statement Queued
• Auto DOP is enabled to enable Parallel Statement Queueing
Luís Marques - @drune - http://lcmarques.com
Consumer Group PARALLEL_SERVERS_TAR
GET
PARALLEL_SERVER_LIMIT
LONG_REPORTING 64 50%
SHORT_REPORTING 64 50%
43. Test#2 – PARALLEL_SERVER_LIMIT
• $ burn_me.sql (19 sessions) to LONG_REPORTING
• SQL> alter system set parallel_servers_target = 64
• $ presman.py –m parallel
Luís Marques - @drune - http://lcmarques.com
16 statements running
3 statements queued 32 Parallel Servers = 50%
of parallel_servers_target
44. Hand Raising
Clear! What about having give more or less priority to
my parallel statements when they are queued?
Luís Marques - @drune - http://lcmarques.com
45. Test#3–PriorityoftheParallelStatement
Queue
Luís Marques - @drune - http://lcmarques.com
Parsed Statement
& Auto DOP is
calculated
SQL
stat
SQL
stat
SQL
stat
Statement
Executes in
Parallel
SQL
stat
SQL
stat
SQL
stat
SQL
stat
FIFO Statements Queue per
Consumer Group – not
enough parallel servers or
limit reached
Enough parallel servers –
PARALLEL_SERVER_LIMIT
not reached
Statement
Executes in
Parallel
Dequeuing priority based RATIO /
SHARES or EMPHASIS values on
the Consumer Group
SQL
stat
SQL
stat
46. Test#3–PriorityoftheParallelStatement
Queue
• 35 sessions for SHORT and LONG Reporting Consumer
Group.
• $ burn_me_all_same_time.sh
• $ presman.py –m parallel –o queue_time.csv –c 4
• Step 1 - 16 Statements running and 19 queued for each
Consumer Group
Luís Marques - @drune - http://lcmarques.com
47. Test#3–PriorityoftheParallelStatement
Queue
• Step 2 - Dequeue of parallel statements started
• Step 3 – Dequeuing continues as soon as some statements
finish
• Step 4 - Almost every statement done. No queued statements
Luís Marques - @drune - http://lcmarques.com
48. Test#3–PriorityoftheParallelStatement
Queue
• SHORT_REPORTING QueueTime: 7719385 milliseconds
• LONG_REPORTING QueueTime: 11375129 milliseconds
67,8% less queue time for SHORT_REPORTING
• SHORT_REPORTING ratio is 5 for 1 in LONG_REPORTING
• SHORT_REPORTING has 5 times more probability to get one
statement dequeded than LONG_REPORTING.
Luís Marques - @drune - http://lcmarques.com
49. Hand Raising
What if I have some critical reports that need to
bypass the queue because they are critical?
Luís Marques - @drune - http://lcmarques.com
50. CriticalParallelStatementQueues
• Oracle 12c introduced parallel_stmt_critical on plan directives
• Allows one value: BYPASS_QUEUE
• Sessions will start immediately and not wait in the queue.
• parallel_max_servers init parameter is the hard threshold and
critical statements can run with lower number of PX servers
dbms_resource_manager.create_plan_directive( plan =>
'REPORTS_PLAN',
group_or_subplan => 'CRITICAL_REPORT', comment => 'CRITICAL
Reporting Querys',
parallel_stmt_critical => 'BYPASS_QUEUE');
Luís Marques - @drune - http://lcmarques.com
51. Q & A
Luís Marques - @drune - http://lcmarques.com
I bet we don’t
have time for it
52. Wanttoknow more?
• Dump the state of DBRM with:
• SQL> oradebug setmypid
• SQL> oradebug dump DBSCHEDULER 1
• Trace wait events with 12c interface:
• SQL> alter session set events 'wait_event["resmgr:cpu
quantum"] trace("%sn", shortstack())';
• SQL> exec
DBMS_MONITOR.SESSION_TRACE_ENABLE(waits => true,
binds => false, plan_stat => 'NEVER');
Luís Marques - @drune - http://lcmarques.com
Editor's Notes
How many of you are using DBRM?
Underestimate because: it is very powerful, not very well understood and poorly used.
Part 1: Theory on DBRM scheduler details with CPU session scheduling in mind
Part 2: More pratical RM plan, testing and validating the most interesting features
A lot of images and arrows will appear during the presentation
The presentation will be driven by a guy that is constantly interrupting it and asking questions
Database Resource Manager is basically a scheduler like the one you will find on your operating system. The difference is that it knows very well your workload and Oracle because it is inside it.
Priority decay can happen and if your mutex holder is eating lot of cpu the priority can be lowered caused a priority inversion issue
1 – BLUE: Processes in DBRM internal queue waiting to be selected and placed on operating system run queue.
This process selection is made according your RM plan and VKRM background process will place the next process in os-runqueue for selection
Operation System is then responsible for place everything on CPUs
2 – ORANGE: Please note that DBRM will take care of priority of PMON (and other Oracle Background Processes) and it will try to avoid any type of CPU starvation for it, even if that means that your session must wait a little longer
Instead of waiting in CPU runqueue, processes will wait on an RM internal queue – I will prove that to you later on
A background process called VKRM will be responsible for placing your next foreground session on OS runqueue
Priority Round Robin scheduling: It retains the advantage of round robin in reducing starvation and also integrates the advantage of priority scheduling.
The quantum that RM gives to your session is by DEFAULT 100m and it is basically a slide of CPU time – You will learn that you can play with it.
PMON starvation will cause also some stability problems (free dead process). PMON will have the same priority as your foreground session and may be an issue
Remember that latches or mutexes are just memory structures on SGA – OS doesn’t have a clue about it
If mutex holder is off the CPU and any other processes that go on CPU may want to take the same mutex it can be an very complex issue since they can’t get it and will spin waiting for it and then sleep (not in 10.2)
Oracle 11g the mutex getters do sleep instead of just yielding the CPU
You may ask: If mutex holder if off the CPU it should come back to CPU very vast right?
Answer: Yes if all the process can get the same priority – That’s what priority decay can happen and if your mutex holder is eating lot of cpu the priority can be lowered caused a priority inversion issue
I will prove 2 different things:
- The VKRM job that places the next process on OS run queue
- The existence of an Resource Manager internal queue
I will not get into much detail here but:
This is part of the output of perf on VKRM process. Perf is a linux profiler and was run against vkrm background process.
We are able to see some Oracle kernel functions here but the function that pops in is kgskrunnext and it give us an hint what is VKRM job.
QUESTION: Let’s think: If somehow we can stop or suspend this background process what will happen?
Test Case:
1 – Oracle Database Resource Manager is throttling with UTILIZATION_LIMIT
2 – ORADEBUG SUSPEND will cause the 100% of resmgr cpu quantum: No more sessions scheduled to be on OS run-queue
3 – RESUME will resume the normal behavior
Only works if DBRM is actively throtlling your session.
Test case: Increasing the number of sessions and watch operating system runqueue size using vmstat
- When DBRM is not enabled, OS-runqueue size will increase as soon as you have start to increase your number of sessions.
- At the end your runqueue is 42 and CPUs are totally busy with 0 of idle time
Same test case
OS run queue stays with same values even if your session number is increasing – This shows that your sessions are not placed directly on Operating System runqueue
Internal queue will have all your “waiting for selection” sessions.
Here an example of a schema consolidation plan that I will use to demonstrate some testing
1 – switch_io_logical - Number of logical I/O for the length of the session
2 – LOG_ONLY – No action, just record the event to SQL Monitoring (12c only)
3 – ACTIVE_SESS_POOL_p1 – 5 maximum active sessions
CANCEL_SQL – If the statements runs for more than 120 seconds it will be canceled
Please note that UTILIZATION_LIMIT replaces MAX_UTILIZATION_LIMIT (on 11.2)
As of 12c UTILIZATION_LIMIT also limits I/O if you are on Exadata and parallel servers as a percentage of parallel_server_targets.
- Fairly easy for many of you, but I will make it complex
- IF you sum the previous values of all levels you will end up with much more than 100%
EMPHASIS – For multilevel plans that use percentage – This is the default if you define a plan.
You may ask: WHAT IS THIS? Are you crazy?
Symbol: Pi and it is basically a product of a sequence
Remember:
L1: 65%
L2: 50% and 40%
L3: 100%
Quick question – How many of you test your DBRM resource manager plans after creating it?
Way over 60% on CPU utilization – Difficult to test if resource manager is respecting your directive.
- As of 12c UTILIZATION_LIMIT also limits I/O if you are on Exadata and parallel servers as a percentage of parallel_server_targets.
A little over 60% here, but it the best way you have to test if RM manager is respecting your UTILIZATION_LIMIT directive.
That value is what Oracle think that is consuming per consumer group.
Will show a sequence of events: The first event is the fireup of 3 sessions in ADHOC consumer group which has a UTILIZATION_LIMIT of 60 as you can seen on screenshot
Only sessions from ADHOC are running with a limit of 60% of Oracle CPU, consuming almost 100% of all consumer groups CPU.
- RISK consumer group has 65% percent of minimum CPU guarantee
120 seconds passed and ADHOC sessions querys got canceled
Every bit of CPU is consumed now by RISK and RISK_REPORT
That is how you should test your RM manager CPU allocation: fire up your workload and measure it.
If you are not satisfied by your results, you should be back to drawing board because defining a RM plan is an interactive process – You create it, test it, check the results and if they are not what you expect you re-create it.
In the end you can just pick your CSV, open excel and do this kind of chart. Will help you so much on visualization on what is consuming your CPU.
Values are in percentage
Explain why
- parallel_server_limit is a percentage of parameter parallel_server_target
RATIO only works for plans with one level and shows the relation between consumer groups
Expected: PX_COORDINATOR FORCED SERIAL
- Running at a very low DOP (for example DOP = 2) might actually be less efficient than running serially, because Parallel Execution comes with some (implementation) overhead that can make a Parallel Execution slower than a serial counterpart
SQL Statements enter in the Database
Statements is parsed and Oracle determines the automatic DOP
If Active parallel servers reaches the value of the PARALLEL_SERVER_LIMIT parallel statements are queued
Statements are queue in FIFO statement queues to control the plan directives: mgmt_pn or ratio priority,
After everything checked that depending of dequeue priority and availability of parallel servers statement is set able to run.
After 11.2.0.2 we do have different FIFO queue per consumer group.