This document provides an overview of Kyle Hailey's background and expertise in Oracle performance tuning. It discusses Kyle's experience since 1990 working with Oracle support, benchmarking, and tools development. The document outlines an agenda for a presentation on advanced Oracle performance tuning, covering topics like the wait interface, ASHMON/SASH tools, SQL tuning, and using graphics to simplify complex performance data. It emphasizes making performance issues clear and understandable to DBAs, developers and managers.
Learn how in a few simple steps you can break down a complex SQL statement into an easy to understand visual diagram which will enable you to quickly identify a good candidate for the best plan of execution. Once you have a good plan of execution you will see how to make Oracle use it and how to test your results verses the default optimization done by Oracle.
The knowledge of how to quickly analyze and tune a query is essential and powerful when it comes to handling queries that for whatever reason, Oracle doesn’t correctly optimize by default.
This slideshow provides an overview for best practices for visual analysis within Tableau. This is intended for anyone who wants to tell more compelling stories with their data.
Learn how in a few simple steps you can break down a complex SQL statement into an easy to understand visual diagram which will enable you to quickly identify a good candidate for the best plan of execution. Once you have a good plan of execution you will see how to make Oracle use it and how to test your results verses the default optimization done by Oracle.
The knowledge of how to quickly analyze and tune a query is essential and powerful when it comes to handling queries that for whatever reason, Oracle doesn’t correctly optimize by default.
This slideshow provides an overview for best practices for visual analysis within Tableau. This is intended for anyone who wants to tell more compelling stories with their data.
Recent world #1 Kaggle Grandmaster and Research Data Scientist at H2O.ai, Marios Michailidis, will delve into the competitive edge that Driverless AI brings out of the box.
Driverless AI can easily score in the top 5% in popular data science challenges against thousands of participants in a matter of minutes with limited processing power.
Apart from the actual predictions, one can use Driverless AI data munging and derived knowledge of the data to build even more powerful models.
This webinar discusses how Driverless AI can get competitive scores in popular Kaggle challenges. Also, Marios will explain the concepts of hyper-parameter tuning and stacking and how they help to make stronger predictions.
Bio:
Former world no.1 Kaggle Grandmaster, Marios Michailidis, is now a Research Data Scientist at H2O.ai. He is finishing his PhD in machine learning at the University College London (UCL) with a focus on ensemble modeling and his previous education entails a B.Sc in Accounting Finance from the University of Macedonia in Greece and an M.Sc. in Risk Management from the University of Southampton. He has gained exposure in marketing and credit sectors in the UK market and has successfully led multiple analytics’ projects based on a wide array of themes.
Before H2O.ai, Marios held the position of Senior Personalization Data Scientist at dunnhumby where his main role was to improve existing algorithms, research benefits of advanced machine learning methods, and provide data insights. He created a matrix factorization library in Java along with a demo version of personalized search capability. Prior to dunnhumby, Marios has held positions of importance at iQor, Capita, British Pearl, and Ey-Zein.
At a personal level, he is the creator and administrator of KazAnova, a freeware GUI for quick credit scoring and data mining which is made absolutely in Java. In addition, he is also the creator of StackNet Meta-Modelling Framework.
Castle is an open-source project that provides an alternative to the lower layers of the storage stack -- RAID and POSIX filesystems -- for big data workloads, and distributed data stores such as Apache Cassandra.
This presentation from Berlin Buzzwords 2012 provides a high-level overview of Castle and how it is used with Cassandra to improve performance and predictability.
Every business today wants to leverage data to drive strategic initiatives with machine learning, data science and analytics — but runs into challenges from siloed teams, proprietary technologies and unreliable data.
That’s why enterprises are turning to the lakehouse because it offers a single platform to unify all your data, analytics and AI workloads.
Join our How to Build a Lakehouse technical training, where we’ll explore how to use Apache SparkTM, Delta Lake, and other open source technologies to build a better lakehouse. This virtual session will include concepts, architectures and demos.
Here’s what you’ll learn in this 2-hour session:
How Delta Lake combines the best of data warehouses and data lakes for improved data reliability, performance and security
How to use Apache Spark and Delta Lake to perform ETL processing, manage late-arriving data, and repair corrupted data directly on your lakehouse
Instaclustr has a diverse customer base including Ad Tech, IoT and messaging applications ranging from small start ups to large enterprises. In this presentation we share our experiences, common issues, diagnosis methods, and some tips and tricks for managing your Cassandra cluster.
About the Speaker
Brooke Jensen VP Technical Operations & Customer Services, Instaclustr
Instaclustr is the only provider of fully managed Cassandra as a Service in the world. Brooke Jensen manages our team of Engineers that maintain the operational performance of our diverse fleet clusters, as well as providing 24/7 advice and support to our customers. Brooke has over 10 years' experience as a Software Engineer, specializing in performance optimization of large systems and has extensive experience managing and resolving major system incidents.
Information Visualization for Medical Informatics
Lifelines, Lifelines2, LifeFlow, treemaps, networks
(slide file: Shneiderman info vismedical-georgetown-v1 )
MySQL for Oracle Developers and the companion MySQL for Oracle DBA's were two presentations for the 2006 MySQL Conference and Expo. These were specifically designed for Oracle resources to understand the usage, syntax and differences between MySQL and Oracle.
Recent world #1 Kaggle Grandmaster and Research Data Scientist at H2O.ai, Marios Michailidis, will delve into the competitive edge that Driverless AI brings out of the box.
Driverless AI can easily score in the top 5% in popular data science challenges against thousands of participants in a matter of minutes with limited processing power.
Apart from the actual predictions, one can use Driverless AI data munging and derived knowledge of the data to build even more powerful models.
This webinar discusses how Driverless AI can get competitive scores in popular Kaggle challenges. Also, Marios will explain the concepts of hyper-parameter tuning and stacking and how they help to make stronger predictions.
Bio:
Former world no.1 Kaggle Grandmaster, Marios Michailidis, is now a Research Data Scientist at H2O.ai. He is finishing his PhD in machine learning at the University College London (UCL) with a focus on ensemble modeling and his previous education entails a B.Sc in Accounting Finance from the University of Macedonia in Greece and an M.Sc. in Risk Management from the University of Southampton. He has gained exposure in marketing and credit sectors in the UK market and has successfully led multiple analytics’ projects based on a wide array of themes.
Before H2O.ai, Marios held the position of Senior Personalization Data Scientist at dunnhumby where his main role was to improve existing algorithms, research benefits of advanced machine learning methods, and provide data insights. He created a matrix factorization library in Java along with a demo version of personalized search capability. Prior to dunnhumby, Marios has held positions of importance at iQor, Capita, British Pearl, and Ey-Zein.
At a personal level, he is the creator and administrator of KazAnova, a freeware GUI for quick credit scoring and data mining which is made absolutely in Java. In addition, he is also the creator of StackNet Meta-Modelling Framework.
Castle is an open-source project that provides an alternative to the lower layers of the storage stack -- RAID and POSIX filesystems -- for big data workloads, and distributed data stores such as Apache Cassandra.
This presentation from Berlin Buzzwords 2012 provides a high-level overview of Castle and how it is used with Cassandra to improve performance and predictability.
Every business today wants to leverage data to drive strategic initiatives with machine learning, data science and analytics — but runs into challenges from siloed teams, proprietary technologies and unreliable data.
That’s why enterprises are turning to the lakehouse because it offers a single platform to unify all your data, analytics and AI workloads.
Join our How to Build a Lakehouse technical training, where we’ll explore how to use Apache SparkTM, Delta Lake, and other open source technologies to build a better lakehouse. This virtual session will include concepts, architectures and demos.
Here’s what you’ll learn in this 2-hour session:
How Delta Lake combines the best of data warehouses and data lakes for improved data reliability, performance and security
How to use Apache Spark and Delta Lake to perform ETL processing, manage late-arriving data, and repair corrupted data directly on your lakehouse
Instaclustr has a diverse customer base including Ad Tech, IoT and messaging applications ranging from small start ups to large enterprises. In this presentation we share our experiences, common issues, diagnosis methods, and some tips and tricks for managing your Cassandra cluster.
About the Speaker
Brooke Jensen VP Technical Operations & Customer Services, Instaclustr
Instaclustr is the only provider of fully managed Cassandra as a Service in the world. Brooke Jensen manages our team of Engineers that maintain the operational performance of our diverse fleet clusters, as well as providing 24/7 advice and support to our customers. Brooke has over 10 years' experience as a Software Engineer, specializing in performance optimization of large systems and has extensive experience managing and resolving major system incidents.
Information Visualization for Medical Informatics
Lifelines, Lifelines2, LifeFlow, treemaps, networks
(slide file: Shneiderman info vismedical-georgetown-v1 )
MySQL for Oracle Developers and the companion MySQL for Oracle DBA's were two presentations for the 2006 MySQL Conference and Expo. These were specifically designed for Oracle resources to understand the usage, syntax and differences between MySQL and Oracle.
Successfully convince people with data visualizationKyle Hailey
Successfully convince people with data visualization
video of presentation available at https://www.youtube.com/watch?v=3PKjNnt14mk
from Data by the Bay conference
8. Clearer
12 12
8 8
4
X 4
30 35 40 45 50 55 60 65 70 75 80
1. Include successes
2. Mark Differences
3. Normalize same temp
4. Scale known vs unknown
Copyright 2006 Kyle Hailey
9. Difficult
NASA Engineers Fail
Congressional Investigators Fail
Data Visualization is Difficult
But …
Lack of Clarity can be devastating
Copyright 2006 Kyle Hailey
10. Solutions
Clear Identification
Know how to identify problems and issues
Access to details
Provide solutions and/or information to address the issues
Graphics
Easy understanding, effective communication and discussion
11. First Step: Graphics
“The humans … are exceptionally good at
parsing visual information, especially
when that information is coded by color
and/or _____ .”
motion
Knowledge representation in cognitive science. Westbury, C. & Wilensky, U. (1998)
12. Why Use Graphics
You can't imagine how many times I was told that nobody wanted or
would use graphics …
-- Jef Raskin, the creator of the Macintosh
Infocus – (overhead projectors) sited a
study that humans can parse graphical
information 400,000 times faster than
textual data
16. Tuning the Database
Complex
What is a day in the life look like
Averages
for a DBA who has performance
issues?
Anscombe's Quartet
I II III IV
x y x y x y x y
Average 9 7.5 9 7.5 9 7.5 9 7.5
Standard Deviation 3.31 2.03 3.31 2.03 3.31 2.03 3.31 2.03
Linear Regression 1.33 1.33 1.33 1.33
17. How Can We Open the Black Box?
LOAD
Max CPU
Top (yard stick)
Activity
Click here
SQL Events Sessions
Get Details
18. How Can We Open the Black Box?
OEM ASHMON/SASH DB Optimizer
•Powerful - Identifies issues quickly and powerfully
•Interactive - Allows exploring the data
•Easy - Understandable by everyone, DBA, Dev and Managers !
21. Do You Want?
Engineering Data?
Copyright 2006 Kyle Hailey
22. Do You Want?
Pretty Pictures
Copyright 2006 Kyle Hailey
23. Do You Want?
Clean and Clear
? ? ? ?
? ?
Copyright 2006 Kyle Hailey
24. Imagine Trying to Drive your Car
Would you want your dashboard to look like :
And is updated once and hour
Or would you like it to
look …
Copyright 2006 Kyle Hailey
29. When to Tune
1. Machine
a) CPU
Response times skewed
100% CPU might be fine
Users wait in queue (run queue) => machine
underpowered
a) Memory
Paging
Wait times skewed (ex : latch free)
Erratic response times ( ex : ls )
1. Oracle Host CPU
Memory
1) Waits > CPU ? Oracle Load
(AAS)
tune waits
AAS >
1) CPU > 100% ? #CPU
Waits > CPU >
tune top CPU SQL AAS > 1 CPU Waits
Top Session Top Wait Top SQL
1) Else
It’s the application
Object Detail SQL Detail Wait Detail Session Detail File Detail
30. Machine
Make sure the machine is healthy before tuning Oracle
CPU => use run queue, < 2 * #CPU
Memory => page out
VMSTAT
31. Summary
1.Machine - vmstat
Memory, CPU (we can see IO response in Oracle)
1.Database - AAS
Use wait interface and graphics
Identify machine, application, database or SQL
1.SQL - VST
Indexes, stats, execution path
Visual SQL Tuning
32. How Can We Open the Black Box?
OEM ASHMON/SASH DB Optimizer
•Powerful - Identifies issues quickly and powerfully
•Interactive - Allows exploring the data
•Easy - Understandable by everyone, DBA, Dev and
Managers !
Editor's Notes
In Oracle Support I learned more faster than I think I could have anywhere. Porting gave me my first appreciation for the shareable nature of Oracle code and also a bit of disbelief that it worked as well as it did. Oracle France gave me an opportunity to concentrate on the Oracle kernel. At Oracle France I had 3 amazing experiences. First was being sent to the Europecar site where I first met a couple of the people who would later become the founding members of the Oaktable, James Morle and Anjo Kolk. The Europecar site introduced me to a fellow name Roger Sanders who first showed me the wait interface before anyone knew what it was. Roger not only used it but read it directly from shared memory without using SQL. Soon after Europecar I began to be sent often to benchmarks at Digital Europe. These benchmarks were some of my favorite work at Oracle. The benchmarks usually consisted of installing some unknown Oracle customer application and then having a few days to make it run as fast as possible. I first started using TCL/TK and direct shared memory access (DMA) at Digital Europe and got solid hands on tuning experience testing things like striping redo and proving it was faster long before people gave up arguing that this was bad from a theoretical point of view. Finally in France, my boss, Jean Yves Caleca was by far the best boss I ever had, but on top of that he was wonderful at exploring the depths of Oracle and explaining it to others, teaching me much about the internals of block structure, UNDO, REDO and Freelsits. I came back from France wanting to do performance work and especially graphical monitoring. The kernel performance group had limited scope in that domain, so I left for a dot com where I had my first run as the sole DBA for everything, backup, recovery, performance, installation and administration. I was called away by Quest who had my favorite performance tool Spotlight. It turns out thought that scope for expanding Spotlight was limited so I jumped at the chance in 2002 to restructure Oracle OEM. The work at OEM I’m proud of but still want to do much more to make performance tuning faster, easier and more graphical.
What allows Oracle, all the dials and knobs, to be the fastest, most robust database on the market also make it one of the most complicated on the market. For those few users that know and love the 1000s of performance options in Oracle, access to those performance options through sql and C code is the defacto standard. For seasoned users it is often difficult to understand why anyone would want to be limited by a GUI which can often be slower, and almost always lacking the latest commands allowed through interfaces such as SQL and C. On the OEM side, when a new tuning feature in the kernel is externalized in OEM (Oracle Enterprise Manager) it is often difficult to see the forest for the tree, or where and how that new feature fits into the grand scheme of things.
This presentation presented a challenge because many of the concepts were like the chicken and the egg. In order to explain one I needed the other. I’ve tried to keep the subjects as linear as possible but sometimes I interweave from one to another.
In order to tune an Oracle database the first step in a complete analysis is to verify the machine because there are a couple of factors that can only be clearly determined by looking at machine statistics. Those two factors are Memory Usage CPU Usage Memory and CPU problems will have tell tale repercussions on Oracle performance statistics and thus can be deduced from just looking at the Oracle statistics, but it is clear just to start with the machine statistics. CPU For CPU, we check the “run queue” which is the number of processes that are ready to run but have to wait for the CPU. A machine free of CPU contention would have a run queue of 0 and could have CPU usage near 100% at the same time. A high CPU usage can be a good sign that the system is being utilized fully. On the other hand a high run queue will indicated that there is more demand for CPU than CPU power available. High run queue can be determined via Oracle statistics by looking at ASH data and seeing if more sessions are marked as being on the CPU than the number of CPUs available. For example if there are 4 sessions average active on CPU in the ASH data and only 2 CPUs then the machine is CPU bound. Solutions for high run queue are either to add more processors or reduce the load on the CPU. If the CPU is mainly being used by Oracle, then that is going to mean tuning the application and ther SQL queries. Memory If the machine is paging out to disk it means there is a memory crunch and can dramatically slow down Oracle. Oracle will sometimes indicate a paging problem through a spike in “latch free” waits but the only guarenteed method of diagnosing this problem is looking at the machine statistics. Machines have statistics for paging and free memory. Often there can be some free memory even when there is paging out because machines start paging out before memory is completely filled. Solutions if the machine is paging out are either to add more memory or to reduce memory usage. Memory usage can be reduced by reducing Oracle cache sizes or reducing Oracle session memory usage.