A collection of screenshots, design mockups and conceptual prototypes for systems monitoring user interfaces and performance data visualizations spanning 20 years of my career from Savant to Oracle to the present. Wacky stuff.
Proactive performance monitoring with adaptive thresholdsJohn Beresniewicz
Presentation given at UKOUG 2008 conference on the Adaptive Thresholds technology in Oracle database 10.2+ and Enterprise Manager 11. Adaptive Thresholds allows users to do consistent and effective performance monitoring across systems and architectures by using statistical characterization of metric streams to automatically set and adapt monitoring thresholds independent of application workload.
This presentation from 2008 is a good summary of Design by Contract and its application to PL/SQL as I have adopted and recommend others to try as well.
An experimental and advanced usage of Oracle Event Histogram and ASH data to answer the question: has ASH sampled any latency outlier events? We use Event Histogram to characterize the probability distribution of event latencies and then join with ASH to find if high significance (low probability) events have been sampled. Presented at UKOUG in 2011.
Awr1page - Sanity checking time instrumentation in AWR reportsJohn Beresniewicz
Discusses Oracle time-based performance instrumentation as presented in AWR reports and inconsistencies between instrumentation sources that can cause confusion as conflicting information is presented. The cognitive load of investigating and reasoning about such conundrums is very high, discouraging even senior performance experts. A program (AWR1page) is discussed that consumes an AWR report and produces a 1-page normalized time summary by instrumentation source, precisely designed for reasoning about instrumentation inconsistencies in AWR reports.
AWR Ambiguity: Performance reasoning when the numbers don't add upJohn Beresniewicz
A close look at an AWR report where DB Time is exceeded by the sum of DB CPU and foreground wait time. We recall core Oracle performance principles and instrumentation design on the way to untangling the confusion.
Awr1page - Sanity checking time instrumentation in AWR reportsJohn Beresniewicz
The presentation discusses issues with Oracle timing instrumentation and introduces AWR1page, a program that produces high-level instrumentation sanity checks from AWR text reports, on a single page.
We thought of checking the Boost library long ago but were not sure if we would collect enough results to write an article. However, the wish remained. We tried to do that twice but gave up each time because we didn't know how to replace a compiler call with a PVS-Studio.exe call. Now we've got us new arms, and the third attempt has been successful. So, are there any bugs to be found in Boost?
Proactive performance monitoring with adaptive thresholdsJohn Beresniewicz
Presentation given at UKOUG 2008 conference on the Adaptive Thresholds technology in Oracle database 10.2+ and Enterprise Manager 11. Adaptive Thresholds allows users to do consistent and effective performance monitoring across systems and architectures by using statistical characterization of metric streams to automatically set and adapt monitoring thresholds independent of application workload.
This presentation from 2008 is a good summary of Design by Contract and its application to PL/SQL as I have adopted and recommend others to try as well.
An experimental and advanced usage of Oracle Event Histogram and ASH data to answer the question: has ASH sampled any latency outlier events? We use Event Histogram to characterize the probability distribution of event latencies and then join with ASH to find if high significance (low probability) events have been sampled. Presented at UKOUG in 2011.
Awr1page - Sanity checking time instrumentation in AWR reportsJohn Beresniewicz
Discusses Oracle time-based performance instrumentation as presented in AWR reports and inconsistencies between instrumentation sources that can cause confusion as conflicting information is presented. The cognitive load of investigating and reasoning about such conundrums is very high, discouraging even senior performance experts. A program (AWR1page) is discussed that consumes an AWR report and produces a 1-page normalized time summary by instrumentation source, precisely designed for reasoning about instrumentation inconsistencies in AWR reports.
AWR Ambiguity: Performance reasoning when the numbers don't add upJohn Beresniewicz
A close look at an AWR report where DB Time is exceeded by the sum of DB CPU and foreground wait time. We recall core Oracle performance principles and instrumentation design on the way to untangling the confusion.
Awr1page - Sanity checking time instrumentation in AWR reportsJohn Beresniewicz
The presentation discusses issues with Oracle timing instrumentation and introduces AWR1page, a program that produces high-level instrumentation sanity checks from AWR text reports, on a single page.
We thought of checking the Boost library long ago but were not sure if we would collect enough results to write an article. However, the wish remained. We tried to do that twice but gave up each time because we didn't know how to replace a compiler call with a PVS-Studio.exe call. Now we've got us new arms, and the third attempt has been successful. So, are there any bugs to be found in Boost?
Searching for bugs in Mono: there are hundreds of them!PVS-Studio
It's very interesting to check large projects. As a rule, we do manage to find unusual and peculiar errors, and tell people about them. Also, it's a great way to test our analyzer and improve all its different aspects. I've long been waiting to check 'Mono'; and finally, I got the opportunity. I should say that this check really proved its worth as I was able to find a lot of entertaining things. This article is about the bugs we found, and several nuances which arose during the check.
War of the Machines: PVS-Studio vs. TensorFlowPVS-Studio
"I'll be back" (c). I think everybody knows this phrase. Although, today we aren't going to talk about the return of the terminator, the topic of the article is similar in some way. We'll discuss the analysis of the the machine learning library TensorFlow and will try to find out, if we can sleep peacefully or Skynet is already coming...
We continue checking Microsoft projects: analysis of PowerShellPVS-Studio
It has become a "good tradition" for Microsoft to make their products open-source: CoreFX, .Net Compiler Platform (Roslyn), Code Contracts, MSBuild, and other projects. For us, the developers of PVS-Studio analyzer, it's an opportunity to check well-known projects, tell people (including the project authors themselves) about the bugs we find, and additionally test our analyzer. Today we are going to talk about the errors found in another project by Microsoft, PowerShell.
Why Students Need the CppCat Code AnalyzerPVS-Studio
CppCat is a simple static code analyzer capable of detecting bugs in C/C++ programs. We started granting free academic licenses to all interested (students, teachers, and so on). For the sake of popularizing CppCat among students, I decided to write this post about errors that can be found in student lab work tasks posted at Pastebin.com.
The .NET Garbage Collector (GC) is really cool. It helps providing our applications with virtually unlimited memory, so we can focus on writing code instead of manually freeing up memory. But how does .NET manage that memory? What are hidden allocations? Are strings evil? It still matters to understand when and where memory is allocated. In this talk, we’ll go over the base concepts of .NET memory management and explore how .NET helps us and how we can help .NET – making our apps better. Expect profiling, Intermediate Language (IL), ClrMD and more!
Celebrating 30-th anniversary of the first C++ compiler: let's find bugs in it.PVS-Studio
Cfront is a C++ compiler which came into existence in 1983 and was developed by Bjarne Stroustrup. At that time it was known as "C with Classes". Cfront had a complete parser, symbol tables, and built a tree for each class, function, etc. Cfront was based on CPre. Cfront defined the language until circa 1990. Many of the obscure corner cases in C++ are related to the Cfront implementation limitations. The reason is that Cfront performed translation from C++ to C. In short, Cfront is a sacred artifact for a C++ programmer. So I just couldn't help checking such a project.
Measure anything, measure everything.
Effortless monitoring with Statsd, Collectd and Graphite can increase software development productivity and quality at the same time.
Characteristics of PVS-Studio Analyzer by the Example of EFL Core Libraries, ...PVS-Studio
After I wrote quite a big article about the analysis of the Tizen OS code, I received a large number of questions concerning the percentage of false positives and the density of errors (how many errors PVS-Studio detects per 1000 lines of code). Apparently, my reasoning that it strongly depends on the project to be analyzed and the settings of the analyzer didn't seem sufficient enough. Therefore, I decided to provide specific figures by doing a more thorough investigation of one of the project of the Tizen OS. I decided that it would be quite interesting to take EFL Core Libraries, because one of the developers, Carsten Haitzler, took an active part in the discussion of my articles. I hope this article would prove to Carsten that PVS-Studio is a worthy tool.
The .NET Garbage Collector (GC) is really cool. It helps providing our applications with virtually unlimited memory, so we can focus on writing code instead of manually freeing up memory. But how does .NET manage that memory? What are hidden allocations? Are strings evil? It still matters to understand when and where memory is allocated. In this talk, we’ll go over the base concepts of .NET memory management and explore how .NET helps us and how we can help .NET – making our apps better. Expect profiling, Intermediate Language (IL), ClrMD and more!
Isolating Cancellations from Scanned Stamps and Postal HistoryRobert Swanson
Collectors of cancellations, postal markings, and covers often wish to illustrate only the
cancellation or marking as it appears on a cover or stamp. Historically, this operation has
been performed by hand, and is called “tracing”. It is literally an artistic activity, and great skill
is required to create a good facsimile of a cancel for research purposes.
DB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentalsJohn Beresniewicz
RMOUG 2020 abstract:
This session will cover core concepts for Oracle performance analysis first introduced in Oracle 10g and forming the backbone of many features in the Diagnostic and Tuning packs. The presentation will cover the theoretical basis and meaning of these concepts, as well as illustrate how they are fundamental to many user-facing features in both the database itself and Enterprise Manager.
ASHviz - Dats visualization research experiments using ASH dataJohn Beresniewicz
RMOUG Training Days 2020 abstract:
The Active Session History (ASH) mechanism is a rich source of fine-grained data about database activity, and is the lynchpin for many database performance management features in the Diagnostic and Tuning packs. Many interesting stories about happenings in the database are buried in ASH waiting to be revealed, and data visualization is key to sifting these out from the high dimensionality and volume of ASH data. The session will cover a number of data visualization experiments conducted using a single ASH dump with an emphasis on the iterative process of discovering useful data visualizations.
Lightning talk given at OakTableWorld 2018 in San Francisco discusses why NoSQL databases are pretty much antithetical to fundamental relational database principles and therefore claims of "Relational-NoSQL" are absurd on their face.
This short presentation is about the deeper meaning of the core Oracle performance metric "Average Active Sessions" as the time derivative of the DB Time function, which explains why the Enterprise Manager DB Performance Page is literally a picture of DB Time (as the integral of AAS) as well as why "ASH Math" works to estimate DB Time (it's a Riemann sum as in first-year calculus.) Also, the relationship of AAS to Little's Law in queueing theory is briefly mentioned.
Small updates for this version presented at OakTableWorld 2018
Discusses Oracle time-based performance instrumentation as presented in AWR reports and inconsistencies between instrumentation sources that can cause confusion as conflicting information is presented. The cognitive load of investigating and reasoning about such conundrums is very high, discouraging even senior performance experts. A program (AWR1page) is discussed that consumes an AWR report and produces a 1-page normalized time summary by instrumentation source, precisely designed for reasoning about instrumentation inconsistencies in AWR reports.
Searching for bugs in Mono: there are hundreds of them!PVS-Studio
It's very interesting to check large projects. As a rule, we do manage to find unusual and peculiar errors, and tell people about them. Also, it's a great way to test our analyzer and improve all its different aspects. I've long been waiting to check 'Mono'; and finally, I got the opportunity. I should say that this check really proved its worth as I was able to find a lot of entertaining things. This article is about the bugs we found, and several nuances which arose during the check.
War of the Machines: PVS-Studio vs. TensorFlowPVS-Studio
"I'll be back" (c). I think everybody knows this phrase. Although, today we aren't going to talk about the return of the terminator, the topic of the article is similar in some way. We'll discuss the analysis of the the machine learning library TensorFlow and will try to find out, if we can sleep peacefully or Skynet is already coming...
We continue checking Microsoft projects: analysis of PowerShellPVS-Studio
It has become a "good tradition" for Microsoft to make their products open-source: CoreFX, .Net Compiler Platform (Roslyn), Code Contracts, MSBuild, and other projects. For us, the developers of PVS-Studio analyzer, it's an opportunity to check well-known projects, tell people (including the project authors themselves) about the bugs we find, and additionally test our analyzer. Today we are going to talk about the errors found in another project by Microsoft, PowerShell.
Why Students Need the CppCat Code AnalyzerPVS-Studio
CppCat is a simple static code analyzer capable of detecting bugs in C/C++ programs. We started granting free academic licenses to all interested (students, teachers, and so on). For the sake of popularizing CppCat among students, I decided to write this post about errors that can be found in student lab work tasks posted at Pastebin.com.
The .NET Garbage Collector (GC) is really cool. It helps providing our applications with virtually unlimited memory, so we can focus on writing code instead of manually freeing up memory. But how does .NET manage that memory? What are hidden allocations? Are strings evil? It still matters to understand when and where memory is allocated. In this talk, we’ll go over the base concepts of .NET memory management and explore how .NET helps us and how we can help .NET – making our apps better. Expect profiling, Intermediate Language (IL), ClrMD and more!
Celebrating 30-th anniversary of the first C++ compiler: let's find bugs in it.PVS-Studio
Cfront is a C++ compiler which came into existence in 1983 and was developed by Bjarne Stroustrup. At that time it was known as "C with Classes". Cfront had a complete parser, symbol tables, and built a tree for each class, function, etc. Cfront was based on CPre. Cfront defined the language until circa 1990. Many of the obscure corner cases in C++ are related to the Cfront implementation limitations. The reason is that Cfront performed translation from C++ to C. In short, Cfront is a sacred artifact for a C++ programmer. So I just couldn't help checking such a project.
Measure anything, measure everything.
Effortless monitoring with Statsd, Collectd and Graphite can increase software development productivity and quality at the same time.
Characteristics of PVS-Studio Analyzer by the Example of EFL Core Libraries, ...PVS-Studio
After I wrote quite a big article about the analysis of the Tizen OS code, I received a large number of questions concerning the percentage of false positives and the density of errors (how many errors PVS-Studio detects per 1000 lines of code). Apparently, my reasoning that it strongly depends on the project to be analyzed and the settings of the analyzer didn't seem sufficient enough. Therefore, I decided to provide specific figures by doing a more thorough investigation of one of the project of the Tizen OS. I decided that it would be quite interesting to take EFL Core Libraries, because one of the developers, Carsten Haitzler, took an active part in the discussion of my articles. I hope this article would prove to Carsten that PVS-Studio is a worthy tool.
The .NET Garbage Collector (GC) is really cool. It helps providing our applications with virtually unlimited memory, so we can focus on writing code instead of manually freeing up memory. But how does .NET manage that memory? What are hidden allocations? Are strings evil? It still matters to understand when and where memory is allocated. In this talk, we’ll go over the base concepts of .NET memory management and explore how .NET helps us and how we can help .NET – making our apps better. Expect profiling, Intermediate Language (IL), ClrMD and more!
Isolating Cancellations from Scanned Stamps and Postal HistoryRobert Swanson
Collectors of cancellations, postal markings, and covers often wish to illustrate only the
cancellation or marking as it appears on a cover or stamp. Historically, this operation has
been performed by hand, and is called “tracing”. It is literally an artistic activity, and great skill
is required to create a good facsimile of a cancel for research purposes.
Similar to JB Design CV: products / mockups / experiments (16)
DB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentalsJohn Beresniewicz
RMOUG 2020 abstract:
This session will cover core concepts for Oracle performance analysis first introduced in Oracle 10g and forming the backbone of many features in the Diagnostic and Tuning packs. The presentation will cover the theoretical basis and meaning of these concepts, as well as illustrate how they are fundamental to many user-facing features in both the database itself and Enterprise Manager.
ASHviz - Dats visualization research experiments using ASH dataJohn Beresniewicz
RMOUG Training Days 2020 abstract:
The Active Session History (ASH) mechanism is a rich source of fine-grained data about database activity, and is the lynchpin for many database performance management features in the Diagnostic and Tuning packs. Many interesting stories about happenings in the database are buried in ASH waiting to be revealed, and data visualization is key to sifting these out from the high dimensionality and volume of ASH data. The session will cover a number of data visualization experiments conducted using a single ASH dump with an emphasis on the iterative process of discovering useful data visualizations.
Lightning talk given at OakTableWorld 2018 in San Francisco discusses why NoSQL databases are pretty much antithetical to fundamental relational database principles and therefore claims of "Relational-NoSQL" are absurd on their face.
This short presentation is about the deeper meaning of the core Oracle performance metric "Average Active Sessions" as the time derivative of the DB Time function, which explains why the Enterprise Manager DB Performance Page is literally a picture of DB Time (as the integral of AAS) as well as why "ASH Math" works to estimate DB Time (it's a Riemann sum as in first-year calculus.) Also, the relationship of AAS to Little's Law in queueing theory is briefly mentioned.
Small updates for this version presented at OakTableWorld 2018
Discusses Oracle time-based performance instrumentation as presented in AWR reports and inconsistencies between instrumentation sources that can cause confusion as conflicting information is presented. The cognitive load of investigating and reasoning about such conundrums is very high, discouraging even senior performance experts. A program (AWR1page) is discussed that consumes an AWR report and produces a 1-page normalized time summary by instrumentation source, precisely designed for reasoning about instrumentation inconsistencies in AWR reports.
Understanding Average Active Sessions (AAS) is critical to understanding Oracle performance at the systemic level. This is my first presentation on the topic done at RMOUG Training Days in 2007. Later I will upload a more recent presentation on AAS from 2013.
This is the presentation on ASH that I did with Graham Wood at RMOUG 2014 and that represents the final best effort to capture essential and advanced ASH content as started in a presentation Uri Shaft and I gave at a small conference in Denmark sometime in 2012 perhaps. The presentation is also available publicly through the RMOUG website, so I felt at liberty to post it myself here. If it disappears it would likely be because I have been asked to remove it by Oracle.
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
2. explanation
A collection of screenshots, design mockups, and
concept prototypes dredged up from old archives.
The tools used were often very crude and not meant to
appear polished. Mockups especially are low-tech
conceptual experiments.
A version with explanations for each image is underway
but very time consuming, so I am publishing the raw
content.
If it doesn’t make sense, it’s OK, you are probably normal.
7. Instance widget graphically
shows 3 “temperature”
metrics and 3 “text” metrics
Multi-instance viewer allowed
widgets to be grouped together
and quickly scanned for alerts
and associated messages