These are the slides used by Dilip Kumar of EnterpriseDB for his presentation at pgDay Asia 2016, Singpaore. He talked about scalability and performance improvements in PostgreSQL v9.6, which is expected to be released in Dec/2016 - Jan/2017.
Query Parallelism in PostgreSQL: What's coming next?PGConf APAC
This presentation was presented by Dilip Kumar (a PostgreSQL contributor) at pgDay Asia 2017. The presentation talks about Prallel query features released in v9.6, the infrastructure for the prallel query feature which was built in previous versions and what is the roadmap for prallel query.
This presentation reviews the top ten new features that will appear in the Postgres 9.5 release.
Postgres 9.5 adds many features designed to enhance the productivity of developers: UPSERT, CUBE, ROLLUP, JSONB functions, and PostGIS improvements. For administrators, it has row-level security, a new index type, and performance enhancements for large servers.
Toro DB- Open-source, MongoDB-compatible database, built on top of PostgreSQLInMobi Technology
Toro DB- Open-source, MongoDB-compatible database, built on top of PostgreSQL
By Álvaro Hernández at India PostgreSQL UserGroup Meetup, Bangalore
at InMobi.
http://technology.inmobi.com/events/india-postgresql-usergroup-meetup-bangalore
Query Parallelism in PostgreSQL: What's coming next?PGConf APAC
This presentation was presented by Dilip Kumar (a PostgreSQL contributor) at pgDay Asia 2017. The presentation talks about Prallel query features released in v9.6, the infrastructure for the prallel query feature which was built in previous versions and what is the roadmap for prallel query.
This presentation reviews the top ten new features that will appear in the Postgres 9.5 release.
Postgres 9.5 adds many features designed to enhance the productivity of developers: UPSERT, CUBE, ROLLUP, JSONB functions, and PostGIS improvements. For administrators, it has row-level security, a new index type, and performance enhancements for large servers.
Toro DB- Open-source, MongoDB-compatible database, built on top of PostgreSQLInMobi Technology
Toro DB- Open-source, MongoDB-compatible database, built on top of PostgreSQL
By Álvaro Hernández at India PostgreSQL UserGroup Meetup, Bangalore
at InMobi.
http://technology.inmobi.com/events/india-postgresql-usergroup-meetup-bangalore
Agenda
• Technical cases in PostgreSQL
• Database Monitoring Methods
By Rohit Vyas at India PostgreSQL UserGroup Meetup, Bangalore at InMobi.
http://technology.inmobi.com/events/india-postgresql-usergroup-meetup-bangalore
Always upgrade! There are hundreds of fixes between each PostgreSQL release, and an important number of them are security fixes! Logical replication allows making major upgrades with minimal downtime and feasible cons.
This webinar covered:
- PostgreSQL releases
- Upgrade options
- What is Pglogical?
- Major upgrades
Oracle Database In-Memory introduces a number of new features in the query optimizer. The aim of this presentation is to describe and demonstrate how they work.
PostgreSQL Enterprise Class Features and CapabilitiesPGConf APAC
These are the slides used by Venkar from Fujitsu for his presentation at pgDay Asia 2016. He spoke about some of the Enterprise Class features of PostgreSQL database.
This ppt was used by Devrim at pgDay Asia 2017. He talked about some important facts about WAL - Transaction Logs or xlogs in PostgreSQL. Some of these can really come handy on a bad day
Agenda:
- Spark on Yarn
- Auto scaling Spark Apps and Cluster management
- Hive Integration with Spark
- Persistent History Server
By Rajat Gupta and Bharath Bhushan at Big Data Meetup at InMobi.
http://technology.inmobi.com/events/big-data-may-meetup
Lessons PostgreSQL learned from commercial databases, and didn’tPGConf APAC
This is the ppt used by Illay for his presentation at pgDay Asia 2016 - "Lessons PostgreSQL learned from commercial
databases, and didn’t". The talk takes you through some of the really good things that PostgreSQL has done really well and somethings that PostgreSQL can learn from other databases
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTanel Poder
From Tanel Poder's Troubleshooting Complex Performance Issues series - an example of Oracle SEG$ internal segment contention due to some direct path insert activity.
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Ashnikbiz
Distributed databases and horizontal scale up is one of the key demands in today's date. PostgreSQL already had some vertical scaling features and horizontal scale-up by adding disks and table partitioning/child tables. With release of v9.5, PostgreSQL will get basic foundation for native sharing capability. From v9.5 Foreign Tables will be able to participate in Inheritance Tree as a child or parent table i.e. one can have table partitions residing on different system.
In our countdown to v9.5 series of hangouts, we will be covering some of the great features of PostgreSQL v9.5 and what is their real life applicability. In the first hangout in this series we will be talking about-
- The feature of foreign partitions/child tables
- Syntax and usage
- EXPLAIN plan demo
- Use cases and benefits
Join us for more and send us your queries on success@ashnik.com
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...PostgreSQL-Consulting
Even an experienced PostgreSQL DBA can not always say that upgrading between major versions of Postgres is an easy task, especially if there are some special requirements, such as downtime limitations or if something goes wrong. For less experienced DBAs anything more complex than dump/restore can be frustrating.
In this talk I will describe why we need a special procedure to upgrade between major versions, how that can be achieved and what sort of problems can occur. I will review all possible ways to upgrade your cluster from classical pg_upgrade to old-school slony or modern methods like logical replication. For all approaches, I will give a brief explanation how it works (limited by the scope of this talk of course), examples how to perform upgrade and some advice on potentially problematic steps. Besides I will touch upon such topics as integration of upgrade tools and procedures with other software — connection brokers, operating system package managers, automation tools, etc. This talk would not be complete if I do not cover cases when something goes wrong and how to deal with such cases.
Size can creep up on you. Some day you may wake up to a multi-terabyte Postgres system handling over 3000 tps staring you down. Learn the best ways to manage these systems as they grow, and find out what new features in 9.0 have made life easier for administrators and application developers working with big data.
This talk will lead you through solutions to problems Postgres faces when it gets big: backups, transaction wraparound, bloat, huge catalogs and upgrades. You need to monitor the right things, find the gems in DBA-friendly database functions and catalog tables, and know the right places to look to spot problems early. We’ll also go over monitoring best practices and open source tools to get the job done.
Working with multiple versions of Postgres back to version 8.2 will be included, and as well as tips on making the most out of new features in 9.0. War stories will be taken from real-world work with Emma, an email marketing company with a few large databases.
It has just been a few months since the PostgreSQL9.5 is released. We have got some of our customers excited about great new features and performance enhancements in v9.5. But here we are already taking a peak into the next version, and we find it awesome! One of the most awaited features – parallelism makes it to Postgres. The infrastructure for parallelism has been added over last few releases but the first parallel operation in query execution will be seen only in v9.6.
A sneak peek at what's coming in PostgreSQL 9.4. Many more things will be added before the beta is released. But hopefully this will give an idea what's already there.
In September 2016, the PostgreSQL community is rolling out PostgreSQL 9.6 which includes improvements in parallelism for query performance, overall performance improvements and the integration of foreign data sources.
This presentation introduces the new features of 9.6 and how they will benefit you.
- Parallel sequential scans, joins and aggregates
- Elimination of repetitive scanning of old data by autovacuum
- Synchronous replication now allows multiple standby servers for increased reliability
- Full-text search for phrases
- Support for remote joins, sorts, and updates in postgres_fdw
- Substantial performance improvements, especially in the area of improving scalability on many-CPU servers
If you have any questions on how to get started with Postgres, please email sales@enterprisedb.com
"One Coin One Brick project is organized by the Vietnamese Students Subcommittee in National University of Singapore (VSS). It is one of our fund‐raising activities for the Vietnam JUMP project, which aims to raise 15,000 SGD for the construction of a musical and a medical room for a needy local kindergarten in Quang Nam province, Vietnam. Inspired by what VietnamJUMP will do in Quang Nam, OCOB is going to build a classroom model with 5137 50-cent coins. Every single 50-coin donated will be put on the model by hand, which resembles the act to laying a brick to build a real classroom. Besides the elevated purpose, the model could be considered as a Singapore record when finished.
For more information please visit our website: http://vietnamjump.vncnus.net/2009/fundraising.html
Agenda
• Technical cases in PostgreSQL
• Database Monitoring Methods
By Rohit Vyas at India PostgreSQL UserGroup Meetup, Bangalore at InMobi.
http://technology.inmobi.com/events/india-postgresql-usergroup-meetup-bangalore
Always upgrade! There are hundreds of fixes between each PostgreSQL release, and an important number of them are security fixes! Logical replication allows making major upgrades with minimal downtime and feasible cons.
This webinar covered:
- PostgreSQL releases
- Upgrade options
- What is Pglogical?
- Major upgrades
Oracle Database In-Memory introduces a number of new features in the query optimizer. The aim of this presentation is to describe and demonstrate how they work.
PostgreSQL Enterprise Class Features and CapabilitiesPGConf APAC
These are the slides used by Venkar from Fujitsu for his presentation at pgDay Asia 2016. He spoke about some of the Enterprise Class features of PostgreSQL database.
This ppt was used by Devrim at pgDay Asia 2017. He talked about some important facts about WAL - Transaction Logs or xlogs in PostgreSQL. Some of these can really come handy on a bad day
Agenda:
- Spark on Yarn
- Auto scaling Spark Apps and Cluster management
- Hive Integration with Spark
- Persistent History Server
By Rajat Gupta and Bharath Bhushan at Big Data Meetup at InMobi.
http://technology.inmobi.com/events/big-data-may-meetup
Lessons PostgreSQL learned from commercial databases, and didn’tPGConf APAC
This is the ppt used by Illay for his presentation at pgDay Asia 2016 - "Lessons PostgreSQL learned from commercial
databases, and didn’t". The talk takes you through some of the really good things that PostgreSQL has done really well and somethings that PostgreSQL can learn from other databases
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTanel Poder
From Tanel Poder's Troubleshooting Complex Performance Issues series - an example of Oracle SEG$ internal segment contention due to some direct path insert activity.
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Ashnikbiz
Distributed databases and horizontal scale up is one of the key demands in today's date. PostgreSQL already had some vertical scaling features and horizontal scale-up by adding disks and table partitioning/child tables. With release of v9.5, PostgreSQL will get basic foundation for native sharing capability. From v9.5 Foreign Tables will be able to participate in Inheritance Tree as a child or parent table i.e. one can have table partitions residing on different system.
In our countdown to v9.5 series of hangouts, we will be covering some of the great features of PostgreSQL v9.5 and what is their real life applicability. In the first hangout in this series we will be talking about-
- The feature of foreign partitions/child tables
- Syntax and usage
- EXPLAIN plan demo
- Use cases and benefits
Join us for more and send us your queries on success@ashnik.com
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...PostgreSQL-Consulting
Even an experienced PostgreSQL DBA can not always say that upgrading between major versions of Postgres is an easy task, especially if there are some special requirements, such as downtime limitations or if something goes wrong. For less experienced DBAs anything more complex than dump/restore can be frustrating.
In this talk I will describe why we need a special procedure to upgrade between major versions, how that can be achieved and what sort of problems can occur. I will review all possible ways to upgrade your cluster from classical pg_upgrade to old-school slony or modern methods like logical replication. For all approaches, I will give a brief explanation how it works (limited by the scope of this talk of course), examples how to perform upgrade and some advice on potentially problematic steps. Besides I will touch upon such topics as integration of upgrade tools and procedures with other software — connection brokers, operating system package managers, automation tools, etc. This talk would not be complete if I do not cover cases when something goes wrong and how to deal with such cases.
Size can creep up on you. Some day you may wake up to a multi-terabyte Postgres system handling over 3000 tps staring you down. Learn the best ways to manage these systems as they grow, and find out what new features in 9.0 have made life easier for administrators and application developers working with big data.
This talk will lead you through solutions to problems Postgres faces when it gets big: backups, transaction wraparound, bloat, huge catalogs and upgrades. You need to monitor the right things, find the gems in DBA-friendly database functions and catalog tables, and know the right places to look to spot problems early. We’ll also go over monitoring best practices and open source tools to get the job done.
Working with multiple versions of Postgres back to version 8.2 will be included, and as well as tips on making the most out of new features in 9.0. War stories will be taken from real-world work with Emma, an email marketing company with a few large databases.
It has just been a few months since the PostgreSQL9.5 is released. We have got some of our customers excited about great new features and performance enhancements in v9.5. But here we are already taking a peak into the next version, and we find it awesome! One of the most awaited features – parallelism makes it to Postgres. The infrastructure for parallelism has been added over last few releases but the first parallel operation in query execution will be seen only in v9.6.
A sneak peek at what's coming in PostgreSQL 9.4. Many more things will be added before the beta is released. But hopefully this will give an idea what's already there.
In September 2016, the PostgreSQL community is rolling out PostgreSQL 9.6 which includes improvements in parallelism for query performance, overall performance improvements and the integration of foreign data sources.
This presentation introduces the new features of 9.6 and how they will benefit you.
- Parallel sequential scans, joins and aggregates
- Elimination of repetitive scanning of old data by autovacuum
- Synchronous replication now allows multiple standby servers for increased reliability
- Full-text search for phrases
- Support for remote joins, sorts, and updates in postgres_fdw
- Substantial performance improvements, especially in the area of improving scalability on many-CPU servers
If you have any questions on how to get started with Postgres, please email sales@enterprisedb.com
"One Coin One Brick project is organized by the Vietnamese Students Subcommittee in National University of Singapore (VSS). It is one of our fund‐raising activities for the Vietnam JUMP project, which aims to raise 15,000 SGD for the construction of a musical and a medical room for a needy local kindergarten in Quang Nam province, Vietnam. Inspired by what VietnamJUMP will do in Quang Nam, OCOB is going to build a classroom model with 5137 50-cent coins. Every single 50-coin donated will be put on the model by hand, which resembles the act to laying a brick to build a real classroom. Besides the elevated purpose, the model could be considered as a Singapore record when finished.
For more information please visit our website: http://vietnamjump.vncnus.net/2009/fundraising.html
Why PostgreSQL for Analytics Infrastructure (DW)?Huy Nguyen
Talk given at Grokking TechTalk 14 - Database Systems
Why starting out with Postgres for reporting/analytics database.
Huy Nguyen
CTO of Holistics.io - BI & Infrastructure SaaS.
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analyticsmason_s
In this talk we introduce Postgres-XL for scaling out PostgreSQL. We cover its architecture, how tables are distributed, and include a sample configuration for a small local test cluster. Finally, we discuss the differences to PostgreSQL and discuss Postgres-XL community building
This presentation introduces the following functionalities of pgAdmin and PEM that make database management more efficient:
1. Examining the performance of a query using the explain plan visualizer in pgAdmin’s Query Tool
2. Examining the performance of a process or session consisting of multiple queries in PEM’s SQL Profiler
3. 24/7 monitoring of Postgres and the underlying host system
4. Capacity management and reporting
5. Alerting the DBA or System Administrator to potential problems
Postgres has the unique ability to act as a powerful data aggregator or information hub in many IT centers bringing together data from different databases and in different formats.
This presentation reviews Postgres' extensibility, foreign data wrappers, and ability to work with structured relational and unstructured NoSQL-like information such as documents and key-value data.
The Postgres capabilities are unrivaled in enabling a complete view of customers or businesses, analyzing disparate data together, and breaking down data silos within the enterprise.
If you would like to listen to the recording please visit EnterpriseDB > Resources > Webcasts > Ondemand Webcasts.
To speak to someone about EnterpriseDB's solutions and services please email sales@enterprisedb.com.
Maximizing Database Tuning in SAP SQL AnywhereSAP Technology
This session illustrates the different tools available in SQL Anywhere to analyze performance issues, as well as describes the most common types of performance problems encountered by database developers and administrators. We also take a look at various tips and techniques that will help boost the performance of your SQL Anywhere database.
Based on the popular blog series, join me in taking a deep dive and a behind the scenes look at how SQL Server 2016 “It Just Runs Faster”, focused on scalability and performance enhancements. This talk will discuss the improvements, not only for awareness, but expose design and internal change details. The beauty behind ‘It Just Runs Faster’ is your ability to just upgrade, in place, and take advantage without lengthy and costly application or infrastructure changes. If you are looking at why SQL Server 2016 makes sense for your business you won’t want to miss this session.
Apache Spark 3.0: Overview of What’s New and Why CareDatabricks
Continuing with the objectives to make Spark faster, easier, and smarter, Apache Spark 3.0 extends its scope with more than 3000 resolved JIRAs. We will talk about the exciting new developments in the Spark 3.0 as well as some other initiatives that are coming in the future. In this talk, we want to share with the Bogota Spark community an overview of Spark 3.0 features and enhancements.
In particular, we will touch upon the following areas:
* Performance Improvement Features
* Improved Useability Features
* ANSI SQL Compliance
* Pandas UDFs
* Compatibility and migration considerations
* Spark Ecosystem: Delta Lake, Project Hydrogen, and Project Zen
The Practice of Presto & Alluxio in E-Commerce Big Data PlatformAlluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
The Practice of Presto & Alluxio in E-Commerce Big Data Platform
Wenjun Tao, Sr. Software Engineer, JD.com
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
Just like you can't defeat the laws of physics there are natural laws that ultimately decide software performance. Even the latest technology beta is still bound by Newton's laws, and you can't change the speed of light, even in the cloud!
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdfMukundThakur22
Since 2006 the world of big data has moved from terabytes to hundreds of petabytes, from local clusters to remote cloud storage, yet the original Apache Hadoop POSIX-based file APIs have barely changed.
It is wonderful that these APIs have worked so well, but we can do a lot better with remote object stores, by providing new operations which suit them better, targeted at columnar data libraries such as ORC and Spark. Only a few libraries need to migrate to these APIs for significant speedups of all big data applications.
This talk introduces a new Hadoop Filesystem API called "vectored read", coming in Hadoop 3.4. An extension of the classic FSDataInputStream it is automatically offered by all filesystem clients.
The S3A connector is the first object store to provide a custom implementation, reading different blocks of data in parallel. In Apache Hive benchmarks with a modified ORC library, we saw a 2x speedup compared to using the classic s3a connector through the Posix APIs.
We will introduce the API spec, the S3A implementation, and the benchmarks, and show how to use it in your own applications. We will also cover our ongoing work on providing similar speedups with other object stores, and the use of the API in other applications.
SQL on Hadoop benchmarks using TPC-DS query setKognitio
Sharon Kirkham, VP Analytics & Consulting at Kognitio, ran the TPC-DS query set using Impala, SparkSQL and Kognitio, to test for speed, reliability and concurrency for different SQL on Hadoop solutions. Standard Hive was originally investigated as part of this benchmark but lack of SQL support and poor single thread performance meant it was removed.
Tech Talk - JPA and Query Optimization - publishGleydson Lima
Behind JPA ans SQL Query Optimizations. Talk about PostgreSQL Indexes and Query Planner and Java Persistence API Performance Tips. Hibernate. Java. PostgreSQL. Spring Boot. Spring JPA.
Webinar | Building Apps with the Cassandra Python DriverDataStax Academy
With the new Python driver for Cassandra it is easy to build integrations and apps that use Cassandra seamlessly as a back in. This session will explore what it takes to build the app and the features available with the new Python drivers.
Mastering MongoDB Atlas: Essentials of Diagnostics and Debugging in the Cloud...Mydbops
Diving deep into the essentials of MongoDB Atlas diagnostics and debugging, helps you ensure optimal performance for your cloud-based databases. Join us as we explore key strategies and best practices for effective database management in the cloud environment. Get ready to elevate your MongoDB Atlas experience and unlock the full potential of your cloud databases.
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...NETWAYS
Open source is at the heart of what we do at Grafana Labs and there is so much happening! The intent of this talk to update everyone on the latest development when it comes to Grafana, Pyroscope, Faro, Loki, Mimir, Tempo and more. Everyone has had at least heard about Grafana but maybe some of the other projects mentioned above are new to you? Welcome to this talk 😉 Beside the update what is new we will also quickly introduce them during this talk.
Similar to PostgreSQL 9.6 Performance-Scalability Improvements (20)
PGConf APAC 2018: Sponsored Talk by Fujitsu - The growing mandatory requireme...PGConf APAC
Speaker: Rajni Baliyan
As the volume of data of a personal nature and commodification of information collected and analysed increases; so is the focus on privacy and data security. Many countries are examining international and domestic laws in order to protect consumers and organisations alike.
The Australian Senate has recently passed a bill containing mandatory requirements to notify the privacy commissioner and consumers when data is at risk of causing serious harm in the case of a data breach occurring.
Europe has also announced new laws that allow consumers more control over their data. These laws allow consumers to tell companies to erase any data held about them.
These new laws will have a significant impact on organisations that store personal information.
This talk will examine some of these legislative changes and how specific PostgreSQL features can assist organisations in meeting their obligations and avoid heavy fines associated with breaching them.
While the physical replication in PostgreSQL is quite robust, however, it doesn’t fit well in the picture when:
- You need partial replication only
- You want to replicate between different major versions of PostgreSQL
- You need to replicate multiple databases to the same target
- Transformation of the data is needed
- You want to replicate in order to upgrade without downtime
The answer to these use cases is logical replication
This talk will discuss and cover these use cases followed by a logical replication demo.
PGConf APAC 2018 - A PostgreSQL DBAs Toolbelt for 2018PGConf APAC
There's no need to re-invent the wheel! Dozens of people have already tried...and succeeded. This talk is a categorized and illustrated overview on most popular and/or useful PostgreSQL specific scripts, utilities and whole toolsets that DBAs should be aware of for solving daily tasks. Inlcuding - performance monitoring, logs management/analyzis, identifying/fixing most common adminstration problems around areas of general performance metrics, tuning, locking, indexing, bloat, leaving out high-availability topics. Covered are venerable oldies from wiki.postgresql.org as well as my newer favourites from Github.
Speaker: Alexander Kukushkin
Kubernetes is a solid leader among different cloud orchestration engines and its adoption rate is growing on a daily basis. Naturally people want to run both their applications and databases on the same infrastructure.
There are a lot of ways to deploy and run PostgreSQL on Kubernetes, but most of them are not cloud-native. Around one year ago Zalando started to run HA setup of PostgreSQL on Kubernetes managed by Patroni. Those experiments were quite successful and produced a Helm chart for Patroni. That chart was useful, albeit a single problem: Patroni depended on Etcd, ZooKeeper or Consul.
Few people look forward to deploy two applications instead of one and support them later on. In this talk I would like to introduce Kubernetes-native Patroni. I will explain how Patroni uses Kubernetes API to run a leader election and store the cluster state. I’m going to live-demo a deployment of HA PostgreSQL cluster on Minikube and share our own experience of running more than 130 clusters on Kubernetes.
Patroni is a Python open-source project developed by Zalando in cooperation with other contributors on GitHub: https://github.com/zalando/patroni
PGConf APAC 2018 - High performance json postgre-sql vs. mongodbPGConf APAC
Speakers: Dominic Dwyer & Wei Shan Ang
This talk was presented in Percona Live Europe 2017. However, we did not have enough time to test against more scenario. We will be giving an updated talk with a more comprehensive tests and numbers. We hope to run it against citusDB and MongoRocks as well to provide a comprehensive comparison.
https://www.percona.com/live/e17/sessions/high-performance-json-postgresql-vs-mongodb
PGConf APAC 2018 - Monitoring PostgreSQL at ScalePGConf APAC
Speaker: Lukas Fittl
Your PostgreSQL database is one of the most important pieces of your architecture - yet the level of introspection available in Postgres is often hard to work with. Its easy to get very detailed information, but what should you really watch out for, send reports on and alert on?
In this talk we'll discuss how query performance statistics can be made accessible to application developers, critical entries one should monitor in the PostgreSQL log files, how to collect EXPLAIN plans at scale, how to watch over autovacuum and VACUUM operations, and how to flag issues based on schema statistics.
We'll also talk a bit about monitoring multi-server setups, first going into high availability and read standbys, logical replication, and then reviewing how monitoring looks like for sharded databases like Citus.
The talk will primarily describe free/open-source tools and statistics views readily available from within Postgres.
PGConf APAC 2018 - Where's Waldo - Text Search and Pattern in PostgreSQLPGConf APAC
Speaker: Joe Conway
There are many use cases for text search and pattern matching, and there are also a wide variety of techniques available in PostgreSQL to perform text search and pattern matching. Figuring out the best "match" between use case and technique can be confusing. This talk will review the possibilities and provide guidance regarding when to use what method, and especially how to properly deal with the related index methods to ensure speedy searches. This talk covers:
* The primary available search methods
* Examples illustrating when to use each
* Extensive discussion of index use
* Timing comparisons using realistic examples
PGConf APAC 2018 - Managing replication clusters with repmgr, Barman and PgBo...PGConf APAC
Speaker: Ian Barwick
PostgreSQL and reliability go hand-in-hand - but your data is only truly safe with a solid and trusted backup system in place, and no matter how good your application is, it's useless if it can't talk to your database.
In this talk we'll demonstrate how to set up a reliable replication
cluster using open source tools closely associated with the PostgreSQL project. The talk will cover following areas:
- how to set up and manage a replication cluster with `repmgr`
- how to set up and manage reliable backups with `Barman`
- how to manage failover and application connections with `repmgr` and `PgBouncer`
Ian Barwick has worked for 2ndQuadrant since 2014, and as well as making various contributions to PostgreSQL itself, is lead `repmgr` developer. He lives in Tokyo, Japan.
PGConf APAC 2018 - PostgreSQL HA with Pgpool-II and whats been happening in P...PGConf APAC
Speaker: Muhammad Usama
Pgpool-II has been around to complement PostgreSQL over a decade and provides many features like connection pooling, failover, query caching, load balancing, and HA. High Availability (HA) is very critical to most enterprise application, the clients needs the ability to automatically reconnect with a secondary node when the master nodes goes down.
This is where Pgpool-II watchdog feature comes in, the core feature of Pgpool-II provides HA by eliminating the SPOF is the Watchdog. This watchdog feature has been around for a while but it went through major overhauling and enhancements in recent releases. This talk aims to explain the watchdog feature, the recent enhancements went into the watchdog and describe how it can be used to provide PostgreSQL HA and automatic failover.
Their is rising trend of enterprise deployment shifting to cloud based environment, Pgpool II can be used in the cloud without any issues. In this talk we will give some ideas how Pgpool-II is used to provide PostgreSQL HA in cloud environment.
Finally we will summarise the major features that have been added in the recent major release of Pgpool II and whats in the pipeline for the next major release.
PGConf APAC 2018 - PostgreSQL performance comparison in various cloudsPGConf APAC
Speaker: Oskari Saarenmaa
Aiven PostgreSQL is available in five different public cloud providers' infrastructure in more than 60 regions around the world, including 18 in APAC. This has given us a unique opportunity to benchmark and compare performance of similar configurations in different environments.
We'll share our benchmark methods and results, comparing various PostgreSQL configurations and workloads across different clouds.
About a year ago I was caught up in line-of-fire when a production system started behaving abruptly
- A batch process which would finish in 15minutes started taking 1.5 hours
- We started facing OLTP read queries on standby being cancelled
- We faced a sudden slowness on the Primary server and we were forced to do a forceful switch to standby.
We were able to figure out that some peculiarities of the application code and batch process were responsible for this. But we could not fix the application code (as it is packaged application).
In this talk I would like to share more details of how we debugged, what was the problem we were facing and how we applied a work around for it. We also learnt that a query returning in 10minutes may not be as dangerous as a query returning in 10sec but executed 100s of times in an hour.
I will share in detail-
- How to map the process/top stats from OS with pg_stat_activity
- How to get and read explain plan
- How to judge if a query is costly
- What tools helped us
- A peculiar autovacuum/vacuum Vs Replication conflict we ran into
- Various parameters to tune autvacuum and auto-analyze process
- What we have done to work-around the problem
- What we have put in place for better monitoring and information gathering
This presentation was used by Blair during his talk on Aurora and PostgreSQl compatibility for Aurora at pgDay Asia 2017. The talk was part of dedicated PostgreSQL track at FOSSASIA 2017
PostgreSQL is one of the most loved databases and that is why AWS could not hold back from offering PostgreSQL as RDS. There are some really nice features in RDS which can be good for DBA and inspiring for Enterprises to build resilient solution with PostgreSQL.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
2. Who am I ?
Dilip Kumar
Currently working at EnterpriseDB
Have worked to develop various features on PostgreSQL (for
internal projects) as well as on other In-House DB at Huawei.
I have also contributed patches to community.
Holds around 14 patents in my name in various DB technologies.
I have presented a paper in PgCon 2015
3. 3
Journey From 9.1 to 9.5
What’s In 9.6 ?
MVCC Scalability Improvement
Clog Scalability
Bulk Load Scalability
Parallel Query
Sorting Improvement
Hash Lock Improvement
Index Only Scan with partial Index
Partial Sort
Cache the Snapshot
Buffer Header Spin Lock
Contents
5. Whats’s In 9.6 ?
ProcArray Lock Contention Solved : Commited
Parallel Sequence Scan : Commited
Parallel NLJ and HJ : Commited
Clog Control Lock Issue : In Progress
Bulk Load Scalability : In Progress
Buffer Header Lock Issue : In progress
Hash Header Lock Contention : In Progress
Sorting Improvement using Quick Sort : In Progress
Checkpoint Continuous Flushing : In Progress
Index Only Scan with partial Index : In Progress
Caching the Snapshot : In Progress
Partial Sort : In Progress
6. MVCC Scalability Improvement
6
ProcArrayLock:
Was the major contention point
reported in 9.5 and blocking Read
write workload to scale beyond 30
cores in TPCC.
With Pgbench also After 30 cores
scalability was not linear.
1 8 16 32 64 128
0
5000
10000
15000
20000
25000
30000
pgbench -M prepared Median Of 30 mins of Runs Syncrhronous Commit=On
Head
Clients
TPS
7. MVCC Scalability Improvement
Many Solutions were tried to
overcome this problem, like
CSN snapshot, Incremental
Snapshot.
Finally Group clear Xid in
ProcArrayEnd Transaction
Successfully got committed for
9.6 version and Scaling is
almost linear upto 64 Clients in
64 thread machine.
1 8 16 32 64 128
0
5000
10000
15000
20000
25000
30000
35000
pgbench-MpreparedMedianOf30minsofRunsSyncrhronousCommit=On
Head
Patch
Clients
TPS
8. Clog Control Improvement
8
Clog Control Lock:
Afer reducing ProcArrayLock contenton CLogControlLock become next
visible contenton Point.
Contenton is mainly due to two reasons, one is that while writng the
transacton status in CLOG, it acquires EXCLUSIVE CLogControlLock which
contends with every other transacton which tries to access the CLOG for
checking transacton status and to reduce it. Second contenton is due to
the reason that when the CLOG page is not found in CLOG bufers, it needs
to acquire CLogControlLock in Exclusive mode which again contends with
shared lockers which tries to access the transacton status.
Soluton Used for ClogControl Lock is Similar to the ProcArray Group Clear
XID.
10. Bulk Load Scalability
Relation Extension Lock:
Currently Relation extension Lock is becoming bottleneck while extending
the relation in parallel.
Both COPY and INSERT are Suffering From the same problem.
Recently we are working on this various solutions are tries, and Currently
this is in Progress. (Lock Free Extension, Extend In multiple Blocks with
User Knob, Group Extend the Relation, Extend In multiple of Lock
Waiters).
11. Bulk Load Scalability
1 2 4 8 16 32 64
0
200
400
600
800
1000
1200
COPY10000 Record (4Bytes) Data Fits in Shared Buffers
Base
Patch
Clients
TPS
1 2 4 8 16 32 64
0
50
100
150
200
250
300
INSERT 1000 Records (1K) data doesn't fits In Shared Buffers
Base
Patch
Clients
TPS
12. Parallel Query
12
Parallel Query is a Great Win for 9.6 this allows to run
single query in Parallel in multiple workers.
Parallel Sequence Scan and Parallel Join are already
Committed and some are in progress like Parallel Index
Scan, Parallel Aggregate.
Parallel Sequence Scan and Parallel Join has shown Great
Improvement for Single Query.
13. Parallel Query
13
Q3 Q4 Q5 Q7 Q8
0
2000
4000
6000
8000
10000
12000
14000
TPC-H Query with scale factor=5
Time in ms
14. Sorting Improvement
Using quicksort for every external sort:
This usage a quick sort instead of Replacement
Selection.
Replacing with quick sort which is cache
conscious algorithm, that improves the
performance significantly.
15. Sorting Improvement
1.7 3.5 7 14
0
50000
100000
150000
200000
250000
300000
350000
400000
Time Calculation for Using Quick Sort For All External Sort Work_mem= DEFAULT (4mb).
Head
Patch
Data Size (GB)
Time in (ms)
16. Hash Header Lock Contention
Hash Header mutex Contention:
Postgres internal hash table is used for many key Operation.
Heavy Weight Locks are managed in hash Tables.
Buffer pool is managed by hash Tables.
But whenever hash needs to allocate element from free list or
release to free list, there is one common lock and that become
main contention point.
As part of this work this is converted to partitioned level freelist
now each partition have separate freelist and separate Locks.
Freelist elements can be shared across partitions.
18. Index only scan with partial index
What is Partial Index ?
A partial index is an index built over a subset of a table; the subset is
defined by a conditional expression (called the predicate of the partial
index). The index contains entries for only those table rows that satisfy the
predicate.
A major motivation for partial indexes is to avoid indexing common values.
Since a query searching for a common value (one that accounts for more
than a few percent of all the table rows) will not use the index anyway, there
is no point in keeping those rows in the index at all. This reduces the size of
the index, which will speed up queries that do use the index.
19. Index only scan with partial index
Problem:
Currently partial indexes end up not using index only scans in most of the
cases.
unless you include all columns from the index predicate to the index, the
planner will decide index only scans are not possible.
Adding those columns which are not needed at runtime, will only increase
the index size.
Solution:
Select the index only scan, if there are some key which are not really
required at runtime.
20. Partial Sort
In PostgreSQL If all the sort keys are part of the index then
index scan will be selected for sorting.
Otherwise scan all tuple from heap and Sort them
completely.
As per this patch, mix both methods:
get results from index in order which partially meets our
requirements
do rest of work from heap.
21. Cache The Snapshot
As per this Improvement whenever backend take a
snapshot its stored in a shared cache.
Whenever any transaction is getting commited, saved
snapshot will be marked Invalid.
When there is mix load of read and write that time if b/w
multiple read there is no write transaction, snapshot can be
reuse.
When write transaction Invalidate the snapshot, First
transaction taking the snapshot will recalculate and update
the cached snapshot
22. Cache The Snapshot
64 88 128 256
0
5000
10000
15000
20000
25000
30000
35000
40000
Pgbench: Scale factor 300 Shared Buff=8GB
BASE
ONLY CLOG CHANGES
CLOG + SAVE SNAPSHOT
Clients
TPS
23. Buffer Header lock Improvement
Convert Pin/Unpin Spin Lock to Atomic Operation:
Spin Lock inside the Pin and Unpin buffer is also one of the
bottleneck, While testing scalability in Big NUMA Machines.
As per this patch Spin lock is converted to Atomic Operation and
Increasing/decreasing reference count is done using just one
ATOMIC operations.
Other Operation which need to do more work other than just
updating state variable, that takes a Lock and Lock is using atomic
operation.