The document discusses various database-related performance problems and their solutions. It describes issues like lock contention, missing indexes, slow queries, and the "SELECT N+1" problem. It provides examples of how to reduce lock contention using algorithms like Hi/Lo and updating asynchronously. It also discusses database connection management and transaction isolation levels. Payment and URL shortener systems are used as examples to illustrate strategies for improving database performance.
JDBC has been the de-facto standard for accessing relational databases for a long time. Times are however changing. In cloud environments the pay-per-use model is popular. If you can use resources more efficiently, you can save money! In addition, when running applications at cloud-scale, the number of concurrent requests which hit your services can skyrocket. Can JDBC handle such concurrency efficiently? Answer: No. The time has come to look beyond JDBC!
For services, reactive frameworks are becoming more popular. These frameworks can make more efficient use of resources due to their non-blocking nature, especially at high concurrency. Now with R2DBC relational databases can also be accessed using a reactive API! This means more efficient use of CPU and memory and better response times and throughput at high concurrency.
A tempting story but there are of course many questions
- Is R2DBC mature enough to implement?
- Which R2DBC drivers are available?
- Is framework support available?
- What do you need to do in order to implement R2DBC?
- Does it improve performance enough to make the switch worthwhile?
- Do I need to have a completely non-blocking stack to benefit from using R2DBC?
To answer these questions and more, I've created several implementations using R2DBC and JDBC with Spring Web MVC and Spring WebFlux and put them to the test. I looked at how to implement R2DBC and measured resource usage, throughput, and responsetimes. Interested in the results? Hint: R2DBC is pretty cool!
Training Webinar: Enterprise application performance with distributed cachingOutSystems
2nd Session - Distributed Caching:
- What is Distributed Caching
- Performance hurdles solved by Distributed Caching
- When to use Distributed Caching
- Patterns to Populate a Distributed Cache
- How to use Distributed Caching in OutSystems
Free Online training: https://www.outsystems.com/learn/courses/
Follow us on Twitter http://www.twitter.com/OutSystemsDev
Like us on Facebook http://www.Facebook.com/OutSystemsDev
End-to-End Reactive Data Access Using R2DBC with RSocket and ProteusVMware Tanzu
Lack of asynchronous relational database drivers in Java has been a barrier to writing scalable, data-driven applications for many. R2DBC is seeking to change this with a new API designed from the ground up for reactive programming against relational databases—its intent ito support reactive data access built on natively asynchronous, non-blocking SQL database drivers.
How does this change the game for data access in the cloud? Used in conjunction with RSocket and Proteus, it is now possible to write applications benefiting from reactive streaming end-to-end, from the browser all the way to the database. No more fiddling with paging APIs, polling for updates, or writing complex logic to merge data from multiple sources--reactive streams can handle this all for you!
RSocket is an open-source, reactive networking protocol that is a collaborative development initiative of Netifi with Pivotal, Facebook, and others. Proteus is a freely available broker for RSocket that is designed to handle the challenges of communication between complex networks of services—both within the data center and over the internet—extending to mobile devices and browsers.
Attend this webinar to learn how to use Pivotal Cloud Foundry with R2DBC and Proteus to build reactive microservices that return large amounts of data in a streaming fashion over RSocket.
Speakers: Ryland Degnan, co-founder and CTO of Netifi and Dan Baskette, Pivotal host
Lessons PostgreSQL learned from commercial databases, and didn’tPGConf APAC
This is the ppt used by Illay for his presentation at pgDay Asia 2016 - "Lessons PostgreSQL learned from commercial
databases, and didn’t". The talk takes you through some of the really good things that PostgreSQL has done really well and somethings that PostgreSQL can learn from other databases
JDBC has been the de-facto standard for accessing relational databases for a long time. Times are however changing. In cloud environments the pay-per-use model is popular. If you can use resources more efficiently, you can save money! In addition, when running applications at cloud-scale, the number of concurrent requests which hit your services can skyrocket. Can JDBC handle such concurrency efficiently? Answer: No. The time has come to look beyond JDBC!
For services, reactive frameworks are becoming more popular. These frameworks can make more efficient use of resources due to their non-blocking nature, especially at high concurrency. Now with R2DBC relational databases can also be accessed using a reactive API! This means more efficient use of CPU and memory and better response times and throughput at high concurrency.
A tempting story but there are of course many questions
- Is R2DBC mature enough to implement?
- Which R2DBC drivers are available?
- Is framework support available?
- What do you need to do in order to implement R2DBC?
- Does it improve performance enough to make the switch worthwhile?
- Do I need to have a completely non-blocking stack to benefit from using R2DBC?
To answer these questions and more, I've created several implementations using R2DBC and JDBC with Spring Web MVC and Spring WebFlux and put them to the test. I looked at how to implement R2DBC and measured resource usage, throughput, and responsetimes. Interested in the results? Hint: R2DBC is pretty cool!
Training Webinar: Enterprise application performance with distributed cachingOutSystems
2nd Session - Distributed Caching:
- What is Distributed Caching
- Performance hurdles solved by Distributed Caching
- When to use Distributed Caching
- Patterns to Populate a Distributed Cache
- How to use Distributed Caching in OutSystems
Free Online training: https://www.outsystems.com/learn/courses/
Follow us on Twitter http://www.twitter.com/OutSystemsDev
Like us on Facebook http://www.Facebook.com/OutSystemsDev
End-to-End Reactive Data Access Using R2DBC with RSocket and ProteusVMware Tanzu
Lack of asynchronous relational database drivers in Java has been a barrier to writing scalable, data-driven applications for many. R2DBC is seeking to change this with a new API designed from the ground up for reactive programming against relational databases—its intent ito support reactive data access built on natively asynchronous, non-blocking SQL database drivers.
How does this change the game for data access in the cloud? Used in conjunction with RSocket and Proteus, it is now possible to write applications benefiting from reactive streaming end-to-end, from the browser all the way to the database. No more fiddling with paging APIs, polling for updates, or writing complex logic to merge data from multiple sources--reactive streams can handle this all for you!
RSocket is an open-source, reactive networking protocol that is a collaborative development initiative of Netifi with Pivotal, Facebook, and others. Proteus is a freely available broker for RSocket that is designed to handle the challenges of communication between complex networks of services—both within the data center and over the internet—extending to mobile devices and browsers.
Attend this webinar to learn how to use Pivotal Cloud Foundry with R2DBC and Proteus to build reactive microservices that return large amounts of data in a streaming fashion over RSocket.
Speakers: Ryland Degnan, co-founder and CTO of Netifi and Dan Baskette, Pivotal host
Lessons PostgreSQL learned from commercial databases, and didn’tPGConf APAC
This is the ppt used by Illay for his presentation at pgDay Asia 2016 - "Lessons PostgreSQL learned from commercial
databases, and didn’t". The talk takes you through some of the really good things that PostgreSQL has done really well and somethings that PostgreSQL can learn from other databases
Grokking Techtalk #37: Data intensive problemGrokking VN
At some point in your software engineer career, you will have to deal with data and your success depends on how big the data that your software can deal with. From a simple problem that requires processing a large amount of data, this talk will present to you how to approach this kind of issue and how to design and choose an efficient solution.
About speaker:
Hồ is Senior Software Engineer at AXON where he helps design and develops complex distributed systems, including image and video encoding, distributed file conversion system. Besides coding, Ho likes to read manga and meet friends in his free time.
Comparing high availability solutions with percona xtradb cluster and percona...Marco Tusa
Percona XtraDB Cluster (PXC) is currently the most popular solution for HA in the MySQL ecosystem, and any solutions Galera-based as PXC have been the only viable option when looking for a high grade of HA using synchronous replication.
But Oracle had intensively worked on making Group Replication more solid and easy to use.
It is time to identify if Group Replication and attached solutions, like InnoDB cluster, can compete or even replace solutions based on Galera.
This presentation will focus on comparing the two solutions and how they behave when serving basic HA problems.
Attendees will be able to get a clearer understanding of which solutions will serve them better, and in which cases.
Postgres Vision 2018: WAL: Everything You Want to KnowEDB
The Write-Ahead Log (WAL) in PostgreSQL is a central feature of the database and it's relied upon to achieve critical functions, like backup, replication, and others. In this presentation delivered at Postgres Vision 2018, Devrim Gündüz, Principal Systems Engineer at EnterpriseDB, explains WAL and what the average database administrator needs to know.
Tungsten Connector / Proxy is truly the secret sauce for the Tungsten Clustering solution. Watch this webinar to learn how the Tungsten Connector enables zero-downtime MySQL maintenance via the manual switch operation, and gain an understanding of the various configuration options for doing local reads in remote composite clusters.
AGENDA
- Review the cluster architecture
- Understand the role of the Connector
- Describe Connector deployment best practices (app, dedicated with lb, db with lb)
- Explore zero-downtime MySQL maintenance using the manual role switch procedure
- Learn about Connector routing patterns inside a composite cluster
- Illustrate a manual site switch
- Explain read affinity and the vast performance improvement of local reads
- Examine Connector multi-cluster support
PostgreSQL Enterprise Class Features and CapabilitiesPGConf APAC
These are the slides used by Venkar from Fujitsu for his presentation at pgDay Asia 2016. He spoke about some of the Enterprise Class features of PostgreSQL database.
Ten query tuning techniques every SQL Server programmer should knowKevin Kline
From the noted database expert and author of 'SQL in a Nutshell' - SELECT statements have a reputation for being very easy to write, but hard to write very well. This session will take you through ten of the most problematic patterns and anti-patterns when writing queries and how to deal with them all. Loaded with live demonstrations and useful techniques, this session will teach you how to take your SQL Server queries mundane to masterful.
Tempto is a product test framework that allows developers to write and execute tests for SQL databases running on Hadoop. Individual test requirements such as data generation, HDFS file copy/storage of generated data and schema creation are expressed declaratively and are automatically fulfilled by the framework. Developers can write tests using Java (using a TestNG like paradigm and AssertJ style assertion) or by providing query files with expected results. We will show how we use it for presto product tests.
Benchto is a benchmark framework that provides an easy and manageable way to define, run and analyze macro benchmarks in clustered environment. Understanding behavior of distributed systems is hard and requires good visibility intostate of the cluster and internals of tested system. This project was developed for repeatable benchmarking ofHadoop SQL engines, most importantly Presto.
Grokking TechTalk 9 - Building a realtime & offline editing service from scra...Grokking VN
https://www.youtube.com/watch?v=_Wqy1B8PXD4&feature=youtu.be
Talk presented by Vu Nguyen, CTO @ Liti Book (Vietnamese)
Brief intro: In this talk, I would like to share how we build a system for LitiBook that can handle (1) real-time editing, (2) offline editing, (3) synchronizing between devices and (4) conflict between different editing sessions. There are not many applications out there can do all these above things. (Evernote does not resolve conflict. Hackpad, Trello and Asana do not support offline). So the challenge is really interesting.
About speaker: Vu Nguyen is a young and passionate engineer who founded Liti Book with his friend. Liti Book aimed to develp the next generation of productivity tool which is support more collaboratoin, more real-time editing, ...
www.grokking.org
ProxySQL - High Performance and HA Proxy for MySQLRené Cannaò
High Availability proxy designed to solve real issues of MySQL setups from small to very large production environments.
Presentation at Percona Live Amsterdam 2015
Introduction to Prometheus Monitoring (Singapore Meetup) Arseny Chernov
Presented at inaugural Singapore Prometheus Meetup, videos on https://www.meetup.com/Singapore-Prometheus-Meetup/events/240844291/
Links to original slides from various blogposts provided.
Grokking Techtalk #37: Data intensive problemGrokking VN
At some point in your software engineer career, you will have to deal with data and your success depends on how big the data that your software can deal with. From a simple problem that requires processing a large amount of data, this talk will present to you how to approach this kind of issue and how to design and choose an efficient solution.
About speaker:
Hồ is Senior Software Engineer at AXON where he helps design and develops complex distributed systems, including image and video encoding, distributed file conversion system. Besides coding, Ho likes to read manga and meet friends in his free time.
Comparing high availability solutions with percona xtradb cluster and percona...Marco Tusa
Percona XtraDB Cluster (PXC) is currently the most popular solution for HA in the MySQL ecosystem, and any solutions Galera-based as PXC have been the only viable option when looking for a high grade of HA using synchronous replication.
But Oracle had intensively worked on making Group Replication more solid and easy to use.
It is time to identify if Group Replication and attached solutions, like InnoDB cluster, can compete or even replace solutions based on Galera.
This presentation will focus on comparing the two solutions and how they behave when serving basic HA problems.
Attendees will be able to get a clearer understanding of which solutions will serve them better, and in which cases.
Postgres Vision 2018: WAL: Everything You Want to KnowEDB
The Write-Ahead Log (WAL) in PostgreSQL is a central feature of the database and it's relied upon to achieve critical functions, like backup, replication, and others. In this presentation delivered at Postgres Vision 2018, Devrim Gündüz, Principal Systems Engineer at EnterpriseDB, explains WAL and what the average database administrator needs to know.
Tungsten Connector / Proxy is truly the secret sauce for the Tungsten Clustering solution. Watch this webinar to learn how the Tungsten Connector enables zero-downtime MySQL maintenance via the manual switch operation, and gain an understanding of the various configuration options for doing local reads in remote composite clusters.
AGENDA
- Review the cluster architecture
- Understand the role of the Connector
- Describe Connector deployment best practices (app, dedicated with lb, db with lb)
- Explore zero-downtime MySQL maintenance using the manual role switch procedure
- Learn about Connector routing patterns inside a composite cluster
- Illustrate a manual site switch
- Explain read affinity and the vast performance improvement of local reads
- Examine Connector multi-cluster support
PostgreSQL Enterprise Class Features and CapabilitiesPGConf APAC
These are the slides used by Venkar from Fujitsu for his presentation at pgDay Asia 2016. He spoke about some of the Enterprise Class features of PostgreSQL database.
Ten query tuning techniques every SQL Server programmer should knowKevin Kline
From the noted database expert and author of 'SQL in a Nutshell' - SELECT statements have a reputation for being very easy to write, but hard to write very well. This session will take you through ten of the most problematic patterns and anti-patterns when writing queries and how to deal with them all. Loaded with live demonstrations and useful techniques, this session will teach you how to take your SQL Server queries mundane to masterful.
Tempto is a product test framework that allows developers to write and execute tests for SQL databases running on Hadoop. Individual test requirements such as data generation, HDFS file copy/storage of generated data and schema creation are expressed declaratively and are automatically fulfilled by the framework. Developers can write tests using Java (using a TestNG like paradigm and AssertJ style assertion) or by providing query files with expected results. We will show how we use it for presto product tests.
Benchto is a benchmark framework that provides an easy and manageable way to define, run and analyze macro benchmarks in clustered environment. Understanding behavior of distributed systems is hard and requires good visibility intostate of the cluster and internals of tested system. This project was developed for repeatable benchmarking ofHadoop SQL engines, most importantly Presto.
Grokking TechTalk 9 - Building a realtime & offline editing service from scra...Grokking VN
https://www.youtube.com/watch?v=_Wqy1B8PXD4&feature=youtu.be
Talk presented by Vu Nguyen, CTO @ Liti Book (Vietnamese)
Brief intro: In this talk, I would like to share how we build a system for LitiBook that can handle (1) real-time editing, (2) offline editing, (3) synchronizing between devices and (4) conflict between different editing sessions. There are not many applications out there can do all these above things. (Evernote does not resolve conflict. Hackpad, Trello and Asana do not support offline). So the challenge is really interesting.
About speaker: Vu Nguyen is a young and passionate engineer who founded Liti Book with his friend. Liti Book aimed to develp the next generation of productivity tool which is support more collaboratoin, more real-time editing, ...
www.grokking.org
ProxySQL - High Performance and HA Proxy for MySQLRené Cannaò
High Availability proxy designed to solve real issues of MySQL setups from small to very large production environments.
Presentation at Percona Live Amsterdam 2015
Introduction to Prometheus Monitoring (Singapore Meetup) Arseny Chernov
Presented at inaugural Singapore Prometheus Meetup, videos on https://www.meetup.com/Singapore-Prometheus-Meetup/events/240844291/
Links to original slides from various blogposts provided.
Zero Downtime Architectures based on JEE platform. Almost every big enterprise with online business tries to design its applications in a way that they are always online. But is it also the case when we upgrade the database cluster? When we switch the whole data center? Based on a customer project we try to present common architecture principles that enable you to do all this without any service interruption and the most important: without any stress.
How to get the maximum performance from your AEP server. This will discuss ways to improve execution time of short running jobs and how to properly configure the server depending on the expected number of users as well as the average size and duration of individual jobs. Included will be examples of making use of job pooling, Database connection sharing, and parallel subprotocol tuning. Determining when to make use of cluster, grid, or load balanced configurations along with memory and CPU sizing guidelines will also be discussed.
Dive deep into specific OSS packages to examine the top issues in the enterprise with two of our most qualified OSS architects, Bill Crowell and Vince Cox walkthrough: Their day-to-day work in OSS packages; ways to fix reported issues; why you can’t expect in-house developers to handle issues in OSS packages.
The venerable Servlet Container still has some performance tricks up its sleeve - this talk will demonstrate Apache Tomcat's stability under high load, describe some do's (and some don'ts!), explain how to performance test a Servlet-based application, troubleshoot and tune the container and your application and compare the performance characteristics of the different Tomcat connectors. The presenters will share their combined experience supporting real Tomcat applications for over 20 years and show how a few small changes can make a big, big difference.
Web Component Development Using Servlet & JSP Technologies (EE6) - Chapter 1...WebStackAcademy
Servlet Technology is used to create web applications. Servlet technology uses Java language to create web applications.
As Servlet Technology uses Java, web applications made using Servlet are Secured, Scalable and Robust.
Web applications are helper applications that resides at web server and build dynamic web pages. A dynamic page could be anything like a page that randomly chooses picture to display or even a page that displays the current time.
Training Webinar: Detect Performance Bottlenecks of ApplicationsOutSystems
In this webinar we look at how to detect and troubleshoot server-side performance bottlenecks.
Free Online training: https://www.outsystems.com/learn/courses/
Follow us on Twitter http://www.twitter.com/OutSystemsDev
Like us on Facebook http://www.Facebook.com/OutSystemsDev
In order to obtain the best performance possible out of your AEP server, the core architecture provides methods to reuse job processes multiple times. This talk will cover how the mechanism functions, what performance improvements you might expect as well as what potential problems you might encounter, how to use pooling in protocols and applications, and how the administrator or package developers can configure and debug specialized job pools for their particular applications
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...Amazon Web Services
In this session, we show you how to set the source Oracle database environment, the target PostgreSQL environment, and parameter group configuration. We also recommended database parameters to disable foreign keys and triggers. Finally, we discuss best practices for using AWS Database Migration Service (AWS DMS) and AWS Schema Conversion Tool (AWS SCT) and show you how to choose the instance type and configure AWS DMS.
Introduction to Gatling performance testing tool and how we used it for testing Zonky's REST API. Example of running distributed performance tests in AWS Fargate with real-time monitoring with Logstash/ElasticSearch/Kibana stack.
Slow things down to make them go faster [FOSDEM 2022]Jimmy Angelakos
Talk from FOSDEM 2022
It's easy to get misled into overconfidence based on the performance of powerful servers, given today's monster core counts and RAM sizes. However, the reality of high concurrency usage is often disappointing, with less throughput than one would expect. Because of its internals and its multi-process architecture, PostgreSQL is very particular about how it likes to deal with high concurrency and in some cases it can slow down to the point where it looks like it's not performing as it should. In this talk we'll take a look at potential pitfalls when you throw a lot of work at your database. Specifically, very high concurrency and resource contention can cause problems with lock waits in Postgres. Very high transaction rates can also cause problems of a different nature. Finally, we will be looking at ways to mitigate these by examining our queries and connection parameters, leveraging connection pooling and replication, or adapting the workload.
Topics:
1. Understand what we mean by high concurrency.
2. Understand ACID & MVCC in Postgres.
3. Understand how high concurrency affects Postgres performance.
4. Understand how locks/latches affect Postgres performance.
5. Understand how high transaction rates can affect Postgres.
6. Mitigation strategies for high concurrency scenarios.
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBaseMichael Stack
Pradeep S, Mallikarjun V of Flipkart
Track 1: Internals
https://open.mi.com/conference/hbasecon-asia-2019
THE COMMUNITY EVENT FOR APACHE HBASE™
July 20th, 2019 - Sheraton Hotel, Beijing, China
https://hbase.apache.org/hbaseconasia-2019/
Grails has great performance characteristics but as with all full stack frameworks, attention must be paid to optimize performance. In this talk Lari will discuss common missteps that can easily be avoided and share tips and tricks which help profile and tune Grails applications.
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Storyvanphp
Bill Monkman, Lead Engineer at Hootsuite, presenting on how Hootsuite went from zero to hundreds of millions of requests per day with its PHP codebase, and how dealing with that growth has shaped its future direction. Tips, optimizations, and horror stories from a rapidly-scaling PHP startup.
Video: https://www.youtube.com/watch?v=TZGeBAIMPII
Life In The FastLane: Full Speed XPagesUlrich Krause
Using XPages out of the box lets you build good looking and well performing applications. However, as XPage applications become bigger and more complex, performance can become an issue and, if it comes to scalability and speed optimization, there are a couple of things to take into consideration.
Learn how to use partial refresh and partial execution mode and how to monitor its execution using a JSF LifeCycle monitor to avoid multiple re-calculation of controls. We will show tools that can allow you to profile your code, readily available from OpenNTF, along with a demonstration of how to use them to improve the speed of your code.
Still writing SSJS and encounter a significant slow down when using Script Libraries? See, how you can improve the speed of your application using JAVA instead of JS, JSON and even @formulas.
Similar to Евгений Хыст "Application performance database related problems" (20)
GraphSummit Paris - The art of the possible with Graph TechnologyNeo4j
Sudhir Hasbe, Chief Product Officer, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppGoogle
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-fusion-buddy-review
AI Fusion Buddy Review: Key Features
✅Create Stunning AI App Suite Fully Powered By Google's Latest AI technology, Gemini
✅Use Gemini to Build high-converting Converting Sales Video Scripts, ad copies, Trending Articles, blogs, etc.100% unique!
✅Create Ultra-HD graphics with a single keyword or phrase that commands 10x eyeballs!
✅Fully automated AI articles bulk generation!
✅Auto-post or schedule stunning AI content across all your accounts at once—WordPress, Facebook, LinkedIn, Blogger, and more.
✅With one keyword or URL, generate complete websites, landing pages, and more…
✅Automatically create & sell AI content, graphics, websites, landing pages, & all that gets you paid non-stop 24*7.
✅Pre-built High-Converting 100+ website Templates and 2000+ graphic templates logos, banners, and thumbnail images in Trending Niches.
✅Say goodbye to wasting time logging into multiple Chat GPT & AI Apps once & for all!
✅Save over $5000 per year and kick out dependency on third parties completely!
✅Brand New App: Not available anywhere else!
✅ Beginner-friendly!
✅ZERO upfront cost or any extra expenses
✅Risk-Free: 30-Day Money-Back Guarantee!
✅Commercial License included!
See My Other Reviews Article:
(1) AI Genie Review: https://sumonreview.com/ai-genie-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
#AIFusionBuddyReview,
#AIFusionBuddyFeatures,
#AIFusionBuddyPricing,
#AIFusionBuddyProsandCons,
#AIFusionBuddyTutorial,
#AIFusionBuddyUserExperience
#AIFusionBuddyforBeginners,
#AIFusionBuddyBenefits,
#AIFusionBuddyComparison,
#AIFusionBuddyInstallation,
#AIFusionBuddyRefundPolicy,
#AIFusionBuddyDemo,
#AIFusionBuddyMaintenanceFees,
#AIFusionBuddyNewbieFriendly,
#WhatIsAIFusionBuddy?,
#HowDoesAIFusionBuddyWorks
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
OpenMetadata Community Meeting - 5th June 2024OpenMetadata
The OpenMetadata Community Meeting was held on June 5th, 2024. In this meeting, we discussed about the data quality capabilities that are integrated with the Incident Manager, providing a complete solution to handle your data observability needs. Watch the end-to-end demo of the data quality features.
* How to run your own data quality framework
* What is the performance impact of running data quality frameworks
* How to run the test cases in your own ETL pipelines
* How the Incident Manager is integrated
* Get notified with alerts when test cases fail
Watch the meeting recording here - https://www.youtube.com/watch?v=UbNOje0kf6E
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeAftab Hussain
Understanding variable roles in code has been found to be helpful by students
in learning programming -- could variable roles help deep neural models in
performing coding tasks? We do an exploratory study.
- These are slides of the talk given at InteNSE'23: The 1st International Workshop on Interpretability and Robustness in Neural Software Engineering, co-located with the 45th International Conference on Software Engineering, ICSE 2023, Melbourne Australia
Top 7 Unique WhatsApp API Benefits | Saudi ArabiaYara Milbes
Discover the transformative power of the WhatsApp API in our latest SlideShare presentation, "Top 7 Unique WhatsApp API Benefits." In today's fast-paced digital era, effective communication is crucial for both personal and professional success. Whether you're a small business looking to enhance customer interactions or an individual seeking seamless communication with loved ones, the WhatsApp API offers robust capabilities that can significantly elevate your experience.
In this presentation, we delve into the top 7 distinctive benefits of the WhatsApp API, provided by the leading WhatsApp API service provider in Saudi Arabia. Learn how to streamline customer support, automate notifications, leverage rich media messaging, run scalable marketing campaigns, integrate secure payments, synchronize with CRM systems, and ensure enhanced security and privacy.
Graspan: A Big Data System for Big Code AnalysisAftab Hussain
We built a disk-based parallel graph system, Graspan, that uses a novel edge-pair centric computation model to compute dynamic transitive closures on very large program graphs.
We implement context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases such as Linux shows that their Graspan implementations scale to millions of lines of code and are much simpler than their original implementations.
These analyses were used to augment the existing checkers; these augmented checkers found 132 new NULL pointer bugs and 1308 unnecessary NULL tests in Linux 4.4.0-rc5, PostgreSQL 8.3.9, and Apache httpd 2.2.18.
- Accepted in ASPLOS ‘17, Xi’an, China.
- Featured in the tutorial, Systemized Program Analyses: A Big Data Perspective on Static Analysis Scalability, ASPLOS ‘17.
- Invited for presentation at SoCal PLS ‘16.
- Invited for poster presentation at PLDI SRC ‘16.
2. www.luxoft.com
Application Performance: Database-related Problems
● Application performance;
● Common performance problems and their solutions;
● Database-related problems;
● Lock contention;
● Locking mechanism;
● Transaction isolation level;
● URL shortener example;
● Hi/Lo algorithms;
● Payment system example.
3. www.luxoft.com
Application Performance
● Key performance metrics:
- Request processing time;
- Throughput;
● Poor performance:
- Long time to process single requests;
- Low number of requests processed per second.
10. www.luxoft.com
Query Execution Time is Too Big
● Missing indexes;
● Slow SQL queries (sub-queries, too many JOINs etc);
● Slow SQL queries generated by ORM;
● Not optimal JDBC fetch size;
● Lack of proper data caching;
● Lock contention.
11. www.luxoft.com
Missing Indexes
To find out what indexes to create look at query execution plan:
EXPLAIN PLAN FOR
SELECT isbn FROM book;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY());
12. www.luxoft.com
TABLE ACCESS FULL
● Full table scan is a scan made on a database where each row of
the table under scan is read in a sequential order and the
columns encountered are checked for the validity of a condition;
● Full table scans are the slowest method of scanning a table in
most of the cases;
● Create missing indexes to search by index instead of
performing full table scan.
13. www.luxoft.com
Slow SQL Queries
● Slow SQL queries (sub-queries, too many JOINs etc):
Solution: Rewrite query
● Slow SQL queries generated by ORM:
- JPQL/HQL and Criteria API queries are translated to SQL;
Solutions:
- Rewrite JPQL/HQL, Criteria API queries;
- Replace with plain SQL query.
14. www.luxoft.com
Not Optimal JDBC Fetch Size
JDBC allows to specify the number of rows fetched with each
database round-trip for a query, and this number is referred to as
the fetch size.
Solutions:
● java.sql.Statement.setFetchSize(rows)
● hibernate.jdbc.fetch_size property
16. www.luxoft.com
Lock Contention
Operations are waiting to obtain lock for a long time due to high
lock contention.
Solution:
Revise application logic and implementation:
● Update asynchronously;
● Replace updates with inserts (inserts are not blocking).
17. www.luxoft.com
Too Much Queries per Single Business Function
● Insert/update queries executed in a loop;
● "SELECT N+1" problem;
● Reduce number calls hitting database.
18. www.luxoft.com
Insert/Update Queries Executed in a Loop
● Use JDBC batch (keep batch size less than 1000);
● hibernate.jdbc.batch_size property;
● Periodically flush changes and clear Session/EntityManager
to control first-level cache size.
19. www.luxoft.com
JDBC Batch Processing
PreparedStatement preparedStatement = connection.prepareStatement("UPDATE book SET title=? WHERE isbn=?");
preparedStatement.setString(1, "Patterns of Enterprise Application Architecture");
preparedStatement.setString(2, "007-6092019909");
preparedStatement.addBatch();
preparedStatement.setString(1, "Enterprise Integration Patterns");
preparedStatement.setString(2, "978-0321200686");
preparedStatement.addBatch();
int[] affectedRecords = preparedStatement.executeBatch();
for (int i=0; i<100000; i++) {
Book book = new Book(.....);
session.save(book);
if ( i % 20 == 0 ) { // 20, same as the JDBC batch size
// flush a batch of inserts and release memory:
session.flush();
session.clear();
}
}
20. www.luxoft.com
"SELECT N+1" Problem
● The first query will selected root entities only, and each
associated collection will be selected with additional query.
● So persistence provider generates N+1 SQL queries, where N is a
number of root entities in result list of user query.
21. www.luxoft.com
"SELECT N+1" Problem
Solutions:
● Use different fetching strategy or entity graph;
● Make child entities aggregate roots and use DAO methods to
fetch them:
- Replace bidirectional one-to-many mapping with unidirectional;
● Enable second-level and query cache.
23. www.luxoft.com
Database Connection Management Problems
● Application is using too much DB connections:
- Application is not closing connections after using
Solution: Close all connections after using
- DB is not able to handle that much connections application uses
Solution: Use connection pooling
● Application is waiting to get connection from pool too long
Solution: Increase pool size
24. www.luxoft.com
JVM Performance Problems
Excessive JVM garbage collections slows down application.
Solutions:
● Analyze garbage collector logs:
- Send GC data to a log file, enable GC log rotation:
-Xloggc:gc.log -XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=1M
-XX:+PrintGCTimeStamps
● Tune GC:
- Use Garbage-First Collector: -XX:+UseG1GC
25. www.luxoft.com
Application Specific Performance Problems
Resource consuming computations:
● Algorithms with complexity O(N2), O(2N);
● Asymmetric RSA encryption;
● Bcrypt hashing during authentication;
● Etc.
Solution: Horizontal scalability. Increase number of instances
capable of processing requests and balance load (create cluster).
26. www.luxoft.com
Network-related Problems
● Network latency;
● Not configured timeout:
- mail.smtp.connectiontimeout Socket connection timeout. Default
is infinite timeout.
- mail.smtp.timeout Socket read timeout. Default is infinite timeout.
27. www.luxoft.com
Reducing Lock Contention
● Database-related problems
- Query execution time is too big
• Lock contention
Solutions:
● Use Hi/Lo algorithms;
● Update asynchronously;
● Replace updates with inserts.
28. www.luxoft.com
Locking Mechanism
Locks are mechanisms that prevent destructive interaction
between transactions accessing the same resource.
In general, multi-user databases use some form of data locking to
solve the problems associated with:
● data concurrency,
● consistency,
● integrity.
29. www.luxoft.com
Isolation Levels vs Locks
● Transaction isolation level does not affect the locks that are
acquired to protect data modifications.
● A transaction always gets an exclusive lock on any data it
modifies and holds that lock until the transaction completes,
regardless of the isolation level set for that transaction.
● For read operations transaction isolation levels primarily define
the level of protection from the effects of modifications made
by other transactions.
30. www.luxoft.com
Preventable Read Phenomena
● Dirty reads - A transaction reads data that has been written by
another transaction that has not been committed yet.
● Nonrepeatable reads - A transaction rereads data it has
previously read and finds that another committed transaction
has modified or deleted the data.
● Phantom reads - A transaction reruns a query returning a set
of rows that satisfies a search condition and finds that another
committed transaction has inserted additional rows that satisfy
the condition.
32. www.luxoft.com
Isolation Levels vs Read Phenomena
Dirty reads Nonrepeatable reads Phantom reads
Read uncommited Possible Possible Possible
Read commited Not possible Possible Possible
Repeatable reads Not possible Not possible Possible
Serializable Not possible Not possible Not possible
36. www.luxoft.com
URL Shortener Example
Requirements:
● Receives URL and returns "shortened" version;
● E.g. post "http://github.com" to "http://url-shortener/s/" and get
back "http://url-shortener/s/2Bi";
● The shortened URL can be resolved to original URL. E.g.
"http://url-shortener/s/2Bi" will return "http://github.com";
● Shortened URLs that were not accessed longer than some
specified amount of time should be deleted.
37. www.luxoft.com
URL Shortener Example
● Each time URL is submitted a new record is inserted into the
database;
● Insert operations do not introduce locks in database;
● For primary key generation database sequence is used;
● The Hi/Lo algorithm allows to reduce number of database hits
to improve performance.
38. www.luxoft.com
URL Shortener Example
● Original URL’s primary key is converted to radix 62:
- Radix 62 alphabet contains digits lower- and upper-case letters: 10000
in radix 10 = 2Bi in radix 62;
● String identifying original URL is converted back to radix 10 to
get primary key value and original URL can be found by ID.
39. www.luxoft.com
URL Shortener Example
E.g. URL "http://github.com/" shortened to "http://url-
shortener/s/2Bi":
● Inserting new record to database with id 10000 for original URL
"http://github.com/" representing "shortened" URL
● Converting id 10000 to radix 62: 2Bi
40. www.luxoft.com
URL Shortener Example
● During each shortened URL resolving last view timestamp is
updated in database and total number of views column is
incremented;
● These update should be asynchronous to not reduce
performance due to lock contention;
● Absence of update operations gives application better
scalability and throughput.
41. www.luxoft.com
Update Asynchronously
● When URL is resolved JMS message is sent to queue;
● Application consumes messages from queue and updates
records in database;
● During URL resolving there are no update operations.
43. www.luxoft.com
Hi/Lo Algorithms
● JPA mapping:
@SequenceGenerator(name = "MY_SEQ", sequenceName = "MY_SEQ",
allocationSize = 50)
allocationSize = N - fetch the next value from the database once in every
N persist calls and locally (in-memory) increment the value in between.
● Sequence DDL:
CREATE SEQUENCE MY_SEQ INCREMENT BY 50 START WITH 50;
INCREMENT BY should match allocationSize
START WITH should be greater or equal to allocationSize
44. www.luxoft.com
Payment System Example
Requirements:
● Users can add funds on their accounts (add funds)
● Users can pay to shops with funds from their accounts
(payment)
● Users and shops can withdraw money from their accounts
(withdraw funds)
● Account balance must be always up to date
47. www.luxoft.com
Simple solution 1 - Queries
UPDATE ACCOUNT_BALANCE SET
BALANCE = BALANCE + :amount
WHERE ACCOUNT_ID = :account
SELECT ACCOUNT_ID,
BALANCE
FROM ACCOUNT_BALANCE
WHERE ACCOUNT_ID = :account
48. www.luxoft.com
Simple solution 1 - Problems
● Update operations introduce locks;
● During Christmas holidays users can make hundreds of
payments simultaneously;
● Due to lock contention payments will be slow;
● System have low throughput.
49. www.luxoft.com
Simple solution 2
● Do not store account balance at all;
● Store details of each transaction;
● Calculate balance dynamically based on transaction log;
● Advantages:
- Still simple enough;
- No update operations at all.
51. www.luxoft.com
Simple solution 2 - Queries
INSERT INTO TRANSACTION_LOG(TX_ID, TX_TYPE, TX_DATE,
ACCOUNT_ID, TX_AMOUNT)
VALUES(:id, :type, :date, :account, :amount)
SELECT ACCOUNT_ID,
SUM(TX_AMOUNT) AS BALANCE
FROM TRANSACTION_LOG
WHERE ACCOUNT_ID = :account
52. www.luxoft.com
Simple solution 2 - Problems
● Users can make thousands of transactions per day;
● During Christmas holidays users can make thousands of
payments per hour;
● Number of transactions continuously grow;
● More records in TRANSACTION_LOG table - slower requests.
53. www.luxoft.com
Better solution
● Store balance on yesterday in table;
● Update account balance once a day in background;
● Store details of each transaction;
● Calculate balance dynamically based on value of balance on
yesterday and transactions made today from transaction log.
54. www.luxoft.com
Better solution - Data model
Table ACCOUNT_BALANCE
Table TRANSACTION_LOG
ACCOUNT_ID BALANCE_DATE BALANCE
TX_ID TX_TYPE TX_DATE ACCOUNT_ID TX_AMOUNT
55. www.luxoft.com
Better solution - Queries
INSERT INTO TRANSACTION_LOG(TX_ID, TX_TYPE, TX_DATE,
ACCOUNT_ID, TX_AMOUNT)
VALUES(:id, :type, :date, :account, :amount)
-- Executed once a day at midnight
UPDATE ACCOUNT_BALANCE SET
BALANCE = BALANCE + :transactionLogSum,
BALANCE_DATE = :lastTransactionLogDate
WHERE ACCOUNT_ID = :account
56. www.luxoft.com
Better solution - Queries
SELECT ACCOUNT_ID,
BALANCE_DATE,
BALANCE AS CACHED_BALANCE
FROM ACCOUNT_BALANCE
WHERE ACCOUNT_ID = :account
SELECT ACCOUNT_ID,
MAX(TX_DATE) AS LAST_TX_LOG_DATE,
SUM(TX_AMOUNT) AS TX_LOG_SUM
FROM TRANSACTION_LOG
WHERE ACCOUNT_ID = :account
AND TX_DATE > :balanceDate
-- BALANCE = CACHED_BALANCE + TX_LOG_SUM
57. www.luxoft.com
Better solution - Advantages
● No updates during payment operations - no locks
● No locks - better throughput
● Number of rows in query with SUM operation is limited (1 day)
● Constant query execution time