The document provides an overview of various OLTP performance benchmarks including pgbench, sysbench, dbt2, BenchmarkSQL, and DVDStore. It discusses the benchmarks' workloads and metrics. It also outlines the development of new benchmarks like TPC-V and how ensuring they run well on PostgreSQL would benefit the open source database.
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureDATAVERSITY
Whether to take data ingestion cycles off the ETL tool and the Data Warehouse or to facilitate competitive Data Science and building algorithms in the organization, the Data Lake — a place for unmodeled and vast data — will be provisioned widely in 2019.
Though it doesn’t have to be complicated, the Data Lake has a few key design points that are critical, and it does need to follow some principles for success. Avoid building the Data Swamp, but not the Data Lake! The tool ecosystem is building up around the Data Lake and soon many will have a robust Lake and Data Warehouse. We will discuss policy to keep them straight, send “horses to courses,” and keep up users’ confidence in the Data Platforms.
As for platform, although Hadoop received the early majority of Data Lakes, organizations are now weighing in that the Data Lake will be built in Cloud object storage. We’ll discuss these options as well.
Get this data point for your Data Lake journey.
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...GetInData
Did you like it? Check out our E-book: Apache NiFi - A Complete Guide
https://ebook.getindata.com/apache-nifi-complete-guide
Apache NiFi is one of the most popular services for running ETL pipelines otherwise it’s not the youngest technology. During the talk, there are described all details about migrating pipelines from the old Hadoop platform to the Kubernetes, managing everything as the code, monitoring all corner cases of NiFi and making it a robust solution that is user-friendly even for non-programmers.
Author: Albert Lewandowski
Linkedin: https://www.linkedin.com/in/albert-lewandowski/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
TekSlate is the leader in Tableau tutorials and other business intelligence tutorials emphasis on delivering complete knowledge through self-paced learning. Tableau Free Tutorials tech to create highly interactive dashboards using actions.
To Learn More Click On Below Link:
http://bit.ly/1zKKnPm
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureDATAVERSITY
Whether to take data ingestion cycles off the ETL tool and the Data Warehouse or to facilitate competitive Data Science and building algorithms in the organization, the Data Lake — a place for unmodeled and vast data — will be provisioned widely in 2019.
Though it doesn’t have to be complicated, the Data Lake has a few key design points that are critical, and it does need to follow some principles for success. Avoid building the Data Swamp, but not the Data Lake! The tool ecosystem is building up around the Data Lake and soon many will have a robust Lake and Data Warehouse. We will discuss policy to keep them straight, send “horses to courses,” and keep up users’ confidence in the Data Platforms.
As for platform, although Hadoop received the early majority of Data Lakes, organizations are now weighing in that the Data Lake will be built in Cloud object storage. We’ll discuss these options as well.
Get this data point for your Data Lake journey.
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...GetInData
Did you like it? Check out our E-book: Apache NiFi - A Complete Guide
https://ebook.getindata.com/apache-nifi-complete-guide
Apache NiFi is one of the most popular services for running ETL pipelines otherwise it’s not the youngest technology. During the talk, there are described all details about migrating pipelines from the old Hadoop platform to the Kubernetes, managing everything as the code, monitoring all corner cases of NiFi and making it a robust solution that is user-friendly even for non-programmers.
Author: Albert Lewandowski
Linkedin: https://www.linkedin.com/in/albert-lewandowski/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
TekSlate is the leader in Tableau tutorials and other business intelligence tutorials emphasis on delivering complete knowledge through self-paced learning. Tableau Free Tutorials tech to create highly interactive dashboards using actions.
To Learn More Click On Below Link:
http://bit.ly/1zKKnPm
Cloudera - The Modern Platform for AnalyticsCloudera, Inc.
This presentation provides an overview of Cloudera and how a modern platform for Machine Learning and Analytics better enables a data-driven enterprise.
SQL Server Alwayson for SharePoint HA/DR Step by Step GuideLars Platzdasch
SQL Server Alwayson for Sharepoint HA/DR SQL Konferenz 2017
-What is SQL Server AlwaysOn?
-AlwaysOn Failover Clustering
-AlwaysOn Availability Groups
-Why AlwaysOn Availability Groups for SharePoint?
-Requirements and Prerequisites
-Step by Step guide to implementing AlwaysOn Availability Groups
Demonstration
lessons learned
This is the presentation delivered by Karthik.P.R at MySQL User Camp Bangalore on 09th June 2017. ProxySQL is a high performance MySQL Load Balancer Designed to scale database servers.
This presentation covers all aspects of PostgreSQL administration, including installation, security, file structure, configuration, reporting, backup, daily maintenance, monitoring activity, disk space computations, and disaster recovery. It shows how to control host connectivity, configure the server, find the query being run by each session, and find the disk space used by each database.
Real-time Analytics with Trino and Apache PinotXiang Fu
Trino summit 2021:
Overview of Trino Pinot Connector, which bridges the flexibility of Trino's full SQL support to the power of Apache Pinot's realtime analytics, giving you the best of both worlds.
Snowflake concepts & hands on expertise to help get you started on implementing Data warehouses using Snowflake. Necessary information and skills that will help you master Snowflake essentials.
Building IAM for OpenStack, presented at CIS (Cloud Identity Summit) 2015.
Discuss Identity Sources, Authentication, Managing Access and Federating Identities
NoSQL databases get a lot of press coverage, but there seems to be a lot of confusion surrounding them, as in which situations they work better than a Relational Database, and how to choose one over another. This talk will give an overview of the NoSQL landscape and a classification for the different architectural categories, clarifying the base concepts and the terminology, and will provide a comparison of the features, the strengths and the drawbacks of the most popular projects (CouchDB, MongoDB, Riak, Redis, Membase, Neo4j, Cassandra, HBase, Hypertable).
What's new in Autonomous Database - OCYatra2023 - Sandesh Rao.pdfSandesh Rao
This session covers the new features and happenings in the Autonomous database world and will help answer more questions DBAs, Developers will have on the Autonomous Database, from provisioning to backups, troubleshooting, tips and tricks, security, multicloud and HA. This is a good introduction for on-prem DBAs who want to learn how to migrate their databases to Cloud. Questions like how to scale up and down, how to secure their environment, how to use mtls, how to use implement data connections and equivalence between Azure and to move data between clouds, all in a quick 45-minute session which might take weeks to pick up reading documentation or spanning several presentations.
There are many ways to run high availability with PostgreSQL. Here, we present a template for you to create your own customized, high-availability solution using Python and for maximum accessibility, a distributed configuration store like ZooKeeper or etcd.
At a blistering pace and for a variety of reasons, companies are migrating their on-premise database infrastructures to cloud-based solutions - to save costs on hardware, tame the impact of disaster recovery, or even to improve security. Zalando is not an exception: more than two years ago we migrated our first production services to AWS.
In addition to the fully managed database services like RDS and Aurora, Amazon offers a wide spectra of EC2 Instances with different types of performance and price. Without a lot of experience in running cloud databases it’s not easy to make the right choice, and as a result you will either have pure database performance or will overpay for over-provisioned resources.
In this talk I will compare different ways of running PostgreSQL on AWS, explain why we decided to run most of our databases on EC2 Instances instead of RDS, how we chose EC2 Instance types and EBS Volumes, which AWS CloudWatch metrics MUST be monitored (and why), and what problems we hit plus how to avoid them.
Data Ingestion in Big Data and IoT platformsGuido Schmutz
Many of the Big Data and IoT use cases are based on combining data from multiple data sources and to make them available on a Big Data platform for analysis. The data sources are often very heterogeneous, from simple files, databases to high-volume event streams from sensors (IoT devices). It’s important to retrieve this data in a secure and reliable manner and integrate it with the Big Data platform so that it is available for analysis in real-time (stream processing) as well as in batch (typical big data processing). In past some new tools have emerged, which are especially capable of handling the process of integrating data from outside, often called Data Ingestion. From an outside perspective, they are very similar to a traditional Enterprise Service Bus infrastructures, which in larger organization are often in use to handle message-driven and service-oriented systems. But there are also important differences, they are typically easier to scale in a horizontal fashion, offer a more distributed setup, are capable of handling high-volumes of data/messages, provide a very detailed monitoring on message level and integrate very well with the Hadoop ecosystem. This session will present and compare Apache NiFi, StreamSets and the Kafka Ecosystem and show how they handle the data ingestion in a Big Data solution architecture.
How netflix manages petabyte scale apache cassandra in the cloudVinay Kumar Chella
At Netflix, we manage petabytes of data in Apache Cassandra which must be reliably accessible to users in mere milliseconds. To achieve this, we have built sophisticated control planes that turn our persistence layer based on Apache Cassandra into a truly self-driving system. We will start with the user interface that Netflix developers use to interact with their Cassandra databases and dive deep into the automation that powers it all. From cluster creation, through scaling up, to cluster death, complex automation drives large fleets of virtual machines hosted on the AWS cloud. First, we will cover the basics of how Netflix deploys Apache Cassandra. In particular, this begins with how we mold Apache Cassandra to the Netflix philosophy of immutable infrastructure, including managing software and hardware upgrades in the face of ever-failing hardware. Then we will explore the concrete techniques needed for such a massive deployment, specifically pull-based control planes and auto-healing strategies. Next, we will cover how Netflix has automated complex but critical Apache Cassandra maintenance tasks such as continuous snapshot backups and always-on anti-entropy repair for keeping our datasets safe and consistent. Both of these systems have gone through multiple architectural evolutions, and we have learned many lessons along the way. Lastly, we will share some of the ways this has gone wrong, and what you can do to avoid them. We will cover a few case studies of major Cassandra outages at Netflix, their root cause, and what we learned from those incidents. At the end of this talk, we hope that participants leave with concrete understanding of the challenges in running massive scale Apache Cassandra as well as solid advice and techniques for building their own self-driving data persistence layer.
Fine-tuning Group Replication for PerformanceVitor Oliveira
This presentation is an overview of Group Replication from the perspective of performance optimization. It shows the main moving parts, the available options and how they can be used to tune its behaviour, and also a few significant benchmark results.
Delivering Data Democratization in the Cloud with SnowflakeKent Graziano
This is a brief introduction to Snowflake Cloud Data Platform and our revolutionary architecture. It contains a discussion of some of our unique features along with some real world metrics from our global customer base.
Cloudera - The Modern Platform for AnalyticsCloudera, Inc.
This presentation provides an overview of Cloudera and how a modern platform for Machine Learning and Analytics better enables a data-driven enterprise.
SQL Server Alwayson for SharePoint HA/DR Step by Step GuideLars Platzdasch
SQL Server Alwayson for Sharepoint HA/DR SQL Konferenz 2017
-What is SQL Server AlwaysOn?
-AlwaysOn Failover Clustering
-AlwaysOn Availability Groups
-Why AlwaysOn Availability Groups for SharePoint?
-Requirements and Prerequisites
-Step by Step guide to implementing AlwaysOn Availability Groups
Demonstration
lessons learned
This is the presentation delivered by Karthik.P.R at MySQL User Camp Bangalore on 09th June 2017. ProxySQL is a high performance MySQL Load Balancer Designed to scale database servers.
This presentation covers all aspects of PostgreSQL administration, including installation, security, file structure, configuration, reporting, backup, daily maintenance, monitoring activity, disk space computations, and disaster recovery. It shows how to control host connectivity, configure the server, find the query being run by each session, and find the disk space used by each database.
Real-time Analytics with Trino and Apache PinotXiang Fu
Trino summit 2021:
Overview of Trino Pinot Connector, which bridges the flexibility of Trino's full SQL support to the power of Apache Pinot's realtime analytics, giving you the best of both worlds.
Snowflake concepts & hands on expertise to help get you started on implementing Data warehouses using Snowflake. Necessary information and skills that will help you master Snowflake essentials.
Building IAM for OpenStack, presented at CIS (Cloud Identity Summit) 2015.
Discuss Identity Sources, Authentication, Managing Access and Federating Identities
NoSQL databases get a lot of press coverage, but there seems to be a lot of confusion surrounding them, as in which situations they work better than a Relational Database, and how to choose one over another. This talk will give an overview of the NoSQL landscape and a classification for the different architectural categories, clarifying the base concepts and the terminology, and will provide a comparison of the features, the strengths and the drawbacks of the most popular projects (CouchDB, MongoDB, Riak, Redis, Membase, Neo4j, Cassandra, HBase, Hypertable).
What's new in Autonomous Database - OCYatra2023 - Sandesh Rao.pdfSandesh Rao
This session covers the new features and happenings in the Autonomous database world and will help answer more questions DBAs, Developers will have on the Autonomous Database, from provisioning to backups, troubleshooting, tips and tricks, security, multicloud and HA. This is a good introduction for on-prem DBAs who want to learn how to migrate their databases to Cloud. Questions like how to scale up and down, how to secure their environment, how to use mtls, how to use implement data connections and equivalence between Azure and to move data between clouds, all in a quick 45-minute session which might take weeks to pick up reading documentation or spanning several presentations.
There are many ways to run high availability with PostgreSQL. Here, we present a template for you to create your own customized, high-availability solution using Python and for maximum accessibility, a distributed configuration store like ZooKeeper or etcd.
At a blistering pace and for a variety of reasons, companies are migrating their on-premise database infrastructures to cloud-based solutions - to save costs on hardware, tame the impact of disaster recovery, or even to improve security. Zalando is not an exception: more than two years ago we migrated our first production services to AWS.
In addition to the fully managed database services like RDS and Aurora, Amazon offers a wide spectra of EC2 Instances with different types of performance and price. Without a lot of experience in running cloud databases it’s not easy to make the right choice, and as a result you will either have pure database performance or will overpay for over-provisioned resources.
In this talk I will compare different ways of running PostgreSQL on AWS, explain why we decided to run most of our databases on EC2 Instances instead of RDS, how we chose EC2 Instance types and EBS Volumes, which AWS CloudWatch metrics MUST be monitored (and why), and what problems we hit plus how to avoid them.
Data Ingestion in Big Data and IoT platformsGuido Schmutz
Many of the Big Data and IoT use cases are based on combining data from multiple data sources and to make them available on a Big Data platform for analysis. The data sources are often very heterogeneous, from simple files, databases to high-volume event streams from sensors (IoT devices). It’s important to retrieve this data in a secure and reliable manner and integrate it with the Big Data platform so that it is available for analysis in real-time (stream processing) as well as in batch (typical big data processing). In past some new tools have emerged, which are especially capable of handling the process of integrating data from outside, often called Data Ingestion. From an outside perspective, they are very similar to a traditional Enterprise Service Bus infrastructures, which in larger organization are often in use to handle message-driven and service-oriented systems. But there are also important differences, they are typically easier to scale in a horizontal fashion, offer a more distributed setup, are capable of handling high-volumes of data/messages, provide a very detailed monitoring on message level and integrate very well with the Hadoop ecosystem. This session will present and compare Apache NiFi, StreamSets and the Kafka Ecosystem and show how they handle the data ingestion in a Big Data solution architecture.
How netflix manages petabyte scale apache cassandra in the cloudVinay Kumar Chella
At Netflix, we manage petabytes of data in Apache Cassandra which must be reliably accessible to users in mere milliseconds. To achieve this, we have built sophisticated control planes that turn our persistence layer based on Apache Cassandra into a truly self-driving system. We will start with the user interface that Netflix developers use to interact with their Cassandra databases and dive deep into the automation that powers it all. From cluster creation, through scaling up, to cluster death, complex automation drives large fleets of virtual machines hosted on the AWS cloud. First, we will cover the basics of how Netflix deploys Apache Cassandra. In particular, this begins with how we mold Apache Cassandra to the Netflix philosophy of immutable infrastructure, including managing software and hardware upgrades in the face of ever-failing hardware. Then we will explore the concrete techniques needed for such a massive deployment, specifically pull-based control planes and auto-healing strategies. Next, we will cover how Netflix has automated complex but critical Apache Cassandra maintenance tasks such as continuous snapshot backups and always-on anti-entropy repair for keeping our datasets safe and consistent. Both of these systems have gone through multiple architectural evolutions, and we have learned many lessons along the way. Lastly, we will share some of the ways this has gone wrong, and what you can do to avoid them. We will cover a few case studies of major Cassandra outages at Netflix, their root cause, and what we learned from those incidents. At the end of this talk, we hope that participants leave with concrete understanding of the challenges in running massive scale Apache Cassandra as well as solid advice and techniques for building their own self-driving data persistence layer.
Fine-tuning Group Replication for PerformanceVitor Oliveira
This presentation is an overview of Group Replication from the perspective of performance optimization. It shows the main moving parts, the available options and how they can be used to tune its behaviour, and also a few significant benchmark results.
Delivering Data Democratization in the Cloud with SnowflakeKent Graziano
This is a brief introduction to Snowflake Cloud Data Platform and our revolutionary architecture. It contains a discussion of some of our unique features along with some real world metrics from our global customer base.
Виталий Харисов "История создания БЭМ. Кратко, сбивчиво и неполно"Yandex
Как мы делали БЭМ. Почему некоторые места сделаны именно так. Что лежит в основе методологии. Что важно, а что можно менять по своему вкусу. Какие технологии мы используем и как они облегчают нам разработку.
My experience with embedding PostgreSQLJignesh Shah
At my current company, we embed PostgreSQL based technologies in various applications shipped as shrink-wrapped software. In this session we talk about the experience of embedding PostgreSQL where it is not directly exposed to end-user and the issues encountered on how they were resolved.
We will talk about business reasons,technical architecture of deployments, upgrades, security processes on how to work with embedded PostgreSQL databases.
(ATS3-PLAT01) Recent developments in Pipeline PilotBIOVIA
This session will review significant enhancements to Pipeline Pilot in recent releases. Areas covered are: Professional client, administration, security, integration, databases, and collections (chemistry, next gen sequencing, documents and text, statistics, and imaging).
SPICE MODEL of CM200HA-24H (Professional+FWDP Model) in SPICE PARKTsuyoshi Horigome
SPICE MODEL of CM200HA-24H (Professional+FWDP Model) in SPICE PARK. English Version is http://www.spicepark.net. Japanese Version is http://www.spicepark.com by Bee Technologies.
SPICE MODEL of SX310 , PSpice Model in SPICE PARK. English Version is http://www.spicepark.net. Japanese Version is http://www.spicepark.com by Bee Technologies.
SPICE MODEL of BP365TS , LTspice Model in SPICE PARKTsuyoshi Horigome
SPICE MODEL of BP365TS , LTspice Model in SPICE PARK. English Version is http://www.spicepark.net. Japanese Version is http://www.spicepark.com by Bee Technologies.
SPICE MODEL of SSM3K15FU (Professional+BDP Model) in SPICE PARKTsuyoshi Horigome
SPICE MODEL of SSM3K15FU (Professional+BDP Model) in SPICE PARK. English Version is http://www.spicepark.net. Japanese Version is http://www.spicepark.com by Bee Technologies.
SPICE MODEL of TPC6108 (Professional+BDP Model) in SPICE PARKTsuyoshi Horigome
SPICE MODEL of TPC6108 (Professional+BDP Model) in SPICE PARK. English Version is http://www.spicepark.net. Japanese Version is http://www.spicepark.com by Bee Technologies.
SPICE MODEL of SSM3K15FU (Standard+BDS Model) in SPICE PARKTsuyoshi Horigome
SPICE MODEL of SSM3K15FU (Standard+BDS) in SPICE PARK. English Version is http://www.spicepark.net. Japanese Version is http://www.spicepark.com by Bee Technologies.
NLCMG - Performance is good, Understanding performance is better nlwebperf
We all want systems (cool websites, boring corporate applications) that respond quickly, and will not overheat our servers. In particular we want them to do that even if a lot of load is put on them. Getting that to work requires a lot of cooperation between developers and system administrators, to name but a few. Developers typically think that a performance problem can be solved with more hardware. After all, it runs great on their own laptop. How hard can it be? System admins typically think that developers are clueless about the real resource cost of load, or the laws of physics for that matter. It would help if these guys could talk to each other better. However, that requires a body of knowledge that is hardly taught well in schools. In my career I have seen quite a few systems that did not perform well under load, and actually got to figure out why. With these examples, this presentation will illustrate some of the important concepts in capacity and performance engineering. With these concepts we are working towards a ‘standard’ body of knowledge. With that body of knowledge dev and ops can work together to make systems that are more fun to use, simpler to manage and make more business sense.
SPICE MODEL of SX310 , LTspice Model in SPICE PARK. English Version is http://www.spicepark.net. Japanese Version is http://www.spicepark.com by Bee Technologies.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
2. About Me
§ Currently Product Manager of vFabric Postgres
§ Performance Engineer , vFabric Data Director
§ Previously with Sun Microsystems (2000-2010)
§ Team Member that delivered the first published mainstream
benchmark with PostgreSQL
§ Blog at : http://jkshah.blogspot.com
2 2012 Copyright VMware Inc
4. Introduction
§ Why do we need benchmarks?
• Reference data points
• Stress Test for “Too Big to Fail” scenarios
§ Uses of Benchmark
• Improve Product Quality
• Understand code path usage
• Performance Characteristics
• Baseline metrics (Reference points)
• Release to release
• Against other technologies to do same business operation
§ Abuses of Benchmark
• Benchmarketing
• Fixated only on ones that are favorable
4 2012 Copyright VMware Inc
6. Pgbench
§ Based on TPC-B workload (circa 1990)
§ Not an OLTP but stress benchmark for database
§ Ratio is Branches: 10 Tellers: 100,000 Accounts Branches
§ Default TPC-B sort-of Tellers Accounts
• Account transactions also impact teller and branch balances
History
• Branch table becomes the biggest bottleneck
6 2012 Copyright VMware Inc
7. Pgbench
§ Hints
• PGSSLMODE disable (Unless you want to factor SSL communication
overhead. Depending on your distribution )
• -M prepared (unless you want to measure overhead of parsing)
§ Various modes of benchmark
§ Default TPC-B sort-of
• Account transactions also impact teller and branch balances
• Branch table becomes the biggest bottleneck
§ -N Simple Update (with select, insert)
• Account Update, Select balance. History insert
• Account table update becomes the biggest bottleneck
§ -S read only test
• AccessShareLock on Accounts table and primary index becomes the
bottleneck
• Fixed in 9.2 (Thanks Robert Haas)
7 2012 Copyright VMware Inc
8. PGBench Select Test
PGBench - Select Test
80000
70000
60000
Transactions Per Second
50000
40000
PG9.1 PGBench Select
30000
9.2 PGBench Select
20000
10000
0
0 20 40 60 80 100 120
Number of Clients
8 2012 Copyright VMware Inc
9. PGBench TPC-B Like Test
PGBench (TPC-B Like)
8000
7000
6000
Transactions Per Second
5000
4000
PG9.1 TPC-B Like
3000
2000 9.2 TPC-B Like
1000
0
0 20 40 60 80 100 120
Number of Clients
9 2012 Copyright VMware Inc
11. sysbench
§ Originally developed to test systems
§ Has an OLTP component which was based on MySQL
§ Creates a table sbtest with a pimary key.
§ Various Modes of OLTP operation
§ Simple Read Only (web site primary key lookup)
§ Complex Read Only
§ Complex Read Write test
11 2012 Copyright VMware Inc
15. Sysbench – Complex R/W Note
§ In 9.0 it was impossible to run sysbench complex r/w without
hitting error - ERROR: duplicate key value violates unique constraint "sbtest_pkey"
§ In 9.1 SSI was introduced and occurrence went down drastically
§ In 9.2 havent encountered the occurence
15 2012 Copyright VMware Inc
17. Dbt2 -
§ Fair Use implementation of TPC-C
Warehouses Stock
§ Implemented using C stored procedures using
driver->client->database server architecture District Item
§ Nine Tables Customer Orders
§ Five Transactions – History Order Lines
• New-Order (NOTPM) 45%
New Orders
• Payment 43%
• Delivery 4%
• Order Status 4%
• Stock Level 4%
17 2012 Copyright VMware Inc
18. Dbt2 -
§ Why is it not TPC-C compliant?
• Not audited by TPC
• No Terminal emulator
• Official kit requires commercial Transaction Manager
• Doesn’t’ cover ACID tests
§ Two versions Available
• Libpq
• ODBC
§ One potential problem is 3 network roundtrips per transaction
which causes “Idle in Transaction” at high load
• BEGIN, SELECT StoredProcedure() , END pattern of transactions
18 2012 Copyright VMware Inc
19. Dbt2 – Postgres 9.1
Cached Runs (data in bufferpool)
Short runs (limited checkpoint
and vacuum impacts)
NOTPM = 45% of all DB Trans
DB Trans rate about 3000 TPS
19 2012 Copyright VMware Inc
21. BenchmarkSQL-
§ Another implementation using TPC-C schema
Warehouses Stock
§ Implemented using JDBC
District Item
§ Nine Tables
§ Five Transactions – Customer Orders
• New-Order (NOTPM) 45% History Order Lines
• Payment 43% New Orders
• Delivery 4%
• Order Status 4%
• Stock Level 4%
§ Surprisingly can do better than dbt2 implementation but still has
“idle in transactions” which means bottlenecked at network/client
level
21 2012 Copyright VMware Inc
23. DVDStore
§ Implementation of Online DVD Store
Customers Cust_hist
§ Postgres support contributed by VMware
Product Inventory
§ Implemented using various stacks
• JSP/Java/JDBC (supports Postgres) Categories Orders
• Linux/Apache/PHP/MySQL (supports Postgres)
Reorder Order Lines
• ASP.NET (not yet implemented for Postgres)
• Stored Procedures (supports Postgres via Npgsql)
§ Eight Tables
§ Main Transactions –
• New-Customers 0-10% (configurable)
• Customer Login
• DVD Browse (By category, by actor, by title)
• Purchase Order (Metric – Orders Per Minute)
• Stock ReOrder (via Triggers)
23 2012 Copyright VMware Inc
24. DVDStore
§ JSP/Java JDBC Implementation
• Tomcat may need tuning
§ PHP-Postgres Implementation
• Suffers from one connection per SQL command
• Needs pg_bouncer (on same server as web server) and configure local
connections to pg_bouncer which does connection caching to actual Postgres
server
§ Stored Procedure Implementation
• Fastest Implementation (> 100,000 orders per minute)
• Idle in transactions can still occur.
§ Metric is Orders Per Minute
• DB Transactions = (6-7 * OPM/60) ~ 10K – 11K TPS
24 2012 Copyright VMware Inc
26. Genesis of TPC-V
§ Users are demanding benchmarks to measure performance of
databases in a virtual environment
• Existing virtualization benchmarks model consolidation:
• Many VMs
• Small VMs
• Non-database workloads
§ TPC is developing a benchmark to satisfy that demand: TPC-V
• An OLTP workload typical of TPC benchmarks
• Fewer, larger VMs
• Cloud characteristics:
• Variability: mix of small and large VMs
• Elasticity: load driven to each VM varies by 10X
26 2012 Copyright VMware Inc
27. Benchmark requirements
§ Satisfies the industry need for a benchmark that:
• Has a database-centric workload
• Stresses virtualization layer
• Moderate # of VMs, exercising enterprise applications
• Healthy storage and networking I/O content; emphasizes I/O in a virtualized
environment
• NOT many app environments in an app consolidation scenario
§ Timely development cycle (1-2 years)
• Based on the TPC-E benchmark and borrows a lot from it
27 2012 Copyright VMware Inc
28. What is TPC-E
§ TPC-E is theTPC’s latest OLTP benchmark
• More complex than TPC-C
• Less I/O than TPC-C
• A lot of the code is TPC-supplied
§ Models a brokerage firm
Customers Brokers Market
Invoke the following transactions …
READ-WRITE READ-ONLY
•Market-Feed •Broker-Volume •Security-Detail
•Trade-Order •Customer-Position •Trade-Lookup
•Trade-Result •Market-Watch •Trade-Status
•Trade-Update
… against the following data
Customer Data Brokerage Data Market Data
28 2012 Copyright VMware Inc
29. Abstraction of the Functional Components in an OLTP
Environment
Modeled Business
Presentation
Network Application
User Services Network
Interfaces And Database
Business Logic Services
Services
Market
Exchange Legend
Customer
Sponsor Provided
Stock Market
29 2012 Copyright VMware Inc
30. Functional Components of TPC-E Test Configuration
Driving and Reporting
Sponsor
Provided
CE… MEE… DM…
EGenDriver EGenDriverCE EGenDriverMEE EGenDriverDM TPC Defined
Interfaces
…CE …MEE …DM
EGenDriver Connector
Sponsor
Provided
Network
EGenTxnHarness Connector
EGenTxnHarness TPC-E Logic and Frame Calls TPC Defined
Interface
Sponsor
Provided Frame Implementation
Legend
Database Interface Sponsor Provided
Commercial TPC Provided
Product
DBMS Commercial Product
Sponsor TPC Defined
Provided Database Logic Interface
30 2012 Copyright VMware Inc
31. How does this all matter to the PostgreSQL community?
§ TPC is developing a benchmarking kit for TPC-V
• First time TPC has gone beyond publishing a functional specification
• Full, end-to-end functionality
• Publicly available kit
• Produces the variability and load elasticity properties of the benchmark
• Users need not worry about complexities of simulating cloud characteristics
• Runs against an open source database
• A “reference” kit; companies are allowed to develop their own kit
§ Anyone can install the kit and pound on the server with a cloud
database workload
• Removes the high cost of entry typical to TPC benchmarks
§ The reference kit will run on PostgreSQL
• ODBC interface allows running the workload against other databases
§ Tentative plans to also release a TPC-E kit
• We started out with a kit to run TPC-E; now adding the TPC-V properties
31 2012 Copyright VMware Inc
32. Our dependence on PostgreSQL
§ This reference kit will be a very successful new benchmark
• But only if its performance on the open source database is at least decent
compared to commercial databases
§ PostgreSQL can benefit a lot from being the reference database for a
major new benchmark
• But only its performance is decent!
§ Running the TPC-E prototype on PGSQL 8.4 on RHEL 6.1, we are at
~20% of published TPC-E results
• Very early results
• Current testbed is memory challenged
• Good news: Query plans for the 10 queries implemented look good
• Long, mostly-read queries => issue is the basic execution path, not redo log,
latch contention, etc.
32 2012 Copyright VMware Inc
33. Benchmark Development Status
§ TPC-V Development Subcommittee
• 9 member companies
• 3-4 engineers working actively on the reference kit
• On version 0.12 of the draft spec
• Worked through a lot of thorny issues
• Betting the farm on the reference kit
• But if we produce a good kit, TPC-V will be an immediate success
§ We expect to make a kit available to member companies in Q3 or
Q4
Bottom line: Cooperating to make the TPC-E/TPC-V reference kits
run well on PostgreSQL will greatly benefit all of us
33 2012 Copyright VMware Inc
34. Acknowledgements!
§ VMware Performance Team
• Dong, Reza
§ VMware vFabric Postgres Team
• David Fetter, Dan, Alex, Nikhil, Scott, Yolanda
34 2012 Copyright VMware Inc