Recently we have started to find that MySQL is holding us back when it comes to our vision of growing a sustainable and scalable database. This talk will discuss the process Rant & Rave followed in order to migrate our core database. By discussing some of the challenges we overcame from mapping datatypes to differences in syntax it is hoped that other MySQL users will be better equipped to make the move to PostgreSQL.
With 9.4 came logical decoding but what is it and how can it be used? Besides being a precursor to bi-directional replication there are plenty of use cases for this and many don't even require you to implement a plugin. We'll look at trigger-less auditing, partial replication and full statement replication.
This document summarizes strategies for scaling a Magento installation. It discusses code management techniques like using good IDE tools and avoiding modifying core files. It outlines hardware profiles including networks, databases, caches and utility servers. It describes the team structure with 16 committers across 5 departments and 31 vendors. Effective communication practices and documentation are emphasized. Release processes, deployments, community participation and collaboration texts are also summarized.
This document provides an overview of using Apache Pulsar as a streaming data source for Apache Flink SQL queries. It discusses how Pulsar can serve as a unified storage layer for both streaming and batch data. The document then demonstrates setting up a Pulsar source in Flink SQL, writing a continuous query over tweeted data to calculate tweet counts in tumbling windows, and sinking the results to another Pulsar topic. It concludes by providing references to documentation on Flink SQL, the Flink-Pulsar connector, and related resources.
Building a Modern Website for Scale (QCon NY 2013)Sid Anand
LinkedIn uses several technologies to scale its services and infrastructure to support over 200 million members. It uses a dynamic discovery and client-side load balancing approach for its web services to improve fault tolerance. The presentation tier is composed of various front-end frameworks while business logic is encapsulated in services. LinkedIn's databases Espresso and Oracle are scaled using techniques like data replication, read replicas and change data capture via Databus. Databus provides a consistent, real-time stream of database changes to power services like search, recommendations and standardization. Messaging is handled using Apache Kafka which provides pub-sub streaming capabilities.
Why we love pgpool-II and why we hate it!PGConf APAC
Pgpool is middleware that works between PostgreSQL clients and servers to provide connection pooling, replication, and load balancing. The presenter's company deployed pgpool in various architectures including master-slave replication and load balancing configurations. They experienced some issues with pgpool like connection errors when using application pooling, lack of guaranteed connection reuse, and bugs. Tips are provided like ensuring synchronized server times and restricting health check users. Pgpool may not be best when automatic node rejoining is needed or during network instability.
Get Your Insecure PostgreSQL Passwords to SCRAMJonathan Katz
Passwords: they just seem to work. You connect to your PostgreSQL database and you are prompted for your password. You type in the correct character combination, and presto! you're in, safe and sound.
But what if I told you that all was not as it seemed. What if I told you there was a better, safer way to use passwords with PostgreSQL? What if I told you it was imperative that you upgraded, too?
PostgreSQL 10 introduced SCRAM (Salted Challenge Response Authentication Mechanism), introduced in RFC 5802, as a way to securely authenticate passwords. The SCRAM algorithm lets a client and server validate a password without ever sending the password, whether plaintext or a hashed form of it, to each other, using a series of cryptographic methods.
In this talk, we will look at:
* A history of the evolution of password storage and authentication in PostgreSQL
* How SCRAM works with a step-by-step deep dive into the algorithm (and convince you why you need to upgrade!)
* SCRAM channel binding, which helps prevent MITM attacks during authentication
* How to safely set and modify your passwords, as well as how to upgrade to SCRAM-SHA-256 (which we will do live!)
all of which will be explained by some adorable elephants and hippos!
At the end of this talk, you will understand how SCRAM works, how to ensure your PostgreSQL drivers supports it, how to upgrade your passwords to using SCRAM-SHA-256, and why you want to tell other PostgreSQL password mechanisms to SCRAM!
7 September 2017 - At ION Conference Durban, South Africa, Andrew Alston on how Liquid Telecom deployed IPv6 and how other organizations can do the same.
With 9.4 came logical decoding but what is it and how can it be used? Besides being a precursor to bi-directional replication there are plenty of use cases for this and many don't even require you to implement a plugin. We'll look at trigger-less auditing, partial replication and full statement replication.
This document summarizes strategies for scaling a Magento installation. It discusses code management techniques like using good IDE tools and avoiding modifying core files. It outlines hardware profiles including networks, databases, caches and utility servers. It describes the team structure with 16 committers across 5 departments and 31 vendors. Effective communication practices and documentation are emphasized. Release processes, deployments, community participation and collaboration texts are also summarized.
This document provides an overview of using Apache Pulsar as a streaming data source for Apache Flink SQL queries. It discusses how Pulsar can serve as a unified storage layer for both streaming and batch data. The document then demonstrates setting up a Pulsar source in Flink SQL, writing a continuous query over tweeted data to calculate tweet counts in tumbling windows, and sinking the results to another Pulsar topic. It concludes by providing references to documentation on Flink SQL, the Flink-Pulsar connector, and related resources.
Building a Modern Website for Scale (QCon NY 2013)Sid Anand
LinkedIn uses several technologies to scale its services and infrastructure to support over 200 million members. It uses a dynamic discovery and client-side load balancing approach for its web services to improve fault tolerance. The presentation tier is composed of various front-end frameworks while business logic is encapsulated in services. LinkedIn's databases Espresso and Oracle are scaled using techniques like data replication, read replicas and change data capture via Databus. Databus provides a consistent, real-time stream of database changes to power services like search, recommendations and standardization. Messaging is handled using Apache Kafka which provides pub-sub streaming capabilities.
Why we love pgpool-II and why we hate it!PGConf APAC
Pgpool is middleware that works between PostgreSQL clients and servers to provide connection pooling, replication, and load balancing. The presenter's company deployed pgpool in various architectures including master-slave replication and load balancing configurations. They experienced some issues with pgpool like connection errors when using application pooling, lack of guaranteed connection reuse, and bugs. Tips are provided like ensuring synchronized server times and restricting health check users. Pgpool may not be best when automatic node rejoining is needed or during network instability.
Get Your Insecure PostgreSQL Passwords to SCRAMJonathan Katz
Passwords: they just seem to work. You connect to your PostgreSQL database and you are prompted for your password. You type in the correct character combination, and presto! you're in, safe and sound.
But what if I told you that all was not as it seemed. What if I told you there was a better, safer way to use passwords with PostgreSQL? What if I told you it was imperative that you upgraded, too?
PostgreSQL 10 introduced SCRAM (Salted Challenge Response Authentication Mechanism), introduced in RFC 5802, as a way to securely authenticate passwords. The SCRAM algorithm lets a client and server validate a password without ever sending the password, whether plaintext or a hashed form of it, to each other, using a series of cryptographic methods.
In this talk, we will look at:
* A history of the evolution of password storage and authentication in PostgreSQL
* How SCRAM works with a step-by-step deep dive into the algorithm (and convince you why you need to upgrade!)
* SCRAM channel binding, which helps prevent MITM attacks during authentication
* How to safely set and modify your passwords, as well as how to upgrade to SCRAM-SHA-256 (which we will do live!)
all of which will be explained by some adorable elephants and hippos!
At the end of this talk, you will understand how SCRAM works, how to ensure your PostgreSQL drivers supports it, how to upgrade your passwords to using SCRAM-SHA-256, and why you want to tell other PostgreSQL password mechanisms to SCRAM!
7 September 2017 - At ION Conference Durban, South Africa, Andrew Alston on how Liquid Telecom deployed IPv6 and how other organizations can do the same.
OSDC 2014: Devdas Bhagat - Graphite: Graphs for the modern age NETWAYS
Graphite is a timeseries data charting package, similar to MRTG and Cacti. This talk will cover Graphite starting from the basics to how booking.com scaled it to millions of datapoints per second.
MinervaDB offers PostgreSQL consulting, support, and remote DBA services. Their PostgreSQL performance audit reviews hardware, OS configuration, vacuum and disk bloat balancing, parameter tuning, HA/DR solutions, application performance, security, and bulk data loading. Support plans include recommendations, capacity planning, security, upgrades, and 24/7 support starting at $25,000/year. Remote DBA services fully manage the PostgreSQL infrastructure for monitoring, installation, performance, backups, HA, security, upgrades, and on-call support starting at $1,600/month.
A walkthrough of various application performance tuning tools and a good workflow for where to start, from a presentation at WindyCityRails 2011 in Chicago, IL.
See the video, and more Web and Ruby/Rails Performance info at www.RailsPerformance.com
-John McCaffrey
Architecting a next-generation data platformhadooparchbook
This document discusses a high-level architecture for analyzing taxi trip data in real-time and batch using Apache Hadoop and streaming technologies. The architecture includes ingesting data from multiple sources using Kafka, processing streaming data using stream processing engines, storing data in data stores like HDFS, and enabling real-time and batch querying and analytics. Key considerations discussed are choosing data transport and stream processing technologies, scaling and reliability, and processing both streaming and batch data.
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...StreamNative
Despite what the Ghostbusters said, we’re going to go ahead and cross (or, join) the streams. This session covers getting started with streaming data pipelines, maximizing Pulsar’s messaging system alongside one of the most flexible streaming frameworks available, Apache Flink. Specifically, we’ll demonstrate the use of Flink SQL, which provides various abstractions and allows your pipeline to be language-agnostic. So, if you want to leverage the power of a high-speed, highly customizable stream processing engine without the usual overhead and learning curves of the technologies involved (and their interconnected relationships), then this talk is for you. Watch the step-by-step demo to build a unified batch and streaming pipeline from scratch with Pulsar, via the Flink SQL client. This means you don’t need to be familiar with Flink, (or even a specific programming language). The examples provided are built for highly complex systems, but the talk itself will be accessible to any experience level.
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)Ontico
This document discusses how PostgreSQL helped Zalando, one of Europe's largest online fashion retailers, scale their operations. Some key points:
- Zalando uses PostgreSQL to store data for over 13.7 million customers, 150,000+ products, and handles 640+ million visits annually.
- They access data through stored procedures to maintain consistency and independence from schema changes. Stored procedures also allow easy data modeling and schema changes without downtimes.
- Zalando shards their database without limits through their Java stored procedure wrapper which directs calls to the appropriate shard transparently.
- They closely monitor PostgreSQL using custom tools like PGObserver and have a dedicated team to ensure high performance and availability.
Analytical DBMS to Apache Spark Auto Migration Framework with Edward Zhang an...Databricks
The document discusses an auto migration framework developed by the Data Service & Solution team at eBay to migrate 90% of their batch workload from ADBMS to Apache Spark. The framework includes components like a migration planner, metadata store, SQL converter, job optimizer, and pipeline generator. It aims to migrate over 5,000 tables and 40PB of relational data over 1 year. Key challenges include collecting comprehensive metadata and validating data quality across platforms after migration. The team has also contributed over 50 issues to the Spark community to improve features around SQL parsing, filter pushdown, and data type handling during migration.
Moving advanced analytics to your sql server databasesEnrico van de Laar
This document discusses moving advanced analytics capabilities into SQL Server databases. It describes how traditionally analytics involved extracting data from databases, performing modeling and scoring elsewhere, and sending results back. However, this can be slow for large datasets and require data movement. The document outlines how SQL Server 2016 and 2017 enable "in-database analytics" by bringing models to the data instead. It provides an overview of different methods for building and exploiting in-database models like sp_execute_external_script, sp_rxPredict, and PREDICT. It also demonstrates how to connect a SQL Server database to predictive models hosted on Azure Machine Learning for rapid, in-database scoring.
This document appears to be slides from a presentation on concurrency in Ruby applications. The slides discuss different concurrency models including blocking threads, callbacks, reactors, and fibers. They explore when concurrency is useful based on factors like context switching costs. Linear and mixed data dependencies are presented as examples to illustrate different concurrency interfaces and implementations using threads or asynchronous callbacks.
Continuous Deployment of Architectural ChangeMatt Graham
Continuous deployment has proven to be a successful and even addicting part of Etsy's engineering culture. See where it's applicable, some of the tools that make it easy, and the kind of architectural change that it makes possible.
MariaDB Server Compatibility with MySQLColin Charles
At the MariaDB Server Developer's meeting in Amsterdam, Oct 8 2016. This was the deck to talk about what MariaDB Server 10.1/10.2 might be missing from MySQL versions up to 5.7. The focus is on compatibility of MariaDB Server with MySQL.
The Power of RxJS in Nativescript + AngularTracy Lee
Learn the basics of use and power of RxJS in NativeScript & Angular in this presentation given at NativeScript Developer Days in New York City September 2017
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...Fwdays
Troubleshooting performance issues can be a bit tricky, especially when you’re given a broad statement that the database is slow.
Learn to direct your attention to the correct moving pieces and fix what needs your attention.
Learn how all this is done at Percona, what we monitor and track, and the tools we use.
Introduction to Data Engineer and Data Pipeline at Credit OKKriangkrai Chaonithi
The document discusses the role of data engineers and data pipelines. It begins with an introduction to big data and why data volumes are increasing. It then covers what data engineers do, including building data architectures, working with cloud infrastructure, and programming for data ingestion, transformation, and loading. The document also explains data pipelines, describing extract, transform, load (ETL) processes and batch versus streaming data. It provides an example of Credit OK's data pipeline architecture on Google Cloud Platform that extracts raw data from various sources, cleanses and loads it into BigQuery, then distributes processed data to various applications. It emphasizes the importance of data engineers in processing and managing large, complex data sets.
The document discusses REST and gRPC APIs. It describes REST as a design pattern that uses HTTP and represents resources with JSON or XML. While REST is widely used, it has disadvantages like bloated data payloads and lack of a formal contract. gRPC is introduced as an alternative that is high performance, uses protocol buffers for compact payloads, and has generated client/server code in many languages. The document also describes how grpc-gateway can be used to expose existing gRPC services through a RESTful JSON API via HTTP to support existing clients.
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021StreamNative
You may be familiar with the Presto plugin used to run fast interactive queries over Pulsar using ANSI SQL and can be joined with other data sources. This plugin will soon get a rename to align with the rename of the PrestoSQL project to Trino. What is the purpose of this rename and what does it mean for those using the Presto plugin? We cover the history of the community shift from PrestoDB to PrestoSQL, as well as, the future plans for the Pulsar community to donate this plugin to the Trino project. One of the connector maintainers will then demo the connector and show what is possible when using Trino and Pulsar!
From Warehouses to Lakes: The Value of StreamsMike Fowler
You have a beautifully modelled warehouse and a lake with all your business data but you’re still looking at the past and making decisions from what happened yesterday. In this talk we’ll look at how you can really get value from your data and make decisions as events happen and even before they do.
From Warehouses to Lakes: The Value of StreamsMike Fowler
Every business has a wealth of data but getting value from data is hard. We've tried Data Warehouses and Data Lakes, and while both give us insights we are after, they present their own challenges. Perhaps most challenging of all is making decisions based on yesterday's data. In this talk we'll look at how you can start using your data to make decisions as events happen in your business and how we can even make predictions too. Best of all, we can populate our Data Lakes and Data Warehouses at the same time keeping all the historic analytics in place.
OSDC 2014: Devdas Bhagat - Graphite: Graphs for the modern age NETWAYS
Graphite is a timeseries data charting package, similar to MRTG and Cacti. This talk will cover Graphite starting from the basics to how booking.com scaled it to millions of datapoints per second.
MinervaDB offers PostgreSQL consulting, support, and remote DBA services. Their PostgreSQL performance audit reviews hardware, OS configuration, vacuum and disk bloat balancing, parameter tuning, HA/DR solutions, application performance, security, and bulk data loading. Support plans include recommendations, capacity planning, security, upgrades, and 24/7 support starting at $25,000/year. Remote DBA services fully manage the PostgreSQL infrastructure for monitoring, installation, performance, backups, HA, security, upgrades, and on-call support starting at $1,600/month.
A walkthrough of various application performance tuning tools and a good workflow for where to start, from a presentation at WindyCityRails 2011 in Chicago, IL.
See the video, and more Web and Ruby/Rails Performance info at www.RailsPerformance.com
-John McCaffrey
Architecting a next-generation data platformhadooparchbook
This document discusses a high-level architecture for analyzing taxi trip data in real-time and batch using Apache Hadoop and streaming technologies. The architecture includes ingesting data from multiple sources using Kafka, processing streaming data using stream processing engines, storing data in data stores like HDFS, and enabling real-time and batch querying and analytics. Key considerations discussed are choosing data transport and stream processing technologies, scaling and reliability, and processing both streaming and batch data.
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...StreamNative
Despite what the Ghostbusters said, we’re going to go ahead and cross (or, join) the streams. This session covers getting started with streaming data pipelines, maximizing Pulsar’s messaging system alongside one of the most flexible streaming frameworks available, Apache Flink. Specifically, we’ll demonstrate the use of Flink SQL, which provides various abstractions and allows your pipeline to be language-agnostic. So, if you want to leverage the power of a high-speed, highly customizable stream processing engine without the usual overhead and learning curves of the technologies involved (and their interconnected relationships), then this talk is for you. Watch the step-by-step demo to build a unified batch and streaming pipeline from scratch with Pulsar, via the Flink SQL client. This means you don’t need to be familiar with Flink, (or even a specific programming language). The examples provided are built for highly complex systems, but the talk itself will be accessible to any experience level.
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)Ontico
This document discusses how PostgreSQL helped Zalando, one of Europe's largest online fashion retailers, scale their operations. Some key points:
- Zalando uses PostgreSQL to store data for over 13.7 million customers, 150,000+ products, and handles 640+ million visits annually.
- They access data through stored procedures to maintain consistency and independence from schema changes. Stored procedures also allow easy data modeling and schema changes without downtimes.
- Zalando shards their database without limits through their Java stored procedure wrapper which directs calls to the appropriate shard transparently.
- They closely monitor PostgreSQL using custom tools like PGObserver and have a dedicated team to ensure high performance and availability.
Analytical DBMS to Apache Spark Auto Migration Framework with Edward Zhang an...Databricks
The document discusses an auto migration framework developed by the Data Service & Solution team at eBay to migrate 90% of their batch workload from ADBMS to Apache Spark. The framework includes components like a migration planner, metadata store, SQL converter, job optimizer, and pipeline generator. It aims to migrate over 5,000 tables and 40PB of relational data over 1 year. Key challenges include collecting comprehensive metadata and validating data quality across platforms after migration. The team has also contributed over 50 issues to the Spark community to improve features around SQL parsing, filter pushdown, and data type handling during migration.
Moving advanced analytics to your sql server databasesEnrico van de Laar
This document discusses moving advanced analytics capabilities into SQL Server databases. It describes how traditionally analytics involved extracting data from databases, performing modeling and scoring elsewhere, and sending results back. However, this can be slow for large datasets and require data movement. The document outlines how SQL Server 2016 and 2017 enable "in-database analytics" by bringing models to the data instead. It provides an overview of different methods for building and exploiting in-database models like sp_execute_external_script, sp_rxPredict, and PREDICT. It also demonstrates how to connect a SQL Server database to predictive models hosted on Azure Machine Learning for rapid, in-database scoring.
This document appears to be slides from a presentation on concurrency in Ruby applications. The slides discuss different concurrency models including blocking threads, callbacks, reactors, and fibers. They explore when concurrency is useful based on factors like context switching costs. Linear and mixed data dependencies are presented as examples to illustrate different concurrency interfaces and implementations using threads or asynchronous callbacks.
Continuous Deployment of Architectural ChangeMatt Graham
Continuous deployment has proven to be a successful and even addicting part of Etsy's engineering culture. See where it's applicable, some of the tools that make it easy, and the kind of architectural change that it makes possible.
MariaDB Server Compatibility with MySQLColin Charles
At the MariaDB Server Developer's meeting in Amsterdam, Oct 8 2016. This was the deck to talk about what MariaDB Server 10.1/10.2 might be missing from MySQL versions up to 5.7. The focus is on compatibility of MariaDB Server with MySQL.
The Power of RxJS in Nativescript + AngularTracy Lee
Learn the basics of use and power of RxJS in NativeScript & Angular in this presentation given at NativeScript Developer Days in New York City September 2017
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...Fwdays
Troubleshooting performance issues can be a bit tricky, especially when you’re given a broad statement that the database is slow.
Learn to direct your attention to the correct moving pieces and fix what needs your attention.
Learn how all this is done at Percona, what we monitor and track, and the tools we use.
Introduction to Data Engineer and Data Pipeline at Credit OKKriangkrai Chaonithi
The document discusses the role of data engineers and data pipelines. It begins with an introduction to big data and why data volumes are increasing. It then covers what data engineers do, including building data architectures, working with cloud infrastructure, and programming for data ingestion, transformation, and loading. The document also explains data pipelines, describing extract, transform, load (ETL) processes and batch versus streaming data. It provides an example of Credit OK's data pipeline architecture on Google Cloud Platform that extracts raw data from various sources, cleanses and loads it into BigQuery, then distributes processed data to various applications. It emphasizes the importance of data engineers in processing and managing large, complex data sets.
The document discusses REST and gRPC APIs. It describes REST as a design pattern that uses HTTP and represents resources with JSON or XML. While REST is widely used, it has disadvantages like bloated data payloads and lack of a formal contract. gRPC is introduced as an alternative that is high performance, uses protocol buffers for compact payloads, and has generated client/server code in many languages. The document also describes how grpc-gateway can be used to expose existing gRPC services through a RESTful JSON API via HTTP to support existing clients.
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021StreamNative
You may be familiar with the Presto plugin used to run fast interactive queries over Pulsar using ANSI SQL and can be joined with other data sources. This plugin will soon get a rename to align with the rename of the PrestoSQL project to Trino. What is the purpose of this rename and what does it mean for those using the Presto plugin? We cover the history of the community shift from PrestoDB to PrestoSQL, as well as, the future plans for the Pulsar community to donate this plugin to the Trino project. One of the connector maintainers will then demo the connector and show what is possible when using Trino and Pulsar!
Similar to Migrating Rant & Rave to PostgreSQL (20)
From Warehouses to Lakes: The Value of StreamsMike Fowler
You have a beautifully modelled warehouse and a lake with all your business data but you’re still looking at the past and making decisions from what happened yesterday. In this talk we’ll look at how you can really get value from your data and make decisions as events happen and even before they do.
From Warehouses to Lakes: The Value of StreamsMike Fowler
Every business has a wealth of data but getting value from data is hard. We've tried Data Warehouses and Data Lakes, and while both give us insights we are after, they present their own challenges. Perhaps most challenging of all is making decisions based on yesterday's data. In this talk we'll look at how you can start using your data to make decisions as events happen in your business and how we can even make predictions too. Best of all, we can populate our Data Lakes and Data Warehouses at the same time keeping all the historic analytics in place.
Getting Started with Machine Learning on AWSMike Fowler
Machine Learning (ML) is an exciting field that Cloud Computing has helped to accelerate. AWS has played a big part in this with it’s continually expanding range of services from the simply named Machine Learning through to SageMaker. But how do you get started? Thankfully you don’t need to become an expert in linear algebra or statistics, all you need to begin is good idea of the life-cycle of a ML project and a passing familiarity with these AWS services. In this talk we’ll outline a typical ML project and review services such as SageMaker and Rekognition so that you can begin to make use of them in your own projects.
This talk looks at converting an existing GCP serverless application into one build using Firebase. Firebase helps to simplify deployment, particularly around simple web hosting. The talk also looks at how easy it is to use GCP services integrated with Firebase such as authentication and Cloud Firestore.
Reducing Pager Fatigue Using a Serverless ML BotMike Fowler
Being woken up at 3 am by the pager is never fun but seeing an incident resolve before you’ve even left the bed is maddening. Sleepily the next day you tune the alert for a better night’s sleep yet more untuned alerts sing to you in your sleep. After a few rounds of alert-tuning whack-a-mole you wonder: Could I predict if an incident will resolve itself?
This is the story of how a weary engineer used a Cloud ML model with Cloud Functions to reduce pager noise. Recounting some of the challenges faced, we’ll explore training a model with a limited data set & continual training in a serverless environment. We’ll also explore the implications of using a bot as a first responder to a pager.
Have you got data in AWS but don’t know how to get started with Machine Learning? My talk will help you make sense of AWS’ offerings and show you how to use them without having to become a mathematician first. See the full talk on YouTube: https://youtu.be/3phjk1CxhXM
Debezium is a Kafka Connect plugin that performs Change Data Capture from your database into Kafka. This talk demonstrates how this can be leveraged to move your data from one database platform such as MySQL to PostgreSQL. A working example is available on GitHub (github.com/gh-mlfowler/debezium-demo).
Leveraging Automation for a Disposable InfrastructureMike Fowler
Moving from the Iron Age to the Cloud Age in computing is supposed to save us money, but many migrations seem to cost more in the long run and result in an infrastructure that is as complex to manage as the one we had before. This is often due to the so called “lift & shift” approach many take – it’s a short term win that doesn’t address why you wanted to move to the cloud in the first place.
The Cloud Age affords us the opportunity to not treat our infrastructure as something special, but as something disposable. By applying the practices of Continuous Integration and delivery to our infrastructure and configuration management, we can build truly scalable infrastructures to host our application’s wildest dreams.
In this talk, we will look at the tools and processes that can be adopted to truly make use of the possibilities of the Cloud.
Managing your own PostgreSQL servers is sometimes a burden your business does not want. In this talk we will provide an overview of some of the public cloud offerings available for hosted PostgreSQL and discuss a number of strategies for migrating your databases with a minimum of downtime.
Terraform is an open source tool that helps you control your infrastructure configuration through code. This talk will serve as a primer showing how to build a basic infrastructure in the Google Cloud and how we can re-use our code to construct multiple, identical environments.
Managing your own PostgreSQL servers is sometimes a burden your business does not want. In this talk we will provide an overview of some of the public cloud offerings available for hosted PostgreSQL and discuss a number of strategies for migrating your databases with a minimum of downtime.
Managing your own PostgreSQL servers is sometimes a burden your business does not want. In this talk we will provide an overview of some of the public cloud offerings available for hosted PostgreSQL and discuss a number of strategies for migrating your databases with a minimum of downtime.
Moving from the Iron Age to the Cloud Age in computing is supposed to save us money yet many migrations seem to cost more in the long run and result in infrastructures as complex to manage as what we had before. This is often the result of the so called “lift & shift” approach many take – it’s a short term win that doesn’t address why you wanted to move to the cloud in the first place.
The Cloud Age affords us the opportunity not to treat our infrastructure as something special, instead as something disposable. By applying the practices of continuous integration and delivery to our infrastructure and configuration management we can built truly scalable infrastructures to host our application’s wildest dreams.
In this talk we will look at the tools and processes that can be adopted to truly make use of the possibilities of the Cloud.
These days you can't go far without encountering XML or JSON and in the world of the web these data types are ubiquitous. Since version 8.3 XML has been supported as a data type and JSON support was introduced in 9.2. We'll be looking at what advantages there are in storing your data with these data types and how we can query and manipulate our data once it's stored.
Unveiling the Advantages of Agile Software Development.pdfbrainerhub1
Learn about Agile Software Development's advantages. Simplify your workflow to spur quicker innovation. Jump right in! We have also discussed the advantages.
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...The Third Creative Media
"Navigating Invideo: A Comprehensive Guide" is an essential resource for anyone looking to master Invideo, an AI-powered video creation tool. This guide provides step-by-step instructions, helpful tips, and comparisons with other AI video creators. Whether you're a beginner or an experienced video editor, you'll find valuable insights to enhance your video projects and bring your creative ideas to life.
Enhanced Screen Flows UI/UX using SLDS with Tom KittPeter Caitens
Join us for an engaging session led by Flow Champion, Tom Kitt. This session will dive into a technique of enhancing the user interfaces and user experiences within Screen Flows using the Salesforce Lightning Design System (SLDS). This technique uses Native functionality, with No Apex Code, No Custom Components and No Managed Packages required.
Preparing Non - Technical Founders for Engaging a Tech AgencyISH Technologies
Preparing non-technical founders before engaging a tech agency is crucial for the success of their projects. It starts with clearly defining their vision and goals, conducting thorough market research, and gaining a basic understanding of relevant technologies. Setting realistic expectations and preparing a detailed project brief are essential steps. Founders should select a tech agency with a proven track record and establish clear communication channels. Additionally, addressing legal and contractual considerations and planning for post-launch support are vital to ensure a smooth and successful collaboration. This preparation empowers non-technical founders to effectively communicate their needs and work seamlessly with their chosen tech agency.Visit our site to get more details about this. Contact us today www.ishtechnologies.com.au
The Rising Future of CPaaS in the Middle East 2024Yara Milbes
Explore "The Rising Future of CPaaS in the Middle East in 2024" with this comprehensive PPT presentation. Discover how Communication Platforms as a Service (CPaaS) is transforming communication across various sectors in the Middle East.
How Can Hiring A Mobile App Development Company Help Your Business Grow?ToXSL Technologies
ToXSL Technologies is an award-winning Mobile App Development Company in Dubai that helps businesses reshape their digital possibilities with custom app services. As a top app development company in Dubai, we offer highly engaging iOS & Android app solutions. https://rb.gy/necdnt
A neural network is a machine learning program, or model, that makes decisions in a manner similar to the human brain, by using processes that mimic the way biological neurons work together to identify phenomena, weigh options and arrive at conclusions.
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid
IBM watsonx Code Assistant for Z, our latest Generative AI-assisted mainframe application modernization solution. Mainframe (IBM Z) application modernization is a topic that every mainframe client is addressing to various degrees today, driven largely from digital transformation. With generative AI comes the opportunity to reimagine the mainframe application modernization experience. Infusing generative AI will enable speed and trust, help de-risk, and lower total costs associated with heavy-lifting application modernization initiatives. This document provides an overview of the IBM watsonx Code Assistant for Z which uses the power of generative AI to make it easier for developers to selectively modernize COBOL business services while maintaining mainframe qualities of service.
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISTier1 app
Are you ready to unlock the secrets hidden within Java thread dumps? Join us for a hands-on session where we'll delve into effective troubleshooting patterns to swiftly identify the root causes of production problems. Discover the right tools, techniques, and best practices while exploring *real-world case studies of major outages* in Fortune 500 enterprises. Engage in interactive lab exercises where you'll have the opportunity to troubleshoot thread dumps and uncover performance issues firsthand. Join us and become a master of Java thread dump analysis!
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsPeter Muessig
The UI5 tooling is the development and build tooling of UI5. It is built in a modular and extensible way so that it can be easily extended by your needs. This session will showcase various tooling extensions which can boost your development experience by far so that you can really work offline, transpile your code in your project to use even newer versions of EcmaScript (than 2022 which is supported right now by the UI5 tooling), consume any npm package of your choice in your project, using different kind of proxies, and even stitching UI5 projects during development together to mimic your target environment.
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesQuickdice ERP
Explore the seamless transition to e-invoicing with this comprehensive guide tailored for Saudi Arabian businesses. Navigate the process effortlessly with step-by-step instructions designed to streamline implementation and enhance efficiency.
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...kalichargn70th171
In today's fiercely competitive mobile app market, the role of the QA team is pivotal for continuous improvement and sustained success. Effective testing strategies are essential to navigate the challenges confidently and precisely. Ensuring the perfection of mobile apps before they reach end-users requires thoughtful decisions in the testing plan.
Odoo releases a new update every year. The latest version, Odoo 17, came out in October 2023. It brought many improvements to the user interface and user experience, along with new features in modules like accounting, marketing, manufacturing, websites, and more.
The Odoo 17 update has been a hot topic among startups, mid-sized businesses, large enterprises, and Odoo developers aiming to grow their businesses. Since it is now already the first quarter of 2024, you must have a clear idea of what Odoo 17 entails and what it can offer your business if you are still not aware of it.
This blog covers the features and functionalities. Explore the entire blog and get in touch with expert Odoo ERP consultants to leverage Odoo 17 and its features for your business too.
An Overview of Odoo ERP
Odoo ERP was first released as OpenERP software in February 2005. It is a suite of business applications used for ERP, CRM, eCommerce, websites, and project management. Ten years ago, the Odoo Enterprise edition was launched to help fund the Odoo Community version.
When you compare Odoo Community and Enterprise, the Enterprise edition offers exclusive features like mobile app access, Odoo Studio customisation, Odoo hosting, and unlimited functional support.
Today, Odoo is a well-known name used by companies of all sizes across various industries, including manufacturing, retail, accounting, marketing, healthcare, IT consulting, and R&D.
The latest version, Odoo 17, has been available since October 2023. Key highlights of this update include:
Enhanced user experience with improvements to the command bar, faster backend page loading, and multiple dashboard views.
Instant report generation, credit limit alerts for sales and invoices, separate OCR settings for invoice creation, and an auto-complete feature for forms in the accounting module.
Improved image handling and global attribute changes for mailing lists in email marketing.
A default auto-signature option and a refuse-to-sign option in HR modules.
Options to divide and merge manufacturing orders, track the status of manufacturing orders, and more in the MRP module.
Dark mode in Odoo 17.
Now that the Odoo 17 announcement is official, let’s look at what’s new in Odoo 17!
What is Odoo ERP 17?
Odoo 17 is the latest version of one of the world’s leading open-source enterprise ERPs. This version has come up with significant improvements explained here in this blog. Also, this new version aims to introduce features that enhance time-saving, efficiency, and productivity for users across various organisations.
Odoo 17, released at the Odoo Experience 2023, brought notable improvements to the user interface and added new functionalities with enhancements in performance, accessibility, data analysis, and management, further expanding its reach in the market.
1. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Migrating Rant & Rave to PostgreSQL
Mike Fowler
PGDay UK 2014
2. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Overview
● Who and what is Rant & Rave?
● The journey to PostgreSQL
– Why migrate?
– Migration requirements
– Moving the data
– Adjusting the queries
3. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
About Me
● Technical Lead/ScrumMaster at Rant & Rave
● Been using PostgreSQL for over 10 years
● Contributed some XML support
– XMLEXISTS/xpath_exists()
– xml_is_well_formed()
● Buildfarm member piapiac
– Amazon EC2 based build for JDBC driver
– Has lead to a number of bugfix patches for JDBC
http://www.pgbuildfarm.org/cgi-bin/show_status.pl?member=piapiac
4. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
We are a technology company providing
Customer engagement solutions
with a difference
to more than 250 global corporations
Customer engagement is “the ongoing interactions between customer and the company, offered by the company,
chosen by the customer.”
Paul Greenberg, ZDNet and credited with inventing the term ‘CRM’
What do we do?
5. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
What makes us different?
1. Everyone else is just listening. We help businesses act before
and after every interaction.
2. Everything we do is in real-time for the frontline.
We call them ‘Moments of Truth’® those emotional
opportunities to create Ravers from customers and your staff.
3. Rant & Rave is for all your customers, wherever they are,
however they touch your business, whenever they touch your
business.
6. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
The Journey
Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
7. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
The Platform
Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
8. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
9. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Some of our customers...
Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
10. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Some of our customers' feedback...
Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
11. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
The Rant & Rave Database
● 8189 tables with approximately 114 million rows
– 242 'system' tables
– Remaining tables are customer specific though many
similarities exist
● 2930 views
– Views are used to customise data end users wish to see
● e.g. A team leader might be restricted to seeing only the feedback
relating to them and their team
● No stored routines, functions, procedures or triggers
12. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Why migrate?
● MySQL has oddities that routinely catch us out
– UNIONs are faster than ORs with JOINs
– Transactions need retrying when under load
– GROUP BYs allowing non-aggregated columns
– “” = 0
– Nightly database restores sometimes fail for no reason
● MySQL replication leaves a lot to be desired
– Statement replication and timestamps
– Manual does not inspire confidence
13. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Migration Process Requirements
● Minimise downtime
● Ensure database equality
– Schema equality
– Data equality
● Minimise software changes
– Client query compatibility
14. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Minimising Downtime
1) Have a complete clone ready
+ Fastest option
- Must manage data changes from when clone was built
- Clients must be build to support both database servers
2) Stop all writers, dump & restore
- Slowest option
+ Easiest option
- Downtime becomes a function of database size and server I/O speed
15. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Minimising Dump & Restore
● Smaller the database the faster the process
– Remove unused tables & data
– Create the schema in PostgreSQL
– Move unchanging tables ahead of time
● Archived data
● In-active customer tables
● Stream the dump & restore
– Intermediary files are expensive
16. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Introducing convert.pl
● Perl script that manipulates mysqldump format into
a format psql can execute
● Can be executed as part of a stream
mysqldump > convert.pl > psql
● All conversion rules are in one place allowing for
easy refinements as the differences between
MySQL and PostgreSQL are identified and resolved
● Aim to make it publicly available
17. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
The mysqldump Format
● SQL Script
● Lines beginning - - are comments
● Lines beginning /* and ending */ are MySQL version
sepcific commands (e.g. /*!40101 SET NAMES utf8 */;)
● Every section is started with a comment header
--
-- Table structure for table `system_users`
--
18. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
The mysqldump Format
● Four different operations are found in the dump
– Table schema
-- Table structure for table `table_name`
– Temporary table schema for views
-- Temporary table structure for view
`view_name`
– View creation
-- Final view structure for view `view_name`
– Data
-- Dumping data for table `table_name`
19. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Dealing with differences
● MySQL uses ` to quote identifiers, PostgreSQL “
– `table_name` => “table_name”
● CREATE TABLE & VIEW is surprisingly similar
● Most data types need only minor adjustment
– Numeric types SMALLINT, INT, BIGINT, DOUBLE
– Character types TINYTEXT, VARCHAR, LONGTEXT
– Time types DATE, DATETIME, TIMESTAMP
● Some datatypes need careful adjustment
– TINYINT, BLOB
20. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Numeric data types
MySQL allows for a variable display size when
declaring a numeric column - this is not storage size
`field_id` int(11)
MySQL Type PostgreSQL Type
SMALLINT SMALLINT
INT INTEGER
INT UNSIGNED BIGINT
BIGINT BIGINT
DOUBLE DOUBLE PRECISION
21. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Character data types
As there is no performance difference between
PostgreSQL's character data types we mapped all
MySQL types to TEXT
MySQL Type PostgreSQL Type
TINYTEXT TEXT
VARCHAR TEXT
LONGTEXT TEXT
22. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Time data types
● mysqldump writes times as UTC without timezone
information, specify --skip-tz-utc to keep
timezone
● MySQL CREATE TABLE allows for one timestamp
column to have an update trigger
updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE
CURRENT_TIMESTAMP
MySQL Type PostgreSQL Type
DATE DATE
DATETIME TIMESTAMP WITH TIME ZONE
TIMESTAMP TIMESTAMP WITH TIME ZONE
23. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
TINYINT data type
● MySQL does not have a BOOLEAN type but uses a
special case of TINYINT(1)
● TINYINT(1) defaults can be adjusted for better
readability
MySQL Type PostgreSQL Type
TINYINT(1) BOOLEAN
TINYINT(1) DEFAULT
'0'
BOOLEAN DEFAULT FALSE
TINYINT(1) DEFAULT
'1'
BOOLEAN DEFAULT TRUE
TINYINT(4) SMALLINT
24. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
BLOB data type
● MySQL's various BLOB types can all be mapped to
BYTEA
● The default binary output is the source of many
invalid multi-byte escape sequences. Specify
--hex-blob to output the blobs as hexadecimal
MySQL Type PostgreSQL Type
TINYBLOB BYTEA
BLOB BYTEA
MEDIUMBLOB BYTEA
LONGBLOB BYTEA
25. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
AUTO_INCREMENT
MySQL uses the AUTO_INCREMENT keyword to
allow a column to take the next value from a
sequence. This keyword is independent from the
numeric data type used. PostgreSQL embeds this
behaviour in the SERIAL data type.
MySQL Type PostgreSQL Type
TINYINT … AUTO_INCREMENT SERIAL
SMALLINT … AUTO_INCREMENT SERIAL
INT … AUTO_INCREMENT SERIAL
INT UNSIGNED …
AUTO_INCREMENT
BIGSERIAL
BIGINT … AUTO_INCREMENT BIGSERIAL
26. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Column Annotations
MySQL allows you to annotate columns in CREATE
TABLE
CREATE TABLE t (col INT COMMENT 'My commented
column' …);
PostgreSQL does not support this however you can
achieve this after creating the table
CREATE TABLE t (col INTEGER …);
COMMENT ON COLUMN t.col IS 'My commented column';
27. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Stuff to ignore
mysqldump outputs commands that are not
relevant to PostgreSQL as well as some extensions
that are not needed.
● CREATE TABLE (…) ENGINE=InnoDB
AUTO_INCREMENT=156583 DEFAULT
CHARSET=latin1;
● LOCK TABLES;
● UNLOCK TABLES;
28. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
INSERTs
mysqldump generates a multi-row version of INSERT.
PostgreSQL supports this syntax but COPY is better suited
to large inserts
INSERT INTO `field_mappings` VALUES (1,892,'YMC','A9'),
(2,892,'WIG','A81')
becomes
COPY field_mappings FROM STDIN WITH NULL AS 'NULL' CSV
QUOTE AS '''';
1,892,'YMC','A9'
2,892,'WIG','A81'
.
29. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Escaped data
● MySQL escapes quotes and backslashes: ',
● Escaped carriage returns (r) and newlines (n)
need to be replaced with their real values
● MySQL escapes hexadecimals with 0x
MySQL Value PostgreSQL Value
' ''
r
n
0xFFFF E'FFFF'
30. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Bizarre data
● The zero timestamp '0000-00-00 00:00:00'
– Can be converted to NULL
● > 0000-00-00 00:00:00 becomes IS NOT NULL
● DFEAULT 0000-00-00 00:00:00 NOT NULL is removed
– Use an alternate odd value (e.g. '1970-01-01 00:00:00')
● Character encoding is painful
– Specifying --default-character-set=utf8 helps
– Dependant on shell environment
– May need to map some sequences (Â => x{FFFD})
31. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
VIEWs
Syntax is subject to same caveats SELECT queries
Extracting the VIEW from mysqldump is a little more
involved than CREATE TABLE
/*!50001 CREATE ALGORITHM=UNDEFINED */
/*!50013 DEFINER=`root`@`%` SQL SECURITY
DEFINER */
/*!50001 VIEW `view_name` AS select
`col1`.`col2` FROM `table_name` WHERE (`col3`
= '9636') */;
32. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
mysqldump
For those taking notes the accumulated mysqldump
command looks like this:
mysqldump --single-transaction --skip-tz-
utc --hex-blob --default-character-
set=utf8 -uuser -ppassowrd db_name
33. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Dealing with queries
● Identify queries that use functions (e.g. isnull())
– Recreate functions in pl/pgsql
– Rewrite query to produce same results
● MySQL is usually case insensitive
● PostgreSQL >= 8.3 does not support implicit casting
– WHERE clauses are prime source
– JOINs need verifying
● MySQL GROUP BY behaviour is non-standard
34. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Case Insensitivity: LIKE
LIKE has a case insensitive alternate ILIKE. ILIKE
does not perform too well however combining
lower()/upper() with LIKE is better.
DB WHERE clause Time
MySQL customername LIKE '%Mike%' 20ms
PSQL customername ILIKE '%mike%' 33ms
PSQL lower(customername) LIKE '%mike%' 25ms
35. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Case Insensitivity: Equality
Equality can achieved by using lower() or upper().
Obviously slower than equality tests however in our
set-up PostgreSQL initially out performs MySQL.
DB WHERE clause 1st
Time 2nd
Time
MySQL customername = 'Mike Fowler' 31ms 0ms
PSQL lower(customername) = 'mike fowler' 23ms 19ms
36. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Functional Indexes
PostgreSQL allows the creation of indexes based on
the computed results of functions. This improves
query times at the expense of INSERT time.
CREATE INDEX idxname ON table
(lower(customername));
DB WHERE clause Time
MySQL customername = 'Mike Fowler' 31ms
PSQL (no index) lower(customername) = 'mike fowler' 23ms
PSQL (index) lower(customername) = 'mike fowler' 0.15ms
37. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
GROUP BY
● MySQL allows nonaggreated columns not named in
the GROUP BY clause
– When a nonaggreated column has differing values MySQL
will not choose in a determinate way
– Appending nonaggreated columns to GROUP BY clause
could generate extra rows compared to MySQL
● MySQL group_concat(col) function
– Aggregates all rows into a comma separated list
– PostgreSQL equivalent string_agg(col,',')
38. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Implicit Casting
● Look for queries with numeric operands
– WHERE col > 56 becomes WHERE col::INTEGER >
56
– WHERE col != 95 becomes WHERE
col::INTEGER != 95
– WHERE col IN (345,347) becomes WHERE col::INTEGER IN
(345,347)
● You could cast everything as TEXT
WHERE col1 = col2 becomes WHERE col1::TEXT
= COL2::TEXT
39. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Implicit Casting
● JOINs with USING clause will need rewriting
SELECT * FROM table1 JOIN table2 USING (id) becomes SELECT *
FROM table1 t1 JOIN table2 t2 ON t1.id::TEXT = t2.id::TEXT
● You could permanently ALTER the column data type
ALTER TABLE table ALTER COLUMN col type
INTEGER USING col::INTEGER
– Saves modifying lots of WHERE clauses but INSERTS and
UPDATES will need checking
40. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Introducing MyPGJDBC
● Simple wrapper over existing PG-JDBC
● Idea is to rewrite & log MySQL style queries to work
in PostgreSQL
– Gives us a safety net for queries we miss
– Identifies the queries so we can repair them
● Aim to make it publicly available
41. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Lessons from our experience
● Allow more time than you expect
● Moving the schema and data is the easiest part
● Identifying, verifying and reworking queries is what
takes the time
42. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Questions?
Mike Fowler
PGDay UK 2014
43. Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Thank you!
Mike Fowler
PGDay UK 2014
Editor's Notes
<number>
<number>
<number>
Is there something where we can show triggers and listening posts here?
Need to simplify by removing Premier Inn, and using logos and descriptions but not anything else. Take out ‘Reward’ too
Slide showing touchpoints and channels matching together ? Search images for touchpoints – to include how mobile can run across every touchpoint and simplify the ‘omni channel’ challenges
<number>