Database migrations with Flyway and LiquibaseLars Östling
An agile world of continuous integration and deployment reinforces the need to be able to seamlessly and effortlessly update your database to keep it in sync with the latest changes in your code. Implementing database migrations with Flyway or Liquibase will help you do just that. This presentation gives a quick overview of the two frameworks accompanied by some simple demos.
Performance Benchmarking: Tips, Tricks, and Lessons LearnedTim Callaghan
Presentation covering 25 years worth of lessons learned while performance benchmarking applications and databases. Presented at Percona Live London in November 2014.
Your Inner Sysadmin - Tutorial (SunshinePHP 2015)Chris Tankersley
One thing that most programmers do not take the time to understand is the servers that their application lives on. Most know a smattering of Apache configs, PHP configs, and basic information about the OS. This talk will deal with looking at tools that can help you quickly set up a server and how it can help you be a better developer. We'll look at tools like puppet for server management, OSSEC for log management, different command line tools, and nagios/monit for system monitoring.
Nowadays in the fast changing world we need to keep less and less time spent on routine activity and to spend more on creativity and bringing something new to move forward.
This slides brings some trending ideas and approaches to deliver software in modern fashion, from Micro-services architecture, Containerisation, Automation, Continuous Integration/Deployment/Delivery.
There is a demo application built with depicted approach https://github.com/webdizz/bootiful-apps
Database migrations with Flyway and LiquibaseLars Östling
An agile world of continuous integration and deployment reinforces the need to be able to seamlessly and effortlessly update your database to keep it in sync with the latest changes in your code. Implementing database migrations with Flyway or Liquibase will help you do just that. This presentation gives a quick overview of the two frameworks accompanied by some simple demos.
Performance Benchmarking: Tips, Tricks, and Lessons LearnedTim Callaghan
Presentation covering 25 years worth of lessons learned while performance benchmarking applications and databases. Presented at Percona Live London in November 2014.
Your Inner Sysadmin - Tutorial (SunshinePHP 2015)Chris Tankersley
One thing that most programmers do not take the time to understand is the servers that their application lives on. Most know a smattering of Apache configs, PHP configs, and basic information about the OS. This talk will deal with looking at tools that can help you quickly set up a server and how it can help you be a better developer. We'll look at tools like puppet for server management, OSSEC for log management, different command line tools, and nagios/monit for system monitoring.
Nowadays in the fast changing world we need to keep less and less time spent on routine activity and to spend more on creativity and bringing something new to move forward.
This slides brings some trending ideas and approaches to deliver software in modern fashion, from Micro-services architecture, Containerisation, Automation, Continuous Integration/Deployment/Delivery.
There is a demo application built with depicted approach https://github.com/webdizz/bootiful-apps
Mysql User Camp : 20-June-14 : Mysql New features and NoSQL SupportMysql User Camp
This slide was presented at Mysql User Camp Event on 20-June-14 at Oracle bangalore. This presentation gives a good insight about New Features in Mysql 5.7 DMR 4 and Nosql Support in Mysql.
Become a MySQL DBA: performing live database upgrades - webinar slidesSeveralnines
In this webinar we cover one of the most basic, but essential tasks of the DBA: minor and major database upgrades in production environments.
AGENDA
What types of upgrades are there?
How do I best prepare for the upgrades?
Best practices for:
Minor version upgrades - MySQL & Galera
Major version upgrades - MySQL & Galera
SPEAKER
Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard. This webinar builds upon recent blog posts and related webinar series by Krzysztof on how to become a MySQL DBA.
To view all the blogs of the ‘Become a MySQL DBA’ series visit: http://www.severalnines.com/blog-categories/db-ops
PoC: Using a Group Communication System to improve MySQL Replication HAUlf Wendel
High Availability solutions for MySQL Replication are either simple to use but introduce a single point of failure or free of pitfalls but complex and hard to use. The Proof-of-Concept sketches a way in the middle. For monitoring a group communication system is embedded into MySQL usng a MySQL plugin which eliminates the monitoring SPOF and is easy to use. Much emphasis is put of the often neglected client side. The PoC shows an architecture in which clients reconfigure themselves dynamically. No client deployment is required.
SynapseIndia Drupal development
SynapseIndia Ecommerce development
SynapseIndia Sharepoint development
SynapseIndia PHP development
SynapseIndia Dotnet development
SynapseIndia Magento development
SynapseIndia MS Dynamic CRM
SynapseIndia Complaints
SynapseIndia Reviews
Going thru the era of IoT that involves lots more and much bigger data, we need a faster database. MySQL 5.7 gives you 3x speed of its predecessor and able to reach 1.6m qps on our select benchmark.
MySQL has a set of utilities written in Python that can do some amazing things for your MySQL instances from setting up replication with automatic fail over to copying database
MySQL Scalability and Reliability for Replicated EnvironmentJean-François Gagné
You have a working application that is using MySQL: great! At the beginning, you are probably using a single database instance, and maybe – but not necessarily – you have replication for backups, but you are not reading from slaves yet. Scalability and reliability were not the main focus in the past, but they are starting to be a concern. Soon, you will have many databases and you will have to deal with replication lag. This talk will present how to tackle the transition.
We mostly cover standard/asynchronous replication, but we will also touch on Galera and Group Replication. We present how to adapt the application to become replication-friendly, which facilitate reading from and failing over to slaves. We also present solutions for managing read views at scale and enabling read-your-own-writes on slaves. We also touch on vertical and horizontal sharding for when deploying bigger servers is not possible anymore.
Are UNIQUE and FOREIGN KEYs still possible at scale, what are the downsides of AUTO_INCREMENTs, how to avoid overloading replication, what are the limits of archiving, … Come to this talk to get answers and to leave with tools for tackling the challenges of the future.
Andrew Ryan describes how Facebook operates Hadoop to provide access as a shared resource between groups.
More information and video at:
http://developer.yahoo.com/blogs/hadoop/posts/2011/02/hug-feb-2011-recap/
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
2. Who am I and why do I think I know about it?
Jakob Lorberblatt
Open source database consultant and enthusiast
- Teaches for a triangle area (here) girls coding club (shameless plug)
- Has a 20 year long obsession with open source technology.
- Loves to talk about software and linux
- Works as a database consultant for Pythian a top global provider for
database management services and cloud enablement that totally
loves your data!
3. The confusion with versions
Major Versions:
Surprise! They aren’t all that logical!
5.1 - no longer supported
5.5 - Oldest version still officially supported
5.6 - Current Stable (most common)
5.7 - First in the newer series, less
compatible with older usage.
8.0 - Latest Release, great feature set,
complete redesign, not all tools work here
yet.
MariaDB versions do not
directly correspond to
MySQL versions.
10.0 - is 5.5 with some extras
10.1 - is 5.6 ish for most
considerations.
10.2 - has some 5.7 in it.
10.3 - 5.7 and 8.0 as well as
mariadb specific functionality
beyond the scope of this
document
Percona Server versions correspond
directly to MySQL Versions
4. The types of dangers you may encounter
Deprecated / changed
Configuration parameters
-
Will prevent server startup
New reserved keywords
-
Errors on execution
Deprecated Syntax
-
Warnings
Eliminated Syntax
-
Failed statements
Performance Issues with specific
queries
New configuration required
Old configuration no longer valid
in new version
Removed functionality or storage
types referenced (myisam
support is removed in 8.0!)
5. -
What did we start with to get in this mess?
Your old software was great! Upgrading
doesn’t mean losing what you had.
- Ease of repair
- Efficiency with existing usage pattern
- Connectivity through the same libraries
- Software security and stability
- Sleek originality your not currently in
search of new solution
6. What are you trying to get too?
- Modern Feature Set
- Current patches and update cycle
- Efficiency gains and improvements
- Specific features and enablements
- Support for modern hardware or
cloud based hosting
- Continued odds of finding your
problem’s solution on stackoverflow
or at least a group of others to rant
about it with online
7. The Basics, how do you get there?
ALWAYS BACKUP YOUR DATA FIRST!!
1) Remove or unlink your existing
mysql install.
2) Install Upgraded packages, at this
point you may only advance 1 version
eg: 5.1 to 5.5 or 5.5 to 5.6
3) Start mysql, examine the error log as
some my.cnf options may have been
deprecated or changed, preventing
system startup.
4) Adjust until fixed
5) Run
‘mysql_upgrade -u root_like_user -p’
6) Although not absolutely required for
all versions it is needed for some so I
recommend restarting mysql again at
this point.
8. What’s Stopping you?
- Version Compatibility
- Infrastructure Compatibility
- Replication Compatibility
- Business Process Changes
- Development cycles
- Confusion about any of the above
- Inability to resolve issues with any of
the above
- Difficulty in testing any of the above.
9. - Pt-upgrade
- proxySQL
- Pt-query-digest
- Higher version replicas
- Lower testing environments for
applications
- PMM or other time series analysis
- Black hole relays, in special
circumstances.
How to bridge the gap to address these issues?
10. pt-upgrade fundamentals
- Create a clone of your existing data.
- Setup clone server with a newer
version of mysql
- Run mysql_upgrade on clone
- Set long_query_time set to 0
- Collect a slow query log
- Transfer the slow log to the clone
- Run pt-upgrade on the clone
- Keep in mind the queries are run in
serial, this will take some time
- Analyze the differences.
Additional Information:
Official Site:
https://www.percona.com/doc/percona-toolkit/LATEST/pt-upgrade.html
Required Configuration:
https://dev.mysql.com/doc/refman/5.6/en/slow-query-log.html
11. What they don’t tell you
What it does do:
- Provides a one to one
comparison between queries on
one version of the software to a
newer one
- Provides content of error
messages if they occur
- Helps scan code for issues with
new keywords or deprecated
statements
What it doesn’t do:
- Provide any sort of load
testing.
- Have any ability to analyze
queries that were not
supplied in the log file.
- Provide the results in an
easy to parse format
- Slow log is applied in a single threaded fashion
- Results are based on a 1-1 basis and dumped to the filesystem
- Many samples of the same query are all compared separately
Things to watch out for:
Saving results will save the ENTIRE result set for
each iteration of a query, if this is the intended
Effect make sure you have enough space in the
target volume.
13. Interpreting the Results
Where do I look for the results of
pt-query-digest?
All of the output is dumped into a
directory, the lines are mapped one for
one, statement executed to result.
“grep ERROR results” should give you
a lot of feedback.
What do I do with what went
wrong?
- Sql_mode is a powerful
variable, it can revert
behaviors.
- Rewrite queries
- Remove deprecated
syntax usage
- quote in backticks any
usages of newly
introduced reserved
keywords.
14. ProxySQL
- Amazing Mirror functionality, allows a complete copy of traffic in real time to be applied to a second
server, real life load testing!
- Dynamic routing of traffic based on custom rule sets allowing read to be directed to a newer version
- New and cool ways to test and move upgrades into production.
http://proxysql.com
15. How Mirroring works and why it is so cool
- Provides a real time duplication of traffic to the new server
- Allows you to collect data about the new version under realistic
loads using your real schema and data without affecting existing
production servers!
- Does not checks the results or guarantee every transaction is
mirrored!
16. Half way betwixt and testing environments
And through the drifts the snowy clifts
Did send a dismal sheen:
Nor shapes of men nor beasts we ken—
The ice was all between.
- Samuel Taylor Coleridge
Things to remember:
- You can always replicate from an older version to the next
newest version, this is a fully supported behavior.
-
- Data formats may change, once upgraded there is no going
back for the physical data, it would require a logical dump.
-
- Replication format can strongly affect behavior, MIXED or ROW
format replication is STRONGLY recommended
-
- Tables without primary keys in combination with ROW based
replication can result in high lag on the slave server
-
17. Black Hole Relays
What is a black hole relay? Why would you want
to blackhole your data
- A black hole table will allow insert but
contains no data (like /dev/null)
- The black hole insert will still get written to
the binary log, allowing a slave to duplicate
the transaction.
- The binary log is read and written by the
version of the server hosting the black hole
tables.
- Version to version replication is officially
supported only to the next major version. If
you are intending to jump from 5.1 to 5.7
you would have a “blackhole” instance for
5.5. And 5.6 inbetween.
18. How to build a relay and how it works
- Take a full backup of the 5.1 server, expand it on
the 5.7 slave, upgrade it for 5.5 then 5.6 and finally
5.7
- “mysqldump --no-data --all-databases” will get you
a schema dump from the 5.1 master, sed -i
‘s/INNODB/BLACKHOLE/’ yourdumpfile.sql
- Load this schema dump into the 5.5 and 5.6
servers with log-slave-updates on.
- Setup 5.5 with the replication positions from the
backup
- Setup 5.6 with the show master status from 5.5
- Setup 5.7 with the show master status from 5.6
- Now start replication on each.
19. Using PMM to track down issues
https://www.percona.com/doc/percona-monitoring-and-management/index.html
PMM is a tool that collects metrics on
the database instance every second
This is done using an agent on the
database server called an exporter
Its collected in prometheus time series
database
The results are then displayed using
grafana for very pretty visualizations
It also does real time query analytics!
20. Somethings are going to break, what do I do?
Sql_mode - many things that are “deprecated” are
actually just disabled by a flag here.
Keywords can be wrapped in backticks
Default values can be changed
Queries can be restructured
FORCE INDEX can be a way to reverse optimizer
plan changes
Indexes can be added / removed
Performance can be tuned through configuration