This document provides an agenda for a training on MySQL Cluster presented by Severalnines.com. The training will cover topics such as MySQL Cluster architecture, installation, performance tuning, management and administration, disk data, designing a cluster, and troubleshooting. Hands-on lab exercises are included to reinforce the concepts taught. Prerequisites for the training include basic SQL and database knowledge as well as laptop hardware requirements.
Severalnines Self-Training: MySQL® Cluster - Part IISeveralnines
Part II of our free self-training slides on MySQL Cluster.
In this part we cover 'Detailed Concepts':
* Data Distribution & Partitioning
* Two Phase Commit Protocol
* Transaction Resources
Severalnines Self-Training: MySQL® Cluster - Part VIISeveralnines
Part VII of our free self-training slides on MySQL Cluster.
In this installment, we cover ’Management and Administration'
* Backup and Restore
* Geographical Redundancy
* Online and Offline Operations
* Ndbinfo tables
* Reporting
* Single User Mode
* Scaling MySQL Cluster
Severalnines Self-Training: MySQL® Cluster - Part VISeveralnines
Part VI of our free self-training slides on MySQL Cluster.
In this part we cover ’Configuration and Installation'
* Data Node configuration
* SQL Node configuration
* Important parameters
* Installation
* Upgrading
Severalnines Training: MySQL Cluster - Part XSeveralnines
Part X of our self-training slides on MySQL Cluster, focused on troubleshooting MySQL Cluster
Topics:
- common problems encountered by users
- error logs and trace files
- recovery and escalation procedures
Conference tutorial: MySQL Cluster as NoSQLSeveralnines
Slides from the 'MySQL Cluster as NoSQL' tutorial at Percona Live MySQL Conference 2012 in London.
Tutorial covers:
*MySQL Cluster administration
* NoSQL options for MySQL Cluster and when to use what
* Memcached (installation and configuration)
* Cluster/J
* NDBAPI
* Benchmarking of different access methods on a live cluster
Severalnines Self-Training: MySQL® Cluster - Part IISeveralnines
Part II of our free self-training slides on MySQL Cluster.
In this part we cover 'Detailed Concepts':
* Data Distribution & Partitioning
* Two Phase Commit Protocol
* Transaction Resources
Severalnines Self-Training: MySQL® Cluster - Part VIISeveralnines
Part VII of our free self-training slides on MySQL Cluster.
In this installment, we cover ’Management and Administration'
* Backup and Restore
* Geographical Redundancy
* Online and Offline Operations
* Ndbinfo tables
* Reporting
* Single User Mode
* Scaling MySQL Cluster
Severalnines Self-Training: MySQL® Cluster - Part VISeveralnines
Part VI of our free self-training slides on MySQL Cluster.
In this part we cover ’Configuration and Installation'
* Data Node configuration
* SQL Node configuration
* Important parameters
* Installation
* Upgrading
Severalnines Training: MySQL Cluster - Part XSeveralnines
Part X of our self-training slides on MySQL Cluster, focused on troubleshooting MySQL Cluster
Topics:
- common problems encountered by users
- error logs and trace files
- recovery and escalation procedures
Conference tutorial: MySQL Cluster as NoSQLSeveralnines
Slides from the 'MySQL Cluster as NoSQL' tutorial at Percona Live MySQL Conference 2012 in London.
Tutorial covers:
*MySQL Cluster administration
* NoSQL options for MySQL Cluster and when to use what
* Memcached (installation and configuration)
* Cluster/J
* NDBAPI
* Benchmarking of different access methods on a live cluster
In this Introduction to GlusterFS webinar, introduction and review of the GlusterFS architecture and key functionalities. Learn how GlusterFS is deployed in the datacenter, in the cloud, or between the two. We’ll also cover a brief update on GlusterFS v3.3 which is currently in beta.
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS StorageGlusterFS
Gluster has partnered with Redapt, Inc., an innovative data center architecture and infrastructure solutions provider, to integrate GlusterFS with hardware providing customers with highly-scalable NAS storage technology for on-premise, virtual and cloud environments. Gluster's storage technology enables Redapt to offer a comprehensive, cost-effective storage solution delivering the scalability, performance and reliability that companies need to effectively run their data centers.
This webinar will provide an overview of the partnership, benefits of the joint solution, and include use cases of how customers today are deploying the joint solution. .
Slides presented at Great Indian Developer Summit 2016 at the session MySQL: What's new on April 29 2016.
Contains information about the new MySQL Document Store released in April 2016.
MySQL Infrastructure Testing Automation at GitHubIke Walker
The database team at GitHub is tasked with keeping the data available and with maintaining its integrity. Our infrastructure automates away much of our operation, but automation requires trust, and trust is gained by testing. This session highlights three examples of infrastructure testing automation that helps us sleep better at night:
- Backups: scheduling backups; making backup data accessible to our engineers; auto-restores and backup validation. What metrics and alerts we have in place.
- Failovers: how we continuously test our failover mechanism, orchestrator. How we setup a failover scenario, what defines a successful failover, how we automate away the cleanup. What we do in production.
- Schema migrations: how we ensure that gh-ost, our schema migration tool, which keeps rewriting our (and your!) data, does the right thing. How we test new branches in production without putting production data at risk.
Ben Golub gives insight to the latest storage trends including the EMC's latest acquisition of Isilon.
http://blog.gluster.com/2010/11/storage-is-sexy-again/
Spectrum Scale - Diversified analytic solution based on various storage servi...Wei Gong
This slides describe diversified analytic solutions based on Spectrum Scale with various deployment mode, such as storage rich-server, share storage, IBM DeepFlash 150 and Elastic Storage Server. It deep dives several advanced data management features and solutions for BD&A workload derived from Spectrum Scale.
Big Data and virtualization are two of the most exciting trends in the industry today. In this session you will learn about the components of Big Data systems, and how real-time, interactive and distributed processing systems like Hadoop integrate with existing applications and databases. The combination of Big Data systems with virtualization gives Hadoop and other Big Data technologies the key benefits of cloud computing: elasticity, multi-tenancy and high availability. A new open source project that VMware will announce at the Hadoop Summit will make it easy to deploy, configure and manage Hadoop on a virtualized infrastructure. We will discuss reference architectures for key Hadoop distributions anddiscuss future directions of this new open source project.
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenEDB
Dieses Webinar hilft Ihnen, die Unterschiede zwischen den verschiedenen Replikationsansätzen zu verstehen, die Anforderungen der jeweiligen Strategie zu erkennen und sich über die Möglichkeiten klar zu werden, was mit jeder einzelnen zu erreichen ist. Damit werden Sie hoffentlich eher in der Lage sein, herauszufinden, welche PostgreSQL-Replikationsarten Sie wirklich für Ihr System benötigen.
- Wie physische und logische Replikation in PostgreSQL funktionieren
- Unterschiede zwischen synchroner und asynchroner Replikation
- Vorteile, Nachteile und Herausforderungen bei der Multi-Master-Replikation
- Welche Replikationsstrategie für unterschiedliche Use-Cases besser geeignet ist
Referent:
Borys Neselovskyi, Regional Sales Engineer DACH, EDB
------------------------------------------------------------
For more #webinars, visit http://bit.ly/EDB-Webinars
Download free #PostgreSQL whitepapers: http://bit.ly/EDB-Whitepapers
Read our #Postgres Blog http://bit.ly/EDB-Blogs
Follow us on Facebook at http://bit.ly/EDB-FB
Follow us on Twitter at http://bit.ly/EDB-Twitter
Follow us on LinkedIn at http://bit.ly/EDB-LinkedIn
Reach us via email at marketing@enterprisedb.com
Database as a Service on the Oracle Database Appliance PlatformMaris Elsins
Speaker: Marc Fielding, Co-speaker: Maris Elsins.
Oracle Database Appliance provides a robust, highly-available, cost-effective, and surprisingly scalable platform for database as a service environment. By leveraging Oracle Enterprise Manager's self-service features, databases can be provisioned on a self-service basis to a cluster of Oracle Database Appliance machines. Discover how multiple ODA devices can be managed together to provide both high availability and incremental, cost-effective scalability. Hear real-world lessons learned from successful database consolidation implementations.
This presentation was written by Wagner Bianchi for the presentation on the Oracle Consulting Team/Professional Services meeting that took place in San Francisco/CA.
In this Introduction to GlusterFS webinar, introduction and review of the GlusterFS architecture and key functionalities. Learn how GlusterFS is deployed in the datacenter, in the cloud, or between the two. We’ll also cover a brief update on GlusterFS v3.3 which is currently in beta.
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS StorageGlusterFS
Gluster has partnered with Redapt, Inc., an innovative data center architecture and infrastructure solutions provider, to integrate GlusterFS with hardware providing customers with highly-scalable NAS storage technology for on-premise, virtual and cloud environments. Gluster's storage technology enables Redapt to offer a comprehensive, cost-effective storage solution delivering the scalability, performance and reliability that companies need to effectively run their data centers.
This webinar will provide an overview of the partnership, benefits of the joint solution, and include use cases of how customers today are deploying the joint solution. .
Slides presented at Great Indian Developer Summit 2016 at the session MySQL: What's new on April 29 2016.
Contains information about the new MySQL Document Store released in April 2016.
MySQL Infrastructure Testing Automation at GitHubIke Walker
The database team at GitHub is tasked with keeping the data available and with maintaining its integrity. Our infrastructure automates away much of our operation, but automation requires trust, and trust is gained by testing. This session highlights three examples of infrastructure testing automation that helps us sleep better at night:
- Backups: scheduling backups; making backup data accessible to our engineers; auto-restores and backup validation. What metrics and alerts we have in place.
- Failovers: how we continuously test our failover mechanism, orchestrator. How we setup a failover scenario, what defines a successful failover, how we automate away the cleanup. What we do in production.
- Schema migrations: how we ensure that gh-ost, our schema migration tool, which keeps rewriting our (and your!) data, does the right thing. How we test new branches in production without putting production data at risk.
Ben Golub gives insight to the latest storage trends including the EMC's latest acquisition of Isilon.
http://blog.gluster.com/2010/11/storage-is-sexy-again/
Spectrum Scale - Diversified analytic solution based on various storage servi...Wei Gong
This slides describe diversified analytic solutions based on Spectrum Scale with various deployment mode, such as storage rich-server, share storage, IBM DeepFlash 150 and Elastic Storage Server. It deep dives several advanced data management features and solutions for BD&A workload derived from Spectrum Scale.
Big Data and virtualization are two of the most exciting trends in the industry today. In this session you will learn about the components of Big Data systems, and how real-time, interactive and distributed processing systems like Hadoop integrate with existing applications and databases. The combination of Big Data systems with virtualization gives Hadoop and other Big Data technologies the key benefits of cloud computing: elasticity, multi-tenancy and high availability. A new open source project that VMware will announce at the Hadoop Summit will make it easy to deploy, configure and manage Hadoop on a virtualized infrastructure. We will discuss reference architectures for key Hadoop distributions anddiscuss future directions of this new open source project.
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenEDB
Dieses Webinar hilft Ihnen, die Unterschiede zwischen den verschiedenen Replikationsansätzen zu verstehen, die Anforderungen der jeweiligen Strategie zu erkennen und sich über die Möglichkeiten klar zu werden, was mit jeder einzelnen zu erreichen ist. Damit werden Sie hoffentlich eher in der Lage sein, herauszufinden, welche PostgreSQL-Replikationsarten Sie wirklich für Ihr System benötigen.
- Wie physische und logische Replikation in PostgreSQL funktionieren
- Unterschiede zwischen synchroner und asynchroner Replikation
- Vorteile, Nachteile und Herausforderungen bei der Multi-Master-Replikation
- Welche Replikationsstrategie für unterschiedliche Use-Cases besser geeignet ist
Referent:
Borys Neselovskyi, Regional Sales Engineer DACH, EDB
------------------------------------------------------------
For more #webinars, visit http://bit.ly/EDB-Webinars
Download free #PostgreSQL whitepapers: http://bit.ly/EDB-Whitepapers
Read our #Postgres Blog http://bit.ly/EDB-Blogs
Follow us on Facebook at http://bit.ly/EDB-FB
Follow us on Twitter at http://bit.ly/EDB-Twitter
Follow us on LinkedIn at http://bit.ly/EDB-LinkedIn
Reach us via email at marketing@enterprisedb.com
Database as a Service on the Oracle Database Appliance PlatformMaris Elsins
Speaker: Marc Fielding, Co-speaker: Maris Elsins.
Oracle Database Appliance provides a robust, highly-available, cost-effective, and surprisingly scalable platform for database as a service environment. By leveraging Oracle Enterprise Manager's self-service features, databases can be provisioned on a self-service basis to a cluster of Oracle Database Appliance machines. Discover how multiple ODA devices can be managed together to provide both high availability and incremental, cost-effective scalability. Hear real-world lessons learned from successful database consolidation implementations.
This presentation was written by Wagner Bianchi for the presentation on the Oracle Consulting Team/Professional Services meeting that took place in San Francisco/CA.
Gluster for Geeks: Performance Tuning Tips & TricksGlusterFS
In this Gluster for Geeks technical webinar, Jacob Shucart, Senior Systems Engineer, will provide useful tips and tricks to make a Gluster cluster meet your performance requirements. He will review considerations for all different phases including planning, configuration, implementation, tuning, and benchmarking.
Topics covered will include:
• Protocols (CIFS, NFS, GlusterFS)
• Hardware configuration
• Tuning parameters
• Performance benchmarks
Embracing Database Diversity: The New Oracle / MySQL DBA - UKOUGKeith Hollman
Classic Oracle DBAs are somewhat starved for the "big overview" knowledge that will make them better decision makers and less hesitant to use MySQL.
The aim is to allow an existing Oracle DBA to get to grips with a MySQL environment, concentrating on the real focus points, and highlighting the similarities of both RDBMS'.
And both worlds provide the necessary tools to avoid a sleepless night.
Inter connect2016 yss1841-cloud-storage-options-v4Tony Pearson
This session will cover private and public cloud storage options, including flash, disk and tape, to address the different types of cloud storage requirements. It will also explain the use of Active File Management for local space management and global access to files, and support for file-and-sync.
"SQL Server Storage Configuration for SharePoint" presented to the Silicon Valley SQL Server User Group on January 13, 2010
Presenter: Burzin Patel, author and Solutions Architect at StorSimple
Learn about the Top Five SQL Server storage configuration best practices for SharePoint, including:
•Disk sizing and configuration •Externalizing BLOB storage •Common maintenance tasks •Performance tuning
New Ceph capabilities and Reference ArchitecturesKamesh Pemmaraju
Have you heard about Inktank Ceph and are interested to learn some tips and tricks for getting started quickly and efficiently with Ceph? Then this is the session for you!
In this two part session you learn details of:
• the very latest enhancements and capabilities delivered in Inktank Ceph Enterprise such as a new erasure coded storage back-end, support for tiering, and the introduction of user quotas.
• best practices, lessons learned and architecture considerations founded in real customer deployments of Dell and Inktank Ceph solutions that will help accelerate your Ceph deployment.
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Red_Hat_Storage
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About? By: Kamesh Pemmaraju,Neil Levine
Have you heard about Inktank Ceph and are interested to learn some tips and tricks for getting started quickly and efficiently with Ceph? Then this is the session for you! In this two part session you learn details of: • the very latest enhancements and capabilities delivered in Inktank Ceph Enterprise such as a new erasure coded storage back-end, support for tiering, and the introduction of user quotas. • best practices, lessons learned and architecture considerations founded in real customer deployments of Dell and Inktank Ceph solutions that will help accelerate your Ceph deployment.
DIY DBaaS: A guide to building your own full-featured DBaaSSeveralnines
More so than ever, businesses need to ensure that their databases are resilient, secure, and always available to support their operations. Database-as-a-Service (DBaaS) solutions have become a popular way for organizations to manage their databases efficiently, leveraging cloud infrastructure and advanced set-and-forget automation.
However, consuming DBaaS from providers comes with many compromises. In this guide, we’ll show you how you can build your own flexible DBaaS, your way. We’ll demonstrate how it is possible to get the full spectrum of DBaaS capabilities along with workload access and portability, and avoid surrendering control to a third-party.
From architectural and design considerations to operational requirements, we’ll take you through the process step-by-step, providing all the necessary information and guidance to help you build a DBaaS solution that is tailor-made to your unique use case. So get ready to dive in and learn how to build your own custom DBaaS solution from scratch!
We created this guide to help developers understand:
- Traditional vs. Sovereign DBaaS implementation models
- The DBaaS environment, elements and design principles
- Using a Day 2 operations framework to develop your blueprint
- The 8 key operations that form the foundation of a complete DBaaS
- Bringing the Day 2 ops framework to life with a provisional architecture
- How you can abstract the orchestration layer with Severalnines solutions
Cloud's future runs through Sovereign DBaaSSeveralnines
Sovereign DBaaS is a new way to do DBaaS that allows you to reliably scale your open-source database ops without being limited to a specific environment or ceding control of your infrastructure to third-party service providers.
With Sovereign DBaaS, users can leverage the benefits of modern deployment strategies, e.g. public cloud, hybrid, etc., with additional security, compliance, and risk mitigation. So what exactly is Sovereign DBaaS and why should you choose one?
Presented by Sanjeev Mohan, Principal Analyst at SanjMo and former Gartner Research VP, and Vinay Joosery, CEO of Severalnines, this webinar dives into the future of the cloud and database management and introduces a new solution, Sovereign DBaaS.
The state of the cloud and its current challenges
What is Sovereign DBaaS?
Agenda:
- Key features of Sovereign DBaaS
- Why you should choose a Sovereign DBaaS
- How you can implement Sovereign DBaaS with Severalnines
- Q&A
Tips to drive maria db cluster performance for nextcloudSeveralnines
Nextcloud requires a database to store administrative data for the platform. A poorly performing database can have a serious impact on performance and availability of Nextcloud. MariaDB Cluster is the recommended database backend for production installations that require high availability and performance.
This talk is a deep dive into how to design and optimize MariaDB Galera Cluster for Nextcloud. We will cover 5 tips on how to significantly improve performance and stability.
Agenda:
Overview of Nextcloud architecture
Database architecture design
Database proxy
MariaDB and InnoDB performance tuning
Nextcloud performance tuning
Q&A
Working with the Moodle Database: The BasicsSeveralnines
Managing the database behind Moodle is key to improving performance and achieving uptime for your users. In this training video we will talk about the Moodle database including topics like configuration, monitoring, and schema management as well as show you how ClusterControl can help with the management of your eLearning LMS systems.
SysAdmin Working from Home? Tips to Automate MySQL, MariaDB, Postgres & MongoDBSeveralnines
Are you an SysAdmin who is now responsible for your companies database operations? Then this is the webinar for you. Learn from a Senior DBA the basics you need to know to keep things up-and-running and how automation can help.
(slides) Polyglot persistence: utilizing open source databases as a Swiss poc...Severalnines
Over the past few years, VidaXL has become a European market leader in the online retail of slow moving consumer goods. When a company achieved over 50% year over year growth for the past 9 years, there is hardly enough time to overhaul existing systems. This means existing systems will be stretched to the maximum of their capabilities, and often additional performance will be gained by utilizing a large variety of datastores.
Webinar slides: How to Migrate from Oracle DB to MariaDBSeveralnines
Watch this webinar replay as we walk you through all you need to know to plan and execute a successful migration from Oracle database to MariaDB.
Over the years MariaDB has gained Enterprise support and maturity to run critical and complex data transaction systems. With the recent version, MariaDB has added some great new features such as SQL_Mode=Oracle compatibility, making the transition process easier than ever before.
Whether you’re planning to migrate from Oracle database to MariaDB manually or with the help of a commercial tool to automate the entire migration process, you need to know all the possible bottlenecks and methods involved in the process and the results validation.
Migrating from Oracle database to MariaDB can come with a number of benefits: lower cost of ownership, access to and use of an open source database engine, tight integration with the web, wide circle of MariaDB database professionals and more.
Find out how it could benefit your organisation!
AGENDA
- A brief introduction to the platform
- Oracle vs MariaDB
- Platform support
- Installation process
- Database access
- Backup process
- Controlling query execution
- Security
- Replication options
- Migration
- Planning and development strategy
- Assessment or preliminary check
- Data type mapping
- Migration tools
- Migration process
- Testing
- Post-migration
- Monitoring and Alerting
- Performance Management
- Backup Management
- High availability
- Upgrades
- Scaling
- Staff training
SPEAKER
Bartlomiej Oles, Senior Support Engineer at Severalnines, is a MySQL and Oracle DBA, with over 15 years experience in managing highly available production systems at IBM, Nordea Bank, Acxiom, Lufthansa, and other Fortune 500 companies. In the past five years, his focus has been on building and applying automation tools to manage multi-datacenter database environments.
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControlSeveralnines
Running PostgreSQL in production comes with the responsibility for a business critical environment; this includes high availability, disaster recovery, and performance. Ops staff worry whether databases are up and running, if backups are taken and tested for integrity, whether there are performance problems that might affect end user experience, if failover will work properly in case of server failure without breaking applications, and the list goes on.
ClusterControl can be used to operationalize your PostgreSQL footprint across your enterprise. It offers a standard way of deploying high-availability replication setups with auto-failover, integrated with load balancers offering a single endpoint to applications. It provides constant health and performance monitoring through rich dashboards, as well as backup management and point-in-time recovery
See how much time and effort can be saved, as well as risks mitigated, with the help of a unified management platform over the more traditional, manual methods.
We’ve seen a 152% increase in ClusterControl installations by PostgreSQL users last year, so make sure you don’t miss out on the trend!
AGENDA
- Managing PostgreSQL “the old way”:
- Common challenges
- Important tasks to perform
- Tools that are available to help
- PostgreSQL automation and management with ClusterControl:
- Deployment
- Backup and recovery
- HA setups
- Failover
- Monitoring
- Live Demo
SPEAKER
Sebastian Insausti, Support Engineer at Severalnines, has loved technology since his childhood, when he did his first computer course (Windows 3.11). And from that moment he was decided on what his profession would be. He has since built up experience with MySQL, PostgreSQL, HAProxy, WAF (ModSecurity), Linux (RedHat, CentOS, OL, Ubuntu server), Monitoring (Nagios), Networking and Virtualization (VMWare, Proxmox, Hyper-V, RHEV).
Prior to joining Severalnines, Sebastian worked as a consultant to state companies in security, database replication and high availability scenarios. He’s also a speaker and has given a few talks locally on InnoDB Cluster and MySQL Enterprise together with an Oracle team. Previous to that, he worked for a Mexican company as chief of sysadmin department as well as for a local ISP (Internet Service Provider), where he managed customers' servers and connectivity.
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...Severalnines
Failover is the process of moving to a healthy standby component, during a failure or maintenance event, in order to preserve uptime. The quicker it can be done, the faster you can be back online. However, failover can be tricky for transactional database systems as we strive to preserve data integrity - especially in asynchronous or semi-synchronous topologies. There are risks associated, from diverging datasets to loss of data. Failing over due to incorrect reasoning, e.g., failed heartbeats in the case of network partitioning, can also cause significant harm.
This webinar replay gives a detailed overview of what failover processes may look like in MySQL, MariaDB and PostgreSQL replication setups. We’ve covered the dangers related to the failover process, and discuss the tradeoffs between failover speed and data integrity. We’ve found out about how to shield applications from database failures with the help of proxies. And we've finally had a look at how ClusterControl manages the failover process, and how it can be configured for both assisted and automated failover.
So if you’re looking at minimizing downtime and meet your SLAs through an automated or semi-automated approach, then this webinar replay is for you!
AGENDA
- An introduction to failover - what, when, how
- in MySQL / MariaDB
- in PostgreSQL
- To automate or not to automate
- Understanding the failover process
- Orchestrating failover across the whole HA stack
- Difficult problems
- Network partitioning
- Missed heartbeats
- Split brain
- From assisted to fully automated failover with ClusterControl
- Demo
SPEAKER
Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard.
What if …
- Traditional, labour-intensive backup and archive practices for your MySQL, MariaDB, MongoDB and PostgreSQL databases were a thing of the past?
- You could have one backup management solution for all your business data?
- You could ensure integrity of all your backups?
- You could leverage the competitive pricing and almost limitless capacity of cloud-based backup while meeting cost, manageability, and compliance requirements from the business.
Welcome to our webinar on Backup Management with ClusterControl.
ClusterControl’s centralized backup management for open source databases provides you with hot backups of large datasets, point in time recovery in a couple of clicks, at-rest and in-transit data encryption, data integrity via automatic restore verification, cloud backups (AWS, Google and Azure) for Disaster Recovery, retention policies to ensure compliance, and automated alerts and reporting.
Whether you are looking at rebuilding your existing backup infrastructure, or updating it, this webinar is for you!
AGENDA
- Backup and recovery management of local or remote databases
- Logical or physical backups
- Full or Incremental backups
- Position or time-based Point in Time Recovery (for MySQL and PostgreSQL)
- Upload to the cloud (Amazon S3, Google Cloud Storage, Azure Storage)
- Encryption of backup data
- Compression of backup data
- One centralized backup system for your open source databases (Demo)
- Schedule, manage and operate backups
- Define backup policies, retention, history
- Validation - Automatic restore verification
- Backup reporting
SPEAKER
Bartlomiej Oles, Senior Support Engineer at Severalnines, is a MySQL and Oracle DBA, with over 15 years experience in managing highly available production systems at IBM, Nordea Bank, Acxiom, Lufthansa, and other Fortune 500 companies. In the past five years, his focus has been on building and applying automation tools to manage multi-datacenter database environments.
Disaster Recovery Planning for MySQL & MariaDBSeveralnines
Bart Oles - Severalnines AB
Organizations need an appropriate disaster recovery plan to mitigate the impact of downtime. But how much should a business invest? Designing a highly available system comes at a cost, and not all businesses and indeed not all applications need five 9's availability.
We will explain fundamental disaster recovery concepts and walk you through the relevant options from the MySQL & MariaDB ecosystem to meet different tiers of disaster recovery requirements, and demonstrate how to automate an appropriate disaster recovery plan.
Krzysztof Ksiazek - Severalnines AB
So, you are a developer or sysadmin and showed some abilities in dealing with databases issues. And now, you have been elected to the role of DBA. And as you start managing the databases, you wonder…
* How do I tune them to make best use of the hardware?
* How do I optimize the Operating System?
* How do I best configure MySQL or MariaDB for a specific database workload?
If you're asking yourself the following questions when it comes to optimally running your MySQL or MariaDB databases, then this talk is for you!
We will discuss some of the settings that are most often tweaked and which can bring you significant improvement in the performance of your MySQL or MariaDB database. We will also cover some of the variables which are frequently modified even though they should not.
Performance tuning is not easy, especially if you're not an experienced DBA, but you can go a surprisingly long way with a few basic guidelines.
Performance Tuning Cheat Sheet for MongoDBSeveralnines
Bart Oles - Severalnines AB
Database performance affects organizational performance, and we tend to look for quick fixes when under stress. But how can we better understand our database workload and factors that may cause harm to it? What are the limitations in MongoDB that could potentially impact cluster performance?
In this talk, we will show you how to identify the factors that limit database performance. We will start with the free MongoDB Cloud monitoring tools. Then we will move on to log files and queries. To be able to achieve optimal use of hardware resources, we will take a look into kernel optimization and other crucial OS settings. Finally, we will look into how to examine performance of MongoDB replication.
Advanced MySql Data-at-Rest Encryption in Percona ServerSeveralnines
Iwo Panowicz - Percona & Bart Oles - Severalnines AB
The purpose of the talk is to present data-at-rest encryption implementation in Percona Server for MySQL.
Differences between Oracle's MySQL and MariaDB implementation.
- How it is implemented?
- What is encrypted:
- Tablespaces?
- General tablespace?
- Double write buffer/parallel double write buffer?
- Temporary tablespaces? (KEY BLOCKS)
- Binlogs?
- Slow/general/error logs?
- MyISAM? MyRocks? X?
- Performance overhead.
- Backups?
- Transportable tablespaces. Transfer key.
- Plugins
- Keyrings in general
- Key rotation?
- General-Purpose Keyring Key-Management Functions
- Keyring_file
- Is useful? How to make it profitable?
- Keyring Vault
- How does it work?
- How to make a transition from keyring_file
Polyglot Persistence Utilizing Open Source Databases as a Swiss Pocket KnifeSeveralnines
Art Van Scheppingen - vidaXL & Bart Oles - Severalnines AB
Over the past few years, VidaXL has become a European market leader in the online retail of slow moving consumer goods. When a company achieved over 50% year over year growth for the past 9 years, there is hardly enough time to overhaul existing systems. This means existing systems will be stretched to the maximum of their capabilities, and often additional performance will be gained by utilizing a large variety of datastores.
Polyglot persistence reigns in rapidly growing environments and the traditional one-size-fits-all strategy of monoglots is over.
VidaXL has a broad landscape of datastores, ranging from traditional SQL data stores, like MySQL or PostgreSQL alongside more recent load balancing technologies such as ProxySQL, to document stores like MongoDB and search engines such as SOLR and Elasticsearch.
Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...Severalnines
Traditional server monitoring tools are not built for modern distributed database architectures. Let’s face it, most production databases today run in some kind of high availability setup - from simpler master-slave replication to multi-master clusters fronted by redundant load balancers. Operations teams deal with dozens, often hundreds of services that make up the database environment.
This is why we built ClusterControl - to address modern, highly distributed database setups based on replication or clustering. We wanted something that could provide a systems view of all the components of a distributed cluster, including load balancers.
Watch this replay of a webinar on free database monitoring using ClusterControl Community Edition. We show you how to monitor all your MySQL, MariaDB, PostgreSQL and MongoDB systems from a single point of control - whether they are deployed as Galera Clusters, sharded clusters or replication setups across on-prem and cloud data centers. We also see how to use Advisors in order to improve performance.
AGENDA
- Requirements for monitoring distributed database systems
- Cloud-based vs On-prem monitoring solutions
- Agent-based vs Agentless monitoring
- Deepdive into ClusterControl Community Edition
- Architecture
- Metrics Collection
- Trending
- Dashboards
- Queries
- Performance Advisors
- Other features available to Community users
SPEAKER
Bartlomiej Oles is a MySQL and Oracle DBA, with over 15 years experience in managing highly available production systems at IBM, Nordea Bank, Acxiom, Lufthansa, and other Fortune 500 companies. In the past five years, his focus has been on building and applying automation tools to manage multi-datacenter database environments.
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLSeveralnines
To operate PostgreSQL efficiently, you need to have insight into database performance and make sure it is at optimal levels.
With that in mind, we dive into monitoring PostgreSQL for performance in this webinar replay.
PostgreSQL offers many metrics through various status overviews and commands, but which ones really matter to you? How do you trend and alert on them? What is the meaning behind the metrics? And what are some of the most common causes for performance problems in production?
We discuss this and more in ordinary, plain DBA language. We also have a look at some of the tools available for PostgreSQL monitoring and trending; and we’ll show you how to leverage ClusterControl’s PostgreSQL metrics, dashboards, custom alerting and other features to track and optimize the performance of your system.
AGENDA
- PostgreSQL architecture overview
- Performance problems in production
- Common causes
- Key PostgreSQL metrics and their meaning
- Tuning for performance
- Performance monitoring tools
- Impact of monitoring on performance
- How to use ClusterControl to identify performance issues
- Demo
SPEAKER
Sebastian Insausti, Support Engineer at Severalnines, has loved technology since his childhood, when he did his first computer course (Windows 3.11). And from that moment he was decided on what his profession would be. He has since built up experience with MySQL, PostgreSQL, HAProxy, WAF (ModSecurity), Linux (RedHat, CentOS, OL, Ubuntu server), Monitoring (Nagios), Networking and Virtualization (VMWare, Proxmox, Hyper-V, RHEV).
Prior to joining Severalnines, Sebastian worked as a consultant to state companies in security, database replication and high availability scenarios. He’s also a speaker and has given a few talks locally on InnoDB Cluster and MySQL Enterprise together with an Oracle team. Previous to that, he worked for a Mexican company as chief of sysadmin department as well as for a local ISP (Internet Service Provider), where he managed customers' servers and connectivity.
This webinar builds upon a related blog post by Sebastian: https://severalnines.com/blog/performance-cheat-sheet-postgresql.
Webinar slides: Our Guide to MySQL & MariaDB Performance TuningSeveralnines
If you’re asking yourself the following questions when it comes to optimally running your MySQL or MariaDB databases:
- How do I tune them to make best use of the hardware?
- How do I optimize the Operating System?
- How do I best configure MySQL or MariaDB for a specific database workload?
Then this replay is for you!
We discuss some of the settings that are most often tweaked and which can bring you significant improvement in the performance of your MySQL or MariaDB database. We also cover some of the variables which are frequently modified even though they should not.
Performance tuning is not easy, especially if you’re not an experienced DBA, but you can go a surprisingly long way with a few basic guidelines.
This webinar builds upon blog posts by Krzysztof from the ‘Become a MySQL DBA’ series.
AGENDA
- What to tune and why?
- Tuning process
- Operating system tuning
- Memory
- I/O performance
- MySQL configuration tuning
- Memory
- I/O performance
- Useful tools
- Do’s and do not’s of MySQL tuning
- Changes in MySQL 8.0
SPEAKER
Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard.
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDBSeveralnines
Galera Cluster is a mainstream option for high availability MySQL and MariaDB. And though it has established itself as a credible replacement for traditional MySQL master-slave architectures, it is not a drop-in replacement.
While Galera Cluster has some characteristics that make it unsuitable for certain use cases, most applications can still be adapted to run on it.
The benefits are clear: multi-master InnoDB setup with built-in failover and read scalability.
But how do you migrate? Does the schema or application change? What are the limitations? Can a migration be done online, without service interruption? What are the potential risks?
In this webinar, Severalnines Support Engineer Bart Oles walks you through what you need to know in order to migrate from standalone or a master-slave MySQL/MariaDB setup to Galera Cluster.
AGENDA
- Application use cases for Galera
- Schema design
- Events and Triggers
- Query design
- Migrating the schema
- Load balancer and VIP
- Loading initial data into the cluster
- Limitations:
- Cluster technology
- Application vendor support
- Performing Online Migration to Galera
- Operational management checklist
- Belts and suspenders: Plan B
- Demo
SPEAKER
Bartlomiej Oles is a MySQL and Oracle DBA, with over 15 years experience in managing highly available production systems at IBM, Nordea Bank, Acxiom, Lufthansa, and other Fortune 500 companies. In the past five years, his focus has been on building and applying automation tools to manage multi-datacenter database environments.
Webinar slides: How to Measure Database Availability?Severalnines
Database availability is notoriously hard to measure and report on, although it is an important KPI in any SLA between you and your customer. We often define availability in terms of 9’s (e.g. 99.9% or 99.999%), although there is often a lack of understanding of what these numbers might mean, or how we can measure them.
Is the database available if an instance is up and running, but it is unable to serve any requests? Or if response times are excessively long, so that users consider the service unusable? Is the impact of one longer outage the same as multiple shorter outages? How do partial outages affect database availability, where some users are unable to use the service while others are completely unaffected?
Not agreeing on precise definitions with your customer might lead to dissatisfaction. The database team might be reporting that they have met their availability goals, while the customer is dissatisfied with the service. In this webinar, we will discuss the different factors that affect database availability. We will then see how you can measure your database availability in a realistic way.
AGENDA
- Defining availability targets
- Critical business functions
- Customer needs
- Duration and frequency of downtime
- Planned vs unplanned downtime
- SLA
- Measuring the database availability
- Failover/Switchover time
- Recovery time
- Upgrade time
- Queries latency
- Restoration time from backup
- Service outage time
- Instrumentation and tools to measure database availability:
- Free & open-source tools
- CC's Operational Report
- Paid tools
SPEAKER
Bartlomiej Oles is a MySQL and Oracle DBA, with over 15 years experience in managing highly available production systems at IBM, Nordea Bank, Acxiom, Lufthansa, and other Fortune 500 companies. In the past five years, his focus has been on building and applying automation tools to manage multi-datacenter database environments.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Severalnines Self-Training: MySQL® Cluster - Part VIII
1. MySQL Cluster Training
presented by severalnines.com
Address:
SeveralninesAB
Contact:
c/o SICS, Box 1263
services@severalnines.com
Isafjordsgatan22
SE-164-29 Kista
Copyright 2011 Severalnines AB Control your database infrastructure 1
2. Full Training Agenda (1/4)
• MySQL Cluster Introduction
– MySQL eco system
– Scale up, scale out, and sharding
– MySQL Cluster Architecture
– Use cases
– Features
– Node types and Roles
• Detailed Concepts
– Data Distribution
– Verifying data distribution
– Access Methods
– Partitioning
– Node failures and failure detection
– Network Partitioning
– Transactions and Locking
– Consistency Model
– Redo logging and Checkpointing
• Internals
– NDB Design Internals
Copyright 2011 Severalnines AB Control your database infrastructure 2
3. Agenda (2/4)
• Installing MySQL Cluster
– Setting up MySQL Cluster
– Starting/stopping nodes
– Recovery and restarts
– Upgrading configuration
– Upgrading Cluster
• Performance Tuning (instructor-led only; contact us at services@severalnines.com)
– Differences compared to Innodb/MyISAM
– Designing efficient and fast applications
– Identifying bottlenecks
– Tweaking configuration (OS and MySQL Cluster)
– Query Tuning
– Schema Design
– Index Tuning
Copyright 2011 Severalnines AB Control your database infrastructure 3
4. Agenda (3/4)
• Management and Administration
– Backup and Restore
– Geographical Replication
– Online and offline operations
– Ndbinfo tables
– Reporting
– Single user mode
– Scaling Cluster
• Disk Data
– Use cases
– Limitations
– Best practice configuration
• Designing a Cluster
– Capacity Planning and Dimensioning
– Hardware recommendations
– Best practice Configuration
– Storage calculations
Copyright 2011 Severalnines AB Control your database infrastructure 4
5. Agenda (4/4)
• Resolving Issues
– Common problems
– Error logs and Tracefiles
– Recovery and Escalation procedures
• Connectivity Overview
– NDBAPI
– Cluster/J
– LDAP
• Severalnines Tools
– Monitoring and Management
– Benchmarking
– Sandboxes
– Configuration and capacity planning
• Conclusion
Copyright 2011 Severalnines AB Control your database infrastructure 5
6. Agenda: Lab Exercises
(only applicable to instructor-led training classes)
• Lab Exercises
– Installing and Loading data into MySQL Cluster
– Starting/stopping nodes, recovery
– Query tuning
– Backup and Restore
– Configuration Upgrade
• Would you like to try something particular?
– This is possible too, speak with your instructor
Copyright 2011 Severalnines AB Control your database infrastructure 6
7. Prerequisites
• Readers / Participants have understanding of SQL and basic database concepts.
• Laptops/PCs for hands-on exercises
• Linux: 1GB RAM
• Windows: 2GB RAM
• Approx. 20GB disk space and Virtualbox installed.
• Virtualbox can be downloaded for free at http://www.virtualbox.org/wiki/Downloads
• MySQL Cluster version 7.1 or later
Copyright 2011 Severalnines AB Control your database infrastructure 7
8. 8th Installment
Severalnines Cluster Self-Training
Part 7: Disk Data
Copyright 2011 Severalnines AB Control your database infrastructure 8
9. Topics covered in Installment 8
• Use cases
• Limitations
• Best practice configuration
Copyright 2011 Severalnines AB Control your database infrastructure 9
10. Use Cases For Disk Data
• By default MySQL Cluster tables store data in RAM
• If there is more data than RAM then disk data tables
can be used.
• Disk data tables works in a similar way as innodb
with
– Innodb_flush_log_at_tx_commit=2
– Both use a bufferpool (innodb_buffer_poolvsdiskpagebuffer)
to cache data (LRU).
– Innodb disk tables are usually performing better for index
scans that hits the disk.
Copyright 2011 Severalnines AB Control your database infrastructure 10
11. Use Cases For Disk Data
• Avoid long running transactions and big
UPDATEs/DELETEs. Instead, divide work into
chunks:
– DELETE … LIMIT 1000;
• Or TRUNCATE TABLE to empty the table
– UPDATE nnn SET .. WHERE x=y LIMIT 1000;
• JOINs, even pushed will be slow on Disk Data tables.
Copyright 2011 Severalnines AB Control your database infrastructure 11
12. In-Memory vs Disk Data Tables
• Transparency
– Accessing disk data/in-memory tables is the same from the
users' point of view
• Performance
– In-memory tables should be used for RT traffic
– Queries hitting the disk will be slow, like any DB. Avoid the
disk with caches and/or us SSD.
Copyright 2011 Severalnines AB Control your database infrastructure 12
13. Use Cases For Disk Data
• Disk data tables are usually used for applications
– That does not require <5ms response time
• Examples:
– Log / archival data
– MMS/SMS messages
• There are sites with > 3TB of data in disk data tables.
• It is possible to have both memory and disk data
tables at the same time.
• Don’t store e.g, images/videos in BLOBs, regardless
if you use Disk Data tables or In-memory tables.
They are better kept on the filesystem, and only meta
data should be kept inside the database.
Copyright 2011 Severalnines AB Control your database infrastructure 13
14. Concepts
• Data is stored in TABLE SPACEs
– There can be several table spaces
• A TABLE SPACE contains
– one or more data files
– more data files can be added online
• One LOGFILE GROUP is associated with each table
space and contains
– One UNDO BUFFER
• The size cannot be changed later on!
– One or more UNDO LOG files
– There can be only one LOGFILE GROUP
– More undo files can be added online
Copyright 2011 Severalnines AB Control your database infrastructure 14
15. Concepts
• Non-indexed columns are stored on DISK.
• Indexed columns are stored in RAM (DataMemory).
• VAR* columns on disk are treated as fixed size (up to
and including MySQL Cluster 7.1)
– VARBINARY(1024) 1024B of storage
INDEXED COLUMN DATATYPE
YES ID BIGINT
NO DATA1 VARBINARY(1024)
YES TS TIMESTAMP
ID TS DATA1
1 1111 aaaa
2 2222 bbbbbb
Copyright 2011 Severalnines AB Control your database infrastructure 15
16. Disk Data Tables
• DiskPageBufferMemory is a LRU page cache
Data Node 0
LQH PGMAN LGMAN
DiskPageBufferMemory
ACC TUP
#ref DATA1
DataMemory
IM
ID TS #ref
P0
1 1111 &11
REDOBUFFER
1… &11 aaaa 1…
REDO LOG LCP TABLESPACE UNDOLOG
Copyright 2011 Severalnines AB Control your database infrastructure 16
17. Disk Data Tables - READ
• Cache miss – read page from TABLESPACE
Data Node 0
LQH PGMAN LGMAN
DiskPageBufferMemory
ACC TUP
#ref DATA1
P0
DataMemory &11 aaaa
IM
ID TS #ref
P0
1 1111 &11
REDOBUFFER
1… &11 aaaa 1…
REDO LOG LCP TABLESPACE UNDOLOG
Copyright 2011 Severalnines AB Control your database infrastructure 17
18. Disk Data Tables - WRITE
• Cache miss – read page from TABLESPACE
Data Node 0
LQH PGMAN LGMAN
DiskPageBufferMemory
ACC TUP
#ref DATA1
P0
DataMemory &11 aaaa
IM &22 bbbb
ID TS #ref
P0
1 1111 &11
2 2222 &22
REDOBUFFER
1… 2… &11 &22bbbb 1… 2…
REDO LOG LCP TABLESPACE UNDOLOG
Copyright 2011 Severalnines AB Control your database infrastructure 18
19. Configuration Parameters
• DiskPageBufferMemory=4096M
– Similar to the Innodb_buffer_pool_size
– The bigger, the more records can be cached, increases
probability the data is cached, which increases performance
– Updates/inserts are perfomed in the cache and later
checkpointed to the table space.
• SharedGlobalMemory=512M
– Allocate between 64-128M for the UNDO_BUFFER
– Meta data structures are also allocated from this resource.
Copyright 2011 Severalnines AB Control your database infrastructure 19
20. CREATING THE LOGFILE GROUP
CREATE LOGFILE GROUP lg ADD UNDOFILE
‘undo_0.dat’ INITIAL_SIZE=20G
UNDO_BUFFER_SIZE=128M ENGINE=NDB;
• This creates a LOGFILE GROUP with one
UNDOFILE that is 20GB in size.
– More Undo files can be added later (online, no downtime):
– ALTER LOGFILE GROUP lg ADD UNDOFILE
‘undo_1.dat’ INITIAL_SIZE 20GB ENGINE=NDB;
• The UNDO_BUFFER_SIZE is set to 128M RAM,
requires a SharedGlobalMemory=512M
Copyright 2011 Severalnines AB Control your database infrastructure 20
21. CREATING THE TABLESPACE
CREATE TABLESPACE ts_0 ADD DATAFILE
'data_0.dat' USE LOGFILE GROUP lg
INITIAL_SIZE=100G ENGINE=NDB;
• This creates a TABLESPACE ts_0 with one
DATAFILE that is 100GB in size.
• More datafiles and more tablespaces can be
created/added online:
– ALTER TABLESPACE ts_0 ADD DATAFILE
'data_1.dat' INITIAL_SIZE=100GB ENGINE=NDB;
Copyright 2011 Severalnines AB Control your database infrastructure 21
22. CREATING A TABLE IN A TABLESPACE
CREATE TABLE t1 (
a INTEGER AUTO PRIMARY KEY,
b VARBINARY(2048))
ENGINE=ndb
TABLESPACE ts_0 STORAGE DISK;
• Column ’b’ will be stored in table space ts_0. Column
’a’ will be stored in DataMemory.
Copyright 2011 Severalnines AB Control your database infrastructure 22
23. Storage Subsystem
• There are for parts that wants to be written to disk:
– LCP
– REDOLOG (the same redo log files are used for disk data
tables and in memory tables
– UNDO LOG
– TABLESPACE
• In many cases you can put:
– On one disk partition:
• LCP/REDO LOG/ UNDO LOG
– On another disk partition
• TABLESPACE
• If you are going to READ a lot from the TABLESPACE, then
use SSD for the TABLESPACE data files
Copyright 2011 Severalnines AB Control your database infrastructure 23
24. Storage Subsystem
• SAS 10K RPM or 15K RPM disks are proven to work
very well.
• A good disk layout can look like:
– RAID 1+0 (4 disks) for LCP/REDO/UNDO
– RAID 1+0 (4 disks) for TABLESPACE
OR
– RAID 1+0 (4 disks) for LCP/REDO
– RAID 1+0 (4 disks SAS) for TABLESPACE/UNDO
• The main point here is that LCP/REDO/UNDO files
are only written too. TABLESPACE files are both
READ/WRITE, and READs are most commonly
RANDOM.
– Again, use SSD or a separate set of disks for TABLESPACE
data if you will issue READ requests.
Copyright 2011 Severalnines AB Control your database infrastructure 24
25. Monitor DiskData with ClusterControl
• ClusterControl monitors your tablespace
Copyright 2011 Severalnines AB Control your database infrastructure 25
26. Summary
• UNDOLOG – written “all the time”
– Circular, WAL (write ahead logging)
• Tablespace – pages checkpointed from
DiskPageBufferMemory
– First DataMemory is LCPd
– Then DiskPageBufferMemory
• A large DiskPageBufferMemory cache is very
important for performace
– After dimensioning as much RAM as possible should be
allocated to the DiskPageBufferMemory.
• Use in-memory tables when response times are
critical.
• A good fast disk subsystem is needed
Copyright 2011 Severalnines AB Control your database infrastructure 26
27. Coming next in Installment 9:
Designing a Cluster
Copyright 2011 Severalnines AB Control your database infrastructure 27
28. We hope these training slides are
useful to you!
Please visit our website to view the
next section of this training.
For any questions, comments, feedback or to
book a training class, please contact us at:
services@severalnines.com
Thank you!
Copyright 2011 Severalnines AB Control your database infrastructure 28
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com
Severalnines has been offering its products free of charge since 2007, while the founders were employed at MySQL. These products are the de-facto standard tools to assist MySQL customers and users in deploying their MySQL clusters. More information about Severalnines at www.severalnines.com