Do you need Ops in your new startup? If not now, then when? And...what is Ops?
Learn how to scale ruby-based distributed software infrastructure in the cloud to serve 4,000 requests per second, handle 400 updates per second, and achieve 99.97% uptime – all while building the product at the speed of light.
Unimpressed? Now try doing the above altogether without the Ops team, while growing your traffic 100x in 6 months and deploying 5-6 times a day!
It could be a dream, but luckily it's a reality that could be yours.
From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQLKonstantin Gredeskoul
In this exciting and informative talk, presented at PgConf Sillicon Valley 2015, Konstantin cut through the theory to deliver a clear set of practical solutions for scaling applications atop PostgreSQL, eventually supporting millions of active users, tens of thousands concurrently, and with the application stack that responds to requests with a 100ms average. He will share how his team solved one of the biggest challenges they faced: effectively storing and retrieving over 3B rows of "saves" (a Wanelo equivalent of Instagram's "like" or Pinterest's "pin"), all in PostgreSQL, with highly concurrent random access.
Over the last three years, the team at Wanelo optimized the hell out of their application and database stacks. Using PostgreSQL version 9 as their primary data store, Joyent Public Cloud as a hosting environment, the team re-architected their backend for rapid expansion several times over, as the unrelenting traffic kept climbing up. This ultimately resulted in a highly efficient, horizontally scalable, fault tolerant application infrastructure. Unimpressed? Now try getting there without the OPS or DBA teams, all while deploying seven times per day to production, with an application measuring 99.999% uptime over the last 6 months.
Most enterprise cloud adoption has relied on virtual machines and infrastructure as a service. However, there is a lot to love about the other approach to clouds—platform as a service. In a PaaS model, you worry about your code, and the systems take care of the rest. True Platform-as-a-Service not only reduces the cost of hardware infrastructure, but also reduces the complexity of the software stack that runs on it. PaaS promises to trim development and deployment time from months and years to days and weeks, but what are the signs of a true PaaS powerhouse? Is it simply free of servers or software to manage? Does it provide automatic upgrades and elasticity? Can you develop in multiple languages and across multiple device platforms? Many informed analysts think PaaS is the inevitable consequence of true utility computing. In this session, Patrick Chanezon of VMWare explains why PaaS may be the future of the enterprise.
The Hadoop Distributed File System is the foundational storage layer in typical Hadoop deployments. Performance and stability of HDFS are crucial to the correct functioning of applications at higher layers in the Hadoop stack. This session is a technical deep dive into recent enhancements committed to HDFS by the entire Apache contributor community. We describe real-world incidents that motivated these changes and how the enhancements prevent those problems from reoccurring. Attendees will leave this session with a deeper understanding of the implementation challenges in a distributed file system and identify helpful new metrics to monitor in their own clusters.
Hugh Brien of AppDynamics shares his Top 10 application issues he sees on a daily basis.
The list covers:
- Application Performance Monitoring
- Database Monitoring
- Java, .NET, Node.js, PHP, and Python Monitoring
- I/O
- And much more
Building Reactive Fast Data & the Data Lake with Akka, Kafka, SparkTodd Fritz
In this session, we will discuss:
* reactive architecture tenets
* distributed “fast data” streams
* application and analytics focused Data Lake
Enterprise level concerns and the importance of holistic governance, operational management, and a Metadata Lake will be conceptually investigated. The next level of detail will be to explore what a prospective architecture looks like at scale with Terabytes of ingestion per day, how scale puts pressure on an architecture, and how to be successful without losing data in a mission critical system via resilient, self-healing, scalable technologies. DevOps and application architecture concerns will be first-class themes throughout.
Reactive principles and technology will be the second act of this talk. Kafka. Akka. Spark. Various streaming technologies (Kafka Streams, Akka Streams, Spark Streaming) will be reviewed to identify what they are best suited for. The fast data pipeline discussion will center around Kafka, Akka, and Apache Flink (Lightbend Fast Data platform). We’ll also walk through an exciting addition to the Akka family, Alpakka, which is a Camel equivalent for Enterprise Integration Patterns.
The final act will be to dive into the Data Lake, from both an analytics and application development perspective. Technologies used to explain concepts will include Amazon and Hadoop. A Data Lake may service multiple analytics consumers with various “views” (and access levels) of data. It may also be a participant of various applications, perhaps by acting as a centralized source for reference data or common middleware (in turn feeding the analytics aspect). The concept of the Metadata Lake to apply structure, meaning and purpose will be an over-arching success factor for a Data Lake. The difference between the Data Lake and Metadata Lake is conceptually similar to a Halocline… Various technologies (Iglu/Snowplow and more) will be discussed from a feature standpoint to flesh out the technology capabilities needed for Data Lake governance.
Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...Clustrix
For high-value, high-throughput sites, downtime can cost hundreds of thousands to millions of dollars. Service architectures have baked lots of resiliency into apps, but databases and their system of record design are often vulnerable to single points of failure, bringing down entire systems. Worse still, when the database is recovered, there can be missing data. How many database transactions can your workload handle losing if your primary database goes down?
There are many strategies to minimize MySQL downtime, usually using replication and redundant hardware. Often these systems involve some manual intervention and potential downtime as failover protocols take hold. Also, these strategies may be expensive and require redundant hardware.
At Clustrix, we think there are alternative strategies that may be a better fit for modern apps in a MySQL environment.
In our final Tech Talk in this series on scaling MySQL, we evaluate multiple HA strategies. We also discuss the following topics:
- The difference between fault tolerance and high availability
- Best practices for achieving high availability with MySQL
- What are the costs of achieving HA? What can be the most cost-effective strategy?
- How is it possible to survive a multi-node failure in MySQL?
View the webcast of this Tech Talk on our YouTube channel.
From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQLKonstantin Gredeskoul
In this exciting and informative talk, presented at PgConf Sillicon Valley 2015, Konstantin cut through the theory to deliver a clear set of practical solutions for scaling applications atop PostgreSQL, eventually supporting millions of active users, tens of thousands concurrently, and with the application stack that responds to requests with a 100ms average. He will share how his team solved one of the biggest challenges they faced: effectively storing and retrieving over 3B rows of "saves" (a Wanelo equivalent of Instagram's "like" or Pinterest's "pin"), all in PostgreSQL, with highly concurrent random access.
Over the last three years, the team at Wanelo optimized the hell out of their application and database stacks. Using PostgreSQL version 9 as their primary data store, Joyent Public Cloud as a hosting environment, the team re-architected their backend for rapid expansion several times over, as the unrelenting traffic kept climbing up. This ultimately resulted in a highly efficient, horizontally scalable, fault tolerant application infrastructure. Unimpressed? Now try getting there without the OPS or DBA teams, all while deploying seven times per day to production, with an application measuring 99.999% uptime over the last 6 months.
Most enterprise cloud adoption has relied on virtual machines and infrastructure as a service. However, there is a lot to love about the other approach to clouds—platform as a service. In a PaaS model, you worry about your code, and the systems take care of the rest. True Platform-as-a-Service not only reduces the cost of hardware infrastructure, but also reduces the complexity of the software stack that runs on it. PaaS promises to trim development and deployment time from months and years to days and weeks, but what are the signs of a true PaaS powerhouse? Is it simply free of servers or software to manage? Does it provide automatic upgrades and elasticity? Can you develop in multiple languages and across multiple device platforms? Many informed analysts think PaaS is the inevitable consequence of true utility computing. In this session, Patrick Chanezon of VMWare explains why PaaS may be the future of the enterprise.
The Hadoop Distributed File System is the foundational storage layer in typical Hadoop deployments. Performance and stability of HDFS are crucial to the correct functioning of applications at higher layers in the Hadoop stack. This session is a technical deep dive into recent enhancements committed to HDFS by the entire Apache contributor community. We describe real-world incidents that motivated these changes and how the enhancements prevent those problems from reoccurring. Attendees will leave this session with a deeper understanding of the implementation challenges in a distributed file system and identify helpful new metrics to monitor in their own clusters.
Hugh Brien of AppDynamics shares his Top 10 application issues he sees on a daily basis.
The list covers:
- Application Performance Monitoring
- Database Monitoring
- Java, .NET, Node.js, PHP, and Python Monitoring
- I/O
- And much more
Building Reactive Fast Data & the Data Lake with Akka, Kafka, SparkTodd Fritz
In this session, we will discuss:
* reactive architecture tenets
* distributed “fast data” streams
* application and analytics focused Data Lake
Enterprise level concerns and the importance of holistic governance, operational management, and a Metadata Lake will be conceptually investigated. The next level of detail will be to explore what a prospective architecture looks like at scale with Terabytes of ingestion per day, how scale puts pressure on an architecture, and how to be successful without losing data in a mission critical system via resilient, self-healing, scalable technologies. DevOps and application architecture concerns will be first-class themes throughout.
Reactive principles and technology will be the second act of this talk. Kafka. Akka. Spark. Various streaming technologies (Kafka Streams, Akka Streams, Spark Streaming) will be reviewed to identify what they are best suited for. The fast data pipeline discussion will center around Kafka, Akka, and Apache Flink (Lightbend Fast Data platform). We’ll also walk through an exciting addition to the Akka family, Alpakka, which is a Camel equivalent for Enterprise Integration Patterns.
The final act will be to dive into the Data Lake, from both an analytics and application development perspective. Technologies used to explain concepts will include Amazon and Hadoop. A Data Lake may service multiple analytics consumers with various “views” (and access levels) of data. It may also be a participant of various applications, perhaps by acting as a centralized source for reference data or common middleware (in turn feeding the analytics aspect). The concept of the Metadata Lake to apply structure, meaning and purpose will be an over-arching success factor for a Data Lake. The difference between the Data Lake and Metadata Lake is conceptually similar to a Halocline… Various technologies (Iglu/Snowplow and more) will be discussed from a feature standpoint to flesh out the technology capabilities needed for Data Lake governance.
Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...Clustrix
For high-value, high-throughput sites, downtime can cost hundreds of thousands to millions of dollars. Service architectures have baked lots of resiliency into apps, but databases and their system of record design are often vulnerable to single points of failure, bringing down entire systems. Worse still, when the database is recovered, there can be missing data. How many database transactions can your workload handle losing if your primary database goes down?
There are many strategies to minimize MySQL downtime, usually using replication and redundant hardware. Often these systems involve some manual intervention and potential downtime as failover protocols take hold. Also, these strategies may be expensive and require redundant hardware.
At Clustrix, we think there are alternative strategies that may be a better fit for modern apps in a MySQL environment.
In our final Tech Talk in this series on scaling MySQL, we evaluate multiple HA strategies. We also discuss the following topics:
- The difference between fault tolerance and high availability
- Best practices for achieving high availability with MySQL
- What are the costs of achieving HA? What can be the most cost-effective strategy?
- How is it possible to survive a multi-node failure in MySQL?
View the webcast of this Tech Talk on our YouTube channel.
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Clustrix
At Clustrix, we think sharding is like stepping in quicksand. Once you make that step, you are stuck constantly maintaining it.
If you are trying to decide to shard or not to shard your MySQL database, or if you are just sick of living with sharding, give our webinar a listen. We’ll walk you through how to think about the problem at hand, and how to avoid getting mired in that quicksand down the road by answering these questions:
- Why do DBAs think sharding is the only end-game?
- What are the long-term costs of sharding?
- What is a better alternative to sharding MySQL?
- How real is it? Is it too good to be true?
View the webcast of this Tech Talk on our YouTube channel.
Tech Talk Series, Part 3: Why is your CFO right to demand you scale down MySQL?Clustrix
Many web businesses enjoy a spike in traffic at some point in the year. Whether it's Black Friday, the NFL draft day, or Mother’s Day, your app needs to be able to scale and capture customer value when it is most needed. Downtime is not an option.
For a database, that means having enough capacity to ensure transaction latency stays within acceptable limits. For high capacity apps using MySQL, this means you may need to deploy triple the normal capacity usage to sustain traffic for one day. But what do you do with that hardware for the rest of the year? Do you leave it idling? That unused capacity is costing you an arm and a leg, and wasted expenses make CFOs grumpy.
In Part 3 of our Tech Talk series, we discuss what the options are for scaling down MySQL, as well as explore answers to the following questions:
- How do I figure out the costs of not scaling down?
- How does ClustrixDB scale-down differently than MySQL?
- How real is elastically scaling in ClustrixDB? What are the catches?
View the webcast of this Tech Talk on our YouTube channel.
Apache Kafka lies at the heart of the largest data pipelines, handling trillions of messages and petabytes of data every day. Learn the right approach for getting the most out of Kafka from the experts at LinkedIn and Confluent. Todd Palino and Gwen Shapira demonstrate how to monitor, optimize, and troubleshoot performance of your data pipelines—from producer to consumer, development to production—as they explore some of the common problems that Kafka developers and administrators encounter when they take Apache Kafka from a proof of concept to production usage. Too often, systems are overprovisioned and underutilized and still have trouble meeting reasonable performance agreements.
Topics include:
- What latencies and throughputs you should expect from Kafka
- How to select hardware and size components
- What you should be monitoring
- Design patterns and antipatterns for client applications
- How to go about diagnosing performance bottlenecks
- Which configurations to examine and which ones to avoid
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"PivotalOpenSourceHub
Keynote at Geode Summit 2016 by Dr. Justin Erenkrantz, Bloolmberg LP. Creating the Future of Big Data Through "The Apache Way" and why this matters to the community
DataEngConf SF16 - Methods for Content Relevance at LinkedInHakka Labs
Learn how LinkedIn makes article recommendations for its users. Talk by Ajit Singh, LinkedIn. To hear about future conferences go to http://dataengconf.com
In this webinar by Jonas Bonér, creator of Akka and CTO/Co-Founder of Lightbend, we take a look at Cloudstate, an OSS tool built on Akka, gRPC, Knative, GraalVM, and Kubernetes. Cloudstate lets you model, manage, and scale stateful services while preserving responsiveness by designing for resilience and elasticity.
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/2mAKgJi.
Ian Nowland and Joel Barciauskas talk about the challenges Datadog faces as the company has grown its real-time metrics systems that collect, process, and visualize data to the point they now handle trillions of points per day. They also talk about how the architecture has evolved, and what they are looking to in the future as they architect for a quadrillion points per day. Filmed at qconnewyork.com.
Ian Nowland is the VP Engineering Metrics and Alerting at Datadog. Joel Barciauskas currently leads Datadog's distribution metrics team, providing accurate, low latency percentile measures for customers across their infrastructure.
Deep Dive Into How To Monitor MySQL or MariaDB Galera Cluster / Percona XtraD...Severalnines
MySQL provides hundreds of status counters, but how do you make sense of all that monitoring data?
If you’re in Operations and your job is to monitor the health of MySQL/MariaDB Galera Cluster or Percona XtraDB Cluster, then this webinar is for you. Setting up a Galera Cluster is fairly straightforward, but keeping it in a good shape and knowing what to look for when it’s having production issues can be a challenge.
Status counters can be tricky to read …
Which of them are more important than others?
How do you find your way in a labyrinth of different variables?
Which of them can make a significant difference?
How might a host’s health impact MySQL performance?
How to identify problematic nodes in your cluster?
To find out more, read these webinar slides (or watch the replay).
Our colleague Krzysztof Książek provided a deep-dive session on what to monitor in Galera Cluster for MySQL & MariaDB. Krzysztof is a MySQL DBA with experience in managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard.
Amongst other things, Krzysztof discussed why having a good monitoring system is a must, covering the following topics:
Galera monitoring
• cluster status
• flow control
Host metrics and their impact on MySQL
• CPU
• memory
• I/O
InnoDB metrics
• CPU-related
• I/O-related
Blockchain for the DBA and Data ProfessionalKaren Lopez
With all the hype around blockchain, why should a DBA or other data professional care? In this session, we will cover the basics of blockchain as it applies to data and database processes:
Immutability
Verification
Distribution
Cryptography
Transactions
Trust
We will look at current offerings for blockchain features in Azure and in database and data stores. Finally, we'll help you identify the types of business requirements that need blockchain technologies.
You will learn:
Understand the valid uses of blockchain approaches in databases
How current technologies support blockchain approaches
Understand the costs, benefits, and risks of blockchain
[@IndeedEng] Boxcar: A self-balancing distributed services protocol indeedeng
Video available at: http://www.youtube.com/watch?v=E1ok08TVxDw
Indeed's flagship job search product has evolved over the years to meet new challenges. It began as a single, monolithic web application. This grew larger and increasingly complex as we built new features. To remedy this growing problem, we implemented a service-oriented architecture to improve system availability, scalability, and maintainability. We examined common practices for service-oriented architectures, and we discovered ways to improve on the state of the art. We developed these ideas into a new framework called Boxcar. In this talk, we will discuss the scaling problems we solved, the innovative ideas behind boxcar, and how we built the scalable architecture that we now use throughout our systems.
R.B. Boyer is a Software Engineer who has been with Indeed since late 2007. Over the years he has worked on a variety of projects, including distributed storage, authentication, and service architectures.
It's a wrap - closing keynote for nlOUG Tech Experience 2017 (16th June, The ...Lucas Jellema
Closing keynote for the Tech Experience 2017 conference in Amersfoort, The Netherlands (16th June 2017). Touches upon the role of The Oracle Database in a changing landscape with NoSQL, CQRS, REST & JSON, Hadoop and Elastic Search. Discusses the gaps that Oracle professionals have to bridge in order to broaden their horizon and prepare for the (near) future. The session discusses the cloud - and how it will impact most organizations and Oracle specialists. It summarizes the main topics and themes from the Tech Experience 2017 conference.
For enterprises, it's rarely a single function causing your OSS problem, it's a combination of architecture, packages, or networks. Using three real-world examples, these slides, from our recent webinar, walk through identifying the infrastructure needs, the technology stack selection process, and the final architected solution for each environment (e-commerce, PaaS, and HPC machine learning.)
Dive deep into specific OSS packages to examine the top issues in the enterprise with two of our most qualified OSS architects, Bill Crowell and Vince Cox walkthrough: Their day-to-day work in OSS packages; ways to fix reported issues; why you can’t expect in-house developers to handle issues in OSS packages.
Big Data means big hardware, and the less of it we can use to do the job properly, the better the bottom line. Apache Kafka makes up the core of our data pipelines at many organizations, including LinkedIn, and we are on a perpetual quest to squeeze as much as we can out of our systems, from Zookeeper, to the brokers, to the various client applications. This means we need to know how well the system is running, and only then can we start turning the knobs to optimize it. In this talk, we will explore how best to monitor Kafka and its clients to assure they are working well. Then we will dive into how to get the best performance from Kafka, including how to pick hardware and the effect of a variety of configurations in both the broker and clients. We’ll also talk about setting up Kafka for no data loss.
The Future of Services: Building Asynchronous, Resilient and Elastic SystemsLightbend
In this talk by Jamie Allen, noted author, speaker and Senior Director of Global Solutions Architects at Lightbend, we will focus on how to build elastic, resilient service-based applications that can handle tremendous amounts of data in real time, and to introduce the new Lightbend framework for Microservices-based applications called "Lagom."
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...confluent
Apache Kafka is now nearly ubiquitous in modern data pipelines and use cases. While the Kafka development model is elegantly simple, operating Kafka clusters in production environments is a challenge. It’s hard to troubleshoot misbehaving Kafka clusters, especially when there are potentially hundreds or thousands of topics, producers and consumers and billions of messages.
The root cause of why real-time applications is lag may be due to an application problem – like poor data partitioning or load imbalance – or due to a Kafka problem – like resource exhaustion or suboptimal configuration. Therefore getting the best performance, predictability, and reliability for Kafka-based applications can be difficult. In the end, the operation of your Kafka powered analytics pipelines could themselves benefit from machine learning (ML).
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Clustrix
At Clustrix, we think sharding is like stepping in quicksand. Once you make that step, you are stuck constantly maintaining it.
If you are trying to decide to shard or not to shard your MySQL database, or if you are just sick of living with sharding, give our webinar a listen. We’ll walk you through how to think about the problem at hand, and how to avoid getting mired in that quicksand down the road by answering these questions:
- Why do DBAs think sharding is the only end-game?
- What are the long-term costs of sharding?
- What is a better alternative to sharding MySQL?
- How real is it? Is it too good to be true?
View the webcast of this Tech Talk on our YouTube channel.
Tech Talk Series, Part 3: Why is your CFO right to demand you scale down MySQL?Clustrix
Many web businesses enjoy a spike in traffic at some point in the year. Whether it's Black Friday, the NFL draft day, or Mother’s Day, your app needs to be able to scale and capture customer value when it is most needed. Downtime is not an option.
For a database, that means having enough capacity to ensure transaction latency stays within acceptable limits. For high capacity apps using MySQL, this means you may need to deploy triple the normal capacity usage to sustain traffic for one day. But what do you do with that hardware for the rest of the year? Do you leave it idling? That unused capacity is costing you an arm and a leg, and wasted expenses make CFOs grumpy.
In Part 3 of our Tech Talk series, we discuss what the options are for scaling down MySQL, as well as explore answers to the following questions:
- How do I figure out the costs of not scaling down?
- How does ClustrixDB scale-down differently than MySQL?
- How real is elastically scaling in ClustrixDB? What are the catches?
View the webcast of this Tech Talk on our YouTube channel.
Apache Kafka lies at the heart of the largest data pipelines, handling trillions of messages and petabytes of data every day. Learn the right approach for getting the most out of Kafka from the experts at LinkedIn and Confluent. Todd Palino and Gwen Shapira demonstrate how to monitor, optimize, and troubleshoot performance of your data pipelines—from producer to consumer, development to production—as they explore some of the common problems that Kafka developers and administrators encounter when they take Apache Kafka from a proof of concept to production usage. Too often, systems are overprovisioned and underutilized and still have trouble meeting reasonable performance agreements.
Topics include:
- What latencies and throughputs you should expect from Kafka
- How to select hardware and size components
- What you should be monitoring
- Design patterns and antipatterns for client applications
- How to go about diagnosing performance bottlenecks
- Which configurations to examine and which ones to avoid
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"PivotalOpenSourceHub
Keynote at Geode Summit 2016 by Dr. Justin Erenkrantz, Bloolmberg LP. Creating the Future of Big Data Through "The Apache Way" and why this matters to the community
DataEngConf SF16 - Methods for Content Relevance at LinkedInHakka Labs
Learn how LinkedIn makes article recommendations for its users. Talk by Ajit Singh, LinkedIn. To hear about future conferences go to http://dataengconf.com
In this webinar by Jonas Bonér, creator of Akka and CTO/Co-Founder of Lightbend, we take a look at Cloudstate, an OSS tool built on Akka, gRPC, Knative, GraalVM, and Kubernetes. Cloudstate lets you model, manage, and scale stateful services while preserving responsiveness by designing for resilience and elasticity.
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/2mAKgJi.
Ian Nowland and Joel Barciauskas talk about the challenges Datadog faces as the company has grown its real-time metrics systems that collect, process, and visualize data to the point they now handle trillions of points per day. They also talk about how the architecture has evolved, and what they are looking to in the future as they architect for a quadrillion points per day. Filmed at qconnewyork.com.
Ian Nowland is the VP Engineering Metrics and Alerting at Datadog. Joel Barciauskas currently leads Datadog's distribution metrics team, providing accurate, low latency percentile measures for customers across their infrastructure.
Deep Dive Into How To Monitor MySQL or MariaDB Galera Cluster / Percona XtraD...Severalnines
MySQL provides hundreds of status counters, but how do you make sense of all that monitoring data?
If you’re in Operations and your job is to monitor the health of MySQL/MariaDB Galera Cluster or Percona XtraDB Cluster, then this webinar is for you. Setting up a Galera Cluster is fairly straightforward, but keeping it in a good shape and knowing what to look for when it’s having production issues can be a challenge.
Status counters can be tricky to read …
Which of them are more important than others?
How do you find your way in a labyrinth of different variables?
Which of them can make a significant difference?
How might a host’s health impact MySQL performance?
How to identify problematic nodes in your cluster?
To find out more, read these webinar slides (or watch the replay).
Our colleague Krzysztof Książek provided a deep-dive session on what to monitor in Galera Cluster for MySQL & MariaDB. Krzysztof is a MySQL DBA with experience in managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard.
Amongst other things, Krzysztof discussed why having a good monitoring system is a must, covering the following topics:
Galera monitoring
• cluster status
• flow control
Host metrics and their impact on MySQL
• CPU
• memory
• I/O
InnoDB metrics
• CPU-related
• I/O-related
Blockchain for the DBA and Data ProfessionalKaren Lopez
With all the hype around blockchain, why should a DBA or other data professional care? In this session, we will cover the basics of blockchain as it applies to data and database processes:
Immutability
Verification
Distribution
Cryptography
Transactions
Trust
We will look at current offerings for blockchain features in Azure and in database and data stores. Finally, we'll help you identify the types of business requirements that need blockchain technologies.
You will learn:
Understand the valid uses of blockchain approaches in databases
How current technologies support blockchain approaches
Understand the costs, benefits, and risks of blockchain
[@IndeedEng] Boxcar: A self-balancing distributed services protocol indeedeng
Video available at: http://www.youtube.com/watch?v=E1ok08TVxDw
Indeed's flagship job search product has evolved over the years to meet new challenges. It began as a single, monolithic web application. This grew larger and increasingly complex as we built new features. To remedy this growing problem, we implemented a service-oriented architecture to improve system availability, scalability, and maintainability. We examined common practices for service-oriented architectures, and we discovered ways to improve on the state of the art. We developed these ideas into a new framework called Boxcar. In this talk, we will discuss the scaling problems we solved, the innovative ideas behind boxcar, and how we built the scalable architecture that we now use throughout our systems.
R.B. Boyer is a Software Engineer who has been with Indeed since late 2007. Over the years he has worked on a variety of projects, including distributed storage, authentication, and service architectures.
It's a wrap - closing keynote for nlOUG Tech Experience 2017 (16th June, The ...Lucas Jellema
Closing keynote for the Tech Experience 2017 conference in Amersfoort, The Netherlands (16th June 2017). Touches upon the role of The Oracle Database in a changing landscape with NoSQL, CQRS, REST & JSON, Hadoop and Elastic Search. Discusses the gaps that Oracle professionals have to bridge in order to broaden their horizon and prepare for the (near) future. The session discusses the cloud - and how it will impact most organizations and Oracle specialists. It summarizes the main topics and themes from the Tech Experience 2017 conference.
For enterprises, it's rarely a single function causing your OSS problem, it's a combination of architecture, packages, or networks. Using three real-world examples, these slides, from our recent webinar, walk through identifying the infrastructure needs, the technology stack selection process, and the final architected solution for each environment (e-commerce, PaaS, and HPC machine learning.)
Dive deep into specific OSS packages to examine the top issues in the enterprise with two of our most qualified OSS architects, Bill Crowell and Vince Cox walkthrough: Their day-to-day work in OSS packages; ways to fix reported issues; why you can’t expect in-house developers to handle issues in OSS packages.
Big Data means big hardware, and the less of it we can use to do the job properly, the better the bottom line. Apache Kafka makes up the core of our data pipelines at many organizations, including LinkedIn, and we are on a perpetual quest to squeeze as much as we can out of our systems, from Zookeeper, to the brokers, to the various client applications. This means we need to know how well the system is running, and only then can we start turning the knobs to optimize it. In this talk, we will explore how best to monitor Kafka and its clients to assure they are working well. Then we will dive into how to get the best performance from Kafka, including how to pick hardware and the effect of a variety of configurations in both the broker and clients. We’ll also talk about setting up Kafka for no data loss.
The Future of Services: Building Asynchronous, Resilient and Elastic SystemsLightbend
In this talk by Jamie Allen, noted author, speaker and Senior Director of Global Solutions Architects at Lightbend, we will focus on how to build elastic, resilient service-based applications that can handle tremendous amounts of data in real time, and to introduce the new Lightbend framework for Microservices-based applications called "Lagom."
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...confluent
Apache Kafka is now nearly ubiquitous in modern data pipelines and use cases. While the Kafka development model is elegantly simple, operating Kafka clusters in production environments is a challenge. It’s hard to troubleshoot misbehaving Kafka clusters, especially when there are potentially hundreds or thousands of topics, producers and consumers and billions of messages.
The root cause of why real-time applications is lag may be due to an application problem – like poor data partitioning or load imbalance – or due to a Kafka problem – like resource exhaustion or suboptimal configuration. Therefore getting the best performance, predictability, and reliability for Kafka-based applications can be difficult. In the end, the operation of your Kafka powered analytics pipelines could themselves benefit from machine learning (ML).
Brand Storytelling - Miért használj a tartalomterjesztéshez fizetett hirdetés...Péter Tóth-Czere
A rövid válasz: mert másképp senki nem fogja látni!
Lényegében viszont azért, mert ez a legjobban targetálható eszköz a marketingben, amivel személyre szabottan lesz lehetőséged eljutni a fogyasztókhoz.
New Declassified Report Exposes Hamas Human Shield PolicyIsraelDefenseForces
A new report released by the IDF utilizes intelligence maps, photographic and video evidence to mount a serious case against Hamas’ illegal use of public infrastructure during Operation Protective Edge.
There’s trouble brewing. The Trump Administration and GOP-controlled Congress have joined old-line lobbyists to wipe out regulations that protect our environment, our workplace safety and our financial system's integrity—purportedly to help business. Responsible business leaders know better. Government regulations keep big firms from foisting their costs onto smaller firms and taxpayers, let the market choose winners fairly, reward healthy innovation and give companies of all sizes a chance to grow. Broadly shared prosperity, the market economy and democracy itself depend on fair regulation.
Join Celinda Lake, President of Lake Research Partners and Bryan McGannon, ASBC’s Policy Director for a March 21st webinar. The discussion highlighted polling data and insights on how regulation matters and what changes in health care and energy mean for America’s small businesses.
Collective navigation of complex networks: Participatory greedy routingKolja Kleineberg
Many networks are used to transfer information or goods, in other words, they are navigated. The larger the network, the more difficult it is to navigate efficiently. Indeed, information routing in the Internet faces serious scalability problems due to its rapid growth, recently accelerated by the rise of the Internet of Things. Large networks like the Internet can be navigated efficiently if nodes, or agents, actively forward information based on hidden maps underlying these systems. However, in reality most agents will deny to forward messages, which has a cost, and navigation is impossible. Can we design appropriate incentives that lead to participation and global navigability? Here, we present an evolutionary game where agents share the value generated by successful delivery of information or goods. We show that global navigability can emerge, but its complete breakdown is possible as well. Furthermore, we show that the system tends to self-organize into local clusters of agents who participate in the navigation. This organizational principle can be exploited to favor the emergence of global navigability in the system.
Confoo-Montreal-2016: Controlling Your Environments using Infrastructure as CodeSteve Mercier
Slides from my talk at ConFoo Montreal, February 2016. A presentation on how to apply configuration management (CM) principles for your various environments, to control changes made to them. You apply CM on your code, why not on your environments content? This presentation will present the infrastructure as code principles using Chef and/or Ansible. Topics discussed include Continuous Integration, Continuous Delivery/Deployment principles, Infrastructure As Code and DevOps.
This is a presentation I gave to 100+ people at Rev1 Ventures in Columbus, OH. The presentation was about how to define DevOps. Like any new concept, there are multiple and sometimes competing definitions. I've found that implementations of DevOps can change but there are some very common anti-patterns. Lastly, I talk about how we implement DevOps at Bold Penguin.
DevOps Cardiff - Monitoring Automation for DevOpsOutlyer
Our Co-Founder Steven Acreman presented at DevOps Cardiff on our view on monitoring and why Self-Service is critical for DevOps & Micro-Services, and did a demo of Dataloop.IO. Here are the slides
Systems Monitoring with Prometheus (Devops Ireland April 2015)Brian Brazil
Monitoring means many things to many people. This talk looks at Systems Monitoring, that is how to keep an eye on a given system and use this as part of overall management of a system. This talk will cover Why one monitors, What to monitor, How to monitor, the general design of a monitoring system and how Prometheus is a good fit for this in terms of instrumentation, consoles, alerts, general system health and sanity.
Prometheus is a next-generation monitoring system publicly announced earlier this year, developed by companies including SoundCloud, locals Boxever and Docker. Since launch there has been wide-spread interest, and many community contributions.
For more information see http://prometheus.io or http://www.boxever.com/tag/monitoring
The Business Value of PaaS Automation - Kieron Sambrook-Smith - Presentation ...eZ Systems
Kieron Sambrook-Smith, Chief Commercial Officer at Platform.sh spoke at eZ Conference 2017 in London about the business value of Platform as a Service (PaaS) Automation.
He covers the many aspects of the advantages of using a PaaS. The business value you can expect to reap will range from hosting cost savings, better workflow and team productivity, new project delivery concepts, and greater competitive advantage. Discover a more advanced implementation of your service offering.
Mapping Life Science Informatics to the CloudChris Dagdigian
Infrastructure cloud platforms such as those offered by Amazon Web Services are not designed and built with scientific research as the primary use case. These presentation slides cover the current state of mapping life science research and HPC technique onto “the cloud” and how to work around the common engineering, orchestration and data movement problems.
[Note: I've replaced the 2011 version of this talk deck with a slightly updated version as delivered at the AIRI Petabyte Challenge Meeting]
In this webinar, Michael Nash of BoldRadius explores the Typesafe Reactive Platform.
The Typesafe Reactive Platform is a suite of technologies and tools that support the creation of reactive applications, that is, applications that handle the kind of responsiveness requirements, data volume, and user load that was out of practical reach only a few years ago.
From analysis of the human genome to wearable technology to communications at a massive scale, BoldRadius has the premier team of experts with decades of collective experience in designing and building these types of applications, and in helping teams adopt these tools.
Handling 1 Billion Requests/hr with Minimal Latency Using DockerMatomy
Head of Mobfox DevOps, David Spitzer, explains how Mobfox used Docker to scale both the services and development team to achieve low latency networking and auto scaling. He discusses the ecosystem back in early 2015 and today, what were the challenges, and how Mobfox overcame them.
Large organizations are increasingly turning to DevOps and Continuous Delivery principles, often with the goal of shipping better software faster. However, they're then faced with important considerations for scaling these processes across teams and in diverse environments while still maintaining the visibility and control necessary for compliance.
This presentation from Matt Meservey, Director of Product Management at SaltStack and Andrew Phillips, VP of DevOps Strategy at XebiaLabs discusses:
Practical advice and tips gleaned from the large organizations they have helped implement and scale DevOps and Continuous Delivery initiatives for
How to focus your initiatives around practicing improvement not just practicing “DevOps”
How the combination XebiaLabs and SaltStack accelerates the software cycle, delivers advanced automation capabilities, enables data-driven improvement and provides continuous insight into your end-to-end software release process in a way other tools simply cannot
Trent Hornibrook gave a recent talk at the Infracoders meet-up playing a thought experiment with the audience on 'what would be your tech decisions if you were given a blank cheque at at startup'.
Trent, recently working for a start-up then shared what decisions he made, and why
De facto DevOps, de facto Agile. Today DevOps is the Manufacturing Revolution of Our Age. There is no escape for us. When got a DevOps, you got a DevOps.
DevOps simply is the combination of cultural philosophies,practices,and tools that increase an organization’s ability to deliver applications and services at high velocity : evolving and improving products at a faster pace than organizations using traditional software development and infrastructure management processes.
Enabling your DevOps culture with AWS-webinarAaron Walker
In this presentation shows you how the benefits of AWS technologies can be combined with a new approach to Development and Operations.
It’s all about delivering new features and functionality faster, without compromising reliability, stability and performance.
* Understand the challenges faced by traditional Development and Operations teams
* Apply Continuous Integration/Delivery processes and tools to enable change
* Appreciate how various AWS technologies can be used to facilitate DevOps
This 7-second Brain Wave Ritual Attracts Money To You.!nirahealhty
Discover the power of a simple 7-second brain wave ritual that can attract wealth and abundance into your life. By tapping into specific brain frequencies, this technique helps you manifest financial success effortlessly. Ready to transform your financial future? Try this powerful ritual and start attracting money today!
ER(Entity Relationship) Diagram for online shopping - TAEHimani415946
https://bit.ly/3KACoyV
The ER diagram for the project is the foundation for the building of the database of the project. The properties, datatypes, and attributes are defined by the ER diagram.
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesSanjeev Rampal
Talk presented at Kubernetes Community Day, New York, May 2024.
Technical summary of Multi-Cluster Kubernetes Networking architectures with focus on 4 key topics.
1) Key patterns for Multi-cluster architectures
2) Architectural comparison of several OSS/ CNCF projects to address these patterns
3) Evolution trends for the APIs of these projects
4) Some design recommendations & guidelines for adopting/ deploying these solutions.
1.Wireless Communication System_Wireless communication is a broad term that i...JeyaPerumal1
Wireless communication involves the transmission of information over a distance without the help of wires, cables or any other forms of electrical conductors.
Wireless communication is a broad term that incorporates all procedures and forms of connecting and communicating between two or more devices using a wireless signal through wireless communication technologies and devices.
Features of Wireless Communication
The evolution of wireless technology has brought many advancements with its effective features.
The transmitted distance can be anywhere between a few meters (for example, a television's remote control) and thousands of kilometers (for example, radio communication).
Wireless communication can be used for cellular telephony, wireless access to the internet, wireless home networking, and so on.
1. Konstantin Gredeskoul
CTO, wanelo.com
DevOps without the “Ops”
A fallacy? A dream? A ________?
@kig
@kigster
How Wanelo handles thousands of writes per second with
99.97% uptime without an operations team
@kig
2. Proprietary and
Wanelo is the digital mall of the future, and a place to find
the most amazing products.
3.
4. What are you running on?
No really, what’s your stack?
Are you on Mongo?
No!?!?!??
or…
You running ruby? WTF? It’s slow!
You are running Erlang? WTF? It’s in Swedish!
etc.
People often ask…
6. Proprietary and
How much traffic does your app get?
• If you are building an internal web-site in Rails you’d be lucky to get
100 RPMs – your users are only a limited set of employees
• Semi-Popular sites with up to a few hundreds of concurrent users
can expect about 1K-2K RPM
• When you cross 100K RPM mark, you joined the “small big boys” :)
• When you are Pinterest, Facebook or Twitter… You are probably
doing 1-10M RPMs
7. So what is this talk about?
• Review Operations, DevOps, and the Cloud, and how
the new technologies are changing the landscape
• Learn some key points and patterns that
dramatically reduce stress and pain associated
with running a site, particularly ruby and/or rails
• Discuss if modern startups really need a dedicated
operations team, and if so – at what point?
9. Proprietary and
What the heck is DevOps?
• “Today, many organizations are confused on what DevOps means
for them..”[2]
1. WikiPedia article on DevOps
2. FORRESTER: “Eliminate DevOps Myths With Situational-Awareness-Based Performance”. John Rakowski, October 10, 2014
• DevOps is a software development method that stresses
communication, collaboration, integration, automation and
measurement cooperation between software developers and
other information-technology (IT) professionals. [1]
10. Proprietary and
“…Efficient teams are deploying code 30 times
more frequently with 50 percent fewer failures in
2014…” [3]
“…DevOps practices correlate strongly with high
organizational performance” [3]
3. Source: PuppetLabs “State of DevOps Report”, 2014
DevOps however, works…
11. Traditional “Heavy” Agile
• Traditional Ops responsibilities were often in conflict
with product development: stability versus change.
Product Dev QA OperationsProduct Dev QA Operations
12. Traditional Operations
• Uptime, stability and reliability
• On-call, fixing site at night
• Backups and disaster recovery
• Security, patching, OpenSSL :)
• Hardware
• Networking
• Colocation / DC
13. “The Cloud” changed things
• Uptime, stability and reliability
• On-call, fixing site at night
• Backups and disaster recovery
• Security, patching, OpenSSL :)
• Hardware
• Networking
• Colocation / DC
14. So the Cloud is a big part of
what makes DevOps possible
15. Let’s talk about a simpler and more
friendly way to build and deploy
software.
16. Early Company Goals (based on Wanelo)
• Maximize iteration speed
• Practice “aggro-agile”™
• Scale up as we go, keep the app fast
• Break things, learn, move on
• Enable, empower and inspire our team
• Remain in control of our infrastructure
17. And while moving really fast…
We just never hired Ops
But we did hire several brilliant engineers who
actually enjoyed infrastructure / platform work.
Except they approach it like … code.
18. Not having Ops meant
• We had to deploy our app to the cloud, and learn
how to provision the nodes we needed, as well as:
• How to provision load balancers and app servers
• How to configure new Solr masters and replicas
• How to install and tune PostgreSQL databases
• memcaches, redis shards, twemproxy, haproxy
19. Fast forward to today
• 100% cloud hosted (Joyent Cloud)
• 100% automated (Chef)
• 10,000% traffic growth in 6 months and survived
• 99.97% uptime (without trying very hard)
• on call engineers get 1-2 pages per week
• 80% of engineers are on call rotation, including
iOS & Android developers
22. 1. Automation and Deployment
• Infrastructure is a first class citizen
• Pairs deliver user stories which include automation
• Did I mention we pair program? It rocks!
• We run Chef continuously in production
• I want to trust my tools, and if they break, fix them
• Partition staging and production environments
23. Incremental Deployment
• Roll code out everywhere, restart 2% of servers
• Watch errors, latency, other anomalies
• When satisfied continue rolling all servers
• Ensure old and new code can co-exist
• Ensure no “drop/rename” migrations happen on live tables
• Ensure no exclusive locking migrations (eg. create index
concurrently)
24. 2. Fault tolerant infrastructure
• Ensure aggressive client timeouts
• Achieving fault tolerance today is much cheaper
than ever before! It’s a crime not to do it :)
• Put haproxy in front of everything, literally
• Stateless services only
• Put makara, twemproxy, Dalli in front of
database, redis and memcached
25. Let’s look at a couple of recipes for resilience
Resilience keeps you sleeping at night
26. Where is everything? HAProxy + Chef Search + Stateless
App talks to
http://127.0.0.1:8000
http://127.0.0.1:8001
App HAProxy
Backend 1
Backend 2
Solr
Web Service
Backend 2
ElasticSearch
Virtual Zone / Server
27. This pattern allows us to have one place that
knows about everything else, in Chef
28. What the hell Makara?
• Makara is a simple database routing tool for
ActiveRecord that has been in production on
Wanelo and TaskRabbit for years
• https://github.com/wanelo/makara (PostgreSQL)
• https://github.com/taskrabbit/makara (MySQL)
29. Proprietary and
• Was the simplest library to
understand, and port to
• Worked in the multi-threaded
environment of Sidekiq
Background Workers
• automatically retries if
replica goes down
• load balances with weights
• Was running in production
30. Replicate everything that replicates
App
HAProxy
Backend 1
Backend 2
Solr Replica
Backend 2
Solr Replica
Solr Replica
Solr Master
Web / API Requests
Background
WorkerQueue
reads
writes
31. App
HAProxy
Backend 1
Backend 2
Solr Replica
Backend 2
Solr Replica
Solr Replica
Solr Master
Web / API Requests
Background
WorkerQueue
Degraded State, but still up!
Many replicas can be down
reads
writes
32. Replicas are great because they are easy to add
and often ok to ignore when they die/reboot/etc.
33. Don’t buy an expensive load balancer
Load Balancer
haproxy
nginx
Load Balancer
haproxy
nginx
200.200.234.145 200.200.234.146
example.com
App Server App Server App Server App Server App Server App Server
34. You can build a decent one with DNS
App Server App Server App Server
Load Balancer
haproxy
nginx
App Server App Server App Server
Load Balancer
haproxy
nginx
DNS Provider
pingping
200.200.234.145 200.200.234.146
DNS auto-failover is
offered with some
enterprise DNS services,
e.g. from DNSMadeEasy
35. When LB goes down, it is removed
from the DNS pool
App Server App Server App Server
Load Balancer
haproxy
nginx
App Server App Server App Server
Load Balancer
haproxy
nginx
DNS Provider
pingping
200.200.234.145 200.200.234.146
It works pretty well
36. It works pretty well
Load Balancer
haproxy
nginx
Dead Load Balancer
200.200.234.145 200.200.234.146
DNS Provider
ping
App Server App Server App Server App Server App Server App Server
example.com
This works best with a
short TTL
Configure LBs in pairs, as
the others failover, to
account for network
partitioning
When LB goes down, it is removed
from the DNS pool
37. This pattern allows us to tolerate reboots and
maintenance with minimal effect on our users
38. Failover to the overflow pattern
Two queues: large primary, small secondary
The primary distributes jobs to a large set of specialized
workers, assigned to specific queues
App
HAProxy
Primary Backend 1
Failover Backend 2
Primary
Background Workers
Redis Primary
Queue
Redis
Failover
"Overflow" Workers
The failover queue has only a small number of overflow
workers, but they will accept any work
39. During spikes in traffic, this pattern allows our
application to continue enqueuing jobs when the
primary is overwhelmed
This is useful in situations when you can’t easily
round robin between multiple shards.
Example: Sidekiq with a “Unique Job” extension.
40. • Some tools allow alerting on the first derivative of an
observed metric.
• This is what we want: rapid drop (or increase) in a
key metric to generate an alert.
3. Alert only on what’s important
• Nagios is great for visibility
• Not great for knowing when to drop everything
because the site is on fire
41. • We never page on “host down”
Because, who cares?
The host is likely redundant, and will be back.
…Probably.
Alerting examples
• We only page for things like “sudden drop in product
saves per second”, or a spike in error rate, etc.
• Monitoring / alerting tool Circonus supports this
42. 4. Obsessive monitoring
• Modern tools offer unprecedented visibility
• Real time application monitoring
• Real time business stats monitoring
• Real time network monitoring
• Dashboards, TV Monitor, alerts
• Real time, real time, real time.
44. 5. Cloud vendor is your partner
• We get phenomenal customer support from Joyent
• Our Cloud Partner, in a way, is our Ops
• Joyent is innovative in that they develop and run
their own cloud stack: from the OS layer (SmartOS)
to the data center management software
• They offer a unique option to take our “cloud” in-
house when that time comes
45. 6. DevOps, really, is just code
• Hire folks who write code, so that they don’t have to
repeat the same task twice
• Everyone will be happier that way.
46. So here is how to reduce stress!
1. Insist on 100% automation
2. Deploy fault tolerant patterns wherever possible
3. Page only on what’s important to the business
4. Monitor everything else obsessively
5. Choose a cloud provider that can be your partner
6. Infrastructure work is software engineering