Colin talks about how he architected and built a high performance time series database from the ground up at Dataloop.io. Handling hundreds of thousands of metrics per second. One of the objectives was to provide real time graphing and alerting. If you're 'rolling your own' metrics, are interested in Node.JS, highly scalable architectures and like listening to plenty of war stories you should enjoy this talk.
Video: http://youtu.be/vx6Ms5TNtqo
DevOps Exchange Meetup Group: http://bit.ly/doxlonmeetup
Migrating from RDBMS to MongoDB Atlas - Texas American Resources Company (TARC)MongoDB
See how Texas American Resources, an oil and gas company, modernized their architecture by moving from a legacy SQL database, to MongoDB. They discuss their migration to MongoDB and then their migration to the cloud with MongoDB Atlas.
Follow on from Back to Basics: An Introduction to NoSQL and MongoDB
•Covers more advanced topics:
Storage Engines
• What storage engines are and how to pick them
Aggregation Framework
• How to deploy advanced analytics processing right inside the database
The BI Connector
• How to create visualizations and dashboards from your MongoDB data
Authentication and Authorisation
• How to secure MongoDB, both on-premise and in the cloud
Iceberg: a modern table format for big data (Ryan Blue & Parth Brahmbhatt, Netflix)
Presto Summit 2018 (https://www.starburstdata.com/technical-blog/presto-summit-2018-recap/)
At the beginning of 2021, Shopify Data Platform decided to adopt Apache Flink to enable modern stateful stream-processing. Shopify had a lot of experience with other streaming technologies, but Flink was a great fit due to its state management primitives.
After about six months, Shopify now has a flourishing ecosystem of tools, tens of prototypes from many teams across the company and a few large use-cases in production.
Yaroslav will share a story about not just building a single data pipeline but building a sustainable ecosystem. You can learn about how they planned their platform roadmap, the tools and libraries Shopify built, the decision to fork Flink, and how Shopify partnered with other teams and drove the adoption of streaming at the company.
Iceberg: A modern table format for big data (Strata NY 2018)Ryan Blue
Hive tables are an integral part of the big data ecosystem, but the simple directory-based design that made them ubiquitous is increasingly problematic. Netflix uses tables backed by S3 that, like other object stores, don’t fit this directory-based model: listings are much slower, renames are not atomic, and results are eventually consistent. Even tables in HDFS are problematic at scale, and reliable query behavior requires readers to acquire locks and wait.
Owen O’Malley and Ryan Blue offer an overview of Iceberg, a new open source project that defines a new table layout addresses the challenges of current Hive tables, with properties specifically designed for cloud object stores, such as S3. Iceberg is an Apache-licensed open source project. It specifies the portable table format and standardizes many important features, including:
* All reads use snapshot isolation without locking.
* No directory listings are required for query planning.
* Files can be added, removed, or replaced atomically.
* Full schema evolution supports changes in the table over time.
* Partitioning evolution enables changes to the physical layout without breaking existing queries.
* Data files are stored as Avro, ORC, or Parquet.
* Support for Spark, Pig, and Presto.
Migrating from RDBMS to MongoDB Atlas - Texas American Resources Company (TARC)MongoDB
See how Texas American Resources, an oil and gas company, modernized their architecture by moving from a legacy SQL database, to MongoDB. They discuss their migration to MongoDB and then their migration to the cloud with MongoDB Atlas.
Follow on from Back to Basics: An Introduction to NoSQL and MongoDB
•Covers more advanced topics:
Storage Engines
• What storage engines are and how to pick them
Aggregation Framework
• How to deploy advanced analytics processing right inside the database
The BI Connector
• How to create visualizations and dashboards from your MongoDB data
Authentication and Authorisation
• How to secure MongoDB, both on-premise and in the cloud
Iceberg: a modern table format for big data (Ryan Blue & Parth Brahmbhatt, Netflix)
Presto Summit 2018 (https://www.starburstdata.com/technical-blog/presto-summit-2018-recap/)
At the beginning of 2021, Shopify Data Platform decided to adopt Apache Flink to enable modern stateful stream-processing. Shopify had a lot of experience with other streaming technologies, but Flink was a great fit due to its state management primitives.
After about six months, Shopify now has a flourishing ecosystem of tools, tens of prototypes from many teams across the company and a few large use-cases in production.
Yaroslav will share a story about not just building a single data pipeline but building a sustainable ecosystem. You can learn about how they planned their platform roadmap, the tools and libraries Shopify built, the decision to fork Flink, and how Shopify partnered with other teams and drove the adoption of streaming at the company.
Iceberg: A modern table format for big data (Strata NY 2018)Ryan Blue
Hive tables are an integral part of the big data ecosystem, but the simple directory-based design that made them ubiquitous is increasingly problematic. Netflix uses tables backed by S3 that, like other object stores, don’t fit this directory-based model: listings are much slower, renames are not atomic, and results are eventually consistent. Even tables in HDFS are problematic at scale, and reliable query behavior requires readers to acquire locks and wait.
Owen O’Malley and Ryan Blue offer an overview of Iceberg, a new open source project that defines a new table layout addresses the challenges of current Hive tables, with properties specifically designed for cloud object stores, such as S3. Iceberg is an Apache-licensed open source project. It specifies the portable table format and standardizes many important features, including:
* All reads use snapshot isolation without locking.
* No directory listings are required for query planning.
* Files can be added, removed, or replaced atomically.
* Full schema evolution supports changes in the table over time.
* Partitioning evolution enables changes to the physical layout without breaking existing queries.
* Data files are stored as Avro, ORC, or Parquet.
* Support for Spark, Pig, and Presto.
Netflix’s Big Data Platform team manages data warehouse in Amazon S3 with over 60 petabytes of data and writes hundreds of terabytes of data every day. With a data warehouse at this scale, it is a constant challenge to keep improving performance. This talk will focus on Iceberg, a new table metadata format that is designed for managing huge tables backed by S3 storage. Iceberg decreases job planning time from minutes to under a second, while also isolating reads from writes to guarantee jobs always use consistent table snapshots.
In this session, you'll learn:
• Some background about big data at Netflix
• Why Iceberg is needed and the drawbacks of the current tables used by Spark and Hive
• How Iceberg maintains table metadata to make queries fast and reliable
• The benefits of Iceberg's design and how it is changing the way Netflix manages its data warehouse
• How you can get started using Iceberg
Speaker
Ryan Blue, Software Engineer, Netflix
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...HostedbyConfluent
If a real-time dashboard takes 5 minutes to refresh, it’s not real-time. With data lakes increasingly enabling massive amounts of unprocessed data sets, delivering low-latency analytics is not for the faint-hearted. Learn how to stream massive amounts of data which used to be impossible to handle from Kafka, to serve real-time applications using lake-scale optimized approaches to storage and indexing.
Visualize some of Austin's open source data using Elasticsearch with Kibana. ObjectRocket's Steve Croce presented this talk on 10/13/17 at the DBaaS event in Austin, TX.
A Walkthrough of InfluxCloud 2.0 by Tim HallInfluxData
Tim Hall, VP of Products at InfluxData, will demonstrate how to setup and use the next InfluxCloud 2.0 in this InfluxDays NYC 2019 presentation. He provides a brief history of InfluxCloud followed by an overview of InfluxDB 2.0 and a demo.
From Raghu Ramakrishnan's presentation "Key Challenges in Cloud Computing and How Yahoo! is Approaching Them" at the 2009 Cloud Computing Expo in Santa Clara, CA, USA. Here's the talk description on the Expo's site: http://cloudcomputingexpo.com/event/session/510
Presto talk @ Global AI conference 2018 Bostonkbajda
Presented at Global AI Conference in Boston 2018:
http://www.globalbigdataconference.com/boston/global-artificial-intelligence-conference-106/speaker-details/kamil-bajda-pawlikowski-62952.html
Presto, an open source distributed SQL engine, is widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources. Proven at scale in a variety of use cases at Facebook, Airbnb, Netflix, Uber, Twitter, LinkedIn, Bloomberg, and FINRA, Presto experienced an unprecedented growth in popularity in both on-premises and cloud deployments in the last few years. Presto is really a SQL-on-Anything engine in a single query can access data from Hadoop, S3-compatible object stores, RDBMS, NoSQL and custom data stores. This talk will cover some of the best use cases for Presto, recent advancements in the project such as Cost-Based Optimizer and Geospatial functions as well as discuss the roadmap going forward.
Unifying Frontend and Backend Development with Scala - ScalaCon 2021Taro L. Saito
Scala can be used for developing both frontend (Scala.js) and backend (Scala JVM) applications. A missing piece has been bridging these two worlds using Scala. We built Airframe RPC, a framework that uses Scala traits as a unified RPC interface between servers and clients. With Airframe RPC, you can build HTTP/1 (Finagle) and HTTP/2 (gRPC) services just by defining Scala traits and case classes. It simplifies web application design as you only need to care about Scala interfaces without using existing web standards like REST, ProtocolBuffers, OpenAPI, etc. Scala.js support of Airframe also enables building interactive Web applications that can dynamically render DOM elements while talking with Scala-based RPC servers. With Airframe RPC, the value of Scala developers will be much higher both for frontend and backend areas.
Setting Up InfluxDB for IoT by David G SimmonsInfluxData
David will be walking you through a typical data architecture for an IoT device. Then, it will be a hands-on workshop to gather data from the device, display it on a dashboard and trigger alerts based on thresholds that you set. View this InfluxDays NYC 2019 presentation to learn about setting up InfluxDB for IoT.
InfluxData builds a time series platform primarily deployed for DevOps and IoT monitoring. This talk presents several lessons learned while scaling the platform across a large number of deployments—from single server open source instances to highly available high-throughput clusters.
This talk presents a number of failure conditions that informed subsequent design choices. Ryan Betts (Director of Engineering at InfluxData) will discuss designing backpressure in an AP system with tens of thousands of resource-limited writers; trade-offs between monolithic and service-oriented database implementations; and lessons learned implementing multiple query processing systems.
InfluxDB 2.0: Dashboarding 101 by David G. SimmonsInfluxData
InfluxDB 2.0 has some new dashboarding and querying capabilities that will make using a time series database even easier. This InfluxDays NYC 2019 presentation presented by David G. Simmons (Senior Developer Evangelist at InfluxData), walks you through how to set up your first dashboard.
Convergent Replicated Data Types in Riak 2.0Big Data Spain
Talk by Gordon Guthrie, Senior Software Engineer at Basho
Summary
A review of the CAP Theorem and the difficulties of resolving conflicts in highly distributed systems. Covering the issues and various theories on how to resolve including the use CRDTs in Riak
Details
CRDTs are used to replicate data across multiple computers in a network, executing updates without the need for remote synchronisation. This leads to merge conflicts in systems using conventional eventual consistency technology, but CRDTs are designed such that conflicts are mathematically impossible. Under the constraints of the CAP theorem they provide the strongest consistency guarantees for available/partition-tolerant (AP) settings.
The CRDT concept was first formally defined in 2007 by Marc Shapiro and Nuno Preguiça in terms of operation commutativity, and development was initially motivated by collaborative text editing. The concept of semilattice evolution of replicated states was first defined by Baquero and Moura in 1997, and development was initially motivated by mobile computing. The two concepts were later unified in 2011.
Basho has worked with the EU and Marc Shapiro's team to push CRDTs into distributed systems. Riak v2.x is the first commercial product to include this functionality
Netflix’s Big Data Platform team manages data warehouse in Amazon S3 with over 60 petabytes of data and writes hundreds of terabytes of data every day. With a data warehouse at this scale, it is a constant challenge to keep improving performance. This talk will focus on Iceberg, a new table metadata format that is designed for managing huge tables backed by S3 storage. Iceberg decreases job planning time from minutes to under a second, while also isolating reads from writes to guarantee jobs always use consistent table snapshots.
In this session, you'll learn:
• Some background about big data at Netflix
• Why Iceberg is needed and the drawbacks of the current tables used by Spark and Hive
• How Iceberg maintains table metadata to make queries fast and reliable
• The benefits of Iceberg's design and how it is changing the way Netflix manages its data warehouse
• How you can get started using Iceberg
Speaker
Ryan Blue, Software Engineer, Netflix
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...HostedbyConfluent
If a real-time dashboard takes 5 minutes to refresh, it’s not real-time. With data lakes increasingly enabling massive amounts of unprocessed data sets, delivering low-latency analytics is not for the faint-hearted. Learn how to stream massive amounts of data which used to be impossible to handle from Kafka, to serve real-time applications using lake-scale optimized approaches to storage and indexing.
Visualize some of Austin's open source data using Elasticsearch with Kibana. ObjectRocket's Steve Croce presented this talk on 10/13/17 at the DBaaS event in Austin, TX.
A Walkthrough of InfluxCloud 2.0 by Tim HallInfluxData
Tim Hall, VP of Products at InfluxData, will demonstrate how to setup and use the next InfluxCloud 2.0 in this InfluxDays NYC 2019 presentation. He provides a brief history of InfluxCloud followed by an overview of InfluxDB 2.0 and a demo.
From Raghu Ramakrishnan's presentation "Key Challenges in Cloud Computing and How Yahoo! is Approaching Them" at the 2009 Cloud Computing Expo in Santa Clara, CA, USA. Here's the talk description on the Expo's site: http://cloudcomputingexpo.com/event/session/510
Presto talk @ Global AI conference 2018 Bostonkbajda
Presented at Global AI Conference in Boston 2018:
http://www.globalbigdataconference.com/boston/global-artificial-intelligence-conference-106/speaker-details/kamil-bajda-pawlikowski-62952.html
Presto, an open source distributed SQL engine, is widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources. Proven at scale in a variety of use cases at Facebook, Airbnb, Netflix, Uber, Twitter, LinkedIn, Bloomberg, and FINRA, Presto experienced an unprecedented growth in popularity in both on-premises and cloud deployments in the last few years. Presto is really a SQL-on-Anything engine in a single query can access data from Hadoop, S3-compatible object stores, RDBMS, NoSQL and custom data stores. This talk will cover some of the best use cases for Presto, recent advancements in the project such as Cost-Based Optimizer and Geospatial functions as well as discuss the roadmap going forward.
Unifying Frontend and Backend Development with Scala - ScalaCon 2021Taro L. Saito
Scala can be used for developing both frontend (Scala.js) and backend (Scala JVM) applications. A missing piece has been bridging these two worlds using Scala. We built Airframe RPC, a framework that uses Scala traits as a unified RPC interface between servers and clients. With Airframe RPC, you can build HTTP/1 (Finagle) and HTTP/2 (gRPC) services just by defining Scala traits and case classes. It simplifies web application design as you only need to care about Scala interfaces without using existing web standards like REST, ProtocolBuffers, OpenAPI, etc. Scala.js support of Airframe also enables building interactive Web applications that can dynamically render DOM elements while talking with Scala-based RPC servers. With Airframe RPC, the value of Scala developers will be much higher both for frontend and backend areas.
Setting Up InfluxDB for IoT by David G SimmonsInfluxData
David will be walking you through a typical data architecture for an IoT device. Then, it will be a hands-on workshop to gather data from the device, display it on a dashboard and trigger alerts based on thresholds that you set. View this InfluxDays NYC 2019 presentation to learn about setting up InfluxDB for IoT.
InfluxData builds a time series platform primarily deployed for DevOps and IoT monitoring. This talk presents several lessons learned while scaling the platform across a large number of deployments—from single server open source instances to highly available high-throughput clusters.
This talk presents a number of failure conditions that informed subsequent design choices. Ryan Betts (Director of Engineering at InfluxData) will discuss designing backpressure in an AP system with tens of thousands of resource-limited writers; trade-offs between monolithic and service-oriented database implementations; and lessons learned implementing multiple query processing systems.
InfluxDB 2.0: Dashboarding 101 by David G. SimmonsInfluxData
InfluxDB 2.0 has some new dashboarding and querying capabilities that will make using a time series database even easier. This InfluxDays NYC 2019 presentation presented by David G. Simmons (Senior Developer Evangelist at InfluxData), walks you through how to set up your first dashboard.
Convergent Replicated Data Types in Riak 2.0Big Data Spain
Talk by Gordon Guthrie, Senior Software Engineer at Basho
Summary
A review of the CAP Theorem and the difficulties of resolving conflicts in highly distributed systems. Covering the issues and various theories on how to resolve including the use CRDTs in Riak
Details
CRDTs are used to replicate data across multiple computers in a network, executing updates without the need for remote synchronisation. This leads to merge conflicts in systems using conventional eventual consistency technology, but CRDTs are designed such that conflicts are mathematically impossible. Under the constraints of the CAP theorem they provide the strongest consistency guarantees for available/partition-tolerant (AP) settings.
The CRDT concept was first formally defined in 2007 by Marc Shapiro and Nuno Preguiça in terms of operation commutativity, and development was initially motivated by collaborative text editing. The concept of semilattice evolution of replicated states was first defined by Baquero and Moura in 1997, and development was initially motivated by mobile computing. The two concepts were later unified in 2011.
Basho has worked with the EU and Marc Shapiro's team to push CRDTs into distributed systems. Riak v2.x is the first commercial product to include this functionality
Near Real-Time Analytics with Apache Spark: Ingestion, ETL, and Interactive Q...Databricks
Near real-time analytics has become a common requirement for many data teams as the technology has caught up to the demand. One of the hardest aspects of enabling near-realtime analytics is making sure the source data is ingested and deduplicated often enough to be useful to analysts while writing the data in a format that is usable by your analytics query engine. This is usually the domain of many tools since there are three different aspects of the problem: streaming ingestion of data, deduplication using an ETL process, and interactive analytics. With Spark, this can be done with one tool. This talk with walk you through how to use Spark Streaming to ingest change-log data, use Spark batch jobs to perform major and minor compaction, and query the results with Spark.SQL. At the end of this talk you will know what is required to setup near-realtime analytics at your organization, the common gotchas including file formats and distributed file systems, and how to handle data the unique data integrity issues that arise from near-realtime analytics.
Using Riak for Events storage and analysis at Booking.comDamien Krotkine
At Booking.com, we have a constant flow of events coming from various applications and internal subsystems. This critical data needs to be stored for real-time, medium and long term analysis. Events are schema-less, making it difficult to use standard analysis tools.This presentation will explain how we built a storage and analysis solution based on Riak. The talk will cover: data aggregation and serialization, Riak configuration, solutions for lowering the network usage, and finally, how Riak's advanced features are used to perform real-time data crunching on the cluster nodes.
EM12c: Capacity Planning with OEM MetricsMaaz Anjum
Some of my thoughts and adventures encapsulated in a presentation regarding Capacity Planning, Resource Utilization, and Enterprise Managers Collected Metrics.
DrupalSouth 2015 - Performance: Not an AfterthoughtNick Santamaria
Nick Santamaria's performance and scalability presentation from DrupalSouth 2015.
https://melbourne2015.drupal.org.au/session/performance-not-afterthought
This presentation will discuss scalability best practices with MongoDB. We will review how the following affect scalability: schema design, locking granularity within versions and engines, scaling vertically or horizontally, and collection sharding. Understanding how these topics can affect your application will help you avoid complications as your data and workload grows.
NoSQL databases like MongoDB, Elasticsearch, and Cassandra are synonymous with scalability, search, and developer agility. But there’s a downside...having to give up the ease and comfort of SQL.
Or do you?
Join this webcast to learn how the newest databases, like CrateDB and CockroachDB deliver the benefits of NoSQL with the ease of SQL by building SQL engines on top of custom NoSQL technology stacks. Database industry veteran Andy Ellicott, who helped launch Vertica, VoltDB, Cloudant, and now with Crate.io, will provide a no-BS view of current DBMS architectures and predictions for the future of data.
If you’re a DBMS user, this webcast will help you make sense of a very crowded DBMS market and make better-informed decisions for your new tech stacks.
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data StreamingMichael Rainey
We produce quite a lot of data! Much of the data are business transactions stored in a relational database. More frequently, the data are non-structured, high volume and rapidly changing datasets known in the industry as Big Data. The challenge for data integration professionals is to combine and transform the data into useful information. Not just that, but it must also be done in near real-time and using a target system such as Hadoop. The topic of this session, real-time data streaming, provides a great solution for this challenging task. By integrating GoldenGate, Oracle’s premier data replication technology, and Apache Kafka, the latest open-source streaming and messaging system, we can implement a fast, durable, and scalable solution. Presented at KScope16.
Performance Tuning RocksDB for Kafka Streams’ State Storesconfluent
Performance Tuning RocksDB for Kafka Streams’ State Stores, Bruno Cadonna, Contributor to Apache Kafka & Software Developer at Confluent and Dhruba Borthakur, CTO & Co-founder Rockset
Meetup link: https://www.meetup.com/Berlin-Apache-Kafka-Meetup-by-Confluent/events/273823025/
Similar to Building a custom time series db - Colin Hemmings at #DOXLON (20)
Murat Karslioglu, VP Solutions @ OpenEBS - Containerized storage for containe...Outlyer
What is wrong w/ stateful workloads on containers today? What is happening at the Linux kernel to improve the security of containers as a platform FOR storage? Could containers and Kubernetes become the foundations of a new approach to storage? Quick demo of the OpenEBS project.
Video: https://youtu.be/rhx_TnZe_E4
This talk is from the DevOps Exchange San Francisco September Meetup: https://www.meetup.com/DevOps-Exchange-SanFrancisco
Feature flags are a valuable DevOps technique to deliver better, more reliable software faster. Feature flags can be used for both release management (dark launches, canary rollouts, betas) as well as long term control (entitlement management, user segmentation personalization).
However, if not managed properly, feature flags can be very destructive technical debt. Feature flags need to be managed properly with visibility and control to both engineering and business users.
Why You Need to Stop Using "The" Staging ServerOutlyer
Old staging methodology is broken for modern development. In fact, the staging server is left over from when we built monolithic applications. Find out why microservice architectures are driving ephemeral testing environments & why every sized dev shop should deliver true continuous deployment.
Staging servers slow down development with merge conflicts, slow iteration loops, and manhour intensive processes. To build better software faster containers and infrastructure as code are key in 2017. Dev Ops professionals miss this talk at their own peril.
How GitHub combined with CI empowers rapid product delivery at Credit Karma Outlyer
Amit and Kashyap will discuss how GitHub and self service continuous integration (CI) helps Credit Karma rapidly deliver new features to over 60 million members. They will review how Credit Karma streamlined and scaled growing CI needs stemming from an army of engineers decomposing monolith into services.
Docker is often used as an end-to-end solution where services are packaged using a Dockerfile, pushed to a container registry and then deployed to a container orchestration like Kubernetes. In this talk, I would like to show you how nix, the purely functional package manager, can replace and improve over docker in the development and build phase of the applications' lifecycle.
Minimum Viable Docker: our journey towards orchestrationOutlyer
While Kubernetes and Mesos are all the rage, you don't necessarily need a complex orchestration layer to start using and benefiting from Docker. We will present how Babylon Health is running its dockerised AI microservices in production, pros and cons, and what we have in store for the future.
Ops is the past! DevOps is the present ! SRE is for giants! NoOps is the future! Fowler even says that a DevOps Engineer is an anti-pattern!
So will our job disappear in 10 years? What can we do about it? What is the next set of skills that we need? A startup is often a precursor to larger changes. I'll tell you what we are trying to do at Curve, a Fintech startup where developers build Kubernetes clusters and the SRE team codes microservices.
The service mesh: resilient communication for microservice applicationsOutlyer
Modern application architecture is shifting from monolith to microservices: componentized, containerized, and orchestrated with systems like Kubernetes, Mesos, and Docker Swarm. While this environment is resilient to many failures of both hardware and software, applications require more than this to be truly resilient. In this talk, we introduce the notion of a "service mesh": a userspace infrastructure layer designed to manage service-to-service communication in microservice applications, including handling partial failures and unexpected load, while reducing tail latencies and degrading gracefully in the presence of component failure.
Microservices: Why We Did It (and should you?) Outlyer
Mason will present a skeptical, humorous, and practical look at whether companies should consider microservices, and why/not. The story includes the reasons why Credit Karma did make the move, the approach we took, and shares some of our learnings so far.
Renan Dias: Using Alexa to deploy applications to KubernetesOutlyer
It's time to bring voice commands into continuous deployment pipelines. In this talk, Renan will walk you through the steps of setting up a powerful and cutting-edge continuous deployment pipeline, which will allow you to deploy your products to Kubernetes clusters using just your voice. "Alexa, deploy API to production". If you have never imagined yourself doing that, or you have but don't know where to start, this talk is definitely for you.
Alex Dias: how to build a docker monitoring solution Outlyer
Alex will be talking about how docker container monitoring was built at Outlyer. He'll be diving into the details behind how you actually monitor everything in such an environment and the challenges that come with it. Namely, how the Docker API, Cgroups, and the Netlink Linux kernel interface can be leveraged to get specific metrics for each container.
How to build a container monitoring solution - David Gildeh, CEO and Co-Found...Outlyer
David will be talking about how he's built the container monitoring at Outlyer. He'll also be diving into the details behind how you actually monitor everything in a container environment and the challenges that come with it.
Heresy in the church of - Corey Quinn, Principal at The Quinn Advisory Group Outlyer
Docker (and by extension, microservices based architecture) has expanded our horizons with respect to how the industry builds and supports applications at scale. It’s changed the way we think about our code, what production looks like, and how we live. But in our rush to embrace this exciting new paradigm, are we throwing away the lessons of the past?
In this entertaining and somewhat irreverent talk, Corey presents the ”other side” of the containerization craze: how configuration management fits into a world consumed by the Docker Docker Docker madness, how ”containers all the way down” can let you down when you least expect it, and how promising technologies should perhaps be vetted a bit more thoroughly before you try to run critical services on top of them.
Anatomy of a real-life incident -Alex Solomon, CTO and Co-Founder of PagerDutyOutlyer
Major incidents can be very stressful, frustrating and chaotic experiences, especially if the on-call responders lack the proper process, training and coordination.
In this talk, we will walk through a real incident from PagerDuty’s own history, to illustrate what an effective incident response looks like. We will recreate the incident timeline step by step and go over all of the different roles involved, including the incident commander, scribe, customer/business liaison and subject matter experts. We will also cover the process and tooling needed to respond quickly and effectively to major incidents in order to minimize customer and business impact.
A Holistic View of Operational Capabilities—Roy Rapoport, Insight Engineering...Outlyer
Roy Rapoport will discuss the framework Insight Engineering at Netflix uses to think about the real-time operational insight space, the capabilities that any successful organization will eventually need in that space, and what Netflix has done in pursuit of addressing these needs at extremely large scale.
The Network Knows—Avi Freedman, CEO & Co-Founder of Kentik Outlyer
Apps generate the traffic, but the network delivers it. Many devops and netops stacks are completely separate, but it doesn't have to be that way!
In this talk we'll talk a bit about network traffic telemetry - sources, tools, and methods - and show how that data can be linked to metric, log, and APM systems.
Building a production-ready, fully-scalable Docker Swarm using Terraform & Pa...Outlyer
Bobby is a Consultant DevOps Engineer who currently works with UK Cloud’s clients to help them understand DevOps, how to improve their automation and migrate to a cloud-native environment. Bobby has over twenty years of experience working with the web and has most recently been working with public sector clients on their latest projects.
On the surface, the tech behind a payments API may look like any other startup’s. You'll probably find some Rails apps, a database, and a bunch of stuff off to the sides to glue it together. GoCardless found it's mostly not the tech that differs, but the approach.
Using their high-availability Postgres cluster as a running example, they explore how reliability became so important to them, and dive into the most recent feature they built into the cluster: zero-downtime patch upgrades.
DOXLON November 2016: Facebook Engineering on cgroupv2Outlyer
Cgroupv1 (or just "cgroups") has helped revolutionize the way that we manage and use containers over the past 8 years. In kernel 4.5, a complete overhaul is coming -- cgroupv2. This talk will go into why a new control group system was needed, the changes from cgroupv1, and practical uses that you can apply to improve the level of control you have over the processes on your servers.
DOXLON November 2016 - ELK Stack and Beats Outlyer
Jon Hammant, Head of Cloud & DevOps for UK & EU for Epam Systems, presented an overview of using the ELK stack together with the Beats Plugin data shippers to provide detailed system metrics, network traffic, file analysis, and more. In addition, he provided an overview of how to monitor multiple Docker containers in a cloud native environment, with logs sent back to a central host.
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesSanjeev Rampal
Talk presented at Kubernetes Community Day, New York, May 2024.
Technical summary of Multi-Cluster Kubernetes Networking architectures with focus on 4 key topics.
1) Key patterns for Multi-cluster architectures
2) Architectural comparison of several OSS/ CNCF projects to address these patterns
3) Evolution trends for the APIs of these projects
4) Some design recommendations & guidelines for adopting/ deploying these solutions.
This 7-second Brain Wave Ritual Attracts Money To You.!nirahealhty
Discover the power of a simple 7-second brain wave ritual that can attract wealth and abundance into your life. By tapping into specific brain frequencies, this technique helps you manifest financial success effortlessly. Ready to transform your financial future? Try this powerful ritual and start attracting money today!
1.Wireless Communication System_Wireless communication is a broad term that i...JeyaPerumal1
Wireless communication involves the transmission of information over a distance without the help of wires, cables or any other forms of electrical conductors.
Wireless communication is a broad term that incorporates all procedures and forms of connecting and communicating between two or more devices using a wireless signal through wireless communication technologies and devices.
Features of Wireless Communication
The evolution of wireless technology has brought many advancements with its effective features.
The transmitted distance can be anywhere between a few meters (for example, a television's remote control) and thousands of kilometers (for example, radio communication).
Wireless communication can be used for cellular telephony, wireless access to the internet, wireless home networking, and so on.
ER(Entity Relationship) Diagram for online shopping - TAEHimani415946
https://bit.ly/3KACoyV
The ER diagram for the project is the foundation for the building of the database of the project. The properties, datatypes, and attributes are defined by the ER diagram.
6. www.dataloop.io | @dataloopio | info@dataloop.io
Riak - Our New Hope
• Scales
• Ops Friendly
• Actually works
• No random JVM crashes here
7. www.dataloop.io | @dataloopio | info@dataloop.io
Objectives
• Handle the load
• Semi-arbitrary queries
• Data retention windows
• Low latency
8. www.dataloop.io | @dataloopio | info@dataloop.io
Data structure
• Resolution/rollup based queries
• Minimum 24 hours at 1 second resolution
• Second, minute and hour resolution
9. www.dataloop.io | @dataloopio | info@dataloop.io
Data structure
• 86,400 data points per resolution
• 1 second -> 24 hour retention
• 1 minute -> 60 day retention
• 1 hour -> 10 year retention
10. www.dataloop.io | @dataloopio | info@dataloop.io
Data structure
• per metric -> 250k data points
• 1000 metric per host -> 2.5M data points
• 300 hosts per user -> 750M data points
• 1000 customers -> 750B data points!!!!!
11. www.dataloop.io | @dataloopio | info@dataloop.io
Simple Riak Storage
• Timestamp keyed object per metric value
• 2i and MapReduce are too slow
• Especially across millions of keys
• Writes would soon cripple our Riak cluster
12. www.dataloop.io | @dataloopio | info@dataloop.io
Intelligent Riak Storage
• Units of storage: time based data blocks
• Compute keys
• Mutable data windows
13. www.dataloop.io | @dataloopio | info@dataloop.io
Query
Get cpu metrics for host A for period t1-t4 at 1 second resolution
• Pull the correct blocks from riak, based on block boundaries
• GET /buckets/host_a/keys/cpu_second_t1b
• GET /buckets/host_a/keys/cpu_second_t2b
• GET /buckets/host_a/keys/cpu_second_t3b
• GET /buckets/host_a/keys/cpu_second_t4b
14. www.dataloop.io | @dataloopio | info@dataloop.io
Query
• Filter points outside of our query range
• Aggregate all the data points
• Perform other operation if more complex query
15. www.dataloop.io | @dataloopio | info@dataloop.io
Expiring
• Cleanup worker
• Removes keys out of retention window
• Host keyed, easier to clear all hosts or account data
16. www.dataloop.io | @dataloopio | info@dataloop.io
Our cluster
• Riak 2.0
• 5 nodes on LevelDB
• Each 2 x 500GB striped SSDs
• Average 1ms GET and PUT latencies
18. www.dataloop.io | @dataloopio | info@dataloop.io
Comments
• Awesome, especially for ops
• A bit more work in application tier
• Always compute keys avoid 2i and MapReduce
• Looking forward to using the new data types