This was my talk at QConNY 2018, which was part of the container orchestration track, about how Chick-fil-A uses Kubernetes at the Edge in our restaurants and how we have engineered some solutions to solve problems that are unique to our scale.
Everything You Need to Know About Big Data: From Architectural Principles to ...Amazon Web Services
In this session, we discuss architectural principles that help simplify big data analytics. We'll apply principles to various stages of big data processing: collect, store, process, analyze, and visualize. We'll discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architectures, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost.
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
Recently, a set of modern table formats such as Delta Lake, Hudi, Iceberg spring out. Along with Hive Metastore these table formats are trying to solve problems that stand in traditional data lake for a long time with their declared features like ACID, schema evolution, upsert, time travel, incremental consumption etc.
Cosco: An Efficient Facebook-Scale Shuffle ServiceDatabricks
Cosco is an efficient shuffle-as-a-service that powers Spark (and Hive) jobs at Facebook warehouse scale. It is implemented as a scalable, reliable and maintainable distributed system. Cosco is based on the idea of partial in-memory aggregation across a shared pool of distributed memory. This provides vastly improved efficiency in disk usage compared to Spark's built-in shuffle. Long term, we believe the Cosco architecture will be key to efficiently supporting jobs at ever larger scale. In this talk we'll take a deep dive into the Cosco architecture and describe how it's deployed at Facebook. We will then describe how it's integrated to run shuffle for Spark, and contrast it with Spark's built-in sort-based shuffle mechanism and SOS (presented at Spark+AI Summit 2018).
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng ShiDatabricks
Apache Kylin is a distributed OLAP engine on Hadoop, which provides sub-second level query latency over datasets scaling to petabytes. Kylin’s superior query performance relies on pre-calculated multi-dimension Cube, which is often time-consuming to build. By default, Kylin uses MapReduce Cube Engine built atop of Hadoop MapReduce framework to aggregate huge amounts of source data. The MR Engine has been well-tuned over years and proven to be stable in hundreds of production deployments. Recently, the Kylin team is trying to further speed up the process of cube building by replacing MR with Spark. Kyligence has initiated the new Spark Cube Engine with some benchmarks between Spark and MR over different datasets, and has received some promising results. Hear about their results and experiences on moving Cube building, which is a huge computing task, to Spark.
Everything You Need to Know About Big Data: From Architectural Principles to ...Amazon Web Services
In this session, we discuss architectural principles that help simplify big data analytics. We'll apply principles to various stages of big data processing: collect, store, process, analyze, and visualize. We'll discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architectures, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost.
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
Recently, a set of modern table formats such as Delta Lake, Hudi, Iceberg spring out. Along with Hive Metastore these table formats are trying to solve problems that stand in traditional data lake for a long time with their declared features like ACID, schema evolution, upsert, time travel, incremental consumption etc.
Cosco: An Efficient Facebook-Scale Shuffle ServiceDatabricks
Cosco is an efficient shuffle-as-a-service that powers Spark (and Hive) jobs at Facebook warehouse scale. It is implemented as a scalable, reliable and maintainable distributed system. Cosco is based on the idea of partial in-memory aggregation across a shared pool of distributed memory. This provides vastly improved efficiency in disk usage compared to Spark's built-in shuffle. Long term, we believe the Cosco architecture will be key to efficiently supporting jobs at ever larger scale. In this talk we'll take a deep dive into the Cosco architecture and describe how it's deployed at Facebook. We will then describe how it's integrated to run shuffle for Spark, and contrast it with Spark's built-in sort-based shuffle mechanism and SOS (presented at Spark+AI Summit 2018).
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng ShiDatabricks
Apache Kylin is a distributed OLAP engine on Hadoop, which provides sub-second level query latency over datasets scaling to petabytes. Kylin’s superior query performance relies on pre-calculated multi-dimension Cube, which is often time-consuming to build. By default, Kylin uses MapReduce Cube Engine built atop of Hadoop MapReduce framework to aggregate huge amounts of source data. The MR Engine has been well-tuned over years and proven to be stable in hundreds of production deployments. Recently, the Kylin team is trying to further speed up the process of cube building by replacing MR with Spark. Kyligence has initiated the new Spark Cube Engine with some benchmarks between Spark and MR over different datasets, and has received some promising results. Hear about their results and experiences on moving Cube building, which is a huge computing task, to Spark.
Large Scale Lakehouse Implementation Using Structured StreamingDatabricks
Business leads, executives, analysts, and data scientists rely on up-to-date information to make business decision, adjust to the market, meet needs of their customers or run effective supply chain operations.
Come hear how Asurion used Delta, Structured Streaming, AutoLoader and SQL Analytics to improve production data latency from day-minus-one to near real time Asurion’s technical team will share battle tested tips and tricks you only get with certain scale. Asurion data lake executes 4000+ streaming jobs and hosts over 4000 tables in production Data Lake on AWS.
Parallelizing with Apache Spark in Unexpected WaysDatabricks
"Out of the box, Spark provides rich and extensive APIs for performing in memory, large-scale computation across data. Once a system has been built and tuned with Spark Datasets/Dataframes/RDDs, have you ever been left wondering if you could push the limits of Spark even further? In this session, we will cover some of the tips learned while building retail-scale systems at Target to maximize the parallelization that you can achieve from Spark in ways that may not be obvious from current documentation. Specifically, we will cover multithreading the Spark driver with Scala Futures to enable parallel job submission. We will talk about developing custom partitioners to leverage the ability to apply operations across understood chunks of data and what tradeoffs that entails. We will also dive into strategies for parallelizing scripts with Spark that might have nothing to with Spark to support environments where peers work in multiple languages or perhaps a different language/library is just the best thing to get the job done. Come learn how to squeeze every last drop out of your Spark job with strategies for parallelization that go off the beaten path.
"
Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...Riccardo Zamana
Time series Analytics - a deep dive into ADX Azure Data Explorer. Let’s discover with a step-by-step approach the entire ecosystem of features driven by Azure Data eXplorer.
Right-Sizing your SQL Server Virtual Machineheraflux
Virtualizing your top-tier production SQL Servers is not as easy as P2V’ing it. Sometimes allocating more resources to the VM is the wrong approach, and getting it wrong will silently hurt performance. What is the most effective method for determining the ‘right’ amount of resources to allocate? What happens if the workload changes a month from now?
The methods for understanding the performance of your mission-critical SQL Servers gathered over the past ten years of SQL Server virtualization will be addressed, and valuable processes for performance statistic collection and analysis will be displayed. Come learn how to properly ‘right-size’ the resources allocated to a VM, improve the performance of your SQL Servers, and keep it maximized well into the future.
Technological Geeks Video 13 :-
Video Link :- https://youtu.be/mfLxxD4vjV0
FB page Link :- https://www.facebook.com/bitwsandeep/
Contents :-
Hive Architecture
Hive Components
Limitations of Hive
Hive data model
Difference with traditional RDBMS
Type system in Hive
Latest version of the Netflix Cloud Architecture story was given at Gluecon May 23rd 2012. Gluecon rocks, and lots of Van Halen references were added for the occasion. There tradeoff between developer driven high functionality AWS based PaaS, and operations driven low cost portable PaaS is discussed. The three sections cover the developer view, the operator view and the builder view.
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spark Summit
Data lineage tracking is one of the significant problems that financial institutions face when using modern big data tools. This presentation describes Spline – a data lineage tracking and visualization tool for Apache Spark. Spline captures and stores lineage information from internal Spark execution plans and visualizes it in a user-friendly manner.
Amazon EC2 provides a broad selection of instance types to accommodate a diverse mix of workloads. In this session, we provide an overview of the Amazon EC2 instance platform, key platform features, and the concept of instance generations. We dive into the current generation design choices of the different instance families, including General Purpose, Compute Optimized, Storage Optimized, Memory Optimized, and GPU instance. We also detail best practices and share performance tips for getting the most out of your Amazon EC2 instances.
Apache Spark on K8S Best Practice and Performance in the CloudDatabricks
Kubernetes As of Spark 2.3, Spark can run on clusters managed by Kubernetes. we will describes the best practices about running Spark SQL on Kubernetes upon Tencent cloud includes how to deploy Kubernetes against public cloud platform to maximum resource utilization and how to tune configurations of Spark to take advantage of Kubernetes resource manager to achieve best performance. To evaluate performance, the TPC-DS benchmarking tool will be used to analysis performance impact of queries between configurations set.
Speakers: Junjie Chen, Junping Du
How Adobe Does 2 Million Records Per Second Using Apache Spark!Databricks
Adobe’s Unified Profile System is the heart of its Experience Platform. It ingests TBs of data a day and is PBs large. As part of this massive growth we have faced multiple challenges in our Apache Spark deployment which is used from Ingestion to Processing.
Performance Optimizations in Apache ImpalaCloudera, Inc.
Apache Impala is a modern, open-source MPP SQL engine architected from the ground up for the Hadoop data processing environment. Impala provides low latency and high concurrency for BI/analytic read-mostly queries on Hadoop, not delivered by batch frameworks such as Hive or SPARK. Impala is written from the ground up in C++ and Java. It maintains Hadoop’s flexibility by utilizing standard components (HDFS, HBase, Metastore, Sentry) and is able to read the majority of the widely-used file formats (e.g. Parquet, Avro, RCFile).
To reduce latency, such as that incurred from utilizing MapReduce or by reading data remotely, Impala implements a distributed architecture based on daemon processes that are responsible for all aspects of query execution and that run on the same machines as the rest of the Hadoop infrastructure. Impala employs runtime code generation using LLVM in order to improve execution times and uses static and dynamic partition pruning to significantly reduce the amount of data accessed. The result is performance that is on par or exceeds that of commercial MPP analytic DBMSs, depending on the particular workload. Although initially designed for running on-premises against HDFS-stored data, Impala can also run on public clouds and access data stored in various storage engines such as object stores (e.g. AWS S3), Apache Kudu and HBase. In this talk, we present Impala's architecture in detail and discuss the integration with different storage engines and the cloud.
Hardware planning & sizing for sql serverDavide Mauri
Purchasing a dedicated server to SQL Server is still a necessary operation. The cloud is a great choice but if you need to create a data warehouse of non-trivial size or if you have the need for optimal performance and control of your production database server, the choice of on-premise server is still an optimal choice. So, how not to throw away money on unnecessary hardware? In this session we will see how each component works together to form a balanced hardware (this is the key word!), without bottlenecks, maximizing the investment made. We'll talk about SAN, CPU, HBA, Fibre Channel, Memory and everything you thought you knew well...
Learn how Apache Atlas is being enhanced to provide a universal open metadata and governance platform for all data processing across the enterprise. With open metadata, multiple metadata repositories, potentially from different vendors, can operate collaboratively to create an enterprise catalog of data that can be located, understood, used and governed. In this talk we will provide a detailed description of the extensions to the type system, new APIs, the connector framework, metadata discovery framework, governance action framework and the inter-operability that we are adding to Apache Atlas. We will show examples of these features in operation. For example, (1) how metadata is discovered and gathered into Apache Atlas, (2) how applications and tools access metadata, (3) how enforcement engines such as Apache Ranger keep synchronized with the latest governance requirements and (4) how to build an adapter to allow other vendor's metadata repositories can exchange metadata with Apache Atlas repositories. We will also explain how these features can be deployed together to support the Hadoop platform, and the enterprise beyond. This session will be presented by Nigel Jones - IBM & Ferd Schapers - ING Chief Information Architect
Speaker:
Nigel Jones, Software Architect, IBM Analytics Group, IBM
Introduction to Amazon EMR design patterns such as using Amazon S3 instead of HDFS, taking advantage of Spot EC2 instances to reduce costs, and other Amazon EMR architectural best practices.
Migrating from Self-Managed Kubernetes on EC2 to a GitOps Enabled EKSWeaveworks
Did your company start down the path of building a cloud native platform using Kubernetes with the goal of enabling developers to innovate faster and increase productivity, but then run into challenges keeping it operating in an optimal way?
In this session, Weaveworks will discuss how to migrate from self-managed Kubernetes on EC2 to a GitOps managed Shared Services Platform (SSP) on EKS. A SSP built on EKS and managed with Weave GitOps provides developers and operators with common workflows to update both applications and infrastructure. With every change in version control, full audit trails are available, and security is enforced. While at the same time enabling easier rollbacks and faster mean-time-to-recovery (MTTR). In short, a Weave GitOps managed SSP increases developer velocity while boosting stability.
How to operate a hybrid Kubernetes architecture, using managed EKS in the AWS Cloud and EKS-Distro on premises.
How to structure your infrastructure repository to efficiently manage multiple teams.
How to use Kubernetes RBAC to provide secure cluster multi-tenancy.
How to use GitOps to promote releases across a hybrid set of independent clusters.
How to accomplish data and operational sovereignty.
At Netweb we believe that innovation is a critical business need. As data analytics, high-performance computing and artificial intelligence continue to evolve, we are building solutions and to help you keep pace with the constantly evolving landscape.
Large Scale Lakehouse Implementation Using Structured StreamingDatabricks
Business leads, executives, analysts, and data scientists rely on up-to-date information to make business decision, adjust to the market, meet needs of their customers or run effective supply chain operations.
Come hear how Asurion used Delta, Structured Streaming, AutoLoader and SQL Analytics to improve production data latency from day-minus-one to near real time Asurion’s technical team will share battle tested tips and tricks you only get with certain scale. Asurion data lake executes 4000+ streaming jobs and hosts over 4000 tables in production Data Lake on AWS.
Parallelizing with Apache Spark in Unexpected WaysDatabricks
"Out of the box, Spark provides rich and extensive APIs for performing in memory, large-scale computation across data. Once a system has been built and tuned with Spark Datasets/Dataframes/RDDs, have you ever been left wondering if you could push the limits of Spark even further? In this session, we will cover some of the tips learned while building retail-scale systems at Target to maximize the parallelization that you can achieve from Spark in ways that may not be obvious from current documentation. Specifically, we will cover multithreading the Spark driver with Scala Futures to enable parallel job submission. We will talk about developing custom partitioners to leverage the ability to apply operations across understood chunks of data and what tradeoffs that entails. We will also dive into strategies for parallelizing scripts with Spark that might have nothing to with Spark to support environments where peers work in multiple languages or perhaps a different language/library is just the best thing to get the job done. Come learn how to squeeze every last drop out of your Spark job with strategies for parallelization that go off the beaten path.
"
Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...Riccardo Zamana
Time series Analytics - a deep dive into ADX Azure Data Explorer. Let’s discover with a step-by-step approach the entire ecosystem of features driven by Azure Data eXplorer.
Right-Sizing your SQL Server Virtual Machineheraflux
Virtualizing your top-tier production SQL Servers is not as easy as P2V’ing it. Sometimes allocating more resources to the VM is the wrong approach, and getting it wrong will silently hurt performance. What is the most effective method for determining the ‘right’ amount of resources to allocate? What happens if the workload changes a month from now?
The methods for understanding the performance of your mission-critical SQL Servers gathered over the past ten years of SQL Server virtualization will be addressed, and valuable processes for performance statistic collection and analysis will be displayed. Come learn how to properly ‘right-size’ the resources allocated to a VM, improve the performance of your SQL Servers, and keep it maximized well into the future.
Technological Geeks Video 13 :-
Video Link :- https://youtu.be/mfLxxD4vjV0
FB page Link :- https://www.facebook.com/bitwsandeep/
Contents :-
Hive Architecture
Hive Components
Limitations of Hive
Hive data model
Difference with traditional RDBMS
Type system in Hive
Latest version of the Netflix Cloud Architecture story was given at Gluecon May 23rd 2012. Gluecon rocks, and lots of Van Halen references were added for the occasion. There tradeoff between developer driven high functionality AWS based PaaS, and operations driven low cost portable PaaS is discussed. The three sections cover the developer view, the operator view and the builder view.
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spark Summit
Data lineage tracking is one of the significant problems that financial institutions face when using modern big data tools. This presentation describes Spline – a data lineage tracking and visualization tool for Apache Spark. Spline captures and stores lineage information from internal Spark execution plans and visualizes it in a user-friendly manner.
Amazon EC2 provides a broad selection of instance types to accommodate a diverse mix of workloads. In this session, we provide an overview of the Amazon EC2 instance platform, key platform features, and the concept of instance generations. We dive into the current generation design choices of the different instance families, including General Purpose, Compute Optimized, Storage Optimized, Memory Optimized, and GPU instance. We also detail best practices and share performance tips for getting the most out of your Amazon EC2 instances.
Apache Spark on K8S Best Practice and Performance in the CloudDatabricks
Kubernetes As of Spark 2.3, Spark can run on clusters managed by Kubernetes. we will describes the best practices about running Spark SQL on Kubernetes upon Tencent cloud includes how to deploy Kubernetes against public cloud platform to maximum resource utilization and how to tune configurations of Spark to take advantage of Kubernetes resource manager to achieve best performance. To evaluate performance, the TPC-DS benchmarking tool will be used to analysis performance impact of queries between configurations set.
Speakers: Junjie Chen, Junping Du
How Adobe Does 2 Million Records Per Second Using Apache Spark!Databricks
Adobe’s Unified Profile System is the heart of its Experience Platform. It ingests TBs of data a day and is PBs large. As part of this massive growth we have faced multiple challenges in our Apache Spark deployment which is used from Ingestion to Processing.
Performance Optimizations in Apache ImpalaCloudera, Inc.
Apache Impala is a modern, open-source MPP SQL engine architected from the ground up for the Hadoop data processing environment. Impala provides low latency and high concurrency for BI/analytic read-mostly queries on Hadoop, not delivered by batch frameworks such as Hive or SPARK. Impala is written from the ground up in C++ and Java. It maintains Hadoop’s flexibility by utilizing standard components (HDFS, HBase, Metastore, Sentry) and is able to read the majority of the widely-used file formats (e.g. Parquet, Avro, RCFile).
To reduce latency, such as that incurred from utilizing MapReduce or by reading data remotely, Impala implements a distributed architecture based on daemon processes that are responsible for all aspects of query execution and that run on the same machines as the rest of the Hadoop infrastructure. Impala employs runtime code generation using LLVM in order to improve execution times and uses static and dynamic partition pruning to significantly reduce the amount of data accessed. The result is performance that is on par or exceeds that of commercial MPP analytic DBMSs, depending on the particular workload. Although initially designed for running on-premises against HDFS-stored data, Impala can also run on public clouds and access data stored in various storage engines such as object stores (e.g. AWS S3), Apache Kudu and HBase. In this talk, we present Impala's architecture in detail and discuss the integration with different storage engines and the cloud.
Hardware planning & sizing for sql serverDavide Mauri
Purchasing a dedicated server to SQL Server is still a necessary operation. The cloud is a great choice but if you need to create a data warehouse of non-trivial size or if you have the need for optimal performance and control of your production database server, the choice of on-premise server is still an optimal choice. So, how not to throw away money on unnecessary hardware? In this session we will see how each component works together to form a balanced hardware (this is the key word!), without bottlenecks, maximizing the investment made. We'll talk about SAN, CPU, HBA, Fibre Channel, Memory and everything you thought you knew well...
Learn how Apache Atlas is being enhanced to provide a universal open metadata and governance platform for all data processing across the enterprise. With open metadata, multiple metadata repositories, potentially from different vendors, can operate collaboratively to create an enterprise catalog of data that can be located, understood, used and governed. In this talk we will provide a detailed description of the extensions to the type system, new APIs, the connector framework, metadata discovery framework, governance action framework and the inter-operability that we are adding to Apache Atlas. We will show examples of these features in operation. For example, (1) how metadata is discovered and gathered into Apache Atlas, (2) how applications and tools access metadata, (3) how enforcement engines such as Apache Ranger keep synchronized with the latest governance requirements and (4) how to build an adapter to allow other vendor's metadata repositories can exchange metadata with Apache Atlas repositories. We will also explain how these features can be deployed together to support the Hadoop platform, and the enterprise beyond. This session will be presented by Nigel Jones - IBM & Ferd Schapers - ING Chief Information Architect
Speaker:
Nigel Jones, Software Architect, IBM Analytics Group, IBM
Introduction to Amazon EMR design patterns such as using Amazon S3 instead of HDFS, taking advantage of Spot EC2 instances to reduce costs, and other Amazon EMR architectural best practices.
Migrating from Self-Managed Kubernetes on EC2 to a GitOps Enabled EKSWeaveworks
Did your company start down the path of building a cloud native platform using Kubernetes with the goal of enabling developers to innovate faster and increase productivity, but then run into challenges keeping it operating in an optimal way?
In this session, Weaveworks will discuss how to migrate from self-managed Kubernetes on EC2 to a GitOps managed Shared Services Platform (SSP) on EKS. A SSP built on EKS and managed with Weave GitOps provides developers and operators with common workflows to update both applications and infrastructure. With every change in version control, full audit trails are available, and security is enforced. While at the same time enabling easier rollbacks and faster mean-time-to-recovery (MTTR). In short, a Weave GitOps managed SSP increases developer velocity while boosting stability.
How to operate a hybrid Kubernetes architecture, using managed EKS in the AWS Cloud and EKS-Distro on premises.
How to structure your infrastructure repository to efficiently manage multiple teams.
How to use Kubernetes RBAC to provide secure cluster multi-tenancy.
How to use GitOps to promote releases across a hybrid set of independent clusters.
How to accomplish data and operational sovereignty.
At Netweb we believe that innovation is a critical business need. As data analytics, high-performance computing and artificial intelligence continue to evolve, we are building solutions and to help you keep pace with the constantly evolving landscape.
One Kubernetes to rule them all (ZEUS 2019 Keynote)Simon Harrer
In 2015, Google open sourced the core of their internal container clustering system under the name Kubernetes. Teams that previously relied upon IaaS and PaaS to run their applications quickly adopted Kubernetes instead. Today, only a few years later, Kubernetes is key to many companies and runs applications with literally billions of users. Kubernetes has become the de facto standard for deploying and running cloud native applications. We’ll give an overview of what Kubernetes is today and share our experiences from using Kubernetes in an ecormmerce and an IoT application. The future of Kubernetes could not look better. The Kubernetes ecosystem is growing, allowing to provision professionally managed databases directly within the cluster, running functions in a serverless-fashion, and even allowing us to host the code, the build pipeline and the application itself on Kubernetes. In the future, there might be only one Kubernetes to rule them all.
Lc3 beijing-june262018-sahdev zala-guangyaSahdev Zala
Our slides deck, used at the LinuxCon+ContainerCon+CLOUDOPEN China 2018, on Kubernetes cluster design considerations and our journey to 1000+ node single cluster with IBM Cloud.
Service-Level Objective for Serverless Applicationsalekn
Deploying commercial applications that meet their expected business needs is challenging due to the differences between how business goals are specified and how the system is evaluated. Furthermore, business goals are dynamic, requiring deployment to change constantly over time. Such difficulties make it costly to maintain application quality as the underlying infrastructure is not always fast enough to keep up with business changes. Nowadays, serverless opens a new approach to build application. By abstracting out the deployment details, serverless application can be implemented with minimum deployment efforts. Serverless also reduces maintenance cost with auto-scaling and pay-as-you-go. Such abilities make us believe that by adopting serverless, we can build application that can meet and quickly adapt to business goals.
However, simply writing applications with serverless is not sufficient. Due to best-effort invocation mechanisms and the lack of application structure awareness, serverless performance is highly variable and often fails to support applications with rigorous quality of service requirements. In this study, we aim to mitigate such limitations by coupling serverless deployment with business needs. In particular, we define an Serverless Service-Level Objective (SLO) interface that allows developers to describe their application structure and business goals in terms of software-level objectives. We implement an SLO enforcer, which uses this information in combination with the system performance metrics to decide a proper serverless deployment and resource allocation for meeting business goals. The Serverless SLO leverages blueprint model, which allow developers to describe applications' architecture and runtime characteristics needs, to map application description to serverless function deployment on the top of Knative. We deploy our proposed system on KinD, a tool to run Kubernetes cluster over our local Docker container, and evaluate it with different system configurations. Evaluation results showed that SLO definition and enforcement helps serverless application use resources in accordance with business goals.
Cost is often the conversation starter when customers think about moving to the cloud. AWS helps lower costs for customers through its “pay only for what you use” pricing model, frequent price drops, and pricing model choice to support variable & stable workloads. In this session, you will learn about the financial considerations of owning and operating a traditional data center or managed hosting provider versus utilizing AWS. We will detail our TCO methodology and showcase cost comparisons for some common customer use-cases. We’ll also cover a few AWS cost optimization areas, including Spot and Reserved Instances, EC2 Auto Scaling, and consolidated billing.
Presenter:
Amit Sharma, Solution Architect, Amazon Internet Services
Krishnenjit Roy, Director IT Operations, Freshdesk
Kubernetes Community Growth and Use CaseChris Gaun
The Kubernetes community has undergone tremendous growth over the past two years. A number of K8S use cases are currently being implemented by organizations.
Simplify Your Way To Expert Kubernetes ManagementDevOps.com
Kubernetes is a deep and complex technology that is evolving fast with new functionality and a growing ecosystem of cloud-native solutions. While the public cloud delivers an almost frictionless user experience, configuring and managing a production Kubernetes environment is an enormous technical challenge for the majority of enterprises that choose to do so on premises. Without the right approach, operationalizing Kubernetes in the data center can take upwards of 6 months, jeopardizing developer productivity and speed-to-market.
In this webinar, you’ll learn from Nutanix cloud native experts on how to fast-track your way to operationalizing a production-ready Kubernetes environment on-prem.
Specifically, we’ll talk about:
How containerized applications use IT resources (and why legacy infrastructure isn’t built for Kubernetes);
The main advantages of running Kubernetes on prem (as part of a multi-cloud strategy);
Key aspects of Kubernetes lifecycle management that greatly benefit from automation.
Driving Digital Transformation With Containers And Kubernetes Complete DeckSlideTeam
Introducing Kubernetes Concepts And Architecture PowerPoint Presentation Slides. This readily available open-source architecture PPT infographics well explains the concept of containers. You can also depict the architecture of containers and microservices with the help of a visually appealing PPT slideshow. Our content-ready containers PPT slideshow allow you to showcase the reasons for opting for Kubernetes by an organization. Depict the roadmap for installing Kubernetes in the organization in a presentable manner by using this slide design. The major advantages of Kubernetes, such as the stability of application run, improving productivity, and many more can be presented in this slide deck. Cover 30 60 90 days plan to implement Kubernetes in the organization with this thoroughly researched PowerPoint templates. Discuss the key components of Kubernetes with a diagram using this modern-designed cluster architecture PowerPoint layouts. Describe each element’s functionality using these PowerPoint visuals. Hence manage the clusters efficiently by downloading Kubernetes architecture PPT slides. https://bit.ly/3p6xEoS
Database as a Service (DBaaS) on KubernetesObjectRocket
Learn about ObjectRocket's adventures in Kubernetes. We'll cover why we chose Kubernetes for our DBaaS platform, the challenges we faced, and how we overcame them. A presentation for DevWeek Austin 2018.
Amazon Elastic Compute Cloud (Amazon EC2) provides resizable compute capacity in the cloud and makes web scale computing easier for customers. Amazon EC2 provides a wide variety of compute instances suited to every imaginable use case, from static websites to high performance supercomputing on-demand, available via highly flexible pricing options. Amazon EC2 works with Amazon Elastic Block Store (Amazon EBS) and Auto Scaling to make it easy for you to get the performance and availability you need for your applications. This session will introduce the key features and different instance types offered by Amazon EC2, demonstrate how you can get started and provide guidance on choosing the right types of instance and purchasing options.
Introduction of Kubernetes - Trang NguyenTrang Nguyen
This presentation provides the basic concepts of the Kubernetes for Beginners.
1) Introduction of Kubernetes
Before Kubernetes
What is Kubernetes
What Kubernetes can do?
What Kubernetes can't do?
Features of Kubernetes
Kubernetes Architecture
Kubernetes vs Docker Swarm
Kubernetes 7 use cases
...
2) Kubernetes Component
What is Kubelet?
What is Kubectl?
What is Kubeadm?
3) Nodes in Kubernetes
What is a node in Kubernetes?
Master node
Worker node
4) Kubernetes Development Process
What is blue green deployment?
How to automate the deployment?
5) Networking in Kubernetes
Kubernetes networking model
Ingress networking in Kubernetes
6) Security Measures in Kubernetes
Best security measures in Kubernetes
DCEU 18: Desigual Transforms the In-Store Experience with Docker Enterprise C...Docker, Inc.
Mathias Kriegel - IT Operations, Desigual
Joan Anton Sances - Software Architect, Desigual
Desigual, a $1-billion-dollar fashion retailer headquartered in Barcelona, operates over 500 stores worldwide. The company is on a digital transformation journey touching every aspect of the customer experience. In this session, IT Operations and Software Architecture teams, will explain how Desigual built an in-store “assistant shopping” that transformed the customer experience adopting modern architecture models leveraging Docker Enterprise for containerization. In the session, you’ll learn: ● How Desigual is leveraging containers with Docker Enterprise, micro services, API´s, CI/CD and hybrid cloud to create an excellent customer experience. ● How to use a container platform to accelerate time-to-market for new applications. ● How Desigual changed its traditional IT operational model, focusing on bringing a PaaS like model for Developer teams, and what they learned along the way. ● How Dev and Ops teams aligned together in the process. ● How Developer productivity increased by adopting modern architecture models.
Similar to Chick-fil-A: Milking the most out of thousands of kubernetes clusteres (20)
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
The Metaverse and AI: how can decision-makers harness the Metaverse for their...Jen Stirrup
The Metaverse is popularized in science fiction, and now it is becoming closer to being a part of our daily lives through the use of social media and shopping companies. How can businesses survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and how does the Metaverse fit into business strategy when futurist ideas are developing into reality at accelerated rates? How do we do this when our data isn't up to scratch? How can we move towards success with our data so we are set up for the Metaverse when it arrives?
How can you help your company evolve, adapt, and succeed using Artificial Intelligence and the Metaverse to stay ahead of the competition? What are the potential issues, complications, and benefits that these technologies could bring to us and our organizations? In this session, Jen Stirrup will explain how to start thinking about these technologies as an organisation.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
2. What to expect from the session
• Intro
• How is CFA using K8s?
• What does our
architecture look like?
• How are we
engineering around
K8s for our business?
• Q&A
4. AT PEAK HOUR
1 sandwich every 16 seconds
1 box of nuggets every 25 seconds
1 order of waffle fries every 14 seconds
1 car through the drive thru every 22 seconds
267 total transactions
12. Engineering Around K8s
• How we build and repair bare
metal clusters
• SRE Lessons Learned
• How we deploy applications to
thousands of clusters
13. Challenges of Bare Metal K8s clustering at scale
• Goal: #code2prod
• Simple enough for a non-
technologist to install
• Manageable remotely
• Automated device discovery
and self-clustering
• Self healing & HA
14. How we Bare Metal Cluster K8s at scale
Highlander Hooves Up
TOOLS
Sherlock FleetRKEImage
PROCESS
15. Bootstrapping Clusters
• Highlander
– Node coordination and
clustering leader election
using UDP
– Execute clustering (RKE)
– Swap KubeDNS for CoreDNS
– Base OAuth identity
negotiation
– Controller Pods (control
plane activity/Istio)
16. Initializing Clusters
What we considered
• Kops = love it, no bare metal
• Kubespray = slow + brittle
• kubeadmin = maybe in the future
• RKE = fairly simple, works for us
Future State?
• Stick w/ RKE, Kubeadmin, or roll our own to meet our needs
17. Resetting Cluster State
• Requirement: Need to be
able to re-image remotely
• Solution: Overlay FS + HAMS
– Manages wiping clusters
and restoring to base
18. Hooves Up
• Self-healing AWS SSM
Registration
• Free even for non-AWS
deployments
• Able to do remote
commands and patch
reporting/management
19. Lessons learned
• Use K8s feature set and don’t reinvent the wheel
• MVP. MVP. MVP.
• Ensure aggregated and searchable logging
• Deep health checks are a must --> Use /healthz
• Every service needs “/metrics”
endpoint
20. How do we deploy to our restaurants?
• Large number of
deployment targets
• Complex success/fail
criteria
• Array of application types
• What approaches did we
consider?
kubectl
/
21. Introducing Fleet
• Design Goals
– Simple to use / reason about
– Use declarative approach
– Support for variety of deployment
models (canary, blue/green)
– Rollout over flexible time period
– Sane rollback behaviors
– Leverage standard k8s API
– Full visibility
22. Fleet Ecosystem Components
• Fleet Client
– Git webhook, REST call, CLI
• Fleet Server API
– Code generation for
deployment, service,
ingress files
– Git management for cluster
repositories
– Deployment status tracking
• Atlas
– Repository of deploy-ready,
k8s compliant application
files
• Vessel
– Deployed on cluster, git
pull, kubectl apply, report
status
• Dashboards
29. Where you can find us
www.linkedin.com/in/brian-chambers
www.linkedin.com/in/calebrhurd
@brianchambers21
@calebrhurd
https://medium.com/@cfatechblog
https://github.com/chick-fil-a