The document discusses deploying Apache Spark on Kubernetes. It provides an overview of Kubernetes and Spark architectures, and describes how to configure Spark applications to run on Kubernetes, including using DaemonSets for the shuffle service, StatefulSets for HDFS, and a staging server for resources. Examples are given of SparkPi and GroupByTest submissions using Kubernetes. Challenges of running HDFS on Kubernetes are also mentioned.
Apache Spark on Kubernetes Anirudh Ramanathan and Tim ChenDatabricks
Kubernetes is a fast growing open-source platform which provides container-centric infrastructure. Conceived by Google in 2014, and leveraging over a decade of experience running containers at scale internally, it is one of the fastest moving projects on GitHub with 1000+ contributors and 40,000+ commits. Kubernetes has first class support on Google Cloud Platform, Amazon Web Services, and Microsoft Azure.
Unlike YARN, Kubernetes started as a general purpose orchestration framework with a focus on serving jobs. Support for long-running, data intensive batch workloads required some careful design decisions. Engineers across several organizations have been working on Kubernetes support as a cluster scheduler backend within Spark. During this process, we encountered several challenges in translating Spark considerations into idiomatic Kubernetes constructs. In this talk, we describe the challenges and the ways in which we solved them. This talk will be technical and is aimed at people who are looking to run Spark effectively on their clusters. The talk assumes basic familiarity with cluster orchestration and containers.
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed Spark Summit
This presentation describes the journey we went through in containerizing Spark workload into multiple elastic Spark clusters in a multi-tenant kubernetes environment. Initially we deployed Spark binaries onto a host-level filesystem, and then the Spark drivers, executors and master can transparently migrate to run inside a Docker container by automatically mounting host-level volumes. In this environment, we do not need to prepare a specific Spark image in order to run Spark workload in containers. We then utilized Kubernetes helm charts to deploy a Spark cluster. The administrator could further create a Spark instance group for each tenant. A Spark instance group, which is akin to the Spark notion of a tenant, is logically an independent kingdom for a tenant’s Spark applications in which they own dedicated Spark masters, history server, shuffle service and notebooks. Once a Spark instance group is created, it automatically generates its image and commits to a specified repository. Meanwhile, from Kubernetes’ perspective, each Spark instance group is a first-class deployment and thus the administrator can scale up/down its size according to the tenant’s SLA and demand. In a cloud-based data center, each Spark cluster can provide a Spark as a service while sharing the Kubernetes cluster. Each tenant that is registered into the service gets a fully isolated Spark instance group. In an on-prem Kubernetes cluster, each Spark cluster can map to a Business Unit, and thus each user in the BU can get a dedicated Spark instance group. The next step on this journey will address the resource sharing across Spark instance groups by leveraging new Kubernetes’ features (Kubernetes31068/9), as well as the Elastic workload containers depending on job demands (Spark18278). Demo: https://www.youtube.com/watch?v=eFYu6o3-Ea4&t=5s
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Amazon Web Services
"This is a technical architect's case study of how Loggly has employed the latest social-media-scale technologies as the backbone ingestion processing for our multi-tenant, geo-distributed, and real-time log management system. This presentation describes design details of how we built a second-generation system fully leveraging AWS services including Amazon Route 53 DNS with heartbeat and latency-based routing, multi-region VPCs, Elastic Load Balancing, Amazon Relational Database Service, and a number of pro-active and re-active approaches to scaling computational and indexing capacity.
The talk includes lessons learned in our first generation release, validated by thousands of customers; speed bumps and the mistakes we made along the way; various data models and architectures previously considered; and success at scale: speeds, feeds, and an unmeltable log processing engine."
Apache Spark on Kubernetes Anirudh Ramanathan and Tim ChenDatabricks
Kubernetes is a fast growing open-source platform which provides container-centric infrastructure. Conceived by Google in 2014, and leveraging over a decade of experience running containers at scale internally, it is one of the fastest moving projects on GitHub with 1000+ contributors and 40,000+ commits. Kubernetes has first class support on Google Cloud Platform, Amazon Web Services, and Microsoft Azure.
Unlike YARN, Kubernetes started as a general purpose orchestration framework with a focus on serving jobs. Support for long-running, data intensive batch workloads required some careful design decisions. Engineers across several organizations have been working on Kubernetes support as a cluster scheduler backend within Spark. During this process, we encountered several challenges in translating Spark considerations into idiomatic Kubernetes constructs. In this talk, we describe the challenges and the ways in which we solved them. This talk will be technical and is aimed at people who are looking to run Spark effectively on their clusters. The talk assumes basic familiarity with cluster orchestration and containers.
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed Spark Summit
This presentation describes the journey we went through in containerizing Spark workload into multiple elastic Spark clusters in a multi-tenant kubernetes environment. Initially we deployed Spark binaries onto a host-level filesystem, and then the Spark drivers, executors and master can transparently migrate to run inside a Docker container by automatically mounting host-level volumes. In this environment, we do not need to prepare a specific Spark image in order to run Spark workload in containers. We then utilized Kubernetes helm charts to deploy a Spark cluster. The administrator could further create a Spark instance group for each tenant. A Spark instance group, which is akin to the Spark notion of a tenant, is logically an independent kingdom for a tenant’s Spark applications in which they own dedicated Spark masters, history server, shuffle service and notebooks. Once a Spark instance group is created, it automatically generates its image and commits to a specified repository. Meanwhile, from Kubernetes’ perspective, each Spark instance group is a first-class deployment and thus the administrator can scale up/down its size according to the tenant’s SLA and demand. In a cloud-based data center, each Spark cluster can provide a Spark as a service while sharing the Kubernetes cluster. Each tenant that is registered into the service gets a fully isolated Spark instance group. In an on-prem Kubernetes cluster, each Spark cluster can map to a Business Unit, and thus each user in the BU can get a dedicated Spark instance group. The next step on this journey will address the resource sharing across Spark instance groups by leveraging new Kubernetes’ features (Kubernetes31068/9), as well as the Elastic workload containers depending on job demands (Spark18278). Demo: https://www.youtube.com/watch?v=eFYu6o3-Ea4&t=5s
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Amazon Web Services
"This is a technical architect's case study of how Loggly has employed the latest social-media-scale technologies as the backbone ingestion processing for our multi-tenant, geo-distributed, and real-time log management system. This presentation describes design details of how we built a second-generation system fully leveraging AWS services including Amazon Route 53 DNS with heartbeat and latency-based routing, multi-region VPCs, Elastic Load Balancing, Amazon Relational Database Service, and a number of pro-active and re-active approaches to scaling computational and indexing capacity.
The talk includes lessons learned in our first generation release, validated by thousands of customers; speed bumps and the mistakes we made along the way; various data models and architectures previously considered; and success at scale: speeds, feeds, and an unmeltable log processing engine."
Slides presented during the Strata SF 2019 conference. Explaining how Lyft is building a multi-cluster solution for running Apache Spark on kubernetes at scale to support diverse workloads and overcome challenges.
Capacity planning is a difficult challenge faced by most companies. If you have too few machines, you will not have enough compute resources available to deal with heavy loads. On the other hand, if you have too many machines, you are wasting money. This is why companies have started investing in automatically scaling services and infrastructure to minimize the amount of wasted money and resources.
In this talk, Nathan will describe how Yelp is using PaaSTA, a PaaS built on top of open source tools including Docker, Mesos, Marathon, and Chronos, to automatically and gracefully scale services and the underlying cluster. He will go into detail about how this functionality was implemented and the design designs that were made while architecting the system. He will also provide a brief comparison of how this approach differs from existing solutions.
This session will examine the many options the data scientist has for running Spark clusters in public and private clouds. We will discuss various environments employing AWS, Mesos, containers, docker, and BlueData EPIC technologies and the benefits and challenges of each.
Speakers:
Tom Phelan, Co-founder and Chief Architect - BlueData Inc. Tom has spent the last 25 years as a senior architect, developer, and team lead in the computer software industry in Silicon Valley. Prior to co-founding BlueData, Tom spent 10 years at VMware as a senior architect and team lead in the core R&D Storage and Availability group. Most recently, Tom led one of the key projects – vFlash, focusing on integration of server-based Flash into the vSphere core hypervisor. Prior to VMware, Tom was part of the early team at Silicon Graphics that developed XFS, one of the most successful open source file systems. Earlier in his career, he was a key member of the Stratus team that ported the Unix operating system to their highly available computing platform. Tom received his Computer Science degree from the University of California, Berkeley.
Scylla Summit 2022: What’s New in ScyllaDB Operator for KubernetesScyllaDB
ScyllaDB has traveled a long way since introducing the Scylla Operator 1.0 at the previous Scylla Summit. This session walks you through the most important changes made over the last year. We take a close look at the latest performance features, as well as new security, stability, and user experience updates. There’s also a glimpse into our vision for the future. Plus, for all those who need an introduction or recap, we’ll also cover Operator basics.
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
Spark Pipelines in the Cloud with Alluxio with Gene PangSpark Summit
Organizations commonly use Apache Spark to gain actionable insight from their large amounts of data. Often, these analytics are in the form of data processing pipelines, where there are a series of processing stages, and each stage performs a particular function, and the output of one stage is the input of the next stage. There are several examples of pipelines, such as log processing, IoT pipelines, and machine learning. The common attribute among different pipelines is the sharing of data between stages. It is also common for Spark pipelines to process data stored in the public cloud, such as Amazon S3, Microsoft Azure Blob Storage, or Google Cloud Storage. The global availability and cost effectiveness of these public cloud storage services make them the preferred storage for data. However, running pipeline jobs while sharing data via cloud storage can be expensive in terms of increased network traffic, and slower data sharing and job completion times. Using Alluxio, a memory speed virtual distributed storage system, enables sharing data between different stages or jobs at memory speed. By reading and writing data in Alluxio, the data can stay in memory for the next stage of the pipeline, and this result in great performance gains. In this talk, we discuss how Alluxio can be deployed and used with a Spark data processing pipeline in the cloud. We show how pipeline stages can share data with Alluxio memory for improved performance benefits, and how Alluxio can improves completion times and reduces performance variability for Spark pipelines in the cloud.
Dev ops for big data cluster management toolsRan Silberman
What are the tools that we can find to day to manage Hadoop cluster and its ecosystem?
There are two tools ready today:
Cloudera Manager and Ambari from Hortonworks.
In this presentation I explain what they do and why to use them, as well as Pros. and Cons.
Homologous Apache Spark Clusters Using Nomad with Alex DadgarDatabricks
Nomad is a modern cluster manager by HashiCorp, designed for both long-lived services and short-lived batch processing workloads. The Nomad team has been working to bring a native integration between Nomad and Apache Spark.
By running Spark jobs on Nomad, both Spark developers and the engineering organization benefit. Nomad’s architecture allows it to have an incredibly high scheduling throughput. To demonstrate this, HashiCorp scheduled 1 million containers in less than five minutes. That speed means that large Spark workloads can be immediately placed, minimizing job runtime and job start latencies.
For an organization, Nomad offers many benefits. Since Nomad was designed for both batch and services, a single cluster can service both an organization’s Spark workload and all service-oriented jobs. That, coupled with the fact that Nomad uses bin-packing to place multiple jobs on each machine, means that organizations can achieve higher density. Which saves money and makes capacity planning easier.
In the future, Nomad will also have the ability to enforce quotas and apply chargebacks, allowing multi-tenant clusters to be easily managed. To further increase the performance of Spark on Nomad, HashiCorp would like to ingest HDFS locality information to place the compute by the data.
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloJoe Stein
In this talk we will walk through how Apache Kafka and Apache Accumulo can be used together to orchestrate a de-coupled, real-time distributed and reactive request/response system at massive scale. Multiple data pipelines can perform complex operations for each message in parallel at high volumes with low latencies. The final result will be inline with the initiating call. The architecture gains are immense. They allow for the requesting system to receive a response without the need for direct integration with the data pipeline(s) that messages must go through. By utilizing Apache Kafka and Apache Accumulo, these gains sustain at scale and allow for complex operations of different messages to be applied to each response in real-time.
Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with SenlinQiming Teng
This is a talk from the Austin OpenStack summit. It demonstrates how a resilient, elastic and load-balanced cluster can be deployed using senlin, heat, ceilometer, lbaas v2, nova.
You’ve heard all of the hype, but how can SMACK work for you? In this all-star lineup, you will learn how to create a reactive, scaling, resilient and performant data processing powerhouse. Bringing Akka, Kafka and Mesos together provides a foundation to develop and operate an elastically scalable actor system. We will go through the basics of Akka, Kafka and Mesos and then deep dive into putting them together in an end2end (and back again) distrubuted transaction. Distributed transactions mean producers waiting for one or more of consumers to respond. We'll also go through automated ways to failure induce these systems (using LinkedIn Simoorg) and trace them from start to stop through each component (using Twitters Zipkin). Finally, you will see how Apache Cassandra and Spark can be combined to add the incredibly scaling storage and data analysis needed in fast data pipelines. With these technologies as a foundation, you have the assurance that scale is never a problem and uptime is default.
An experience sharing of the OpenStack deployment at Suning.com, a large online retailer in China. The talk presents the challenges and opportunities on orchestrating the enterprise workloads using Heat.
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...Yahoo Developer Network
Apache Hadoop YARN is a modern resource-management platform that handles resource scheduling, isolation and multi-tenancy for a variety of data processing engines that can co-exist and share a single data-center in a cost-effective manner.
In the first half of the talk, we are going to give a brief look into some of the big efforts cooking in the Apache Hadoop YARN community.
We will then dig deeper into one of the efforts - supporting Docker runtime in YARN. Docker is an application container engine that enables developers and sysadmins to build, deploy and run containerized applications. In this half, we'll discuss container runtimes in YARN, with a focus on using the DockerContainerRuntime to run various docker applications under YARN. Support for container runtimes (including the docker container runtime) was recently added to the Linux Container Executor (YARN-3611 and its sub-tasks). We’ll walk through various aspects of running docker containers under YARN - resource isolation, some security aspects (for example container capabilities, privileged containers, user namespaces) and other work in progress features like image localization and support for different networking modes.
Speakers:
Vinod Kumar Vavilapalli is the Hadoop YARN and MapReduce guy at Hortonworks. He is a long term Hadoop contributor at Apache, Hadoop committer and a member of the Apache Hadoop PMC. He has a Bachelors degree from Indian Institute of Technology Roorkee in Computer Science and Engineering. He has been working on Hadoop for nearly 9 years and he still has fun doing it. Straight out of college, he joined the Hadoop team at Yahoo! Bangalore, before Hortonworks happened. He is passionate about using computers to change the world for better, bit by bit.
Sidharta Seethana is a software engineer at Hortonworks. He works on the YARN team, focussing on bringing new kinds of workloads to YARN. Prior to joining Hortonworks, Sidharta spent 10 years at Yahoo! Inc., working on a variety of large scale distributed systems for core platforms/web services, search and marketplace properties, developer network and personalization.
(Nina Hanzlikova, Zalando) Kafka Summit SF 2018
My team at Zalando fell in love with KStreams and their programming model straight out of the gate. However, as a small team of developers, building out and supporting our infrastructure while still trying to deliver solutions for our business has not always resulted in a smooth journey.
Can a small team of a couple of developers run their own Kafka infrastructure confidently and still spend most of their time developing code?
In this talk, we will dive into some of the problems we experienced while running Kafka brokers and Kafka Streams applications, as well as the consultations we had with other teams around this matter. We will outline some of the pragmatic decisions we made regarding backups, monitoring and operations to minimize our time spent administering our Kafka brokers and various stream applications.
Cloud-native .NET Microservices mit KubernetesQAware GmbH
BASTA! 2017, Mainz: Talk von Mario-Leander Reimer (@LeanderReimer, Cheftechnologe bei QAware).
Cloud-Größen wie Google, Twitter und Netflix haben die Kernbausteine ihrer Infrastruktur quelloffen verfügbar gemacht. Das Resultat aus vielen Jahren Cloud-Erfahrung ist nun frei zugänglich, und jeder kann seine eigenen Cloud-nativen Anwendungen entwickeln – Anwendungen, die in der Cloud zuverlässig laufen und fast beliebig skalieren. Die einzelnen Bausteine wachsen zu einem großen Ganzen zusammen, dem Cloud-Native-Stack. In dieser Session stellen wir die wichtigsten Konzepte und aktuellen Schlüsseltechnologien kurz vor. Anschließend implementieren wir einen einfachen Microservice mit .NET Core und Steeltoe OSS und bringen ihn zusammen mit ausgewählten Bausteinen für Service-Discovery und Konfiguration schrittweise auf einem Kubernetes-Cluster zum Laufen.
Slides presented during the Strata SF 2019 conference. Explaining how Lyft is building a multi-cluster solution for running Apache Spark on kubernetes at scale to support diverse workloads and overcome challenges.
Capacity planning is a difficult challenge faced by most companies. If you have too few machines, you will not have enough compute resources available to deal with heavy loads. On the other hand, if you have too many machines, you are wasting money. This is why companies have started investing in automatically scaling services and infrastructure to minimize the amount of wasted money and resources.
In this talk, Nathan will describe how Yelp is using PaaSTA, a PaaS built on top of open source tools including Docker, Mesos, Marathon, and Chronos, to automatically and gracefully scale services and the underlying cluster. He will go into detail about how this functionality was implemented and the design designs that were made while architecting the system. He will also provide a brief comparison of how this approach differs from existing solutions.
This session will examine the many options the data scientist has for running Spark clusters in public and private clouds. We will discuss various environments employing AWS, Mesos, containers, docker, and BlueData EPIC technologies and the benefits and challenges of each.
Speakers:
Tom Phelan, Co-founder and Chief Architect - BlueData Inc. Tom has spent the last 25 years as a senior architect, developer, and team lead in the computer software industry in Silicon Valley. Prior to co-founding BlueData, Tom spent 10 years at VMware as a senior architect and team lead in the core R&D Storage and Availability group. Most recently, Tom led one of the key projects – vFlash, focusing on integration of server-based Flash into the vSphere core hypervisor. Prior to VMware, Tom was part of the early team at Silicon Graphics that developed XFS, one of the most successful open source file systems. Earlier in his career, he was a key member of the Stratus team that ported the Unix operating system to their highly available computing platform. Tom received his Computer Science degree from the University of California, Berkeley.
Scylla Summit 2022: What’s New in ScyllaDB Operator for KubernetesScyllaDB
ScyllaDB has traveled a long way since introducing the Scylla Operator 1.0 at the previous Scylla Summit. This session walks you through the most important changes made over the last year. We take a close look at the latest performance features, as well as new security, stability, and user experience updates. There’s also a glimpse into our vision for the future. Plus, for all those who need an introduction or recap, we’ll also cover Operator basics.
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
Spark Pipelines in the Cloud with Alluxio with Gene PangSpark Summit
Organizations commonly use Apache Spark to gain actionable insight from their large amounts of data. Often, these analytics are in the form of data processing pipelines, where there are a series of processing stages, and each stage performs a particular function, and the output of one stage is the input of the next stage. There are several examples of pipelines, such as log processing, IoT pipelines, and machine learning. The common attribute among different pipelines is the sharing of data between stages. It is also common for Spark pipelines to process data stored in the public cloud, such as Amazon S3, Microsoft Azure Blob Storage, or Google Cloud Storage. The global availability and cost effectiveness of these public cloud storage services make them the preferred storage for data. However, running pipeline jobs while sharing data via cloud storage can be expensive in terms of increased network traffic, and slower data sharing and job completion times. Using Alluxio, a memory speed virtual distributed storage system, enables sharing data between different stages or jobs at memory speed. By reading and writing data in Alluxio, the data can stay in memory for the next stage of the pipeline, and this result in great performance gains. In this talk, we discuss how Alluxio can be deployed and used with a Spark data processing pipeline in the cloud. We show how pipeline stages can share data with Alluxio memory for improved performance benefits, and how Alluxio can improves completion times and reduces performance variability for Spark pipelines in the cloud.
Dev ops for big data cluster management toolsRan Silberman
What are the tools that we can find to day to manage Hadoop cluster and its ecosystem?
There are two tools ready today:
Cloudera Manager and Ambari from Hortonworks.
In this presentation I explain what they do and why to use them, as well as Pros. and Cons.
Homologous Apache Spark Clusters Using Nomad with Alex DadgarDatabricks
Nomad is a modern cluster manager by HashiCorp, designed for both long-lived services and short-lived batch processing workloads. The Nomad team has been working to bring a native integration between Nomad and Apache Spark.
By running Spark jobs on Nomad, both Spark developers and the engineering organization benefit. Nomad’s architecture allows it to have an incredibly high scheduling throughput. To demonstrate this, HashiCorp scheduled 1 million containers in less than five minutes. That speed means that large Spark workloads can be immediately placed, minimizing job runtime and job start latencies.
For an organization, Nomad offers many benefits. Since Nomad was designed for both batch and services, a single cluster can service both an organization’s Spark workload and all service-oriented jobs. That, coupled with the fact that Nomad uses bin-packing to place multiple jobs on each machine, means that organizations can achieve higher density. Which saves money and makes capacity planning easier.
In the future, Nomad will also have the ability to enforce quotas and apply chargebacks, allowing multi-tenant clusters to be easily managed. To further increase the performance of Spark on Nomad, HashiCorp would like to ingest HDFS locality information to place the compute by the data.
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloJoe Stein
In this talk we will walk through how Apache Kafka and Apache Accumulo can be used together to orchestrate a de-coupled, real-time distributed and reactive request/response system at massive scale. Multiple data pipelines can perform complex operations for each message in parallel at high volumes with low latencies. The final result will be inline with the initiating call. The architecture gains are immense. They allow for the requesting system to receive a response without the need for direct integration with the data pipeline(s) that messages must go through. By utilizing Apache Kafka and Apache Accumulo, these gains sustain at scale and allow for complex operations of different messages to be applied to each response in real-time.
Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with SenlinQiming Teng
This is a talk from the Austin OpenStack summit. It demonstrates how a resilient, elastic and load-balanced cluster can be deployed using senlin, heat, ceilometer, lbaas v2, nova.
You’ve heard all of the hype, but how can SMACK work for you? In this all-star lineup, you will learn how to create a reactive, scaling, resilient and performant data processing powerhouse. Bringing Akka, Kafka and Mesos together provides a foundation to develop and operate an elastically scalable actor system. We will go through the basics of Akka, Kafka and Mesos and then deep dive into putting them together in an end2end (and back again) distrubuted transaction. Distributed transactions mean producers waiting for one or more of consumers to respond. We'll also go through automated ways to failure induce these systems (using LinkedIn Simoorg) and trace them from start to stop through each component (using Twitters Zipkin). Finally, you will see how Apache Cassandra and Spark can be combined to add the incredibly scaling storage and data analysis needed in fast data pipelines. With these technologies as a foundation, you have the assurance that scale is never a problem and uptime is default.
An experience sharing of the OpenStack deployment at Suning.com, a large online retailer in China. The talk presents the challenges and opportunities on orchestrating the enterprise workloads using Heat.
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...Yahoo Developer Network
Apache Hadoop YARN is a modern resource-management platform that handles resource scheduling, isolation and multi-tenancy for a variety of data processing engines that can co-exist and share a single data-center in a cost-effective manner.
In the first half of the talk, we are going to give a brief look into some of the big efforts cooking in the Apache Hadoop YARN community.
We will then dig deeper into one of the efforts - supporting Docker runtime in YARN. Docker is an application container engine that enables developers and sysadmins to build, deploy and run containerized applications. In this half, we'll discuss container runtimes in YARN, with a focus on using the DockerContainerRuntime to run various docker applications under YARN. Support for container runtimes (including the docker container runtime) was recently added to the Linux Container Executor (YARN-3611 and its sub-tasks). We’ll walk through various aspects of running docker containers under YARN - resource isolation, some security aspects (for example container capabilities, privileged containers, user namespaces) and other work in progress features like image localization and support for different networking modes.
Speakers:
Vinod Kumar Vavilapalli is the Hadoop YARN and MapReduce guy at Hortonworks. He is a long term Hadoop contributor at Apache, Hadoop committer and a member of the Apache Hadoop PMC. He has a Bachelors degree from Indian Institute of Technology Roorkee in Computer Science and Engineering. He has been working on Hadoop for nearly 9 years and he still has fun doing it. Straight out of college, he joined the Hadoop team at Yahoo! Bangalore, before Hortonworks happened. He is passionate about using computers to change the world for better, bit by bit.
Sidharta Seethana is a software engineer at Hortonworks. He works on the YARN team, focussing on bringing new kinds of workloads to YARN. Prior to joining Hortonworks, Sidharta spent 10 years at Yahoo! Inc., working on a variety of large scale distributed systems for core platforms/web services, search and marketplace properties, developer network and personalization.
(Nina Hanzlikova, Zalando) Kafka Summit SF 2018
My team at Zalando fell in love with KStreams and their programming model straight out of the gate. However, as a small team of developers, building out and supporting our infrastructure while still trying to deliver solutions for our business has not always resulted in a smooth journey.
Can a small team of a couple of developers run their own Kafka infrastructure confidently and still spend most of their time developing code?
In this talk, we will dive into some of the problems we experienced while running Kafka brokers and Kafka Streams applications, as well as the consultations we had with other teams around this matter. We will outline some of the pragmatic decisions we made regarding backups, monitoring and operations to minimize our time spent administering our Kafka brokers and various stream applications.
Cloud-native .NET Microservices mit KubernetesQAware GmbH
BASTA! 2017, Mainz: Talk von Mario-Leander Reimer (@LeanderReimer, Cheftechnologe bei QAware).
Cloud-Größen wie Google, Twitter und Netflix haben die Kernbausteine ihrer Infrastruktur quelloffen verfügbar gemacht. Das Resultat aus vielen Jahren Cloud-Erfahrung ist nun frei zugänglich, und jeder kann seine eigenen Cloud-nativen Anwendungen entwickeln – Anwendungen, die in der Cloud zuverlässig laufen und fast beliebig skalieren. Die einzelnen Bausteine wachsen zu einem großen Ganzen zusammen, dem Cloud-Native-Stack. In dieser Session stellen wir die wichtigsten Konzepte und aktuellen Schlüsseltechnologien kurz vor. Anschließend implementieren wir einen einfachen Microservice mit .NET Core und Steeltoe OSS und bringen ihn zusammen mit ausgewählten Bausteinen für Service-Discovery und Konfiguration schrittweise auf einem Kubernetes-Cluster zum Laufen.
Kubernetes is designed to be an extensible system. But what is the vision for Kubernetes Extensibility? Do you know the difference between webhooks and cloud providers, or between CRI, CSI, and CNI? In this talk we will explore what extension points exist, how they have evolved, and how to use them to make the system do new and interesting things. We’ll give our vision for how they will probably evolve in the future, and talk about the sorts of things we expect the broader Kubernetes ecosystem to build with them.
Extending DevOps to Big Data Applications with KubernetesNicola Ferraro
DevOps, continuous delivery and modern architectural trends can incredibly speed up the software development process. Big Data applications cannot be an exception and need to keep the same pace.
In this one day workshop, we will introduce Spark at a high level context. Spark is fundamentally different than writing MapReduce jobs so no prior Hadoop experience is needed. You will learn how to interact with Spark on the command line and conduct rapid in-memory data analyses. We will then work on writing Spark applications to perform large cluster-based analyses including SQL-like aggregations, machine learning applications, and graph algorithms. The course will be conducted in Python using PySpark.
Kubernetes is exploding in popularity right now and has all the buzz and cargo-culting that Docker enjoyed just a few years ago. But what even is Kubernetes? How do I run my PHP apps in it? Should I run my PHP apps in it ?
Linux-Stammtisch Juli 2019, Munich: Talk by Mario-Leander Reimer (@LeanderReimer, Principal Software Architect at QAware)
=== Please download slides if blurred! ===
Abstract: Only a few years ago the move towards microservice architecture was the first big disruption in software engineering: instead of running monoliths, systems were now build, composed and run as autonomous services. But this came at the price of added development and infrastructure complexity. Serverless and FaaS seem to be the next disruption, they are the logical evolution trying to address some of the inherent technology complexity we are currently faced when building cloud native apps.
FaaS frameworks are currently popping up like mushrooms: Knative, Kubeless, OpenFn, Fission, OpenFaas or Open Whisk are just a few to name. But which one of these is safe to pick and use in your next project? Let's find out. This session will start off by briefly explaining the essence of Serverless application architecture. We will then define a criteria catalog for FaaS frameworks and continue by comparing and showcasing the most promising ones.
Best Practices for Running Kafka on Docker ContainersBlueData, Inc.
Docker containers provide an ideal foundation for running Kafka-as-a-Service on-premises or in the public cloud. However, using Docker containers in production environments for Big Data workloads using Kafka poses some challenges – including container management, scheduling, network configuration and security, and performance.
In this session at Kafka Summit in August 2017, Nanda Vijyaydev of BlueData shared lessons learned from implementing Kafka-as-a-Service with Docker containers.
https://kafka-summit.org/sessions/kafka-service-docker-containers
Google and Intel speak on NFV and SFC service delivery
The slides are as presented at the meet up "Out of Box Network Developers" sponsored by Intel Networking Developer Zone
Here is the Agenda of the slides:
How DPDK, RDT and gRPC fit into SDI/SDN, NFV and OpenStack
Key Platform Requirements for SDI
SDI Platform Ingredients: DPDK, IntelⓇRDT
gRPC Service Framework
IntelⓇ RDT and gRPC service framework
The overall evolution towards microservices has caused a lot of IT leaders to radically rethink architectures and platforms. One can hardly keep up with the rapid onslaught on new distributed technologies. The same people who just asked yesterday "how can we deploy Docker containers?", are now asking "how can we operate Kubernetes-as-a-Service on-premise?", and are about to start asking "how can we operate the open source frameworks of our choice, such as Spark, TensorFlow, HDFS, and more, as a service across hybrid clouds?”. This session will discuss: Challenges of orchestrating and operating.
The overall evolution towards microservices has caused a lot of IT leaders to radically rethink architectures and platforms. One can hardly keep up with the rapid onslaught on new distributed technologies. The same people who just asked yesterday "how can we deploy Docker containers?", are now asking "how can we operate Kubernetes-as-a-Service on-premise?", and are about to start asking "how can we operate the open source frameworks of our choice, such as Spark, TensorFlow, HDFS, and more, as a service across hybrid clouds?”. This session will discuss: Challenges of orchestrating and operating
Reliable Performance at Scale with Apache Spark on KubernetesDatabricks
Kubernetes is an open-source containerization framework that makes it easy to manage applications in isolated environments at scale. In Apache Spark 2.3, Spark introduced support for native integration with Kubernetes. Palantir has been deeply involved with the development of Spark’s Kubernetes integration from the beginning, and our largest production deployment now runs an average of ~5 million Spark pods per day, as part of tens of thousands of Spark applications.
Over the course of our adventures in migrating deployments from YARN to Kubernetes, we have overcome a number of performance, cost, & reliability hurdles: differences in shuffle performance due to smaller filesystem caches in containers; Kubernetes CPU limits causing inadvertent throttling of containers that run many Java threads; and lack of support for dynamic allocation leading to resource wastage. We intend to briefly describe our story of developing & deploying Spark-on-Kubernetes, as well as lessons learned from deploying containerized Spark applications in production.
We will also describe our recently open-sourced extension (https://github.com/palantir/k8s-spark-scheduler) to the Kubernetes scheduler to better support Spark workloads & facilitate Spark-aware cluster autoscaling; our limited implementation of dynamic allocation on Kubernetes; and ongoing work that is required to support dynamic resource management & stable performance at scale (i.e., our work with the community on a pluggable external shuffle service API). Our hope is that our lessons learned and ongoing work will help other community members who want to use Spark on Kubernetes for their own workloads.
betterCode Workshop: Effizientes DevOps-Tooling mit GoQAware GmbH
bettterCode, 24.06.2021, Online: Workshop of Mario-Leander Reimer (@LeanderReimer, Principal Software Architect at QAware) & Markus Zimmermann (@markus_zm, Senior Software Engineer at QAware)
== Please download slides in case they are blurred! ===
Use the right tool and language for the job! Especially in the DevOps tooling area, Go has established itself as a simple, reliable and efficient programming language. In this workshop, we learned about suitable application areas and implementing quite a few tools.
MongoDB.local DC 2018: MongoDB Ops Manager + KubernetesMongoDB
MongoDB Ops Manager is an enterprise-grade end-to-end database management, monitoring, and backup solution. Kubernetes has clearly won the orchestration-platform "wars". In this session we'll take a deep dive on how you can leverage both these technologies to host your MongoDB deployments within your Kubernetes infrastructure whether that's OpenShift, PKS, Azure AKS, or just upstream. This talk will review the core technologies, such as containers, Kubernetes, and MongoDB Ops Manager. You'll also have a chance to see real-live demos of MongoDB running on Kubernetes and managed with MongoDB Ops Manager with the MongoDB Enterprise Kubernetes Operator.
Strata Singapore 2017 business use case section
"Big Telco Real-Time Network Analytics"
https://conferences.oreilly.com/strata/strata-sg/public/schedule/detail/62797
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
5. Kubernetes provides a common API and
self-healing framework which
automatically handles machine failures
and application deployments, logging,
and monitoring.
5
10. Clusters - set of compute, storage, network
resource
Pods - colocated group of application containers
that share volumes and a networking stack
Replication Controllers - ensure a specific number
of pods, manage pods, status updates
Services - cluster wide service discovery
10
11. Node #1 192.168.0.2
Pod #1
10.0.0.2
Node #5 192.168.0.6
Volume
Network
Pod #2
10.0.0.3
Volume
Network
Pod #8
10.0.0.9
Volume
Network
8080 8080 8080
11
12. Support for Event Stream Processing
Fast Data Queries in Real Time
Improved Programmer Productivity
Fast Batch Processing of Large Data Set
12
13. Driver Process that contains the SparkContext
Executor Process that executes one or more Spark tasks
Master Process that manages applications across the cluster
Worker Process that manages executors on a particular node
13
http://spark.apache.org/docs/latest/cluster-overview.html
30. Node Manager # 1…N
external
shuffle plugin
RDD
(IntermediateFile)
RDD
(IntermediateFile)
External Shuffle
30
Executor
Long-Running ETL jobs
Interactive application or Server
Any application with large shuffles
Executor