This presentation describes how hortonworks is delivering Hadoop on Docker for a cloud-agnostic deployment approach which presented in Cisco Live 2015.
Lessons Learned Running Hadoop and Spark in Docker ContainersBlueData, Inc.
Many initiatives for running applications inside containers have been scoped to run on a single host. Using Docker containers for large-scale production environments poses interesting challenges, especially when deploying distributed big data applications like Apache Hadoop and Apache Spark. This session at Strata + Hadoop World in New York City (September 2016) explores various solutions and tips to address the challenges encountered while deploying multi-node Hadoop and Spark production workloads using Docker containers.
Some of these challenges include container life-cycle management, smart scheduling for optimal resource utilization, network configuration and security, and performance. BlueData is "all in” on Docker containers—with a specific focus on big data applications. BlueData has learned firsthand how to address these challenges for Fortune 500 enterprises and government organizations that want to deploy big data workloads using Docker.
This session by Thomas Phelan, co-founder and chief architect at BlueData, discusses how to securely network Docker containers across multiple hosts and discusses ways to achieve high availability across distributed big data applications and hosts in your data center. Since we’re talking about very large volumes of data, performance is a key factor, so Thomas shares some of the storage options implemented at BlueData to achieve near bare-metal I/O performance for Hadoop and Spark using Docker as well as lessons learned and some tips and tricks on how to Dockerize your big data applications in a reliable, scalable, and high-performance environment.
http://conferences.oreilly.com/strata/hadoop-big-data-ny/public/schedule/detail/52042
Big Data in Container; Hadoop Spark in Docker and MesosHeiko Loewe
3 examples for Big Data analytics containerized:
1. The installation with Docker and Weave for small and medium,
2. Hadoop on Mesos w/ Appache Myriad
3. Spark on Mesos
This session will examine the many options the data scientist has for running Spark clusters in public and private clouds. We will discuss various environments employing AWS, Mesos, containers, docker, and BlueData EPIC technologies and the benefits and challenges of each.
Speakers:
Tom Phelan, Co-founder and Chief Architect - BlueData Inc. Tom has spent the last 25 years as a senior architect, developer, and team lead in the computer software industry in Silicon Valley. Prior to co-founding BlueData, Tom spent 10 years at VMware as a senior architect and team lead in the core R&D Storage and Availability group. Most recently, Tom led one of the key projects – vFlash, focusing on integration of server-based Flash into the vSphere core hypervisor. Prior to VMware, Tom was part of the early team at Silicon Graphics that developed XFS, one of the most successful open source file systems. Earlier in his career, he was a key member of the Stratus team that ported the Unix operating system to their highly available computing platform. Tom received his Computer Science degree from the University of California, Berkeley.
How to Protect Big Data in a Containerized EnvironmentBlueData, Inc.
Every enterprise spends significant resources to protect its data. This is especially true in the case of big data, since some of this data may include sensitive or confidential customer and financial information. Common methods for protecting data include permissions and access controls as well as the encryption of data at rest and in flight.
The Hadoop community has recently rolled out Transparent Data Encryption (TDE) support in HDFS. Transparent Data Encryption refers to the process whereby data is transparently encrypted by the big data application writing the data; it is not decrypted again until it is accessed by another application. The data is encrypted during its entire lifespan—in transit and at rest—except when it is being specifically accessed by a processing application.
TDE is an excellent approach for protecting data stored in data lakes built on the latest versions of HDFS. However, it does have its challenges and limitations. Systems that want to use TDE require tight integration with enterprise-wide Kerberos Key Distribution Center (KDC) services and Key Management Systems (KMS). This integration isn’t easy to set up or maintain. These issues can be even more challenging in a virtualized or containerized environment where one Kerberos realm may be used to secure the big data compute cluster and a different Kerberos realm may be used to secure the HDFS filesystem accessed by this cluster.
BlueData has developed significant expertise in configuring, managing, and optimizing access to TDE-protected HDFS. This session at the Strata Data Conference in March 2018 (by Thomas Phelan, co-founder and chief architect at BlueData) offers a detailed overview of how transparent data encryption works with HDFS, with a particular focus on containerized environments.
You’ll learn how HDFS TDE is configured and maintained in an environment where many big data frameworks run simultaneously (e.g., in a hybrid cloud architecture using Docker containers). Moreover, you’ll learn how KDC credentials can be managed in a Kerberos cross-realm environment to provide data scientists and analysts with the greatest flexibility in accessing data while maintaining complete enterprise-grade data security.
https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63763
Structor - Automated Building of Virtual Hadoop ClustersOwen O'Malley
Discusses vagrant scripts to setup and deploy a working Hadoop multiple node cluster with or without security. All source code is available on https://github.com/hortonworks/structor .
Lessons Learned Running Hadoop and Spark in Docker ContainersBlueData, Inc.
Many initiatives for running applications inside containers have been scoped to run on a single host. Using Docker containers for large-scale production environments poses interesting challenges, especially when deploying distributed big data applications like Apache Hadoop and Apache Spark. This session at Strata + Hadoop World in New York City (September 2016) explores various solutions and tips to address the challenges encountered while deploying multi-node Hadoop and Spark production workloads using Docker containers.
Some of these challenges include container life-cycle management, smart scheduling for optimal resource utilization, network configuration and security, and performance. BlueData is "all in” on Docker containers—with a specific focus on big data applications. BlueData has learned firsthand how to address these challenges for Fortune 500 enterprises and government organizations that want to deploy big data workloads using Docker.
This session by Thomas Phelan, co-founder and chief architect at BlueData, discusses how to securely network Docker containers across multiple hosts and discusses ways to achieve high availability across distributed big data applications and hosts in your data center. Since we’re talking about very large volumes of data, performance is a key factor, so Thomas shares some of the storage options implemented at BlueData to achieve near bare-metal I/O performance for Hadoop and Spark using Docker as well as lessons learned and some tips and tricks on how to Dockerize your big data applications in a reliable, scalable, and high-performance environment.
http://conferences.oreilly.com/strata/hadoop-big-data-ny/public/schedule/detail/52042
Big Data in Container; Hadoop Spark in Docker and MesosHeiko Loewe
3 examples for Big Data analytics containerized:
1. The installation with Docker and Weave for small and medium,
2. Hadoop on Mesos w/ Appache Myriad
3. Spark on Mesos
This session will examine the many options the data scientist has for running Spark clusters in public and private clouds. We will discuss various environments employing AWS, Mesos, containers, docker, and BlueData EPIC technologies and the benefits and challenges of each.
Speakers:
Tom Phelan, Co-founder and Chief Architect - BlueData Inc. Tom has spent the last 25 years as a senior architect, developer, and team lead in the computer software industry in Silicon Valley. Prior to co-founding BlueData, Tom spent 10 years at VMware as a senior architect and team lead in the core R&D Storage and Availability group. Most recently, Tom led one of the key projects – vFlash, focusing on integration of server-based Flash into the vSphere core hypervisor. Prior to VMware, Tom was part of the early team at Silicon Graphics that developed XFS, one of the most successful open source file systems. Earlier in his career, he was a key member of the Stratus team that ported the Unix operating system to their highly available computing platform. Tom received his Computer Science degree from the University of California, Berkeley.
How to Protect Big Data in a Containerized EnvironmentBlueData, Inc.
Every enterprise spends significant resources to protect its data. This is especially true in the case of big data, since some of this data may include sensitive or confidential customer and financial information. Common methods for protecting data include permissions and access controls as well as the encryption of data at rest and in flight.
The Hadoop community has recently rolled out Transparent Data Encryption (TDE) support in HDFS. Transparent Data Encryption refers to the process whereby data is transparently encrypted by the big data application writing the data; it is not decrypted again until it is accessed by another application. The data is encrypted during its entire lifespan—in transit and at rest—except when it is being specifically accessed by a processing application.
TDE is an excellent approach for protecting data stored in data lakes built on the latest versions of HDFS. However, it does have its challenges and limitations. Systems that want to use TDE require tight integration with enterprise-wide Kerberos Key Distribution Center (KDC) services and Key Management Systems (KMS). This integration isn’t easy to set up or maintain. These issues can be even more challenging in a virtualized or containerized environment where one Kerberos realm may be used to secure the big data compute cluster and a different Kerberos realm may be used to secure the HDFS filesystem accessed by this cluster.
BlueData has developed significant expertise in configuring, managing, and optimizing access to TDE-protected HDFS. This session at the Strata Data Conference in March 2018 (by Thomas Phelan, co-founder and chief architect at BlueData) offers a detailed overview of how transparent data encryption works with HDFS, with a particular focus on containerized environments.
You’ll learn how HDFS TDE is configured and maintained in an environment where many big data frameworks run simultaneously (e.g., in a hybrid cloud architecture using Docker containers). Moreover, you’ll learn how KDC credentials can be managed in a Kerberos cross-realm environment to provide data scientists and analysts with the greatest flexibility in accessing data while maintaining complete enterprise-grade data security.
https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63763
Structor - Automated Building of Virtual Hadoop ClustersOwen O'Malley
Discusses vagrant scripts to setup and deploy a working Hadoop multiple node cluster with or without security. All source code is available on https://github.com/hortonworks/structor .
How to deploy Apache Spark in a multi-tenant, on-premises environmentBlueData, Inc.
Adoption of Apache Spark in the enterprise is increasing rapidly - it's become one of the fastest growing and most popular technologies in the Big Data ecosystem.
However, implementing an enterprise-ready, on-premises Spark deployment can be very complex and it requires expertise that is generally not available to all.
BlueData makes it easier to deploy Apache Spark on-premises. With BlueData, you can spin up virtual Spark clusters within minutes – providing secure, self-service, on-demand access to Big Data analytics and infrastructure. You can deploy Spark in standalone mode or with Hadoop / YARN. You can also build analytical pipelines and create Spark clusters using our RESTful APIs, and use web-based Zeppelin notebooks for interactive data analytics.
BlueData’s software platform leverages virtualization and Docker containers – combined with our own patent-pending innovations – to make it faster, and more cost-effective for enterprises to get up and running with a multi-tenant Spark deployment on-premises.
Learn more at www.bluedata.com
Kubernetes is an open source container cluster orchestration platform founded by Google. This presentation covers an overview of it's main concepts, plus how it fits into Google Cloud Platform. This was delivered by Kit Merker at DevNexus 2015 in Atlanta.
Bare-metal performance for Big Data workloads on Docker containersBlueData, Inc.
In a benchmark study, Intel® compared the performance of Big Data workloads running on a bare-metal deployment versus running in Docker* containers with the BlueData® EPIC™ software platform.
This in-depth study shows that performance ratios for container-based Hadoop workloads on BlueData EPIC are equal to — and in some cases, better than — bare-metal Hadoop. For example, benchmark tests showed that the BlueData EPIC platform demonstrated an average 2.33% performance gain over bare metal, for a configuration with 50 Hadoop compute nodes and 10 terabytes (TB) of data. These performance results were achieved without any modifications to the Hadoop software.
This is a revolutionary milestone, and the result of an ongoing collaboration between Intel and BlueData software engineering teams.
This white paper describes the software and hardware configurations for the benchmark tests, as well as details of the performance benchmark process and results.
YARN Containerized Services: Fading The Lines Between On-Prem And CloudDataWorks Summit
Apache Hadoop YARN is the modern distributed operating system for big data applications. In Apache Hadoop 3.1.0, YARN added a service framework that supports long-running services. This new capability goes hand in hand with the recent improvements in YARN to support Docker containers. Together these features have made it significantly easier to bring new applications and services to YARN.
In this talk you will learn about YARN service framework, its new containerization capabilities and how it lays the foundation for a hybrid and uniform architecture for compute and storage across on-prem and multi-cloud environments. This will include examples highlighting how easy it is to bring applications to the YARN service framework as well as how to containerize applications.
Here's what to expect in this talk:
- Motivation for YARN service framework and containerization
- YARN service framework overview
- YARN service examples
- Containerization overview
- Containerization for Big Data and non Big Data workloads - wait that's everything
Using Ansible to deploy a 6-node Hortonworks Data Platform (hadoop) cluster on AWS with the ObjectRocket ansible-hadoop playbook.
Presented at the Ansible NOVA MeetUp on February 23, 2017: https://www.meetup.com/Ansible-NOVA/events/236853616/
HPC and cloud distributed computing, as a journeyPeter Clapham
Introducing an internal cloud brings new paradigms, tools and infrastructure management. When placed alongside traditional HPC the new opportunities are significant But getting to the new world with micro-services, autoscaling and autodialing is a journey that cannot be achieved in a single step.
CBlocks - Posix compliant files systems for HDFSDataWorks Summit
With YARN running Docker containers, it is possible to run applications that are not HDFS aware inside these containers. It is hard to customize these applications since most of them assume a Posix file system with rewrite capabilities. In this talk, we will dive into how we created a block storage, how it is being tested internally and the storage containers which makes it all possible.
The storage container framework was developed as part of Ozone (HDFS-7240). This is talk will also explore the current state of Ozone along with CBlocks. This talk will explore architecture of storage containers, how replication is handled, scaling to millions of volumes and I/O performance optimizations.
As Hadoop becomes the defacto big data platform, enterprises deploy HDP across wide range of physical and virtual environments spanning private and public clouds. This session will cover key considerations for cloud deployment and showcase Cloudbreak for simple and consistent deployment across cloud providers of choice.
APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van NiekerkSpark Summit
Many data scientists are already making heavy usage of the Jupyter ecosystem for analyzing data using interactive notebooks.
Apache Toree (incubating) is a Jupyter kernel designed to act as a gateway to Spark by enabling users Spark from standard Jupyter notebooks. This allows users to easily integrate Spark into their existing Jupyter deployments, This allows users to easily move between languages and contexts without needing to switch to a different set of tools.
Apache Toree is designed expressly for interactive work. It supports interpreters in Scala, Python, and R.
In this talk, I will cover the design of Toree, how it interacts with the Jupyter ecosystem and various ways in which users can extend the functionality of Apache Toree via a powerful plugin system.
Cloud Foundry and OpenStack - A Marriage Made in Heaven! (Cloud Foundry Summi...VMware Tanzu
Business Track presented by Animesh Singh, Lead Architect and Strategist at IBM.
Bring the world's best IaaS to the world's best PaaS, In this talk IBM and Rackspace are going to share their experiences of running Cloud Foundry on OpenStack. The talk will focus on how CloudFoundry and OpenStack complement each other, how they technically integrate using Cloud provider interface (CPI), how could we automate OpenStack setup for Cloud Foundry deployments, and what are some of the best practices for configuring a scalable environment.
Dev ops for big data cluster management toolsRan Silberman
What are the tools that we can find to day to manage Hadoop cluster and its ecosystem?
There are two tools ready today:
Cloudera Manager and Ambari from Hortonworks.
In this presentation I explain what they do and why to use them, as well as Pros. and Cons.
How to deploy Apache Spark in a multi-tenant, on-premises environmentBlueData, Inc.
Adoption of Apache Spark in the enterprise is increasing rapidly - it's become one of the fastest growing and most popular technologies in the Big Data ecosystem.
However, implementing an enterprise-ready, on-premises Spark deployment can be very complex and it requires expertise that is generally not available to all.
BlueData makes it easier to deploy Apache Spark on-premises. With BlueData, you can spin up virtual Spark clusters within minutes – providing secure, self-service, on-demand access to Big Data analytics and infrastructure. You can deploy Spark in standalone mode or with Hadoop / YARN. You can also build analytical pipelines and create Spark clusters using our RESTful APIs, and use web-based Zeppelin notebooks for interactive data analytics.
BlueData’s software platform leverages virtualization and Docker containers – combined with our own patent-pending innovations – to make it faster, and more cost-effective for enterprises to get up and running with a multi-tenant Spark deployment on-premises.
Learn more at www.bluedata.com
Kubernetes is an open source container cluster orchestration platform founded by Google. This presentation covers an overview of it's main concepts, plus how it fits into Google Cloud Platform. This was delivered by Kit Merker at DevNexus 2015 in Atlanta.
Bare-metal performance for Big Data workloads on Docker containersBlueData, Inc.
In a benchmark study, Intel® compared the performance of Big Data workloads running on a bare-metal deployment versus running in Docker* containers with the BlueData® EPIC™ software platform.
This in-depth study shows that performance ratios for container-based Hadoop workloads on BlueData EPIC are equal to — and in some cases, better than — bare-metal Hadoop. For example, benchmark tests showed that the BlueData EPIC platform demonstrated an average 2.33% performance gain over bare metal, for a configuration with 50 Hadoop compute nodes and 10 terabytes (TB) of data. These performance results were achieved without any modifications to the Hadoop software.
This is a revolutionary milestone, and the result of an ongoing collaboration between Intel and BlueData software engineering teams.
This white paper describes the software and hardware configurations for the benchmark tests, as well as details of the performance benchmark process and results.
YARN Containerized Services: Fading The Lines Between On-Prem And CloudDataWorks Summit
Apache Hadoop YARN is the modern distributed operating system for big data applications. In Apache Hadoop 3.1.0, YARN added a service framework that supports long-running services. This new capability goes hand in hand with the recent improvements in YARN to support Docker containers. Together these features have made it significantly easier to bring new applications and services to YARN.
In this talk you will learn about YARN service framework, its new containerization capabilities and how it lays the foundation for a hybrid and uniform architecture for compute and storage across on-prem and multi-cloud environments. This will include examples highlighting how easy it is to bring applications to the YARN service framework as well as how to containerize applications.
Here's what to expect in this talk:
- Motivation for YARN service framework and containerization
- YARN service framework overview
- YARN service examples
- Containerization overview
- Containerization for Big Data and non Big Data workloads - wait that's everything
Using Ansible to deploy a 6-node Hortonworks Data Platform (hadoop) cluster on AWS with the ObjectRocket ansible-hadoop playbook.
Presented at the Ansible NOVA MeetUp on February 23, 2017: https://www.meetup.com/Ansible-NOVA/events/236853616/
HPC and cloud distributed computing, as a journeyPeter Clapham
Introducing an internal cloud brings new paradigms, tools and infrastructure management. When placed alongside traditional HPC the new opportunities are significant But getting to the new world with micro-services, autoscaling and autodialing is a journey that cannot be achieved in a single step.
CBlocks - Posix compliant files systems for HDFSDataWorks Summit
With YARN running Docker containers, it is possible to run applications that are not HDFS aware inside these containers. It is hard to customize these applications since most of them assume a Posix file system with rewrite capabilities. In this talk, we will dive into how we created a block storage, how it is being tested internally and the storage containers which makes it all possible.
The storage container framework was developed as part of Ozone (HDFS-7240). This is talk will also explore the current state of Ozone along with CBlocks. This talk will explore architecture of storage containers, how replication is handled, scaling to millions of volumes and I/O performance optimizations.
As Hadoop becomes the defacto big data platform, enterprises deploy HDP across wide range of physical and virtual environments spanning private and public clouds. This session will cover key considerations for cloud deployment and showcase Cloudbreak for simple and consistent deployment across cloud providers of choice.
APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van NiekerkSpark Summit
Many data scientists are already making heavy usage of the Jupyter ecosystem for analyzing data using interactive notebooks.
Apache Toree (incubating) is a Jupyter kernel designed to act as a gateway to Spark by enabling users Spark from standard Jupyter notebooks. This allows users to easily integrate Spark into their existing Jupyter deployments, This allows users to easily move between languages and contexts without needing to switch to a different set of tools.
Apache Toree is designed expressly for interactive work. It supports interpreters in Scala, Python, and R.
In this talk, I will cover the design of Toree, how it interacts with the Jupyter ecosystem and various ways in which users can extend the functionality of Apache Toree via a powerful plugin system.
Cloud Foundry and OpenStack - A Marriage Made in Heaven! (Cloud Foundry Summi...VMware Tanzu
Business Track presented by Animesh Singh, Lead Architect and Strategist at IBM.
Bring the world's best IaaS to the world's best PaaS, In this talk IBM and Rackspace are going to share their experiences of running Cloud Foundry on OpenStack. The talk will focus on how CloudFoundry and OpenStack complement each other, how they technically integrate using Cloud provider interface (CPI), how could we automate OpenStack setup for Cloud Foundry deployments, and what are some of the best practices for configuring a scalable environment.
Dev ops for big data cluster management toolsRan Silberman
What are the tools that we can find to day to manage Hadoop cluster and its ecosystem?
There are two tools ready today:
Cloudera Manager and Ambari from Hortonworks.
In this presentation I explain what they do and why to use them, as well as Pros. and Cons.
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....Jeffrey Breen
Part 3 of 3 of series focusing on the infrastructure aspect of getting started with Big Data. This presentation demonstrates how to use Apache Whirr to launch a Hadoop cluster on Amazon EC2--easily.
Presented at the Boston Predictive Analytics Big Data Workshop, March 10, 2012. Sample code and configuration files are available on github.
Managing Docker Containers In A Cluster - Introducing KubernetesMarc Sluiter
Containerising your applications with Docker gets more and more attraction. While managing your Docker containers on your developer machine or on a single server is not a big hassle, it can get uncomfortable very quickly when you want to deploy your containers in a cluster, no matter if in the cloud or on premises. How do you provide high availability, scaling and monitoring? Fortunately there is a rapidly growing ecosystem around docker, and there are tools available which support you with this. In this session I want to introduce you to Kubernetes, the Docker orchestration tool started and open sourced by Google. Based on the experience with their data centers, Google uses some interesting declarative concepts like pods, replication controllers and services in Kubernetes, which I will explain to you. While Kubernetes still is a quite young project, it reached its first stable version this summer, thanks to many contributions by Red Hat, Microsoft, IBM and many more.
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks
The recently launched HDP 2.3 is a major advancement of Open Enterprise Hadoop. It represents the best of community led development with innovations spanning Apache Hadoop, Apache Ambari, Ranger, HBase, Spark and Storm. In this session we will provide an in-depth overview of new functionality and discuss it's impact on new and ongoing big data initiatives.
A Tight Ship: How Containers and SDS Optimize the EnterpriseEric Kavanagh
The Briefing Room with Dez Blanchfield and Red Hat
Think of containers as the drones of modern computing. They're small, agile, and can carry a significant payload. In many ways, they represent the fruition of the last two major paradigm shifts in enterprise software: SOA and virtualization. However, for companies to fully leverage this innovative approach, a persistent storage platform is needed that is as flexible and scalable as containers themselves.
Register for this episode of The Briefing Room to hear Bloor Group Data Scientist Dez Blanchfield, who will explain the significance of container technology, and the relevance of software-defined storage (SDS) in a constantly evolving IT world. He'll be briefed by Steve Watt and Sayan Saha of Red Hat, who will demonstrate how open-source technology can help organizations take advantage of this brave new world of enterprise computing. They will explain how containers are the next step in the evolution of the operating system, and why SDS is now the optimal solution.
PaaS Anywhere - Deploying an OpenShift PaaS into your Cloud Provider of ChoiceIsaac Christoffersen
Choice matters. And OpenShift Enterprise by Red Hat is the only Platform-as-a-Service (PaaS) product that runs in the hosting environment of your choice.
While true PaaS products provide a level of infrastructure abstraction so developers can be more productive, choosing the right environment to host your PaaS product is critical. You must take many factors into consideration, like cost, availability, scalability, and manageability. Having a PaaS product that’s truly environment-agnostic can maximize your options and help you make the choice that best fits your needs.
This talk has a demo of deploying OpenShift Enterprise to multiple public cloud providers and within local virtual machines. By using common provisioning tools and the various cloud vendor APIs, you’ll see the realization of open, hybrid PaaS. You’ll also learn about considerations when running OpenShift Enterprise in these different environments, and monitoring strategies for proactively maintaining the health of an OpenShift Enterprise environment.
How to build "AutoScale and AutoHeal" systems using DevOps practices by using modern technologies.
A complete build pipeline and the process of architecting a nearly unbreakable system were part of the presentation.
These slides were presented at 2018 DevOps conference in Singapore. http://claridenglobal.com/conference/devops-sg-2018/
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...Srijan Technologies
Drupal has been a consistent leader in the Gartner Magic Quadrant for Web Content Management. However, enterprises leveraging Drupal have traditionally relied on PaaS providers for their hosting, scaling and lifecycle management. And that usually leads to enterprise applications being locked-in with a particular cloud or vendor.
As container and container orchestration technologies disrupt the cloud and platform landscape, there’s a clear way to avoid this state of affairs. In this webinar, we discuss why it's important to build a cloud-native Drupal platform, and exactly how to do that.
Join the webinar to understand how you can avoid vendor lock-in, and create a secure platform to manage, operate and scale your Drupal applications in a multi-cloud portable manner.
Key Takeaways:
- Why you need a cloud-native Drupal platform and how to build one
- How to craft an idiomatic development workflow
- Understanding infrastructure and cloud engineering - under the hood
- Demystifying the art and science of Docker and Kubernetes: deep dive into scaling the LAMP stack
- Exploring cost optimization and cloud governance
- Understand portability of applications
- A hands-on demo of how the platform works
Building Cloud Native Applications with Oracle Autonomous Database.Oracle Developers
In this session, Manish Kapur from the Oracle Application Development Cloud Platform team will provide an overview of Oracle's Cloud-Native Application Development platform. He will talk about developing and deploying cloud-native applications like Microservices and Serverless functions using Continuous Integration and Delivery Pipelines. This will include a demonstration of how to use the CI/CD approach to build and deploy a simple Node.js based microservices application that uses Oracle Autonomous Transaction Processing (ATP) database for persistence.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
AI Genie Review: World’s First Open AI WordPress Website CreatorGoogle
AI Genie Review: World’s First Open AI WordPress Website Creator
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-genie-review
AI Genie Review: Key Features
✅Creates Limitless Real-Time Unique Content, auto-publishing Posts, Pages & Images directly from Chat GPT & Open AI on WordPress in any Niche
✅First & Only Google Bard Approved Software That Publishes 100% Original, SEO Friendly Content using Open AI
✅Publish Automated Posts and Pages using AI Genie directly on Your website
✅50 DFY Websites Included Without Adding Any Images, Content Or Doing Anything Yourself
✅Integrated Chat GPT Bot gives Instant Answers on Your Website to Visitors
✅Just Enter the title, and your Content for Pages and Posts will be ready on your website
✅Automatically insert visually appealing images into posts based on keywords and titles.
✅Choose the temperature of the content and control its randomness.
✅Control the length of the content to be generated.
✅Never Worry About Paying Huge Money Monthly To Top Content Creation Platforms
✅100% Easy-to-Use, Newbie-Friendly Technology
✅30-Days Money-Back Guarantee
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
#AIGenieApp #AIGenieBonus #AIGenieBonuses #AIGenieDemo #AIGenieDownload #AIGenieLegit #AIGenieLiveDemo #AIGenieOTO #AIGeniePreview #AIGenieReview #AIGenieReviewandBonus #AIGenieScamorLegit #AIGenieSoftware #AIGenieUpgrades #AIGenieUpsells #HowDoesAlGenie #HowtoBuyAIGenie #HowtoMakeMoneywithAIGenie #MakeMoneyOnline #MakeMoneywithAIGenie
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Graspan: A Big Data System for Big Code AnalysisAftab Hussain
We built a disk-based parallel graph system, Graspan, that uses a novel edge-pair centric computation model to compute dynamic transitive closures on very large program graphs.
We implement context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases such as Linux shows that their Graspan implementations scale to millions of lines of code and are much simpler than their original implementations.
These analyses were used to augment the existing checkers; these augmented checkers found 132 new NULL pointer bugs and 1308 unnecessary NULL tests in Linux 4.4.0-rc5, PostgreSQL 8.3.9, and Apache httpd 2.2.18.
- Accepted in ASPLOS ‘17, Xi’an, China.
- Featured in the tutorial, Systemized Program Analyses: A Big Data Perspective on Static Analysis Scalability, ASPLOS ‘17.
- Invited for presentation at SoCal PLS ‘16.
- Invited for poster presentation at PLDI SRC ‘16.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
OpenMetadata Community Meeting - 5th June 2024OpenMetadata
The OpenMetadata Community Meeting was held on June 5th, 2024. In this meeting, we discussed about the data quality capabilities that are integrated with the Incident Manager, providing a complete solution to handle your data observability needs. Watch the end-to-end demo of the data quality features.
* How to run your own data quality framework
* What is the performance impact of running data quality frameworks
* How to run the test cases in your own ETL pipelines
* How the Incident Manager is integrated
* Get notified with alerts when test cases fail
Watch the meeting recording here - https://www.youtube.com/watch?v=UbNOje0kf6E
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeAftab Hussain
Understanding variable roles in code has been found to be helpful by students
in learning programming -- could variable roles help deep neural models in
performing coding tasks? We do an exploratory study.
- These are slides of the talk given at InteNSE'23: The 1st International Workshop on Interpretability and Robustness in Neural Software Engineering, co-located with the 45th International Conference on Software Engineering, ICSE 2023, Melbourne Australia
Atelier - Innover avec l’IA Générative et les graphes de connaissancesNeo4j
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Allez au-delà du battage médiatique autour de l’IA et découvrez des techniques pratiques pour utiliser l’IA de manière responsable à travers les données de votre organisation. Explorez comment utiliser les graphes de connaissances pour augmenter la précision, la transparence et la capacité d’explication dans les systèmes d’IA générative. Vous partirez avec une expérience pratique combinant les relations entre les données et les LLM pour apporter du contexte spécifique à votre domaine et améliorer votre raisonnement.
Amenez votre ordinateur portable et nous vous guiderons sur la mise en place de votre propre pile d’IA générative, en vous fournissant des exemples pratiques et codés pour démarrer en quelques minutes.
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Mobile App Development Company In Noida | Drona InfotechDrona Infotech
Looking for a reliable mobile app development company in Noida? Look no further than Drona Infotech. We specialize in creating customized apps for your business needs.
Visit Us For : https://www.dronainfotech.com/mobile-application-development/
9. Cloudbreak
• Developed by SequenceIQ
• Open source with Apache 2.0
license [ Apache project soon ]
• Deploys selected services to
public and private cloud via
Ambari Blueprints
• Elastic – can spin up any number
of nodes, add/remove on the fly
• Provides full cloud lifecycle
management post-deployment
10. BI / Analytics
(Hive)
IoT Apps
(Storm, HBase, Hive)
Launch HDP on Any Cloud for Any Application
Dev / Test
(all HDP services)
Data Science
(Spark)
Cloudbreak
1. Pick a Blueprint
2. Choose a Cloud
3. Launch HDP!
Example Ambari
Blueprints:
IoT Apps, BI / Analytics, Data Science, Dev /
Test
11. Hadoop in Cloud Provisioning with Cloudbreak
Create
Templates
Provide
Blueprint
Associate
Credentials
Launch
Cluster
17. BI / Analytics
(Hive)
IoT Apps
(Storm, HBase, Hive)
Dev / Test
(all HDP services)
Data Science
(Spark)
Autoscaling
Policy
• Policies based on any Ambari metrics
• Coordinates with YARN
• Policies are based on Metrics or Time
• Scaling can be service or component
type specific
Optimize cloud usage via Elastic Clusters
19. Provisioning – How it works
Start VMs -
with a running
Docker
daemon
Cloudbreak
Bootstrap
•Start Consul
Cluster
•Start Swarm
Cluster (Consul
for discovery)
Start Ambari
servers/agents
- Swarm API
Ambari
services
registered in
Consul
(Registrator)
Post Blueprint
21. Multiplicity
of
Stacks
Multiplicity
of hardware
environments
Static website Web frontendUser DB Queue Analytics DB
Development
VM QA server Public Cloud
Contributor’s
laptopProduction
Cluster
Customer Data
Center
An engine that enables any payload to be
encapsulated as a lightweight, portable,
self-sufficient container
Docker is a “Shipping Container” System for Code
22. Lightweight, portable
Build once, run anywhere
VM – without the overhead of a VM
Isolated containers
Automated and scripted
Docker
23. Why Is Docker So Exciting?
For Developers:
Build once…run anywhere
• A clean, safe, and portable runtime
environment for your app.
• No missing dependencies, packages etc.
• Run each app in its own isolated container
• Automate testing, integration, packaging
• Reduce/eliminate concerns about
compatibility on different platforms
• Cheap, zero-penalty containers to deploy
services
For DevOps:
Configure once…run anything
• Make the entire lifecycle more efficient,
consistent, and repeatable
• Eliminate inconsistencies between SDLC
stages
• Support segregation of duties
• Significantly improves the speed and
reliability of CICD
• Significantly lightweight compared to VMs
24. App
A
Hypervisor (Type 2)
Host OS
Server
Guest
OS
Bins/
Libs
App
A’
Guest
OS
Bins/
Libs
App
B
Guest
OS
Bins/
Libs
Docker
Host OS kernel
Server
bin
AppA
lib
AppB
VM
Container
Containers are isolated,
Share only the kernel
Guest
OS
Guest
OS
…result is significantly faster
deployment, much less overhead,
easier migration, faster restart
lib
AppB
lib
AppB
lib
AppB
bin
AppA
Docker: Containers vs. VMs
26. HDP as Docker
Containers
via Cloudbreak
• Running Ambari Cluster in Containers
• Use Blueprint to define services
• All HDP services share a single container
Cloudb
reak
Ambari HDP
Installs
Ambari on
the VMs
Docker
VM
Docker
VM
Docker
Linux
Instruct
s
Ambari
to build
HDP
cluster
Cloud Provider/Bare Metal
Provisions
VMs from
Cloud
Providers
Run Hadoop as Docker Containers
31. • Quick installation with pre-pulled rpms
• Same process/images for dev/qa/prod
• Same process for single/multi-node
Benefits of running Hadoop on Docker
43. Cisco and Hortonworks’ Partnership
100% open source Hadoop Distribution,
Support and Training
Integrated Infrastructures for Big Data
CISCO AND HORTONWORKS ARE PARTNERING TO HELP YOU BUILD
YOUR BIG DATA SOLUTION AND REACH MASSIVE SCALABILITY,
SUPERIOR EFFICIENCY AND DRAMATICALLY LOWER TOTAL COST OF
OWNERSHIP THANKS TO A VALIDATED JOINT ARCHITECTURE.
44. Results of the collaboration
• Efficient Hadoop as a
service
• Adoption of Docker for
enterprise Hadoop
deployment
Tasks
Cisco
InterCloud
Public Cloud
Provider
HDP installation
15:04 mins 11:55 mins
Teragen (avg of 3 execution)
7:08 mins 22:15 mins
Terasort(avg of 3 execution)
32:09 mins 60:12 mins
Teravalidate(avg of 3
execution)
2:31 mins 10:40 mins
45. Observations Future Collaboration
• Docker is maturing inside enterprises
• Interest to run Docker on top of bare
metal
• Big data app developers are leaning
towards containerization of apps
• YARN is becoming application
deployment platform beyond big data
apps
• Demand for native containerized fully
managed app on YARN
• Run Docker natively on
Openstack
• Run Docker on Yarn
• OpenStack bare metal
47. Learn More
Download the Hortonworks Sandbox
Learn Hadoop
Build Your Analytic App
Try Hadoop 2
More about Cisco & Hortonworks
http://hortonworks.com/partner/cisco/
More about Hortonworks’ Acquisition of SequenceIQ
http://bit.ly/1R1ktxO
Editor's Notes
Deploying Hadoop on Openstack is never been easier but Hortonworks and Cisco collaboration in last few months makes it completely automated and seamless.
This is cautionary statement as this presentation may have product and collaboration direction which are subject to change.
We were founded in 2011 by 24 developers from Yahoo where Hadoop was conceived to address data challenges at internet scale. What we now know of as Hadoop really started in 2005, when a team at Yahoo was directed to build out a large-scale data storage and processing technology that would allow them to improve their most critical application, Search.
Their challenge was essentially two-fold. First they needed to capture and archive the contents of the internet, and then process the data so that users could search through it effectively an efficiently. Clearly traditional approaches were both technically (due to the size of the data) and commercially (due to the cost) impractical. The result was the Apache Hadoop project that delivered large scale storage (HDFS) and processing (MapReduce).
Today we are over 600 employees and have partnered with over 1000 companies who are the leaders in the data center
We have also been very fortunate to achieve very significant customer adoption with over 330 customers as of the end of 2014, spanning nearly every vertical.
Hortonworks was founded the sole intent to make Hadoop an enterprise data platform. With YARN as its foundation, HDP delivers a centralized architecture with true multi-tenancy for data-processing and shared services for Security, Governance and Operations to satisfy enterprise requirements, all deeply integrated and certified with leading datacenter technologies.
We are uniquely focused on this transformation of Hadoop and doing our work completely in open source. This is all predicated on our leadership in the community, which enables not only to best support users of but also provides uniquely present customer requirements within this open, thriving community.
Hortonworks approach is quite clear… we are focused on delivery of enterprise grade Hadoop as a reliable data platform that will enable your transition to a modern data architecture. To this end, we work solely within the broad open source community with a focus on innovation at the core of Apache Hadoop with YARN as a foundation and then within all the related projects that deliver on the key requirements for the enterprise such as governance, security and operation.
Since our incepetion just three years ago, we have grown to more than 450 employees and have partnered closely with the leaders in the datacenter, all of whom share this vision: to enable a modern data architecture with Hadoop in order to allow their customers to address the architectural challenge that they all are facing due to exploding data volumes.
Hortonworks Open platform approach enables us to partner and co-exist with other data center technologies. Our deep engineering relationship with data center leaders like Cisco makes it possible for customers to augment their data center with Hadoop technologies for their next generation modern data architecture.
Hortonwork’s Hadoop platform had already been enabled deployment Hadoop in any environment from Linux to Windows , Bare metal to Cloud so that Hadoop deployment environment should be business decision rather than a technical one. In continuation of such Hadoop Everywhere vision, Hortonworks recent acquisition of SequenceIQ added a provisioning and auto-scaling toolset which makes it even more easier to deploy Hadoop in private and public Cloud to accelerate the time-to-value for Hadoop deployment.
Cloudbreak is developed by SequenceIQ company from beautiful city of Budapest. Hortonworks acquired them in the month of April.
Cloudbreak is open source with Apache 2.0 license and uses many other open source technologies as the build blocks including Docker.
It is Hadoop cluster deployment and management tool which can deploy any app or use case specific hadoop cluster to public and private cloud environment in matter of minutes.
It also provide on-going cluster infrastructure management including policy based auto-scaling of clusters to optimize infrastructure usage.
Cloudbreak enables launching Hadoop cluster in 4 easy steps.
Create template captures your hadoop cluster infrastructure definition – node size , network setup .
Cloudbreak support heterogeneous instances for building the hadoop cluster as all service or service components are not same in terms of their resource requirement.
Cloudbreak not only simplify the Hadoop cluster provisioning in Cisco Openstack Cloud but also automatically scale the Hadoop clusters based on SLA or time based policies. SLA is monitored through Hadoop service metrics captured by Ambari. This way Cloudbreak enables you to get an elastic Hadoop clusters very quickly in Cisco Openstack Cloud.
Cloudbreak actively monitors Ambari metrics to assess health of every Hadoop service. It allows defining policies based on these metrics for every cluster deployed and enabled for auto-scaling. Based on these metrics and user defined policies , cloudbreak can scale clusters or services by adding nodes or allocating more yarn containers depending of the type of hadoop service.
View from 10000 ft high.
Only thing it will need is a Docker daemon. All cloud providers are going towards Docker including Cisco Intercloud.
Quick question - How many of you have used Docker before.
Docker is a container based virtualization framework. It is an open platform for developers and admins to build, ship, and
run distributed applications.
Consisting of Docker Engine, a portable,
lightweight runtime and packaging tool, and Docker Hub, a cloud service
for sharing applications and automating workflows, Docker enables apps
to be quickly assembled from components and eliminates the friction
between development, QA, and production environments. Docker is
Lightweight, portable VM but without the overhead of a VM.
Unlike traditional virtualization Docker is fast, lightweight and easy to use. Docker allows you to create containers holding all the dependencies for an application. Each container is kept isolated from any other, and nothing gets shared.
Steps:
Can span us Docker containers remotely on hosts considering:
1. Resource management - aware of the cluster resources (e.g. can schedule it with bin packing - anywhere where 1GB memory is available) or randomly
2. Constraints using labels (label one node and stsrt the container based on labels)
3. Affinity - containers can be co-scheduled (link, vollumes-from, net=container on the same host)
Best of Hadoop , Docker and Openstack in a single cloud platform to our joint customers.
Description Texas 3 GCP
VM types GP2-2Xlarge n1-standard-8
Cores 8 8
Memory 32 GB 30 GB
Volume size 2 x 400GB 2 x 400GB
Volume type HDD (magnetic) generic (magnetic)
Data nodes count 10 10
HDFS size 8 TB 8 TB
Yarn memory 240 GB 240 GB
HDP blueprint multinode-hdfs-yarn
We are expanding our Cloud strategy to meet Enterprise customer demand.
Look at the top first. We’ve done a great job of taking our platform for Private Cloud and provisioning Enterprise workloads. We’ve done a great job with UCS, with VBlock, with FlexPod. As a matter of fact, we are the leader in converged infrastructure today, and that market is expanding as customers look to Cisco and our Partners to deliver the Enterprise workloads and the benefits of Private Cloud. They’re also asking for Dev/Ops models. They want to create truly native applications for the Public Cloud. They want to harness the value of Hadoop and Big Data Analytics and Hana. And they want to leverage the collaborative platform present today. We are the leader in Private Cloud infrastructure.
Along the left-hand side, our Partners have done some amazing things. 3 Million seats of HCS, the IaaS platforms that they’ve invested in, small, medium, large, local community-based infrastructure platforms. Some Partners have enabled the PaaS platform. Some Partners are hosting MicroSoft applications, like Dimension Data does today…globally around the world. Some Partners have managed to build a Citrix or VMware virtual desktop offer.
So what Cisco Cloud Services offer is an engine to generate more services to augment capabilities we’ve invested in, and to do so in a way that only we could do together. You’ll see us leverage the extensions through innovations in the WebEx platform. You’ll see that Meraki is a very powerful model to continue to expand. You’ll hear more about the portfolio of Unified Threat Defense, and comprehensive threat defense that we think only we can bring to the cloud.
You’ll see more about analytics, and the Platforms that we have in store. You’ll soon see more about Hana-as-a-Service. And all the capabilities we can bring, will be an acceleration of those offers that we can bring to you. Why not accelerate all of our capabilities together, using our capabilities in a way that no one else has. And btw, we can’t ignore the big Public Clouds. Let’s use the Intercloud FabricT manager when appropriate to just move a workload out to that Public Cloud. I don’t care if its Azure, or Amazon or Google. Only Cisco can do this through some of the innovations that we have.
How are we going to do this?