SlideShare a Scribd company logo
1 of 39
Download to read offline
Kerberos and Health Checks and Bare
Metal, Oh My!
Updates to OpenStack Sahara in Newton
Updates to OpenStack Sahara in Newton
Vitaly Gridnev, Sahara PTL (Mirantis)
Elise Gafford, Sahara Core (Red Hat)
Nikita Konovalov, Sahara Core (Mirantis)
Agenda
1. Sahara overview
2. Health checks and management improvements
3. Kerberos integration for clusters
4. Image generation improvements
5. Bare metal clusters
6. What is NEW in NEWton
7. Q&A
Agenda
1. Sahara overview
2. Health checks and management improvements
3. Kerberos integration for clusters
4. Image generation improvements
5. Bare metal clusters
6. What is NEW in NEWton
7. Q&A
Sahara: The Use Cases
● Data Processing Cluster Management
○ On-demand, scalable, configurable, persistent clusters
○ Supports multiple plugins (Apache, Ambari, CDH, MapR...)
○ Integrates with Heat, Glance, Nova, Neutron, and Cinder
● EDP (Elastic Data Processing)
○ Supports multiple job types (Java, MR, Hive, Pig, Spark, Storm...)
○ Supports transient clusters (spin up, process, shut down) or
persistent clusters
○ Integrates with Swift and/or Manila (optionally)
Sahara: The API
Sahara: The Project
● Cluster provisioning plugins:
○ Cloudera Distribution of Hadoop (using Cloudera Manager)
○ Hortonworks Data Platform (using Apache Ambari)
○ MapR
○ “Vanilla” Apache Hadoop, Spark, and Storm
● EDP job types:
○ MapReduce, Java, Hive, and Pig jobs (using Apache Oozie)
○ Spark, Spark Streaming, and Storm jobs (using Apache Spark and Apache Storm)
● Image packing repository (sahara-image-elements)
● Framework to validate Sahara installation (sahara-tests)
● UI plugin
● OpenStackClient plugin
Agenda
1. Sahara overview
2. Health checks and management improvements
3. Kerberos integration for clusters
4. Image generation improvements
5. Bare metal clusters
6. What is NEW in NEWton
7. Q&A
Event log for clusters
● Cluster events about provisioning: allows to
understand what is the current status of cluster
provisioning, or reasons of failure
● Available since Newton for clusters created by
using Ambari
● Supported in CLI since Newton, with full dump of
all steps and events
Event log for clusters
Event log for clusters
Health checks for clusters
● Users are interested in monitoring cluster state
after cluster provisioning: vital for long living
clusters
● Sahara in Liberty doesn't have any monitoring of
the health of cluster processes. A cluster can be
broken or unavailable but Sahara will still think
that it is in ACTIVE status.
Health checks for clusters
● Clusters health checks have been implemented since
Mitaka
● Available for clusters deployed using Ambari and
Cloudera Manager. Less availability for vanilla
clusters
● Since Newton checks are available for the MapR
plugin
● Health results can be set to notify Ceilometer
● Easy to recheck health
Health checks for clusters
Health checks for clusters
Health checks for clusters
Health checks for clusters
Next steps are:
● More detailed health checks
○ Particular datanode/slave failure
○ No enough space in HDFS
● Suggestions/actions to repair health:
○ Datanode replacement
○ New nodes
○ Restarting services
● More flexible configuration of health checks (advanced health
checks, on disabling/enabling health checks for some reason)
Agenda
1. Sahara overview
2. Health checks and management improvements
3. Kerberos integration for clusters
4. Image generation improvements
5. Bare metal clusters
6. What is NEW in NEWton
7. Q&A
Security improvements
● Security is an important part of created clusters
● Previously security could be enabled only by
managers calling only Ambari and Cloudera
Manager directly, but that leads to a situation in
which Sahara will not perform auth operations,
and EDP does not work
● Security is important not just for clusters, but for
Sahara itself
Security improvements
In Newton the following Kerberos security features were implemented:
● MIT KDC can be preconfigured (or an existing KDC can be used)
● Oozie client was re-implemented to support auth operations with Kerberos
● Spark job executions are also supported
● Keys are distributed on nodes for system users (hdfs, hadoop, spark)
● Supported for clusters deployed using Ambari and Cloudera Manager
● Note: Be sure that latest hadoop-swift jars are in place for Swift data sources!
Security improvements
Security improvements
● Bandit tests per commit
● Improved secret storage
(using Barbican and
Castellan) was implemented
in the previous release
Agenda
1. Sahara overview
2. Health checks and management improvements
3. Kerberos integration for clusters
4. Image generation improvements
5. Bare metal clusters
6. What is NEW in NEWton
7. Q&A
Where we were
Sahara had 2 flows that were relevant to image manipulation:
● Pre-Nova spawn image packing
○ Used sahara-image-elements repository to generate images (to store in Glance)
● Post-Nova spawn cluster generation from “clean” (OS-only) images
○ Logic maintained in Sahara process within plugins
● Pre-Configuration validation of images by plugins
○ Remember how I said we had 2 flows relevant to image manipulation?
○ We didn’t do this at all.
Where We Were: Problems
● Duplication of logic
○ Steps required for packing images and “clean” image clusters were often identical, but had to
be expressed separately (in DIB and in Python).
● Poor validation
○ Plugins did not validate that images provided to them met their needs.
○ Failures due to image contents were late and sometimes difficult to understand.
● Poor encapsulation
○ Image generation and cluster provisioning logic for any one plugin are really one application
○ Maintaining them in two places allows versionitis and dependency problems
○ Having one monolithic repo for all plugins makes them less pluggable
Our Dream Implementation
● All flows share common logic:
○ Image packing
○ Image validation
○ Clean image cluster gen
● Image manipulation is stored and versioned within plugins
● The user can still generate images with a CLI...
● But they can also use an API to generate images in clean build environments
● ... And both dev test cycles and user retries are as quick and painless as
possible
The plan
1. Build a validation engine that ensures that images meet a specification
a. YAML-based spec definition
2. Extend that engine to optionally modify images to spec
3. Build a CLI to expose this functionality
4. Create and test specifications for each plugin to support this method
5. Deprecate sahara-image-elements (only when this method proves stable)
6. Build an API to:
a. Spawn a clean tenant-plane image build environment
b. Download a base image from Glance and modify it to spec
c. Push the new image back to Glance and register it for use by Sahara
Where we are
1. Build a validation engine that ensures that images meet a specification
a. YAML-based spec definition
2. Extend that engine to optionally modify images to spec
3. Build a CLI to expose this functionality
4. Create and test specifications for each plugin to support this method
5. Deprecate sahara-image-elements (only when this method proves stable)
6. Build an API to:
a. Spawn a clean tenant-plane image build environment
b. Download a base image from Glance and modify it to spec
c. Push the new image back to Glance and register it for use by Sahara
What it looks like: the specs
● YAML-based definitions
● Argument definitions for
configurability
● Idempotent resource
declarations
○ Scripts must be written
idempotently, as always in
resource declarations
● Logical control operators (any,
all, os_case, etc.)
What it looks like: the CLI
Command structure:
sahara-image-pack --image ./image.qcow2
PLUGIN VERSION [plugin arguments]
Features:
● Auto-generates help text from arguments
● Idempotent and modifies images in-place
○ Very fast test cycles and retries
● Allows freeform bash scripts and more
structured resources
○ Though it’s on you to make your scripts
idempotent
● Test-only mode to validate without change
What it’s doing
The images module runs a sequence
of steps against a remote machine
● Validation uses the Sahara SSH remote in
read-only mode
● Clean image gen uses the SSH remote
● Image packing uses a libguestfs Python
API image handle
All three use the same logic,
contained in the appropriate plugin
Plugin implementation targeting O!
Agenda
1. Sahara overview
2. Health checks and management improvements
3. Kerberos integration for clusters
4. Image generation improvements
5. Bare metal clusters
6. What is NEW in NEWton
7. Q&A
Ironic integration
Why should you run Bare Metal in OpenStack:
● Big Data workload originates from Bare Metal installations
● Quick cluster scalability may have lower priority than a long running stability
and persistence
● Best performance by design, no virtualization overhead
● The ability to manage a baremetal cluster with the OpenStack API
Bare Metal compared to Virtualized
Bare metal (Ironic) Virtual Machines
Cluster size flexibility Dedicating nodes completely. Flavor based scheduling
Resource utilization The host is 100% utilized. KVM has memory overhead. Other VM
may abuse host’s resources.
Data locality Data is accessible directly from the
local disks.
Locality may be achieved by proper
resource scheduling
Live migration A host may be lost completely. Supported for some target daemons
Some tips before running Bare Metal
● Scheduling is not trivial. The Cloud operator may need to specify additional
Flavors, Availability Zones, or other metadata
● Storage is not backed by Cinder for Bare Metal
○ Sahara does disk discover on it’s own
○ Disks are different from the on w/o root mount are going to be dedicated to HDFS
● Non-standard hardware will require drivers built into the provisioning image
● Network tenant isolation is achievable through manual hardware switch
configurations
Agenda
1. Sahara overview
2. Health checks and management improvements
3. Kerberos integration for clusters
4. Image generation improvements
5. Bare metal clusters
6. What is NEW in NEWton
7. Q&A
What is NEW in NEWton
● Designate integration;
● API Improvements: pagination for list operations, API to
manage/enable/disable plugins;
● New plugin versions
○ HDP 2.4 supported
○ MapR 5.2.0
○ CDH 5.7.x
○ Vanilla + Spark on YARN
What is NEW in Newton
● Sahara tests framework to validate
environment readiness for Sahara’s
clusters
○ Sahara tempest plugin with more tests (CLI,
API)
○ Sahara scenario framework with a bunch of
templates
○ Published on PyPi
https://pypi.python.org/pypi/sahara-tests
Q&A
Useful links and materials
● Sahara wiki https://wiki.openstack.org/wiki/Sahara
● Sahara specs https://specs.openstack.org/openstack/sahara-specs/
● Sahara docs http://docs.openstack.org/developer/sahara/
● Sahara images http://sahara-files.mirantis.com/images/upstream/newton/

More Related Content

What's hot

OpenStack Nova - Developer Introduction
OpenStack Nova - Developer IntroductionOpenStack Nova - Developer Introduction
OpenStack Nova - Developer IntroductionJohn Garbutt
 
Autoscaling with magnum and senlin
Autoscaling with magnum and senlinAutoscaling with magnum and senlin
Autoscaling with magnum and senlinQiming Teng
 
OpenStack Architecture and Use Cases
OpenStack Architecture and Use CasesOpenStack Architecture and Use Cases
OpenStack Architecture and Use CasesJalal Mostafa
 
CERN OpenStack Cloud Control Plane - From VMs to K8s
CERN OpenStack Cloud Control Plane - From VMs to K8sCERN OpenStack Cloud Control Plane - From VMs to K8s
CERN OpenStack Cloud Control Plane - From VMs to K8sBelmiro Moreira
 
OpenStack Telco Architecture: OpenStack Summit Boston 2017
OpenStack Telco Architecture: OpenStack Summit Boston 2017OpenStack Telco Architecture: OpenStack Summit Boston 2017
OpenStack Telco Architecture: OpenStack Summit Boston 2017Christian "kiko" Reis
 
What's new in OpenStack Liberty
What's new in OpenStack LibertyWhat's new in OpenStack Liberty
What's new in OpenStack LibertyStephen Gordon
 
Senlin deep dive 2016
Senlin deep dive 2016Senlin deep dive 2016
Senlin deep dive 2016Qiming Teng
 
The Battle of the distros - OS Summit Atlanta2014
The Battle of the distros - OS Summit Atlanta2014The Battle of the distros - OS Summit Atlanta2014
The Battle of the distros - OS Summit Atlanta2014Edgar Magana
 
Suning OpenStack Cloud and Heat
Suning OpenStack Cloud and HeatSuning OpenStack Cloud and Heat
Suning OpenStack Cloud and HeatQiming Teng
 
10 Years of OpenStack at CERN - From 0 to 300k cores
10 Years of OpenStack at CERN - From 0 to 300k cores10 Years of OpenStack at CERN - From 0 to 300k cores
10 Years of OpenStack at CERN - From 0 to 300k coresBelmiro Moreira
 
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-2
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-22012 CloudStack Design Camp in Taiwan--- CloudStack Overview-2
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-2tcloudcomputing-tw
 
Solr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin PresentationSolr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin PresentationNitin Sharma
 
Openstack architure part 1
Openstack architure part 1Openstack architure part 1
Openstack architure part 1Nhan Cao Thanh
 
Future Science on Future OpenStack
Future Science on Future OpenStackFuture Science on Future OpenStack
Future Science on Future OpenStackBelmiro Moreira
 
Managing Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native WayManaging Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native WayQiming Teng
 
Cern Cloud Architecture - February, 2016
Cern Cloud Architecture - February, 2016Cern Cloud Architecture - February, 2016
Cern Cloud Architecture - February, 2016Belmiro Moreira
 
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpNathan Handler
 

What's hot (20)

OpenStack Nova - Developer Introduction
OpenStack Nova - Developer IntroductionOpenStack Nova - Developer Introduction
OpenStack Nova - Developer Introduction
 
Autoscaling with magnum and senlin
Autoscaling with magnum and senlinAutoscaling with magnum and senlin
Autoscaling with magnum and senlin
 
OpenStack Architecture and Use Cases
OpenStack Architecture and Use CasesOpenStack Architecture and Use Cases
OpenStack Architecture and Use Cases
 
CERN OpenStack Cloud Control Plane - From VMs to K8s
CERN OpenStack Cloud Control Plane - From VMs to K8sCERN OpenStack Cloud Control Plane - From VMs to K8s
CERN OpenStack Cloud Control Plane - From VMs to K8s
 
OpenStack Telco Architecture: OpenStack Summit Boston 2017
OpenStack Telco Architecture: OpenStack Summit Boston 2017OpenStack Telco Architecture: OpenStack Summit Boston 2017
OpenStack Telco Architecture: OpenStack Summit Boston 2017
 
What's new in OpenStack Liberty
What's new in OpenStack LibertyWhat's new in OpenStack Liberty
What's new in OpenStack Liberty
 
Senlin deep dive 2016
Senlin deep dive 2016Senlin deep dive 2016
Senlin deep dive 2016
 
The Battle of the distros - OS Summit Atlanta2014
The Battle of the distros - OS Summit Atlanta2014The Battle of the distros - OS Summit Atlanta2014
The Battle of the distros - OS Summit Atlanta2014
 
Suning OpenStack Cloud and Heat
Suning OpenStack Cloud and HeatSuning OpenStack Cloud and Heat
Suning OpenStack Cloud and Heat
 
OpenStack 101
OpenStack 101OpenStack 101
OpenStack 101
 
10 Years of OpenStack at CERN - From 0 to 300k cores
10 Years of OpenStack at CERN - From 0 to 300k cores10 Years of OpenStack at CERN - From 0 to 300k cores
10 Years of OpenStack at CERN - From 0 to 300k cores
 
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-2
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-22012 CloudStack Design Camp in Taiwan--- CloudStack Overview-2
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-2
 
Solr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin PresentationSolr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin Presentation
 
Openstack architure part 1
Openstack architure part 1Openstack architure part 1
Openstack architure part 1
 
Future Science on Future OpenStack
Future Science on Future OpenStackFuture Science on Future OpenStack
Future Science on Future OpenStack
 
Managing Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native WayManaging Container Clusters in OpenStack Native Way
Managing Container Clusters in OpenStack Native Way
 
OpenStack Super Bootcamp.pdf
OpenStack Super Bootcamp.pdfOpenStack Super Bootcamp.pdf
OpenStack Super Bootcamp.pdf
 
Cern Cloud Architecture - February, 2016
Cern Cloud Architecture - February, 2016Cern Cloud Architecture - February, 2016
Cern Cloud Architecture - February, 2016
 
OpenStack 101 update
OpenStack 101 updateOpenStack 101 update
OpenStack 101 update
 
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at Yelp
 

Viewers also liked

OpenStack Tempest and REST API testing
OpenStack Tempest and REST API testingOpenStack Tempest and REST API testing
OpenStack Tempest and REST API testingopenstackindia
 
Juegos olímpicos
Juegos olímpicosJuegos olímpicos
Juegos olímpicossonia-soler
 
Reflexiones paralelas al relevamiento de necesidades para grupos de apoyo
Reflexiones paralelas al relevamiento de necesidades para grupos de apoyoReflexiones paralelas al relevamiento de necesidades para grupos de apoyo
Reflexiones paralelas al relevamiento de necesidades para grupos de apoyoCaep2016
 
SOBRE A ASSISTÊNCIA SOCIAL
SOBRE A ASSISTÊNCIA SOCIALSOBRE A ASSISTÊNCIA SOCIAL
SOBRE A ASSISTÊNCIA SOCIALMaíra B. Melo
 
Rico Zulkarnain RZ9 - Futsal CV New
Rico Zulkarnain RZ9 - Futsal CV NewRico Zulkarnain RZ9 - Futsal CV New
Rico Zulkarnain RZ9 - Futsal CV NewRico Zulkarnain
 
id card printer
id card printerid card printer
id card printerxpressid
 
Entro así y salgo asa
Entro así y salgo asaEntro así y salgo asa
Entro así y salgo asa1000Charona
 
Leithold louis-el-calculos-7ed-1380-pag
Leithold louis-el-calculos-7ed-1380-pagLeithold louis-el-calculos-7ed-1380-pag
Leithold louis-el-calculos-7ed-1380-pagJimmy Arch
 
Thấy gì qua lễ hội thể thao thường niên của trẻ em nhật
Thấy gì qua lễ hội thể thao thường niên của trẻ em nhậtThấy gì qua lễ hội thể thao thường niên của trẻ em nhật
Thấy gì qua lễ hội thể thao thường niên của trẻ em nhậtAnhcdby03
 
Az átadáskor megjelent prospektus az Újpesti Fürdőről
Az átadáskor megjelent prospektus az Újpesti FürdőrőlAz átadáskor megjelent prospektus az Újpesti Fürdőről
Az átadáskor megjelent prospektus az Újpesti Fürdőrőlurbanistablog
 
Sở hữu tướng tay này, sống sướng hơn tiên sau kết hôn
Sở hữu tướng tay này, sống sướng hơn tiên sau kết hônSở hữu tướng tay này, sống sướng hơn tiên sau kết hôn
Sở hữu tướng tay này, sống sướng hơn tiên sau kết hônAnhcdby03
 

Viewers also liked (20)

OpenStack Tempest and REST API testing
OpenStack Tempest and REST API testingOpenStack Tempest and REST API testing
OpenStack Tempest and REST API testing
 
Juegos olímpicos
Juegos olímpicosJuegos olímpicos
Juegos olímpicos
 
Negociacion ces3
Negociacion ces3Negociacion ces3
Negociacion ces3
 
The Hurt Locker
The Hurt LockerThe Hurt Locker
The Hurt Locker
 
Reflexiones paralelas al relevamiento de necesidades para grupos de apoyo
Reflexiones paralelas al relevamiento de necesidades para grupos de apoyoReflexiones paralelas al relevamiento de necesidades para grupos de apoyo
Reflexiones paralelas al relevamiento de necesidades para grupos de apoyo
 
SOBRE A ASSISTÊNCIA SOCIAL
SOBRE A ASSISTÊNCIA SOCIALSOBRE A ASSISTÊNCIA SOCIAL
SOBRE A ASSISTÊNCIA SOCIAL
 
Taller 2 noviembre 1
Taller 2 noviembre 1Taller 2 noviembre 1
Taller 2 noviembre 1
 
Rico Zulkarnain RZ9 - Futsal CV New
Rico Zulkarnain RZ9 - Futsal CV NewRico Zulkarnain RZ9 - Futsal CV New
Rico Zulkarnain RZ9 - Futsal CV New
 
id card printer
id card printerid card printer
id card printer
 
Entro así y salgo asa
Entro así y salgo asaEntro así y salgo asa
Entro así y salgo asa
 
Autopista en bolivia
Autopista en boliviaAutopista en bolivia
Autopista en bolivia
 
Ficha robotica
Ficha roboticaFicha robotica
Ficha robotica
 
Arduino comic es
Arduino comic esArduino comic es
Arduino comic es
 
Leithold louis-el-calculos-7ed-1380-pag
Leithold louis-el-calculos-7ed-1380-pagLeithold louis-el-calculos-7ed-1380-pag
Leithold louis-el-calculos-7ed-1380-pag
 
Tics
TicsTics
Tics
 
Esp with thepic16f877
Esp with thepic16f877Esp with thepic16f877
Esp with thepic16f877
 
Thấy gì qua lễ hội thể thao thường niên của trẻ em nhật
Thấy gì qua lễ hội thể thao thường niên của trẻ em nhậtThấy gì qua lễ hội thể thao thường niên của trẻ em nhật
Thấy gì qua lễ hội thể thao thường niên của trẻ em nhật
 
Az átadáskor megjelent prospektus az Újpesti Fürdőről
Az átadáskor megjelent prospektus az Újpesti FürdőrőlAz átadáskor megjelent prospektus az Újpesti Fürdőről
Az átadáskor megjelent prospektus az Újpesti Fürdőről
 
Cantantes
CantantesCantantes
Cantantes
 
Sở hữu tướng tay này, sống sướng hơn tiên sau kết hôn
Sở hữu tướng tay này, sống sướng hơn tiên sau kết hônSở hữu tướng tay này, sống sướng hơn tiên sau kết hôn
Sở hữu tướng tay này, sống sướng hơn tiên sau kết hôn
 

Similar to -Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahara in Newton

Scylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
Scylla Summit 2022: What’s New in ScyllaDB Operator for KubernetesScylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
Scylla Summit 2022: What’s New in ScyllaDB Operator for KubernetesScyllaDB
 
Cinder project update at OpenStack Boston Summit May 2017
Cinder project update at OpenStack Boston Summit May 2017Cinder project update at OpenStack Boston Summit May 2017
Cinder project update at OpenStack Boston Summit May 2017Miroslav Halas
 
Database as a Service (DBaaS) on Kubernetes
Database as a Service (DBaaS) on KubernetesDatabase as a Service (DBaaS) on Kubernetes
Database as a Service (DBaaS) on KubernetesObjectRocket
 
Top 10 Kubernetes Native Java Quarkus Features
Top 10 Kubernetes Native Java Quarkus FeaturesTop 10 Kubernetes Native Java Quarkus Features
Top 10 Kubernetes Native Java Quarkus Featuresjclingan
 
NetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksNetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksRuslan Meshenberg
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansPeter Clapham
 
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIXCassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIXVinay Kumar Chella
 
My sql cluster case study apr16
My sql cluster case study apr16My sql cluster case study apr16
My sql cluster case study apr16Sumi Ryu
 
The road to enterprise ready open stack storage as service
The road to enterprise ready open stack storage as serviceThe road to enterprise ready open stack storage as service
The road to enterprise ready open stack storage as serviceSean Cohen
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to ChainerShunta Saito
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1Ruslan Meshenberg
 
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...ShapeBlue
 
To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…Sergey Dzyuban
 
Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...DataWorks Summit
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Dave Holland
 

Similar to -Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahara in Newton (20)

Scylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
Scylla Summit 2022: What’s New in ScyllaDB Operator for KubernetesScylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
Scylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
 
Cinder project update at OpenStack Boston Summit May 2017
Cinder project update at OpenStack Boston Summit May 2017Cinder project update at OpenStack Boston Summit May 2017
Cinder project update at OpenStack Boston Summit May 2017
 
Database as a Service (DBaaS) on Kubernetes
Database as a Service (DBaaS) on KubernetesDatabase as a Service (DBaaS) on Kubernetes
Database as a Service (DBaaS) on Kubernetes
 
Top 10 Kubernetes Native Java Quarkus Features
Top 10 Kubernetes Native Java Quarkus FeaturesTop 10 Kubernetes Native Java Quarkus Features
Top 10 Kubernetes Native Java Quarkus Features
 
NetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksNetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talks
 
Welcome to icehouse
Welcome to icehouseWelcome to icehouse
Welcome to icehouse
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIXCassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
 
My sql cluster case study apr16
My sql cluster case study apr16My sql cluster case study apr16
My sql cluster case study apr16
 
The road to enterprise ready open stack storage as service
The road to enterprise ready open stack storage as serviceThe road to enterprise ready open stack storage as service
The road to enterprise ready open stack storage as service
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
 
To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…
 
Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
 
[OSS Upstream Training] 5 open stack liberty_recap
[OSS Upstream Training] 5 open stack liberty_recap[OSS Upstream Training] 5 open stack liberty_recap
[OSS Upstream Training] 5 open stack liberty_recap
 
open stackliberty_recap_by_VietOpenStack
open stackliberty_recap_by_VietOpenStackopen stackliberty_recap_by_VietOpenStack
open stackliberty_recap_by_VietOpenStack
 

-Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahara in Newton

  • 1. Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahara in Newton Updates to OpenStack Sahara in Newton Vitaly Gridnev, Sahara PTL (Mirantis) Elise Gafford, Sahara Core (Red Hat) Nikita Konovalov, Sahara Core (Mirantis)
  • 2. Agenda 1. Sahara overview 2. Health checks and management improvements 3. Kerberos integration for clusters 4. Image generation improvements 5. Bare metal clusters 6. What is NEW in NEWton 7. Q&A
  • 3. Agenda 1. Sahara overview 2. Health checks and management improvements 3. Kerberos integration for clusters 4. Image generation improvements 5. Bare metal clusters 6. What is NEW in NEWton 7. Q&A
  • 4. Sahara: The Use Cases ● Data Processing Cluster Management ○ On-demand, scalable, configurable, persistent clusters ○ Supports multiple plugins (Apache, Ambari, CDH, MapR...) ○ Integrates with Heat, Glance, Nova, Neutron, and Cinder ● EDP (Elastic Data Processing) ○ Supports multiple job types (Java, MR, Hive, Pig, Spark, Storm...) ○ Supports transient clusters (spin up, process, shut down) or persistent clusters ○ Integrates with Swift and/or Manila (optionally)
  • 6. Sahara: The Project ● Cluster provisioning plugins: ○ Cloudera Distribution of Hadoop (using Cloudera Manager) ○ Hortonworks Data Platform (using Apache Ambari) ○ MapR ○ “Vanilla” Apache Hadoop, Spark, and Storm ● EDP job types: ○ MapReduce, Java, Hive, and Pig jobs (using Apache Oozie) ○ Spark, Spark Streaming, and Storm jobs (using Apache Spark and Apache Storm) ● Image packing repository (sahara-image-elements) ● Framework to validate Sahara installation (sahara-tests) ● UI plugin ● OpenStackClient plugin
  • 7. Agenda 1. Sahara overview 2. Health checks and management improvements 3. Kerberos integration for clusters 4. Image generation improvements 5. Bare metal clusters 6. What is NEW in NEWton 7. Q&A
  • 8. Event log for clusters ● Cluster events about provisioning: allows to understand what is the current status of cluster provisioning, or reasons of failure ● Available since Newton for clusters created by using Ambari ● Supported in CLI since Newton, with full dump of all steps and events
  • 9. Event log for clusters
  • 10. Event log for clusters
  • 11. Health checks for clusters ● Users are interested in monitoring cluster state after cluster provisioning: vital for long living clusters ● Sahara in Liberty doesn't have any monitoring of the health of cluster processes. A cluster can be broken or unavailable but Sahara will still think that it is in ACTIVE status.
  • 12. Health checks for clusters ● Clusters health checks have been implemented since Mitaka ● Available for clusters deployed using Ambari and Cloudera Manager. Less availability for vanilla clusters ● Since Newton checks are available for the MapR plugin ● Health results can be set to notify Ceilometer ● Easy to recheck health
  • 13. Health checks for clusters
  • 14. Health checks for clusters
  • 15. Health checks for clusters
  • 16. Health checks for clusters Next steps are: ● More detailed health checks ○ Particular datanode/slave failure ○ No enough space in HDFS ● Suggestions/actions to repair health: ○ Datanode replacement ○ New nodes ○ Restarting services ● More flexible configuration of health checks (advanced health checks, on disabling/enabling health checks for some reason)
  • 17. Agenda 1. Sahara overview 2. Health checks and management improvements 3. Kerberos integration for clusters 4. Image generation improvements 5. Bare metal clusters 6. What is NEW in NEWton 7. Q&A
  • 18. Security improvements ● Security is an important part of created clusters ● Previously security could be enabled only by managers calling only Ambari and Cloudera Manager directly, but that leads to a situation in which Sahara will not perform auth operations, and EDP does not work ● Security is important not just for clusters, but for Sahara itself
  • 19. Security improvements In Newton the following Kerberos security features were implemented: ● MIT KDC can be preconfigured (or an existing KDC can be used) ● Oozie client was re-implemented to support auth operations with Kerberos ● Spark job executions are also supported ● Keys are distributed on nodes for system users (hdfs, hadoop, spark) ● Supported for clusters deployed using Ambari and Cloudera Manager ● Note: Be sure that latest hadoop-swift jars are in place for Swift data sources!
  • 21. Security improvements ● Bandit tests per commit ● Improved secret storage (using Barbican and Castellan) was implemented in the previous release
  • 22. Agenda 1. Sahara overview 2. Health checks and management improvements 3. Kerberos integration for clusters 4. Image generation improvements 5. Bare metal clusters 6. What is NEW in NEWton 7. Q&A
  • 23. Where we were Sahara had 2 flows that were relevant to image manipulation: ● Pre-Nova spawn image packing ○ Used sahara-image-elements repository to generate images (to store in Glance) ● Post-Nova spawn cluster generation from “clean” (OS-only) images ○ Logic maintained in Sahara process within plugins ● Pre-Configuration validation of images by plugins ○ Remember how I said we had 2 flows relevant to image manipulation? ○ We didn’t do this at all.
  • 24. Where We Were: Problems ● Duplication of logic ○ Steps required for packing images and “clean” image clusters were often identical, but had to be expressed separately (in DIB and in Python). ● Poor validation ○ Plugins did not validate that images provided to them met their needs. ○ Failures due to image contents were late and sometimes difficult to understand. ● Poor encapsulation ○ Image generation and cluster provisioning logic for any one plugin are really one application ○ Maintaining them in two places allows versionitis and dependency problems ○ Having one monolithic repo for all plugins makes them less pluggable
  • 25. Our Dream Implementation ● All flows share common logic: ○ Image packing ○ Image validation ○ Clean image cluster gen ● Image manipulation is stored and versioned within plugins ● The user can still generate images with a CLI... ● But they can also use an API to generate images in clean build environments ● ... And both dev test cycles and user retries are as quick and painless as possible
  • 26. The plan 1. Build a validation engine that ensures that images meet a specification a. YAML-based spec definition 2. Extend that engine to optionally modify images to spec 3. Build a CLI to expose this functionality 4. Create and test specifications for each plugin to support this method 5. Deprecate sahara-image-elements (only when this method proves stable) 6. Build an API to: a. Spawn a clean tenant-plane image build environment b. Download a base image from Glance and modify it to spec c. Push the new image back to Glance and register it for use by Sahara
  • 27. Where we are 1. Build a validation engine that ensures that images meet a specification a. YAML-based spec definition 2. Extend that engine to optionally modify images to spec 3. Build a CLI to expose this functionality 4. Create and test specifications for each plugin to support this method 5. Deprecate sahara-image-elements (only when this method proves stable) 6. Build an API to: a. Spawn a clean tenant-plane image build environment b. Download a base image from Glance and modify it to spec c. Push the new image back to Glance and register it for use by Sahara
  • 28. What it looks like: the specs ● YAML-based definitions ● Argument definitions for configurability ● Idempotent resource declarations ○ Scripts must be written idempotently, as always in resource declarations ● Logical control operators (any, all, os_case, etc.)
  • 29. What it looks like: the CLI Command structure: sahara-image-pack --image ./image.qcow2 PLUGIN VERSION [plugin arguments] Features: ● Auto-generates help text from arguments ● Idempotent and modifies images in-place ○ Very fast test cycles and retries ● Allows freeform bash scripts and more structured resources ○ Though it’s on you to make your scripts idempotent ● Test-only mode to validate without change
  • 30. What it’s doing The images module runs a sequence of steps against a remote machine ● Validation uses the Sahara SSH remote in read-only mode ● Clean image gen uses the SSH remote ● Image packing uses a libguestfs Python API image handle All three use the same logic, contained in the appropriate plugin Plugin implementation targeting O!
  • 31. Agenda 1. Sahara overview 2. Health checks and management improvements 3. Kerberos integration for clusters 4. Image generation improvements 5. Bare metal clusters 6. What is NEW in NEWton 7. Q&A
  • 32. Ironic integration Why should you run Bare Metal in OpenStack: ● Big Data workload originates from Bare Metal installations ● Quick cluster scalability may have lower priority than a long running stability and persistence ● Best performance by design, no virtualization overhead ● The ability to manage a baremetal cluster with the OpenStack API
  • 33. Bare Metal compared to Virtualized Bare metal (Ironic) Virtual Machines Cluster size flexibility Dedicating nodes completely. Flavor based scheduling Resource utilization The host is 100% utilized. KVM has memory overhead. Other VM may abuse host’s resources. Data locality Data is accessible directly from the local disks. Locality may be achieved by proper resource scheduling Live migration A host may be lost completely. Supported for some target daemons
  • 34. Some tips before running Bare Metal ● Scheduling is not trivial. The Cloud operator may need to specify additional Flavors, Availability Zones, or other metadata ● Storage is not backed by Cinder for Bare Metal ○ Sahara does disk discover on it’s own ○ Disks are different from the on w/o root mount are going to be dedicated to HDFS ● Non-standard hardware will require drivers built into the provisioning image ● Network tenant isolation is achievable through manual hardware switch configurations
  • 35. Agenda 1. Sahara overview 2. Health checks and management improvements 3. Kerberos integration for clusters 4. Image generation improvements 5. Bare metal clusters 6. What is NEW in NEWton 7. Q&A
  • 36. What is NEW in NEWton ● Designate integration; ● API Improvements: pagination for list operations, API to manage/enable/disable plugins; ● New plugin versions ○ HDP 2.4 supported ○ MapR 5.2.0 ○ CDH 5.7.x ○ Vanilla + Spark on YARN
  • 37. What is NEW in Newton ● Sahara tests framework to validate environment readiness for Sahara’s clusters ○ Sahara tempest plugin with more tests (CLI, API) ○ Sahara scenario framework with a bunch of templates ○ Published on PyPi https://pypi.python.org/pypi/sahara-tests
  • 38. Q&A
  • 39. Useful links and materials ● Sahara wiki https://wiki.openstack.org/wiki/Sahara ● Sahara specs https://specs.openstack.org/openstack/sahara-specs/ ● Sahara docs http://docs.openstack.org/developer/sahara/ ● Sahara images http://sahara-files.mirantis.com/images/upstream/newton/