SlideShare a Scribd company logo
Introduction to Stacki
Greg Bruno, PhD
VP Engineering, StackIQ
Open Source Stack Installer
Stacki is a very fast and ultra reliable Linux server provisioning tool … at scale.
With zero prerequisites for taking systems from bare metal to a ping and prompt.
PayPal
Hadoop @ PayPal
12 x 2TB SATA
data drives	
48 nodes
each rack
1GBE-10GBE
NICs
24 x 900GB 6G SAS
10K data drives	
24 nodes
each rack
10GBE NIC
8 x 4TB NR-SAS
data drives	
10 GBE NIC
Bay	Area	
Salt	Lake	City	
Las	Vegas	
DATACENTERS
•  3,000 nodes and growing
•  60+ initial server racks
•  Heterogeneous HW
across multiple DCs
Data Science
Infrastructure Footprint	
48 nodes
each rack
Automation Challenge
Spinout creates some datacenter automation challenges …
•  Smaller team but even more to do
•  Rethink automation
•  Distributed systems have tons of local drives which require

time consuming disk formatting and partitioning, and hardware
RAID config on masternodes
•  New provisioning solution needs to easily, flexibly integrate

w/ other commercial, open source, and homegrown

management tools
•  Can 100s or 1000s of nodes be (re)provisioned as quickly as

one or a few? (e.g., drive failures mean replacing entire host

from O/S to disk to network to firmware to … etc)
Stacki @ PayPal
Ambari HDP
Health Detection
Integration
IPMI/iLOOS Disk Network
DHCP / DNS /
TFTP
Ansible
- Disk Array Controller Configuration
- Disk Partitioning Configuration
“Stacki + Ansible = Happiness. :D” – Stacki mailing list 8/11/15
Quick, Early Success
14 Minutes*To Fully Provision 6 Racks of Bare Metal (288 Servers)
Includes wiping all
disks then fully
partitioning & formatting
~3500 drives
And Now…
Upgrades all firmware
automatically
Executes Ansible
scripts on all hosts
Hadoop packages
installed
* Versus hours with other hyperscale management tools, or days to weeks with traditional tools and processes
How We Solve the Problem
History
• San Diego Supercomputer Center
•  1986 - National Science Foundation
•  Along with NCSA only two non-classified centers
•  Mission: serve computational scientists
• Rocks
•  2000 - First cluster group inside SDSC
•  Version 1.0 released that November as open source
•  10k+ clusters world-wide
• StackIQ
•  2006 - Commercial support for Rocks
•  2011 - Venture Backed
•  Focus on next generation clustered systems (Data, Cloud)
• Stacki - 2015
•  June – released as open source
•  July – first hyper-scale user
Must Haves
 Make it – Automatic
◦  Think about it, test it. Deploy it.
◦  People don’t scale, software does. Free your people – allow ops guys to be ops/analysis guys, move them from single machine view to
global machine view.
 Make it – Repeatable
◦  State of the environment is guaranteed. Does not require homogeneity of hardware or functionality. Make compute environments
homogenous on heterogeneous hardware and software.
◦  Really, nothing is homogenous. Environment maybe, behavior of that environment on different machines while predictable will not be the
same across all hardware. Stacki gets you flexibility and predictability.
 Make it – Reliable
◦  You always get what you want when you want it. You can make reasonable estimates of need because you’ve made the environment
predictable and repeatable. Just like science!
 Make it – Comprehensive
◦  Manage application layer(s) down to kernels and device configuration with one tool. Never hit the network unconfigured.
◦  Provide turn-key deployment with reasonable default settings and ability to customize / re-wire as desired.
Stacki Positioning
DevOps / Configuration Tool
DHCP /
DNS / TFTP
NetworkDiskOS
In-house
developed
deployment
tools
- Disk Array Controller Configuration
- Disk Partitioning Configuration
Datacenter Architecture
Frontend
Network
Backend Backend Backend Backend
em1 em1
em1
em1
em1
Download and Boot the ISO
Go to www.stacki.com and download the ISO
◦  It’s 1.8 GB
◦  “stacki” pallet plus stripped down CentOS 6.6

Boot the ISO on the host that will be your frontend
Frontend Services
Services to build backend nodes
◦  DHCP
◦  TFTP
◦  Named (optional)
Services to access backend nodes
◦  SSH key management
◦  Parallel execution shell
Host Configuration Spreadsheet
Frontend
Network
Backend Backend Backend Backend
em1 em1
em1
em1
em1
Backend Installation
Save your Host Configuration spreadsheet as a CSV

Import CSV on frontend
◦  “stack load hostfile file=hosts.csv”

Tell backend nodes to install on their next PXE boot
◦  “stack set host boot backend action=install”

PXE boot all backend nodes

Done!
BitTorrent-Inspired Package Installation
Stacki
Customizing Your Hosts
Advanced Networking
Via Host Configuration spreadsheet, you can configure:
◦  Bonded interfaces
◦  VLANs
◦  Bridging
◦  Any combo of the above
Manage hosts in multiple subnets
◦  Build a single cluster from hosts in multiple subnets
◦  Manage hosts in multiple datacenters
Host Configuration Spreadsheet
Disk Controller Configuration Spreadsheet
Disk Partition Configuration Spreadsheet
Multiple Distributions
A frontend houses a default distribution
◦  Based on stripped down CentOS 6.6 or 7.1
◦  Used to build backend nodes

Can add any number of new distributions to a frontend
◦  E.g., RHEL 6.x based distro, CentOS 6.5, etc.
Assign any backend node to any distro
Why is this hard and important?
Datacenter Architecture
Frontend
Network
Backend Backend Backend Backend
em1 em1
em1
em1
em1
Datacenter Host Software Stack
DevOps / Configuration Tool
DHCP /
DNS / TFTP
NetworkDiskOS
In-house
developed
deployment
tools
- Disk Array Controller Configuration
- Disk Partitioning Configuration
The “Step 0” Problem
Check namenodes are
empty
Format/start HDFS
Create all directories
Create all metastores
Start services (Hbase, Hive,
Oozie, Sqoop, Impala, etc)
Deploy client configuration Configure database
Setup/assign monitors
(activity, services, and host)
Test database connections
Validate/resolve hostnamesConsistent host timezones
No bad kernel versions
running
(CDH) version consistency
Java version consistency
Daemons versions
consistency
Mgmt Agents versions
consistency
Host specification/SSH
ports
MUCH MORE …
DHCP Server/Client setup TFTP/PXE configuration
Server OS installation
Node OS Install
RAID configuration
Boot configuration
System/data disk
partitioning
Monitoring system setup
and config
Lights Out/IPMI setup
User accounts added and
synced
SSH keys on all hosts
Network node configuration
Config Mgmt install and
configuration
Route configurationOS upgrades/updates
Site specific software and
configuration
Host specification/SSH
ports
Security
Firewall setupCluster Mgmt utility Database install and config
Multiple network configPackage installation MUCH MORE …
Clusters are Different
Adding new servers does require coordination
Newly added servers must:
•  Have same software stack as original
servers
•  Have same configuration as original
servers
•  Know about original servers
And, original servers must:
•  Know about new servers
Result: The management complexity added to the
Operations staff is “exponential”
Exponential Complexity
Number of Servers
ManagementComplexity
General Data Center
Clusters
The Pain Curve
Number of Servers
ManagementComplexity
General Data Center
Clusters
PAIN
The Pain Threshold
The pain threshold differs for every
organization
Function of:
•  cluster(s) size
•  number of people in Operations
•  Operations staff cluster expertise
Moore’s Law
50 1 2 3 4
8
1
2
3
4
5
6
7
Time (Years)
Density
18 month
doubling
Moore’s Law and Infrastructure Value
What it Means for You
50 1 2 3 4
100
0
10
20
30
40
50
60
70
80
90
Time (Years)
Value(%)
3 months
90% value
18 months
50% value
Time is Money
The clock starts ticking when hosts land on your
loading dock
Without your applications online, you have an
paper weight that consumes power, cooling, and
management’s attention
Try It Out
stacki.com
Download - www.stacki.com
Source & Docs - github.com/StackIQ/stacki/wiki
Discuss - groups.google.com/forum/#!forum/stacki
PayPal’s Options
Bring what we used at former parent company eBay with us.
Build our own soups-to-nuts bespoke bare metal provisioning tool.
Find the perfect open source tool that we can use and grow with.
Not Possible
Not Optimal
Not Likely
Quick, Early Success
2 Weeks Instead of 2 Years
To Build a Scale-out Management Solution
1.  Installed Stacki Frontend (base management server)
Ran test installations of backend servers
1.  Single Server test
2.  Full Rack test (48 nodes)
2.  Updated distribution (CentOS 6.6) to install additional
packages
3.  Integrated IPMI information into Stacki
1.  Can now ssh into all IPMI consoles from the Stacki
frontend host using <hostname>.ipmi
4.  Re-ran with PayPal kickstart changes/additions and was
able to image 6 racks in 14 minutes, including:
1.  Nuking disks/partitions and running a full format of all
data drives
5.  Updated the Stacki post-boot piece to do the following:
1.  Upgrade firmware if host needs it
2.  Runs PayPal Ansible playbook, which:
1.  Installs additional packages
2.  Creates user accounts
3.  Disables unused services
4.  Sets up resolver/ntp/syslog-ng/sudoers/limits.
d/sysctl/etc.
5.  Installs/configures Ambari agents
6.  Checks data drive mounts, fstab
7.  Prepares the rack to be added to a Hadoop
cluster
PayPal development with Stacki includes:
DevOps Agnostic
DevOps / Configuration Tool
DHCP /
DNS / TFTP
NetworkDiskOS
In-house
developed
deployment
tools
- Disk Array Controller Configuration
- Disk Partitioning Configuration
The “Step 0” Problem
Check namenodes are
empty
Format/start HDFS
Create all directories
Create all metastores
Start services (Hbase, Hive,
Oozie, Sqoop, Impala, etc)
Deploy client configuration Configure database
Setup/assign monitors
(activity, services, and host)
Test database connections
Validate/resolve hostnamesConsistent host timezones
No bad kernel versions
running
(CDH) version consistency
Java version consistency
Daemons versions
consistency
Mgmt Agents versions
consistency
Host specification/SSH
ports
MUCH MORE …
DHCP Server/Client setup TFTP/PXE configuration
Server OS installation
Node OS Install
RAID configuration
Boot configuration
System/data disk
partitioning
Monitoring system setup
and config
Lights Out/IPMI setup
User accounts added and
synced
SSH keys on all hosts
Network node configuration
Config Mgmt install and
configuration
Route configurationOS upgrades/updates
Site specific software and
configuration
Host specification/SSH
ports
Security
Firewall setupCluster Mgmt utility Database install and config
Multiple network configPackage installation MUCH MORE …
App Config
Site Config
HW Install
System Performance
Validation
Bare Metal Installers
Hadoop Mgmt Tool
Upgrades/Patching
Disk Configuration
Monitoring Tool
Configuration Tool
Network/Site Config ToolsSystems Mgmt Tool
Others …
MANUAL
SEMI-AUTOMATED
TOOLCHAIN
(w/o StackIQ)
w/StackIQ
FULLY AUTOMATED
StackIQ Boss
Configuration Database
 Server appliance types (e.g. data, namenode, tomcat, …)
 Number of CPUs
 Disk partitioning
 Hardware RAID config
 PCI bus information
 …
 And other System Attributes
Attributes
 Global
◦  stack set attr
 Appliance
◦  stack set appliance attr
 OS
◦  stack set os attr
 Host
◦  stack set host attr
Kickstart Profiles
Zoom In
Starting from the Empty Set
  { }
{ os }
© 2009 UC Regents
{ os, core }
© 2009 UC Regents
{ os, core, kernel }
© 2009 UC Regents
{ os, core, kernel, mapr }
© 2009 UC Regents
Manage the Deltas
{os, core, kernel, mapr} {os, core, kernel, horton}
© 2009 UC Regents
stacki.com
 @masonkatz

More Related Content

What's hot

How to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation SavingsHow to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation SavingsIsaac Christoffersen
 
Jurijs Velikanovs - RAC Attack 101 - How to install 12c RAC on your laptop
Jurijs Velikanovs -  RAC Attack 101 - How to install 12c RAC on your laptop  Jurijs Velikanovs -  RAC Attack 101 - How to install 12c RAC on your laptop
Jurijs Velikanovs - RAC Attack 101 - How to install 12c RAC on your laptop Andrejs Vorobjovs
 
OpenStack Deployment with Chef Workshop
OpenStack Deployment with Chef WorkshopOpenStack Deployment with Chef Workshop
OpenStack Deployment with Chef WorkshopMatt Ray
 
Stacki at the Seattle Scalability Meetup
Stacki at the Seattle Scalability MeetupStacki at the Seattle Scalability Meetup
Stacki at the Seattle Scalability MeetupStackIQ
 
Best Practices for Virtualizing Hadoop
Best Practices for Virtualizing HadoopBest Practices for Virtualizing Hadoop
Best Practices for Virtualizing HadoopDataWorks Summit
 
StackiFest16: How PayPal got a 300 Nodes up in 14 minutes - Greg Bruno
StackiFest16: How PayPal got a 300 Nodes up in 14 minutes - Greg BrunoStackiFest16: How PayPal got a 300 Nodes up in 14 minutes - Greg Bruno
StackiFest16: How PayPal got a 300 Nodes up in 14 minutes - Greg BrunoStackIQ
 
StackiFest16: What's Next in Stacki - Mason Katz
StackiFest16: What's Next in Stacki - Mason Katz StackiFest16: What's Next in Stacki - Mason Katz
StackiFest16: What's Next in Stacki - Mason Katz StackIQ
 
Exadata 12c New Features RMOUG
Exadata 12c New Features RMOUGExadata 12c New Features RMOUG
Exadata 12c New Features RMOUGFuad Arshad
 
Hostvn ceph in production v1.1 dungtq
Hostvn   ceph in production v1.1 dungtqHostvn   ceph in production v1.1 dungtq
Hostvn ceph in production v1.1 dungtqViet Stack
 
Automating Yourself Out of Trouble
Automating Yourself Out of TroubleAutomating Yourself Out of Trouble
Automating Yourself Out of TroubleJose De La Rosa
 
StackiFest16: Building a Cluster with Stacki - Greg Bruno
StackiFest16: Building a Cluster with Stacki - Greg BrunoStackiFest16: Building a Cluster with Stacki - Greg Bruno
StackiFest16: Building a Cluster with Stacki - Greg BrunoStackIQ
 
SOUG_Deployment__Automation_DB
SOUG_Deployment__Automation_DBSOUG_Deployment__Automation_DB
SOUG_Deployment__Automation_DBUniFabric
 
Chef for OpenStack: OpenStack Spring Summit 2013
Chef for OpenStack: OpenStack Spring Summit 2013Chef for OpenStack: OpenStack Spring Summit 2013
Chef for OpenStack: OpenStack Spring Summit 2013Matt Ray
 
Oracle Sandbox
Oracle SandboxOracle Sandbox
Oracle SandboxDatavail
 
[OpenStack Day in Korea 2015] Track 1-4 - VDI OpenStack? It Works!!!
[OpenStack Day in Korea 2015] Track 1-4 - VDI OpenStack? It Works!!![OpenStack Day in Korea 2015] Track 1-4 - VDI OpenStack? It Works!!!
[OpenStack Day in Korea 2015] Track 1-4 - VDI OpenStack? It Works!!!OpenStack Korea Community
 
Basic concepts for_clustered_data_ontap_8.3_v1.1-lab_guide
Basic concepts for_clustered_data_ontap_8.3_v1.1-lab_guideBasic concepts for_clustered_data_ontap_8.3_v1.1-lab_guide
Basic concepts for_clustered_data_ontap_8.3_v1.1-lab_guideVikas Sharma
 
GPU Accelerated Virtual Desktop Infrastructure (VDI) on OpenStack
GPU Accelerated Virtual Desktop Infrastructure (VDI) on OpenStackGPU Accelerated Virtual Desktop Infrastructure (VDI) on OpenStack
GPU Accelerated Virtual Desktop Infrastructure (VDI) on OpenStackBrian Schott
 
Openstack In Real Life
Openstack In Real LifeOpenstack In Real Life
Openstack In Real LifePaul Guth
 
PuppetConf 2016: Changing the Engine While in Flight – Neil Armitage, VMware
PuppetConf 2016: Changing the Engine While in Flight – Neil Armitage, VMwarePuppetConf 2016: Changing the Engine While in Flight – Neil Armitage, VMware
PuppetConf 2016: Changing the Engine While in Flight – Neil Armitage, VMwarePuppet
 

What's hot (20)

How to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation SavingsHow to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation Savings
 
Jurijs Velikanovs - RAC Attack 101 - How to install 12c RAC on your laptop
Jurijs Velikanovs -  RAC Attack 101 - How to install 12c RAC on your laptop  Jurijs Velikanovs -  RAC Attack 101 - How to install 12c RAC on your laptop
Jurijs Velikanovs - RAC Attack 101 - How to install 12c RAC on your laptop
 
OpenStack Deployment with Chef Workshop
OpenStack Deployment with Chef WorkshopOpenStack Deployment with Chef Workshop
OpenStack Deployment with Chef Workshop
 
Stacki at the Seattle Scalability Meetup
Stacki at the Seattle Scalability MeetupStacki at the Seattle Scalability Meetup
Stacki at the Seattle Scalability Meetup
 
Best Practices for Virtualizing Hadoop
Best Practices for Virtualizing HadoopBest Practices for Virtualizing Hadoop
Best Practices for Virtualizing Hadoop
 
StackiFest16: How PayPal got a 300 Nodes up in 14 minutes - Greg Bruno
StackiFest16: How PayPal got a 300 Nodes up in 14 minutes - Greg BrunoStackiFest16: How PayPal got a 300 Nodes up in 14 minutes - Greg Bruno
StackiFest16: How PayPal got a 300 Nodes up in 14 minutes - Greg Bruno
 
StackiFest16: What's Next in Stacki - Mason Katz
StackiFest16: What's Next in Stacki - Mason Katz StackiFest16: What's Next in Stacki - Mason Katz
StackiFest16: What's Next in Stacki - Mason Katz
 
Exadata 12c New Features RMOUG
Exadata 12c New Features RMOUGExadata 12c New Features RMOUG
Exadata 12c New Features RMOUG
 
Hostvn ceph in production v1.1 dungtq
Hostvn   ceph in production v1.1 dungtqHostvn   ceph in production v1.1 dungtq
Hostvn ceph in production v1.1 dungtq
 
Automating Yourself Out of Trouble
Automating Yourself Out of TroubleAutomating Yourself Out of Trouble
Automating Yourself Out of Trouble
 
StackiFest16: Building a Cluster with Stacki - Greg Bruno
StackiFest16: Building a Cluster with Stacki - Greg BrunoStackiFest16: Building a Cluster with Stacki - Greg Bruno
StackiFest16: Building a Cluster with Stacki - Greg Bruno
 
SOUG_Deployment__Automation_DB
SOUG_Deployment__Automation_DBSOUG_Deployment__Automation_DB
SOUG_Deployment__Automation_DB
 
Chef for OpenStack: OpenStack Spring Summit 2013
Chef for OpenStack: OpenStack Spring Summit 2013Chef for OpenStack: OpenStack Spring Summit 2013
Chef for OpenStack: OpenStack Spring Summit 2013
 
Oracle Sandbox
Oracle SandboxOracle Sandbox
Oracle Sandbox
 
Puppet + Windows Nano Server
Puppet + Windows Nano ServerPuppet + Windows Nano Server
Puppet + Windows Nano Server
 
[OpenStack Day in Korea 2015] Track 1-4 - VDI OpenStack? It Works!!!
[OpenStack Day in Korea 2015] Track 1-4 - VDI OpenStack? It Works!!![OpenStack Day in Korea 2015] Track 1-4 - VDI OpenStack? It Works!!!
[OpenStack Day in Korea 2015] Track 1-4 - VDI OpenStack? It Works!!!
 
Basic concepts for_clustered_data_ontap_8.3_v1.1-lab_guide
Basic concepts for_clustered_data_ontap_8.3_v1.1-lab_guideBasic concepts for_clustered_data_ontap_8.3_v1.1-lab_guide
Basic concepts for_clustered_data_ontap_8.3_v1.1-lab_guide
 
GPU Accelerated Virtual Desktop Infrastructure (VDI) on OpenStack
GPU Accelerated Virtual Desktop Infrastructure (VDI) on OpenStackGPU Accelerated Virtual Desktop Infrastructure (VDI) on OpenStack
GPU Accelerated Virtual Desktop Infrastructure (VDI) on OpenStack
 
Openstack In Real Life
Openstack In Real LifeOpenstack In Real Life
Openstack In Real Life
 
PuppetConf 2016: Changing the Engine While in Flight – Neil Armitage, VMware
PuppetConf 2016: Changing the Engine While in Flight – Neil Armitage, VMwarePuppetConf 2016: Changing the Engine While in Flight – Neil Armitage, VMware
PuppetConf 2016: Changing the Engine While in Flight – Neil Armitage, VMware
 

Similar to Introduction to Stacki - World's fastest Linux server provisioning Tool

Provisioning Servers Made Easy
Provisioning Servers Made EasyProvisioning Servers Made Easy
Provisioning Servers Made EasyAll Things Open
 
Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016StackIQ
 
StackiFest 16: Stacki Overview- Anoop Rajendra
StackiFest 16: Stacki Overview- Anoop Rajendra StackiFest 16: Stacki Overview- Anoop Rajendra
StackiFest 16: Stacki Overview- Anoop Rajendra StackIQ
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansPeter Clapham
 
Open stack implementation
Open stack implementation Open stack implementation
Open stack implementation Soumyajit Basu
 
Puppet Camp Dallas 2014: Puppet Keynote
Puppet Camp Dallas 2014: Puppet Keynote Puppet Camp Dallas 2014: Puppet Keynote
Puppet Camp Dallas 2014: Puppet Keynote Puppet
 
TryStack: A Sandbox for OpenStack Users and Admins
TryStack: A Sandbox for OpenStack Users and AdminsTryStack: A Sandbox for OpenStack Users and Admins
TryStack: A Sandbox for OpenStack Users and AdminsAnne Gentle
 
Puppet Camp London 2014: Keynote
Puppet Camp London 2014: KeynotePuppet Camp London 2014: Keynote
Puppet Camp London 2014: KeynotePuppet
 
Containers for grownups migrating traditional &amp; existing applications[1...
Containers for grownups   migrating traditional &amp; existing applications[1...Containers for grownups   migrating traditional &amp; existing applications[1...
Containers for grownups migrating traditional &amp; existing applications[1...DevOps.com
 
Automating hard things may 2015
Automating hard things   may 2015Automating hard things   may 2015
Automating hard things may 2015Mark Baker
 
Building a Hadoop Cluster with Stacki
Building a Hadoop Cluster with StackiBuilding a Hadoop Cluster with Stacki
Building a Hadoop Cluster with StackiStackIQ
 
Baylisa - Dive Into OpenStack
Baylisa - Dive Into OpenStackBaylisa - Dive Into OpenStack
Baylisa - Dive Into OpenStackJesse Andrews
 
Puppet Camp DC 2014: Keynote
Puppet Camp DC 2014: KeynotePuppet Camp DC 2014: Keynote
Puppet Camp DC 2014: KeynotePuppet
 
Stateless Hypervisors at Scale
Stateless Hypervisors at ScaleStateless Hypervisors at Scale
Stateless Hypervisors at ScaleAntony Messerl
 
Micro Datacenter & Data Warehouse
Micro Datacenter & Data WarehouseMicro Datacenter & Data Warehouse
Micro Datacenter & Data Warehousemdcdwh
 
Puppet Camp Tokyo 2014: Keynote
Puppet Camp Tokyo 2014: KeynotePuppet Camp Tokyo 2014: Keynote
Puppet Camp Tokyo 2014: KeynotePuppet
 
Autoscaling OpenStack Natively with Heat, Ceilometer and LBaaS
Autoscaling OpenStack Natively with Heat, Ceilometer and LBaaSAutoscaling OpenStack Natively with Heat, Ceilometer and LBaaS
Autoscaling OpenStack Natively with Heat, Ceilometer and LBaaSShixiong Shang
 
Openstack in 10 mins
Openstack in 10 minsOpenstack in 10 mins
Openstack in 10 minsDawood M.S
 

Similar to Introduction to Stacki - World's fastest Linux server provisioning Tool (20)

Provisioning Servers Made Easy
Provisioning Servers Made EasyProvisioning Servers Made Easy
Provisioning Servers Made Easy
 
Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016
 
StackiFest 16: Stacki Overview- Anoop Rajendra
StackiFest 16: Stacki Overview- Anoop Rajendra StackiFest 16: Stacki Overview- Anoop Rajendra
StackiFest 16: Stacki Overview- Anoop Rajendra
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 
Open stack implementation
Open stack implementation Open stack implementation
Open stack implementation
 
Puppet Camp Dallas 2014: Puppet Keynote
Puppet Camp Dallas 2014: Puppet Keynote Puppet Camp Dallas 2014: Puppet Keynote
Puppet Camp Dallas 2014: Puppet Keynote
 
TryStack: A Sandbox for OpenStack Users and Admins
TryStack: A Sandbox for OpenStack Users and AdminsTryStack: A Sandbox for OpenStack Users and Admins
TryStack: A Sandbox for OpenStack Users and Admins
 
Puppet Camp London 2014: Keynote
Puppet Camp London 2014: KeynotePuppet Camp London 2014: Keynote
Puppet Camp London 2014: Keynote
 
Containers for grownups migrating traditional &amp; existing applications[1...
Containers for grownups   migrating traditional &amp; existing applications[1...Containers for grownups   migrating traditional &amp; existing applications[1...
Containers for grownups migrating traditional &amp; existing applications[1...
 
Automating hard things may 2015
Automating hard things   may 2015Automating hard things   may 2015
Automating hard things may 2015
 
Building a Hadoop Cluster with Stacki
Building a Hadoop Cluster with StackiBuilding a Hadoop Cluster with Stacki
Building a Hadoop Cluster with Stacki
 
Baylisa - Dive Into OpenStack
Baylisa - Dive Into OpenStackBaylisa - Dive Into OpenStack
Baylisa - Dive Into OpenStack
 
Puppet Camp DC 2014: Keynote
Puppet Camp DC 2014: KeynotePuppet Camp DC 2014: Keynote
Puppet Camp DC 2014: Keynote
 
Stateless Hypervisors at Scale
Stateless Hypervisors at ScaleStateless Hypervisors at Scale
Stateless Hypervisors at Scale
 
Micro Datacenter & Data Warehouse
Micro Datacenter & Data WarehouseMicro Datacenter & Data Warehouse
Micro Datacenter & Data Warehouse
 
Puppet Camp Tokyo 2014: Keynote
Puppet Camp Tokyo 2014: KeynotePuppet Camp Tokyo 2014: Keynote
Puppet Camp Tokyo 2014: Keynote
 
Autoscaling OpenStack Natively with Heat, Ceilometer and LBaaS
Autoscaling OpenStack Natively with Heat, Ceilometer and LBaaSAutoscaling OpenStack Natively with Heat, Ceilometer and LBaaS
Autoscaling OpenStack Natively with Heat, Ceilometer and LBaaS
 
Openstack in 10 mins
Openstack in 10 minsOpenstack in 10 mins
Openstack in 10 mins
 
Openstack
OpenstackOpenstack
Openstack
 

More from Suresh Paulraj

Accelerating DevOps Pipelines with AWS
Accelerating DevOps Pipelines with AWSAccelerating DevOps Pipelines with AWS
Accelerating DevOps Pipelines with AWSSuresh Paulraj
 
2015 08-11-scdo-meetup
2015 08-11-scdo-meetup2015 08-11-scdo-meetup
2015 08-11-scdo-meetupSuresh Paulraj
 
Getting started with salt stack
Getting started with salt stackGetting started with salt stack
Getting started with salt stackSuresh Paulraj
 
Getting started with salt stack
Getting started with salt stackGetting started with salt stack
Getting started with salt stackSuresh Paulraj
 
SoCalDevOpsUserGroup-PresentationPuppetLabs
SoCalDevOpsUserGroup-PresentationPuppetLabsSoCalDevOpsUserGroup-PresentationPuppetLabs
SoCalDevOpsUserGroup-PresentationPuppetLabsSuresh Paulraj
 

More from Suresh Paulraj (6)

Accelerating DevOps Pipelines with AWS
Accelerating DevOps Pipelines with AWSAccelerating DevOps Pipelines with AWS
Accelerating DevOps Pipelines with AWS
 
2015 08-11-scdo-meetup
2015 08-11-scdo-meetup2015 08-11-scdo-meetup
2015 08-11-scdo-meetup
 
Getting started with salt stack
Getting started with salt stackGetting started with salt stack
Getting started with salt stack
 
Getting started with salt stack
Getting started with salt stackGetting started with salt stack
Getting started with salt stack
 
SoCalDevOpsUserGroup-PresentationPuppetLabs
SoCalDevOpsUserGroup-PresentationPuppetLabsSoCalDevOpsUserGroup-PresentationPuppetLabs
SoCalDevOpsUserGroup-PresentationPuppetLabs
 
Introduction to Chef
Introduction to ChefIntroduction to Chef
Introduction to Chef
 

Recently uploaded

When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...Elena Simperl
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance
 
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»QADay
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlPeter Udo Diehl
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...Product School
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxAbida Shariff
 
КАТЕРИНА АБЗЯТОВА «Ефективне планування тестування ключові аспекти та практ...
КАТЕРИНА АБЗЯТОВА  «Ефективне планування тестування  ключові аспекти та практ...КАТЕРИНА АБЗЯТОВА  «Ефективне планування тестування  ключові аспекти та практ...
КАТЕРИНА АБЗЯТОВА «Ефективне планування тестування ключові аспекти та практ...QADay
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsPaul Groth
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaRTTS
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Product School
 

Recently uploaded (20)

When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
КАТЕРИНА АБЗЯТОВА «Ефективне планування тестування ключові аспекти та практ...
КАТЕРИНА АБЗЯТОВА  «Ефективне планування тестування  ключові аспекти та практ...КАТЕРИНА АБЗЯТОВА  «Ефективне планування тестування  ключові аспекти та практ...
КАТЕРИНА АБЗЯТОВА «Ефективне планування тестування ключові аспекти та практ...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 

Introduction to Stacki - World's fastest Linux server provisioning Tool

  • 1. Introduction to Stacki Greg Bruno, PhD VP Engineering, StackIQ
  • 2. Open Source Stack Installer Stacki is a very fast and ultra reliable Linux server provisioning tool … at scale. With zero prerequisites for taking systems from bare metal to a ping and prompt.
  • 4. Hadoop @ PayPal 12 x 2TB SATA data drives 48 nodes each rack 1GBE-10GBE NICs 24 x 900GB 6G SAS 10K data drives 24 nodes each rack 10GBE NIC 8 x 4TB NR-SAS data drives 10 GBE NIC Bay Area Salt Lake City Las Vegas DATACENTERS •  3,000 nodes and growing •  60+ initial server racks •  Heterogeneous HW across multiple DCs Data Science Infrastructure Footprint 48 nodes each rack
  • 5. Automation Challenge Spinout creates some datacenter automation challenges … •  Smaller team but even more to do •  Rethink automation •  Distributed systems have tons of local drives which require
 time consuming disk formatting and partitioning, and hardware RAID config on masternodes •  New provisioning solution needs to easily, flexibly integrate
 w/ other commercial, open source, and homegrown
 management tools •  Can 100s or 1000s of nodes be (re)provisioned as quickly as
 one or a few? (e.g., drive failures mean replacing entire host
 from O/S to disk to network to firmware to … etc)
  • 6. Stacki @ PayPal Ambari HDP Health Detection Integration IPMI/iLOOS Disk Network DHCP / DNS / TFTP Ansible - Disk Array Controller Configuration - Disk Partitioning Configuration “Stacki + Ansible = Happiness. :D” – Stacki mailing list 8/11/15
  • 7. Quick, Early Success 14 Minutes*To Fully Provision 6 Racks of Bare Metal (288 Servers) Includes wiping all disks then fully partitioning & formatting ~3500 drives And Now… Upgrades all firmware automatically Executes Ansible scripts on all hosts Hadoop packages installed * Versus hours with other hyperscale management tools, or days to weeks with traditional tools and processes
  • 8. How We Solve the Problem
  • 9. History • San Diego Supercomputer Center •  1986 - National Science Foundation •  Along with NCSA only two non-classified centers •  Mission: serve computational scientists • Rocks •  2000 - First cluster group inside SDSC •  Version 1.0 released that November as open source •  10k+ clusters world-wide • StackIQ •  2006 - Commercial support for Rocks •  2011 - Venture Backed •  Focus on next generation clustered systems (Data, Cloud) • Stacki - 2015 •  June – released as open source •  July – first hyper-scale user
  • 10. Must Haves  Make it – Automatic ◦  Think about it, test it. Deploy it. ◦  People don’t scale, software does. Free your people – allow ops guys to be ops/analysis guys, move them from single machine view to global machine view.  Make it – Repeatable ◦  State of the environment is guaranteed. Does not require homogeneity of hardware or functionality. Make compute environments homogenous on heterogeneous hardware and software. ◦  Really, nothing is homogenous. Environment maybe, behavior of that environment on different machines while predictable will not be the same across all hardware. Stacki gets you flexibility and predictability.  Make it – Reliable ◦  You always get what you want when you want it. You can make reasonable estimates of need because you’ve made the environment predictable and repeatable. Just like science!  Make it – Comprehensive ◦  Manage application layer(s) down to kernels and device configuration with one tool. Never hit the network unconfigured. ◦  Provide turn-key deployment with reasonable default settings and ability to customize / re-wire as desired.
  • 11. Stacki Positioning DevOps / Configuration Tool DHCP / DNS / TFTP NetworkDiskOS In-house developed deployment tools - Disk Array Controller Configuration - Disk Partitioning Configuration
  • 12. Datacenter Architecture Frontend Network Backend Backend Backend Backend em1 em1 em1 em1 em1
  • 13. Download and Boot the ISO Go to www.stacki.com and download the ISO ◦  It’s 1.8 GB ◦  “stacki” pallet plus stripped down CentOS 6.6 Boot the ISO on the host that will be your frontend
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20. Frontend Services Services to build backend nodes ◦  DHCP ◦  TFTP ◦  Named (optional) Services to access backend nodes ◦  SSH key management ◦  Parallel execution shell
  • 22. Frontend Network Backend Backend Backend Backend em1 em1 em1 em1 em1 Backend Installation Save your Host Configuration spreadsheet as a CSV Import CSV on frontend ◦  “stack load hostfile file=hosts.csv” Tell backend nodes to install on their next PXE boot ◦  “stack set host boot backend action=install” PXE boot all backend nodes Done!
  • 25. Advanced Networking Via Host Configuration spreadsheet, you can configure: ◦  Bonded interfaces ◦  VLANs ◦  Bridging ◦  Any combo of the above Manage hosts in multiple subnets ◦  Build a single cluster from hosts in multiple subnets ◦  Manage hosts in multiple datacenters
  • 29. Multiple Distributions A frontend houses a default distribution ◦  Based on stripped down CentOS 6.6 or 7.1 ◦  Used to build backend nodes Can add any number of new distributions to a frontend ◦  E.g., RHEL 6.x based distro, CentOS 6.5, etc. Assign any backend node to any distro
  • 30. Why is this hard and important?
  • 31. Datacenter Architecture Frontend Network Backend Backend Backend Backend em1 em1 em1 em1 em1
  • 32. Datacenter Host Software Stack DevOps / Configuration Tool DHCP / DNS / TFTP NetworkDiskOS In-house developed deployment tools - Disk Array Controller Configuration - Disk Partitioning Configuration
  • 33. The “Step 0” Problem Check namenodes are empty Format/start HDFS Create all directories Create all metastores Start services (Hbase, Hive, Oozie, Sqoop, Impala, etc) Deploy client configuration Configure database Setup/assign monitors (activity, services, and host) Test database connections Validate/resolve hostnamesConsistent host timezones No bad kernel versions running (CDH) version consistency Java version consistency Daemons versions consistency Mgmt Agents versions consistency Host specification/SSH ports MUCH MORE … DHCP Server/Client setup TFTP/PXE configuration Server OS installation Node OS Install RAID configuration Boot configuration System/data disk partitioning Monitoring system setup and config Lights Out/IPMI setup User accounts added and synced SSH keys on all hosts Network node configuration Config Mgmt install and configuration Route configurationOS upgrades/updates Site specific software and configuration Host specification/SSH ports Security Firewall setupCluster Mgmt utility Database install and config Multiple network configPackage installation MUCH MORE …
  • 34. Clusters are Different Adding new servers does require coordination Newly added servers must: •  Have same software stack as original servers •  Have same configuration as original servers •  Know about original servers And, original servers must: •  Know about new servers Result: The management complexity added to the Operations staff is “exponential”
  • 35. Exponential Complexity Number of Servers ManagementComplexity General Data Center Clusters
  • 36. The Pain Curve Number of Servers ManagementComplexity General Data Center Clusters PAIN
  • 37. The Pain Threshold The pain threshold differs for every organization Function of: •  cluster(s) size •  number of people in Operations •  Operations staff cluster expertise
  • 38. Moore’s Law 50 1 2 3 4 8 1 2 3 4 5 6 7 Time (Years) Density 18 month doubling
  • 39. Moore’s Law and Infrastructure Value
  • 40. What it Means for You 50 1 2 3 4 100 0 10 20 30 40 50 60 70 80 90 Time (Years) Value(%) 3 months 90% value 18 months 50% value
  • 41. Time is Money The clock starts ticking when hosts land on your loading dock Without your applications online, you have an paper weight that consumes power, cooling, and management’s attention
  • 43. stacki.com Download - www.stacki.com Source & Docs - github.com/StackIQ/stacki/wiki Discuss - groups.google.com/forum/#!forum/stacki
  • 44. PayPal’s Options Bring what we used at former parent company eBay with us. Build our own soups-to-nuts bespoke bare metal provisioning tool. Find the perfect open source tool that we can use and grow with. Not Possible Not Optimal Not Likely
  • 45. Quick, Early Success 2 Weeks Instead of 2 Years To Build a Scale-out Management Solution 1.  Installed Stacki Frontend (base management server) Ran test installations of backend servers 1.  Single Server test 2.  Full Rack test (48 nodes) 2.  Updated distribution (CentOS 6.6) to install additional packages 3.  Integrated IPMI information into Stacki 1.  Can now ssh into all IPMI consoles from the Stacki frontend host using <hostname>.ipmi 4.  Re-ran with PayPal kickstart changes/additions and was able to image 6 racks in 14 minutes, including: 1.  Nuking disks/partitions and running a full format of all data drives 5.  Updated the Stacki post-boot piece to do the following: 1.  Upgrade firmware if host needs it 2.  Runs PayPal Ansible playbook, which: 1.  Installs additional packages 2.  Creates user accounts 3.  Disables unused services 4.  Sets up resolver/ntp/syslog-ng/sudoers/limits. d/sysctl/etc. 5.  Installs/configures Ambari agents 6.  Checks data drive mounts, fstab 7.  Prepares the rack to be added to a Hadoop cluster PayPal development with Stacki includes:
  • 46. DevOps Agnostic DevOps / Configuration Tool DHCP / DNS / TFTP NetworkDiskOS In-house developed deployment tools - Disk Array Controller Configuration - Disk Partitioning Configuration
  • 47. The “Step 0” Problem Check namenodes are empty Format/start HDFS Create all directories Create all metastores Start services (Hbase, Hive, Oozie, Sqoop, Impala, etc) Deploy client configuration Configure database Setup/assign monitors (activity, services, and host) Test database connections Validate/resolve hostnamesConsistent host timezones No bad kernel versions running (CDH) version consistency Java version consistency Daemons versions consistency Mgmt Agents versions consistency Host specification/SSH ports MUCH MORE … DHCP Server/Client setup TFTP/PXE configuration Server OS installation Node OS Install RAID configuration Boot configuration System/data disk partitioning Monitoring system setup and config Lights Out/IPMI setup User accounts added and synced SSH keys on all hosts Network node configuration Config Mgmt install and configuration Route configurationOS upgrades/updates Site specific software and configuration Host specification/SSH ports Security Firewall setupCluster Mgmt utility Database install and config Multiple network configPackage installation MUCH MORE … App Config Site Config HW Install System Performance Validation Bare Metal Installers Hadoop Mgmt Tool Upgrades/Patching Disk Configuration Monitoring Tool Configuration Tool Network/Site Config ToolsSystems Mgmt Tool Others … MANUAL SEMI-AUTOMATED TOOLCHAIN (w/o StackIQ) w/StackIQ FULLY AUTOMATED
  • 49. Configuration Database  Server appliance types (e.g. data, namenode, tomcat, …)  Number of CPUs  Disk partitioning  Hardware RAID config  PCI bus information  …  And other System Attributes
  • 50. Attributes  Global ◦  stack set attr  Appliance ◦  stack set appliance attr  OS ◦  stack set os attr  Host ◦  stack set host attr
  • 53. Starting from the Empty Set   { }
  • 54. { os } © 2009 UC Regents
  • 55. { os, core } © 2009 UC Regents
  • 56. { os, core, kernel } © 2009 UC Regents
  • 57. { os, core, kernel, mapr } © 2009 UC Regents
  • 58. Manage the Deltas {os, core, kernel, mapr} {os, core, kernel, horton} © 2009 UC Regents