2012 Fall OpenStack Bare-metal Speaker Session

General
Bare-metal
Provisioning
Framework

Mikyung Kang, USC/ISI
David Kang, USC/ISI
Ken Igarashi, NTT docomo
Mana Kenoko, NTT docomo
Hiromichi Ito, Virtual Tech Japan
Arata Notsu, Virtual Tech Japan

2
Overview
¡  Why General Bare-metal Provisioning? (USC/ISI)
¡  Why Bare-metal provisioning?
¡  OpenStack Bare-metal History: Essex – Folsom – Grizzly
¡  Bare-metal Provisioning Framework
¡  Bare-metal Release Plan

¡  Bare-metal provisioning support? (NTT Docomo)
¡  Instance Request
¡  Nova-compute Selection
¡  Image Provisioning
¡  Network Isolation
¡  Nova-volume Attachment
¡  VNC Access
¡  Snapshot

General Bare-Metal Provisioning Framework (Speaker Session)

3
Why Bare-metal Provisioning?
¡  Manage Bare-metal Machines using OpenStack

Virtual Machines Bare-‐‑‒Metal Machines

Real-‐‑‒time Analysis Various CPU Management
support using OpenStack

Open
Stack


4
¡  Difference between VM and Bare-metal Machines
¡  Virtual Machines
¡  Hypervisor exists between physical resources and virtual machines
¡  Image provisioning, VM’s power management, volume isolation
(iSCSI), console access (VNC), VM’s snapshot

Virtual Machine Bare-Metal Machine
Hypervisor 　(OpenStack)
NW Storage ＮＷ OS
NW StorageＮＷ OS
Host OS
HW iSCSI VLAN imag
HW iSCSI VLAN imag CPU MEM HDD NIC
eDB
CPU MEM HDD NIC
eDB

¡  Bare-metal Machines
¡  There is no hypervisor
¡  Bare-metal machine can access physical resources freely
¡  Need to achieve same security level as virtual environments


5
¡  Virtual machine vs. Bare-metal machine instances

bare-metal m1.tiny m1.medium m1.large
Driver Hypervisor 　
CPU

Aggregate Host OS
MEM
resources bm1.medium HW
CPU MEM HDD NIC
HDD
bm1.tiny

Nova-Compute Nova-Compute
(virtual)

Bare-metal machine Virtual machine


6
OpenStack Bare-metal History

Essex Release: April 2012
• Non-PXE Tilera multi-core bare-metal machines

Folsom Release: Sept. 2012
• Non-PXE Tilera multi-core bare-metal machines
• Pending review: PXE support & bare-metal MySQL DB

Grizzly Release: April 2013
• Finish review à merge to upstream: basic functions
• New features including fault-tolerance and security
enhancement as well as scheduler changes

General Bare-metal Provisioning Framework (Speaker Session)

7
OpenStack Bare-metal History
¡  Initial design for Tilera (Non-PXE) Image Provisioning (TFTP/NFS)

Essex

Folsom


8
Bare-metal Provisioning Framework
Compute Node w/
bare-metal plugin
nova/virt/
--- libvirt/
Grizzly
--- driver.py LibvirtDriver
--- baremetal/ (nova/virt/libvirt/driver.py)
--- driver.py
--- tilera.py
--- tilera_pdu.py BareMetalDriver
--- pxe.py (nova/virt/baremetal/driver.py)
--- ipmi.py

1 compute_driver=baremetal.driver.BareMetalDriver

baremetal_driver
2 ={baremetal.tilera.TILERA Non-PXE (TILERA) PXE
| baremetal.pxe.PXE } (nova/virt/baremetal/tilera.py) (nova/virt/baremetal/pxe.py)

power_manager
3 ={baremetal.tilera_pdu.Pdu PDU IPMI
| baremetal.ipmi.Ipmi } (nova/virt/baremetal/tilera_pdu.py) (nova/virt/baremetal/ipmi.py)

instace_type_extra_specs
4 =cpu_arch:xxx
Tilepro64 ARM x86_64


9
¡  Bare-metal nova-compute vs. back-end machines

x86_64 BM Farm
PXE:
X86_64

cpu_arch=* ARM BM Farm
hypervisor_type PXE:
=baremetal ARM

Tilera BM Farm
Non-PXE:
Tilera

Nova-compute w/
Nova-scheduler bare-metal plug-in Bare-metal back-end


10

Registers
bare-metal
bare-metal resources
Driver Essex
CPU

Aggregate Folsom
Bare-metal Filter:
cpu_arch &
MEM
resources bm1.medium
hypervisor_type
HDD
bm1.tiny

Nova-Compute TEXT

Homogeneous Capability Bare-metal
nodes
Nova-Scheduler Maximum Capability information
Including total number of bare-metal machines


11

Registers
bare-metal
bare-metal resources
Driver
CPU Grizzly
Aggregate
Bare-metal Filter:
cpu_arch &
MEM
resources bm1.medium
hypervisor_type
HDD
bm1.tiny

Nova-Compute

Multiple Capabilities Bare-metal
MySQL DB
Nova-Scheduler

baremetal_sql_connection = mysql://
$ID:$Password@$IP/nova_bm


12
Bare-metal Release Plan

Grizzly-1: Nov. 22nd

Grizzly-3: Feb. 21st


General Bare-Metal Provisioning
Framework

Ken Igarashi
Mana Kaneko
(NTT docomo Inc.)

Copyright©2011 NTT DOCOMO, INC. All rights reserved.

Benchmarking
o  CPU (Coremark) o  Context Switch (LMBench)
180000
Baremetal Virtual 70
160000 Better
Baremetal Virtual
60
140000 worse
50

Time [µS]
120000
100000 40
80000 30
60000 20
40000 10
20000 0
0 2 4 8 16 24 32 64 96
Number of Process

o  TCP Throughput (Netperf)
o  Ping
Baremetal Virtual SR-IOV 0.4 Baremetal SR-IOV Virtual
10000
Throughput [Mbps]

Better
worse
Latency [ms]
8000 0.3
6000
0.2
4000
0.1
2000

0 0
64 1024 1500
transmit receive Packet Size [bytes]
DOCOMO, INC All Rights Reserved 14

VM Provisioning Procedure in Nova
1. Instance Request

Nova-
Nova-API
Scheduler

Hypervisor Hypervisor Hypervisor
Host OS
Host OS
Host OS

Nova-Compute
Nova-Compute
Nova-Compute

Glance

USER1 USER1
Storage
Storage
Vol-13 Vol-14

USER2 USER2
Storage
Storage
Vol-11
Vol-12

Nova-Volume

1. Instance Request
2. Choose Nova-Compute
Nova-
Nova-API
Scheduler

Host OS
Host OS
Host OS

Nova-Compute
Nova-Compute
Nova-Compute

Glance

USER1 USER1
Storage
Storage
Vol-13 Vol-14

USER2 USER2
Storage
Storage
Vol-11
Vol-12

Nova-Volume

1. Instance Request
Nova-
Nova-API
Scheduler

3. Image Provisioning
VM
VM
VM
VM

Host OS
Host OS
Host OS

Nova-Compute
Nova-Compute
Nova-Compute

Glance

USER1 USER1
Storage
Storage
Vol-13 Vol-14

USER2 USER2
Storage
Storage
Vol-11
Vol-12

Nova-Volume

1. Instance Request
Nova-
Nova-API
Scheduler

VM
VM
VM
VM

Host OS
Host OS
Host OS

Nova-Compute
Nova-Compute
Nova-Compute

4. Network Isolation
Glance

USER1 USER1
Storage
Storage
Vol-13 Vol-14

USER2 USER2
Storage
Storage
Vol-11
Vol-12

Nova-Volume

1. Instance Request
Nova-
Nova-API
Scheduler

VM
VM
VM
VM

Host OS
Host OS
Host OS

Nova-Compute
Nova-Compute
Nova-Compute

Glance

USER1 USER1
Storage
Storage
Vol-13 Vol-14

USER2 USER2
Storage
Storage
Vol-11
Vol-12
5. Nova-Volume Attachment
Nova-Volume

1. Instance Request
Nova-
Nova-API
Scheduler

VM
VM
VM
VM

6. VNC Access
Host OS
Host OS
Host OS

Nova-Compute
Nova-Compute
Nova-Compute

Glance

USER1 USER1
Storage
Storage
Vol-13 Vol-14

USER2 USER2
Storage
Storage
Vol-11
Vol-12
Nova-Volume

1. Instance Request
AMI
AMI
Nova- glance
Nova-API
Scheduler

7. Snapshot
VM
VM
VM
VM

6. VNC Access
Host OS
Host OS
Host OS

Nova-Compute
Nova-Compute
Nova-Compute

Glance

USER1 USER1
Storage
Storage
Vol-13 Vol-14

USER2 USER2
Storage
Storage
Vol-11
Vol-12
Nova-Volume

Bare-Metal Provisioning Functions
o  We need to implement same functions for bare-metal
provisioning
1.  Instance Request – Description for bare-metal machine instances
2.  Choose Nova-Compute – Scheduler for bare-metal machines
3.  Image Provisioning – Turn on/off and deploy images to bare-metal
machines
4.  Network Isolation – Create private LAN among bare-metal
machines
5.  Nova-Volume Attachment – Provide secure iSCSI access
6.  VNC Access – Provide console access to bare-metal servers
7.  Snapshot – Create new AMI from a running VM
How to achieve those functions without hypervisor?

Keep
Compatibility Less impact to
(Same API)
Nova


1. Instance Request
o  Create instance types for bare-metal machines
Name
Id
memory_mb
VCPUS
local_gb
m1.tiny
1 512
1
40
m1.medium
2
4096
2
80
b1.tiny
3
512
1
40
b1.medium
4
4096
2
80

o  bare-metal machine instances have
“instance_type_extra_specs”

Id
key
value
3
cpu_arch
tilepro64
4
cpu_arch
x86_64

Ø  euca-run-instances –t m1.tiny -> Create virtual instance
Ø  euca-run-instances –t b1.tiney -> Create bare-metal instance

2. Choose Nova-Compute (Sceduler)
o  Create pseudo Nova-Computes for bare-metal machines
CPU

bare-metal MEM
m1.tiny
1.midium
m1.large
m
Driver
HDD Hypervisor　
m1.large
Host OS
b1.midium
HW
CPU MEM HDD NIC
Nova-Compute b1.tiny
Nova-Compute
(virtual)

o  Filter scheduler can classify virtual and bare-metal machines
Hypervisor　 Hypervisor　
Filter
m1.large
Host OS
Host OS
Scheduler
HW
Virtual
HW
m1.tiny
CPU MEM HDD NIC CPU MEM HDD NIC

Nova-
m1.midium
b1.tiny
Scheduler
b1.tiny
Nova-API
Bare-Metal
b1.midium

3. Image Provisioning (x86_64)
0. Preparation

Create “kernel + ramdisk”, Run bare-metal
and register them to glance deployment servers
“baremetal-mkinitrd.sh”
AKI
ARI
- dnsmasq (PXE server)
nova-compute
- bm_deploy_server
glance

Specify nova-
Edit nova.conf
compute type

compute_driver=nova.virt.baremetal.driver.BareMetalDriver Driver for nova-compute
baremetal_driver=nova.virt.baremetal.pxe.PXE and power manager
power_manager=nova.virt.baremetal.ipmi.Ipmi
baremetal_deploy_ramdisk = 843adb6d-e0f8-452d-9a60-d8c883a0983c
baremetal_deploy_kernel = 7dfd792c-fc85-480e-8d07-7d9b20d58c24
AKI and ARI
for 1st boot

1.  1st Boot
Nova- nova-compute/
Scheduler
PXE server
PXE boot
Use kernel/ramdisk
b1.tiny
for the deployment
Bare-Metal
Machines
AKI ARI
Nova-API
(deploy)
(deploy)

euca-run-instances –t b1.tinyl --ramdisk
ari-bare (–kernel aki-bare) ami-bare

2.  System Setup
Send AMI via iSCSI
AMI-
nova-compute/ bare
bm_deploy_server
Read Configuration (Nova-Network)
MAC and IP Address
1.  Create File system (SWAP)
2.  Configure MAC and IP address
3.  Setup PXE for 2nd boot
4.  Reboot

3.  2nd Boot
AMI-
PXE boot bare
Use kernel/ramdisk for the
provisioning
nova-compute/
PXE server
aki-Bare ari-Bare

aki-Bare ari-Bare

Boot from Local HDD

euca-run-instances –t b1.tinyl --ramdisk
ari-bare (–kernel aki-bare) ami-bare
Bare-Metal Instance


o  Virtual Machines o  Bare-Metal Machines
Ø  Hypervisor checks addresses (IP Ø  Use can change address and VLAN
and MAC), and puts VLAN tag
tag freely

IP address MAC, IP address,
spoofing! VLAN spoofing!
(pretend others)
(pretending others)

APL-d APL-d APL-d
MW-d MW-d MW-d
OS-d OS-d OS-d

Hypervisor
Hypervisor
HW

HW
HW

OK
NG


4. Network Isolation (β version)
o  Use Quantum – NEC’s Trema + OpenFlow Switch
Ø  Protect against address spoofing (MAC and IP)
Ø  Create a private network among instances
of_in_port=<switch’s port> src_mac !=
<Instance's MAC> -> DROP
Nova-Compute of_in_port=<switch’s port> src_ip !=
Quantum
<Instance's IP> -> DROP
(bare-metal)
of_in_port=* dst_ip=<Instance's IP> protocol
and dst_port Allowed by security group ->
OpenFlow Controller ALLOW
(Trema from NEC)
of_in_port=* dst_ip=<BROADCAST> protocol
and dst_port Allowed by security group ->
ALLOW
Security Group A

OpenFlow
Security Group B
Switch
Security Group B

DOCOMO, INC All Rights Reserved Security Group A
29

o  Virtual Machines o  Bare-Metal Machines
Ø  Nova-Volume is transparent to Ø  Use can see all Nova-Volumes
users

iscsiadm –m iscsiadm –m
discovery
discovery

APL-d APL-d APL-d
MW-d MW-d MW-d
Don’t work!
Can see all
OS-d OS-d OS-d
the volumes
HW
Hypervisor
Hypervisor

HW
HW

USER1 USER2 USER3 USER4
Storage
Storage
Storage
Storage
Vol-14 Vol-13 Vol-12
Vol-11

Nova-Volume

5. Nova-Volume Attachment (β version)
o  Use Nova-Compute as a proxy of Nova-Volume
Ø  Separate Nova-Volume network and provide ACL using CHAP

2. Provide ACL for each
bare-metal machines
1. Isolate iSCSI netowrk
OpenFlow
Switch

Server A

USER1
Storage
USER2
Storage
Server B
Vol-13 Vol-14

USER3 USER4
Storage
Storage
Vol-11
Vol-12

Nova-Volume
Server C
Server D
Nova-Volume Network Bare-Metal Nova Volume Network


6. VNC Access (β version)
o  Provide console access by Serial over LAN (SOL)
Nova-Compute
Bare-metal

SOL

Serial Console

o  Use Ajax Console (shellinabox)

DOCOMO, INC All Rights Reserved http://code.google.com/p/shellinabox/
32

Bare-Metal Provisioning
1.  Instance Request
- Create new instance type with “extra_specs = bare-metal”
2.  Choose Nova-Compute
- Create new scheduler called “Heterogeneous Scheduler”
3.  Image Provisioning
- Use Intel vPro and IPMI to Turn on/off bare-metal machines
4.  Network Isolation
- Use Quantum (OpenFlow) to protect against address spoofing and create
a private LAN within a security group
5.  Nova Volume Attachment
- Network ACL (VLAN and CHAP)
6.  VNC Access
-  Serial over LAN
7.  Snapshot
- TBD

Libvirt and Bare-Metal Driver
o  Compare operations supported by Horizon
Category
Operation
Libvirt
Bare-Metal

Activate
O O (IPMI)

Reboot
O
O (IPMI)

Suspend
O
X
Instance
Terminate
O
O (IPMI)

MAC/IP
O
O (Deploy Ramdisk)
Address
Floating IP
O
O
Snapshot
O
X
Security
O
O (OpenFlow)
Security
Groups
Keypair
O
O

Console
O (VNC)
△ (SOL)

Demonstration

Mana Kaneko
(NTT docomo Inc.)


Implemented Functions


Scaling the Nova-Compute
using Zabbix

General Bare-Metal
Provisioning Framework


Bare-Metal Machine Provisioning
o  Manage Bare-Metal Machines same as Virtual Machines

Virtual Bare-Metal
Machines
Machines

Ø  Run an instance through OpenStack API
ü euca-run-instances –t b1.tinyl --ramdisk ari-bare (–kernel aki-bare) ami-bare

Utilize all the ecosystem Management
created on top of OpenStack using OpenStack

Open
Stack

Auto-Scaling

Auto-Scaling of the Nova-Compute
o  Change resources dynamically based on load

Common
Computing Pool

Common
Computing Pool


How Does Zabbix Scale a Nova-Compute?
Nova-Compute
Zabbix
Information
from Libvirt
ITEM
Management
VM
VM's CPU load
Item1, Item2
Total vCPUs VM
VM
V VM’s Memory
M
VM’s Disk etc…

Zabbix
Libvirt Collectd
Plugin
TRIGGER
Plugin
“Item2” = Total vCPUs Scale-out Trigger
Scale-in Trigger
Zabbix argent
“Item1” = Total CPUs
H
O
ACTION
S
System Information
Scale-out Action
T Total CPUs Scale-in Action
Total Memory
Total Disk etc…


Trigger & Action for scaling the Nova-Compute
Item List
Item1
Total CPUs
Item2
Total vCPUs

Trigger List
Expression
Value
True : PROBLEM
Scale-out
Total vCPUs.ave(60) > Total CPUs
False : OK
Total vCPUs.ave(180) True : PROBLEM
Scale-in
< Total CPUs - number of CPUs per
server
False : OK

Action List
Value Status
Operation
Execute “euca-run-instances~”
Scale-out
PROBLEM
command to Nova-api

Execute “euca-terminate-instances~”
Scale-in
PROBLEM
command to Nova-api


Scaling Nova-Compute


Bare-metal codes for submission
o  Updated scheduler and compute for multiple bare-metal
capabilities
Ø  https://review.openstack.org/13920
o  Added separate bare-metal MySQL DB
o  A script for bare-metal node management
Ø  https://review.openstack.org/#/c/11366/
o  Updated bare-metal provisioning framework
o  Added PXE back-end bare-metal
o  Added bare-metal host manager


Bare-metal docs 44

OpenStack Wiki
•  http://wiki.openstack.org/
GeneralBareMetalProvisioningFramework

OpenStack Source
•  nova/virt/baremetal/docs/*.rst
•  README and installation documents

The Latest Github branch
•  https://github.com/NTTdocomo-openstack/
nova/


Bare-metal provisioning interests
o  Contact: USC/ISI & NTT Docomo
o  Interested companies: collaboration / testing

Tuesday
@4:30-5:10pm
[Emma AB]

summit
session
Design & Implementation
meetup


2012 Fall OpenStack Bare-metal Speaker Session

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to 2012 Fall OpenStack Bare-metal Speaker Session

Similar to 2012 Fall OpenStack Bare-metal Speaker Session (20)

2012 Fall OpenStack Bare-metal Speaker Session