The document discusses general bare-metal provisioning frameworks in OpenStack. It provides an overview of why bare-metal provisioning is needed compared to virtual machines. It describes the history of bare-metal support in OpenStack from the Essex to Grizzly releases. It also outlines the key components of the bare-metal provisioning framework, including the bare-metal driver, power manager, and instance type specifications. Finally, it discusses the bare-metal provisioning workflow and release plan.
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPCGanesan Narayanasamy
To cope with the steaming out of Moore’s law and Dennard’s scaling end, the world of High-Performance Computing is rapidly evolving toward high-throughput architectures with specialized hardware for vectors and tensor operations in conjunction with sophisticated power management subsystems. RISC-V ISA and Open-HW can prove its effectiveness in fostering innovation in the HPC market as it has done in the embedded one. In this talk, I will introduce a set of building blocks for future HPC systems we have been designing at the ETH Zurich and the University of Bologna.
BM recently announced the POWER10 processor. POWER10 brings in a rich set of architecture capabilities in the processor core. Features like prefix instruction support and Matrix Multiply Assist, in short known as MMA, which were introduced in Open POWER ISA V3.01 are implemented in the POWER10 processor. MMA is an on-chip AI acceleration capability which accelerates the matrix multiplication compute. This talk will cover these two key concepts introduced in POWER ISA V3.01 namely, 1) prefix instructions, and how this can help extend the POWER ISA in next generation 2) Matrix Multiply Assist architecture and its implementation in POWER10.
Devconf2017 - Can VMs networking benefit from DPDKMaxime Coquelin
DPDK brings high-performance/low-latency virtualization networking capabilities thanks to its Vhost/Virtio support. The session will first introduce DPDK and its Vhost/Virtio implementations, exposing to the audience examples of possible uses, and challenges that need to be addressed to achieve high-performance, functionality and reliability. Then, Vhost/Virtio improvements introduced in last DPDK release will be covered, such as receive path optimizations, Virtio's indirect descriptors support, or transmit zero copy to name a few. The speakers will explain which problems they aim to address, how they address them, mentioning their limitations.
Finally, the speakers, who are active DPDK's Virtio/Vhost contributors, will expose what new developments are in the pipe to tackle the remaining challenges.
The session will be presented so that DPDK developers and users find useful information on current developments and status. People not familiar with DPDK may find a overview, get and share ideas with other projects.
Ariel Waizel discusses the Data Plane Development Kit (DPDK), an API for developing fast packet processing code in user space.
* Who needs this library? Why bypass the kernel?
* How does it work?
* How good is it? What are the benchmarks?
* Pros and cons
Ariel worked on kernel development at the IDF, Ben Gurion University, and several companies. He is interested in networking, security, machine learning, and basically everything except UI development. Currently a Solution Architect at ConteXtream (an HPE company), which specializes in SDN solutions for the telecom industry.
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPCGanesan Narayanasamy
To cope with the steaming out of Moore’s law and Dennard’s scaling end, the world of High-Performance Computing is rapidly evolving toward high-throughput architectures with specialized hardware for vectors and tensor operations in conjunction with sophisticated power management subsystems. RISC-V ISA and Open-HW can prove its effectiveness in fostering innovation in the HPC market as it has done in the embedded one. In this talk, I will introduce a set of building blocks for future HPC systems we have been designing at the ETH Zurich and the University of Bologna.
BM recently announced the POWER10 processor. POWER10 brings in a rich set of architecture capabilities in the processor core. Features like prefix instruction support and Matrix Multiply Assist, in short known as MMA, which were introduced in Open POWER ISA V3.01 are implemented in the POWER10 processor. MMA is an on-chip AI acceleration capability which accelerates the matrix multiplication compute. This talk will cover these two key concepts introduced in POWER ISA V3.01 namely, 1) prefix instructions, and how this can help extend the POWER ISA in next generation 2) Matrix Multiply Assist architecture and its implementation in POWER10.
Devconf2017 - Can VMs networking benefit from DPDKMaxime Coquelin
DPDK brings high-performance/low-latency virtualization networking capabilities thanks to its Vhost/Virtio support. The session will first introduce DPDK and its Vhost/Virtio implementations, exposing to the audience examples of possible uses, and challenges that need to be addressed to achieve high-performance, functionality and reliability. Then, Vhost/Virtio improvements introduced in last DPDK release will be covered, such as receive path optimizations, Virtio's indirect descriptors support, or transmit zero copy to name a few. The speakers will explain which problems they aim to address, how they address them, mentioning their limitations.
Finally, the speakers, who are active DPDK's Virtio/Vhost contributors, will expose what new developments are in the pipe to tackle the remaining challenges.
The session will be presented so that DPDK developers and users find useful information on current developments and status. People not familiar with DPDK may find a overview, get and share ideas with other projects.
Ariel Waizel discusses the Data Plane Development Kit (DPDK), an API for developing fast packet processing code in user space.
* Who needs this library? Why bypass the kernel?
* How does it work?
* How good is it? What are the benchmarks?
* Pros and cons
Ariel worked on kernel development at the IDF, Ben Gurion University, and several companies. He is interested in networking, security, machine learning, and basically everything except UI development. Currently a Solution Architect at ConteXtream (an HPE company), which specializes in SDN solutions for the telecom industry.
The needs for immediate responsiveness of VMs in the virtualized environments have been on the rise. Several services in SKT also require soft realtime support for virtual machines to substitute the physical machines to achieve high utilization and adaptability. However, consolidated multiple OSes and irregular external events might render the hypervisor infringe on a VM's promptitude. As a solution of this problem, we are improving Xen's credit scheduler by introducing the RT_PRIORITY that guarantees a VM's running at any given point in time as long as credits remains to be burn. It would increase the quality of service and make a VM's behavior predictable on the consolidated environment. In addition, we extend our suggestion to the multi-core environment and even a large number of physical machines by using live migrations.
Early Benchmarking Results for Neuromorphic ComputingDESMOND YUEN
An update on the Intel Neuromorphic Research Community’s growth and benchmark results, including the addition of new corporate members and numerous new benchmarking updates computed on Intel’s neuromorphic test chip, Loihi.
One of the presentations used in a discussion meeting about GlusterFS held on Sep. 14, 2011 in Japan.
Ust: http://www.ustream.tv/channel/glusterfs
Togetter: http://togetter.com/li/188183
Slides at OpenStack Summit 2017 Sydney
Session Info and Video: https://www.openstack.org/videos/sydney-2017/100gbps-openstack-for-providing-high-performance-nfv
Slide at OpenStack Summit 2018 Vancouver
Session Info and Video: https://www.openstack.org/videos/vancouver-2018/can-we-boost-more-hpc-performance-integrate-ibm-power-servers-with-gpus-to-openstack-environment
In this talk Jiří Pírko discusses the design and evolution of the VLAN implementation in Linux, the challenges and pitfalls as well as hardware acceleration and alternative implementations.
Jiří Pírko is a major contributor to kernel networking and the creator of libteam for link aggregation.
The needs for immediate responsiveness of VMs in the virtualized environments have been on the rise. Several services in SKT also require soft realtime support for virtual machines to substitute the physical machines to achieve high utilization and adaptability. However, consolidated multiple OSes and irregular external events might render the hypervisor infringe on a VM's promptitude. As a solution of this problem, we are improving Xen's credit scheduler by introducing the RT_PRIORITY that guarantees a VM's running at any given point in time as long as credits remains to be burn. It would increase the quality of service and make a VM's behavior predictable on the consolidated environment. In addition, we extend our suggestion to the multi-core environment and even a large number of physical machines by using live migrations.
Early Benchmarking Results for Neuromorphic ComputingDESMOND YUEN
An update on the Intel Neuromorphic Research Community’s growth and benchmark results, including the addition of new corporate members and numerous new benchmarking updates computed on Intel’s neuromorphic test chip, Loihi.
One of the presentations used in a discussion meeting about GlusterFS held on Sep. 14, 2011 in Japan.
Ust: http://www.ustream.tv/channel/glusterfs
Togetter: http://togetter.com/li/188183
Slides at OpenStack Summit 2017 Sydney
Session Info and Video: https://www.openstack.org/videos/sydney-2017/100gbps-openstack-for-providing-high-performance-nfv
Slide at OpenStack Summit 2018 Vancouver
Session Info and Video: https://www.openstack.org/videos/vancouver-2018/can-we-boost-more-hpc-performance-integrate-ibm-power-servers-with-gpus-to-openstack-environment
In this talk Jiří Pírko discusses the design and evolution of the VLAN implementation in Linux, the challenges and pitfalls as well as hardware acceleration and alternative implementations.
Jiří Pírko is a major contributor to kernel networking and the creator of libteam for link aggregation.
Summit 16: OPNFV on ARM - Hardware Freedom of Choice Has Arrived!OPNFV
Freedom of choice is one of the key concepts in the SDN and NFV revolution we are seeing today. OPNFV is at the heart of this revolution yet very limited freedom of choice has existed on the hardware architecture side. However, with the work done in the Armband project, ARM servers are now an alternative hardware architecture for Brahmaputra deployments. The Armband team has ported the OPNFV Fuel Project to support deployments on ARM servers. The necessary code changes have been upstreamed through the OPNFV armband project. End users are now able to download or build their own Brahmaputra OPNFV ISO ready for ARM and install it using available OPNFV documentation. In addition to this and to further the OPNFV VNF ecosystem, a full specification OPNFV Pharos lab based on ARM servers was built by Enea for running continuous integration (CI) and continuous deployment (CD). In this presentation, we will walk you through the experiences gained in this process, the challenges and how they were overcome and what is coming next.
Join-fu is the art of performance-tuning your application's SQL. Join Jay in a fun, irreverent look at the common ways application developers misuse and abuse their database.
In a traditional Xen configuration domain 0 is used for a large number of different functions including running the toolstack(s), backends for network and disk I/O, running the QEMU device model instances, driving the physical devices in the system, handling guest console/framebuffer I/O and miscellaneous monitoring and management functions. Having all these functions in one domain produces a complex environment which is susceptible to shared fate on the failure of any one function, has complex interactions between functions (including resource contention) which makes it difficult to predict performance, and has limited flexibility (such as requiring the same kernel for all device drivers).
""Domain 0 disaggregation"" has been discussed for some time as a way to break out domain 0's functions into separate domains. Doing this enables each domain to be tailored to its function such as using a different kernel or operating system to drive different physical devices. Splitting functions into separate domains removes some of the unintentional interactions such as in-domain resource contention and reduces the system impact of the failure of a single function such as a device driver crash.
Although domain 0 disaggregation is not new it is seldom used in practise and much of its use is focussed on providing enhanced security. Citrix XenServer will be moving towards a disaggregated domain 0 in order to provide better security, scalability, performance, reliability, supportability and flexibility. This talk will describe XenServer's “Windsor” architecture and explain how it will provide the above benefits to customers and users. We will present an overview of the architecture and some early experimental measurements showing the benefits.
QPACE QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
2012 Fall OpenStack Bare-metal Speaker Session
1. General
Bare-metal
Provisioning
Framework
Mikyung Kang, USC/ISI
David Kang, USC/ISI
Ken Igarashi, NTT docomo
Mana Kenoko, NTT docomo
Hiromichi Ito, Virtual Tech Japan
Arata Notsu, Virtual Tech Japan
3. 3
Why Bare-metal Provisioning?
¡ Manage Bare-metal Machines using OpenStack
Virtual Machines Bare-‐‑‒Metal Machines
Real-‐‑‒time Analysis Various CPU Management
support using OpenStack
Open
Stack
General Bare-Metal Provisioning Framework (Speaker Session)
4. 4
Why Bare-metal Provisioning?
¡ Difference between VM and Bare-metal Machines
¡ Virtual Machines
¡ Hypervisor exists between physical resources and virtual machines
¡ Image provisioning, VM’s power management, volume isolation
(iSCSI), console access (VNC), VM’s snapshot
Virtual Machine Bare-Metal Machine
Hypervisor (OpenStack)
NW Storage NW OS
NW StorageNW OS
Host OS
HW iSCSI VLAN imag
HW iSCSI VLAN imag CPU MEM HDD NIC
eDB
CPU MEM HDD NIC
eDB
¡ Bare-metal Machines
¡ There is no hypervisor
¡ Bare-metal machine can access physical resources freely
¡ Need to achieve same security level as virtual environments
General Bare-Metal Provisioning Framework (Speaker Session)
5. 5
Why Bare-metal Provisioning?
¡ Virtual machine vs. Bare-metal machine instances
bare-metal m1.tiny m1.medium m1.large
Driver Hypervisor
CPU
Aggregate Host OS
MEM
resources bm1.medium HW
CPU MEM HDD NIC
HDD
bm1.tiny
Nova-Compute Nova-Compute
(virtual)
Bare-metal machine Virtual machine
General Bare-Metal Provisioning Framework (Speaker Session)
6. 6
OpenStack Bare-metal History
Essex Release: April 2012
• Non-PXE Tilera multi-core bare-metal machines
Folsom Release: Sept. 2012
• Non-PXE Tilera multi-core bare-metal machines
• Pending review: PXE support & bare-metal MySQL DB
Grizzly Release: April 2013
• Finish review à merge to upstream: basic functions
• New features including fault-tolerance and security
enhancement as well as scheduler changes
General Bare-metal Provisioning Framework (Speaker Session)
7. 7
OpenStack Bare-metal History
¡ Initial design for Tilera (Non-PXE) Image Provisioning (TFTP/NFS)
Essex
Folsom
General Bare-metal Provisioning Framework (Speaker Session)
10. 10
Bare-metal Provisioning Framework
Registers
bare-metal
bare-metal resources
Driver Essex
CPU
Aggregate Folsom
Bare-metal Filter:
cpu_arch &
MEM
resources bm1.medium
hypervisor_type
HDD
bm1.tiny
Nova-Compute TEXT
Homogeneous Capability Bare-metal
nodes
Nova-Scheduler Maximum Capability information
Including total number of bare-metal machines
General Bare-metal Provisioning Framework (Speaker Session)
11. 11
Bare-metal Provisioning Framework
Registers
bare-metal
bare-metal resources
Driver
CPU Grizzly
Aggregate
Bare-metal Filter:
cpu_arch &
MEM
resources bm1.medium
hypervisor_type
HDD
bm1.tiny
Nova-Compute
Multiple Capabilities Bare-metal
MySQL DB
Nova-Scheduler
baremetal_sql_connection = mysql://
$ID:$Password@$IP/nova_bm
General Bare-metal Provisioning Framework (Speaker Session)
12. 12
Bare-metal Release Plan
Grizzly-1: Nov. 22nd
Grizzly-3: Feb. 21st
General Bare-metal Provisioning Framework (Speaker Session)
14. Benchmarking
o CPU (Coremark) o Context Switch (LMBench)
180000
Baremetal Virtual 70
160000 Better
Baremetal Virtual
60
140000 worse
50
Time [µS]
120000
100000 40
80000 30
60000 20
40000 10
20000 0
0 2 4 8 16 24 32 64 96
Number of Process
o TCP Throughput (Netperf)
o Ping
Baremetal Virtual SR-IOV 0.4 Baremetal SR-IOV Virtual
10000
Throughput [Mbps]
Better
worse
Latency [ms]
8000 0.3
6000
0.2
4000
0.1
2000
0 0
64 1024 1500
transmit receive Packet Size [bytes]
DOCOMO, INC All Rights Reserved 14
15. VM Provisioning Procedure in Nova
1. Instance Request
Nova-
Nova-API
Scheduler
Hypervisor Hypervisor Hypervisor
Host OS
Host OS
Host OS
Nova-Compute
Nova-Compute
Nova-Compute
Glance
USER1 USER1
Storage
Storage
Vol-13 Vol-14
USER2 USER2
Storage
Storage
Vol-11
Vol-12
Nova-Volume
DOCOMO, INC All Rights Reserved 15
16. VM Provisioning Procedure in Nova
1. Instance Request
2. Choose Nova-Compute
Nova-
Nova-API
Scheduler
Hypervisor Hypervisor Hypervisor
Host OS
Host OS
Host OS
Nova-Compute
Nova-Compute
Nova-Compute
Glance
USER1 USER1
Storage
Storage
Vol-13 Vol-14
USER2 USER2
Storage
Storage
Vol-11
Vol-12
Nova-Volume
DOCOMO, INC All Rights Reserved 16
17. VM Provisioning Procedure in Nova
1. Instance Request
2. Choose Nova-Compute
Nova-
Nova-API
Scheduler
3. Image Provisioning
VM
VM
VM
VM
Hypervisor Hypervisor Hypervisor
Host OS
Host OS
Host OS
Nova-Compute
Nova-Compute
Nova-Compute
Glance
USER1 USER1
Storage
Storage
Vol-13 Vol-14
USER2 USER2
Storage
Storage
Vol-11
Vol-12
Nova-Volume
DOCOMO, INC All Rights Reserved 17
18. VM Provisioning Procedure in Nova
1. Instance Request
2. Choose Nova-Compute
Nova-
Nova-API
Scheduler
3. Image Provisioning
VM
VM
VM
VM
Hypervisor Hypervisor Hypervisor
Host OS
Host OS
Host OS
Nova-Compute
Nova-Compute
Nova-Compute
4. Network Isolation
Glance
USER1 USER1
Storage
Storage
Vol-13 Vol-14
USER2 USER2
Storage
Storage
Vol-11
Vol-12
Nova-Volume
DOCOMO, INC All Rights Reserved 18
19. VM Provisioning Procedure in Nova
1. Instance Request
2. Choose Nova-Compute
Nova-
Nova-API
Scheduler
3. Image Provisioning
VM
VM
VM
VM
Hypervisor Hypervisor Hypervisor
Host OS
Host OS
Host OS
Nova-Compute
Nova-Compute
Nova-Compute
4. Network Isolation
Glance
USER1 USER1
Storage
Storage
Vol-13 Vol-14
USER2 USER2
Storage
Storage
Vol-11
Vol-12
5. Nova-Volume Attachment
Nova-Volume
DOCOMO, INC All Rights Reserved 19
20. VM Provisioning Procedure in Nova
1. Instance Request
2. Choose Nova-Compute
Nova-
Nova-API
Scheduler
3. Image Provisioning
VM
VM
VM
VM
6. VNC Access
Hypervisor Hypervisor Hypervisor
Host OS
Host OS
Host OS
Nova-Compute
Nova-Compute
Nova-Compute
4. Network Isolation
Glance
USER1 USER1
Storage
Storage
Vol-13 Vol-14
USER2 USER2
Storage
Storage
Vol-11
Vol-12
5. Nova-Volume Attachment
Nova-Volume
DOCOMO, INC All Rights Reserved 20
21. VM Provisioning Procedure in Nova
1. Instance Request
AMI
AMI
2. Choose Nova-Compute
Nova- glance
Nova-API
Scheduler
3. Image Provisioning
7. Snapshot
VM
VM
VM
VM
6. VNC Access
Hypervisor Hypervisor Hypervisor
Host OS
Host OS
Host OS
Nova-Compute
Nova-Compute
Nova-Compute
4. Network Isolation
Glance
USER1 USER1
Storage
Storage
Vol-13 Vol-14
USER2 USER2
Storage
Storage
Vol-11
Vol-12
5. Nova-Volume Attachment
Nova-Volume
DOCOMO, INC All Rights Reserved 21
22. Bare-Metal Provisioning Functions
o We need to implement same functions for bare-metal
provisioning
1. Instance Request – Description for bare-metal machine instances
2. Choose Nova-Compute – Scheduler for bare-metal machines
3. Image Provisioning – Turn on/off and deploy images to bare-metal
machines
4. Network Isolation – Create private LAN among bare-metal
machines
5. Nova-Volume Attachment – Provide secure iSCSI access
6. VNC Access – Provide console access to bare-metal servers
7. Snapshot – Create new AMI from a running VM
How to achieve those functions without hypervisor?
Keep
Compatibility Less impact to
(Same API)
Nova
DOCOMO, INC All Rights Reserved 22
23. 1. Instance Request
o Create instance types for bare-metal machines
Name
Id
memory_mb
VCPUS
local_gb
m1.tiny
1 512
1
40
m1.medium
2
4096
2
80
b1.tiny
3
512
1
40
b1.medium
4
4096
2
80
o bare-metal machine instances have
“instance_type_extra_specs”
Id
key
value
3
cpu_arch
tilepro64
4
cpu_arch
x86_64
Ø euca-run-instances –t m1.tiny -> Create virtual instance
Ø euca-run-instances –t b1.tiney -> Create bare-metal instance
DOCOMO, INC All Rights Reserved 23
24. 2. Choose Nova-Compute (Sceduler)
o Create pseudo Nova-Computes for bare-metal machines
CPU
bare-metal MEM
m1.tiny
1.midium
m1.large
m
Driver
HDD Hypervisor
m1.large
Host OS
b1.midium
HW
CPU MEM HDD NIC
Nova-Compute b1.tiny
Nova-Compute
(virtual)
o Filter scheduler can classify virtual and bare-metal machines
Hypervisor Hypervisor
Filter
m1.large
Host OS
Host OS
Scheduler
HW
Virtual
HW
m1.tiny
CPU MEM HDD NIC CPU MEM HDD NIC
Nova-
m1.midium
b1.tiny
Scheduler
b1.tiny
Nova-API
Bare-Metal
b1.midium
DOCOMO, INC All Rights Reserved 24
25. 3. Image Provisioning (x86_64)
0. Preparation
Create “kernel + ramdisk”, Run bare-metal
and register them to glance deployment servers
“baremetal-mkinitrd.sh”
AKI
ARI
- dnsmasq (PXE server)
nova-compute
- bm_deploy_server
glance
Specify nova-
Edit nova.conf
compute type
compute_driver=nova.virt.baremetal.driver.BareMetalDriver Driver for nova-compute
baremetal_driver=nova.virt.baremetal.pxe.PXE and power manager
power_manager=nova.virt.baremetal.ipmi.Ipmi
baremetal_deploy_ramdisk = 843adb6d-e0f8-452d-9a60-d8c883a0983c
baremetal_deploy_kernel = 7dfd792c-fc85-480e-8d07-7d9b20d58c24
AKI and ARI
for 1st boot
DOCOMO, INC All Rights Reserved 25
26. 3. Image Provisioning (x86_64)
1. 1st Boot
Nova- nova-compute/
Scheduler
PXE server
PXE boot
Use kernel/ramdisk
b1.tiny
for the deployment
Bare-Metal
Machines
AKI ARI
Nova-API
(deploy)
(deploy)
euca-run-instances –t b1.tinyl --ramdisk
ari-bare (–kernel aki-bare) ami-bare
2. System Setup
Send AMI via iSCSI
AMI-
nova-compute/ bare
bm_deploy_server
Read Configuration (Nova-Network)
MAC and IP Address
1. Create File system (SWAP)
2. Configure MAC and IP address
3. Setup PXE for 2nd boot
4. Reboot
DOCOMO, INC All Rights Reserved 26
27. 3. Image Provisioning (x86_64)
3. 2nd Boot
AMI-
PXE boot bare
Use kernel/ramdisk for the
provisioning
nova-compute/
PXE server
aki-Bare ari-Bare
aki-Bare ari-Bare
Boot from Local HDD
euca-run-instances –t b1.tinyl --ramdisk
ari-bare (–kernel aki-bare) ami-bare
Bare-Metal Instance
DOCOMO, INC All Rights Reserved 27
28. 4. Network Isolation
o Virtual Machines o Bare-Metal Machines
Ø Hypervisor checks addresses (IP Ø Use can change address and VLAN
and MAC), and puts VLAN tag
tag freely
IP address MAC, IP address,
spoofing! VLAN spoofing!
(pretend others)
(pretending others)
APL-d APL-d APL-d
MW-d MW-d MW-d
OS-d OS-d OS-d
Hypervisor
Hypervisor
HW
HW
HW
OK
NG
DOCOMO, INC All Rights Reserved 28
29. 4. Network Isolation (β version)
o Use Quantum – NEC’s Trema + OpenFlow Switch
Ø Protect against address spoofing (MAC and IP)
Ø Create a private network among instances
of_in_port=<switch’s port> src_mac !=
<Instance's MAC> -> DROP
Nova-Compute of_in_port=<switch’s port> src_ip !=
Quantum
<Instance's IP> -> DROP
(bare-metal)
of_in_port=* dst_ip=<Instance's IP> protocol
and dst_port Allowed by security group ->
OpenFlow Controller ALLOW
(Trema from NEC)
of_in_port=* dst_ip=<BROADCAST> protocol
and dst_port Allowed by security group ->
ALLOW
Security Group A
OpenFlow
Security Group B
Switch
Security Group B
DOCOMO, INC All Rights Reserved Security Group A
29
30. 5. Nova-Volume Attachment
o Virtual Machines o Bare-Metal Machines
Ø Nova-Volume is transparent to Ø Use can see all Nova-Volumes
users
iscsiadm –m iscsiadm –m
discovery
discovery
APL-d APL-d APL-d
MW-d MW-d MW-d
Don’t work!
Can see all
OS-d OS-d OS-d
the volumes
HW
Hypervisor
Hypervisor
HW
HW
USER1 USER2 USER3 USER4
Storage
Storage
Storage
Storage
Vol-14 Vol-13 Vol-12
Vol-11
Nova-Volume
DOCOMO, INC All Rights Reserved 30
31. 5. Nova-Volume Attachment (β version)
o Use Nova-Compute as a proxy of Nova-Volume
Ø Separate Nova-Volume network and provide ACL using CHAP
2. Provide ACL for each
bare-metal machines
1. Isolate iSCSI netowrk
OpenFlow
Switch
Server A
USER1
Storage
USER2
Storage
Server B
Vol-13 Vol-14
USER3 USER4
Storage
Storage
Vol-11
Vol-12
Nova-Volume
Server C
Server D
Nova-Volume Network Bare-Metal Nova Volume Network
DOCOMO, INC All Rights Reserved 31
32. 6. VNC Access (β version)
o Provide console access by Serial over LAN (SOL)
Nova-Compute
Bare-metal
SOL
Serial Console
o Use Ajax Console (shellinabox)
DOCOMO, INC All Rights Reserved http://code.google.com/p/shellinabox/
32
33. Bare-Metal Provisioning
1. Instance Request
- Create new instance type with “extra_specs = bare-metal”
2. Choose Nova-Compute
- Create new scheduler called “Heterogeneous Scheduler”
3. Image Provisioning
- Use Intel vPro and IPMI to Turn on/off bare-metal machines
4. Network Isolation
- Use Quantum (OpenFlow) to protect against address spoofing and create
a private LAN within a security group
5. Nova Volume Attachment
- Network ACL (VLAN and CHAP)
6. VNC Access
- Serial over LAN
7. Snapshot
- TBD
DOCOMO, INC All Rights Reserved 33
34. Libvirt and Bare-Metal Driver
o Compare operations supported by Horizon
Category
Operation
Libvirt
Bare-Metal
Activate
O O (IPMI)
Reboot
O
O (IPMI)
Suspend
O
X
Instance
Terminate
O
O (IPMI)
MAC/IP
O
O (Deploy Ramdisk)
Address
Floating IP
O
O
Snapshot
O
X
Security
O
O (OpenFlow)
Security
Groups
Keypair
O
O
Console
O (VNC)
△ (SOL)
DOCOMO, INC All Rights Reserved 34
38. Bare-Metal Machine Provisioning
o Manage Bare-Metal Machines same as Virtual Machines
Virtual Bare-Metal
Machines
Machines
Ø Run an instance through OpenStack API
ü euca-run-instances –t b1.tinyl --ramdisk ari-bare (–kernel aki-bare) ami-bare
Utilize all the ecosystem Management
created on top of OpenStack using OpenStack
Open
Stack
Auto-Scaling
DOCOMO, INC All Rights Reserved 38
39. Auto-Scaling of the Nova-Compute
o Change resources dynamically based on load
Common
Computing Pool
Common
Computing Pool
DOCOMO, INC All Rights Reserved 39
40. How Does Zabbix Scale a Nova-Compute?
Nova-Compute
Zabbix
Information
from Libvirt
ITEM
Management
VM
VM's CPU load
Item1, Item2
Total vCPUs VM
VM
V VM’s Memory
M
VM’s Disk etc…
Zabbix
Libvirt Collectd
Plugin
TRIGGER
Plugin
“Item2” = Total vCPUs Scale-out Trigger
Scale-in Trigger
Zabbix argent
“Item1” = Total CPUs
H
O
ACTION
S
System Information
Scale-out Action
T Total CPUs Scale-in Action
Total Memory
Total Disk etc…
DOCOMO, INC All Rights Reserved 40
41. Trigger & Action for scaling the Nova-Compute
Item List
Item1
Total CPUs
Item2
Total vCPUs
Trigger List
Expression
Value
True : PROBLEM
Scale-out
Total vCPUs.ave(60) > Total CPUs
False : OK
Total vCPUs.ave(180) True : PROBLEM
Scale-in
< Total CPUs - number of CPUs per
server
False : OK
Action List
Value Status
Operation
Execute “euca-run-instances~”
Scale-out
PROBLEM
command to Nova-api
Execute “euca-terminate-instances~”
Scale-in
PROBLEM
command to Nova-api
DOCOMO, INC All Rights Reserved 41
43. Bare-metal codes for submission
o Updated scheduler and compute for multiple bare-metal
capabilities
Ø https://review.openstack.org/13920
o Added separate bare-metal MySQL DB
Ø https://review.openstack.org/10726
o A script for bare-metal node management
Ø https://review.openstack.org/#/c/11366/
o Updated bare-metal provisioning framework
Ø https://review.openstack.org/11354
o Added PXE back-end bare-metal
Ø https://review.openstack.org/11088
o Added bare-metal host manager
Ø https://review.openstack.org/11357
DOCOMO, INC All Rights Reserved 43
44. Bare-metal docs 44
OpenStack Wiki
• http://wiki.openstack.org/
GeneralBareMetalProvisioningFramework
OpenStack Source
• nova/virt/baremetal/docs/*.rst
• README and installation documents
The Latest Github branch
• https://github.com/NTTdocomo-openstack/
nova/
DOCOMO, INC All Rights Reserved 44
45. Bare-metal provisioning interests
o Contact: USC/ISI & NTT Docomo
o Interested companies: collaboration / testing
Tuesday
@4:30-5:10pm
[Emma AB]
summit
session
Design & Implementation
meetup
DOCOMO, INC All Rights Reserved 45