• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Deploying Baremetal Instances with OpenStack
 

Deploying Baremetal Instances with OpenStack

on

  • 4,137 views

 

Statistics

Views

Total Views
4,137
Views on SlideShare
4,136
Embed Views
1

Actions

Likes
10
Downloads
112
Comments
0

1 Embed 1

https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Deploying Baremetal Instances with OpenStack Deploying Baremetal Instances with OpenStack Presentation Transcript

    • NII dodai-compute2.0 projectDeploying Baremetal Instances with OpenStack Ver1.1 2013/02/10 Etsuji Nakai
    • NII dodai-compute2.0 project $ who am i  Etsuji Nakai – Senior solution architect and cloud evangelist at Red Hat. – Working for NII (National Institute of Informatics Japan) as a cloud technology consultant. – The author of “Professional Linux Systems” series. • Available only in Japanese. Translation offering from publishers are welcomed ;-) Professional Linux Systems Professional Linux Systems Professional Linux Systems Technology for Next Decade Deployment and Management Network Management2
    • Background of the project
    • NII dodai-compute2.0 project Why does baremetal matter?  General usecase – I/O Intensive application (RDB) – Realtime application (Deterministic latency) – Native Processor Features – etc....  Specific usecase in “Academic Research Cloud (ARC)” of NII – Flexible extension of existing server cluster. – Flexible extension of existing cloud infrastructure.4
    • NII dodai-compute2.0 project Academic Research Cloud (ARC) in NII, today.  This is a prototype of the Japan-wide research cloud. – Its now running in NIIs laboratories, and will be extended as a Japan-wide research cloud.  Research labs can extend their existing clusters (HPC cluster, cloud infrastructures, etc...) by attaching baremetal servers from the resource pool. L2 connection(VLAN) Baremetal Resource Pool Existing HPC Cluster ・・・ ・・・ ・・・ Existing Cloud Infrastructure ・・・ ・・・ Self Service Portal On-demand provisioning/ de-provisioning Flexible extension of5 existing cluster
    • NII dodai-compute2.0 project Future plan of the ARC.  ARC will be extended as a Japan-wide cloud with SINET4 WAN connection. – SINET4 is a MPLS based wide area Ethernet service for academic facilities in Japan, operated by NII. Baremetal Resource Pool Existing HPC Cluster ・・・ ・・・ ・・・ Existing Cloud Infrastructure ・・・ ・・・ http://www.sinet.ad.jp/index_en.html MPLS based Wide Area Ethernet6
    • NII dodai-compute2.0 project Overview of dodai-compute1.0  What is dodai-compute? – Baremetal driver extension of Nova, currently used in ARC. • Designed and developed by NII in 2012 • Based on Diablo with Ubuntu 11.10 • Source codes – https://github.com/nii-cloud/dodai-compute – Upside: Simple extension aimed for the specific usecase :-) – Downside: Unsuitable for general usecase :-( • Cannot manage mixed environment of baremetal and hypervisor hosts. • One-to-one mapping from instance flavor to baremetal host. (No scheduling logic to select suitable host automatically.) • Nonstandard use of availability zone. (Used for host status management.) The most outstanding issue - Its not merged in upstream. No community support, No future!7
    • NII dodai-compute2.0 project Planning of ARC baremetal provisioning feature  It should be designed based on the framework in the upstream. – Existing framework: GeneralBareMetalProvisioningFramework. • So called “NTTdocomo-openstack.” • Blueprint - http://wiki.openstack.org/GeneralBareMetalProvisioningFramework • Source codes - https://github.com/NTTdocomo-openstack/nova  As a first step, we compared the architectures of “dodai-compute” and “NTTdocomo-openstack”, and considered the following things. – Whats common and whats uncommon? – What can be more generalized in “NTTdocomo-openstack”? – What should be added to be used for ARC? The goal of the project “dodai-compute2.0” is - Extend the upstream framework for ARC. - Not to be a private branch, stay in the upstream. Note: – NTTdocomo-openstack branch has been merged in the upstream with many modifications. Although this slide is based on NTTdocomo-openstack branch, the future extension will be done directly on the upstream.8
    • NII dodai-compute2.0 project By the way, what does “dodai” stand for?  1. Base, Foundation, Framework, etc...  2. A sub flight system (SFS) featured in Mobile Suit Gundam.9
    • Comparison of dodai-compute1.0 and NTTdocomo-openstack
    • NII dodai-compute2.0 project Todays Topics  1. Coupling Structure with Nova Scheduler.  2. OS Provisioning Mechanism.  3. Network Virtualization.11
    • Coupling Stricture with Nova Scheduler
    • NII dodai-compute2.0 project General flow of instance launch  Question: – How can we apply baremetal servers in place of VM instances in this structure? VM VM Select host for new instance Compute Driver ・・・ Register hosts to scheduler Nova Scheduler VM VM Launch VM Asks to launch instance Compute Driver13
    • NII dodai-compute2.0 project A1. Register “Baremetal Pool” as an “Instance Host”  dodai-compute takes this approach. Its driver acts as a single host which accommodates multiple baremetal servers. Launch baremetal Select baremetal server server to launch Baremetal Asks to Pool launch instance Compute Driver Nova Scheduler Register pools Select pool Baremetal for new instance Pool Compute Driver14
    • NII dodai-compute2.0 project A2. Register each baremetal as a “Single Instance Host”  NTTdocomo-openstack takes this approach. Its driver acts as a proxy for baremetal servers, each of them accommodates just one instance. Launch selected baremetal server Register each baremetal as host Asks to Nova Scheduler launch instance Compute Driver Select baremetal server for new instance15
    • NII dodai-compute2.0 project Class structure for coupling with Nova  dodai-compute1.0 and NTTdocomo-openstack has basically the same class structure in terms of coupling with Nova. – The drawing is the case of dodai-compute1.0 – NTTdocomo-openstack uses “BareMetalDriver” in place of “DodaiConnection” Base class of different kinds of visualization hosts Driver for libvirt managed hypervisor (KVM/LXC) Driver for baremetal management https://github.com/nii-cloud/dodai-compute/wiki/Developer-guide16
    • NII dodai-compute2.0 project How does Nova Scheduler see baremetal servers?  dodai-computes driver acts as a single host which accommodates multiple baremetal servers. – Its like representing a baremetal pool as a single “Host” which runs baremetal servers as its “VMs”. – Scheduling policy is implemented in the driver side. (Nova Scheduler has no choice of hosts.) Nova API Nova Scheduler Scheduler recognizes it as a single host Nova Compute dodai db(Baremetal (dodaiConnection) serverinformation) A host of “baremetal VMs” Choose host to provision by referring to dodai db ・・・17
    • NII dodai-compute2.0 project How does Nova Scheduler see baremetal servers?  NTTdocomo-openstack driver acts as a proxy of all baremetal hosts. – Each baremetal server is seen as an independent host which can accommodate up to one instance. – Scheduling policy is implemented as a part of Nova Scheduler. It uses "extra_specs” metadata to distinguish baremetal hosts from hypervisor hosts. Scheduler recognizes all baremetal hosts Nova API Nova Scheduler Register all hosts by referring to baremetal db Hosts of just one instance Nova Compute beremetal db (Baremetal (BareMetalDriver) serverinformation) ・・・18 extra_specs=cpu_arch:x86_64
    • NII dodai-compute2.0 project Considerations on the Nova Scheduler coupling  dodai-compute – Scheduling (server selection logic) is up to the driver. • Currently, theres no intelligence in the drivers scheduler. One-to-one mappings between physical servers and instance types are pre-defined. • However, it enables users to choose a baremetal server explicitly.  NTTdocomo-openstack – Scheduling (server selection logic) is up to Nova Scheduler. • Currently, the standard “Filter Scheduler” is used. • “instance_type_extra_specs=cup_arch:x86_64” is used to distinguish baremetal hosts from hypervisor hosts. • Users cannot choose a baremetal server to use explicitly. This must be addressed for ARC usecase. We may use additional “labels” in instance_type_extra_specs, like, “instance_type_extra_specs=cpu_arch:x86_64,racklocation:a32”19
    • OS Provisioning Mechanism
    • NII dodai-compute2.0 project OS Installation Mechanism of dadai-compute1.0  The basic flow of OS installation in dodai-compute1.0 – Management IP (IPMI) of baremetal servers are stored in database. – The driver prepares a boot image and an installation script. – The actual installation works are handled by the script. (2) Pass installation script URL as a kernel parameter BareMetal PXEBoot (1) Fetch the target image from Glance Driver Server (tar ball of root file system contents), And prepare the installation script. pxe boot image OS Installation Server (4) Fetch the image tar ball, Baremetal and expand it into the local disk Server (3) Fetch the installation script and run it.21
    • NII dodai-compute2.0 project OS Installation Mechanism of NTTdocomo-openstack  The basic flow of OS installation in NTTdocomo-openstack. – Management IP (IPMI) of baremetal servers are stored in database. – The driver prepares a boot image and an installation script. – The actual installation works are handled by the script. (2) Embed installation script into the init script BareMetal PXEBoot (1) Fetch the target image from Glance Driver Server (dd image of root filesystem), And prepare the installation script. pxe boot image OS Installation Server (4) Attache the iSCSI LUN, and fill it with the dd image. Baremetal Server (3) export local disk as an iSCSI LUN, and ask installation service to fill it.22
    • NII dodai-compute2.0 project OS Installation Mechanism The basic framework is the same for both of them. – Management IP (IPMI) of baremetal servers are stored in database. – The driver prepares a pxe boot image to start OS installation. – The actual installation works are handled by scripts in the boot image.  The difference just lies on the actual installation method. – Installation script of dodai-compute1.0: • Make partitions and filesystems on the local disk. • Fetch tar.gz image and unbundle it directly to the local filesystem. • Install grub to the local disk. – Installation script of NTTdocomo-openstack: • Start tgtd (iSCSI target daemon) and export the local disk as an iSCSI LUN. • Ask the external “Installation Server” to install OS in that LUN. • The installation server attaches the LUN and copy “dd” image to it. • Grub is not installed. The baremetal relies on PXE boot even for bootstrapping of OS provisioned in the local disk. So,...23
    • NII dodai-compute2.0 project Considerations on OS Installation Mechanism We could give more general framework which allows multiple installation methods.  Registered machine images need to have meta-data to specify: – Type of Installation Service (2) Prepare PXE boot image – Installation services FQDN corresponding to the • We may use “properties attribute” of the image. selected installation service BareMetal PXEBoot Driver Server (1) Prepare the target Image in the corresponding installation service pxe boot image/ initrd script for the selected OS Installation installation service Server A OS Installation Baremetal Server B Server (3) Script in initrd starts the installation using the selected installation service.24
    • NII dodai-compute2.0 project Considerations on OS Installation Mechanism  Candidates of Installation Service: – Existing ones such as in dodai-compute and NTTdocomo-openstack. – Wed like to add Kickstart method, too. • The image contains a ks.cfg file instead of an actual binary image. • The installation service install the baremetal using Kickstart. Kickstart gives more flexibility and ease of use for customizing image contents.25
    • Network Virtualization
    • NII dodai-compute2.0 project Network configuration of dadai-compute1.0  L2 separation is done by VLAN. – Each lab has its own fixed VLAN ID assigned on SINET4. – dodai-compute asks OpenFlow controller to setup a SINET4 port/VLAN mapping. VLAN is explicitly specified by a user. – Mappings between baremetals NICs and associated VLAN Trunking switch ports are stored in database. Service Network Service Network  OS side configuration is done by the local agent. Switch #1 Switch #2 – NIC bonding is also configured for redundancy. – NIC bonding is mandatory in ARC. bonding Service IP OpenFlow Service IP and Bonding config is Controller done by local agent based on Baremetal the request from dodai-compute Server Port/VLAN mapping Management IP PXE Boot / (Fixed) Agent Operations Management Network dodai-compute27
    • NII dodai-compute2.0 project Network configuration of NTTdocomo-openstack  Virtual Network is managed by Quantum API and NEC OpenFlow Plug-in. – L2 separation is done port-based packet separation using flowtable entries. – Mappings between baremetals NICs and associated switch ports are stored in database. – VLAN based separation needs to be added for ARC usecase. Service Network – When a user specifies more than two NICs, the Switch driver choose unused NICs from the database and setup the flowtable entries for associated ports. – NIC bonding mechanism needs to be added for ARC Service IP usecase. Baremetal OpenFlow Server Controller Management IP PXE Boot (Fixed) Management Network BaremetalDriver28
    • NII dodai-compute2.0 project How will Quantum API be used for ARC usecase?  Using Quantum API and plugin is a preferable choice for ARC. But we need some modification/extension, too.  VLAN based separation needs to be added for ARC usecase. SINET4 – Our plan is to add BareMetal VLAN plugin which configures port/VLAN mappings using flowtable entries, or directly configures port- VLAN on CISCO switches. VLAN Trunking – This enables us not only SINET4 VLAN connection but also interconnection with VM Service Network Service Network instances using OVS plugin(via VLAN). Switch #1 Switch #2  NIC bonding mechanism needs to be added Port VLAN for ARC usecase. – As all NICs of baremetal servers are registered VLAN Trunking in database, we may add redundancy information there. (eg. NIC-A should be paired with NIC-B for bonding.) OVS Plugin BareMetal – We may still need a local agent to make actual VLAN Plugin bonding configuration. Hypervisor Host Baremetal Server29
    • Summary
    • NII dodai-compute2.0 project Summary  Target areas for the future extension: 1. Scheduler extension for grouping of baremetal servers. – Allowing users to specify baremetal servers to be used. 2. Multiple OS provisioning method. – Allowing multiple types of OS images such as: • dd-image (NTTdocomo-openstack style) • tar ball (dodai-compute style) • Kickstart installation (new feature) 3. Baremetal Quantum plugin for VLAN inter-connection. – Allowing inter-connection to existing VLAN networks. – Allowing NIC-bonding configuration.  As NTTdocomo-openstack branch has been merged in the upstream, the future extension will be done directly on the upstream.31
    • NII dodai-compute2.0 project Thank You! Etsuji Nakai Twitter @enakai00