Bare Metal Provisioning for Big Data
Vol.01 2016/12/1
Woosuk, Choi
EC Core Technology Department
http://www.rakuten.co.jp/
2
At first, Simple Comparison
Virtualization VS Bare Metal
3
What’s the best one for Big Data System ?
• We’re used to need enough resource every time in order to
processing huge data in the Rakuten
• In terms of performance/cost, winner is Bare Metal
Bare Metal Virtualization ( Cloud )
Infra
Management
Difficult Easy
Performance Best Effort Bottleneck Storage/Network ..
Solutions Many legacy way .. AWS , OpenStack ..
Virtualization VS Bare Metal
4
H/W Selection Approach
Low Price
Micro Server
Local Storage
A Large number of
servers
DAS
High Performance
Storage
5
Quanta S910-X31E
Micro Server
FEATURE HIGHLIGHTS
• Chassis
• Nodes : 9 nodes
• Network - Inner Switch
• 1Set, No redundancy
• 10Gbps x 2 uplink
• 1Gbps x 24 internal
• Power
• 2 redundant PSUs
• Per Node
• Processor
• Xeon E3-1200 v3 (4C/1S/8T)
• Memory
• DDR3 16GB/32GB
• Network
• 1Gbps x 2 via Inner Switch
• Storage - No Raid
• Controller : SATAIII
• SSD : 2.5” 120GB/480GB
• HDD : 2.5” 1TB
6
Quanta 210-X12RS/D51B-1U
1U type Server
FEATURE HIGHLIGHTS
• Processor
• Xeon E5-2620/2640 v3
• 8C/1S/32T
• Memory
• DDR3/4 128GB
• Network
• 10Gbps x 1 , No redundancy
• Power
• 1 PSU , No redundancy
• Storage
• Raid Controller : MegaRAID
• SSD : 2.5” 480/600GB x 4
• HDD : 2.5” 1TB x 2
7
Quanta 210-X22RQ/D51B-2U
2U type Server for Hadoop
FEATURE HIGHLIGHTS
• Processor
• Xeon E5-2620/2640 v3
• 8C/2S/32T
• Memory
• DDR3/4 128GB/256GB
• Network
• 10Gbps x 1 , No redundancy
• PSU
• 1 PSU , No redundancy
• Storage
• Controller : SATAIII ( no RAID )
• SSD : 2.5” 120GB x 2
• HDD : 3.5” 4/6/8TB x 12
8
Bare Metal, But is it really good for everything ?
So, we can make Bare Metal management system
which is most likely to cloud solution
9
2014
2014 3Q
2015 2Q
2015 4Q
2016
Bare Metal System
Project for Big Data
1st Revision Launch
2nd Phase Start
2nd Revision Launch
History
For Next ….
• More Global
• Not only Big Data
• Expand for others
• Scratch
• By Admin
• GUI/No API
• More Open Source
• By User
• Resource Management
• GUI/API
10
US Data Center
200 servers
Boston Office
EU Data Center
300 Servers
Paris Office
JP Data Center
5000 Physical Servers
India Operation Center
24x7 Trouble Tier1 Support
US Data Center
500 servers
Tokyo Office
Where are we ?
11
• Racking/Cabling
• HW Check
• Stress Test
• BIOS Configuration
• Error on BMC
Acceptance
• Registration to OS
Provisioning System
• Test / Building OS
image for each type
• Base Chef recipes
Ready to
Provisioning • Provisioning at once
• OS Installation
• Application
Deployment
• Set up monitoring
Provisioning
Connected System
Server Assign
Physical Server Delivery Flow
Bare Metal ManagementFacility
Infra Administrator
Application
Platform Admin
12
Rack = SW NAME : XXXX
3
3
1
1
Racking Rule – for Infra Hostname
Infra Hostname
RuleRegion
SW Host
= Rack Name
Port Num
= Position
Server
Type
A
JP XXXX 01 2U
JP-XXXX-01-2u
B
JP XXXX 03 1U
JP-XXXX-03-1u
B
A
Region : JP
 Mapping Rack position and Port
number in Switch
 OS provisioning engine can make
infra hostname automatically
Rack Position
Port Number
13
OCP Server
DHCPTFTPSCP BDD Servers
1.Get IP address and boot loader file name
2.Get boot loader file(pxelinux.0)
3.Get boot option configuration(pxelinux.cfg/default)
4.Start kernel5.Get boot image
7.Perform
BDD OCP Scripts
8.Scp logs file
8.Analyze logs and get certificate result
Certificate the whole rack servers as new delivery
6.Start boot image
HW Check - Data Flow
14
No. Script Name Note
1 Check for initial delivery
Collect HW information and check whether
it is meet our expectation.
2 HW full stress test for DOA Check stability of HW
3 Initial FW / BIOS update
Check BIOS version, downgrade / upgrade
BIOS version when necessary
4 Initial standard BIOS configuration
Mandatory configuration in BIOS
Ex) Dedicated network for BMC
5 Over provisioning for SSD Better write performance
6 Initial L2 Switch configuration Basic configuration on ToR / Inner Switch
HW Check - Function
15
Ready to Provisioning
- Bare Metal Provisioning System
16
Role Mission Detail
Redundancy /
Capacity
Plaining
Self
Provisioning
All provisioning including application deployment
should be handled by users at once.
Full
Automation
Fire and Forget.
Pool/Resource
Management
All user can control their own resource from pool.
It can help your capacity planning as well as
trouble shooting
Platform
Quality
Full Stack
Management
All system management, all user can do that by
themselves
Easy Operation
Support not only GUI but also API for easy
maintenance
Fast Delivery Faster faster faster, like cloud
Mission
17
Installation Engine - MAAS
ManagementTool-Chef
OS provisioning system
App
OS
App
OS
GSP Hadoop
Bare Metal Management
App
OS
App
OS
Cluster for Redundancy
Pool
Platform Admin /
Application Owner
Infra Admin
Self provisioning
Resource management
Developing and Maintenance
Concept
18
Component Detail
Provisioning Controller
This was built up by Rakuten in order to control all
provisioning process and data.
It store own organization data in Rakuten as well as control
MAAS and Chef , Power DNS, for all automation.
MAAS Metal as a Service from Canonical
Chef Management tool
PowerDNS Master DNS
BIND Slave DNS
Shinken Alerting system is compatible with Nagios.
Graphite Time Series DB
Components with Open Source
19
Self Provisioning with Full Automation
Dash Board
Organization
Role/Recipe
Host name
Custom
data
Provisioning System
OS Provisioning with Chef
API
Worker
Controller
Installation Engine
Management
Monitoring
Operation System
Configuration
All Operation by Chef
App Deploy
Monitoring
Kick OS Provisioning for your system
With Recipes
Recipe for Application
Full Automation
Control Control
ConnectChef
MAAS
Power
DNSShinken Graphite
Operation by
each User
20
Pool
APP 1
Deployed
Pool
APP 2
Deployed
After
Commissioning
Assignment
Provisioning
Destroy OK
Take over
OK
OK OK
OK
NG
OK
OK
Resource/Pool Management : Reservation
Infra Admin
Admin User 1 ADMIN
Authority
Authentication
Chef
Organization
= Tenant
APP 1 APP2
Admin User 1 User 2
Only Admin
NG
NG
NG
OK
NG
User 2
Only possible by build user
21
APP 1
Deployed
APP 2
Deployed
Provisioning
Destroy OK
Take over
OK
OK OK
OK
NG
OK
OK
Admin User 1 ADMIN
Authority
Authentication
Chef
Organization
= Tenant
APP 1 APP2
Admin User 1 User 2
NG
NG
NG
OK
NG
User 2
Only possible by build user
After Commissioning
Shared Pool
Quotation for each group/platform
Self pick up from shared Pool
Resource/Pool Management : Shared Pool
22
• Designed by ApplicationApp Monitoring
• Designed by Application
• Custom Package by ApplicationApp Deployment
• Custom Configuration on OS by Application
• Server Account based on Chef OrganizationOS Configuration
• Default OS monitoringInfra Monitoring
• Default Configuration on OS
• Basic PackagesOS Configuration
• Simple image
• Pattern base Partitioning / Raid configurationOS Installation
• Detail H/W Spec
• Custom Information for BDDInventory Data
Full Stack Management
Role/Recipe
Infra Base
App XXApplication/
Platform
Admin
Infra Admin
Responsibility
Chef
Organization
OS Provisioning
Criteria
App YY
Role/Recipe
App ZZ
MAAS
OS Images
23
Easy Provisioning
1st Step
• Chose Server
2nd Step
• Chose Action
• Install
• Destroy
3rd Step
• Hostname what you want
• Unnecessary DNS operation
• OS distribution/version
• Tenant yours and environment
• Chose recipes of your application
Final, click and get it
Hey, I want new
server
Just Do It
24
InstallOS SetupOS SetupEnv
Provisioning Process and Fast Delivery
Provisioning System
Default
Infra Role
App Role
Manage recipes for app
Default Infra
Monitoring
App Monitoring
Basic Install
DNS entry
OS / APP
Configuration
Monitoring
Configuration
Task
Worker
Approximately 30 min
Request via GUI/API
Operation System
Chef ChefMAAS
Power
DNS
25
Chef-Client
Collected
Yum
Nrpe
Collector
Graphite
Graphite Web
TimeSeriseDB
Inventory DB
OCS Inventory
Inventory
GLPI agent
Shiken
Thruk
Monitoring
Config files
Docker
Pager
DutyAlert
Repository
Package
JIRA
Issue
Tracking
Chef Enterprise
Configuration
DataBag
Account
Organization
Cookbook
Cookbook
Cookbook
Elastic Search
Graylog2
Log Analysis
GIT
Source
Management
Jenkins
Graylog Web
Graphana
Dash Board
Core Engine
OS Provisioning
MAAS
PowerDNS
DNS
MySQL
OS Install
Entry
Finally, You can get server with one click
Operation
Support
26
What is Next ? Is it done ? No…
27
Global Hosting for BCP
US 1 US 2 EU JP 1 JP 2
Zone 1
Zone 2
Zone 3
Global Zone
Prov
System
Global Dash Board
Prov
System
Prov
System
Prov
System
Prov
System
Local
System
Data
Center
Local
Zone
28
Platform : APP 1
Environment : DEV
Management
API
Application
Application
HaProxy
Network
Layer
Float IP
NAT
Private Network
for User
Platform : APP 2
Environment : PROD
.
.
Network Control
Bare Metal
Management
Another
Application
ACL
Block by ACL
Permit
Permit
Permit
Permit
X
User
API
Application
Application
Permit
29
Data Base
DB 1
DB XX
Application
Layer
.
.
Storage Layer
API
App 1
App 2
Internal SLB
HaProxy
Access
Layer
Access
Web 1
Web2
External SLB
HaProxy
Permit
Permit
ACL
Separated Network
30
Separating Functionality
Like OpenStack Concept
• Current system only focus OS provisioning as one worker
• Current system had been designed for only dedicated DC
• Functionality requirement is getting increase
31
Installation Engine
ManagementTool-Chef
OS provisioning system
App
OS
Bare Metal App
Bare Metal Management
App
OS
App
OS
Cluster for Redundancy
Pool
New feature for Cloud
Docker
OS OS OS
VM
Cloud
OpenStack/Swarm/Mesos.. Global Dash Board
Cost
Allocation
32
Thank you
33
We are hiring!
Let’s join us for Building the core part of Rakuten
Just mail to me, I can help you for your application
Woosuk.choi@rakuten.com
http://global.rakuten.com/corp/careers/

Bare Metal Provisioning for Big Data - OpenStack最新情報セミナー(2016年12月)

  • 1.
    Bare Metal Provisioningfor Big Data Vol.01 2016/12/1 Woosuk, Choi EC Core Technology Department http://www.rakuten.co.jp/
  • 2.
    2 At first, SimpleComparison Virtualization VS Bare Metal
  • 3.
    3 What’s the bestone for Big Data System ? • We’re used to need enough resource every time in order to processing huge data in the Rakuten • In terms of performance/cost, winner is Bare Metal Bare Metal Virtualization ( Cloud ) Infra Management Difficult Easy Performance Best Effort Bottleneck Storage/Network .. Solutions Many legacy way .. AWS , OpenStack .. Virtualization VS Bare Metal
  • 4.
    4 H/W Selection Approach LowPrice Micro Server Local Storage A Large number of servers DAS High Performance Storage
  • 5.
    5 Quanta S910-X31E Micro Server FEATUREHIGHLIGHTS • Chassis • Nodes : 9 nodes • Network - Inner Switch • 1Set, No redundancy • 10Gbps x 2 uplink • 1Gbps x 24 internal • Power • 2 redundant PSUs • Per Node • Processor • Xeon E3-1200 v3 (4C/1S/8T) • Memory • DDR3 16GB/32GB • Network • 1Gbps x 2 via Inner Switch • Storage - No Raid • Controller : SATAIII • SSD : 2.5” 120GB/480GB • HDD : 2.5” 1TB
  • 6.
    6 Quanta 210-X12RS/D51B-1U 1U typeServer FEATURE HIGHLIGHTS • Processor • Xeon E5-2620/2640 v3 • 8C/1S/32T • Memory • DDR3/4 128GB • Network • 10Gbps x 1 , No redundancy • Power • 1 PSU , No redundancy • Storage • Raid Controller : MegaRAID • SSD : 2.5” 480/600GB x 4 • HDD : 2.5” 1TB x 2
  • 7.
    7 Quanta 210-X22RQ/D51B-2U 2U typeServer for Hadoop FEATURE HIGHLIGHTS • Processor • Xeon E5-2620/2640 v3 • 8C/2S/32T • Memory • DDR3/4 128GB/256GB • Network • 10Gbps x 1 , No redundancy • PSU • 1 PSU , No redundancy • Storage • Controller : SATAIII ( no RAID ) • SSD : 2.5” 120GB x 2 • HDD : 3.5” 4/6/8TB x 12
  • 8.
    8 Bare Metal, Butis it really good for everything ? So, we can make Bare Metal management system which is most likely to cloud solution
  • 9.
    9 2014 2014 3Q 2015 2Q 20154Q 2016 Bare Metal System Project for Big Data 1st Revision Launch 2nd Phase Start 2nd Revision Launch History For Next …. • More Global • Not only Big Data • Expand for others • Scratch • By Admin • GUI/No API • More Open Source • By User • Resource Management • GUI/API
  • 10.
    10 US Data Center 200servers Boston Office EU Data Center 300 Servers Paris Office JP Data Center 5000 Physical Servers India Operation Center 24x7 Trouble Tier1 Support US Data Center 500 servers Tokyo Office Where are we ?
  • 11.
    11 • Racking/Cabling • HWCheck • Stress Test • BIOS Configuration • Error on BMC Acceptance • Registration to OS Provisioning System • Test / Building OS image for each type • Base Chef recipes Ready to Provisioning • Provisioning at once • OS Installation • Application Deployment • Set up monitoring Provisioning Connected System Server Assign Physical Server Delivery Flow Bare Metal ManagementFacility Infra Administrator Application Platform Admin
  • 12.
    12 Rack = SWNAME : XXXX 3 3 1 1 Racking Rule – for Infra Hostname Infra Hostname RuleRegion SW Host = Rack Name Port Num = Position Server Type A JP XXXX 01 2U JP-XXXX-01-2u B JP XXXX 03 1U JP-XXXX-03-1u B A Region : JP  Mapping Rack position and Port number in Switch  OS provisioning engine can make infra hostname automatically Rack Position Port Number
  • 13.
    13 OCP Server DHCPTFTPSCP BDDServers 1.Get IP address and boot loader file name 2.Get boot loader file(pxelinux.0) 3.Get boot option configuration(pxelinux.cfg/default) 4.Start kernel5.Get boot image 7.Perform BDD OCP Scripts 8.Scp logs file 8.Analyze logs and get certificate result Certificate the whole rack servers as new delivery 6.Start boot image HW Check - Data Flow
  • 14.
    14 No. Script NameNote 1 Check for initial delivery Collect HW information and check whether it is meet our expectation. 2 HW full stress test for DOA Check stability of HW 3 Initial FW / BIOS update Check BIOS version, downgrade / upgrade BIOS version when necessary 4 Initial standard BIOS configuration Mandatory configuration in BIOS Ex) Dedicated network for BMC 5 Over provisioning for SSD Better write performance 6 Initial L2 Switch configuration Basic configuration on ToR / Inner Switch HW Check - Function
  • 15.
    15 Ready to Provisioning -Bare Metal Provisioning System
  • 16.
    16 Role Mission Detail Redundancy/ Capacity Plaining Self Provisioning All provisioning including application deployment should be handled by users at once. Full Automation Fire and Forget. Pool/Resource Management All user can control their own resource from pool. It can help your capacity planning as well as trouble shooting Platform Quality Full Stack Management All system management, all user can do that by themselves Easy Operation Support not only GUI but also API for easy maintenance Fast Delivery Faster faster faster, like cloud Mission
  • 17.
    17 Installation Engine -MAAS ManagementTool-Chef OS provisioning system App OS App OS GSP Hadoop Bare Metal Management App OS App OS Cluster for Redundancy Pool Platform Admin / Application Owner Infra Admin Self provisioning Resource management Developing and Maintenance Concept
  • 18.
    18 Component Detail Provisioning Controller Thiswas built up by Rakuten in order to control all provisioning process and data. It store own organization data in Rakuten as well as control MAAS and Chef , Power DNS, for all automation. MAAS Metal as a Service from Canonical Chef Management tool PowerDNS Master DNS BIND Slave DNS Shinken Alerting system is compatible with Nagios. Graphite Time Series DB Components with Open Source
  • 19.
    19 Self Provisioning withFull Automation Dash Board Organization Role/Recipe Host name Custom data Provisioning System OS Provisioning with Chef API Worker Controller Installation Engine Management Monitoring Operation System Configuration All Operation by Chef App Deploy Monitoring Kick OS Provisioning for your system With Recipes Recipe for Application Full Automation Control Control ConnectChef MAAS Power DNSShinken Graphite Operation by each User
  • 20.
    20 Pool APP 1 Deployed Pool APP 2 Deployed After Commissioning Assignment Provisioning DestroyOK Take over OK OK OK OK NG OK OK Resource/Pool Management : Reservation Infra Admin Admin User 1 ADMIN Authority Authentication Chef Organization = Tenant APP 1 APP2 Admin User 1 User 2 Only Admin NG NG NG OK NG User 2 Only possible by build user
  • 21.
    21 APP 1 Deployed APP 2 Deployed Provisioning DestroyOK Take over OK OK OK OK NG OK OK Admin User 1 ADMIN Authority Authentication Chef Organization = Tenant APP 1 APP2 Admin User 1 User 2 NG NG NG OK NG User 2 Only possible by build user After Commissioning Shared Pool Quotation for each group/platform Self pick up from shared Pool Resource/Pool Management : Shared Pool
  • 22.
    22 • Designed byApplicationApp Monitoring • Designed by Application • Custom Package by ApplicationApp Deployment • Custom Configuration on OS by Application • Server Account based on Chef OrganizationOS Configuration • Default OS monitoringInfra Monitoring • Default Configuration on OS • Basic PackagesOS Configuration • Simple image • Pattern base Partitioning / Raid configurationOS Installation • Detail H/W Spec • Custom Information for BDDInventory Data Full Stack Management Role/Recipe Infra Base App XXApplication/ Platform Admin Infra Admin Responsibility Chef Organization OS Provisioning Criteria App YY Role/Recipe App ZZ MAAS OS Images
  • 23.
    23 Easy Provisioning 1st Step •Chose Server 2nd Step • Chose Action • Install • Destroy 3rd Step • Hostname what you want • Unnecessary DNS operation • OS distribution/version • Tenant yours and environment • Chose recipes of your application Final, click and get it Hey, I want new server Just Do It
  • 24.
    24 InstallOS SetupOS SetupEnv ProvisioningProcess and Fast Delivery Provisioning System Default Infra Role App Role Manage recipes for app Default Infra Monitoring App Monitoring Basic Install DNS entry OS / APP Configuration Monitoring Configuration Task Worker Approximately 30 min Request via GUI/API Operation System Chef ChefMAAS Power DNS
  • 25.
    25 Chef-Client Collected Yum Nrpe Collector Graphite Graphite Web TimeSeriseDB Inventory DB OCSInventory Inventory GLPI agent Shiken Thruk Monitoring Config files Docker Pager DutyAlert Repository Package JIRA Issue Tracking Chef Enterprise Configuration DataBag Account Organization Cookbook Cookbook Cookbook Elastic Search Graylog2 Log Analysis GIT Source Management Jenkins Graylog Web Graphana Dash Board Core Engine OS Provisioning MAAS PowerDNS DNS MySQL OS Install Entry Finally, You can get server with one click Operation Support
  • 26.
    26 What is Next? Is it done ? No…
  • 27.
    27 Global Hosting forBCP US 1 US 2 EU JP 1 JP 2 Zone 1 Zone 2 Zone 3 Global Zone Prov System Global Dash Board Prov System Prov System Prov System Prov System Local System Data Center Local Zone
  • 28.
    28 Platform : APP1 Environment : DEV Management API Application Application HaProxy Network Layer Float IP NAT Private Network for User Platform : APP 2 Environment : PROD . . Network Control Bare Metal Management Another Application ACL Block by ACL Permit Permit Permit Permit X User API Application Application Permit
  • 29.
    29 Data Base DB 1 DBXX Application Layer . . Storage Layer API App 1 App 2 Internal SLB HaProxy Access Layer Access Web 1 Web2 External SLB HaProxy Permit Permit ACL Separated Network
  • 30.
    30 Separating Functionality Like OpenStackConcept • Current system only focus OS provisioning as one worker • Current system had been designed for only dedicated DC • Functionality requirement is getting increase
  • 31.
    31 Installation Engine ManagementTool-Chef OS provisioningsystem App OS Bare Metal App Bare Metal Management App OS App OS Cluster for Redundancy Pool New feature for Cloud Docker OS OS OS VM Cloud OpenStack/Swarm/Mesos.. Global Dash Board Cost Allocation
  • 32.
  • 33.
    33 We are hiring! Let’sjoin us for Building the core part of Rakuten Just mail to me, I can help you for your application Woosuk.choi@rakuten.com http://global.rakuten.com/corp/careers/

Editor's Notes

  • #2 Ok before start ,, I’ll introduce mainly os provisioning system, but it’s not deep drive for technical side. I just want to introcude “background of this system” and “mission for solving our problem and reach our goals out” If you want more explain for technically, please contact us
  • #4 Just one reason of Bare metal, yes resource ,, performance coz,, we were Big Data ..
  • #5 God bless ,, bla bla .. means we’re able to have more resource with same budget for cloud system or formal vendor God curse ,, without good solution,, it really kill us. Coz we have to manage all from low layer,, ,damm it.
  • #10 Here is our history,, End of 2012 , I and 2 new graduate , started GSP operation team in developing group. Middle of 2013 ( as if my memory is correct ) ,, we had kicked off OS provisioning project for GSP ( at that moment.. ) Any way, first revision was launched middle of 2014, in Lux , JPE1 , OCDC But we had started 2nd phase soon ( from middle of 2015 ) coz there were a lot of goal changed situation, and we launched end of 2015 , Now we’re going to dive 3rd phase Back to the GSP, it’s growing with BDD infrastructre, their corroborating is in good shape, As I described number of versions, over there ,,, really too many at least two version launched one year, we couldn’t do this without our provisioning system. End of 2014,, Hadoop joined to our group with C2000 running on Ops. I guess the order is management and operation as same as GSP, Anyway, C4000 and C5000 , new Hadoop clusters had been designed with BDD infrastructure, and it also is in good shape.
  • #12 Ba
  • #17 So these missions are mandatory goal to let us easy ,, ok let’s journey for detail of system in order to clear these missions BUT !!!
  • #18 Two members, platform admin is user for this system, Infra is supporting for them throughout system. There is automation OS provisioning connected operation tool – Chef, All user can do by themselves
  • #20 Left is operations by user Right side, architecture connecting each components, installation, your recipes even monitoring Your chef recipes is managing not only your current system but they also would be used for new server no mater when you want it. We called OS provisioning system, but wide vison is this system is involving everything and controlling it.
  • #24 Of couse, we have API ,for the same operation.
  • #25 Your recipe will be used into the OS provisioning
  • #26 Finally you can get new one server , everything automatically Blue was done, Red is .. Yeah we gonna make it one day.
  • #31 Currently ,, fixed component, Dashboard ( global ) , authentication , core provisioning engine and morning control Network ,, ( SDN functionally ,, SLB )
  • #32 We’re trying to .. Cloud solution Cost allocation as well.