sean666666@gmail.com, 2013 P1
Cloud OS development
Sean Chang
2013/6
sean666666@gmail.com, 2013 P2
Agenda
• Software Defined XXX
• What we did in Cloud OS
• What we did actually in Cloud
• Yesterday & Today
• Conclusion
sean666666@gmail.com, 2013 P3
What the hell is Cloud Computing ?
• 5 features of NIST Definition
o On-demand self-service
o Broad network access
o Resource pooling
o Rapid elasticity
o Measured Service
sean666666@gmail.com, 2013 P4
Who care …
sean666666@gmail.com, 2013 P5
Cloud computing is…(personal defininition)
App, Web…anything provide fantastic user experience
anytime/anywhere/any-device.
sean666666@gmail.com, 2013 P6
Leading character(legecy…)
Photo:
- www.bulandwaterandair.com
- www.countryfarm.url.tw
sean666666@gmail.com, 2013 P7
IaaS with Software
• Software Defined XXX
sean666666@gmail.com, 2013 P8
IaaS Roadmap
…
Networking
Storage
Virtualization
Photo:
- www.slashgear.com
- softwarestrategiesblog.com/
Data Center
sean666666@gmail.com, 2013 P9
Our Cloud OS
COS
Cloud OS
sean666666@gmail.com, 2013 P10
Targets
• Design criterion
o Adapt multiple enterprise scale from 1 physical
machines to over thousands machines.
o Fast deploy
– One disc & scripts & inventories
o Tightly but loosely software design components
for both of all-in-one one stop shopping and
single patent module.
o Allow cloud computing resources scale
horizontal and vertical dynamically without
timeout service.
sean666666@gmail.com, 2013 P11
Physical machine roles
Roles Description Implementations
Web
- UI
- Restful API
- Server load balance
- Pylons 1.0
- Nginx
- Spring MVC
- jQuery 1.3.2
Center
- Master node
- >2 nodes suggestion for HA
- Dynamically scale out
- Global DB
- python 2.6
- CentOS 5.4
- Spring IoC
- PostgreSQL 8.4
- Libvirt 0.8.8
Host
- Computing node
- Support Multi-Hypervisor
- HA each other
- Xen 3.4
- KVM 0.11
Storage
- Share storage
- Volume:save image, vdisk,
meta-data…etc
- XML
- NFS
- iSCSI
- SAN
Develop Date:2010
sean666666@gmail.com, 2013 P12
COS Deployment
Web Layer
Center Layer
Host Layer
Storage Layer
DB
…
…
…
DB
DB Cluster
…
…
Persistent data
Persistent data
PXE
DHCP
RPM
sean666666@gmail.com, 2013 P13
COS operations
• Software-defined Cloud OS
• Enable with python environment
• Fast & auto deploy
• Machines’ role play defined by asking Center with configurations
• Machines’ role convert dynamically
• Machines’ HA by heartbeat
• VMs service also have HA
• Master & Slave
sean666666@gmail.com, 2013 P14
Role changable
Center
Computing
Node
Storage
Center
Computing
Node
Storage
Computing
Node
Computing
Node
Computing
Node
Computing
Node
Computing
Node
Computing
Node
Center
Storage
Storage
sean666666@gmail.com, 2013 P15
Web architecture
COS Web
View Controller
COS Center
Browser
JavaScript
jQueryHtml
CSS
AJAX mako
RPC
Pyro
RESTful
Model
HTTP/HTTPS
SpringPython
Python
COSLib
AAA
Pylons
Pyro
sean666666@gmail.com, 2013 P16
COS modules
Cloud
Manager
Hypervisor
ControlImage
Repository
Snapshot
DB
HA
Monitor
Auto
Scaling
Alarm
Portal
AAA
UI
Drag&Drop
Web
Center
Storage
Deploy
sean666666@gmail.com, 2013 P17
Networking segments configuration
• Service network
o public ip of VMs
• Function network
o intranet for dev/servers communication
• Storage network
o intranet for access storages
sean666666@gmail.com, 2013 P18
Troubles
• jQuery effects and animations was not so
fancy before.
• Fire in a hole
o Unstable open source.
• Show me the money
• Very less practices to learn.
• New team without institution.
sean666666@gmail.com, 2013 P19
Poor’s DEV/Test approach
• Every developer only own <2 machines.
• How did we do integration testing ?
COS Web
(COS/web)
COS Center
(COSl/center)
COMS system call
DEV port
COS/
springWeb.xml
dynavirtual_3_0/
dev-springWeb.xml
config/
dev-service.ini
config/
service.ini
Configures by each developer
sean666666@gmail.com, 2013 P20
Poor’s teamwork approaches
• Free Git server
o Customized it
– Combine requirement forms
– Build configuration management institution
• Free issue tracking system
o Integrated with Git
sean666666@gmail.com, 2013 P21
Attitude of using open source
• Open source is one kind of fantastic
solution, but also a poison.
• Taiwan should not only learn how to use it,
but also learn more spirits inside.
o Write the great code
o Design & Architecture
o Vision
sean666666@gmail.com, 2013 P22
VM running status Log from Libvirt
sean666666@gmail.com, 2013 P23
23
Good artists copy, great artists steal
• Steal the solution of VirtManager
>>> dir(conn)
o listDomainsID
o listDefinedDomains
sean666666@gmail.com, 2013 P24
VM vLan switching
• No API in the past
o # brctl
peth0
peth1
eth0
vif0.0
vif1.0
xenbr0
xenbr1
dom0 dom1
eth0
eth1vif1.1
sean666666@gmail.com, 2013 P25
Snapshot
• No snapshot API in the past
o VHD2
o Xen4.0
• Deal with Memory and Disk separately
o Libvirt API
o qcow2 + cmd
sean666666@gmail.com, 2013 P26
Hardware Stress Issues
• Pay bananas, get monkeys
o Network cards
– We broken 3 ethernet cards in one day…
o Storage
– NFS ?
– ZFS ?
– NetAPP ?
– Microsoft Storage Sever 2012 ?
sean666666@gmail.com, 2013 P27
Future
• Increase modules
o Analyzer
o Image transformer
o Auto test
• Software Define Networking
• Software Define Storage
o Disaster recovery
sean666666@gmail.com, 2013 P28
Do we really earn money ?
• Business model ?
Photo: abcnews.go.com
sean666666@gmail.com, 2013 P29
ROI of pure public cloud service
0
200
400
600
800
1000
1200
CAPEX OPEX Revenue(every year)
Employees
IDC
Guest OS
Hypervisor
Hardware
Income
300 VMs (2 cores, 2G memory) scale
(unit: $)
sean666666@gmail.com, 2013 P30
ROI in large scale
0
10000
20000
30000
40000
50000
60000
70000
80000
CAPEX OPEX Revenue(every year)
Employees
IDC
Guest OS
Hypervisor
Hardware
Income
30,000 VMs scale
(unit: $)
sean666666@gmail.com, 2013 P31
CAPEX
Hypervisor
Hardware
Guest OS
CAPEX analysis
30%
30%
40%
Photo: freephotoshop.org
sean666666@gmail.com, 2013 P32
Communities today
Photo: www.qyjohn.net
sean666666@gmail.com, 2013 P33
Conclusion
• Heaven or Hell depends on what kind of hardware
you use.
• Large scale! Please !
o 程辉:新浪SAE云计算平台架构
• Hardware is not critical for TW’s Cloud supply
chain. Software could leverage TW in the whole
IT world. And better late than never.
• Grade A player oriented.
• Keep learning.
sean666666@gmail.com, 2013 P34
Keep walking …

Cloud OS development

  • 1.
    sean666666@gmail.com, 2013 P1 CloudOS development Sean Chang 2013/6
  • 2.
    sean666666@gmail.com, 2013 P2 Agenda •Software Defined XXX • What we did in Cloud OS • What we did actually in Cloud • Yesterday & Today • Conclusion
  • 3.
    sean666666@gmail.com, 2013 P3 Whatthe hell is Cloud Computing ? • 5 features of NIST Definition o On-demand self-service o Broad network access o Resource pooling o Rapid elasticity o Measured Service
  • 4.
  • 5.
    sean666666@gmail.com, 2013 P5 Cloudcomputing is…(personal defininition) App, Web…anything provide fantastic user experience anytime/anywhere/any-device.
  • 6.
    sean666666@gmail.com, 2013 P6 Leadingcharacter(legecy…) Photo: - www.bulandwaterandair.com - www.countryfarm.url.tw
  • 7.
    sean666666@gmail.com, 2013 P7 IaaSwith Software • Software Defined XXX
  • 8.
    sean666666@gmail.com, 2013 P8 IaaSRoadmap … Networking Storage Virtualization Photo: - www.slashgear.com - softwarestrategiesblog.com/ Data Center
  • 9.
    sean666666@gmail.com, 2013 P9 OurCloud OS COS Cloud OS
  • 10.
    sean666666@gmail.com, 2013 P10 Targets •Design criterion o Adapt multiple enterprise scale from 1 physical machines to over thousands machines. o Fast deploy – One disc & scripts & inventories o Tightly but loosely software design components for both of all-in-one one stop shopping and single patent module. o Allow cloud computing resources scale horizontal and vertical dynamically without timeout service.
  • 11.
    sean666666@gmail.com, 2013 P11 Physicalmachine roles Roles Description Implementations Web - UI - Restful API - Server load balance - Pylons 1.0 - Nginx - Spring MVC - jQuery 1.3.2 Center - Master node - >2 nodes suggestion for HA - Dynamically scale out - Global DB - python 2.6 - CentOS 5.4 - Spring IoC - PostgreSQL 8.4 - Libvirt 0.8.8 Host - Computing node - Support Multi-Hypervisor - HA each other - Xen 3.4 - KVM 0.11 Storage - Share storage - Volume:save image, vdisk, meta-data…etc - XML - NFS - iSCSI - SAN Develop Date:2010
  • 12.
    sean666666@gmail.com, 2013 P12 COSDeployment Web Layer Center Layer Host Layer Storage Layer DB … … … DB DB Cluster … … Persistent data Persistent data PXE DHCP RPM
  • 13.
    sean666666@gmail.com, 2013 P13 COSoperations • Software-defined Cloud OS • Enable with python environment • Fast & auto deploy • Machines’ role play defined by asking Center with configurations • Machines’ role convert dynamically • Machines’ HA by heartbeat • VMs service also have HA • Master & Slave
  • 14.
    sean666666@gmail.com, 2013 P14 Rolechangable Center Computing Node Storage Center Computing Node Storage Computing Node Computing Node Computing Node Computing Node Computing Node Computing Node Center Storage Storage
  • 15.
    sean666666@gmail.com, 2013 P15 Webarchitecture COS Web View Controller COS Center Browser JavaScript jQueryHtml CSS AJAX mako RPC Pyro RESTful Model HTTP/HTTPS SpringPython Python COSLib AAA Pylons Pyro
  • 16.
    sean666666@gmail.com, 2013 P16 COSmodules Cloud Manager Hypervisor ControlImage Repository Snapshot DB HA Monitor Auto Scaling Alarm Portal AAA UI Drag&Drop Web Center Storage Deploy
  • 17.
    sean666666@gmail.com, 2013 P17 Networkingsegments configuration • Service network o public ip of VMs • Function network o intranet for dev/servers communication • Storage network o intranet for access storages
  • 18.
    sean666666@gmail.com, 2013 P18 Troubles •jQuery effects and animations was not so fancy before. • Fire in a hole o Unstable open source. • Show me the money • Very less practices to learn. • New team without institution.
  • 19.
    sean666666@gmail.com, 2013 P19 Poor’sDEV/Test approach • Every developer only own <2 machines. • How did we do integration testing ? COS Web (COS/web) COS Center (COSl/center) COMS system call DEV port COS/ springWeb.xml dynavirtual_3_0/ dev-springWeb.xml config/ dev-service.ini config/ service.ini Configures by each developer
  • 20.
    sean666666@gmail.com, 2013 P20 Poor’steamwork approaches • Free Git server o Customized it – Combine requirement forms – Build configuration management institution • Free issue tracking system o Integrated with Git
  • 21.
    sean666666@gmail.com, 2013 P21 Attitudeof using open source • Open source is one kind of fantastic solution, but also a poison. • Taiwan should not only learn how to use it, but also learn more spirits inside. o Write the great code o Design & Architecture o Vision
  • 22.
    sean666666@gmail.com, 2013 P22 VMrunning status Log from Libvirt
  • 23.
    sean666666@gmail.com, 2013 P23 23 Goodartists copy, great artists steal • Steal the solution of VirtManager >>> dir(conn) o listDomainsID o listDefinedDomains
  • 24.
    sean666666@gmail.com, 2013 P24 VMvLan switching • No API in the past o # brctl peth0 peth1 eth0 vif0.0 vif1.0 xenbr0 xenbr1 dom0 dom1 eth0 eth1vif1.1
  • 25.
    sean666666@gmail.com, 2013 P25 Snapshot •No snapshot API in the past o VHD2 o Xen4.0 • Deal with Memory and Disk separately o Libvirt API o qcow2 + cmd
  • 26.
    sean666666@gmail.com, 2013 P26 HardwareStress Issues • Pay bananas, get monkeys o Network cards – We broken 3 ethernet cards in one day… o Storage – NFS ? – ZFS ? – NetAPP ? – Microsoft Storage Sever 2012 ?
  • 27.
    sean666666@gmail.com, 2013 P27 Future •Increase modules o Analyzer o Image transformer o Auto test • Software Define Networking • Software Define Storage o Disaster recovery
  • 28.
    sean666666@gmail.com, 2013 P28 Dowe really earn money ? • Business model ? Photo: abcnews.go.com
  • 29.
    sean666666@gmail.com, 2013 P29 ROIof pure public cloud service 0 200 400 600 800 1000 1200 CAPEX OPEX Revenue(every year) Employees IDC Guest OS Hypervisor Hardware Income 300 VMs (2 cores, 2G memory) scale (unit: $)
  • 30.
    sean666666@gmail.com, 2013 P30 ROIin large scale 0 10000 20000 30000 40000 50000 60000 70000 80000 CAPEX OPEX Revenue(every year) Employees IDC Guest OS Hypervisor Hardware Income 30,000 VMs scale (unit: $)
  • 31.
    sean666666@gmail.com, 2013 P31 CAPEX Hypervisor Hardware GuestOS CAPEX analysis 30% 30% 40% Photo: freephotoshop.org
  • 32.
    sean666666@gmail.com, 2013 P32 Communitiestoday Photo: www.qyjohn.net
  • 33.
    sean666666@gmail.com, 2013 P33 Conclusion •Heaven or Hell depends on what kind of hardware you use. • Large scale! Please ! o 程辉:新浪SAE云计算平台架构 • Hardware is not critical for TW’s Cloud supply chain. Software could leverage TW in the whole IT world. And better late than never. • Grade A player oriented. • Keep learning.
  • 34.