SlideShare a Scribd company logo
Xiaobin Zhang, zhangxbk@cnsuning.com
Long Jin, jinlongb@cnsuning.com
Qiming Teng, tengqim@cn.ibm.com
• Suning Overview
• Suning OpenStack Journey
• Suning Cloud Workload
• Lessons Learnt
• Wishlists
2
• Basic Information
• Established in 1990
• The largest commercial enterprise in China
• Top 3 Chinese private enterprises
• The 50th among the Top 500 enterprises in China
• Business Lines
• retail, logistics, supply chain, real estates, investments, ...
• By the end of 2012
• Suning has stores in 700+ cities in China and other countries.
• The total number of staff is 180,000
• 4 R&D centers: Beijing, Shanghai, Nanjing, Silicon Valley
• Brand value of $ 13 B, annual revenue $ 37 B
3
$ 175 Billion
43.9% YTY Growth
1H 2014, China
• Opportunities
• Improve Efficiency, Collaboration Paradigms and Business Models
• Traditional e-Commerce  O2O (Online-to-Offline)  Cloudified Whole Value Chain
4
PERSONALIZED
SHOPPING
EXPERIENCES
EFFICIENT
MERCHANDISING AND
SUPPLY NETWORK
TRANSFORM AND
OPTIMIZE OPERATIONS
OPERATING
EFFICIENCY
REVENUE
GROWTH
ECOSYSTEM
DOMINATION
Suning Private Cloud
• Multiple Data Centers
• 1000s of Hosts
• 10x1000s of Virtual Machines
• Rich & Customized Middleware
• Automated Deployment / Operation
• Workflow Consolidation
5
Suning Public Cloud
• Cloud Server
• VPC
• Shared, Object Storage
• Cloud Database
• Fast Deployment
• Monitoring and Billing
• The journey starts since early 2013
• single deployment -> multi-region deployment across data centers
• R & D workloads testing -> Internet/production workload
6
Domain Status
Compute • 256 GB memory, 4 GB NICs, 64 cores; Windows/Linux guests
Network • Isolated network for admin, data and storage; OVS bonding; HW LB
Storage • LVM and GlusterFS resource pool with QoS support; Cinder multi-backend
Container • Docker resource pool and docker repo with HA enabled
Deployment • Cobbler, Puppet
Management • 3 nodes HA setup for controllers; RabbitMQ cluster
Monitoring • Proprietary resource/service monitoring tools, guest agents for data collection
• RabbitMQ portal and LogStash
Optimization • Resource scheduling for standalone, clustered and layered applications
• 100+ applications of diverse characteristics
• Mixed CPU-intensive and I/O Intensive workload:
• CPU-intensive, long-hour duration mobile application compilation and building
• huge storage and volume (800G ~ 1T)
• search engine compilation
• big data analytics, e.g. sentiment analysis
• thumbnail generation
• Different software stacks for Internet applications
• Apache + JBoss + MySQL
• IHS + WAS + DB2
• Others
7
8
Web /
Frond-End
AppServer /
Middle-Tier
Database /
Back-End
• Optionally Clustering
• Optionally Auto-Scaling
• Dispatch to different hosts,
regions, networks, ...
• Optionally Clustering
• Optionally Auto-Scaling
• Schedule to different hosts,
regions, networks, ...
• Short upgrade cycle: 1-4
weeks (not whole system)
• Optionally Active/Passive
• Dispatch to different hosts,
networks, regions, ...
Dynamic Discovery
Live Registration
Request Granularity
Dynamic Discovery
Live Registration
Transaction Dispatching
• A component/service may play different roles
• Apache: web-server and/or reverse-proxy and/or load-balancer
• JBoss: front-end, back-end or both
• Master agent, Host agent, JBoss instances
• Service discovery and registration is complex
• IT requirements like SSH key, service user, password, directory, package repository…
• Legacy script and automation tools (taking in or discard)
• Workload distribution has to be planned ahead
• traditional process forking is not acceptable on a virtualized platform
• VMs become the management unit on cloud
• tuning specs: quota, profile, application characteristics
• scaling VMs instead of forking new processes
9
• An orchestrator sitting above compute, storage and network
• Template based VM provisioning, aka. stack creation
• Heat’s auto-scaling solution is valuable for Suning's Internet applications
• A standardized approach of cloud application deployment
• and orchestration?
10
milk powder: 5 million cans milk: 100 containers promotion season: 3 days
• Standard images and software packages
• Post-launch configuration
• creation of user accounts
• key distribution and revocation
• VM roles assignment
• package update or upgrade
• middleware install and configuration
• application install and configuration
• monitoring tools install and configuration
• service discover and registration ???
• .......
11
• Deployment and Orchestration
• Heat based deployment only covers part of the story
• Cloud-init only concerns with the initial deployment
• What we need is an integrated end-to-end tool chain
that covers runtime/maintenance orchestration as well
12
• Orchestration is not thoroughly tested in community (e.g. Auto-scaling)
• involves Heat, Ceilometer, Nova, Keystone...
• rolling-update may not work as expected
• scaling out may jump from one to many directly
• ceilometer alarm evaluator may not work
• ...
• fixing these is not an easy job
• Triggers for scaling
• network metrics (packets processed, bytes transferred) sounds interesting, but
• CPU and memory are still the primary bottlenecks
• Scaling may be triggered with combination factors of CPU, memory, disk I/O and/or
network I/O with customized algorithm
• Rolling update is of critical importance
• ensure a given number of instances are always online when performing updates
• Deletion policy from resource groups
• sometimes the newest members are preferred to oldest members, considering that
• old members may have state cached, may have proved to be stable, ...
• Fast detection and fast scaling (seconds level)
13
• Availability
• Storage Reliability/Availability
• Hard disk errors are common
• VM High-Availability (aka. the "Pets" story)
• it doesn't seem like a single project mission
• host failure, network failure, storage failure, guest failure, application failure ...
• may need to get Nova, Ceilometer (Zaqar?), Heat, Keystone to work together
• AutoScaling
• Semi-AutoScaling (Scale at a given point in time)
• Smarter VM placement, aka. Global Scheduling
• e.g. 3 Apache server per host is okay, but 9 Apache per host is risky
• VM placement is mainly concerned with service availability
• Scaling across availability zones, across regions
14
• Application Profile and Management
• Each application has a unique architecture where some components are reusable
• Most components are capable of playing different roles (e.g. front-end vs back-end)
• domain role, slave role, host role, etc.
• Combinations are difficult to predict and manage
• Solum? Murano?
• Provider Templates?
+ promote template reusability
+ facilitate fine granularity version control
- difficult to reference resources (attributes) from outer/inner templates
- difficult to get dependencies done right
- Tools to standardize Heat template collections
15
• Configurable frequency for Heat engine calls
• mostly from os-xxx-config
• may need a short interval during bootup, then switch to a longer interval
• Tools and guidance for the establishment of standard workflows
• need to abstract away common features and parameters
• need to simplify the deployment, management process
• need to adapt to new technologies
• e.g. transition from using shared disk volumes to use storage cloud
16
Thank You!
17

More Related Content

What's hot

Spark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesSpark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on Kubernetes
Yousun Jeong
 
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G coreTối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
Vietnam Open Infrastructure User Group
 
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
confluent
 
Achieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudAchieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloud
Scott Miao
 
Running and Managing Kubernetes on OpenStack
Running and Managing Kubernetes on OpenStackRunning and Managing Kubernetes on OpenStack
Running and Managing Kubernetes on OpenStack
Victor Palma
 
OpenStack High Availability
OpenStack High AvailabilityOpenStack High Availability
OpenStack High Availability
Jakub Pavlik
 
[Spark Summit 2017 NA] Apache Spark on Kubernetes
[Spark Summit 2017 NA] Apache Spark on Kubernetes[Spark Summit 2017 NA] Apache Spark on Kubernetes
[Spark Summit 2017 NA] Apache Spark on Kubernetes
Timothy Chen
 
WordPress Cluster for Enterprise High-Availability and On-Demand Scaling
WordPress Cluster for Enterprise High-Availability and On-Demand ScalingWordPress Cluster for Enterprise High-Availability and On-Demand Scaling
WordPress Cluster for Enterprise High-Availability and On-Demand Scaling
Jelastic Multi-Cloud PaaS
 
openstack, devops and people
openstack, devops and peopleopenstack, devops and people
openstack, devops and people
Andrew Yongjoon Kong
 
Cassandra serving netflix @ scale
Cassandra serving netflix @ scaleCassandra serving netflix @ scale
Cassandra serving netflix @ scale
Vinay Kumar Chella
 
Data Stores @ Netflix
Data Stores @ NetflixData Stores @ Netflix
Data Stores @ Netflix
Vinay Kumar Chella
 
John Spray - Ceph in Kubernetes
John Spray - Ceph in KubernetesJohn Spray - Ceph in Kubernetes
John Spray - Ceph in Kubernetes
ShapeBlue
 
Autoscaling OpenStack Natively with Heat, Ceilometer and LBaaS
Autoscaling OpenStack Natively with Heat, Ceilometer and LBaaSAutoscaling OpenStack Natively with Heat, Ceilometer and LBaaS
Autoscaling OpenStack Natively with Heat, Ceilometer and LBaaS
Shixiong Shang
 
Openstack Summit Container Day Keynote
Openstack Summit Container Day KeynoteOpenstack Summit Container Day Keynote
Openstack Summit Container Day Keynote
Boyd Hemphill
 
A fun cup of joe with open liberty
A fun cup of joe with open libertyA fun cup of joe with open liberty
A fun cup of joe with open liberty
Andy Mauer
 
Paul Angus - what's new in ACS 4.11
Paul Angus - what's new in ACS 4.11Paul Angus - what's new in ACS 4.11
Paul Angus - what's new in ACS 4.11
ShapeBlue
 
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
ShapeBlue
 
Divide and conquer: resource segregation in the OpenStack cloud
Divide and conquer: resource segregation in the OpenStack cloudDivide and conquer: resource segregation in the OpenStack cloud
Divide and conquer: resource segregation in the OpenStack cloud
Stephen Gordon
 
Paul Angus - CloudStack Container Service
Paul  Angus - CloudStack Container ServicePaul  Angus - CloudStack Container Service
Paul Angus - CloudStack Container Service
ShapeBlue
 
K8S in prod
K8S in prodK8S in prod

What's hot (20)

Spark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on KubernetesSpark day 2017 - Spark on Kubernetes
Spark day 2017 - Spark on Kubernetes
 
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G coreTối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
 
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
 
Achieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudAchieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloud
 
Running and Managing Kubernetes on OpenStack
Running and Managing Kubernetes on OpenStackRunning and Managing Kubernetes on OpenStack
Running and Managing Kubernetes on OpenStack
 
OpenStack High Availability
OpenStack High AvailabilityOpenStack High Availability
OpenStack High Availability
 
[Spark Summit 2017 NA] Apache Spark on Kubernetes
[Spark Summit 2017 NA] Apache Spark on Kubernetes[Spark Summit 2017 NA] Apache Spark on Kubernetes
[Spark Summit 2017 NA] Apache Spark on Kubernetes
 
WordPress Cluster for Enterprise High-Availability and On-Demand Scaling
WordPress Cluster for Enterprise High-Availability and On-Demand ScalingWordPress Cluster for Enterprise High-Availability and On-Demand Scaling
WordPress Cluster for Enterprise High-Availability and On-Demand Scaling
 
openstack, devops and people
openstack, devops and peopleopenstack, devops and people
openstack, devops and people
 
Cassandra serving netflix @ scale
Cassandra serving netflix @ scaleCassandra serving netflix @ scale
Cassandra serving netflix @ scale
 
Data Stores @ Netflix
Data Stores @ NetflixData Stores @ Netflix
Data Stores @ Netflix
 
John Spray - Ceph in Kubernetes
John Spray - Ceph in KubernetesJohn Spray - Ceph in Kubernetes
John Spray - Ceph in Kubernetes
 
Autoscaling OpenStack Natively with Heat, Ceilometer and LBaaS
Autoscaling OpenStack Natively with Heat, Ceilometer and LBaaSAutoscaling OpenStack Natively with Heat, Ceilometer and LBaaS
Autoscaling OpenStack Natively with Heat, Ceilometer and LBaaS
 
Openstack Summit Container Day Keynote
Openstack Summit Container Day KeynoteOpenstack Summit Container Day Keynote
Openstack Summit Container Day Keynote
 
A fun cup of joe with open liberty
A fun cup of joe with open libertyA fun cup of joe with open liberty
A fun cup of joe with open liberty
 
Paul Angus - what's new in ACS 4.11
Paul Angus - what's new in ACS 4.11Paul Angus - what's new in ACS 4.11
Paul Angus - what's new in ACS 4.11
 
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
 
Divide and conquer: resource segregation in the OpenStack cloud
Divide and conquer: resource segregation in the OpenStack cloudDivide and conquer: resource segregation in the OpenStack cloud
Divide and conquer: resource segregation in the OpenStack cloud
 
Paul Angus - CloudStack Container Service
Paul  Angus - CloudStack Container ServicePaul  Angus - CloudStack Container Service
Paul Angus - CloudStack Container Service
 
K8S in prod
K8S in prodK8S in prod
K8S in prod
 

Similar to Suning OpenStack Cloud and Heat

Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolith
Markus Eisele
 
25 snowflake
25 snowflake25 snowflake
25 snowflake
剑飞 陈
 
Operational-Analytics
Operational-AnalyticsOperational-Analytics
Operational-Analytics
Niloy Mukherjee
 
Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012
Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012
Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012
Big Data Spain
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
John Adams
 
10 Tips for Your Journey to the Public Cloud
10 Tips for Your Journey to the Public Cloud10 Tips for Your Journey to the Public Cloud
10 Tips for Your Journey to the Public Cloud
Intuit Inc.
 
Monitoring MySQL at scale
Monitoring MySQL at scaleMonitoring MySQL at scale
Monitoring MySQL at scale
Ovais Tariq
 
Mtc learnings from isv & enterprise interaction
Mtc learnings from isv & enterprise  interactionMtc learnings from isv & enterprise  interaction
Mtc learnings from isv & enterprise interaction
Govind Kanshi
 
Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)
Govind Kanshi
 
Mark Interrante OpenStack Design Summit
Mark Interrante OpenStack Design SummitMark Interrante OpenStack Design Summit
Mark Interrante OpenStack Design Summit
Open Stack
 
Agile infrastructure
Agile infrastructureAgile infrastructure
Agile infrastructure
Tarun Rajput
 
iMobileMagic Teck Talk Scale Up
iMobileMagic Teck Talk Scale UpiMobileMagic Teck Talk Scale Up
iMobileMagic Teck Talk Scale Up
Pedro Machado
 
The Need of Cloud-Native Application
The Need of Cloud-Native ApplicationThe Need of Cloud-Native Application
The Need of Cloud-Native Application
Emiliano Pecis
 
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring SolutionHow KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
Elasticsearch
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Yong Feng
 
Ultra-scale e-Commerce Transaction Services with Lean Middleware
Ultra-scale e-Commerce Transaction Services with Lean Middleware Ultra-scale e-Commerce Transaction Services with Lean Middleware
Ultra-scale e-Commerce Transaction Services with Lean Middleware
WSO2
 
Un-clouding the cloud
Un-clouding the cloudUn-clouding the cloud
Un-clouding the cloud
Davinder Kohli
 
Cloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSSCloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSS
aspyker
 
Top 5 Java Performance Metrics, Tips & Tricks
Top 5 Java Performance Metrics, Tips & TricksTop 5 Java Performance Metrics, Tips & Tricks
Top 5 Java Performance Metrics, Tips & Tricks
AppDynamics
 
Cloud stack for_beginners
Cloud stack for_beginnersCloud stack for_beginners
Cloud stack for_beginners
Radhika Puthiyetath
 

Similar to Suning OpenStack Cloud and Heat (20)

Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolith
 
25 snowflake
25 snowflake25 snowflake
25 snowflake
 
Operational-Analytics
Operational-AnalyticsOperational-Analytics
Operational-Analytics
 
Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012
Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012
Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
10 Tips for Your Journey to the Public Cloud
10 Tips for Your Journey to the Public Cloud10 Tips for Your Journey to the Public Cloud
10 Tips for Your Journey to the Public Cloud
 
Monitoring MySQL at scale
Monitoring MySQL at scaleMonitoring MySQL at scale
Monitoring MySQL at scale
 
Mtc learnings from isv & enterprise interaction
Mtc learnings from isv & enterprise  interactionMtc learnings from isv & enterprise  interaction
Mtc learnings from isv & enterprise interaction
 
Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)
 
Mark Interrante OpenStack Design Summit
Mark Interrante OpenStack Design SummitMark Interrante OpenStack Design Summit
Mark Interrante OpenStack Design Summit
 
Agile infrastructure
Agile infrastructureAgile infrastructure
Agile infrastructure
 
iMobileMagic Teck Talk Scale Up
iMobileMagic Teck Talk Scale UpiMobileMagic Teck Talk Scale Up
iMobileMagic Teck Talk Scale Up
 
The Need of Cloud-Native Application
The Need of Cloud-Native ApplicationThe Need of Cloud-Native Application
The Need of Cloud-Native Application
 
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring SolutionHow KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
 
Ultra-scale e-Commerce Transaction Services with Lean Middleware
Ultra-scale e-Commerce Transaction Services with Lean Middleware Ultra-scale e-Commerce Transaction Services with Lean Middleware
Ultra-scale e-Commerce Transaction Services with Lean Middleware
 
Un-clouding the cloud
Un-clouding the cloudUn-clouding the cloud
Un-clouding the cloud
 
Cloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSSCloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSS
 
Top 5 Java Performance Metrics, Tips & Tricks
Top 5 Java Performance Metrics, Tips & TricksTop 5 Java Performance Metrics, Tips & Tricks
Top 5 Java Performance Metrics, Tips & Tricks
 
Cloud stack for_beginners
Cloud stack for_beginnersCloud stack for_beginners
Cloud stack for_beginners
 

Recently uploaded

如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
gapen1
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
XfilesPro
 
Transforming Product Development using OnePlan To Boost Efficiency and Innova...
Transforming Product Development using OnePlan To Boost Efficiency and Innova...Transforming Product Development using OnePlan To Boost Efficiency and Innova...
Transforming Product Development using OnePlan To Boost Efficiency and Innova...
OnePlan Solutions
 
Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)
alowpalsadig
 
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
Tier1 app
 
Liberarsi dai framework con i Web Component.pptx
Liberarsi dai framework con i Web Component.pptxLiberarsi dai framework con i Web Component.pptx
Liberarsi dai framework con i Web Component.pptx
Massimo Artizzu
 
TMU毕业证书精仿办理
TMU毕业证书精仿办理TMU毕业证书精仿办理
TMU毕业证书精仿办理
aeeva
 
Orca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container OrchestrationOrca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container Orchestration
Pedro J. Molina
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Paul Brebner
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
Bert Jan Schrijver
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
dakas1
 
42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert
vaishalijagtap12
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
ervikas4
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Julian Hyde
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
kalichargn70th171
 
What’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete RoadmapWhat’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete Roadmap
Envertis Software Solutions
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
Alina Yurenko
 
DevOps Consulting Company | Hire DevOps Services
DevOps Consulting Company | Hire DevOps ServicesDevOps Consulting Company | Hire DevOps Services
DevOps Consulting Company | Hire DevOps Services
seospiralmantra
 
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
widenerjobeyrl638
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid
 

Recently uploaded (20)

如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
 
Transforming Product Development using OnePlan To Boost Efficiency and Innova...
Transforming Product Development using OnePlan To Boost Efficiency and Innova...Transforming Product Development using OnePlan To Boost Efficiency and Innova...
Transforming Product Development using OnePlan To Boost Efficiency and Innova...
 
Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)
 
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
 
Liberarsi dai framework con i Web Component.pptx
Liberarsi dai framework con i Web Component.pptxLiberarsi dai framework con i Web Component.pptx
Liberarsi dai framework con i Web Component.pptx
 
TMU毕业证书精仿办理
TMU毕业证书精仿办理TMU毕业证书精仿办理
TMU毕业证书精仿办理
 
Orca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container OrchestrationOrca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container Orchestration
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
 
42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
 
What’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete RoadmapWhat’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete Roadmap
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
 
DevOps Consulting Company | Hire DevOps Services
DevOps Consulting Company | Hire DevOps ServicesDevOps Consulting Company | Hire DevOps Services
DevOps Consulting Company | Hire DevOps Services
 
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
 

Suning OpenStack Cloud and Heat

  • 1. Xiaobin Zhang, zhangxbk@cnsuning.com Long Jin, jinlongb@cnsuning.com Qiming Teng, tengqim@cn.ibm.com
  • 2. • Suning Overview • Suning OpenStack Journey • Suning Cloud Workload • Lessons Learnt • Wishlists 2
  • 3. • Basic Information • Established in 1990 • The largest commercial enterprise in China • Top 3 Chinese private enterprises • The 50th among the Top 500 enterprises in China • Business Lines • retail, logistics, supply chain, real estates, investments, ... • By the end of 2012 • Suning has stores in 700+ cities in China and other countries. • The total number of staff is 180,000 • 4 R&D centers: Beijing, Shanghai, Nanjing, Silicon Valley • Brand value of $ 13 B, annual revenue $ 37 B 3 $ 175 Billion 43.9% YTY Growth 1H 2014, China
  • 4. • Opportunities • Improve Efficiency, Collaboration Paradigms and Business Models • Traditional e-Commerce  O2O (Online-to-Offline)  Cloudified Whole Value Chain 4 PERSONALIZED SHOPPING EXPERIENCES EFFICIENT MERCHANDISING AND SUPPLY NETWORK TRANSFORM AND OPTIMIZE OPERATIONS OPERATING EFFICIENCY REVENUE GROWTH ECOSYSTEM DOMINATION
  • 5. Suning Private Cloud • Multiple Data Centers • 1000s of Hosts • 10x1000s of Virtual Machines • Rich & Customized Middleware • Automated Deployment / Operation • Workflow Consolidation 5 Suning Public Cloud • Cloud Server • VPC • Shared, Object Storage • Cloud Database • Fast Deployment • Monitoring and Billing
  • 6. • The journey starts since early 2013 • single deployment -> multi-region deployment across data centers • R & D workloads testing -> Internet/production workload 6 Domain Status Compute • 256 GB memory, 4 GB NICs, 64 cores; Windows/Linux guests Network • Isolated network for admin, data and storage; OVS bonding; HW LB Storage • LVM and GlusterFS resource pool with QoS support; Cinder multi-backend Container • Docker resource pool and docker repo with HA enabled Deployment • Cobbler, Puppet Management • 3 nodes HA setup for controllers; RabbitMQ cluster Monitoring • Proprietary resource/service monitoring tools, guest agents for data collection • RabbitMQ portal and LogStash Optimization • Resource scheduling for standalone, clustered and layered applications
  • 7. • 100+ applications of diverse characteristics • Mixed CPU-intensive and I/O Intensive workload: • CPU-intensive, long-hour duration mobile application compilation and building • huge storage and volume (800G ~ 1T) • search engine compilation • big data analytics, e.g. sentiment analysis • thumbnail generation • Different software stacks for Internet applications • Apache + JBoss + MySQL • IHS + WAS + DB2 • Others 7
  • 8. 8 Web / Frond-End AppServer / Middle-Tier Database / Back-End • Optionally Clustering • Optionally Auto-Scaling • Dispatch to different hosts, regions, networks, ... • Optionally Clustering • Optionally Auto-Scaling • Schedule to different hosts, regions, networks, ... • Short upgrade cycle: 1-4 weeks (not whole system) • Optionally Active/Passive • Dispatch to different hosts, networks, regions, ... Dynamic Discovery Live Registration Request Granularity Dynamic Discovery Live Registration Transaction Dispatching
  • 9. • A component/service may play different roles • Apache: web-server and/or reverse-proxy and/or load-balancer • JBoss: front-end, back-end or both • Master agent, Host agent, JBoss instances • Service discovery and registration is complex • IT requirements like SSH key, service user, password, directory, package repository… • Legacy script and automation tools (taking in or discard) • Workload distribution has to be planned ahead • traditional process forking is not acceptable on a virtualized platform • VMs become the management unit on cloud • tuning specs: quota, profile, application characteristics • scaling VMs instead of forking new processes 9
  • 10. • An orchestrator sitting above compute, storage and network • Template based VM provisioning, aka. stack creation • Heat’s auto-scaling solution is valuable for Suning's Internet applications • A standardized approach of cloud application deployment • and orchestration? 10 milk powder: 5 million cans milk: 100 containers promotion season: 3 days
  • 11. • Standard images and software packages • Post-launch configuration • creation of user accounts • key distribution and revocation • VM roles assignment • package update or upgrade • middleware install and configuration • application install and configuration • monitoring tools install and configuration • service discover and registration ??? • ....... 11
  • 12. • Deployment and Orchestration • Heat based deployment only covers part of the story • Cloud-init only concerns with the initial deployment • What we need is an integrated end-to-end tool chain that covers runtime/maintenance orchestration as well 12 • Orchestration is not thoroughly tested in community (e.g. Auto-scaling) • involves Heat, Ceilometer, Nova, Keystone... • rolling-update may not work as expected • scaling out may jump from one to many directly • ceilometer alarm evaluator may not work • ... • fixing these is not an easy job
  • 13. • Triggers for scaling • network metrics (packets processed, bytes transferred) sounds interesting, but • CPU and memory are still the primary bottlenecks • Scaling may be triggered with combination factors of CPU, memory, disk I/O and/or network I/O with customized algorithm • Rolling update is of critical importance • ensure a given number of instances are always online when performing updates • Deletion policy from resource groups • sometimes the newest members are preferred to oldest members, considering that • old members may have state cached, may have proved to be stable, ... • Fast detection and fast scaling (seconds level) 13
  • 14. • Availability • Storage Reliability/Availability • Hard disk errors are common • VM High-Availability (aka. the "Pets" story) • it doesn't seem like a single project mission • host failure, network failure, storage failure, guest failure, application failure ... • may need to get Nova, Ceilometer (Zaqar?), Heat, Keystone to work together • AutoScaling • Semi-AutoScaling (Scale at a given point in time) • Smarter VM placement, aka. Global Scheduling • e.g. 3 Apache server per host is okay, but 9 Apache per host is risky • VM placement is mainly concerned with service availability • Scaling across availability zones, across regions 14
  • 15. • Application Profile and Management • Each application has a unique architecture where some components are reusable • Most components are capable of playing different roles (e.g. front-end vs back-end) • domain role, slave role, host role, etc. • Combinations are difficult to predict and manage • Solum? Murano? • Provider Templates? + promote template reusability + facilitate fine granularity version control - difficult to reference resources (attributes) from outer/inner templates - difficult to get dependencies done right - Tools to standardize Heat template collections 15
  • 16. • Configurable frequency for Heat engine calls • mostly from os-xxx-config • may need a short interval during bootup, then switch to a longer interval • Tools and guidance for the establishment of standard workflows • need to abstract away common features and parameters • need to simplify the deployment, management process • need to adapt to new technologies • e.g. transition from using shared disk volumes to use storage cloud 16