Presentation on how to chat with PDF using ChatGPT code interpreter
Leveraging OpenStack at Scale: How the Elastic Cloud Drives Innovation Velocity
1. Leveraging OpenStack at Scale: How
the Elastic Cloud Drives Innovation
Velocity
Jonathan Chiang
Comcast
August 23, 2016
2. 2 2
High Speed Internet
Video
IP Telephony
Home Security /
Automation
Universal Parks
Media Properties
3. Stretching the Comcast Elastic Cloud |
Our Journey with OpenStack
• Petabyte of Memory and One Million vCPU Cores in 2016
• Multi-Petabyte Ceph Block and Object Storage
• Multi-Terabyte SSD
• Deployed across 34 Regions
• National and Regional Data Centers
• Icehouse Release Today, Moving Directly to Mitaka
4. Community Contributions
• Lines of code: 73,000
• Commits: 1100
• Core Developers and Reviewers on Multiple Projects
• Since Vancouver Summit (Kilo), Comcast has increased its
contributions by 50%
4
5. Comcast Elastic Cloud: A Powerful Platform
for Critical Services
5
X1 Residential
e-mail
Big data
and network
telemetry
Product
development
Virtual network
functions
6. Xfinity X1
6
• Customized Apps
• Social Media Integration
• Personalized and Dynamic
TV Experience
• Cloud DVR
7. Residential Email Services and Product
Development
7
• Tens of Millions of Users
• Terabytes of data
• Hundreds of Applications
• Critical Part of Multiple CI/CD Pipelines
• Accelerating Innovation
• Container Orchestration
8. Big Data on Elastic Cloud
• Big data analysis of network performance
and telemetry
• Leverages CEC compute instances with
ephemeral storage
• Swift used for data lake/unified central
storage
8
Cluster Node VM
OpenStack Swift
(Data Lake)
Root Volume
CEPHlibvirt
Ephemeral
(Local Disk)
Cinder Volume
(HDFS)
9. Challenges to Overcome in Order to
Increase Business Velocity
• Converge our infrastructure to meet the demands of modern
workloads
• Increase operational efficiency
• Performance and scalability
9
10. Converging Our Infrastructure to Meet the
Demands of Modern Workloads: Storage
Modern storage requirements demand:
• Provisioned IOPs
• High read/write throughput
• Quality of Service
• Persistent container storage
But add the complexity of:
• Noisy neighbors
• Convergence vs disaggregation
• Operational awareness
• Cost
10
11. Converging Our Infrastructure to Meet the
Demands of Modern Workloads: NFV
VNFs have complex requirements:
• High bandwidth, IO, while exhibiting extremely low latency
• Predictable and consistent CPU performance
• High availability and resiliency
But introduce the challenges of:
• Multi-tenancy
• Operational complexity
• Vendor reliance
• Cost
11
12. Increase Operational Efficiency
An ideal operational environment would encompass:
• Clear and concise reference architecture
• Automated and repeatable deployment model
• Consistent upgrade path and backwards compatibility
• Visibility into the full stack to understand issues and make better
decisions
Introduce the challenges:
• Multiple deployment methodologies
• Difficulty upgrading
• Unsupported APIs
• Robust instrumentation and monitoring
12
13. Performance and Scalability
Growth of our Elastic Cloud:
• Year over year increase in demand
• Increase in data acquisition and retention
Are testing the limits of:
• OpenStack networking
• Control plane capabilities
13
14. Addressing the Challenges
• Collaborate with large scale operators
• Continue to contribute to the community
• Embrace a chaotic environment
14
I’m here to share with you Comcast’s journey with OpenStack thus far.
Talk about the successful use cases that have really accelerated the growth of our Elastic Cloud
And identify some of the areas that we are currently challenged by and the opportunities we have as a community to address them
Some context around Comcast as a business.
Comcast Cable is one of the nation's largest video, high-speed Internet and phone providers to residential customers under the XFINITY brand
We also provides these services to businesses
With the merger of NBCUniversal, the company is made up of;
10 TV and movie production studios including Universal Pictures
20 cable channels including E! and the Golf Channel
11 regional broadcast TV stations
15 Telemundo stations
9 regional sports cable networks
2 sports teams 76ers and Flyers
Digital media – Fandango and a large stake in Hulu
We have fairly substantial Openstack deployments
To date, we have a Petabyte of memory and over 1 million vCPU cores
We have multi tiered storage offerings including
Many-Petabytes of Ceph block and object storage
Many-Terabytes of SSD arrays
We’re deployed across 34 National and Regional Data Centers
We distribute our network and deploy our OpenStack fabric to our Regional Data Centers, with single digit network latencies in order to deliver high performance compute and storage to customers.
We are still running Icehouse today, but are moving directly into Mitaka in our new regions
We believe in giving back to the community by contributing code, advancing the real-world needs of OpenStack operators within the community, and making our team members available to help advance community goals
As of today, we’ve contributed over 73K lines of code
And have made 1100 commits
We have core developers and reviewers on multiple projects
Since the Vancouver summit, we have increased our contributions to the Openstack community by 50%
Comcast set out to build a platform that can meet diverse critical workloads
reducing overall cost
Consolidating infrastructure efforts
Reduce duplication
Improve utilization
For some of our major workloads, Openstack has delivered
We’re extremely proud as a company of our X1 Entertainment Operating System, and I’m extremely proud that millions of our customers cloud-enabled X1 boxes are powered by OpenStack.
During the Olympics, X1 customers saw the convergence of technology and content
Every event, in every sport, every athlete, every medal ceremony all in one place
Request and search using the voice commands on the remote
The cloud DVR in the image here allows you to take your recorded content wherever you go
Our residential email service, which millions of customers use daily, runs on CEC
We accelerated innovation and time to market for applications
Our product development teams increasingly are leveraging container orchestration technologies, such as Mesos, to deploy and operate new consumer offerings.
The platform also serves as a critical part of multiple CI/CD pipelines
Comcast has big data. We collect telemetry from millions of devices in millions of homes.
Our analytics teams run Hadoop and other tools on Elastic Cloud to optimize network infrastructure and monitor performance metrics
Leverages CEC compute instances, ephemeral storage, and Swift as the backend
Although Openstack has provided tremendous value for Comcast, there are still a number of challenges to over come in order to increase our business velocity and value
Converge our infrastructure to meet the demands of modern workloads
Reduce operational complexity
Improve performance and scalability
I’ll discuss each of challenges in detail and hopefully provide some insight into what real issues we are facing
An area of convergence we are focusing on is storage.
Our engineers are leveraging modern tools like Spark, Kafka, Hadoop, and Tableau to analyze and visual these data sets identify areas we can improve for our customers.
However, these tools have created a new set of requirements for our infrastructure. Our engineers are asking us for more diverse storage solutions such as SSD for high IOPS workloads.
Or high performing object storage for distributed messaging systems.
Direct attached magnetic disks for Hadoop and ELK clusters
In order to solve address these challenges, we leverage a combination of storage vendors, OpenSource technologies.
But these all add operational complexity.
We are at a crossroad of convergence vs disaggregation. Its cost effective to add direct attached disks to compute nodes, but scalability and utilization are impacted.
Putting workloads on disaggregated storage adds to noisy neighbors issues, and without appropriate instrumentation, it’s difficult to identify the culprit
The issue of cost adds more complications. Our users compare us to AWS who offers SSD backed volumes for $.10/GB/Month
VNFs requires high bandwidth, lots of IO, all while exhibiting extremely low latency
Predictable and consistent CPU performance
High availability and resiliency
But also introduce the challenges multi-tenancy
Running both multiple concurrent workloads strains the CPU and network performance requirements of our infrastructure
This adds operational complexity due to lack of metrics (statsd for swift is all we have)
SDNs are Vendor specific and very opinionated
Cost is always a concern
As a cloud service provider, we are constantly being compared to AWS in cost, performance, and reliability.
AWS has the volume to buy hardware at an incredible discounts and that’s rapidly shrinking the delta between the cost of running workloads in AWS and on premise.
The area where we can effectively reduce cost of running in Openstack is by reducing the resources needed to deploy, operate, and scale it.
An ideal operational environment would encompass:
Clear and concise reference architecture
Automated and repeatable deployment model
Consistent upgrade path and backwards compatibility
Visibility into the full stack to understand issues and make better decisions
However, the challenges of:
Multiple deployment methodologies
Difficulty upgrading
Unsupported APIs
Robust instrumentation and monitoring
We continue to see year over year growth of our Elastic Cloud service. That demand is being driven by the increase in data acquisition and retention.
As we continue to keep up with demand, we currently are testing the limits of openstack networking and control plane capabilities.
So why am I on this stage today – it’s to extend an offer to the community to collaborate with similar scale providers need to reduce the technical debt we have in common.
Scale, performance, operational complexity, are all issues plaguing large deployments.
Comcast is actively engaging large scale operator to create a community to address these challenges
We realize that many smaller enterprises look to larger enterprises to help validate Openstack, so we have to contribute all of our solutions back to the community
We also need to embrace a chaotic environment, that has both hyper-converged and disaggregated infrastructure from a variety of vendors. We have to pick the right tool for the right job at the right time.
I want to acknowledge and thank Mark Muehl (who really should’ve been giving this talk). Comcast’s success with Openstack is a result of
his vision and leadership.