I am going to talk about a number of topics today relating to policy virtualization. The first is why you would want to virtualize network systems in the first place and what advantages you can expect. I will then discuss some of the key technical challenges which must be addressed before virtualizing a network system such as a policy system. We have spent a lot of time looking into our architecture and seeing how to evolve it to ensure that it can meet the needs of both current and future networks. I’ll then illustrate how these enhancements have led us to deploy one of the largest distributed policy systems in the world, with the entire system running virtualized. Finally I’ll talk about the future of the network and how virtualization, if it is done right, can enable network evolution towards what we call programmable networks, which are networks that take advantage of NFV and SDN principles and technologies
So firstly, why?If you look at the computing world as a whole, and particularly the software industry. It has gone through rapid changes over the past 10 years or so. Much of these changes have been enabled through extensive use of virtualization and cloud philosophies.
It has never been easier to build, deploy, manage and upgrade software. It has never been easier to get your software or service into the hands of consumers, be it with an app in one of the app stores, or using SaaS. It has also never been cheaper due to pay as you grow business models and cheaper and easier to manage hardware.Not only is it cheaper and easier to deploy software, these changes have also facilitated a lot of innovation as services can test and add new features quickly and easily, again, through an update in an app store, or by updating a SaaS type application.
I think it is fair to say that mobile networks haven’t adapted to the same degree as other industries. There are numerous reasons for this, first and foremost, mobile networks are complex things – it takes a tremendous amount of engineering intelligence and know-how to build a scalable and robust mobile (or fixed line for that matter) network. The kinds of services offered in your typical cloud application are far less complex by comparison. Mobile networks also tend to contain legacy equipment and there has not really been widespread adoption of cloud/virtualization technologies.And all this is fine, there is nothing inherently wrong with thie. There are plenty of examples of industries where software is run and managed in this way with long term systems providing robust and scalable systems. And this was all fine in the telecommunications industry when there was relatively little competition. Ok, networks would compete with each other for customers etc, but all of the competition was within the industry using similar architectures and products. So, while it is always important to keep costs down, adaptable networks weren’t top of the priority list because, well, all networks were similar and had similar problems. In short, there was little or no evolutionary pressure on the network to adapt. This is not the case anymore as OTT applications are out manoeuvring mobile networks and improving on many of the services that mobile networks offer. They can partly do this as these OTT apps are built and designed utilising virtualization and cloud technologies. There is now significant evolutionary pressure on mobile operators to adapt to this new environment. So, the question really is, how do you adapt?TODO: DISTILL THIS INTO A SINGLE IMAGE OR SOME TEXT WITH AN IMAGE – I CAN TALK THROUGH THE POINTS ON HEREIn google/amazondatacentres – the ratio of staff to servers is low, in mobile networks thestaff to server ratio is quite high. From: http://gigaom.com/2013/10/07/the-2020-network-how-our-communications-infrastructure-will-evolve/ :“The key metric here is number of servers operated by a single system administrator. Today in mobile networks that ratio is around 40:1, this involves individual servers with unique installs, low levels of automation, compliance requirements and time-intensive support requests. To reduce the marginal cost of adding an application, carriers would need to migrate to cloud architectures. Cloud systems offer a unified platform for applications and allow for high levels of automation with server to system administrator ratios greater than 5000:1. The higher the ratio, the more the system administrator’s role becomes that of a high-level software developers – instead of hitting a reset switch they’re finding find bugs with the help of custom firmware. The consequence is a massive competency shift in the operations team.”For a long time, this didn’t matter. It didn’t matter that networks were static or complex or that there was expensive equipment with high operational costs. But all this matters now. What we are seeing here is evolution in action – and the survival of the fittest. For a long time it didn’t matter that networks were unfit – they were coming under no evolutionary pressure to be fit. Now, however, there is pressure in the form of OTT applications, reduced profitability and commoditisation of connectivity. The advances that have been seen in other companies (like google, facebook, microsoft, etc), have not been seen in mobile networks. To some degree, this was fine when they were just other companies, but now they are direct competitors so it is not an even playing field. Smart person in WhatsApp/Skype/Wherever comes up with an idea, it will be in their systems in a month or so. Not true of mobile networks.
So how do you adapt – well to many, the adoption of virtualization is seen as key technology to making networks simpler, cheaper, more efficient, and more adaptable. And, while I don’t think virtualization in itself it is a panacea, I do think that virtualization, if it is done correctly, can deliver many of these promises. What is fundamentally key to this is that virtualization must be done with an eye to the future. It is fairly straightforward to just take some software and run it on a hypervisor. Anybody can virtualize software, that is not a challenge. What is a challenge, is ensuring that that software can operate in the manner that we believe is required for current and future networks. So, what are the things that need to be considered??
Well, here are some examples of the kinds of things that our engineering teams have been looking closely at over the past number of years. For example, the management of stateful session data – this is a key difference between a system like a policy system and a request/response web-page or application. Sessions on a mobile network may last a long time (hours or days), and each transaction in a policy system must understand the context, and what is going on, in that session, so that data needs to be accessible from any virtual machine that can process that session – which means that you need to ensure that your storage mechanisms, your database technology, facilitates this.Go round the houses (briefly). Virtualisation presents a number of challenges and opportunities. Some of the challenges centre around the database technology you use, how you are going to manage the systems etc. The opportunities that can be taken advantage of are great though, for example you can use virtualisation to improve how you scale, you can use it to improve how you upgrade your systems and how you achieve high availability
So, how have these factors effected the way that we build our systems? Well, I’ve picked a couple of aspects here that I felt were interesting andLSUThe Logical Scalable Unit (or LSU) is a key component in our architecture. Each LSU is a self-contained building block of a deployment. An LSU contains a number of virtual machines running Openet Software (in this case our policy manager PCRF product). This is the core building block of a virtualized deployment. The LSU has been designed from the ground up to maximise the benefits of virtualization and cloud technologies. Essential factors such as scaling, high availability and upgrades were taken into account when designing the internals of an LSU. Each LSU has well defined performance characteristics, meaning that it is very easy to understand how many LSUs are required to meet current and future demand.DATA AND STORAGEThe choice of database technology is a key one when we considering how we scale and also how we manage data inside an LSU and between LSUs.We also closely examined our database and storage mechanisms to favour in-memory storage both to ensure low latency processing and facilitate scaling. ROUTINGIntelligent diameter routing is central to being able to scale and spread load between LSUs and virtual machines. High availability mechanisms as well as upgrades use diameter routing in order to allow for seamless handover, as well as upgrades that can be done in-service without requiring any additional hardware. MANOOne of the key components of an NFV architecture is the management and orchestration of the virtual images. As an aside, I think that the success of NFV hinges on the success of the Management and Orchestration group and the functionality that they are currently looking to define. We have built a number of new components that make deployment and management of our systems easier. As this area is likely to be one where there is significant change in the coming years, we have made extensive use of standard APIs to allow integration with other systems.
The Logical Scalable Unit (or LSU) is a key component in our architecture. Each LSU is a self-contained building block of a deployment. An LSU contains a number of virtual machines running Openet Software (in this case our policy manager PCRF product). This is the core building block of a virtualized deployment. The LSU has been designed from the ground up to maximise the benefits of virtualization and cloud technologies. Essential factors such as scaling, high availability and upgrades were taken into account when designing the internals of an LSU. Each LSU has well defined performance characteristics, meaning that it is very easy to understand how many LSUs are required to meet current and future demand.
The choice of database technology is a key one when we consider how we are going to scale.We also closely examined our database and storage mechanisms to favour in-memory storage both to ensure low latency processing and facilitate scaling.
Intelligent diameter routing is central to being able to scale and spread load between LSUs and virtual machines. High availability mechanisms as well as upgrades use diameter routing in order to allow for seamless handover, as well as upgrades that can be done in-service without requiring any additional hardware.
One of the key components of an NFV architecture is the management and orchestration of the virtual images. As an aside, I think that the success of NFV hinges on the success of the Management and Orchestration group and the functionality that they are currently looking to define. We have built a number of new components that make deployment and management of our systems easier. As this area is likely to be one where there is significant change in the coming years, we have made extensive use of standard APIs to allow integration with other systems.
So what are the benefits of these architectural choices?Well, these have resulted in easier to manage systems owing to our orchestration components. We can also rapidly deploy and upgrade the virtual machines on our LSUs. With deployment of LSUs taking less than 5 hours, and in service upgrades taking approx 3 hours with no loss of service. The LSU model allows our systems to be hugely scalable, with a sample deployment architected (and tested) to handle 1.1 million TPS in a deployment spread across 11 data centres. This is a massive deployment, and one that is made significantly easier to manage through the use of the technologies I mentioned previously.What is important to remember here is that these are real, tangible, benefits. And this is a central point as I get to the future trends and where we are going in the next couple of years. It is not enough to just look towards NFV or SDN as some panacea for solving all the world’s problems. Virtualization is a key step in achieving the goals of NFV/SDN, but it is a step that provides real tangible benefits as you can see here.
As I mentioned before – we have utilised these principles in a large Tier 1 in North America. These are a very progressive customer of ours who are strongly adopting the principles of NFV, and our systems enable them to get there. So, it is possible to do this, and it is possible to do it at extremely large scale.
Here are some nice things that analysts have said about our systems. Some of these people may be sitting in this room, so I’m sure you can ask them further about it.
We use the term programmable networks to describe where we feel mobile networks need to go. Really what programmable networks are is a combination of NFV and SDN which enables innovation. This image is adapted from the original NFV white paper which I’m sure many of you have read at this point. To me, this illustrates how we can use these principles to create a network that is cheaper, easy to manage, and fundamentally allows mobile networks to adapt to any environment.
Virtualization is at the core of our vision for future networks. It is absolutely key that virtualization is adopted in the correct manner to enable these future networks. Without virtualization (or if it is done poorly), many of these improvements will fall flat.================Both SDN and NFV are enabling technologies for creating “Programmable Networks” and to varying degrees these build on virtualization. Programmable networks should enable true innovation as well as ease of configuration and management. In essence the network becomes a resource for new services to be created. The areas of Virtualization, NFV, SDN and Programmable Networks are all tightly linked and can be combined to greater effect. Pundits have long insisted that widespread private cloud buildouts are a foregone conclusion, but reality begs to differ. While the technology to build a private cloud has been available for several years, uptake has been slow.At first glance, private cloud adoption figures seem relatively robust. Forrester Research, for example, said that 31% of its enterprise customers claim to have a private cloud in place, and another 17% plan to build one in the next 12 months.But upon closer examination, only 13% of those organizations that report having a private cloud have a "true" private cloud, said Lauren Nelson, an analyst with the firm."Most have adopted a software solution that improves their management capabilities," Nelson said. More often than not, those so-called private clouds don't include some key characteristics of a cloud -- for example, multi-tenancy, end-user self-service or metered usage. From (http://searchcloudcomputing.techtarget.com/news/2240186850/Misleading-private-cloud-adoption-stats-hide-underlying-problems).
The current wave of virtualization will be succeeded by NFV, and adapting the virtualization infrastructure to adhere to the outputs of NFV.It is very clear that NFV is not simply about virtualizing your software. That’s the easy part. The hard part is around the management and orchestration of these systems. That means that the software needs to be easy to deploy, manage and upgrade, as well as being rapidly scalable. I think that the Management and Orchestration group (MANO) in NFV is central to the success of the group. The outputs that they produce will dictate how software is deployed. If the management and orchestration layer is not successful, then a network will end up with a bunch of orchestration points – which is not really conducive to having a single centralised orchestration layer.
If NFV is done correctly, then you can start to move towards a programmable network. A Programmable is really what you will get when NFV and SDN combine. It abstracts this complexity and presents a consolidated API to the application layer. This allows common applications and services to be built that are not dependent on the underlying infrastructure. SDN and NFV are enabling technologies in a programmable network. Virtualization and NFV will allow network nodes to be created and configured on the fly, as well as scaled on demand, while protocols like OpenFlow can configure how data is moved around the network. To continue the computing analogy, the Control Layer acts as a compiler, meaning that generic code can be written that is not dependent on the individual network nodes – the equivalent of writing Java code rather than assembly. The control layer (Management and Orchestration Layer in NFV) is key to the success of this approach. Most vendors will move towards virtualization, but will (initially at least) create their own management/orchestration functionality. While this will ease management, without a common management/orchestration function, this will still lead to a complex and difficult to manage network (like what we have now!!).
In terms of adoption, here is what we are seeing. At the moment, we are seeing a drive towards virtualisation of the ‘Traditional Software’ components such as PCRF, OCS, OFCS, IMS etc. Many of these are the products that Openet supplies so we have invested significantly to ensure that our products can be deployed in this manner. This provides a number of advantages such as reduced TCO, scalability, and easier management and upgrades.In the medium term, we think that virtualization support will extend to more traditional hardware nodes such as the gateways and routers. This will yield the same advantages as virtualization of the software nodes (reduced TCO, scalability etc...). In addition to this, we feel that some elements of SDN/Programmable Networks will start to bleed into mobile networks. This could be the creation of adjunct programmable network cores to support functions like M2M or enterprise services, or enabling a subset of the overall core network. In the long term, networks will aim to be fully programmable. This will require OpenStack like protocols that expose common APIs for the whole network. NFV and SDN will push the network and IT sides closer together as the aim is that they can use common platforms. Note that some of the timelines in this network could shift due to some compelling event which leads to a shift in introducing new services.
The final point that I want to emphasise here is about the industry as a whole. I believe that NFV, and the principles that it advocates, are vital for this industry to evolve in order to reduce costs, to generate new revenue streams, to make networks more adaptable and to compete with new competitors. Unless we, as an industry, embrace this change we simply won’t be able to compete with OTT services or adapt to the needs of future networks.I really do like the NFV terminology the way they describe the movement as a ‘call to arms’, and I think that we as an industry must embrace this call. Thank you.
Policy Virtualization: Realizing the Potential
Realizing the Potential
Broadband Traffic Management
13 Nov 2013, London
Openet CTO Office