In this blog, we will discuss Next-Gen Management Frameworks (NGMF), and what 6WIND is doing to make our vRouters ready.
Why Next-Gen Management Frameworks
During the course of my career at 6WIND, I have witnessed the transformation of the networking and telecom industry. We have progressively seen vRouters replace hardware routers as standard servers with high performance networking software now provide the same level of features and performance for a fraction of the price.
As a result, more and more network deployments happen on standard servers in datacenters. Both service providers and enterprise customers leverage their existing infrastructure to host numerous and complex network functions and decrease CapEx by sharing hardware resources. The IT and networking worlds converge.
One of the benefits brought by IT management is automation, which also decreases OpEx. Datacenter admins require centralized management interfaces. This is standard for SaaS applications and the same requirements are naturally drawn to networking applications, such as SD-WAN, vCPE, vEPC, etc.
Such a Next-Gen Management Framework has to take care of inventory, deployment, configuration, monitoring and lifecycle management for the routers, independently of their brand, physical or virtual nature and infrastructure management platform.
In the next section, I will give an overview of NGMF features and the matching requirements to make vRouters ready. Then, I will present the corresponding 6WIND vRouter roadmap.
Features of Next-Gen Management Frameworks
The NGMF is deployed privately on premises by the customer, or in the cloud for a single customer or with multi-tenant support (each customer logs in with his/her credentials). It must be able to interact with bare metal servers and virtual machines in different virtual infrastructure environments (OpenStack, AWS, Google Cloud, VMware vCloud, etc.).
It provides the following features:
Deployment: installation of software appliances on physical servers or cloud environment, with inventory management
License management/enforcement: license key management with usage control, contacted by appliances at startup to check their entitlement
Monitoring/Analytics: recording and monitoring of statistics, alerts and faults; gives a graphical view of the overall status of the system
Configuration: intent-based networking configuration of the devices, based on standard APIs and protocols (YANG, NETCONF, RESTCONF)
Lifecycle management: start/stop/restart, update, scale and heal of network functions
APIs: provided for all features for integration with third-party, custom or legacy management frameworks and orchestration tools
Deployment takes care of Day-0 / Day-1 in the lifecycle of the software appliance.
Day-0 consists in the first installation of the appliance in the target environment and sets the ground for Day-1. Different technologies are involved according to the target environment, as shown in the figure above, from PXE/ISO for bare metal deployment, QCOW and OVF/OVA packages for private clouds using OpenStack/KVM or VMware and cloud provider specific APIs for Amazon Web Services, Google Cloud Platform or Microsoft Azure.
Day-1 is the initial system and network configuration for basic access to the device; it sets the ground for Day-2 that is detailed a bit later in this post. Day-1 mostly relies on cloud-init for initial networking and remote access configuration.
The NGMF also provides access to the server / OpenStack / cloud platform inventory for deployment. Appliances can be seen on a map and clicked-on to access details and manage them individually.
vRouters register with the NGMF on startup. License entitlement is checked (token-based to support VMs) and license is delivered and installed (for example using Ansible).
In conjunction with monitoring, this can also check that the customer does not infringe his license (throughput, number of tunnels), and even track – with the customer’s agreement of course – its usage to create innovative business models (pay per use with a fine-grain feature set).
Monitoring is key in controlling the status of the network and anticipating and analyzing issues. vRouters must provide statistics and alerts and it must be possible to track these values over time.
To that intent, vRouters export key performance indicators (KPI) to a time series database (TSDB) for analysis through a web-based frontend.
In addition to KPIs, alert/fault management is required and involves several mechanisms, both on-device (logs, health check) and as part of the analytics frontend (thresholds on KPIs).
Day-0 and Day-1 configuration are part of deployment as explained previously. Once the vRouter is deployed and running, one is going to configure it for a particular service (a CPE, a BNG, a security gateway).
This can be done per device through a Web interface or the CLI for example. In that case, the administrator has to login to each device (or develop a script that does it) to input configuration commands.
This can be acceptable for a couple of vRouters and a pretty static configuration, but as soon as more than a handful of devices have to be managed, or when the configuration varies, programmatic interfaces like NETCONF and RESTCONF provide a higher level of abstraction with a YANG-based data model for integration with third-party management tools.
And as the industry is moving from per-device management (e.g. in a SeGW use case, connect to each tunnel endpoint to configure IPsec) to intent-driven management (e.g. specify tunnel characteristics and the NGMF takes care of per-device configuration), TOSCA and orchestrators enter the game. There is however no direct impact on the vRouter itself as the orchestrator ends-up using NETCONF/YANG or still in many cases the CLI through a SSH session.
Start/stop/restart and scaling mostly apply to Virtual Network Functions (VNFs), that is, vRouters that run as network functions. ETSI NFV specifies how to package VNFs by wrapping the appliance image into a description that describes its requirements in terms of CPU, memory, storage, etc. The VNF also has to provide the right APIs to be started and stopped.
Scaling can be implemented in many ways depending on the use case, from flexible core allocation to dynamic VM spawning with a load balancer when the VNF is composed of several Virtual Network Function Components (VNFC).
Healing is also part of lifecycle management. A radical way to heal a VNF may be to restart it, especially as a VM does not take too long to startup. A more subtle approach, more suitable for bare metal deployments, involves both auto-healing on the device and healing from the NGMF as a reaction to an alert received in the analytics framework (e.g. trigger IKE renegotiation on both ends when an IPsec tunnel fails).
As we see, the NGMF leverages multiple APIs and tools from the vRouter to implement its features. We have listed Web, CLI, Linux shell, NETCONF, RESTCONF, YANG, cloud-init, KPIs and TSDB, Ansible.
The NGMF itself is very likely to be integrated with other frameworks, from legacy NMS to state-of-the-art cloud management systems, by way of NFVO/VNFM and VIMs. Therefore, it has to expose its capabilities for deployment, license management, monitoring and lifecycle management through APIs too.
So, What’s The Impact on 6WIND vRouters?
At 6WIND, we are making our vRouters compliant with NGMF.
You can check this blog summarizing the features of our latest release to see how we have introduced the support of Telemetry and integration with InfluxDB/Grafana as an example. Our next release will provide extensive alert/fault monitoring to help the user identify a possible faulty or diverging behavior and take remediation before something worse occurs.
We are also working on NETCONF/YANG support for the next release, and RESTCONF will follow. This will replace our current XML-based management, which is quite efficient yet proprietary. By relying on YANG data models, we will ease the integration with external management tools. Internally, we will also simplify our developments as for example the CLI will be automatically inferred from the YANG files.
We also plan to extend our current license management system to provide token-based management with a central licensing server, allowing fully automated instantiation and deployment of vRouters, without any manual intervention to install the license file.
Lifecycle management will come next as we progress with our telco customers and partners. This is still a work in progress.
I hope this topic caught your interest. We would be happy to get your feedback.
Yann Rapaport is Vice President Of Product Management for 6WIND.