Universal CPE (uCPE) is an industry hot topic as it brings more flexibility in the subscription and deployment of value-added services for end-customers through virtualization.

It consists of a single platform running virtual network functions (VNFs) to replace multiple dedicated appliances. The hardware is built using commercial off-the-shelf (COTS) servers, which brings another strong advantage in the global sourcing capabilities of a unique or commonly used hardware platform.

Many industry articles list the benefits of uCPE within the growing trend towards a flexible and programmable network. However, one of the main objections of uCPE from Service Providers is: “will the uCPE meet my customer’s performance requirements?”

This question is valid because network intensive tasks are required for the uCPE including switching between VNFs at the host level. When VNFs are chained, more switching operations are needed. Additionally, performance must be deterministic to avoid facing temporary issues that will be difficult to analyze and troubleshoot.

The price for all-in-one uCPE appliances is very sensitive, and a Telecom Equipment Manufacturer (TEM) starting in the uCPE business cannot afford to build its solution on a platform that won’t match Service Provider’s TCO expectations. I have personally been involved in many of these discussions with TEMs, and understand how critical it is to have the most efficient switching/routing stack to keep the best possible price/performance ratio to be competitive on this market.

Routing is a must-have feature for uCPE. Some TEMs, to optimize the design, take the decision to embed the mandatory vRouter directly in the infrastructure. It has multiple advantages, such as avoiding additional switching operations from/to a dedicated vRouter VNF, but also properly managing end-to-end Quality of Service (QoS) in case congestion happens during the VNF chain.

TEMs may also want to add a low-cost physical CPE design, without virtualization capabilities, based on the same architecture, routing software and management.

This blog post will detail a test that compares the efficiency of a Linux networking stack versus 6WIND’s vRouter stack for uCPE solutions. The test goal is to determine the routing capability of a single Atom C2000 CPU core in a uCPE use case to understand the savings brought by an efficient fast path stack versus Linux. No VNFs are used in this test for the sake of simplicity. The main goal was to showcase a low-cost CPE design based on a COTS server. Please note 6WIND’s vRouter is fully able to combine switching features (by offloading Linux Bridge or Open vSwitch data plane from the kernel) with routing. After the test, we will conclude that 6WIND vRouter is 7X Linux performance for uCPE.

6WIND vRouter: Fast Path Networking Stack for uCPE

6WIND’s networking stack is designed as an acceleration engine for the Linux networking stack, offloading network processing from the Linux stack into what we refer to as our fast path. 6WIND’s fast path runs on a dedicated set of cores (only one in this example). It has very limited impact on the system’s management. Standard Linux commands will be used to configure networking, and we demonstrate that our fast path properly reflects Linux networking stack states. The advantages of such design for uCPE are huge, and summarized below.

One of the drawbacks with other projects such as OVS-DPDK or VPP is that they are designed as standalone stacks. This makes it very complex to mix Ethernet interfaces natively managed by those technologies with LTE or USB interfaces that may not be usable with those stacks. 6WIND’s vRouter natively supports this mix by design.

uCPE Benchmark: 6WIND vRouter vs Linux

The test was performed using an Intel Atom(TM) CPU C2758 running at 2.40GHz. A single core of this CPU is dedicated to the forwarding plane for testing in the fast path configuration (FP_MASK):

root@alicante:~# fp-conf-tool -D -S

: ${FP_MASK:=1}

: ${FP_PORTS:=’0000:00:14.0 0000:00:14.1 0000:00:14.2 0000:00:14.3′}

Three 1Gbps ports are used as LAN ports, while a single 1Gbps port is used for WAN.

These 4x1Gbps ports are connected to an IXIA to generate the traffic and measure the performance.

1: Let’s start with the uCPE configuration

# Create LAN bridge

brctl addbr lan

brctl addif lan enp0s20f0

brctl addif lan enp0s20f1

brctl addif lan enp0s20f2

# Rename WAN port

ip link set dev enp0s20f3 name wan

# Change MACs

ip link set dev lan address 00:0C:C3:11:22:33

ip link set dev wan address 00:0C:C3:44:55:66

# Interfaces up

ip link set dev enp0s20f0 up

ip link set dev enp0s20f1 up

ip link set dev enp0s20f2 up

ip link set dev lan up

ip link set dev wan up

# IP address on LAN

ip address add 192.168.1.254/24 dev lan

# IP address on WAN

ip address add 172.16.1.254/24 dev wan

# Add NAT on WAN interface

iptables -A POSTROUTING -o wan -j MASQUERADE

# Add 3 fake hosts on LAN

ip neighbor add 192.168.1.1 lladdr 00:0c:c3:11:11:11 dev lan nud permanent

ip neighbor add 192.168.1.2 lladdr 00:0c:c3:22:22:22 dev lan nud permanent

ip neighbor add 192.168.1.3 lladdr 00:0c:c3:33:33:33 dev lan nud permanent

# Add fake neighbor on WAN

ip neighbor add 172.16.1.100 lladdr 00:0c:c3:44:44:44 dev wan nud permanent

# Create a strict priority scheduler on wan egress interface, and shape @ 1Gbps

fp-cli qos-sched-add wan prio 4 rate 1g

# Mark voice traffic as prio 1

iptables -t mangle -A POSTROUTING -m dscp –dscp 0x2e -j MARK –set-xmark 0x1

# Mark video traffic as prio 2

iptables -t mangle -A POSTROUTING -m dscp –dscp 0x0a -j MARK –set-xmark 0x2

2: Let’s dump the fast path states to check everything is properly in sync

We first check the Linux Bridge has been created in the fast path, and the three LAN ports are added to this bridge:

root@alicante:~# fp-cli bridge

Bridge interfaces:

lan-vr0:

nf_call_iptables off:

nf_call_ip6tables off:

enp0s20f2-vr0: master lan-vr0

state: forwarding

features: learning flooding

enp0s20f1-vr0: master lan-vr0

state: forwarding

features: learning flooding

enp0s20f0-vr0: master lan-vr0

state: forwarding

features: learning flooding

We then validate that the IP addresses and routes are synchronized:

root@alicante:~# fp-cli addr4 lan

number of ip address: 1

192.168.1.254 [2]

root@alicante:~# fp-cli addr4 wan

number of ip address: 1

172.16.1.254 [1]

root@alicante:~# fp-cli route4 type all table 254

# – Preferred, * – Active, > – selected

(254) 0.0.0.0/0 [03] NEIGH gw 10.16.18.9 via mgmt0-vr0 (11)

(254) 10.16.18.0/24 [07] CONNECTED via mgmt0-vr0 (12)

(254) 172.16.1.0/24 [40] CONNECTED via wan-vr0 (43)

(254) 192.168.1.0/24 [38] CONNECTED via lan-vr0 (41)

Finally, on the filtering side, we have our NAT rule, and our two classification rules:

root@alicante:~# fp-cli nf4-rules nat

Chain PREROUTING (policy ACCEPT 0 packets 0 bytes)