Problem to tackle
In modern cloud-native environments, like with Sylva, Kubernetes cluster management benefits from a high degree of automation. However, achieving similar flexibility and automation at the datacenter network layer—particularly within IP and VLAN switch fabrics— remains a significant challenge. Network Functions typically have very complex network connectivity profile due to the need to integrate with many existing partner systems, often part of telco legacy domain. In such context, integration of Network Functions turns to be laborious and error prone task on the datacenter network fabric that nobody is happy with: not the Network Function team, not the platform team, not the network team.
Host Based Routing solution
Sylva recognized this problem early on and already in 2024 started experimenting with Host Based Routing (HBR), a concept which foresees containerized router instance, on every Kubernetes host (node). The node then directly peers with EVPN/VXLAN capable fabric, thanks to BGP control plane capabilities of the containerized router. This is complete change comparing to the traditional approach to connecting the nodes to the network fabric with static VLAN configuration. Nodes become integral part of fabric and participate directly in routing decisions. This opens possibility to use Kubernetes Resource Model (KRM) to configure all necessary connections to different VPNs / VRFs and target networks in the data center network, including route leaking configurations, solely within the Kubernetes cluster, without touching the network. In this way the network can remain relatively static from the perspective of day 2 operations as entire relevant network configuration remains within Sylva cluster. This containerized router uses dynamic properties of BGP protocol flexible and declarative connectivity management to answer of Telco Workloads connectivity needs.
In Host Based Routing connectivity demand is declaratively configured through Kubernetes Custom resources (1). They are reconciled by Host Network Operator which uses standard interfaces (2) to translate this demand to the configuration of the Containerized Routing Agent (CRA), which creates necessary logical interfaces on the node and programs routing table in the Netlink level of Linux Kernel (3), based on which necessary connectivity is established on layer 3 towards the network fabric (4) and layer 2 towards the pods/containers (5). This eliminates need for complicated network automation since the main use cases for it are covered within Kubernetes cluster. Furthermore, it enables flexible and dynamic allocation of the compute nodes within data center, without constraints imposed by pre configured switches.
HBR Origins and Evolution
Originally conceptualized and developed internally by Deutsche Telekom and based on Free Range Router (FRR) as CRA, HBR has been in production since 2022. It is default networking and connectivity solution for more than 300 bare metal Kubernetes clusters which host critical network functions like 5G SA Core, enabling zero touch configuration of the most complex connectivity scenarios.
To extend this ecosystem a collaboration within Sylva has been initiated by Deutsche Telekom, 6WIND and Orange to prove usability and viability of HBR beyond its use case in Deutsche Telekom. The Proof-of-Concept has been organized hands on in Sylva Validation Center in Paris. It was executed in two phases which were recently successfully closed.
The First success in this collaboration was integration of 6WIND’s Virtual Service Router (VSR) as Containerized Routing Agent within the HBR setup, however without full integration with Host Network Operator. The goal was to show that enterprise grade virtual router like VSR, can easily take the role of CRA and even enable advanced functionalities.
“As virtual router pioneer and early contributor to Free Range Routing project, we have been keen to show that our product VSR can successfully work in HBR solution and add new possibilities on top of providing connectivity, such as filtering and egress capabilities” – Jean Mickaël Guerin, 6WIND CTO
“We have implemented EVPN/VXLAN based fabric within our lab, deployed HBR solution and tested numerous scenarios and complex connectivity setups both functionally and performance wise over several months. Not only they all worked very well, but during that entire time we did not need to touch the network fabric configuration even once.” – Vlad Onutu, Orange Cloud Network Lead
In second phase the latest HBR version 2.0 was deployed including full integration of host network operator. This time using FRR-CRA reference implementation. The goal was to demonstrate declarative model of fulfilling connectivity demand via HBR custom resources and possibilities which comes with it (e.g. GitOps based configuration or connectivity self service).
“We have recently released version 2.0 of HBR which enables seamless pluggability of different CRAs (e.g. enterprise versions), by offering stable integration API with host network operator. It also contains open-source reference implementation based on FRR container, to facilitate experimentation and testing. The phase two of HBR PoC was proof for us that HBR is mature enough to be used in cloud environments beyond Deutsche Telekom, like Sylva.” – Christopher Dziomba, Lead HBR Architect, Deutsche Telekom.
“HBR 2.0 solution proved to be easy to install and use, things working after the setup of the values. Pods with macvlan interfaces in various vlans can communicate by leveraging the EVPN/VXLAN fabric built between the cluster (frr-cra pods) and the switches.” – Vlad Onutu, Orange Cloud Network Lead
Next steps
- Deutsche Telekom as the founding member of Sylva intends to move the HBR development activities to Sylva and put the code on disposal under Apache 2.0 license, fostering open collaboration and adoption.
- Extension of HBR capabilities from IP layer (BGP) routing to Layer 2 VLAN automation is planned in order to enable usage of part of its functionalities in traditional network fabric designs.
Takeaway
HBR aims to enhance Telco Cloud environments by automating network connectivity for Telco workloads and enabling seamless IP datacenter interconnections (BGP-based).