

## Segment Routing Proxy Device Implemented Using P4 on FPGA with Zero CPU Overhead

#### Mirek Walukiewicz miroslaw.Walukiewicz@intel.com

Big Thanks to Pavel Benacek, Jan Kubalek and Bert Klaps from Intel

## Segment Routing v6 overview



SRv6 is a new powerful protocol with many emerging use-case. It is considered as successor of well-known GTP-U protocol. It is very useful for implementing 5G network slicing.



## Software implementation of the SRv6-proxy



Using P4 and FPGA based SmartNICs we can make SRv6-proxy system more efficient

## Why P4 for SRv6-proxy implementation using FPGA?

There are SmartNIC that can implement any reasonable networking pipeline.

P4 allows to define many classes of networking pipelines.

Programming FPGAs is complex, in general.

Very RTL skilled engineers are necessary to make FPGA projects.

FPGA projects with complexity like SRv6 proxy can take months to complete.

Using High Level Synthesis approach like P4 can reduce development time to weeks.

No RTL skilled engineers are required to complete such work on SmartNIC.

## So do it with FPGA and P4

## HW platform for SRv6-proxy implementation in P4



No CPU cores spent on SRv6 proxy processing! Everything done in HW.



# Host->Network Transformation Table



Biggest HW acceleration challenge -

Big exact match tables containing millions of entries

Table Key is simple, but actions parameters are very big due to SRv6 related definitions

# Network->Host Transformation Table



Big challenge for HW acceleration: very big exact match tables (2M) and large size keys due to full SRv6 header.

#### **SRv6 test configurations - BIDIRECTIONAL TRAFFIC**





#### Achievements to share:

- Full SRv6 proxy acceleration presented supporting millions of SRv6
  paths using FPGA based SmartNIC
- No SW overhead for SRv6 processing by host application connects the embedded NIC SR-IOV interface.

#### Learnings from the project:

- FPGA acceleration made by engineers without Verilog/VHDL knowledge thanks to use P4 translation.
- Possibility to demonstrate use of very large exact match tables in DDR4 difficult to achieve by ASICs.
- FPGA-based SmartNIC development is much faster with P4.



## Thank You!

Mirek Walukiewicz miroslaw.Walukiewicz@intel.com