Euro P4 2022
Dec 09, 2022 to Dec 09, 2022 | Rome, Italy
EuroP4 2022 took place December 9th and was held in conjunction with CoNEXT 2022 (December 6-9, 2022) in Rome, Italy.
The 5th European P4 Workshop (EuroP4) brought together networking researchers to discuss cutting-edge P4-based research and projects, P4-based tools, and the needs of the community. The workshop created an opportunity to forge connections between researchers, introduce more networking researchers to the P4 community, and seed future top-tier publications and innovation.
View proceedings from the Euro P4 Workshop.
Networks and the services they support form the communication backbone of our society, and it is important that potential Distributed Denial of Service (DDoS) attacks are detected quickly, in order to avoid or minimize the impact they may have on the availability of services. Recent technological advances in programmable networks - specifically the programmability of data planes in switches and routers, have made available new ways of detecting such attacks. By relying on this newfound possibility, this paper proposes the utilization of a Random Forest (RF) to aid in quickly and accurately detecting DDoS attacks in a programmable switch. Random forests utilize several classification trees, each of them for independently classifying an input as one of a set of classes. Here, each decision tree will classify a network flow as potentially malicious, i.e. part of a DDoS attack, or a legitimate user flow. Despite utilizing multiple classification trees to improve accuracy, random forests are relatively lightweight, with each tree requiring few and simple computations to arrive at a classification. Our results show that even small RFs, requiring as few as 63 match+action table entries, can achieve F1-Scores of over 90%.
A slow HTTP POST attack is an application-layer distributed denial-of-service attack targeting web servers. The attacker simulates a legitimate user with a slow network speed and continues to send requests, resulting in server resources being unavailable for a long time to other users. The similarity to legitimate behavior makes it challenging to identify such attack traffic. To address this issue, this paper proposes a responsive defense mechanism that exploits programmable network devices to identify attack traffic based on HTTP headers. With information that is not available from legacy network devices, this method can identify different types of requests and apply limitations. This approach achieves a distributed, source-based defense capability by utilizing data plane programmability, making it a scalable solution. The simulation results show that the approach is effective and accurate against slow HTTP POST attacks.
This paper presents an implementation of a practical cryptographic primitive based on ChaCha on a Tofino programmable switch. A key challenge is optimizing the implementation by leveraging the structure of ChaCha operations and hardware features of Tofino. Our implementation outperforms the AES-based approach in terms of performance and small memory footprint and achieves up to 203 Gbps of throughput.
SESSION: Architecture and Language
Reducing P4 Language's Voluminosity using Higher-Level Constructs
Albert Gran Alcoz (ETH Zürich), Coralie Busse-Grawitz (ETH Zürich), Eric Marty (ETH Zürich), Laurent Vanbever (ETH Zürich)
Over the last years, P4 has positioned itself as the primary language for data-plane programming. Despite its constant evolution, the P4 language still "suffers" from one significant limitation: the voluminosity of its code. Today, P4 users overcome this limitation by relying on templating tools, hand-crafted scripts, and complicated macros. Unfortunately, these methods are not optimal: they make the development process difficult and do not generalize well beyond one codebase.
In this work, we propose reducing the voluminosity of P4 code by introducing higher-level language constructs. We present O4, an extended version of P4, that includes three such constructs: arrays (which group same-type entities together), loops (which reduce simple repetitions), and factories (which enable code parametrization).
Read Paper | View Slides
Compiling Packet Programs to dRMT Switches: Theory and Algorithms
Balázs Vass (Budapest University of Technology and Economics), Ádám Fraknói (Eötvös Loránd University), Erika Bérczi-Kovács (Eötvös Loránd University), Gábor Rétvári (Budapest University of Technology and Economics)
A critical step in P4 compilation is finding an efficient mapping of the high-level P4 source code constructs to the physical resources exposed by the underlying hardware, while meeting data and control flow dependencies in the program. In this paper, we take a new look at the algorithmic aspects of this problem, with the motivation to understand the fundamental theoretical limits and obtain better P4 pipeline embeddings in the dRMT (disaggregated Match-Action Table) switch architecture. We report mixed results. We find that optimizing P4 program embedding for maximizing throughput is computationally intractable even when some architectural constraints are relaxed, and there is no hope for a tractable approximation with arbitrary precision unless P = NP. At the same time, we find that the maximal throughput embedding is approximable in quasi-linear time with a small constant bound. Our evaluations show that the proposed algorithm outperforms the heuristics of prior work both in terms of throughput and compilation speed.
Recent P4 research has motivated the need for in-network fractional calculations to support functions in Networking (for calculations related to active queue management and load balancing) and in Machine Learning. The P4 language and ASICs do not natively support fractional types (e.g., float).
This paper re-thinks the foundation of in-network fractional calculation and proposes a new approach that is more resource conscious and is straightforward to encode in P4. Instead of floating-point, it uses a fixed-point encoding of numerals; and instead of sampling functions into tables it uses Taylor Approximation to reduce data-plane calculations to simple arithmetic over pre-calculated coefficients, requiring constant space and linear time. The paper describes and evaluates a P4 code synthesis algorithm that allows users to trade-off switch resources for accuracy, grounded on an application of a well-understood mathematical theory. It describes how to encode π and various functions including cos, log and exp.
We introduce a formal semantics of P4 for the HOL4 interactive theorem prover. We exploit properties of the language, like the absence of call by reference and the copy-in/copy-out mechanism, to define a heapless small-step semantics that is abstract enough to simplify verification, but that covers the main aspects of the language: interaction with the architecture via externs, table match, and parsers. Our formalization is written in the Ott metalanguage, which allows us to export definitions to multiple interactive theorem provers. The exported HOL4 semantics allows us to establish machine-checkable proofs regarding the semantics, properties of P4 programs, and soundness of analysis tools.
SESSION: Monitoring and Applications
Current approaches to network observability rely on techniques like active probing, packet sampling, and path-level telemetry, which only provide a partial view. This paper presents causal telemetry, a new model that adapts ideas from distributed systems to the network setting. Causal telemetry captures causal relationships between events, including those that take place on physically separated devices. We motivate causal telemetry through examples, we show how it can be used to diagnose anomalies and faults, and we present algorithms for constructing the needed causal graphs from network executions. We develop a P4-based prototype implementation, CoCaTel, and discuss a case study that uses causal telemetry to detect Priority-Based Flow Control (PFC) deadlocks.
Network monitoring is a fundamental task for proper network troubleshooting and performance management. Recently, in-band Network Telemetry (INT) has been demonstrated as a powerful and efficient network monitoring framework. Using INT, network information hop-by-hop can be collected directly from the data plane by gathering this information in the production traffic. However, INT data collection is limited by available packet size and processing overhead, making it critical to choose what data to collect and when to collect it. In this demo, we propose the In-band Inter Packet Gap Network Telemetry (IPGNET) per-hop monitoring. We argue that by monitoring the IPG hop-by-hop, it is possible to correlate the data and identify: (i) Network problems like congestion and delays, finding their root cause, and (ii) Microbursts and their contributing flows. Our preliminary results show that IPGNET can detect microbursts on multiple queues and report all the contributing flows with high efficiency in terms of control/data plane overhead.
This work presents the implementation of a tabular interpolation approach to estimate empirical Shannon entropy on programmable data plane ASICs using P4. The technique transforms the complex computations of the random projection into fast lookup over pre-computed tables in the match-action pipeline. Likewise, the interpolation heuristic further reduces the table size substantially. Thus, more tables can be accommodated, achieving higher estimation accuracy. Simulations based on real-world network traffic traces are performed to evaluate the estimation accuracy. The scheme is deployed in a Barefoot Tofino2 switch connected to the International Center for Advanced Internet Research (iCAIR) national testbed. The system can estimate the entropy of network traffic accurately at 400 Gbps throughput.
Beamforming is now an integral feature of modern wireless communication systems and its implementation calls for an accurate beam alignment by estimating the direction of signal arrival. However, this estimation is computationally complex, especially in a dynamic environment where a user is constantly on the move. In this paper, we propose a user-assisted in-network method to optimally approximate the angle of arrival by segmenting the cell area into an exponentially binned grid and make use of the advantages offered by programmable data planes and their match-action table (MAT) logic. The proposed method is implemented in P4 and runs on a Tofino ASIC. Our evaluation proves a theoretical bound on the absolute error of the proposed MAT-based angle approximation and shows that it is in accordance with the empirical error distributions.
There is a great interest in utilizing P4 for in-network computing along with programmable data planes. This use is emerging as a new network paradigm that can not just reduce the complexity but the delay as well. Beamforming is now an integral feature of modern wireless communication systems and its implementation calls for an accurate beam alignment by estimating the direction of signal arrival. However, this estimation is computationally complex, especially in a dynamic environment where a user is constantly on the move.
In this paper, we propose a user-assisted in-network method to optimally approximate the angle of arrival by segmenting the cell area into an exponentially binned grid and make use of the advantages offered by programmable data planes and their match-action table (MAT) logic. The method expects location messages periodically reported by user equipment, processes them in the network and reconfigures the base station antennas accordingly, implementing user-assisted in-network beam control. The proposed method is implemented in P4 and runs on a Tofino ASIC.