VHPC 2022

17th Workshop on Virtualization in High-Performance Cloud Computing

held in conjunction with ISC-HPC

Hamburg, Germany, May 29th - Jun 2nd 2022

Keynote Talks

rtla: finding the sources of OS noise on Linux

Speaker: Daniel Bristot De Oliveira, Senior Principal Software Engineer in the real-time kernel team at Red Hat.

Abstract: Currently, Real-time Linux is evaluated using a black-box approach. While this method provides an overview of the system, it fails to provide a root cause analysis for unexpected values. Developers have to use kernel trace features to debug these cases, requiring extensive knowledge about the system and fastidious tracing setup and breakdown. Such analysis will be even more impactful after the PREEMPT_RT merge. To support these cases, since version 5.17, the Linux kernel includes a new tool named rtla, which stands for Real-time Linux Analysis. The rtla is a meta-tool that consists of a set of commands that aims to analyze the real-time properties of Linux. Instead of testing Linux as a black box, rtla leverages kernel tracing capabilities to provide precise information about latencies and root causes of unexpected results. In this talk, Daniel will present two tools provided by rtla. The timerlat tool to measure IRQ and thread latency for interrupt-driven applications and the osnoise tool to evaluate the ability of Linux to isolate workload from the interferences from the rest of the system. The presentation includes examples of how to use the tool to find the root cause analysis and collect extra tracing information directly from the tool.

DynamoDB: NoSQL database services for predictable HPC workloads

Speakers: Akshat Vig, Principal Software Engineer at AWS and Amit Purohit, Director/General Manager of Database Services at AWS.

Abstract: Data is growing at exponential pace and organizations are using data to make effective decisions, where one of the key challenges is making them "fast". In this talk, we will look at how AWS is continuously innovating on behalf of the customers to make it easy to access data and build data-driven applications in the cloud. We will also look at the journey of what we learned as we built Amazon DynamoDB, one of the NoSQL data stores in AWS used by hundreds of thousands of customers providing consistent performance at any scale. As DynamoDB has evolved over the last ten years, it has faced numerous challenges in delivering newer offerings such as Indexing, Encryption, Backups, Streams and Global tables without sacrificing predictable performance. We will share the lessons learned and how we evolved the service continuously by improving its performance characteristics while adding new features.

Accepted Paper Presentations

eBPF-based Extensible Paravirtualization

Authors: Luigi Leonardi, Giuseppe Lettieri, Giacomo Pellicci (University of Pisa)

Abstract: High performance applications usually need to give many hints to the OS kernel regarding their needs. For example, CPU affinity is commonly used to pin processes to cores and avoid the cost of CPU migration, isolate performance critical tasks, bring related tasks together, and so on. However, when running inside a Virtual Machine, the (guest) OS kernel can only assign virtual resources, e.g., pinning a guest process to a virtual CPU thread (vCPU); the vCPU thread, however, is still freely scheduled by the host hypervisor, which is unaware of the guest application requests. This semantic gap is usually overcome by statically allocating virtual resources to their hardware counterparts, which is costly and inflexible, or via paravirtualization, i.e., by modifying the guest kernel to pass the hints to the host, which is cumbersome and difficult to extend. We propose to use host-injected eBPF programs as a way for the host to obtain this kind of information from the guest in an extensible way, without modifying the guest (Linux) kernel, and without statically allocating resources. We apply this idea to the example of CPU affinity and run some experiments to show its effect on several microbenchmarks. Finally, we discuss the implications for confidential computing.

Virtual Clusters: Isolated, Containerized HPC Environments in Kubernetes

Authors: George Zervas, Anthony Chazapis, Yannis Sfakianakis, Christos Kozanitis, Angelos Bilas (FORTH & University of Crete)

Abstract: Today, Cloud and HPC workloads tend to use different approaches for managing resources. However, as more and more applications require a mixture of both high-performance and data processing computation, convergence of Cloud and HPC resource management is becoming a necessity. Cloud-oriented resource management strives to share physical resources across applications to improve infrastructure efficiency. On the other hand, the HPC community prefers to rely on job queueing mechanisms to coordinate among tasks, favoring dedicated use of physical resources by each application. In this paper, we design a combined Slurm-Kubernetes system that is able to run unmodified HPC workloads under Kubernetes, alongside other, non-HPC applications. First, we containerize the whole HPC execution environment into a virtual cluster, giving each user a private HPC context, with common libraries and utilities built-in, like the Slurm job scheduler. Second, we design a custom Slurm-Kubernetes protocol that allows Slurm to dynamically request resources from Kubernetes. Essentially, in our system the Slurm controller delegates placement and scheduling decisions to Kubernetes, thus establishing a centralized resource management endpoint for all available resources. Third, our custom Kubernetes scheduler applies different placement policies depending on the workload type. We evaluate the performance of our system compared to a native Slurm-based HPC cluster and demonstrate its ability to allow the joint execution of applications with seemingly conflicting requirements on the same infrastructure with minimal interference.

On the use of Linux Real-Time Features for RAN Packet Processing in Cloud Environments

Authors: Luca Abeni, Tommaso Cucinotta, Balazs Pinczel, Peter Matray, Murali Srinivasan, Tobias Lindquist (Scuola Superiore Sant'Anna & Ericsson)

Abstract: This paper shows how to use a Linux-based operating system as a real-time processing platform for low-latency and predictable packet processing in cloudified radio-access network (cRAN) scenarios. This use-case exhibits challenging end-to-end processing latencies, in the order of milliseconds for the most time-critical layers of the stack. A significant portion of the variability and instability in the observed end-to-end performance in this domain is due to the power saving capabilities of modern CPUs, often in contrast with the low-latency and high-performance requirements of this type of applications. We discuss how to properly configure the system for this scenario, and evaluate the proposed configuration on a synthetic application designed to mimic the behavior and computational requirements of typical software components implementing baseband processing in production environments.

Analyzing Unikernel Support for HPC: Experimental Study of OpenMP

Author: Abdulrahman Azab (University of Oslo & Mansoura University)

Abstract: Unikernels are single-application operating systems designed to run as virtual machines. They are popular in the cloud domain and are considered as a good alternative to containers due to the benefits they provide in terms of performance, low resource consumption, and security. This paper investigates the use of unikernels as a platform for HPC applications, considering both the potential advantages as well as the limitations of this novel OS model. The performance and stability of two unikernel platforms (HermitCore and HermiTux) are experimentally evaluated over standard representative HPC OpenMP benchmarks. We observe that unikernels remarkably reduce the overhead due to system calls, leading to a significant speedup (up to 77%) in system-bound applications. For applications that are not system-intensive, there are a few performance differences between the unikernel and the vanilla Linux execution. It should be remarked that modern unikernel projects are not yet fully mature, and exhibit stability issues running some OpenMP benchmarks.

Accepted Lightning Talks

ToroV, a kernel in user-space to deploy server-less applications

Speaker: Matias Vara (Vates SAS)

Abstract covers deployment of server-less applications using ToroV Virtual Machine Monitor with POSIX API support.

Keep an eye on your busy-waiting

Speaker: Remo Andreoli (Scuola Superiore Sant'Anna)
Authors: Remo Andreoli, Tommaso Cucinotta (Scuola Superiore Sant'Anna), Daniel Bristot De Oliveira (Red Hat)

Discussion of busy-waiting techniques in high-performance libraries and their impact on system performance.

Unified workload deployment on heterogeneous HPC clusters

Speaker: Anastassios Nanos (Nubificus LTD)
Authors: Anastassios Nanos, George Ntoutsos, Charalampos Mainas (Nubificus LTD)

Proposal for unified workload deployment using OCI image specification for HPC environments.

Invited Talks

Mobility and Elasticity in High Performance Cloud Computing

Speaker: Eric Jul (University of Oslo)

Platform-level resource orchestration and abstraction layers in continuum platforms: the challenge of heterogeneity

Speaker: Massimo Coppola (ISTI, CNR)

Scalable containers in HPC and the convergence toward HPC clouds

Speaker: Azat Khuziyakhmetov (GWDG)

Virtualisation Overhead Analyzed Using IO500

Speaker: Julian Kunkel (GWDG)

Enroot and Pyxis – Lightweight, Unprivileged Containers for Slurm Clusters

Speaker: Luke Yeager (NVIDIA)

MPI-based Graph Computing with Serverless Fault Tolerance in Clouds

Speakers: Peng Lin and Lin Ma (ByteDance)

State of Kubernetes in Container Orchestration for Scientific Computing

Speaker: Michael Alexander (BOKU Vienna and OeAW, Austria)

General Information

The 17th Workshop on Virtualization in High-Performance Cloud Computing (VHPC 2022) aims to bring together researchers and industrial practitioners facing the challenges posed by virtualization in HPC/Cloud scenarios in order to foster discussion, collaboration, mutual exchange of knowledge and experience, enabling research to ultimately provide novel solutions for virtualized computing systems of tomorrow.

The workshop will have a duration of one day and it will be held in conjunction with the International Supercomputing Conference - High Performance (ISC) 2022 on June 2nd, 2022 in Hamburg, Germany.

For more information, refer to either the VHPC 2022 or the ISC 2022 web pages.