Reliable and sustainable computations: An application-driven approach

In this talk, Roman Iakymchuk presents his work on accuracy and reproducibility assuring strategies for parallel iterative solvers that may not hold due to the non-associativity of floating-point operations. These strategies primarily rely on guarding every bit of result until final rounding, hence they can be costly. The energy consumption constraint for large-scale computing encourages scientists to revise the architecture design of hardware but also applications, algorithms, as well as the underlying working/ storage precision. The main aim is to make the computing cost sustainable and apply the lagom principle (”not too much, not too little, the right amount”), especially when it comes to working/ storage precision. Thus, he will introduce an approach to address the issue of sustainable, but still reliable, computations from the perspective of computer arithmetic tools. Before lowering precision, one must ensure that the simulation is numerically correct, e.g. by relying on alternative floating-point models/ rounding to pinpoint numerical bugs and to estimate the accuracy. We employ VerifiCarlo and its variable precision backend to identify the parts of the code that benefit from smaller floating-point formats. Finally, we show preliminary results on proxy applications.

Neko: A Modern, Portable, and Scalable Framework for High-Fidelity Computational Fluid Dynamics

Recent trends and advancements including more diverse and heterogeneous hardware in High-Performance Computing are challenging scientific software developers in their pursuit of good performance and efficient numerical methods. As a result, the well-known maxim “software outlives hardware” may no longer necessarily hold true, and researchers are today forced to re-factor their codes to leverage these powerful new heterogeneous systems. We present Neko – a portable framework for high-fidelity spectral element flow simulations. Unlike prior works, Neko adopts a modern object-oriented Fortran 2008 approach, allowing multi-tier abstractions of the solver stack and facilitating various hardware backends ranging from general-purpose processors, accelerators down to exotic vector processors and Field-Programmable Gate Arrays (FPGAs) via Neko’s device abstraction layer. Focusing on the performance and accuracy of Neko, we show the first direct numerical simulation (DNS) of a Flettner rotor submerged in a turbulent boundary layer, observing excellent agreement of lift with experimental data. Using a mesh with five million spectral elements, which turns into more than a billion unique degrees of freedom, the simulation requires less than three days to complete on accelerated systems compared to weeks on traditional non-accelerated systems. Finally, we present performance measurements on a wide range of accelerated computing platforms, including the EuroHPC pre-exascale system LUMI, where Neko achieves excellent parallel efficiency for a large DNS of turbulent fluid flow using up to 80% of the entire LUMI supercomputer.

MS4F – Cross-Cutting Aspects of Exploiting Exascale Platforms for High-Fidelity CFD in Turbulence Research

This minisymposium was chaired by a CEEC consortium member and contained the presentation of another CEEC consortium member. The arrival of exascale computing has opened up unprecedented simulation capabilities for Computational Fluid Dynamics (CFD) applications. While offering high theoretical peak performance and high memory bandwidth, efficiently exploiting these systems necessitates complex programming models and significant programming investments .

Sustainable and Reliable Computing with Tools: Analyzing Precision Appetites of CFD Applications with VerifiCarlo

Energy consumption constraints for large-scale computing encourage scientists to revise the architecture design of hardware but also applications, algorithms, as well as the underlying working/ storage precision. I will introduce an approach to address the issue of sustainable, but still reliable, computations from the perspective of computer arithmetic tools. We employ VerifiCarlo and its variable precision backend to identify the parts of the code that benefit from smaller floating-point formats. Finally, we show preliminary results on proxies of CFD applications.

VPREC to analyze the precision appetites and numerical abnormalities of several proxy applications

The third in a series of presentations from Roman Iakymchuk on work using tools to investigate mixed precision possibilities. He and his co-author Pablo de Oliveira Castro introduce an approach to address the issue of sustainable computations with computer arithmetic tools. They use the variable precision backend (VPREC) to identify parts of code that can benefit from smaller floating-point formats and show preliminary results on several proxy applications.

Precision in linear algebra solvers and its cascade impact on applications

Until recent years, linear algebra solvers were predominantly operating with double precision or binary64. With the advent of AI, in particular deep learning, lower floating-point precisions formats were introduced to accommodate the need of such computations; now the spectrum is shifted to fixed point and integer arithmetic, expanding to block floating point arithmetic. This change […]

Code of the Month vol.6 “Neko by CEEC” — PUBLIC event

Online

Recent trends and advancements including more diverse and heterogeneous hardware in High-Performance Computing are challenging scientific software developers in their pursuit of good performance and efficient numerical methods. As a result, the well-known maxim “software outlives hardware” may no longer necessarily hold true, and researchers are today forced to re-factor their codes to leverage these powerful new heterogeneous systems. We present Neko – a portable framework for high-fidelity spectral element flow simulations. Unlike prior works, Neko adopts a modern object-oriented Fortran 2008 approach, allowing multi-tier abstractions of the solver stack and facilitating various hard- ware backends ranging from general-purpose processors, accelerators down to exotic vector processors and Field Programmable Gate Arrays (FPGAs) via Neko’s device abstraction layer. Focusing on Neko’s performance and exascale readiness, we outline the optimisation and algorithmic work necessary to ensure scalability and performance portability across a wide range of platforms. Finally, we present performance measurements on a wide range of accelerated computing platforms, including the EuroHPC pre-exascale systems LUMI and Leonardo, where Neko achieves excellent parallel efficiency for an extreme-scale direct numerical simulation (DNS) of turbulent thermal convection using up to 80% of the entire LUMI supercomputer.