What is Mixed Precision?
Computers have been getting faster for as long as they’ve existed. However, not every computer part has been speeding up at the same rate. For example, processors have been speeding up much faster than memory, what you may know as RAM in your laptop. It started in the 1980s when central processing units (CPUs) began to get faster than memory. Then by 2008, graphical processing units (GPUs) started being used in high-performance computing precisely when they also began to get faster than memory. For supercomputers built of hundreds or thousands of processors and memory to match, This meant that simulations couldn’t run as fast as processors but instead had to wait on memory each time a calculation was completed before processing the next one.
As supercomputers get larger both in terms of calculation power (exascale) and sheer size (requiring electricity to actually travel down longer circuits to reach memory), the amount of time lost while waiting for memory is going up. Ultimately, it’s getting harder to exploit the full potential of exascale supercomputers. Since our goal is to use entire exascale computers to run our Lighthouse cases, finding solutions to this is a top priority.
One way to solve this mismatch between the speed of processors and the speed of memory is to try and rewrite codes to access memory less frequently and to transfer less data. This way, processors can chug along at their faster speed with fewer interruptions. This is where mixed precision can help to reduce the movement of data to and from memory. When done correctly, mixed precision optimizes how precise calculations need to be, which saves time and energy as we’ll explain below.
First, let’s review some basics of floating point numbers and integers, and the way computers and humans do arithmetic. When most people add or subtract numbers, we use 0 through 9 in a system based on groups of 10 (base 10). However, computers only use 0 and 1 (base 2) to represent all the same numbers. Check out this article for a refresher on counting like a computer using your fingers. As you’ll see, you can represent much larger numbers with 10 fingers in base 2 than you can in base 10. In fact, computers generally use 32 or 64 bits as their metaphorical fingers to count numbers as large as 2,147,483,647 or 9,223,372,036,854,775,807.
Now a brief refresher on integers and floating point numbers: an integer is a whole number like 2, 35, or 0, whereas a floating point number has a decimal like 3.2, 17.6, or 0.225. When it comes to storing and doing arithmetic on integers, the difference between base 10 and base 2 isn’t a problem. If the computer can represent the number with however many bits the software allows, the operation works just like it would for a human and is precise. If the number is too big, the computer shows an error message and simply won’t give a result at all.
The problems start when we need to represent numbers with decimal points in computers, particularly when these numbers get very large or very small as they do in large simulations. In these cases, the computer only writes as much of the number as will fit in the bits that the software allows. Anything else is rounded and then lost, which leads to errors and inaccuracies over the course of a full simulation. Also, if each number is slightly inaccurate, the order in which you add or multiply numbers can change the errors in a way that changes the final results. Since another strategy for making calculations run quickly is making them run in parallel over hundreds or thousands of processors simultaneously, it’s impossible to guarantee a specific order of operations without slowing down the entire system. Hence, it has been standard practice to use double precision or double the “normal” number of bits (for a total of 64) to represent numbers more accurately when doing floating point operations.
However, at exascale, moving all those bits between memory and processors can really slow things down, as described above. This is where mixed precision has the potential to help. Strictly speaking a mixed precision algorithm is an algorithm that effectively and efficiently combines two or more floating-point and/ or integer precisions aiming for better time-to-solution and/ or energy-to-solution while preserving the final accuracy of the results. It means that the algorithm varies how many bits are used to represent numbers depending on how much rounding them at a given point in a simulation will impact the final result. The algorithm tries to save as much electricity and time as possible while still calculating accurately. Hypothetically, an algorithm could tell a simulation to use half precision at one time point (16 bits) and to use extended precision (up to 128 bits or more) later. The challenge with mixed precision is judging what’s effective and efficient. When is it safe to decrease precision and when is it necessary to increase it?
That’s just one of the questions we’re working to answer for CFD simulations. Look out for more in our news feed like the work Roman presented in the summer of 2023 at ISC and PASC!