Rounding for fixed point calculations
by Toshihiro Horie
April 25, 2005
updated July 17, 2017

Legend:
-------
* = endpoint is included in range
o = endpoint is excluded in range

On intel SSE, there exists a CVTSD2SI insruction with rounding mode controllable using the MXCSR register.

k are numbers in the set of all integers.

Rounding to Nearest Integer (Symmetric Arithmetic Rounding)
C++: floor(x+0.5), except when x=0.49999... [see JDK bug below]
feature: almost ideal rounding, except for bias at k+0.5 where k is integer
feature: this is probably what you learned in elementary school
feature: symmetric about the origin
Intel x64: not possible in single instruction, they prefer to use round to closest even
ARM AArch64: FCVTAS round to nearest with ties to away
-------------------------------------

                       rint(x)
                          |
                          |
                      3.0 +              *=====o
                          |                    
                          |                    
                      2.0 +        *=====o     
                          |                    
                          |                    
                      1.0 +  *=====o           
                          |                    
                          |                    
     .--+--.--+--.--+--o=====o--+--.--+--.--+--.  x
      -3.0  -2.0  -1.0    |    1.0    2.0  3.0  
                          |
                 o=====*  + -1.0
                          |
                          |
           o=====*        + -2.0
                          |
                          |
     o=====*              + -3.0
                          |
                          |




Intel x87's Round to Closest (Nearest) Even Integer
Also known as Banker's Rounding
feature: default starting mode in Win32 apps
feature: ideal rounding from a numerical perspective
feature: MS DirectX uses this mode, along with single precision mode
feature: almost the same as ideal rounding except at 0.5+k points
where k is an integer.
C++: difficult to implement correctly without relying on platform-specific behavior (see notes at the end)
ARM NEON: int32x2_t  vcvt_s32_f32(float32x2_t a);  // VCVT.S32.F32 d0, d0
intel x64: _mm_round_sd(x, _MM_FROUND_TO_NEAREST_INT)
ARM AArch64: FCVTNS
------------------------------------------------------------------

                      rint_even(x)
                          |
                          |
                      3.0 +              o====o
                          |              
                          |              
                      2.0 +        *=====*
                          |              
                          |               
                      1.0 +  o=====o      
                          |               
                          |               
     .--+--.--+--.--+--*=====*--+--.--+--.--+--.  x
      -3.0  -2.0  -1.0    |    1.0    2.0  3.0  
                          |
                 o=====o  + -1.0
                          |
                          |
           *=====*   -2.0 +
                          |
                          |
     o=====o         -3.0 +
                          |
                          |


Rounding to Nearest Integer, but biased towards positive infinity
feature: easy to implement using floor
feature: for integer arguments, use add and shift right
C++: rint_pos(x) = floor(x+0.5), except when x= k + 0.49999...
intel x64: _mm_round_sd(x, _MM_FROUND_TO_POS_INF)
ARM AArch64: ???
--------------------------------------------------------------------

                      rint_pos(x)
                          |
                          |
                      3.0 +              *=====o
                          |                    
                          |                    
                      2.0 +        *=====o     
                          |                    
                          |                    
                      1.0 +  *=====o           
                          |                    
                          |                    
     .--+--.--+--.--+--*=====o--+--.--+--.--+--.  x
      -3.0  -2.0  -1.0    |    1.0    2.0  3.0  
                          |
                 *=====o  + -1.0
                          |
                          |
           *=====o        + -2.0
                          |
                          |
     *=====o              + -3.0
                          |
                          |

Round Towards Zero (chop, fix, truncation)
feature: default rounding mode in ANSI C
C++: (int)x
feature: toward minus infinity for positive numbers, 
          toward positive infinity for negative numbers
feature: has a larger deadzone at zero
intel: Intel SSE3 has a FISTTP instruction for this.
intel x64: _mm_round_sd(x, _MM_FROUND_TO_ZERO)
ARM AArch64: FCVTZS
--------------------------------------------------------
                     int_cast(x)
                          |
                          |
                      3.0 +
                          |
                          |
                      2.0 +           *=====o
                          |
                          |
                      1.0 +     *=====o
                          |
                          |
     .--+--.--+--.--o==.=====.==o--.--+--.--+--. x
      -3.0  -2.0  -1.0    |    1.0    2.0  3.0  
                          |
              o=====*     + -1.0
                          |
                          |
         o====*      -2.0 +
                          |
                          |
                     -3.0 +
                          |
                          |

Round towards Minus Infinity (floor)
feature: for integer or fixed point arguments, this is easy in hardware (use arithmetic shift right)
C++: floor(x)
intel x64: _mm_round_sd(x, _MM_FROUND_TO_NEG_INF)?
ARM AArch64: FCVTMS?
--------------------------------------------------------------

                        floor(x)
                          |
                          |
                      3.0 +
                          |
                          |
                      2.0 +           *=====o
                          |
                          |
                      1.0 +     *=====o
                          |
                          |
     .--+--.--+--.--+--.--*=====o--.--+--.--+--.  x
      -3.0  -2.0  -1.0    |    1.0    2.0  3.0  
                          |
                    *=====o -1.0
                          |
                          |
              *=====o     + -2.0
                          |
                          |
        *=====o      -3.0 +
                          |
                          |


Ceiling (ceil)
C++: ceil(x) = -floor(-x)
feature: almost the mirror image of floor(x)
intel x64: _mm_round_sd(x, _MM_FROUND_TO_POS_INF)
ARM AArch64: FCVTPS(x)
----------------------------------------

                       ceil(x)
                          |
                          |
                      3.0 +           o=====*
                          |                  
                          |                  
                      2.0 +     o=====*        
                          |                   
                          |                   
                      1.0 o=====*              
                          |                     
                          |                   
     .--+--.--+--.--o==.==*--.--+--.--+--.--+--.  x
      -3.0  -2.0  -1.0    |    1.0    2.0  3.0  
                          |
              o=====*     + -1.0
                          |
                          |
         o====*      -2.0 +
                          |
                          |
                     -3.0 +
                          |
                          |
General information on: Custom Rounding.
More information on: Speed Optimizations.
Pentium rounding unsignaled overflow bug documented at: Microsoft
More about C99 rounding modes at FreeBSD mailing list
It turns out that floor(x+0.5) is not equal to round(x) when x=0.4999... ... This was the cause of this Java JDK bug.
More information on rounding bugs are at here.
Rounding for fixed point calculations by Toshihiro Horie April 25, 2005 updated July 17, 2017

Rounding for fixed point calculations
by Toshihiro Horie
April 25, 2005
updated July 17, 2017