Math Options
This topic lists and describes the different math options available.
-h fplevel
Default: -h fp2
The -h fp option controls the level of floating-point optimizations. The level argument controls the level of allowable optimization; 0 gives the compiler minimum freedom to optimize floating-point operations, while 4 gives it maximum freedom. The higher the level, the less the floating-point operations conform to the IEEE standard.
Each level, 1-4, includes the optimizations at the previous level.
- 0
- Use this level only when the code pushes the limits of IEEE accuracy or requires strong IEEE standard conformance. The resulting executable code conforms more closely to the IEEE floating-point standard than the default mode (-h fp2). Many identity optimizations are disabled. Vectorization of floating-point and complex reductions are disabled. Executable code is slower than higher floating-point optimization levels.
- 1
- Use this option only when the code pushes the limits of IEEE accuracy or requires strong IEEE standard conformance. This option performs generally safe, non-conforming IEEE optimizations, such as folding
a == ato true, whereais a floating point object. At this level, a scaled complex divide mechanism is enabled that increases the range of complex values that can be handled without producing an underflow. Rewrite of division into multiplication by reciprocal is inhibited. - 2
- Default.
- 3
- Use when performance is more critical than the level of IEEE standard conformance provided by fp2. The -h fp3 option is an acceptable level of optimization for many applications.
- 4
- Use if the application uses algorithms which are tolerant of reduced precision.
| Optimization Type | fp0 | fp1 | fp2 (default) | fp3 | fp4 |
|---|---|---|---|---|---|
| Safety | Maximum | High | High | Moderate | Low |
| Complex divisions | Accurate and slower | Accurate and slower | Fast1 | Fast1 | Fast1 |
| Exponentiation rewrite | None | None | When optimization benefit is very high2 | Always2,3 | Always2,3 |
| Strength reduction | None | None | Fast | Fast | Fast |
| Rewrite division as reciprocal equivalent4 | None | None | Yes | Aggressive | Aggressive |
| Floating point reductions | Slow | Fast | Fast | Fast | Fast |
| Expression factoring | None | Yes | Yes | Yes | Yes |
| Expression tree balancing | None | None | Yes | Yes | Yes |
| Inline 32-bit operations5 | No | No | No | Yes | Yes |
| Fused multiply-add6 | No | Yes | Yes | Yes | Yes |
Algebraically correct but may lack precision in boundary cases.
Rewriting values raised to a constant power into an algebraically equivalent series of multiplications and/or square roots.
Rewriting exponentiations (ab) not previously optimized into the algebraically equivalent form exp(b * ln(a)).
For example, x/y is transformed to x * 1.0/y.
32-bit division, square root, and reciprocal square root use very fast but less precise code sequences.
Uses fused multiply-add instructions on architectures that support it.