manpagez: man pages & more
info cln
Home | html | info | man
 [ << ] [ < ] [ Up ] [ > ] [ >> ] [Top] [Contents] [Index] [ ? ]

## 4.10 Functions on floating-point numbers

Recall that a floating-point number consists of a sign `s`, an exponent `e` and a mantissa `m`. The value of the number is `(-1)^s * 2^e * m`.

Each of the classes `cl_F`, `cl_SF`, `cl_FF`, `cl_DF`, `cl_LF` defines the following operations.

`type scale_float (const type& x, sintC delta)`
`type scale_float (const type& x, const cl_I& delta)`

Returns `x*2^delta`. This is more efficient than an explicit multiplication because it copies `x` and modifies the exponent.

The following functions provide an abstract interface to the underlying representation of floating-point numbers.

`sintE float_exponent (const type& x)`

Returns the exponent `e` of `x`. For `x = 0.0`, this is 0. For `x` non-zero, this is the unique integer with `2^(e-1) <= abs(x) < 2^e`.

`sintL float_radix (const type& x)`

Returns the base of the floating-point representation. This is always `2`.

`type float_sign (const type& x)`

Returns the sign `s` of `x` as a float. The value is 1 for `x` >= 0, -1 for `x` < 0.

`uintC float_digits (const type& x)`

Returns the number of mantissa bits in the floating-point representation of `x`, including the hidden bit. The value only depends on the type of `x`, not on its value.

`uintC float_precision (const type& x)`

Returns the number of significant mantissa bits in the floating-point representation of `x`. Since denormalized numbers are not supported, this is the same as `float_digits(x)` if `x` is non-zero, and 0 if `x` = 0.

The complete internal representation of a float is encoded in the type `decoded_float` (or `decoded_sfloat`, `decoded_ffloat`, `decoded_dfloat`, `decoded_lfloat`, respectively), defined by

```struct decoded_typefloat {
type mantissa; cl_I exponent; type sign;
};
```

and returned by the function

`decoded_typefloat decode_float (const type& x)`

For `x` non-zero, this returns `(-1)^s`, `e`, `m` with `x = (-1)^s * 2^e * m` and `0.5 <= m < 1.0`. For `x` = 0, it returns `(-1)^s`=1, `e`=0, `m`=0. `e` is the same as returned by the function `float_exponent`.

A complete decoding in terms of integers is provided as type

```struct cl_idecoded_float {
cl_I mantissa; cl_I exponent; cl_I sign;
};
```

by the following function:

`cl_idecoded_float integer_decode_float (const type& x)`

For `x` non-zero, this returns `(-1)^s`, `e`, `m` with `x = (-1)^s * 2^e * m` and `m` an integer with `float_digits(x)` bits. For `x` = 0, it returns `(-1)^s`=1, `e`=0, `m`=0. WARNING: The exponent `e` is not the same as the one returned by the functions `decode_float` and `float_exponent`.

Some other function, implemented only for class `cl_F`:

`cl_F float_sign (const cl_F& x, const cl_F& y)`

This returns a floating point number whose precision and absolute value is that of `y` and whose sign is that of `x`. If `x` is zero, it is treated as positive. Same for `y`.

 [ << ] [ < ] [ Up ] [ > ] [ >> ] [Top] [Contents] [Index] [ ? ]

This document was generated on August 27, 2013 using texi2html 5.0.

```© manpagez.com 2000-2018