[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
4.10 Functions on floating-point numbers
Recall that a floating-point number consists of a sign s
, an
exponent e
and a mantissa m
. The value of the number is
(-1)^s * 2^e * m
.
Each of the classes
cl_F
, cl_SF
, cl_FF
, cl_DF
, cl_LF
defines the following operations.
type scale_float (const type& x, sintC delta)
type scale_float (const type& x, const cl_I& delta)
Returns
x*2^delta
. This is more efficient than an explicit multiplication because it copiesx
and modifies the exponent.
The following functions provide an abstract interface to the underlying representation of floating-point numbers.
sintE float_exponent (const type& x)
-
Returns the exponent
e
ofx
. Forx = 0.0
, this is 0. Forx
non-zero, this is the unique integer with2^(e-1) <= abs(x) < 2^e
. sintL float_radix (const type& x)
-
Returns the base of the floating-point representation. This is always
2
. type float_sign (const type& x)
-
Returns the sign
s
ofx
as a float. The value is 1 forx
>= 0, -1 forx
< 0. uintC float_digits (const type& x)
-
Returns the number of mantissa bits in the floating-point representation of
x
, including the hidden bit. The value only depends on the type ofx
, not on its value. uintC float_precision (const type& x)
-
Returns the number of significant mantissa bits in the floating-point representation of
x
. Since denormalized numbers are not supported, this is the same asfloat_digits(x)
ifx
is non-zero, and 0 ifx
= 0.
The complete internal representation of a float is encoded in the type
decoded_float
(or decoded_sfloat
, decoded_ffloat
,
decoded_dfloat
, decoded_lfloat
, respectively), defined by
struct decoded_typefloat { type mantissa; cl_I exponent; type sign; };
and returned by the function
decoded_typefloat decode_float (const type& x)
-
For
x
non-zero, this returns(-1)^s
,e
,m
withx = (-1)^s * 2^e * m
and0.5 <= m < 1.0
. Forx
= 0, it returns(-1)^s
=1,e
=0,m
=0.e
is the same as returned by the functionfloat_exponent
.
A complete decoding in terms of integers is provided as type
struct cl_idecoded_float { cl_I mantissa; cl_I exponent; cl_I sign; };
by the following function:
cl_idecoded_float integer_decode_float (const type& x)
-
For
x
non-zero, this returns(-1)^s
,e
,m
withx = (-1)^s * 2^e * m
andm
an integer withfloat_digits(x)
bits. Forx
= 0, it returns(-1)^s
=1,e
=0,m
=0. WARNING: The exponente
is not the same as the one returned by the functionsdecode_float
andfloat_exponent
.
Some other function, implemented only for class cl_F
:
cl_F float_sign (const cl_F& x, const cl_F& y)
-
This returns a floating point number whose precision and absolute value is that of
y
and whose sign is that ofx
. Ifx
is zero, it is treated as positive. Same fory
.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated on August 27, 2013 using texi2html 5.0.