mirror of
https://github.com/Samsung/escargot.git
synced 2026-06-22 10:01:50 +00:00
161 lines
7.2 KiB
Text
161 lines
7.2 KiB
Text
Tiny Big Float library
|
|
----------------------
|
|
Copyright (c) 2017-2020 Fabrice Bellard
|
|
|
|
LibBF is a small library to handle arbitrary precision binary or
|
|
decimal floating point numbers. Its compiled size is about 90 KB of
|
|
x86 code and has no dependency on other libraries. It is not the
|
|
fastest library nor the smallest but it tries to be simple while using
|
|
asymptotically optimal algorithms. The basic arithmetic operations
|
|
have a near linear running time.
|
|
|
|
The TinyPI example computes billions of digits of Pi using the
|
|
Chudnovsky formula.
|
|
|
|
1) Features
|
|
-----------
|
|
|
|
- Arbitrary precision floating point numbers in base 2 using the IEEE
|
|
754 semantics (including subnormal numbers, infinities and
|
|
NaN).
|
|
- All operations are exactly rounded using the 5 IEEE 754 rounding
|
|
modes (round to nearest with ties to even or away from zero, round
|
|
to zero, -/+ infinity). The additional non-deterministic faithful
|
|
rounding mode is supported when a lower or deterministic running
|
|
time is necessary.
|
|
- Stateless API (each function takes as input the rounding mode,
|
|
mantissa and exponent precisions in bits and return the IEEE status
|
|
flags).
|
|
- The basic arithmetic operations (addition, subtraction,
|
|
multiplication, division, square root) have a near linear running
|
|
time.
|
|
- Multiplication using a SIMD optimized Number Theoretic Transform.
|
|
- Exactly rounded floating point input and output in any base between
|
|
2 and 36 with near linear runnning time. Floating point output can
|
|
select the smallest amount of digits to get the required precision.
|
|
- Transcendental functions are supported (exp, log, pow, sin, cos, tan,
|
|
asin, acos, atan, atan2).
|
|
- Operations on arbitrarily large integers are supported by using a
|
|
special "infinite" precision. Integer division with remainder and
|
|
logical operations (assuming two complement binary representation)
|
|
are implemented.
|
|
- Arbitrary precision floating point numbers in base 10 corresponding
|
|
to the IEEE 754 2008 semantics with the limitation that the mantissa
|
|
is always normalized. The basic arithmetic operations, output and
|
|
input are supported with a quadratic running time.
|
|
- Easy to embed: a few C files need to be copied, the memory allocator
|
|
can be redefined, the memory allocation failures are tested.
|
|
- MIT license.
|
|
|
|
2) Compilation
|
|
--------------
|
|
|
|
Edit the top of the Makefile to select the build options. By default,
|
|
the MPFR library is used to compile the test tools (bftest and
|
|
bfbench) but it is not needed to build libbf. The included SoftFP code
|
|
(softfp* files) is only used by the bftest test tool.
|
|
|
|
TinyPI example: the "tinypi" executable uses the portable code. The
|
|
"tinypi-avx2" executable uses the AVX2 implementation. An x86 CPU of
|
|
at least the Intel Haswell generation is necessary for AVX2.
|
|
|
|
3) Design principles
|
|
--------------------
|
|
|
|
- Base 2 and IEEE 754 semantics were chosen so that it is possible to
|
|
get good performance and to compare the results with other libraries
|
|
or hardware implementations. Moreover, base 2 arbitrary precision is
|
|
easier to analyse and implement.
|
|
|
|
- The support of subnormal numbers and of a configurable number of
|
|
bits for the exponent allows the exact emulation of IEEE 754
|
|
floating hardware.
|
|
|
|
- The stateless API ensures that there is no global state to save and
|
|
restore between operations. The rounding mode, subnormal flag and
|
|
number of exponent bits are ored to a single "flags" parameter to
|
|
limit the verbosity of the API. The number of exponent bits 'n' is
|
|
specified as '(M-n)' where M is the maximum number of exponent bits
|
|
so that '0' always indicates the maximum number of exponent bits.
|
|
|
|
- All the IEEE 754 status flags are returned by each operation. The
|
|
user can easily or them when necessary.
|
|
|
|
- Unlike other libraries (such as MPFR [2]), the numbers have no
|
|
attached precision. The general rule is that each operation is
|
|
internally computed with infinite precision and then rounded with
|
|
the precision and rounding mode specified for the operation.
|
|
|
|
- In many computations it is necessary to use arbitrarily large
|
|
integers. LibBF support them without adding another number type by
|
|
providing a special "infinite" precision. There is a small overhead
|
|
of course because they are manipulated as floating point numbers but
|
|
there is no cost to convert between floating point numbers and
|
|
integers.
|
|
|
|
- The faithful rounding mode (i.e. the result is rounded to - or
|
|
+infinity non deterministically) is supported for all operations. It
|
|
usually gives a faster and deterministic running time. The
|
|
transcendental functions, inverse or inverse square root are
|
|
internally implemented to give a faithful rounding. When a
|
|
non-faithful rounding is requested by the user, the Ziv rounding
|
|
algorithm is invoked.
|
|
|
|
4) Implementation notes
|
|
-----------------------
|
|
|
|
- The code was tested on a 64 bit x86 CPU. It should be portable to
|
|
other CPUs. The portable version handles numbers with up to 4*10^16
|
|
digits. The AVX2 version handles numbers with up to 8*10^12 digits.
|
|
|
|
- 32 bits: the code compiles on 32 bit architectures but it is not
|
|
designed to be efficient nor scalable in this case. The size of the
|
|
numbers is limited to about 10 million digits.
|
|
|
|
- The Number Theoretic Transform is not the fastest algorithm for
|
|
small to medium numbers (i.e. a few million digits), but it gets
|
|
better when the size of the numbers grows. There is no round-off
|
|
errors as with Fast Fourier Transform, the memory usage is much
|
|
smaller and it is potentially easier to parallelize. This code
|
|
contains an original SIMD (AVX2 on x86) implementation using 64 bit
|
|
floating point numbers. It relies on the fact that the fused
|
|
multiply accumulate (FMA) operation gives access to the full
|
|
precision of the product of two 64 bit floating point numbers. The
|
|
portable code relies on the fact that the C compiler supports a
|
|
double word integer type (i.e. 128 bit integers on 64 bit). The
|
|
modulo operations were replaced with multiplications which are
|
|
usually faster.
|
|
|
|
- Base conversion: the algorithm is not the fastest one but it is
|
|
simple and still gives a near linear running time.
|
|
|
|
- This library reuses some ideas from TachusPI (
|
|
http://bellard.org/pi/pi2700e9/tpi.html ) . It is about 4 times
|
|
slower to compute Pi but is much smaller and simpler.
|
|
|
|
5) Known limitations
|
|
--------------------
|
|
|
|
- In some operations (such as the transcendental ones), there is no
|
|
rigourous proof of the rounding error. We expect to improve it by
|
|
reusing ideas from the MPFR algorithms. Some unlikely
|
|
overflow/underflow cases are also not handled in exp or pow.
|
|
|
|
- The transcendental operations are not speed optimized and do not use
|
|
an asymptotically optimal algorithm (the running time is in
|
|
O(n^(1/2)*M(n)) where M(n) is the time to multiply two n bit
|
|
numbers). A possible solution would be to implement a binary
|
|
splitting algorithm for exp and sin/cos (see [1]) and to use a
|
|
Newton based inversion to get log and atan.
|
|
|
|
- Memory allocation errors are not always correctly reported for the
|
|
transcendental operations.
|
|
|
|
6) References
|
|
-------------
|
|
|
|
[1] Modern Computer Arithmetic, Richard Brent and Paul Zimmermann,
|
|
Cambridge University Press, 2010
|
|
(https://members.loria.fr/PZimmermann/mca/pub226.html).
|
|
|
|
[2] The GNU MPFR Library (http://www.mpfr.org/)
|