man page AltiVec section 7

Accelerate(7)        BSD Miscellaneous Information Manual        Accelerate(7)

NAME

     Accelerate vecLib vImage AltiVec vMathLib BLAS LAPACK vDSP vBigNum
     vBasicOps Vector Computation Velocity Engine Extended Math Library --
     This man page introduces the vector instruction set extension to the Pow-
     erPC architecture known as Velocity Engine (or AltiVec), the Accelerate
     umbrella framework, its constituent libraries and programming support in
     Mac OS X.

DESCRIPTION

     The PowerPC vector instruction set architecture is based on a separate
     SIMD style execution unit with inherently high data parallelism.  This
     high degree of parallelism is enhanced with additional parallelism
     through superscalar dispatch to multiple execution units and execution
     unit pipelines. All vector instructions are designed to be easily
     pipelined with pipeline latencies no greater than the scalar double pre-
     cision floating-point multiply-add fused class of instructions.  There
     are no operating mode switches which preclude fine grain interleaving of
     instructions with the existing floating-point and integer instructions.
     Parallelism with the integer and floating-point instructions is simpli-
     fied by the facts that the vector unit never generates an exception and
     has few shared resources or communication paths that require it to be
     tightly synchronized with the other units.

Highlights

     Fixed vector length of 128-bits (16 8-bit elements, 8 16-bit elements, or
     4 32-bit elements.
     Signed and unsigned 8-, 16-, and 32-bit integers, and IEEE single-preci-
     sion floats.
     Saturation arithmetic.
     32-register namespace.
     Vector register file architecturally separate from floating-point and
     integer registers.
     No mode switching that would increase the overhead of using the instruc-
     tions.
     4 operand, non-destructive instructions (3 source, 1 result).
     Operations selected based on utility to digital signal processing algo-
     rithms (including 2D and 3D image processing).

Who benefits?

     Many of the services provided by MacOS X (e.g., Quartz, QuickTime,
     OpenGL, CoreAudio) already exploit the vector acceleration available on
     Macintosh G4 and G5 computers.  All MacOS X users enjoy these benefits.

     Many applications that run on MacOS X (e.g., iTunes, iMovie) have already
     been coded to use the vector libraries and vector instruction set.  Users
     of these applications enjoy the benefits of vector acceleration.

     Software developers who would like their code to use the vector facility
     on Macintosh G4 and G5 computers may choose to:
     (1) Make explicit calls to entry points in the Accelerate framework.
     Apple has optimized many of these routines for the vector engine (see the
     framework discussion that follows.)
     and/or (2) Program directly to the vector unit using the "Programming
     Interface Model."

     Note that a programmer must take explicit actions (as above) to engage
     the vector engine, otherwise it remains idle.

Where to go from here:

     Browse a comprehensive introduction to vector programming:
     http://developer.apple.com/hardware/ve

     Examine the prototypes for functions you can invoke:

     /System/Library/Frameworks/vecLib.framework/Headers/*.h

     /System/Library/Frameworks/Accelerate.framework/Frameworks/vImage.framework/Headers/*.h

     Include the interfaces in the code you write:

     #include   <Accelerate/Accelerate.h>

     Compile and link your code:

     cc -faltivec -framework Accelerate file.c

Accelerate Umbrella Framework

     The Accelerate umbrella framework encompasses all the libraries provided
     with MacOS X that Apple has optimized for high performance vector and
     numerical computing.  Subsequent sections describe the sub-frameworks
     that comprise the Accelerate framework.

vImage Framework

     A collection of basic image processing filters such as Convolution, Mor-
     phological, and Geometric transforms. Alpha compositing and histogram
     operations are also supported.

vecLib Framework

     The vecLib framework is a collection of facilities covering digital sig-
     nal processing (vDSP), matrix computations (BLAS), numerical linear alge-
     bra (LAPACK), mathematical routines (vMathLib), basic operations (vBasi-
     cOps) and large number calculations (vBigNum).

     The vDSP, BLAS and LAPACK components of vecLib run on the scalar and vec-
     tor domain.  vecLib automatically detects the presence of the vector
     engine and uses it.  vMathLib mirrors the existing scalar libm on the
     vector engine and vBasicOps is meant to complement the processor by pro-
     viding more functionality such as a 32x32 vector integer multiply.
     vBigNum, vBasicOps and vMathLib run only on the vector engine.

     There is also another matrix computation package in vecLib called vBasi-
     cOps.  It works somewhat in the same spirit as the BLAS.  It is best
     suited for small problems when availability of source is preferred.  It
     can also be used as an educational tool to gain insights into the working
     of the PowerPC vector unit.  In most cases, the use of BLAS instead of
     vectorOps is recommended.

vDSP

     The vDSP Library provides mathematical functions for applications such as
     speech, sound, audio, and video processing, diagnostic medical imaging,
     radar signal processing, seismic analysis, and scientific data process-
     ing.

     The vDSP functions operate on real and complex data types. The functions
     include data type conversions, fast Fourier transforms (FFTs), and vec-
     tor-to-vector and vector-to-scalar operations.

     The vDSP functions have been implemented in two ways: as vectorized code
     (for single precision only), which uses the vector unit on the PowerPC G4
     and G5 microprocessors, and as scalar code, which runs on Macintosh mod-
     els that have a G3 microprocessor.

     It is noteworthy that vDSP's FFTs are one of the fastest implementations
     of the Discrete Fourier Transforms available anywhere.

     The vDSP Library itself is included as part of vecLib in Mac OS X.  The
     header file, vDSP.h, defines data types used by the vDSP functions and
     symbols accepted as flag arguments to vDSP functions.

     vDSP functions are available in single and double precision.  Note that
     only the single precision is vectorized due to the underlying instruction
     set architecture of the vector engine on board G4 and G5 processors.

     For more information about vDSP download the manual at <http://devel-
     oper.apple.com/hardware/ve/downloads/vDSP.sit.hqx>

BLAS

     The Basic Linear Algebra Subroutines (BLAS) are high quality routines for
     performing basic vector and matrix operations. Level 1 BLAS consists of
     vector-vector operations, Level 2 BLAS consists of matrix-vector opera-
     tions, and Level 3 BLAS have matrix-matrix operations.  The efficiency,
     portability, and the wide adoption of the BLAS have made them commonplace
     in the development of high quality linear algebra software such as LAPACK
     and in  other technologies requiring fast vector and matrix calculations.
     All the industry standard FORTRAN BLAS entry points and the standard C
     BLAS entry points are exported from the vecLib framework (the latter are
     commonly denoted the legacy C BLAS.)  For more information refer to
     <http://www.netlib.org/blas/faq/>

LAPACK

     LAPACK provides routines for solving systems of simultaneous linear equa-
     tions, least-squares solutions of linear systems of equations, eigenvalue
     problems, and singular value problems.  The associated matrix factoriza-
     tions (LU, Cholesky, QR, SVD, Schur, generalized Schur) are also pro-
     vided, as are related computations such as reordering of the Schur fac-
     torizations and estimating condition numbers. Dense and banded matrices
     are handled, but not general sparse matrices. In all areas, similar func-
     tionality is provided for real and complex matrices, in both single and
     double precision.  LAPACK in vecLib makes full use of the optimized BLAS
     and fully benefits from their performance.  All the industry standard
     FORTRAN LAPACK entry points are exported from the vecLib framework.  C
     programs may make calls to the FORTRAN entry points using the prototypes
     set out in "/System/Library/Frameworks/vecLib.framework/Headers/cla-
     pack.h".

     For more information refer to <http://www.netlib.org/lapack/index/>.

     Note that vecLib's LAPACK was built using the FORTRAN to C converter
     called f2c.  Users must be aware that:

     ALL arguments must be passed by reference.  This includes all scalar
     arguments such as matrix dimension M and N, further note there is a dif-
     ference in the memory arrangement of a two-dimensional array in Fortran
     and C.

     For more information refer to <http://www.netlib.org/clapack/readme>.

vBasicOps

     A collection of basic operations such as add, subtract, multiply and
     divide that complement the vector processor's basic operations up to 128
     bits.  Consult "/System/Library/Frameworks/vecLib.framework/Headers/vBa-
     sicOps.h" for further information.

vBigNum

     Routines for large number calculations from 128 bits.  Consult "/Sys-
     tem/Library/Frameworks/vecLib.framework/Headers/vBigNum.h" for further
     information.

Darwin                           June 6, 2002                           Darwin

Mac OS X 10.4.6 - Generated Sun Apr 16 13:38:10 CDT 2006