|
Auf dem Compute-Cluster ist die Intel MKL Version 10.1.1.019 installiert.
source /opt/intel/mkl/10.1.1.019/tools/environment/mklvarsem64t.sh
Beispiel in C
(....) #include <mkl_blas.h> (....)
Beispiel für Compilierung
icc -openmp -L/opt/intel/mkl/10.1.1.019/lib/em64t/ -I/opt/intel/mkl/10.1.1.019/include -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lm -lpthread source.c -o example
Eine präzise Angabe der für ihren Fall benötigten Bibliotheken erhalten sie unter:
software.intel.com/en-us/articles/intel-mkl-link-line-advisor/
Overview
The Intel(R) Math Kernel Library (Intel(R) MKL) provides developers of
scientific, engineering and financial software with a set of linear
algebra routines, fast Fourier transforms, and vectorized math and
random number generation functions, all optimized for the latest
Intel(R) Pentium(R) 4 processors, Intel(R) Xeon(R) processors with
Streaming SIMD Extensions 3 (SSE3) and Intel(R) Extended Memory 64
Technology (Intel(R) EM64T), and Intel(R) Itanium(R) 2 processors.
This software also performs well on non-Intel (x86) processors.
Intel(R) MKL provides linear algebra functionality with LAPACK
(solvers and eigensolvers) plus level 1, 2, and 3 BLAS offering the
vector, vector-matrix, and matrix-matrix operations needed for complex
mathematical software computations. Users who prefer the FORTRAN
90/95 programming language may call LAPACK driver and computational
subroutines via special interfaces with reduced numbers of arguments.
Intel(R) MKL provides ScaLAPACK (Scalable LAPACK) and support
functionality including the Parallel Basic Linear Algebra Subprograms
(PBLAS). For solving sparse systems of equations, Intel(R) MKL
provides direct and iterative sparse solvers as well as a supporting
set of sparse BLAS (levels 1, 2, and 3).
Intel(R) MKL offers multidimensional fast Fourier transforms (1D, 2D,
3D) with mixed radix support (not limited to sizes of powers of 2).
Intel(R) MKL also provides distributed versions of these functions for
use on clusters. For the solution of partial differential equations
(PDE), Intel(R) MKL provides a few preconditioners to help with the
convergence of our iterative solvers. Optimization [Trust Region]
solvers provide efficient routines for solving nonlinear least square
problems with and without boundary constraints.
Intel(R) MKL also includes a set of vectorized transcendental
functions (called the Vector Math Library (VML)) offering both greater
performance and excellent accuracy compared to the libm (scalar)
functions. The Vector Statistical Library (VSL) offers high
performance vectorized random number generators for a number of
probability distributions as well as convolution and correlation
routines.
The BLAS, LAPACK, direct sparse solver (DSS/PARDISO), FFT, VML library
functions, and optimization solvers in Intel(R) MKL are threaded using
OpenMP*. All of Intel(R) MKL is thread-safe (with the exception of
the deprecated ?lacon, ?lasq3, and ?lasq4 LAPACK routines; see the
reference manual for more information).
New in Intel(R) MKL 10.1
* Performance Improvements in the BLAS:
* Performance improvements on Quad-Core Intel(R) Xeon(R) processor
5400 series systems with 64-bit OS's:
* SGEMM: 2% on 1 thread and 6% on 8 threads
* DGEMM: 7% on 8 threads
* CGEMM: 2% on 1 thread and 10% on 8 threads
* ZGEMM: 7% on 1 thread and 11% on 8 threads
* Performance improvements on Quad-Core Intel(R) Xeon(R) processor
5400 series systems with 32-bit OS's:
* SGEMM: 7-15% on 8 threads
* DGEMM: 7-15% on 8 threads
* Performance improvement on Intel(R) Core(TM) i7 processors with
64-bit OS's:
* SGEMM: 50% on 1 thread and 50% on 8 threads
* DGEMM: 11% on 1 thread and 12% on 8 threads
* CGEMM: 2-3% on 1 thread and 2-3% on 8 threads
* ZGEMM: 2% on 1 thread
* DTRSM: 20% on 1 thread and 20% on 8 threads for some cases.
* Improvements to the direct sparse solver (DSS/PARDISO):
* The performance of out-of-core PARDISO was improved by 35% on
average.
* Support of separate backward/forward substitution for
DSS/PARDISO has been added.
* A new parameter for turning off iterative refinement for DSS
interface has been introduced.
* A new parameter for checking sparse matrix structure has been
introduced for PARDISO interface.
* The sparse solver functionality has now been integrated into the
core math library and it is no longer necessary to link a
separate solver library. See the user guide for more
information.
* The sparse solver functionality can now be linked dynamically.
* The capability to track and/or interrupt the progress of lengthy
LAPACK computations has been added via a callback function
mechanism. A function called mkl_progress can be defined in a user
application, which will be called regularly from a subset of the
MKL LAPACK routines. See the LAPACK Auxiliary and Utility Routines
chapter in the reference manual for more information. Refer to the
specific function descriptions to see which LAPACK functions
support the feature.
* Transposition functions have been added to Intel MKL. See the
"BLAS-like Extensions" section of chapter 2 in reference manual
for further detail.
* The C++ std::complex type can now be used instead of MKL-specific
complex types.
* Improvements to the Discrete Fourier Transform Interface (DFTI)
* Addition of the DftiCopyDescriptor convenience function
* Reduction in the size of statically linked executables calling
DFTI functions
* Support for DFTI_REAL_REAL storage (i.e., real and imaginary
parts in separate arrays) in complex-to-complex transforms
* An implementation of the Boost uBLAS matrix-matrix multiplication
routine is now provided which will make use of the highly
optimized version of DGEMM in the Intel MKL BLAS. See the User
guide for more information.
* Improvements to the sparse BLAS:
* Support for all data types (single precision, complex and double
complex) has been added.
* Routines for computing the sum and product of two sparse
matrices stored, both stored in the compressed sparse row format
have been added.
* Routines for converting between different support sparse matrix
formats have been added.
* ScaLAPACK functionality can now be dynamically linked.
* Optimized versions of the Cumulative Normal Distribution
(CdfNorm), its inverse (CdfNormInv), and the inverse complementary
error function (ErfcInv) have been added to the Vector Math
Library.
Autor: Alexander Fitterling, Stand: 06.08.2010 18:19 Uhr |