 |
| Intel® Math Kernel Library (Intel® MKL) offers highly optimized, extensively threaded math routines for scientific, engineering, and financial applications that require maximum performance.
Intel MKL is available with the Intel® C++ and Fortran Compilers Professional Editions and Intel® Cluster Toolkit, as well as a standalone product.
Intel MKL provides high performance, future proofing for applications and productivity for developers. Intel MKL is extremely optimized for current multicore x86 platforms and will continue to be optimized for future platforms to ensure applications benefit seamlessly from the latest architecture enhancements.
Microsoft Visual Studio* Developers: Build robust technical applications more efficiently using the premier library of high-speed implementations of BLAS, LAPACK, FFTs and Statistics functions from within Microsoft Visual Studio 2003, 2005 and 2008.
Product Brief [PDF 462KB]
Intel® Math Kernel Library Flash Demo

|
- Version 10.1 Now Available!
See below for a list of the new features and performance improvements. - Click here for the full list of supported operating systems, compilers, and processors.
- Check out WhatIf.intel.com for interesting new technologies related to Intel MKL.
Outstanding performance on Intel® processors
Achieve leadership performance with the math library that is highly optimized for, Intel® Xeon®, Intel® Core™, Intel® Itanium®, and Intel® Pentium® 4 processor-based systems. Special attention has been paid to optimizing multi-threaded performance for the Intel® Xeon Quad-core processors and the new Intel® Core™ i7 Quad-Core processors. Intel MKL strives for performance, competitive with that of other math software packages on non-Intel processors. Multicore ready- Excellent scaling on multicore and multiprocessor systems1
Use the built-in parallelism of Intel MKL to automatically obtain excellent scaling on multicore and multiprocessors including Intel® Xeon® 7400 and the latest dual and quad-core systems. Intel MKL BLAS, Fast Fourier transforms, and Vector Math, among many other routines are threaded using OpenMP*. - Thread-Safety
All Intel MKL functions are thread-safe. A non-threaded sequential version of Intel MKL is also provided. Automatic runtime processor detection
A runtime check is performed so that processor-specific optimized code is executed, ensuring that your application achieves optimal performance on whatever system it is executing on. Support for C and Fortran interfaces
Intel MKL includes both C and Fortran interfaces, unlike some alternative math libraries that require you to purchase multiple products. Support for all Intel® processors in one package
Intel MKL includes support for Intel® Xeon®, Intel® Core™, Intel® Pentium 4, Intel® Itanium® architectures in a single package. Alternative math libraries require you to purchase multiple products for all supported processors. Royalty-free distribution rights
Redistribute unlimited copies of the Intel MKL runtime libraries with your software. Intel® Premier Support
Receive one year of world-class technical support with every purchase of Intel MKL. During this period, you can download product upgrades free of charge, including major version releases. For more information, visit the Intel Registration Center. Additionally, the user forum is a great place to get community support. User forum
Share experiences with other users of Intel MKL at the Intel moderated Intel MKL Discussion Forum. Linear Algebra - BLAS and LAPACKEmploy BLAS and LAPACK routines that are highly optimized for Intel processors, and that provide significant performance improvements over alternative implementations. Intel MKL 10.0 is compliant with the new 3.1 release of LAPACK. Linear Algebra - ScaLAPACK
The Intel MKL implementation of ScaLAPACK can provide significant performance improvements over the standard NETLIB implementation. Linear Algebra- Sparse Solvers
Solve large, sparse linear systems of equations with the PARDISO Direct Sparse Solver – an easy-to-use, thread-safe, high-performance, and memory-efficient software library licensed from the University of Basel. Intel MKL also includes Conjugate Gradient and FGMRES iterative sparse solvers. Fast Fourier Transforms (FFT)
Utilize our multi-dimensional FFT routines (1D up to 7D) with a modern, easy-to-use C and Fortran interface. Intel MKL supports distributed memory clusters with the same API enabling you to improve your performance by distributing the work over a large number of processors with minimal effort. Intel MKL also provides compatibility with the FFTW 2.x and 3.0 interfaces making it easy for current FFTW users to plug Intel MKL into their existing applications. Vector Math Library
Increase application performance with vectorized implementations of computationally intensive core mathematical functions (power, trigonometric, exponential, hyperbolic, logarithmic, and more). Vector Random Number Generators
Speed up your simulations using our vector random number generators, which can provide substantial performance improvements over scalar random number generator alternatives. LINPACK Benchmark
Intel provides free LINPACK benchmark packages built with Intel MKL to help you obtain the highest possible benchmark results for your Intel® architecture-based systems. Back to top In this release of Intel Math Kernel Library (Intel MKL 10.1) provides optimized multi-threaded performance for the newest Intel® processors (Intel® Xeon® 7400 series, Intel® Core™). Intel MKL 10.0 introduced a new “layered” architecture to better support the varied usage models of our users as well as merged the standard and cluster editions so there is a single comprehensive package. Optimizations for the new Intel® Xeon® and Intel® Core™ Processors
For more information see section “Performance Improvements in Version 10.1” below. "Layered" Architecture introduced in Intel MKL 10.0
Intel MKL 10.0 introduced a re-architected product to provide multiple layers so that the base Intel MKL package supports numerous configurations of interfaces, compilers, and processors in a single package. Many other library vendors have specific versions that must be first found, downloaded, installed, and tested depending on the particular configuration of your development environment. This new Intel MKL architecture is intended to provide maximum support for our varied customers’ needs, while minimizing the effort it takes to obtain and utilize the great performance of Intel MKL. For more information, please refer to the “Using Intel MKL Parallelism” section of the Intel MKL User’s Guide. Computational Layer
This layer forms the heart of Intel MKL. A runtime check is performed so that processor-specific optimized code is executed. Users can build custom shared objects to include only the specific code needed and thus reduce the size of this layer if size is an issue. PARDISO Direct Sparse Solver- Out-of-core memory implementation for solving larger problems on SMP systems
- Support of separate backward/forward substitution for DSS/PARDISO.
- A new parameter for turning off iterative refinement for DSS interface.
- A new parameter for checking sparse matrix structure has been introduced for PARDISO interface.
- The sparse solver functionality now integrated into the core math library and it is no longer necessary to link a separate solver library.
- The sparse solver functionality can now be linked dynamically.
Sparse BLAS- Added routines for computing the sum and product of two sparse matrices stored in compressed sparse row format
- Adder routines for converting between different sparse matrix formats.
- Added support for all data types (single precision, complex and double complex).
- Added sparse 0-based indexing.
- Added single precision support added.
- Threaded Level-3 Sparse BLAS triangular solvers.
LAPACK- The capability to track and/or interrupt the progress of lengthy LAPACK computations has been added via a callback function mechanism. A function called mkl_progress can be defined in a user application, which will be called regularly from a subset of the MKL LAPACK routines. Refer to the specific function descriptions to see which LAPACK functions support the feature.
Discrete Fourier Transform Interface (DFTI)- Added the DftiCopyDescriptor function for convenience when using the FFTs.
- The size of statically linked executables calling DFTI has been reduced significantly.
- Complex storage is now available for real-to-real transforms.
Iterative Solver Preconditioner- ILUT accelerator/preconditioner for the Intel MKL RCI iterative solvers
Vector Math Functions- New Mul, Conj, MulbyConj, CIS, Abs functions
- New “Enhanced Performance” mode EP Mode is for applications where math function inaccuracies don’t dominate parameter inaccuracies (e.g. Monte Carlo simulations and Media applications)
- All VML functions are now threaded
- Optimized versions of the Cumulative Normal Distribution (CdfNorm), its inverse (CdfNormInv), and the inverse complementary error function (ErfcInv) have been added to the Vector Math Library.
User’s Guide- We have greatly improved our Intel MKL User’s Guide. It is an indispensable tool for working with Intel MKL. Visit the Documentation page to download it or view it online.
Compiler Support - Support for new compilers including the new Intel® compilers 11.0 and PGI* compilers.
| Performance Improvements in Intel MKL 10.1 |
We improved performance in all areas of the library. Below are some specific measured performance gains. A list of performance improvements in past versions of Intel MKL is available on the Performance improvements page. Performance improvements are illustrated for each Intel MKL product domain (BLAS/LAPACK, FFT, VML, VSL, etc.) -
BLAS
-
32-bit improvements
-
Up to 50% improvement for (Z,C)GEMM on Quad-Core Intel® Xeon® processor 5300 series
-
10% improvement for all (D,S,Z,C)GEMM code on Quad-Core Intel® Xeon® processor 5400 series
-
64-bit improvements
-
50% improvement for SGEMM on the Intel® Core™ i7 processor.
-
30% improvement for right-side cases of DTRSM on the Intel® Core™ i7 processor.
-
Direct sparse solver (DSS/PARDISO):
-
35% performance improvement on average for out-of-core PARDISO.
- VML and VSL
- Optimizations on the Intel® Core™ i7 processor :
- Up to 17% improvement for the following VML functions: Asin, Asinh, Acos, Acosh, Atan, Atan2, Atanh, Cbrt, CIS, Cos, Cosh, Conj, Div, ErfInv, Exp, Hypot, Inv, InvCbrt, InvSqrt, Ln, Log10, MulByConj, Sin, SinCos, Sinh, Sqrt, Tanh.
- Up to 67% improvement for uniform random number generation.
- Up to 10% improvement for VSL distribution generators based on Wichmann-Hill, Sobol, and Niederreiter BRNGs (64-bit only).
| Performance Improvements in Version 10.0 |
BLAS - Threading of DGEMM was improved for small and middle sizes - outer product sizes by 10%, square sizes by 80%
- DGEMM/SGEMM Large square and large outer product sizes were improved by 4-5% on 1 thread and 10-15% on 8 threads
- DTRSM, DTRMM, and DSYRK were improved by 5-30%
- Other level 3 real functions were improved by 2-4% on large sizes
LAPACK - We dramatically improved the performance of several linear equation solvers (?spsv/?hpsv/?ppsv, ?pbsv/?gbsv, ?gtsv/?ptsv, ?sysv/?hesv). Banded and packed storage format and multiple right-hand sides cases see speed-ups of up to 100 times.
- All symmetric eigensolvers (?syev/?syev, ?syevd/?heevd, ?syevx/?heevx, ?syevr/?heevr) have significantly improved, since tridiagonalization routine (?sytrd/?hetrd) has sped up to 4 times
- All symmetric eigensolvers in packed storage (?spev/?hpev, ?spevd/?hpevd, ?spevx/?hpevx) have significantly improved, since tridiagonalization routine in packed storage (?sptrd/?hptrd) has sped up to 3 times
- Up to 2 times improvement for a number of routines applying orthogonal/unitary transformations (?ormqr/?unmqr, ?ormrq/?unmrq, ?ormql/?unmql, ?ormlq/?unmlq).
FFTs - Improved single threaded performance of up to 1.8 times on complex 1D FFTs for power-of-two sizes.
- On Intel® 64 architecture-based systems running in 64-bit mode single precision complex backward 1D FFT for data sizes greater than 2^22 elements have been sped up by up to 2 times on 4 threads and up to 2.4 times on 8 threads on Intel® Itanium® processors
VML/VSL - Performance of VSL functions is improved on non-Intel processors by approximately 2 times on average
- Performance of VML vdExp, vdSin, and vdCos functions is improved on non-Intel processors by 18% on average
- Performance of VSL functions is improved on IA-32 and Intel® 64 architecture by 7% on average
Operating Systems
Intel MKL 10.1 supports Linux*, Windows* (including HPC Server 2008) and Mac OS* X. Linux variants include: Red Hat*, Suse*, Debian*, Ubuntu*, Asianux*, and other Linux Standard Base 3.1 variants. For a complete list, please see the System Requirements page. Development Environments
Intel MKL is easily used and integrated with popular development tools and environments, such as Microsoft Visual Studio*, Xcode*, Eclipse*, and the GNU Compiler Collection (GCC). Processors
Intel MKL 10.1 supports all Intel Architecture compatible processors and is specifically optimized for: - Intel® Xeon® processor family
- Intel® Core™ processor family
- Intel® Itanium processors family
- Intel® Pentium® processor family
- AMD Opteron* and Athlon* processor families
For a complete list, please see the System Requirements page. NOTE: Intel MKL for Mac OS* X is not available as a standalone product. It is only available with the Intel C++ Compiler Professional Edition and Intel Fortran Compiler Professional Edition. Every purchase of an Intel® Software Development Product includes a year of support services, which provides access to Intel Premier Support and all product updates during that time. Intel Premier Support gives you online access to Intel MKL discussion forum, technical notes, application notes, and documentation. Install the product, and then register to get support and product update information. 1 Performance tests and ratings are measured using specific computer systems and/or components and reflect the appropriate performance of Intel products as measured by those tests. Any difference in system design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing.
Get more information on performance tests and on the performance of Intel products. |  |
|
Intel® Software Network
|  | |
| -
It’s free and easy to become a member, so join today!
| |
|