\begin{verbatim} PARDIS0 - Release Notes - version 6.2.0 --------------------------------------- PARDISO Contents ---------------- PARDISO - a direct and iterative sparse linear solver library - is a tuned math solver library designed for high performance on homogeneous multicore machines and cluster of multicores including Intel Xeon, Intel Itanium, AMD Opteron, and IBM Power processors, and includes both 32-bit and 64-bit library versions. Different versions are available for Linux MAC OSX, and Windows 64-bit operating systems. A full suite of sparse direct linear solver routines for all kind of sparse matrices is provided, and key datastructures have been designed for high performance on various multicore processors and cluster of multicore processors in 64-bit modes. A selected suite of iterative linear solvers for real and complex symmetric indefinite matrices is provided that takes advantage of a new algebraic multilevel incomplete factorization that is especially designed for good performance in very large-scale applications. New Features ------------ New features of release PARDISO 6.2.0: (o) Added incremental LU updates, multiple-rank parallel update algorithms for sparse LU factors. The incremental update been been proven to be very useful for transient simulations in circuit simulation. (o) Added much faster internal block factorization method. New features of release PARDISO 6.1.0: (o) Added approximate minimum degree orderings. New features of release PARDISO 6.0.0: (o) Added support for the R-INLA project. (o) Added acceleration of block orderings for symmetric indefinite matrices. (o) Significantly improved the reordering time for matrices including dense columns. (o) Added METIS 5.1 as additional preprocessing method. (o) Improved scalability for factorization on higher number of cores. (o) Added out-of-core option for real and complex symmetric indefinite matrices. (o) New internal data structure to simpify future developments. New features of release PARDISO 5.0.0: (o) Switch to host-free license for all 64-bit libraries. This allows all users to use the PARDISO software within a cluster environment. (o) Full support of multi-threaded Schur-complement computations for all kind of matrices. (o) Full support of multi-threaded parallel selected inversion to compute selected entries of the inverse of A. (o) Faster multi-threaded code for the solution of multiple right hand sides. (o) Full support of 32-bit sequential and parallel factorizations for all kind of matrices. New features of release PARDISO 4.1.3: (o) Bug fix in the computation of the log of the determinant. New features of release PARDISO 4.1.2: (o) Support of different licensing types - Evaluation license for academic and commercial use. - Academic license (host-unlocked, user-locked, 1 year). - Commercial single-user license (host-unlocked, user-locked, 1 year). - Commercial license (host-unlocked, user-unlocked, redistributable, 1 year). New features of release PARDISO 4.1.0: (o) New MPI-based numerical factorization and parallel forward/backward substitution on distributed-memory architectures for symmetric indefinite matrices. PARDISO 4.1.0 has the unique feature among all solvers that it can compute the exact bit-identical solution on multicores and cluster of multicores. Here are some results for a nonlinear FE model with 800'000 elements from automobile sheet metal forming simulations. CPUs per node: 4 x Intel(R) Xeon(R) CPU E5620 @ 2.40GHz (8 cores in total) Memory per node: 12 GiB, Interconnect: Infiniband 4xQDR (t_fact: factorization in seconds, t_solve= solve in seconds) PARDISO 4.1.0 (deterministic) : ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ t_fact | 1 core 4 cores 8 cores ---------+--------+---------+--------- 1 host | 92.032 23.312 11.966 2 hosts | 49.051 12.516 7.325 4 hosts | 31.646 8.478 5.018 t_solve | 1 core 4 cores 8 cores ---------+--------+---------+--------- 1 host | 2.188 0.767 0.545 2 hosts | 1.205 0.462 0.358 4 hosts | 0.856 0.513 0.487 Intel MKL 10.2 (non-deterministic): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ t_fact | 1 core 4 cores 8 cores --------+--------+---------+--------- 1 host | 94.566 27.266 14.018 t_solve | 1 core 4 cores 8 cores ---------+--------+---------+--------- 1 host | 2.223 2.183 2.207 - The MPI version is only available for academic research purposes. (o) New host-unlimited licensing meachnism integrated into PARDISO 4.1.0. We now have two different options available: - a time-limited user-host locked license, and - a time-limited user-locked (host-free) license. (o) 32-bit sequential and parallel factorization and solve routines for real unsymmetric matrices (matrix_type = 11). Mixed-precision refinement can used for these 32-bit sparse direct factorizations. (o) Additional routines that can check the input data (matrix, right-hand-side) (contribution from Robert Luce, TU Berlin) New features of release 4.0.0 of PARDISO since version 3.3.0: (o) Due to the new features the interface to PARDISO and PARDISOINIT has changed! This version is not backward compatible! (o) Reproducibility of exact numerical results on multi-core architectures. The solver is now able to compute the exact bit identical solution independent on the number of cores without effecting the scalability. Here are some results for a nonlinear FE model with 500'000 elements. Intel MKL PARDISO 10.2 1 core - factor: 17.980 sec., solve: 1.13 sec. 2 cores - factor: 9.790 sec., solve: 1.13 sec. 4 cores - factor: 6.120 sec., solve: 1.05 sec. 8 cores - factor: 3.830 sec., solve: 1.05 sec. U Basel PARDISO 4.0.0: 1 core - factor: 16.820 sec., solve: 1.09 sec. 2 cores - factor: 9.021 sec., solve: 0.67 sec. 4 cores - factor: 5.186 sec., solve: 0.53 sec. 8 cores - factor: 3.170 sec., solve: 0.43 sec. This method is currently only working for symmetric indefinite matrices. (o) 32-bit sequential and parallel factorization and solve routines for real symmetric indefinite matrices, for symmetric complex matrices and for structurally symmetric matrices. Mixed-precision refinement is used for these 32-bit sparse direct factorizations. (o) Internal 64-bit integer datastructures for the numerical factors allow to solve very large sparse matrices with over 2^32 nonzeros in the sparse direct factors. (o) Work has been done to significantly improve the parallel performance of the sparse direct solver which results in a much better scalability for the numerical factorization and solve on multicore machines. At the same time, the workspace memory requirements have been substantially reduced, making the PARDISO direct routine better able to deal with large problem sizes. (o) Integration of a parallel multi-threaded METIS reordering that helps to accelerate the reordering phase (Done by to Stefan Roellin, ETH Zurich) (o) Integration of a highly efficient preconditioning method that is based on a multi-recursive incomplete factorization scheme and stabilized with a new graph-pivoting algorithm. The method have been selected by the SIAM Journal of Scientific Computing as a very important milestone in the area of new solvers for symmetric indefinite matrices and the related paper appeared as a SIGEST SIAM Paper in 2008. This preconditioner is highly effective for large-scale matrices with millions of equations. [1] O. Schenk, M. Bollhoefer, and R. Roemer, On large-scale diagonalization techniques for the Anderson model of localization. Featured SIGEST paper in the SIAM Review selected "on the basis of its exceptional interest to the entire SIAM community". SIAM Review 50 (2008), pp. 91--112. (o) Support of 32-bit and 64-bit Windows operating systems (based on Intel Professional Compiler Suite and the Intel MKL Performance Library) (o) A new extended interface to direct and iterative solver. Double-precision parameters are passed by a dparm array to the solver. The interface allow for greater flexibility in the storage of input and output data within supplied arrays through the setting of increment arguments. (o) Note that the interface to PARDISO and PARDISOINIT has changed and that this version is not backward compatible. (o) Computation of the determinant for symmetric indefinite matrices. (o) Solve A^Tx=b using the factorization of A. (o) The solution process e.g. LUx=b can be performed in several phases that the user can control. (o) This version of PARDISO is compatible with the interior-point optimization package IPOPT version 3.7.0 (o) A new matlab interface has been added that allows a flexible use of all direct and iterative solvers. Contributions ------------- The following colleagues have contributed to the solver (in alphabetical order): Peter Carbonetto (UBC Vancouver, Canada). Radim Janalik (USI Lugano, Switzerland George Karypis (U Minnesota, US) Arno Liegmann (ETHZ, Switzerland) Esmond Ng (LBNL, US) Stefan Roellin (ETHZ, Switzerland) Michael Saunders (Stanford, US) Applicability ------------- Different PARDISO versions are provided for use with the GNU compilers gfortran/gcc, with the Intel ifort compiler, and (for use under Solaris) with the Sun f95/cc compilers. Required runtime libraries under Microsoft Windows -------------------------------------------------- PARDISO version 4.0.0 and later link with the standard runtime library provided by the Microsoft Visual Studio 2008 compilers. This requires that the machine PARDISO runs on either has VS2K8 installed (or the Windows SDK for Windows Server 2008), or the runtime libraries can be separately downloaded from the appropriate Microsoft platform links provided below: Visual Studio 2K8 Redist: x86 x64 Bug Reports ----------- Bugs should be reported to info@pardiso-project.org with the string "PARDISO-Support" in the subject line. \end{verbatim}