Category Archives: Numerical Libraries

Minimal Metrics has partnered with Reservoir Labs to work on the prototyping of a entirely new computer architecture and the development of algorithms to run on it. About Reservoir Labs: Reservoir solves the critical technology challenges of high performance computing. Our advanced computing and communications products, thought-leading research and novel technologies have made Reservoir a trusted and respected partner of corporate clients, government agencies, and leading researchers. We thrive on opportunities to learn and create as we develop and deliver groundbreaking science and security solutions. This particular project is indeed groundbreaking science, and as such, we can’t say more  other that we’re thrilled to be part of this effort, and to be working with our friends at Reservoir Labs once again.

Sandia National Laboratories has just signed Minimal Metrics to help in their performance analysis of their next-generation high-performance computing platforms. In particular, the Application Performance Modeling and Analysis team is interested in studying the effects of stall cycles on application performance. More specifically, parallel scientific simulation kernels running on Intel’s Sandy Bridge systems. Stall cycles, or periods of time where the processor is not producing any results, are notoriously difficult to account for. While the hardware architects have added some hardware instrumentation to help accomplish this, using that instrumentation requires extensive background and understand of the specific microprocessor’s architecture. Sandia, while possessing leading expertise in application performance, is looking to leverage Minimal Metrics’ unparalleled experience in the field in an attempt to further their research. Minimal Metrics will be working closely with Sandia’s application developers and systems teams to ensure that their code and systems are working as optimally as…

Read more

Texas Instruments has renewed their contract with Minimal Metrics to deliver optimized numerical libraries for their next generation of high-performance microprocessors. The Keystone 2, is a revolutionary new product that integrates four cores of an ARM A15 with 8 cores of the  C6678 DSP on the same die. Offload to the DSPs can be accomplished either through the use of OpenCL or a subset of the OpenMP 4 specification, OpenMPACC. For this contract, Minimal Metrics will be providing optimized, hybrid, ARM+DSP-accelerated versions of the following libraries. These libraries are critical elements in the middleware of high-performance numerical simulations. FFTW – Fast Fourier transform BLAS/ATLAS – Vector/vector, matrix/vector and matrix/matrix arithmetic LAPACK – Dense library algebra LIBFLAME – Dense linear algebra Applications are accelerated using new versions of the libraries transparently, without requiring any changes to the source or object code. In this way, HPC applications ported to the ARM gain vast increases in performance by…

Read more

Minimal Metrics has successful completed their work with Texas Instruments to deliver an optimized and complete BLAS (Basic Linear Algebra Subroutines) library. The BLAS are the building blocks of many high performance numerical algorithms – and are known as the keystone to high performance in scientific simulations. The Minimal Metrics team worked closely with the ATLAS group in order to provide a complete and optimized implementation much faster than could be coded by hand. The target platform for the work was the C6678 DSP, aka Keystone, an 8-core DSP with industry leading floating point performance per watt. This library will also be used for the Keystone 2, a revolutionary new product that integrates the ARM A15 CPU with 8 cores of the C6678. The DSP architecture is substantially different than coding for “traditional” RISC/CISC processors. The Keystone series provides a number of architectural features to enhance performance – including hardware-assisted software…

Read more

Minimal Metrics has teamed up with the authors of ATLAS to bring their ultra high performance implementation of the BLAS or Basic Linear Algebra Subroutines to Texas Instruments‘ next-generation, multi-core, digital signal processing (DSP) platform. BLAS is used throughout the industry to achieve performance on a number of low-level matrix and vector operations. It is best known as the building block of LAPACK; a programming library invented by Dr. Jack Dongarra in the 70’s to solve problems consisting of simultaneous systems of linear equations. Solving such systems are at the core of many uses of high performance computers today, whether it be in the simulation sciences like weather forecasting, aircraft design or car crash simulation or web technologies such as data mining, web analytics and natural language processing. Ti’s new generation of processors, beginning with the 8-core C6678, a.k.a “Shannon” raise the bar on the performance per-watt for both integer…

Read more

5/5