ISC 2010 "LINPACK ON FUTURE MANYCORE & GPU BASED SYSTEMS"

Edison · 发表于 2010-5-26 12:00

http://www.supercomp.de/isc10/Tu ... e-GPU-Based-Systems

LINPACK ON FUTURE MANYCORE & GPU BASED SYSTEMS
Sunday, May 30, 2010, 1:30pm – 6:00pm
Hamburg University, Main Building, Room M

Presenters

Prof. Dr. Jack Dongarra, University Distinguished Professor of Computer Science, University of Tennessee & Oak Ridge National Laboratory, USA
Dr. Jakub Kurzak, Research Scientist, University of Tennessee, USA
We will provide a brief historical look at the development of dense linear algebra libraries, from LINPACK, to LAPACK, to ScaLAPACK. These packages served the community well for many years. Today we see new computer architectures emerging, which will cause another change to the software landscape, namely many core and accelerators. These changes will necessitate changes again to the linear algebra libraries. We have been developing two packages, PLASMA and MAGMA, for just these architectures.

PLASMA (Parallel Linear Algebra Software for Multicore Architectures) is a software library for solving dense systems of linear equations and linear least squares problems, designed to achieve high efficiency on homogeneous multicore processors and multi-socket systems of multicore processors.

Currently, PLASMA significantly outperforms vendor libraries, such as MKL, ACML or ESSL, and drastically outperforms its academic predecessors such as LAPACK and ScaLAPACK on top-of-the-line server-size multicore systems. Currently, however, it only provides a small subset of the functionality of LAPACK and ScaLAPACK and unlike ScaLAPACK, it does not support distributed memory systems.

This tutorial will attempt to present PLASMA from the perspective of an average user, from the perspective of a potential community contributor of numerical routines and from the perspective of the designers of PLASMA environment and runtime mechanisms. Similarities and differences between PLASMA and legacy software will be covered, as well as issues related to interactions between PLASMA, user code and other libraries. Future directions will be outlined, including support for GPUs and distributed memory systems.

Level of Tutorial
Introductory: 30%
Intermediate: 50%
Advanced: 20%

Prerequisites
People will benefit from this tutorial that have a basic understanding of numerical computing and programming principles i.e. some familiarity with fundamentals of parallel programming. A genuine interest in dense linear algebra (Gaussian elimination) and numerical libraries in general would be an advantage. Because PLASMA is mostly written in C, supporting both FORTRAN and C interfaces, the main part of the examples is programmed in C. So the attendees should have a basic understanding of the C language as a prerequisite.

Edison · 发表于 2010-5-26 12:04

http://www.supercomp.de/isc10/Co ... he-TOP500-Yardstick
FOCUSING LINPACK: THE TOP500 YARDSTICK
Thursday, June 3, 2010, 9:00am – 10:30am, Hall B
>> View detailed schedule

Chair

Dr. Erich Strohmaier, Head of Future Technology Group, Lawrence Berkeley National Laboratory, USA
LINPACK Benchmark with Time Limits on Multicore & GPU Based Accelerators
Prof. Dr. Jack Dongarra, University Distinguished Professor of Computer Science, University of Tennessee & Oak Ridge National Laboratory, USA
The original LINPACK Benchmark is, in some sense, an accident. It was originally designed to assist users of the LINPACK package by providing information on execution times required to solve a system of linear equations. The first “LINPACK Benchmark” report appeared as an appendix in the LINPACK Users’ Guide in 1979. The appendix comprised of data for one commonly used path in the LINPACK software package. Results were provided for a matrix problem of size 100, on a collection of widely used computers (23 computers in all). This was done so users could estimate the time required to solve their matrix problem by extrapolation. Over the years additional performance data was added, more as a hobby than anything else, and today the collection includes over 1300 different computer systems. In addition to the number of computers increasing, the scope of the benchmark has also expanded. Today one form of the Linpack benchmark is the basis of the Top500 listing. We will look at how and why the benchmark has changed over the past 30 years and discuss the plans for another change to accommodate new technology and limitations.

Linpack on Multicores and GPUs
We will provide a brief historical look at the development of dense linear algebra libraries, from LINPACK, to LAPACK, to ScaLAPACK. These packages served the community well for many years. Today we see new computer architectures emerging, which will cause another change to the software landscape, namely many core and accelerators. These changes will necessitate changes again to the linear algebra libraries. We have been developing two packages, PLASMA and MAGMA, for just these architectures.

The main motivation for the PLASMA (Parallel Linear Algebra Software for Multiprocessor Architectures) project is to create a new generation of dense linear algebra libraries that achieve the fastest possible time to an accurate solution on multicore systems. Specifically, PLASMA aims at outperforming ScaLAPACK and LAPACK on distributed and shared memory systems, as well as leading vendor implementations (e.g. Intel’s MKL and AMD’s ACML) on the top of the line multi-core systems. It is also a main goal of PLASMA to provide a unified framework for different memory architectures, e.g. distributed memory systems (traditional clusters and tightly coupled MPPs), shared memory systems (traditional socket-level SMPs, multi-cores or CMPs, NUMA systems), as well as accelerator based computing.

Following are the main goals to be accomplish by the PLASMPro-Aject:

Dynamic Scheduling PLASMA will relieve the programmer from scheduling of tasks by implementing dependency-driven/data-driven dynamic scheduling. Tasks will be scheduled as their dependencies become satisfied and subsequently input data becomes available.
Communication & Memory Management PLASMA shall separate the algorithm developer from the specifics of particular memory architecture. In particular, PLASMA will relieve the programmer from explicit message passing on a distributed memory system and the allocation/management of communication data buffers.
The MAGMPro-Aject aims to develop a dense linear algebra library similar to LAPACK but for heterogeneous/hybrid architectures, starting with current "Multicore+GPU" systems.

The MAGMA research is based on the idea that, to address the complex challenges of the emerging hybrid environments, optimal software solutions will themselves have to hybridize, combining the strengths of different algorithms within a single framework. Building on this idea, we aim to design linear algebra algorithms and frameworks for hybrid manycore and GPUs systems that can enable applications to fully exploit the power that each of the hybrid components offers.

Edison · 发表于 2010-5-26 12:07

binbin · 发表于 2010-5-26 15:06

又全是鸟文，能否说下主要意思。

帐号		自动登录	找回密码
密码			注册

ISC 2010 "LINPACK ON FUTURE MANYCORE & GPU BASED SYSTEMS"

浏览过的版块