Par4All 1.0 发布，Tesla C2050 运行 SPEC CPU2006 410.bwaves 为双 X5670 4.5 倍

Edison · 发表于 2010-8-5 13:29

Hyantes
Description
Hyantes is a library to compute neighbourhood population potential with scale control. It is developed by the Mescal team from the Laboratoire Informatique de Grenoble (France), as a part of Hypercarte project. The
Hypercarte project aims to develop new methods for the cartographic representation of human distributions (population density, population increase, etc.) with various smoothing functions and opportunities for time-scale animations of maps. Hyantes provides one of the smoothing methods related to multiscalar neighbourhood density estimation. It is a C library that takes sets of geographic data as inputs and computes a smoothed representation of this data taking account of neighbourhood's influence.

For more information: http://hyantes.gforge.inria.fr
Results
We measure the wall-clock time that includes startup time, data load time and output write time, that is the real time understood by users. By measuring kernel time only, speed-up would be better but less representative of the real application (Amdahl...).

On one of our WildNode with 2 Intel Xeon X5670 @ 2.93GHz (12 cores) and a Tesla C2050 (Fermi), Linux/Ubuntu 10.04, gcc 4.4.3, CUDA 3.1, we measure in production:

Sequential execution time on CPU: 30.355s
OpenMP parallel execution time on CPUs: 3.859s, speed-up: 7.87
CUDA parallel execution time on GPU: 0.441s, speed-up: 68.8
To test it by yourself on the main computational part, go to the examples/P4A_accel/Hyantes directory of par4all or look in the git here.
SPEC CPU2006 410.bwaves
bwaves is a computational fluid dynamics Fortran program that simulates blast waves in three dimensional transonic transient laminar viscous flow.

More information on the program itself on http://www.spec.org/cpu2006/Docs/410.bwaves.html

On a HPC Project WildNode, we get a speed-up of 4.5 with a 2 Intel Xeon X5670 @ 2.93GHz (12 cores).

Matrix multiplication
The classical **o World in Fortran :-) can be found in examples/F77_matmul_OpenMP directory of par4all or look in the git here so you can try by yourself.

On a HPC Project WildNode, we get a speed-up of 12.1 (thanks to cache effects) with a 2 Intel Xeon X5670 @ 2.93GHz (12 cores).

http://www.par4all.org/documentation/benchmarks

金莎 · 发表于 2010-11-5 02:45

提示: 作者被禁止或删除内容自动屏蔽

杂草 · 发表于 2010-11-5 04:12

得到的点点滴滴

帐号		自动登录	找回密码
密码			注册

金莎金莎当前离线积分 16 IP卡狗仔卡头像被屏蔽	2^# 发表于 2010-11-5 02:45 \| 只看该作者提示: 作者被禁止或删除内容自动屏蔽
金莎金莎当前离线积分 16 IP卡狗仔卡头像被屏蔽
	回复支持反对使用道具举报显身卡

Par4All 1.0 发布，Tesla C2050 运行 SPEC CPU2006 410.bwaves 为 双 X5670 4.5 倍

浏览过的版块

Par4All 1.0 发布，Tesla C2050 运行 SPEC CPU2006 410.bwaves 为双 X5670 4.5 倍