POPPUR爱换

 找回密码
 注册

QQ登录

只需一步,快速开始

手机号码,快捷登录

搜索
查看: 2193|回复: 11
打印 上一主题 下一主题

斯坦福大学发布 FAH GPU3 客户端 for NVIDIA

[复制链接]
跳转到指定楼层
1#
发表于 2010-5-26 07:11 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
http://folding.stanford.edu/English/FAQ-NVIDIA-GPU3

A Brief History of FAH: From Tinker to Gromacs to GPU to GPU2 to GPU3

NVIDIA GTX 480

Introduction

Since 2000, Folding@home (FAH) has lead to a major jump in the capabilities of molecular simulation. By joining together hundreds of thousands of PCs throughout the world, calculations, which were previously considered impossible, have now become routine. FAH has targeted the study of protein folding and protein folding diseases, and numerous scientific advances have come from the project.

In 2006, we began looking forward to another major advance in capabilities. This advance utilizes the new, high performance Graphics Processing Units (GPUs) from ATI to achieve performance previously only possible on supercomputers. With this new technology, as well as the new Cell processor in Sony's PlayStation 3, we will soon be able to attain performance on the 100 gigaflop scale per computer. With this new software and hardware, we will be able to push Folding@home a major step forward.

In 2008, we have developed a second generation GPU core (GPU2). This core is much more sophisticated than the original, with higher reliability, ease of use, and much more scientific calculation capabilities.

Our goal is to apply this new technology to dramatically advance the capabilities of Folding@home, applying our simulations to further study of protein folding and related diseases, including Alzheimer's Disease, Huntington's Disease, and certain forms of cancer. With these computational advances, coupled with new simulation methodologies to harness the new techniques, we will be able to address questions previously considered impossible to tackle computationally, and make even greater impacts on our knowledge of folding and folding related diseases.

Folding@home debuts with the Tinker core (October 2000)

In October 2000, Folding@home was officially released. The main software core engine was the Tinker molecular dynamics (MD) code. Tinker was chosen as the first scientific core due to its versatility and well laid out software design. In particular, Tinker was the only code to support a wide variety of MD force fields and solvent models. With the Tinker core, we were able to make several advances, including the first folding of a small protein starting purely from sequence (subsequently published in Nature).

A major step forward: the Gromacs core (May 2003)

After many months of testing, Folding@home officially rolled out a new core based on the Gromacs MD code in May 2003. Gromacs is the fastest MD code available, and likely one of the most optimized scientific codes in the world. By using hand tuned assembly code and utilizing new hardware in many PCs and Intel-based Macs (the SSE instructions), Gromacs was considerably faster than most MD codes by a factor of about 10x, and approximately a 20x to 30x speed increase over Tinker (which was written for flexibility and functionality, but not for speed).

However, while Gromacs is faster than Tinker, it has limits to what it can do; for example, it does not support many implicit solvent models, which play a key role in our folding simulations with Tinker. Thus, while Gromacs significantly sped certain calculations, it was not a replacement for Tinker, and so the Tinker core will continue to play an important role in Folding@home (including a recent paper in Science). For these reasons, points for Gromacs WUs were set to be consistent with points for Tinker WUs, as both play an important role in the science of FAH. Moreover, we switched the benchmark machine to a 2.8 GHz Pentium 4 (from a 500MHz Celeron) in order to allow us to fairly benchmark these types of WUs (as the benchmark machine needed to have hardware support for SSE).

The next major step forward: Streaming Processor cores (September 2006)

Much like the Gromacs core greatly enhanced Folding@home by a 20x to 30x speed increase via a new utilization of hardware (SSE) in PCs, in 2006, Folding@home has developed a new streaming processor core to utilize another new generation of hardware: GPUs with programmable floating-point capability. By writing highly optimized, hand tuned code to run on ATI X1900 class GPUs, the science of Folding@home will see another 20x to 30x speed increase over its previous software (Gromacs) for certain applications. This great speed increase is achieved by running essentially the complete molecular dynamics calculation on the GPU; while this is a challenging software development task, it appears to be the way to achieve the highest speed improvement on GPU's.

In addition, through collaboration with Pande Group, Sony has developed an analogous core for the PS3's Cell processor (another streaming processor), which should see a significant speed increase for the science over the types of calculations we could previously do on a x86/SSE Gromacs core as well. Following what we did with the introduction of Gromacs, we will now switch benchmark machines and include an ATI X1900XT GPU in order to be able to benchmark streaming WUs (which cannot be run on non-GPU machines). This machine will also benchmark CPU units (which continue to be of value since GPUs work only for certain simulations) without using its GPU.

The second generation GPU core, aka GPU2, for ATI hardware (April 2008)

After running the original GPU core for quite some time and analyzing its results, we have learned a lot about running GPGPU software. For example, it has become clear that a GPGPU approach via DirectX (DX) is not sufficiently reliable for what we need to do. Also, we've learned a great deal about GPU algorithms and improvements. One of the really exciting aspects about GPU's is that not only can they accelerate existing algorithms significantly, they get really interesting in that they can open doors to new algorithms that we would never think to do on CPUs at all (due to their very slow speed on CPUs, not but GPU's).

After much effort, we have taken all we've learned about GPUs from the first generation client and produced a second generation client. This new client appears to be faster, more reliable, and has more scientific functionality. The preliminary results so far from it look very exciting, and we're excited to now open up the client for FAH donors to run.

The second generation GPU core (GPU2) for NVIDIA (June 2008)

In collaboration with NVIDIA, we have released a GPU2 core for NVIDIA hardware.

The third generation GPU core (GPU3) for NVIDIA (May 2010)

Due to its great computational abilities, our GPU2 client has had a great scientific impact so far. In our most recent FAH paper (also see the movie), the GPU clients play a star role in allowing Folding@home to push to unprecedented levels, simulating protein folding on the millisecond timescale in an atomistic model.

GPU3 brings several key new features to Folding@home. In particular, GPU3 will allow for greatly enhanced science: including more accurate models, new science can be done, 2x faster execution of the science, more stable simulations, OpenCL support for run time science optimizations, and greater flexibility for adding new scientific capability. This is accomplished through the use of the http://simtk.org/home/openmm/OpenMM GPU library (which originally came from FAH GPU code, but has been significantly enhanced by Simbios staff).

GPU3 also lays down the foundation for future incorporation of OpenMM's support of OpenCL, which will also bring some very important new scientific features, especially in terms of on-the-fly runtime optimizations of the scientific code. However, at the moment, OpenCL is not supported in the current GPU3 NVIDIA client.

General instructions

This web page will serve as the FAQ and Release Notes for this new client, and we will update this page as more information becomes available.

The FAH GPU Client installer should do everything one needs. It installs the new v6.x SysTray style client, as well as DLL files used by this new client. Download the client from the High Performance Client Download Page for folding experts. The Windows GPU Guide can help you install the GPU3 client.

Basic Requirements:

    * a GeForce, Quadro, or Tesla card that supports CUDA (G80 or later for the most part)
    * A CUDA 2.2+ capable driver, version 185.55 or newer is recommended. Or 195.62 for GTX 2xx cards (download the 195.62 driver for Win XP, Win XP 64 bit, Vista/Win7, and Vista/Win7 64 bit). 197.41 for GTX 4xx cards.
    * Windows operating system (32 or 64 bit), XP or newer.

In some cases, donors have found problems with older drivers, so to be safe, we strongly recommend that donors use a more recent driver (197.45).

While the GPU3 client is not beta, the core is still a beta release and we expect there will be bugs, flaws, problems, etc. To minimize problems, we have been testing the cores extensively in house and they run well there. However, it's our experience that running in the controlled setup in our lab and running "out in the wild" are very different situations.

As in the use of any beta software, please make sure to back up your hard drive, and do not run this client on any machine which cannot tolerate even the slightest instability or problems.


下载链接:http://foldingforum.org/viewtopic.php?f=24&t=14671
2#
发表于 2010-5-26 08:41 | 只看该作者
本帖最后由 cool_exorcist 于 2010-5-26 08:46 编辑

http://folding.typepad.com/

Open beta release of the GPU3 client/core

We have a new GPU core (core 15) going
into an open beta test for NVIDIA clients. This core requires a new
client (see below) as well as the latest drivers (197.45). This core is
the first run of the GPU3 technology, derived from the OpenMM project at
Stanford (http://simtk.org/home/openmm). You can find more information
in our GPU3 FAQ (see url below).

While this release is for NVIDIA only to start, we are actively pushing
ATI support (with the help of AMD/ATI), although we have no ETA at the
moment.

This is the first open beta test of this new client and core, so there
are likely bugs to be found as more donors try this out on more diverse
sets of hardware. Also, the documentation (
GPU3
FAQ
) is new too and there are
possibly some errors there too. However, the client has been QA'd both
internally at Stanford and with our closed group of beta testers and is
looking pretty good so far.

Some testers in the closed beta test have found problems with 8800 and
9800 class GPUs (we are working on this).

Please post bugs or issues in the
new
GPU3 section
of this forum

NVIDIA Client download:
SYSTRAY:
http://www.stanford.edu/~friedrim/.Folding@home-systray-632.msi
(md5sum=effd87ba12c96be28e252bccbe776ff9)
VISTA CONSOLE:
http://www.stanford.edu/~friedrim/.Folding@home-Win32-GPU_Vista-631.zip
(md5sum=b41301886881958c64c1907b3ed6acae)
XP CONSOLE:
http://www.stanford.edu/~friedrim/.Folding@home-Win32-GPU_XP-631.zip
(md5sum=885e36a477d247487f8009335bd4e3cc)

GPU3 FAQ:
http://folding.stanford.edu/English/FAQ-NVIDIA-GPU3                        
                                                               
                                       

                                Posted at 01:40 PM in code development  | Permalink                                                                                       

回复 支持 反对

使用道具 举报

头像被屏蔽
3#
发表于 2010-5-26 08:55 | 只看该作者
提示: 作者被禁止或删除 内容自动屏蔽
回复 支持 反对

使用道具 举报

4#
发表于 2010-5-26 09:01 | 只看该作者
都计算这么多年了 我记得大三时候就开始了 有没有啥阶段性成果啊????????
380 发表于 2010-5-26 08:55

总比那个找外星人的有意义点,不过这些东西在全世界范围内耗能不少,却很少有实质性的成果,在如今低碳潮流下显得不那么某种水产。
回复 支持 反对

使用道具 举报

5#
发表于 2010-5-26 09:08 | 只看该作者
找外星人那个的确**
回复 支持 反对

使用道具 举报

6#
发表于 2010-5-26 09:19 | 只看该作者
总比那个找外星人的有意义点,不过这些东西在全世界范围内耗能不少,却很少有实质性的成果,在如今低碳潮 ...
toshibacom 发表于 2010-5-26 09:01

STU有一个列表,上面是引用成果的论文名单,可以去看看
回复 支持 反对

使用道具 举报

7#
发表于 2010-5-26 09:22 | 只看该作者
beta……又是beta……
回复 支持 反对

使用道具 举报

8#
发表于 2010-5-26 09:51 | 只看该作者
今早道听途说GTX470能达到14640PPD,GTX480估计有18000PPD。
如果按照这个结果计算,Fermi在单精度数学计算中的效率真不错。(这次发布的GPU3客户端,仍然不支持双精度)

起码使用Gromacs引擎下,目前新的客户端中:
9600GSO(G92)能达到4500PPD;
GTX260+大约是5510PPD;
而GTX470是14640PPD。

以上数值还在进一步修正中,毕竟STU没有给出运算包的分值。
回复 支持 反对

使用道具 举报

头像被屏蔽
9#
发表于 2010-5-26 09:56 | 只看该作者
提示: 作者被禁止或删除 内容自动屏蔽
回复 支持 反对

使用道具 举报

10#
发表于 2010-5-26 10:07 | 只看该作者
积分规则变了
然后啥新鲜玩意都没用,cache继续被当做shared,双精度依旧欠奉

目前的PPD差距,经过数学变换之后跟离线包在GPU2规则下跑的那个分数与当时其他卡的差距大体上相当
斯坦福不过是通过修改规则,在维持一定的量化分数的基础上将差距比例直接抹了个0而已
但愿是我算错了吧,谁有特别烂的卡来跑一下看看
回复 支持 反对

使用道具 举报

11#
发表于 2010-5-26 10:18 | 只看该作者
蛋白质折叠啊折叠
回复 支持 反对

使用道具 举报

12#
发表于 2010-5-26 11:23 | 只看该作者
Open beta release of the GPU3 client/core

We have a new GPU core (core 15) going
into an op ...
cool_exorcist 发表于 2010-5-26 08:41



    成果是有的,但没有到能够轰动普罗大众的地步。论文还是发了不少的
http://folding.stanford.edu/English/Papers
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

广告投放或合作|网站地图|处罚通告|

GMT+8, 2025-9-13 14:33

Powered by Discuz! X3.4

© 2001-2017 POPPUR.

快速回复 返回顶部 返回列表