POPPUR爱换

 找回密码
 注册

QQ登录

只需一步,快速开始

手机号码,快捷登录

搜索
查看: 3704|回复: 39
打印 上一主题 下一主题

IBM与洛斯阿拉莫斯国家实验室合作,研发Peta级超级计算机 CELL双精度升级版跃然纸上

[复制链接]
跳转到指定楼层
1#
发表于 2006-9-6 12:38 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
IBM has won a bid to build a supercomputer called Roadrunner that will include not just conventional Opteron chips but also the Cell processor used in the Sony Playstation, CNET News.com has learned. The supercomputer, for the Los Alamos National Laboratory, will be the world's fastest machine and is designed to sustain a performance level of a "petaflop," or 1 quadrillion calculations per second, said U.S. Sen. Pete Domenici earlier this year. Bidding for the system opened in May, when a congressional subcommittee allocated $35 million for the first phase of the project, said Domenici, a Republican from New Mexico, where the nuclear weapons lab is located.

Now sources familiar with the machine have said that IBM has won the contract and that the National Nuclear Security Administration is expected to announce the deal in coming days. The system is expected to be built in phases, beginning in September and finishing by 2007 if the government chooses build the full petaflop system.

There's plenty of competition in the high-end supercomputing race, though. Japan's Institute of Physical and Chemical Research, called RIKEN, announced in June that it had completed its Protein Explorer supercomputer. The Protein Explorer reached the petaflop level, RIKEN said, though not using the conventional Linpack supercomputing speed test.

Representatives of IBM and Los Alamos declined to comment for this story. The NNSA, which oversees U.S. nuclear weapons work at Los Alamos and other sites, didn't immediately respond to a request for comment.

Hybrid supercomputers
The Roadrunner system, along with the Protein Explorer and the seventh-fastest supercomputer, Tokyo Institute of Technology's Tsubame system built by Sun Microsystems, illustrate a new trend in supercomputing: combining general-purpose processors with special-purpose accelerator chips.

"Roadrunner is emphasizing acceleration technologies. Coprocessor acceleration is intrinsic to that particular design," said John Gustafson, chief technology officer of start-up ClearSpeed Technologies, which sells the accelerator add-ons used in the Tsubame system. (Gustafson was referring to the Roadrunner project in general, not to IBM's winning bid, of which he disclaimed knowledge.)

IBM's BladeCenter systems are amenable to the hybrid approach. A single chassis can accommodate both general-purpose Opteron blade servers and Cell-based accelerator systems. The BladeCenter chassis includes a high-speed communications links among the servers, and one source said the blades will be used in Roadrunner.

Advanced Micro Devices' Opteron processor is used in supercomputing "cluster" systems that spread computing work across numerous small machines joined with a high-speed network. In the case of Roadrunner, the Cell processor, designed jointly by IBM, Sony and Toshiba, provides the special-purpose accelerator.

Cell originally was designed to improve video game performance in the PlayStation 3 console. The single chip's main processor core is augmented by eight special-purpose processing cores that can help with calculations such as simulating the physics of virtual worlds. Those engines also are amenable to scientific computing tasks, IBM has said.

Using accelerators "expands dramatically" the amount of processing a computer can accomplish for a given amount of electrical power, Gustafson said.

"If we keep pushing traditional microprocessors and using them as high-performance computing engines, they waste a lot of energy. When you get to the petascale regions, you're talking tens of megawatts when using traditional x86 processors" such as Opteron or Intel's Xeon, he said.

"A watt is about a dollar a year if you have the things on all the time," so 10 megawatts per year equates to $10 million in operating expenses, Gustafson said.

A new partnership
The Los Alamos-IBM alliance is noteworthy for another reason as well. The Los Alamos lab has traditionally favored supercomputers from manufacturers other than IBM, including Silicon Graphics, Compaq and Linux Networx. Its sister lab and sometimes rival, Lawrence Livermore, has had the Big Blue affinity, housing the current top-ranked supercomputer, Blue Gene/L.

Los Alamos also houses earlier Big Blue behemoths such as ASC Purple, ASCI White and ASCI Blue Pacific. (ASCI stood for the Accelerated Strategic Computing Initiative, a federal effort to hasten supercomputing development to perform nuclear weapons simulation work, but has since been modified to the Advanced Simulation and Computing program.)

Blue Gene/L has a sustained performance of 280 teraflops, just more than one-fourth of the way to the petaflop goal.
The U.S. government has become an avid supercomputer customer, using the machines for simulations to ensure nuclear weapons will continue to work even as they age beyond their original design lifespans. Such physics simulations have grown increasingly sophisticated, moving from two to three dimensions, but more is better. Los Alamos expects Roadrunner will increase the detail of simulations by a factor of 10, one source said.

For twice-yearly ranking of supercomputers called the Top500 list, computers are ranked on the basis of a benchmark called Linpack that measures how many floating-point operations per second--"flops"--it can perform. Linpack is a convenient but incomplete representation of a machine's total ability, but it's nevertheless widely watched.

IBM has dominated the Top500 list with its Blue Gene/L supercomputing designs. But U.S. models haven't always led, and there's been some international rivalry: A Japanese system, NEC's Earth Simulator, topped the list for years.

IBM and petaflop computing are no strangers. Although customers can buy the current Blue Gene/L systems or rent their processing power from IBM, Blue Gene actually began as a research project in 2000 to reach the petaflop supercomputing level.
2#
发表于 2006-9-6 12:42 | 只看该作者
quadrillion
这个数字我不认识了:o :o :o
回复 支持 反对

使用道具 举报

3#
发表于 2006-9-6 12:46 | 只看该作者
我KAO     
回复 支持 反对

使用道具 举报

4#
 楼主| 发表于 2006-9-6 12:49 | 只看该作者
原帖由 roadrunner 于 2006-9-6 12:46 发表
我KAO     


hello, Cell。
回复 支持 反对

使用道具 举报

5#
发表于 2006-9-6 12:56 | 只看该作者
这种既拥有X86的兼容性又拥有CELL的特定运算能力的系统很有意思
回复 支持 反对

使用道具 举报

6#
发表于 2006-9-6 12:56 | 只看该作者
opteron+外置加速器:thumbsup:
回复 支持 反对

使用道具 举报

7#
发表于 2006-9-6 13:45 | 只看该作者
说实话,cell那种成绩,总感觉很不实在
跟个超高频486似的
回复 支持 反对

使用道具 举报

potomac 该用户已被删除
8#
发表于 2006-9-6 14:07 | 只看该作者
提示: 作者被禁止或删除 内容自动屏蔽
回复 支持 反对

使用道具 举报

9#
发表于 2006-9-6 14:15 | 只看该作者
原帖由 potomac 于 2006-9-6 14:07 发表

可惜蛋糕被IBM抢走。

AMD来个Opteron+R600,吃个独食岂不更好。:lol:


Opteron+R600只能用来玩游戏
回复 支持 反对

使用道具 举报

10#
发表于 2006-9-6 14:17 | 只看该作者
如果AMD要搞CPU和GPU共用插座, 那目的肯定不是为了游戏
回复 支持 反对

使用道具 举报

11#
发表于 2006-9-6 14:20 | 只看该作者
原帖由 potomac 于 2006-9-6 14:07 发表

可惜蛋糕被IBM抢走。

AMD来个Opteron+R600,吃个独食岂不更好。:lol:



不一样的,可以做更多蛋糕.集成加速芯片也许是个好主义.
回复 支持 反对

使用道具 举报

12#
发表于 2006-9-6 15:18 | 只看该作者
好像一片cell是0.25T单精度的峰值。

要凑1P, 插上4000片cell, 好“高档”的刀片集群啦,哈哈。

Opteron负责数据通道,计算流是靠cell,好不容易凑出1P单精度的峰值来。



我可以轻易判断出,其大致的结构来,不妨可打个赌:

由N个x86刀片 (靠 网卡) 构成集群

每个刀片上, 有M个Opteron, 和X个Cell通过HTT连接(估计还不是直接连接,而是桥接)
回复 支持 反对

使用道具 举报

13#
发表于 2006-9-6 15:21 | 只看该作者
因此建造进度很快, 9月开工,明年就能造完。 估计是应急的方案。
回复 支持 反对

使用道具 举报

14#
发表于 2006-9-6 15:31 | 只看该作者
这是一个可能的方案,就能凑出1P来(不需要设计新Opteron和新Cell, 只需要设计HTT桥)

集群连接方式: 网卡、以太网连接结构

1024个刀片(双Opteron刀片)

每个刀片, 双Opteron, 用4个HTT和4个cell桥接。

一种垃圾堆技术
回复 支持 反对

使用道具 举报

15#
 楼主| 发表于 2006-9-6 15:36 | 只看该作者
在这里说的显然不是单精度,而是指双精度,CELL的双精度加强版可能会在07年完成,双精度性能按照之前IBM的paper是可以达到单精度的一半,目前DD3的CELL的双精度性能是单精度CELL的1/14(或者说1/10)。
回复 支持 反对

使用道具 举报

16#
发表于 2006-9-6 15:36 | 只看该作者
原帖由 hopetoknow2 于 2006-9-6 15:31 发表
一种垃圾堆技术

只要不是intel技术的,就一定是垃圾.......
回复 支持 反对

使用道具 举报

17#
发表于 2006-9-6 15:42 | 只看该作者
IBM的技术是很不错的
回复 支持 反对

使用道具 举报

18#
发表于 2006-9-6 15:56 | 只看该作者

回复 #16 Edison 的帖子

为什么说是一定是1P双精度呢? 不能是说1P单精度,或1P整数呢?

九月都开工了,很快就完成。时间上看, 就是想用新设计的cell,也来不及。
回复 支持 反对

使用道具 举报

19#
发表于 2006-9-6 15:59 | 只看该作者
原帖由 hopetoknow2 于 2006-9-6 15:31 发表
这是一个可能的方案,就能凑出1P来(不需要设计新Opteron和新Cell, 只需要设计HTT桥)

集群连接方式: 网卡、以太网连接结构

1024个刀片(双Opteron刀片)

每个刀片, 双Opteron, 用4个HTT和4个cell桥接。 ...


既然用Cell作主要的FLOPS来源,也许就直接单Opteorn刀片就可以了,用不着双Opteon,也用不到双核心的Opteron。这样一个Opteron可以接3个Cell,而不是两个Opteon接4个。3个Cell加一个Opteorn的刀片也够复杂了。

Ethernet也不是IBM和米国能源部的风格,起码也是infiniband吧。
回复 支持 反对

使用道具 举报

20#
 楼主| 发表于 2006-9-6 16:07 | 只看该作者
CELL DP和DD3 CELL是引脚兼容的,基本上不需要重新设计主板,类似的情况当初在Blue Gene也有过。
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

广告投放或合作|网站地图|处罚通告|

GMT+8, 2025-2-11 05:42

Powered by Discuz! X3.4

© 2001-2017 POPPUR.

快速回复 返回顶部 返回列表