|
well,现在有更加准确的情报了。
WASHINGTON, DC - 06 Sep 2006: The U.S. Department of Energy's National Nuclear Security Administration (NNSA) has selected IBM to design and build the world's first supercomputer to harness the immense power of the Cell Broadband Engine™ (Cell B.E.) processor aiming to produce a machine capable of a sustained speed of up to 1,000 trillion calculations per second, or one petaflop.
The 'hybrid' supercomputer, codenamed Roadrunner, will be installed at DOE's Los Alamos National Laboratory. In a first-of-a-kind design, Cell B.E. chips -- originally designed for video game platforms -- will work in conjunction with systems based on x86 processors from Advanced Micro Devices, Inc. (AMD).
Designed specifically to handle a broad spectrum of scientific and commercial applications, the supercomputer design will include new, highly sophisticated software to orchestrate over 16,000 AMD Opteron™ processor cores and over 16,000 Cell B.E. processors in tackling some of the most challenging problems in computing today. The revolutionary supercomputer will be capable of a peak performance of over 1.6 petaflops (or 1.6 thousand trillion calculations per second).
The machine is to be built entirely from commercially available hardware and based on the Linux® operating system. IBM® System x™ 3755 servers based on AMD Opteron technology will be deployed in conjunction with IBM BladeCenter® H systems with Cell B.E. technology. Each system used is designed specifically for high performance implementations.
Designed also with space and power consumption issues in mind, the system will employ advanced cooling and power management technologies and will occupy only 12,000 square feet of floor space, or approximately the size of three basketball courts.
New Era of Industry Supercomputing
Roadrunner's construction will involve the creation of advanced "Hybrid Programming" software which will orchestrate the Cell B.E.-based system and AMD system and will inaugurate a new era of heterogeneous technology designs in supercomputing. These innovations, created collaboratively among IBM and LANL engineers will allow IBM to deploy mixed-technology systems to companies of all sizes, spanning industries such as life sciences, financial services, automotive and aerospace design.
How it Works
Roadrunner's hybrid design will allow the system to segment complex mathematical equations, routing each segment to the part of the system that can most efficiently handle it. Typical compute processes, file IO, and communication activity will be handled by AMD Opteron processors while more complex and repetitive elements -- ones that traditionally consume the majority of supercomputer resources -- will be directed to the more than 16,000 Cell B.E. processors. Designed originally for gaming platforms, where intense graphics and real-time responsiveness are key, the Cell B.E. processor is ideal to speed Roadrunner through intense mathematical problems.
"This new supercomputer demonstrates a commitment to achieve a major advance in technological capability that will help enable scientists and businesses solve the most challenging problems," said Bill Zeitler, senior vice president, IBM Systems and Technology Group. "Los Alamos is a valued partner as we embark on this exciting journey."
"This installation with Los Alamos and IBM demonstrates the compelling benefits from industry leaders innovating around an open platform; in this case IBM and AMD collaborating in the use of AMD Opteron and the Cell B.E. processor to build powerful systems for highly specific Los Alamos Labs workloads," said Marty Seyer, senior vice president, Commercial Segment, AMD. "This is an excellent demonstration of Torrenza in action -- building on the performance and performance-per-watt advantages AMD delivers to create incredible value in leveraging HyperTransport technology to redefine how different systems, based on different processor platforms, can communicate with each other to solve some of the most complex computing problems."
IBM will begin shipping the new supercomputer to the DOE facility at the Los Alamos National Laboratory later this year, with completion of the installation and acceptance anticipated in 2008.
Based on the Power Architecture™, the Cell B.E. processor was developed in collaboration with IBM, Sony Corporation, Sony Computer Entertainment Inc. (Sony and Sony Computer Entertainment collectively referred to as Sony Group), and Toshiba Corporation.
在这另一篇官方新闻中,我们可以看到一些这样的数字:
1、这台超级计算机中安装的Opteron和CELL的数量均分别超过1万6000枚。
2、理论峰值为1.6 PFLOPS。
3、机器会在2008年完成安装。
我们目前已经知道的一些事实:
假设K8为Rev.G即FP单元保持目前的样子,双精度:2.8GHz*2FLOPS/Cycle*2 Core= 11.2 GFLOPS。 16000枚2.8GHz Opteron的双精度运算能力是 179200 GFLOPS = 179.2 TFLOPS
假设CELL依然是05年11月的DD3.1版本,单精度:2.8GHz*8FLOPS/Cycle*8 Core= 179.2 GFLOPS,16000枚2.8GHz DD3.1 CELL的单精度浮点能力是:2867200 GFLOPS = 2867.200 TFLOPS = 2.87 PFLOPS, DD3.1的CELL双精度性能只有单精度性能的1/10,也就是说16000枚DD3.1版CELL组成的双精度性能大约是0.28 PFLOPS。
如果都是按照目前的K8 dual core和CELL合并起来,性能只有0.47 PFLOPS。
很显然,如果采用目前16000枚K8 dual core和16000枚目前的CELL都是无法达成计划中的1.6 PFLOPS目标。
根据目前AMD和IBM在官方文件中透露的信息,这台机器真正采用的CELL很可能实际上是CELL DP或者说CELL双精度增强版,双精度性能可以做到单精度的1/2(相当于大多数目前通用处理器的单精度/双精度性能比例),这时候的情况就是:
65nm Rev.G K8 dual core: 2.8GHz*2FLOPS/Cycle*2 Core= 11.2 GFLOPS * 16000 = 179200 GFLOPS = 179.2 TFLOPS
CELL DP:2.8GHz*8FLOPS/Cycle*8 Core= 179.2 GFLOPS,16000枚CELL DP单精度性能 = 2867200 GFLOPS = 2867.200 TFLOPS = 2.87 PFLOPS,双精度性能为1.4336 PFLOPS。
这样,16000枚CELL DP + 16000枚 65nm Rev.G K8 dual core可以做到1.4336+0.1792 = 1.612 PFLOPS
如果opteron采用的是K8L级别的双核版,上面的1.612 PFLOPS可以变成1.792 PFLOPS。 |
|