|
本主题是专门为讨论Larrabee开设的,因此要求相关的讨论以技术为主。
Larrabee目前确定的信息:
1、Intel的Doug Carmean为Larrabee的首席架构师
2、目前唯一发布过Larrabee细节场合是在斯坦福大学的CS448教程上,由Doug Carmean主讲的"Intel Larrabee",但是相关的演讲文件因为NDA的缘故并没有公开下载;
3、目前有两份和Larrabee相关的幻灯片文件,一份是TerPro-Aject首席架构师Ed Davis的"Tera Tera Tera",另一份为惠普高性能计算部Richard Kaufmann的"HP & PetaFLOPS";
4、按照"Tera Tera Tera"的介绍,Larrabee的架构将如下的:
频率: 1.7GHz~2.5GHz
内核数: 16~24内核 in-order x86 ISA,4线程/内核
每核单周期双精度运算能力: Non-SSE:2
w/SSE:8-16
每核单周期整数运算: ???
每内核cache容量, 时延: L1 32KB, 1 clock
L2 256KB, 10 clock
L3 没有
64-byte cache-line
内核互联总线: 256byte/cycle Ring环路
Ring环路时延: ???
内存: 1~2GB 128GB/s GDDR/FastDRAM
设备总线带宽: QPI, 17GB/s/link, 时延50ns
峰值: 14~40 GF/core, 0.2-1.0TF/processor
5、按照HP & PetaFLOPS的介绍,内核的细节基本上和"Tera Tera Tera"类似,但是在内核数量和内存带宽上有不少的提升:
频率: 4GHz
内核数: 32内核 in-order x86 ISA, 4线程/内核
每核单周期双精度运算能力: Non-SSE:2
w/SSE:8-16
每核单周期整数运算: ???
每内核cache容量, 时延: L1 32KB, 1 clock
L2 256KB, 10 clock
L3 没有
64-byte cache-line
内核互联总线: 256byte/cycle Ring环路
Ring环路时延: ???
内存: 1~2GB 192GB/s GDDR/FastDRAM
设备总线带宽: QPI, 17GB/s/link, 时延50ns
峰值: ~2TF/processor
5、将会在2008年提供演示,可能推出的时间在2009年或者2010年;
6、针对的市场主要是高端图形以及高性能计算机(HPC),适用于Jpeg纹理、物理加速、抗锯齿、AI强化、光线追踪等;
7、非游戏图形开发方面有可能采用Intel称之为Ct的API。[更新,现在基本确定Larrabee的SDK暂名为"Native SDK"]
~~~~~~
大家可以就以下话题展开讨论:
1、x86 ISA采用in-order流水线是否由于先天性的缺陷而导致性能受到严重抑制
2、Larrabee和Cell、G8X、R600在架构上的特点、差异以及由此可能引出的应用、性能差别
3、你对Larrabee的前景有何看法
4、你希望Larrabee能在哪些方面作适度的改进
~~~~~~
参与讨论的时候请注意:
1、请不要把其他网站的新闻照抄过来,如果你需要大家关注其内容,只需要把链接和部分关键的段落提供,照搬的内容我们会予以删除。
2、与上面或者其他网友提供的信息重复或者重叠的内容请不要再引用。
3、请注意网络礼节。
更新:
2008年6月3日,Siggraph 2008出现了名为"Larrabee: A Many-Core x86 Architecture for Visual Computing"的专题讲座,出席的人员包括了:
Larry Seiler, Doug Carmean, Eric Sprangle, Tom Forsyth (Intel Corporation), Michael Abrash (RAD Game Tools), Pradeep Dubey, Stephen Junkins, Adam Lake, Jeremy Sugerman, Robert Cavin, Roger Espasa, Ed Grochowski, Toni Juan (Intel Corporation), Pat Hanrahan (Stanford University)
内容概述:
This paper introduces the Larrabee many-core visual computing architecture (a new software rendering pipeline implementation), a many-core programming model, and performance analysis for several applications. Larrabee uses multiple in-order x86 CPU cores that are augmented by a wide vector processor unit, as well as fixed-function co-processors. This provides dramatically higher performance per watt and per unit of area than out-of-order CPUs on highly parallel workloads and greatly increases the flexibility and programmability of the architecture as compared to standard GPUs.
会议召开时间为 8 月 12 日。
文件释出:
http://softwarecommunity.intel.c ... rrabee_manycore.pdf
2009 年的 GDC 09 上 Intel 将公布 LRBni ISA 的细节:
https://www.cmpevents.com/GD09/a ... =11&SessID=9139
SIMD Programming with Larrabee: A Second Look at the Larrabee New Instructions (LRBni) in Action
Speaker: Tom Forsyth (Programmer, Intel)
Date/Time: TBD
Track: Programming
Format: 60-minute Lecture
Experience Level: All
Session Description
Larrabee is Intel's revolutionary approach to take the current evolving programmability of the GPGPU to its logical end. The Larrabee architecture features many cores and threads, as well as a new vector instruction-set extension, the Larrabee new instructions (LRBni).
This talk follows Michael Abrash's first glimpse into LRBni and examines the programming methods and hardware instructions that help programmers get the most out of LRBni's extremely wide vector units. Starting with simple math examples that are fairly simple to vectorize, it moves through loops, conditionals, and more complex flow control, showing how to implement these algorithms in LRBni.
Next, the numberous choices of data format are examined - when to use SOA or AOS (and what those terms mean!), and how to use gather/scatter most efficiently from the same data structures used in an existing engine.
Finally, there is a quick look at efficient code scheduling and how to use the multiple hardware threads to help absorb instruction latencies.
Takeaway
The attendees will learn about the latest processor architecture from Intel, and the instruction set used to program it. Understanding how this architecture and instruction set works will give the attendee information on how to design the next iteration of their game engine, and the possibilities available when programming Larrabee natively.
Intended Audience and Prerequisites
Programmers will get the most from this talk, although it will be of interest to anyone interested in the nature of Larrabee, and the reasons why processor architecture is evolving in Larrabee's direction.
Rasterization on Larrabee: A First Look at the Larrabee New Instructions (LRBni) in Action
Speaker: Michael Abrash (Programmer, Rad Game Tools)
Date/Time: TBD
Track: Programming
Format: 60-minute Lecture
Experience Level: All
Session Description
Larrabee is Intel's revolutionary approach to take the current evolving programmability of the GPGPU to its logical end. The Larrabee architecture features many cores and threads, as well as a new vector instruction-set extension, the Larrabee new instructions (LRBni).
This talk will provide an overview of LRBni and discusses the major instruction features - 16-wide SIMD, multiply-add, ternary instructions, predication, built-in data-format conversion, and gather/scatter.
The talk will then take a close look at a specific - and not obviously vectorizable - application of LRBni - rasterization. This is a crucial stage in the Larrabee rendering pipeline, and it demonstrates how developers can use the flexibility of the new instruction set to solve problems that are not obviously shader-like.
Takeaway
The attendees will learn about the latest processor architecture from Intel, and the instruction set used to program it. Understanding how this architecture and instruction set works will give the attendee information on how to design the next iteration of their game engine, and the possibilities available when programming Larrabee natively.
Intended Audience and Prerequisites
Programmers will get the most from this talk, although it will be of interest to anyone interested in the nature of Larrabee, and the reasons why processor architecture is evolving in Larrabee's direction. |
|