本帖最后由 itany 于 2010-2-17 20:34 编辑
High end system vendors get the blues
Analysis As servers get more generic, it's back to workstations
By Nebojsa Novakovic
Tuesday, 16 February 2010, 15:39
ONCE UPON A TIME, say 10 years ago, the PC hierarchy was very clear: the bottom level client - desktop and laptop alike - PCs were mainstream systems with similar performance and features as well as resulting lower margins for both the manufacturers making them and the system integrators installing them. Then there were professional 3D graphics, desktop publishing and electronic design automation workstations, which were powerful usually dual-socket deskside machines with high priced OpenGL graphics cards. At the very top were servers, usually monsters with two or four CPUs, larger memory and I/O, as well as custom, expensive system level design.
As you climbed up every level, the volumes diminished, but both the unit prices and percentage sales margins achievable jumped massively. So, a quad CPU server would cost as much as four dual CPU workstations or 40 PC clients respectively, even though the plain bill of materials might only have been three and 15 times more costly, respectively. The added value, service and support margins made up the difference.
And that was a time when most of these big machines were still designed, and often made, in the US or other Western countries. A decade ago, the Taiwanese were still novices at workstation and server board design, and the cost savings or quick turnarounds that the little Chinese island can bring to the table didn't mean that much yet in this exclusive marketplace, which was dominated then by DEC, Compaq, HP, Sun, IBM and Silicon Graphics and what was then a healthy mix of various 64-bit RISC architectures as well as the ascendant but still 32-bit X86 at the low end.
Now, things look quite a bit different.
The RISC platforms with their extra performance and price niches are mostly gone. None remain in workstations and only very high-end servers are around with IBM POWER7 and Fujitsu SPARC64 VIII coming this year. No RISC in workstations means a far smaller software developer base, as having a machine on the desk can never be fully replaced by access to a remote big machine somewhere far away. Even Intel's Itanium, the supposed RISC-killer, seems destined to follow the same fate in due course. Too bad Nvidia didn't decide to get hold of the Alpha platform and its fast X86 binary translators, since if it had maybe we'd now have some real competition there.
The 64-bit X86 has fully taken over the workstation and all classes of servers up to eight sockets now, with only a limited few RISC boxen still around in this mainstream server category. And, yes, it's mostly the Taiwanese firms like Supermicro, Asus, Gigabyte and Tyan - increasingly through their mainland China bureaus - churning out the actual designs for companies like HP, Dell, Fujitsu and others. Quick turnaround, low price and pretty much generic standard feature sets on, say, 1U dual processor (DP) server platforms now make them just as ubiquitous, and with similar zero value add, as a typical client PC.
As an example, a typical Intel dual processor Nehalem or Westmere Xeon platform, expected to be like 90 per cent of all DP server shipments this year, has complete memory control and QPI interconnect already in the CPU, and only Intel's Tylersburg chipset is available to choose from. Everyone even uses the same LSI SAS optional storage controller. So, how much 'added value' is there to add at the hardware level? It seems, even less than on a high-end gaming desktop board. No value add means no margin to add.
When coupled with the tough competition between both these Taiwanese firms and the big US vendors using the designs, this means rock bottom margins for the generic 1U and 2U DP server platforms, not that different percentage-wise from, say, a desktop.
With Intel's Nehalem-EX quad socket platform arrival late next month, a similar 'standardisation' will happen in the higher end space. Except for very big boxes that scale to eight sockets and more - mind you, that's 64 cores and 128 threads here - the base four socket platform will, sooner or later, become as generic as the DP one. One CPU socket type, one I/O chipset, one memory buffer chip, and you can play around with the rest, but there's not much to play with at the base hardware. Why should there be, when Intel did a great job with all these chosen components anyway? As you'll see at the end of next month, Nehalem-EX is expected to scale very well, the best ever multiprocessor scaling for any X86 platform.
Let's see what tricks and treats the OEMs will come up with for the new platform, but in the meantime, is there any hope for more varied and more value added configurations in the mainstream and still predominant dual socket platform, Intel and AMD alike? And that's putting aside the more proprietary vendor-specific stuff like blades or other non standard formats.
The answer is back in the past - graphics workstations, a market with less pomp and glamour, but that has sustained its growth, unique value add and, therefore, reasonable margins, for many years already.
Workstations are more demanding than generic servers, with the exceptions of HPC supercomputing nodes or I/O intensive machines. A high-end 3D workstation needs the best of everything, from the fastest CPU with the most cores - remember Intel's workstation-only W-series Nehalem Xeons - to the fastest and often very large memory system, where capacity, bandwidth and latency all matter. Then, a fast multi-slot PCIe I/O system to feed both one or two fast GPUs and often a RAID or fast SSD array. Finally, networking and other I/O are there, but not before we throw in an expensive display, a powerful PSU or two and sometimes even an unusual 3D pointing device such as the spaceball.
Where can Taiwan vendors make that difference felt, to justify higher prices and therefore margins? Workstation users would love performance tuning just as much as PC overclockers, as long as it, well, works, and brings added - very productivity-linked here - performance benefit. After all, finishing a render in four days instead of five days might mean 20 per cent more money earned per day here. So, what Asus did with its Z7S-WS dual Xeon mainboard two years ago, and what EVGA intends to do with its dual Xeon Classified board in April, might apply here well, that is, reasonable CPU and memory performance optimisation options for frequency, bandwidth and latency. Of course, that requires an improved power system, good component choice and demanding overall board design.
Then, a choice of I/O options: even though a single Intel Tylersburg 5520 chipset has 36 PCIe v2 lanes, these can be handled in many ways. One is, for those happy with this number of lanes and desiring minimum latency, like in say GPGPU work where the GPU to main memory round trip time is important, just optimise the traces and latencies on the 2 x PCIe x16 v2 slots, and leave the x4 v2 slot open for a, say, PCIe high-speed SSD card like the one Intel showed at the last IDF.
Another is for those requiring three or more GPGPU cards in one system, but all hanging from one chipset. Then, like EVGA Classified, the mainboard designers would add two hot - beware of heat handling here - Nvidia Nforce 200 PCIe bridges, and have four full PCIe x16 v2 slots here. An alternative, like in the Supermicro boards, is using two Tylersburg 5520 bridges, one on each CPU and then connected together via an additional QPI link. This way, you'll also get four full PCIe X16 slots, plus more extra x4 slots for I/O expansion.
Another add-on category that makes sense is better I/O. Like, for instance, using a more intelligent Gigabit Ethernet controller such as the Intel 82576 that takes on more TCP/IP protocol stack processing to offload work from the CPU. Same for USB3 and, why not, hardware-assisted SATA3 RAID capability. And, did I forget easy BIOS update via Flash within the BIOS itself, just like on most desktop boards?
There won't be a shortage of CPUs to fill into these. After all, Intel's Xeon 5600 series of six-core Westmere CPUs will come out next month, and the high performance top bins are expected to set new speed records. Also, the vendors could consider workstation flavours of dual socket and quad socket Nehalem-EX boards, even with multiple QPI links between each CPU pair for higher bandwidth, as well as massive memory for EDA chip design and computational science simulations which are the main targets of a possible Nehalem-EX workstation niche.
In summary, the generic server value add hardware side looks just as bleak for any Taiwan mainboard maker as a, well, generic client PC. But, custom servers aren't the only way out. Good old workstations might reward those vendors who dare to make the right moves. But let's not only wait for Gigabyte, Asus, Supermicro. It's also up to us, the users, to voice our opinions and make our needs known, too. µ |