Project Erebus has been in a constant state of change over the past several months. What initially started off as a system with 6 water cooled Nvidia GTX 480′s in a over the top gaming motherboard has morphed into a system with 8 ATI 6970 cards housed in a rack mounted server case. While I don’t think it’s safe to say the transformation is complete it has certainly reached a precipitous. In the ever changing world of computing the bar of performance, and speed are in a constant state of escalation. What one days is top of the line, the next is just old news. At least for a short while I have the honor of setting the bar in some respects, even if it is a losing battle.
The build, V 1.0
Erebus version 1, was composed of a EVGA SR-2 Classified motherboard and 6 water cooled GTX 480′s running at 1504Mhz. Water cooling was done so that all 6 cards could fit into the 7 available PCI express slots. The intention was to leave the center slot open for the Magma expansion bay. The Magma PCI-E expansion chassis provided room for several more cards, however a limitation of the SR-2′s 32bit BIOS prevented more than 8 total cards. The chassis was fitted with two Galaxy O/C cards running at 1520Mhz, slightly faster then the EVGA cards, but not requiring water cooling. With regards to a full Nvidia system this was one of the fastest personally owned single systems at the time. Any application capable of accessing multiple cards was able to utilize all 3,840 stream processors available in all 8 cards.
In an effort to increase the available bus speed and the hopes that all 10 slots of the TYAN FT72-B7015 motherboard could be used an order was put in with RenderStream a company based out of Austin TX. The board it’s self is much larger than even the already large SR-2. However the bare bones system consist of an entire chassis and power supply. Simply moving the memory, processors and hard drives over to the Tyan board I was up and running in Ubuntu in no time. At nearly the same time Nvida had released the GTX 580′s increasing clocks speeds and unlocking all 512 stream processors from the GF110 core. Immediately 4 EVGA GTX 580 hydro-coppers were ordered with the plans to order 4 more once they were in stock. It was at that time I discovered that in fact not all 10 slots could be used on the Tyan board, even using the Magma expansion bay.
There was some issues initially with getting all 8 cards to work, but once the BIOS was configured correctly both Windows 7 and Ubuntu 64 were able to detect all 8 cards. The system consisted of 4 GTX 480′s and 4 GTX 580′s any attempt to add another card regardless of the type would cause the system not to post, it had seemed as if the BIOS was unable to address all cards. Once again I was limited to 8 cards, it was a rather big disappointment because I hoped to be able to use the expansion and have a system with potentially 10 or 11 cards. Then there were rumors that Nvidia was to release the GTX 590′s essentially 2 under clocked 580 cores housed on one double slot card. This would give me the potential to have 16 total cards. However shortly before the release of the GTX 590′s I had decided to make a some what dramatic decision.
After a short discussion in my frequented IRC channel I immediately searched and found a good price on 8 ATI 6970′s. There was some initial concerns if the Tyan motherboard would support ATI cards at all. I only knew of a small handful of people that had ordered the RenderStream system and no one that I was aware of had even tried ATI cards in the system. All the Nvidia cards were removed and the 6970′s were installed. At first I thought that Ubuntu could only detect 6 cards, but as it turns out there was some issues with aticonfig correctly detecting everything. Manually setting the necessary configs got everything working and the results have been nothing but positive.
My first test consisted of running a new oclHashcat Beta application called oclHashcat-lite. Oclhc-lite is targeted at new users of the oclhc series of tools. Without the need to learn all the details of regular oclHashcat. In addition oclhc-lite was optimized for single hash performance utilizing the benefits of round reversal and various other hashing algorithm short cuts and GPU techniques to achieve the highest overall possible speeds. First up was the MD5 hashing algorithm the bench-press of benchmarks immediately showed us results just shy of what was predicted. Here are the results of the test.
To date these are some of the highest speeds achieved by a single personally owned machine.
Multi-hash performance obviously suffers and drops down to 14.8B/sec with 6500 hashes in MD5 using regular oclhc. The setup is relatively new and I plan to continue doing several benchmarks and post them here. For now I’m overall pretty pleased with the setup. For the test that I’ve ran ATI has surpassed Nvidia with ease. The only drawback I foresee is having to run a separate Nvidia system to support those CUDA only apps. Many thanks to those that have helped me iron out the kinks and to Atom for make such wonderful elegant tools