A while ago I saw the Backblaze storage pod and was impressed, super cheap and space for 45 drives in 4U.
I found a place to have them made in the UK (in matt black, not the Backblaze red) and though I sold most of them I have still got a few left for £800 each (including the nylon stand-offs and the Port multipliers), mail me at [email protected] if you are interested, if you need more than one I can give you a discount too.
After the first run I had them completely done by the fabrication place so they are precision assembled (I was hopeless at it, almost as hopeless as I am at selling things.)
As soon as you see so many disks in a case like that, it’s hard not to think of Sun’s Thumper and ZFS.
I’ve blogged about ZFS before and given talks on it. With so many disks to fail (either noisily or silently) data loss is inevitable (and worse - you may not even be alerted), ZFS would solve this (or at least ensure you know about it). BackBlaze use custom application logic to work around this, using TomCat and HTTPS.
It’s Not Highly Available
An ex-Sun guy has a critique here that is totally spot on and he makes a few great points about subtle changes to Sun’s design to accommodate vibration, noise and electromagnetic radiation. In so many ways the hardware is inadequate and does not have the uptime characteristics of an enterprise SAN/NAS. There are however a few smart software solutions to work around hardware failures, so the availability of a particular device is not so important.
It wont be fast
That is largely a feature of the disks and the controllers, using the Port Multiplers slows you down too. A very cool feature of ZFS Hybrid Storage Pools allows for using SSD as a second level cache, that would help.
How can you make it HA?
The landscape has shifted a little since I last blogged about it, Ceph and RiakCS being interesting additions. Ceph has an object store, block device and POSIX Filesystem with distributed metadata in the works, the one to watch I think.