Building A Backblaze Storage Pod

January 5th 2010 12:41 pm

[edit]You can buy BackBlaze pods now at OpenStorageSystems [/edit]

A while ago I saw the Backblaze storage pod and was impressed.

Like many others I thought:

  • I want one
  • Wouldn’t it work great with ZFS
  • The hardware sucks

Building one

I used to make and sell storage servers using Linux a while ago with media streaming software and easy setup (before all the ready-rolled ones came out) so the software side is not a major challenge. Backblaze have released a template for the cases and a list of other components and coincidentally a good friend is a Mechanical Design Engineer and can work on it for me. The cost for the cases drops precipitously if you buy in bulk and he is looking at making it able to be stored flat and easily assembled by folding edges together so I will take the plunge and buy a load, if anyone wants to get hold of some, please get in touch.

ZFS

As soon as you see so many disks in a case like that, it’s hard not to think of Sun’s Thumper and ZFS.

I’ve blogged about ZFS before and given talks on it. With so many disks to fail (either noisily or silently)  data loss is inevitable (and worse – you may not even be alerted), ZFS would solve this (or at least ensure you know about it). BackBlaze use custom application logic to work around this, using TomCat and HTTPS.

It’s Not Highly Available

A chap at Sun has a critique here that is totally spot on and he makes a few great points about subtle changes to Sun’s design to accommodate vibration, noise and electromagnetic radiation. In so many ways the hardware is inadequate and does not have the uptime characteristics of a device in Suns range. That said though an individual device from Sun is not as HA as, say, an EMC SAN (with mirrored write cache, dual SPs etc) as it too relies mostly on commodity hardware. For FiveNines availability you need to decide what you are doing to protect against device failure anyway, the BackBlaze devices just fail faster – that’s your trade-off.

It wont be fast

That is largely a feature of the disks and the controllers; you could get a better motherboard, disks and faster controllers, perhaps eschew the port multipliers too, if performance is a problem. A very cool new feature of ZFS (L2ARC / Hybrid Storage Pools) allows for using SSD as a second level cache, that would help. In linux dm-cache (or here)  could probably achieve something similar.

How can you make it HA?

This is really another blog (and a few weeks work hacking out the ideas), but I can think of several ways of doing what BackBlaze do in their software stack to export files (via NFS, SMB or other protocol) or block devices (ATAoE, iSCSI, NBD etc) in a robust manner.

I have ordered some of the port multipliers, got my friend working on the case and will buy the sundry bits over the next few days.

This is one of my101 goals

Posted by tom under 101 & BackBlazePod & ZFS | 5 Comments »

5 Responses to “Building A Backblaze Storage Pod”

  1. Nilay responded on 05 Jan 2010 at 6:14 pm #

    Very cool! We haven’t had a chance to load up Solaris & ZFS on our pod… Debian + JFS + mdadm has been very reliable for us, but we are always curious to see if there is a better way to do it.

    We’ll keep our eyes on your blog… and definitely keep us updated on your progress!

    – Nilay

  2. Ronald Duncan responded on 15 May 2010 at 12:15 pm #

    About 6 years ago, we build a collection of 5 Terrabyte servers, since then we have used EMC SAN’s and other large disk arrays.

    We are off to start building our own again. The main issue for us is a nice form of clustering, since we want to mirror the data across datacentres. Speed is not an issue at the moment since this if for our 2nd level backups, but if we could get it fast then we could use it for other things.

    It just needs to be mountable as NFS.

    A more complex software stack is http://www.openfiler.com/ and I would be interested in people’s experience with open filler.

    We tested out ZFS, and it did not cope with simple things like pulling a drive out and putting it back in. If you pull the drive out it goes into degraded mode, but it does not recognise the drive coming back in and you have a whole lot of pain to remount the drive and then rebuild things using the drive. Fine testing locally, but no use for someone that is likely to pull out the wrong drive in a remote data centre.

    How have you got on with building so far, since we would like to build a couple of servers immediately.

    All the best
    Ronald

  3. Butch responded on 21 Jul 2011 at 8:23 pm #

    With regards to ZFS not handling drive removal well, that may be more a driver support issue at the time. I’ve used ZFS on several Solaris systems with known supported controllers and have had no such issues. You may have to run cfgadm -C configure on the specific “slot” to reconfigure the drive first, but as long as the OS can communicate properly with the controller there shouldn’t be any problems.

  4. syle responded on 23 Dec 2011 at 1:55 pm #

    Isn’t that the point to begin with, not to have to have the pain to have to login to a box and “figure out what slot”, then “figure out what command to run on that slot”,
    should be able to pull it out, stick new one in, and have it do this automatically.

  5. Roger Pickering responded on 13 Jan 2012 at 5:17 pm #

    Thanks so much for the article! I am very much interested in this system, but need to know if it will work as is with Solaris. I very much like the ZFS fs and would use this in a tiered system so my fast secondary tier would be an Oracle data server.

    One, perhaps dumb, question, assuming it works with Solaris, is: can I put several boxes into one ZFS pool? That would allow me to have files much larger than the 120 TB otherwise (assuming RAIDZ2 with two spares and one SSD cache).

    Thanks in advance!

    Roger

Trackback URI | Comments RSS

Leave a Reply