I require a distributed filesystem for Windows

I am about to take over Network Admin duties at work. (Don’t ask.) In and of itself, this isn’t necessarily a Bad Thing, but XP was still new the last time I was a Network Admin. I have a need for a software solution, and I’m hoping that someone can point me in the right direction.

As the subject says: I required a filesystem that can be distributed across our WAN/LAN workstations, instead of being situated on a few big servers. I know that HiveCache used to do something like this, but it seems to have gone the way of the dodo. No, I do not want to push our data to a public service.

We are a small but geographically diverse company, with roughly 40 employees locally and another 40 or so employees in 2 other North American locations. The server infrastructure is a combination of Win2k and 2k3 machines, just shy of a dozen of them for various tasks.

The company is at the size where we are starting to generate a volume of data that is tedious to do backups—a full data (non-OS) backup is about 2TB. Unfortunately, we are still small enough that going with an off-the-shelf SAN hardware solution is too cost-prohibitive—anything more than maybe $2k will get nixed immediately.

We have a great network infrastructure, so I’d rather do a distributed backup across the entire network to a single carousel than run 6 different tape drives like we have now. Yeah, it would be cheap to just buy a few 750GB hard drives and RAID them together. But, I don’t want to buy yet another server, and I don’t want to have yet another single point of failure. And I’d still have to dump it all to tape, so why bother?

Thus, I’d like to start using the spare hard drive space on the workstations to perform the function of network storage. I’ve done some napkin-math that suggests that roughly 1 of those 2 TB could be stored in workstation space, with the rest being distributed across the servers. Obviously, it would have to be transparent, redundant, and secure. The idea has been around for at least a decade now, if not longer, but I’ll be darned if I can find anyone that does it on Windows.

Sun has Lustre, but that’s not exactly what I was going for as it wants a dedicated index server … and doesn’t run on Windows. Gluster seems closer, with distributed indexes, but also doesn’t run on Windows. Windows has DFS, but that’s just a unified namespace across servers (transparency), not actual redundancy.

And yes, I am fully aware of the implications of what I am asking. Yes, I know that if more than 30% of the people turn their PCs off then someone might not be able to get their data. Yes, I know that it will increase network overhead (albeit not significantly), as well as workstation processor overhead. Blah, Blah, blah.

Anyone? Bueller? Aren’t we in the future yet?

Published by

Rick Osborne

I am a web geek who has been doing this sort of thing entirely too long. I rant, I muse, I whine. That is, I am not at all atypical for my breed.