A Thought Experiment in Using the Web for Storage

Linus Torvalds' famous quote: "Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it ;)"

Consider all the free websites where you can send information of your own choosing - blogs, Facebook, web mail, (this one), etc.

Each site has its own (defacto) API for reading and writing data.

Can a translation layer be made for an arbitrary site, presenting it as a block device to the Linux kernel? I believe this was done for GMail, via the GmailFS.

The trick would be to add a RAID layer to this, to deal with the transient nature of the sites.

Another aspect of the mechanism would identify new sites and semi-automatically add accounts (possibly prompting the system owner to solve a Captcha, for example.)

Once added, data would be propagated to the new accounts.

For the RAID layer, data would be kept in several places but the different places would not only have different size limits (many unknown), but also different reliability and longevity properties. The reliability could be monitored, but the longevity would have to be estimated.

Some parameters for estimating the longevity might be:

  • number of accounts
  • duration of account lifetime
  • stock price

Per-site size limits would probably be required to avoid ToS violations.

changed December 7, 2012