ZFS Technology: One of Apple's Secret Leopard Weapons

| About: Apple Inc. (AAPL)
This article is now exclusive for PRO subscribers.

By Carl Howe

Today, I'm going to take on a geeky technology topic. But before you click away, answer this question:

If all the applications, documents, digital photos, music files, and movies on your computer disappeared today, how upset would you be?

If your answer was anything other than, "Not at all,", read on. And no, I'm not going to tell you to go do your backups. I know that no one actually spends the time to do backups in real life.

Instead, I'm going to talk about some technology that Apple is using.

We've seen a flurry of articles this week about the coming role of Sun's open-sourced ZFS storage system in Mac OS X 10.5, Leopard. What we haven't seen much is what it means to users and consumers.

The simplest way to describe ZFS is that it is a complete rethink of how computers store information. Everyone today thinks about storing files on a disk. A disk is, of course, that rectangular metal-and-PC-board blob attached to metal rails in your computer case with screws that you never have the right screwdriver for (no, not that one -- that's your power supply; think the other smaller metal blob). If you fold, damage, spindle, or mutilate your disk, your files die with it. If you want to hear horror stories about the fragility of this approach, ask your local IT guy, but be sure you have several hours to spend listening to the tales of woe. It isn't pretty.

Disk-based storage has other problems too. Like the fact that when your disk gets full, you have to buy another one, copy over all the data you had on your old one, and then get rid of the old one. Some may ask why you can't just use both. The answer is you can, but there's always some stupid piece of software that wants to dump stuff on the drive that's completely full instead of the shiny new one you just bought. Go figure.

Also disks and files, even if they are all working correctly, often suffer some something whimsically called "bit rot." That's where information gets corrupted on the disk without anyone knowing. This phenomenon is best illustrated when you have to pull up the big presentation to the senior management for the project you've been working on for the last five years, and your computer responds with something like the below:

That's bit rot. Fun, isn't it?

Now Mac users have been pretty insulated from the whole C: drive insanity that PC users have put up with for the last 26 years (yes, since 1981), but even with such luxuries as having hard drives named "Insanity" or "Starship Enterprise", we're still hostages to the model that files go on disks, and if disks fail, users wail. So much for Mac superiority. And the whole RAID array idea to keep that data secure by making multiple copies of files? RAID arrays are great in concept (we use them here at Blackfriars), but full of all kinds of nasty little performance and consistency surprises in practice. And you still have no protection against bit rot.

Now ZFS has a different model. It believes that users being aware of disks is so last century. ZFS deems all your disks a "pool" of storage. It manages those disks for you; you only deal with files. Want to add more storage? Just add another disk to the pool, and ZFS knows what to do. Want to replace a disk? Tell ZFS to remove it from the pool, and it clears it off for you. You don't know where or how many copies the system is storing -- you just know that they are always there for you. Even better is the fact that it explicitly looks for bit rot and can correct it on the fly. It's that smart. Oh, and did I mention that it will actually run faster under heavy disk loads than just about any storage system runs now?

Now Apple, in its clever way, has not gone around saying, "We've got ZFS in Leopard!". Instead, Apple has begun marketing one major benefit of ZFS without saying anything about the implementation. Plus it gave it a more consumer friendly name too: Time Machine. The benefit: "Time Machine automatically keeps a spare copy of every thing you've done on your Mac." Cool huh?

But with ZFS technology, Time Machine is actually a pretty simple thing. See, ZFS uses a technology called "Copy on Write" to modify files. Rather than overwriting data directly, it first makes a copy of whatever information was there before and modifies that. And the old data? Well, that old data just lives on in the Time Machine archive. Since ZFS uses this mechanism for every file change, Apple only needs a GUI to allow you to search and manage the archive, and boom, you've got Time Machine.

But having ZFS in the Mac OS X foundation means a whole lot more than Time Machine. ZFS will accelerate Apple's ability to both simplify tasks and amaze consumers in future releases.

Apple never talks about future plans, but ZFS opens up a wealth of simple benefits for Macs like:

  • Easy OS upgrades. Pulling out your Mac OS X install disks if an upgrade goes awry will be a thing of the past. Instead, the Apple installer will simply make a snapshot (a copy-on-write copy) of their existing system before doing any install. If the new install doesn't go well, the users can simply roll back to the snapshot with one click.

  • Built-in offsite backup for laptops. A long time ago, Microsoft promised a feature called Intellisync (that name has since been reused for an Outlook function), which would cleverly make copies of laptop information on servers when users were connected to a local network, thereby providing live backups. They never actually released the feature, largely because it was nearly impossible to make work reliably under Windows 2000. ZFS makes that type of opportunistic corporate backing up of user laptops simple because it has remote snapshots and mirroring built-in.

  • Really big storage. Today, a Mac system with a few terabytes is a bit of a monster to initialize, backup, and manage. But with ZFS in the foundation, Petabyte (that's a million gigabytes) Mac OS-based systems will be just as easy to set up and manage as today's gigabyte-sized ones. In fact, the Z in ZFS stands for Zettabyte, which is a trillion gigabytes. The real beneficiaries of big storage systems made possible by ZFS will be Mac-using movie makers, photographers, and musicians who even today blow through terabytes of storage at a time. With ZFS, a one-man startup with a ZFS-powered Mac will be able to manage the digital assets for multiple computer animated movies without even breathing hard -- a job that today requires a data center, servers, and a wealth of data center geeks.

At the end of the day, though, there's one overriding benefit to Apple from moving to the new, simpler foundation of ZFS: differentiation. As I noted at the beginning of this article, Windows Vista STILL has letter-drive-based storage models hidden under its flashy GUI today. It will take years for Microsoft to cast off that disk-based legacy and all the business processes that accompany it. In the meantime, the productivity and innovation edge that ZFS gives Apple will even more of a lead over it Windows competitors than it has now. And that all translates into more sales for Apple's computer business.

Back in 2000, I wrote a research report for Forrester titled, "Surviving the Storage Boom." In that report, I warned computer vendors that they needed to make fundamental changes in how they sold and implemented storage to avoid their customers being overwhelmed by the tsunami of storage and its management headaches this decade. In today's world of 12 megapixel digital cameras, 100 gigabyte music collections, and uncompressed high-definition movies, no one can afford to manage disks any longer. Apple's strategy of incorporating ZFS into Mac OS X demonstrates they've identified storage as a problem and will try to solve it in an elegant way. The big question is how long it will be before the rest of the personal computer industry wakes up.

UPDATE: Some readers have wondered if I really understand how Leopard does Time Machine, so I apologize for being unclear in the article above. I am not suggesting that Time Machine is implemented on top of ZFS in Leopard; in fact, I'm pretty sure it isn't. Further, Leopard's support for ZFS is rumored to be read-only. I simply used Time Machine as an example of a storage service that becomes vastly simpler with a ZFS rather than a disk-based foundation. I apologize for the confusion I may have created.

For more information about ZFS, see Sun's ZFS page on OpenSolaris.org.

Full disclosure: the author owns Apple stock.