Mac OS X 10.6 Snow Leopard: the Ars Technica review

In June of 2004, during the WWDC keynote address, Steve Jobs revealed Mac OS X 10.4 Tiger to developers and the public for the first time. When the finished product arrived in April of 2005, Tiger was the biggest, most important, most feature-packed release in the history of Mac OS X by a wide margin. Apple's marketing campaign reflected this, touting "over 150 new features."

All those new features took time. Since its introduction in 2001, there had been at least one major release of Mac OS X each year. Tiger took over a year and a half to arrive. At the time, it definitely seemed worth the wait. Tiger was a hit with users and developers. Apple took the lesson to heart and quickly set expectations for the next major release of Mac OS X, Leopard. Through various channels, Apple communicated its intention to move from a 12-month to an 18-month release cycle for Mac OS X. Leopard was officially scheduled for "spring 2007."

As the date approached, Apple's marketing machine trod a predictable path.

Steve Jobs at WWDC 2007, touting 300 new features in Mac OS X 10.5 Leopard

Apple even went so far as to list all 300 new features on its website. As it turns out, "spring" was a bit optimistic. Leopard actually shipped at the end of October 2007, nearly two and a half years after Tiger. Did Leopard really have twice as many new features as Tiger? That's debatable. What's certain is that Leopard included a solid crop of new features and technologies, many of which we now take for granted. (For example, have you had a discussion with a potential Mac user since the release of Leopard without mentioning Time Machine? I certainly haven't.)

Mac OS X appeared to be maturing. The progression was clear: longer release cycles, more features. What would Mac OS X 10.6 be like? Would it arrive three and a half years after Leopard? Would it and include 500 new features? A thousand?

At WWDC 2009, Bertrand Serlet announced a move that he described as "unprecedented" in the PC industry.

Mac OS X 10.6 - Read Bertrand's lips: No New Features!

That's right, the next major release of Mac OS X would have no new features. The product name reflected this: "Snow Leopard." Mac OS X 10.6 would merely be a variant of Leopard. Better, faster, more refined, more... uh... snowy.

This was a risky strategy for Apple. After the rapid-fire updates of 10.1, 10.2, and 10.3 followed by the riot of new features and APIs in 10.4 and 10.5, could Apple really get away with calling a "time out?" I imagine Bertrand was really sweating this announcement up on the stage at WWDC in front of a live audience of Mac developers. Their reaction? Spontaneous applause. There were even a few hoots and whistles.

Many of these same developers applauded the "150+ new features" in Tiger and the "300 new features" in Leopard at past WWDCs. Now they were applauding zero new features for Snow Leopard? What explains this?

It probably helps to know that the "0 New Features" slide came at the end of an hour-long presentation detailing the major new APIs and technologies in Snow Leopard. It was also quickly followed by a back-pedaling ("well, there is one new feature...") slide describing the addition of Microsoft Exchange support. In isolation, "no new features" may seem to imply stagnation. In context, however, it served as a developer-friendly affirmation.

The overall message from Apple to developers was something like this: "We're adding a ton of new things to Mac OS X that will help you write better applications and make your existing code run faster, and we're going to make sure that all this new stuff is rock-solid and as bug-free as possible. We're not going to overextend ourselves adding a raft of new customer-facing, marketing-friendly features. Instead, we're going to concentrate 100% on the things that affect you, the developers."

But if Snow Leopard is a love letter to developers, is it a Dear John letter to users? You know, those people that the marketing department might so crudely refer to as "customers." What's in it for them? Believe it or not, the sales pitch to users is actually quite similar. As exhausting as it has been for developers to keep up with Apple's seemingly never-ending stream of new APIs, it can be just as taxing for customers to stay on top of Mac OS X's features. Exposé, a new Finder, Spotlight, a new Dock, Time Machine, a new Finder again, a new iLife and iWork almost every year, and on and on. And as much as developers hate bugs in Apple's APIs, users who experience those bugs as application crashes have just as much reason to be annoyed.

Enter Snow Leopard: the release where we all get a break from the new-features/new-bugs treadmill of Mac OS X development. That's the pitch.

Uncomfortable realities

But wait a second, didn't I just mention an "hour-long presentation" about Snow Leopard featuring "major new APIs and technologies?" When speaking to developers, Apple's message of "no new features" is another way of saying "no new bugs." Snow Leopard is supposed to fix old bugs without introducing new ones. But nothing says "new bugs, coming right up" quite like major new APIs. So which is it?

Similarly, for users, "no new features" connotes stability and reliability. But if Snow Leopard includes enough changes to the core OS to fill an hour-long overview session at WWDC more than a year before its release, can Apple really make good on this promise? Or will users end up with all the disadvantages of a feature-packed release like Tiger or Leopard—the inevitable 10.x.0 bugs, the unfamiliar, untried new functionality—but without any of the actual new features?

Yes, it's enough to make one quite cynical about Apple's real motivations. To throw some more fuel on the fire, have a look at the Mac OS X release timeline below. Next to each release, I've included a list of its most significant features.

That curve is taking on a decidedly droopy shape, as if it's being weighed down by the ever-increasing number of new features. (The releases are distributed uniformly on the Y axis.) Maybe you think it's reasonable for the time between releases to stretch out as each one brings a heavier load of goodies than the last, but keep in mind the logical consequence of such a curve over the long~~horn~~ haul.

And yeah, there's a little upwards kick at the end for 10.6, but remember, this is supposed to be the "no new features" release. Version 10.1 had a similar no-frills focus but took a heck of a lot less time to arrive.

Looking at this graph, it's hard not to wonder if there's something siphoning resources from the Mac OS X development effort. Maybe, say, some project that's in the first two or three major releases of its life, still in that steep, early section of its own timeline graph. Yes, I'm talking about the iPhone, specifically iPhone OS. The iPhone business has exploded onto Apple's balance sheets like no other product before, even the iPod. It's also accruing developers at an alarming rate.

It's not a stretch to imagine that many of the artists and developers who piled on the user-visible features in Mac OS X 10.4 and 10.5 have been reassigned to iPhone OS (temporarily or otherwise). After all, Mac OS X and iPhone OS share the same core operating system, the same language for GUI development, and many of the same APIs. Some workforce migration seems inevitable.

And let's not forget the "Mac OS X" technologies that we later learned were developed for the iPhone and just happened to be announced for the Mac first (because the iPhone was still a secret), like Core Animation and code signing. Such conspiracy theories certainly aren't helped by WWDC keynote snubs and other indignities suffered by Mac OS X and the Mac in general since the iPhone arrived on the scene. And so, on top of everything else, Snow Leopard is tasked with restoring some luster to Mac OS X.

Got all that? A nearly two-year development cycle, but no new features. Major new frameworks for developers, but few new bugs. Significant changes to the core OS, but more reliability. And a franchise rejuvenation with few user-visible changes.

It's enough to turn a leopard white.

The price of entry

Snow Leopard's opening overture to consumers is its price: $29 for those upgrading from Leopard. The debut release of Mac OS X 10.0 and the last four major releases have all been $129, with no special pricing for upgrades. After eight years of this kind of fiscal disciplining, Leopard users may well be tempted to stop reading right now and just go pick up a copy. Snow Leopard's upgrade price is well under the impulse purchase threshold for many people. Twenty-nine dollars plus some minimal level of faith in Apple's ability to improve the OS with each release, and boom, instant purchase.

Still here? Good, because there's something else you need to know about Snow Leopard. It's an overture of a different sort, less of a come-on and more of a spur. Snow Leopard will only run on Macs with Intel CPUs. Sorry (again), PowerPC fans, but this is the end of the line for you. The transition to Intel was announced over four years ago, and the last new PowerPC Mac was released in October 2005. It's time.

But if Snow Leopard is meant to prod the PowerPC holdouts into the Intel age, its "no new features" stance (and the accompanying lack of added visual flair) is working against it. For those running Leopard on a PowerPC-based Mac, there's precious little in Snow Leopard to help push them over the (likely) four-digit price wall of a new Mac. For PowerPC Mac owners, the threshold for a new Mac purchase remains mostly unchanged. When their old Mac breaks or seems too slow, they'll go out and buy a new one, and it'll come with Snow Leopard pre-installed.

If Snow Leopard does end up motivating new Mac purchases by PowerPC owners, it will probably be the result of resignation rather than inspiration. An Intel-only Snow Leopard is most significant for what it isn't: a further extension of PowerPC life support on the Mac platform.

The final interesting group is owners of Intel-based Macs that are still running Mac OS X 10.4 Tiger. Apple shipped Intel Macs with Tiger installed for a little over one year and nine months. Owners of these machines who never upgraded to Leopard are not eligible for the $29 upgrade to Snow Leopard. They're also apparently not eligible to purchase Snow Leopard for the traditional $129 price. Here's what Apple has to say about Snow Leopard's pricing (emphasis added).

Mac OS X version 10.6 Snow Leopard will be available as an upgrade to Mac OS X version 10.5 Leopard in September 2009 [...] The Snow Leopard single user license will be available for a suggested retail price of $29 (US) and the Snow Leopard Family Pack, a single household, five-user license, will be available for a suggested price of $49 (US). For Tiger® users with an Intel-based Mac, the Mac Box Set includes Mac OS X Snow Leopard, iLife® '09 and iWork® '09 and will be available for a suggested price of $169 (US) and a Family Pack is available for a suggested price of $229 (US).

Ignoring the family packs for a moment, this means that Snow Leopard will either be free with your new Mac, $29 if you're already running Leopard, or $169 if you have an Intel Mac running Tiger. People upgrading from Tiger will get the latest version of iLife and iWork in the bargain (if that's the appropriate term), whether they want them or not. It sure seems like there's an obvious place in this lineup for a $129 offering of Snow Leopard on its own. Then again, perhaps it all comes down to how, exactly, Apple enforces the $29 Snow Leopard upgrade policy.

(As an aside to non-Mac users, note that the non-server version of Mac OS X has no per-user serial number and no activation scheme of any kind, and never has. "Registration" with Apple during the Mac OS X install process is entirely optional and is only used to collect demographic information. Failing to register (or entering entirely bogus registration information) has no effect on your ability to run the OS. This is considered a genuine advantage of Mac OS X, but it also means that Apple has no reliable record of who, exactly, is a "legitimate" owner of Leopard.)

One possibility was that the $29 Snow Leopard upgrade DVD would only install on top of an existing installation of Leopard. Apple has done this type of thing before, and it bypasses any proof-of-purchase annoyances. It would, however, introduce a new problem. In the event of a hard drive failure or simple decision to reinstall from scratch, owners of the $29 Snow Leopard upgrade would be forced to first install Leopard and then install Snow Leopard on top of it, perhaps more than doubling the installation time—and quintupling the annoyance.

Given Apple's history in this area, no one should have been surprised to find out that Apple chose the much simpler option: the $29 "upgrade" DVD of Snow Leopard will, in fact, install on any supported Mac, whether or not it has Leopard installed. It will even install onto an entirely empty hard drive.

To be clear, installing the $29 upgrade to Snow Leopard on a system not already running a properly licensed copy of Leopard is a violation of the end-user license agreement that comes with the product. But Apple's decision is a refreshing change: rewarding honest people with a hassle-free product rather than trying to punish dishonest people by treating everyone like a criminal. This "honor system" upgrade enforcement policy partially explains the big jump to $169 for the Mac Box Set, which ends up re-framed as an honest person's way to get iLife and iWork at their usual prices, plus Snow Leopard for $11 more.

And yes, speaking of installing, let's finally get on with it.

Installation

Apple claims that Snow Leopard's installation process is "up to 45% faster." Installation times vary wildly depending on the speed, contents, and fragmentation of the target disk, the speed of the optical drive, and so on. Installation also only happens once, and it's not really an interesting process unless something goes terribly wrong. Still, if Apple's going to make such a claim, it's worth checking out.

To eliminate as many variables as possible, I installed both Leopard and Snow Leopard from one hard disk onto another (empty) one. It should be noted that this change negates some of Snow Leopard's most important installation optimizations, which are focused on reducing random data access from the optical disc.

Even with this disadvantage, the Snow Leopard installation took about 20% less time than the Leopard installation. That's well short of Apple's "up to 45%" claim, but see above (and don't forget the "up to" weasel words). Both versions installed in less than 30 minutes.

What is striking about Snow Leopard's installation is how quickly the initial Spotlight indexing process completed. Here, Snow Leopard was 74% faster in my testing. Again, the times are small (5:49 vs. 3:20) and again, new installations on empty disks are not the norm. But the shorter wait for Spotlight indexing is worth noting because it's the first indication most users will get that Snow Leopard means business when it comes to performance.

Another notable thing about installation is what's not installed by default: Rosetta, the facility that allows PowerPC binaries to run on Intel Macs. Okay Apple, we get it. PowerPC is a stiff, bereft of life. It rests in peace. It's rung down the curtain and joined the choir invisible. As far as Apple is concerned, PowerPC is an ex-ISA.

But not installing Rosetta by default? That seems a little harsh, even foolhardy. What's going to happen when all those users upgrade to Snow Leopard and then double-click what they've probably long since forgotten is a PowerPC application? Perhaps surprisingly, this is what happens:

Rosetta: auto-installed for your convenience

That's what I saw when I tried to launch Disk Inventory X on Snow Leopard, an application that, yes, I had long since forgotten was PowerPC-only. After I clicked the "Install" button, I actually expected to be prompted to insert the installer DVD. Instead, Snow Leopard reached out over the network, pulled down Rosetta from an Apple server, and installed it.

No reboot was required, and Disk Inventory X launched successfully after the Rosetta installation completed. Mac OS X has not historically made much use of the install-on-demand approach to system software components, but the facility used to install Rosetta appears quite robust. Upon clicking "Install," an XML property list containing a vast catalog of available Mac OS X packages was downloaded. Snow Leopard uses the same facility to download and install printer drivers on demand, saving another trip to the installer DVD. I hope this technique gains even wider use in the future.

Installation footprint

Rosetta aside, Snow Leopard simply puts fewer bits on your disk. Apple claims it "takes up less than half the disk space of the previous version," and that's no lie. A clean, default install (including fully-generated Spotlight indexes) is 16.8 GB for Leopard and 5.9 GB for Snow Leopard. (Incidentally, these numbers are both powers-of-two measurements; see sidebar.)

A gigabyte by any other name

Snow Leopard has another trick up its sleeve when it comes to disk usage. The Snow Leopard Finder considers 1 GB to be equal to 10⁹ (1,000,000,000) bytes, whereas the Leopard Finder—and, it should be noted, every version of the Finder before it—equates 1 GB to 2³⁰ (1,073,741,824) bytes. This has the effect of making your hard disk suddenly appear larger after installing Snow Leopard. For example, my "1 TB" hard drive shows up in the Leopard Finder as having a capacity of 931.19 GB. In Snow Leopard, it's 999.86 GB. As you might have guessed, hard disk manufacturers use the powers-of-ten system. It's all quite a mess, really. Though I come down pretty firmly on the powers-of-two side of the fence, I can't blame Apple too much for wanting to match up nicely with the long-established (but still dumb, mind you) hard disk vendors' capacity measurement standard.

Snow Leopard has several weight loss secrets. The first is obvious: no PowerPC support means no PowerPC code in executables. Recall the maximum possible binary payload in a Leopard executable: 32-bit PowerPC, 64-bit PowerPC, x86, and x86_64. Now cross half of those architectures off the list. Granted, very few applications in Leopard included 64-bit code of any kind, but it's a 50% reduction in size for executables no matter how you slice it.

Of course, not all the files in the operating system are executables. There are data files, images, audio files, even a little video. But most of those non-executable files have one thing in common: they're usually stored in compressed file formats. Images are PNGs or JPEGs, audio is AAC, video is MPEG-4, even preference files and other property lists now default to a compact binary format rather than XML.

In Snow Leopard, other kinds of files climb on board the compression bandwagon. To give just one example, ninety-seven percent of the executable files in Snow Leopard are compressed. How compressed? Let's look:

% cd Applications/Mail.app/Contents/MacOS
% ls -l Mail
-rwxr-xr-x@ 1 root  wheel  0 Jun 18 19:35 Mail

Boy, that's, uh, pretty small, huh? Is this really an executable or what? Let's check our assumptions.

% file Applications/Mail.app/Contents/MacOS/Mail
Applications/Mail.app/Contents/MacOS/Mail: empty

Yikes! What's going on here? Well, what I didn't tell you is that the commands shown above were run from a Leopard system looking at a Snow Leopard disk. In fact, all compressed Snow Leopard files appear to contain zero bytes when viewed from a pre-Snow Leopard version of Mac OS X. (They look and act perfectly normal when booted into Snow Leopard, of course.)

So, where's the data? The little "@" at the end of the permissions string in the ls output above (a feature introduced in Leopard) provides a clue. Though the Mail executable has a zero file size, it does have some extended attributes:

% xattr -l Applications/Mail.app/Contents/MacOS/Mail
com.apple.ResourceFork:
0000     00 00 01 00 00 2C F5 F2 00 2C F4 F2 00 00 00 32    .....,...,.....2
0010     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
(184,159 lines snipped)
2CF610   63 6D 70 66 00 00 00 0A 00 01 FF FF 00 00 00 00    cmpf............
2CF620   00 00 00 00                                        ....

com.apple.decmpfs:
0000   66 70 6D 63 04 00 00 00 A0 82 72 00 00 00 00 00    fpmc......r.....

Ah, there's all the data. But wait, it's in the resource fork? Weren't those deprecated about eight years ago? Indeed they were. What you're witnessing here is yet another addition to Apple's favorite file system hobbyhorse, HFS+.

At the dawn of Mac OS X, Apple added journaling, symbolic links, and hard links. In Tiger, extended attributes and access control lists were incorporated. In Leopard, HFS+ gained support for hard links to directories. In Snow Leopard, HFS+ learns another new trick: per-file compression.

The presence of the com.apple.decmpfs attribute is the first hint that this file is compressed. This attribute is actually hidden from the xattr command when booted into Snow Leopard. But from a Leopard system, which has no knowledge of its special significance, it shows up as plain as day.

Even more information is revealed with the help of Mac OS X Internals guru Amit Singh's hfsdebug program, which has quietly been updated for Snow Leopard.

% hfsdebug /Applications/Mail.app/Contents/MacOS/Mail
...
  compression magic    = cmpf
  compression type     = 4 (resource fork has compressed data)
  uncompressed size    = 7500336 bytes

And sure enough, as we saw, the resource fork does indeed contain the compressed data. Still, why the resource fork? It's all part of Apple's usual, clever backward-compatibility gymnastics. A recent example is the way that hard links to directories show up—and function—as aliases when viewed from a pre-Leopard version of Mac OS X.

In the case of a HFS+ compression, Apple was (understandably) unable to make pre-Snow Leopard systems read and interpret the compressed data, which is stored in ways that did not exist at the time those earlier operating systems were written. But rather than letting applications (and users) running on pre-10.6 systems choke on—or worse, corrupt through modification—the unexpectedly compressed file contents, Apple has chosen to hide the compressed data instead.

And where can the complete contents of a potentially large file be hidden in such a way that pre-Snow Leopard systems can still copy that file without the loss of data? Why, in the resource fork, of course. The Finder has always correctly preserved Mac-specific metadata and both the resource and data forks when moving or duplicating files. In Leopard, even the lowly cp and rsync commands will do the same. So while it may be a little bit spooky to see all those "empty" 0 KB files when looking at a Snow Leopard disk from a pre-Snow Leopard OS, the chance of data loss is small, even if you move or copy one of the files.

The resource fork isn't the only place where Apple has decided to smuggle compressed data. For smaller files, hfsdebug shows the following:

% hfsdebug /etc/asl.conf
...
  compression magic    = cmpf
  compression type     = 3 (xattr has compressed data)
  uncompressed size    = 860 bytes

Here, the data is small enough to be stored entirely within an extended attribute, albeit in compressed form. And then, the final frontier:

% hfsdebug /Volumes/Snow Time/Applications/Mail.app/Contents/PkgInfo
...
  compression magic    = cmpf
  compression type     = 3 (xattr has inline data)
  uncompressed size    = 8 bytes

That's right, an entire file's contents stored uncompressed in an extended attribute. In the case of a standard PkgInfo file like this one, those contents are the four-byte classic Mac OS type and creator codes.

% xattr -l Applications/Mail.app/Contents/PkgInfo
com.apple.decmpfs:
0000   66 70 6D 63 03 00 00 00 08 00 00 00 00 00 00 00    fpmc............
0010   FF 41 50 50 4C 65 6D 61 6C                         .APPLemal

There's still the same "fpmc..." preamble seen in all the earlier examples of the com.apple.decmpfs attribute, but at the end of the value, the expected data appears as plain as day: type code "APPL" (application) and creator code "emal" (for the Mail application—cute, as per classic Mac OS tradition).

You may be wondering, if this is all about data compression, how does storing eight uncompressed bytes plus a 17-byte preamble in an extended attribute save any disk space? The answer to that lies in how HFS+ allocates disk space. When storing information in a data or resource fork, HFS+ allocates space in multiples of the file system's allocation block size (4 KB, by default). So those eight bytes will take up a minimum of 4,096 bytes if stored in the traditional way. When allocating disk space for extended attributes, however, the allocation block size is not a factor; the data is packed in much more tightly. In the end, the actual space saved by storing those 25 bytes of data in an extended attribute is over 4,000 bytes.

But compression isn't just about saving disk space. It's also a classic example of trading CPU cycles for decreased I/O latency and bandwidth. Over the past few decades, CPU performance has gotten better (and computing resources more plentiful—more on that later) at a much faster rate than disk performance has increased. Modern hard disk seek times and rotational delays are still measured in milliseconds. In one millisecond, a 2 GHz CPU goes through two million cycles. And then, of course, there's still the actual data transfer time to consider.

Granted, several levels of caching throughout the OS and hardware work mightily to hide these delays. But those bits have to come off the disk at some point to fill those caches. Compression means that fewer bits have to be transferred. Given the almost comical glut of CPU resources on a modern multi-core Mac under normal use, the total time needed to transfer a compressed payload from the disk and use the CPU to decompress its contents into memory will still usually be far less than the time it'd take to transfer the data in uncompressed form.

That explains the potential performance benefits of transferring less data, but the use of extended attributes to store file contents can actually make things faster, as well. It all has to do with data locality.

If there's one thing that slows down a hard disk more than transferring a large amount of data, it's moving its heads from one part of the disk to another. Every move means time for the head to start moving, then stop, then ensure that it's correctly positioned over the desired location, then wait for the spinning disk to put the desired bits beneath it. These are all real, physical, moving parts, and it's amazing that they do their dance as quickly and efficiently as they do, but physics has its limits. These motions are the real performance killers for rotational storage like hard disks.

The HFS+ volume format stores all its information about files—metadata—in two primary locations on disk: the Catalog File, which stores file dates, permissions, ownership, and a host of other things, and the Attributes File, which stores "named forks."

Extended attributes in HFS+ are implemented as named forks in the Attributes File. But unlike resource forks, which can be very large (up to the maximum file size supported by the file system), extended attributes in HFS+ are stored "inline" in the Attributes File. In practice, this means a limit of about 128 bytes per attribute. But it also means that the disk head doesn't need to take a trip to another part of the disk to get the actual data.

As you can imagine, the disk blocks that make up the Catalog and Attributes files are frequently accessed, and therefore more likely than most to be in a cache somewhere. All of this conspires to make the complete storage of a file, including both its metadata in its data, within the B-tree-structured Catalog and Attributes files an overall performance win. Even an eight-byte payload that balloons to 25 bytes is not a concern, as long as it's still less than the allocation block size for normal data storage, and as long as it all fits within a B-tree node in the Attributes File that the OS has to read in its entirety anyway.

There are other significant contributions to Snow Leopard's reduced disk footprint (e.g., the removal of unnecessary localizations and "designable.nib" files) but HFS+ compression is by far the most technically interesting.

Installer intelligence

Apple makes two other interesting promises about the installation process:

Snow Leopard checks your applications to make sure they're compatible and sets aside any programs known to be incompatible. In case a power outage interrupts your installation, it can start again without losing any data.

The setting aside of "known incompatible" applications is undoubtedly a response to the "blue screen" problems some users encountered when upgrading from Tiger to Leopard two years ago, which was caused by the presence of incompatible—and some would say "illicit"—third-party system extensions. I have a decidedly pragmatic view of such software, and I'm glad to see Apple taking a similarly practical approach to minimizing its impact on users.

Apple can't be expected to detect and disable all potentially incompatible software, of course. I suspect only the most popular or highest profile risky software is detected. If you're a developer, this installer feature may be a good way to find out if you're on Apple's sh*t list.

As for continuing an installation after a power failure, I didn't have the guts to test this feature. (I also have a UPS.) For long-running processes like installation, this kind of added robustness is welcome, especially on battery-powered devices like laptops.

I mention these two details of the installation process mostly because they highlight the kinds of things that are possible when developers at Apple are given time to polish their respective components of the OS. You might think that the installer team would be hard-pressed to come up with enough to do during a nearly two-year development cycle. That's clearly not the case, and customers will reap the benefits.

Snow Leopard's new looks

I've long yearned for Apple to make a clean break, at least visually, from Mac OS X's Aqua past. Alas, I will be waiting a bit longer, because Snow Leopard ushers in no such revolution. And yet here I am, beneath a familiar-looking section heading that seems to indicate otherwise. The truth is, Snow Leopard actually changes the appearance of nearly every pixel on your screen—but not in the way you might imagine.

Since the dawn of color on the Macintosh, the operating system has used a default output gamma correction value of 1.8. Meanwhile, Windows—aka the rest of the world—has used a value of 2.2. Though this may not seem significant to anyone but professional graphics artists, the difference is usually apparent to even a casual observer when viewing the same image on both kinds of displays side by side.

Though Mac users will probably instinctively prefer the 1.8 gamma image that they're used to, Apple has decided that this historical difference is more trouble than it's worth. The default output gamma correction value in Snow Leopard is now 2.2, just like everyone else. Done and done.

If they notice at all, users will likely experience this change as a feeling that the Snow Leopard user interface has a bit more contrast than Leopard's. This is reinforced by the new default desktop background, a re-drawn, more saturated version of Leopard's default desktop. (Note that these are two entirely different images and not an attempt to demonstrate the effects of different gamma correction settings.)

Dock Exposé spotlight effect

But even beyond color correction, true to form, Apple could not resist adding a few graphical tweaks to the Snow Leopard interface. The most apparent changes are related to the Dock. First, there's the new "spotlight" look triggered by a click-and-hold on an application icon in the Dock. (This activates Exposé, but only for the windows belonging to the application that was clicked. More later.)

Furthermore, any and all pop-up menus on the Dock—and only on the Dock—have a unique look in Snow Leopard, complete with a custom selection appearance (which, for a change, does a passable job of matching the system-wide selection appearance setting).

New Dock menu appearance. Mmmm… arbitrary.

For Mac users of a certain age, these menus may bring to mind Apple's Hi-Tech appearance theme from the bad-old days of Copland. They're actually considerably more subtle, however. Note the translucent edges which accentuate the rounded corners. The gradient on the selection highlight is also admirably restrained.

Nevertheless, this is an entirely new look for a single (albeit commonly used) application, and it does clash a bit with the default "slanty, shiny shelf" appearance of the Dock. But I've already had my say about that, and more. If the oath of Snow Leopard's appearance was to "first, do no harm," then I think I'm inclined to give it a passing grade—almost.

If I had to characterize what's wrong with Snow Leopard's visual additions with just two words, it'd be these: everything fades. Apple has sprinkled Core Animation fairy dust over seemingly every application in Snow Leopard. If any part of the user interface appears, disappears, or changes in any significant way, it's accompanied by an animation and one or more fades.

In moderation, such effects are fine. But in several instances, Snow Leopard crosses the line. Or rather, it crosses my line, which, it should be noted, is located far inside the territories of Candy Land. Others with a much lower tolerance for animations who are already galled by the frippery in Leopard and earlier releases will find little to love in Snow Leopard's visual changes.

The one that really drove me over the edge is the fussy little dance of the filename area that occurs in the Finder (surprise!) when renaming a file on the desktop. There's just something about so many cross-fades, color changes, and text offsets occurring so rapidly and concentrated into such a small area that makes me want to scream. And whether or not I'm actually waiting for these animations to finish before I can continue to use my computer, it certainly feels that way sometimes.

Still, I must unenthusiastically predict that most normal people (i.e., the ones who will not read this entire article) will either find these added visual touches delightful, or (much more likely) not notice them at all.

Branding

Animation aside, the visual sameness of Snow Leopard presents a bit of a marketing challenge for Apple. Even beyond the obvious problem of how to promote an operating system upgrade with "no new features" to consumers, there's the issue of how to get people to notice that this new product exists at all.

In the run-up to Snow Leopard's release, Apple stuck to a modified version of Leopard's outer space theme. It was in the keynote slideshows, on the WWDC banners, on the developer release DVDs, and all over the Mac OS X section of Apple's website. The header image from Apple's Mac OS X webpage as of a week before Snow Leopard's release appears below. It's pretty cut and dried: outer space, stars, rich purple nebula, lens flare.

Then came the golden master of Snow Leopard, which, in a pleasant change from past releases, was distributed to developers a few weeks before Snow Leopard hit the shelves. Its installer introduced an entirely different look which, as it turns out, was carried over to the retail packaging. For a change, let's line up the discs instead of the packaging (which is rapidly shrinking to barely enclose the disc anyway). Here's Mac OS X 10.0 through 10.6, top to bottom and left to right. (The 10.0 and 10.1 discs looked essentially identical and have been coalesced.)

One of these things is not like the others…

Yep, it's a snow leopard. With actual snow on it. It's a bit on the nose for my taste, but it's not without its charms. And it does have one big thing going for it: it's immediately recognizable as something new and different. "Unmistakable" is how I'd sum up the packaging. Eight years of the giant, centered, variously adorned "X" and then boom: a cat. There's little chance that anyone who's seen Leopard sitting on the shelf of their local Apple store for the past two years will fail to notice that this is a new product.

(If you'd like your own picture of Snowy the snow leopard (that's right, I've named him), Apple was kind enough to include a desktop background image with the OS. Self-loathing Windows users may download it directly.)

Warning: internals ahead

We've arrived at the start of the customary "internals" section. Snow Leopard is all about internal changes, and this is reflected in the content of this review. If you're only interested in the user-visible changes, you can skip ahead, but you'll be missing out on the meat of this review and the heart of Apple's new OS.

64-bit: the road leads ever on

Mac OS X started its journey to 64-bit back in 2003 with the release of Panther, which included the bare minimum support for the then-new PowerPC G5 64-bit CPU. In 2005, Tiger brought with it the ability to create true 64-bit processes—as long as they didn't link with any of the GUI libraries. Finally, Leopard in 2007 included support for 64-bit GUI applications. But again, there was a caveat: 64-bit support extended to Cocoa applications only. It was, effectively, the end of the road for Carbon.

Despite Leopard's seemingly impressive 64-bit bona fides, there are a few more steps before Mac OS X can reach complete 64-bit nirvana. The diagrams below illustrate.

Mac OS X 10.4 Tiger

Mac OS X 10.5 Leopard

Mac OS X 10.6 Snow Leopard

As we'll see, all that yellow in the Snow Leopard diagram represents its capability, not necessarily its default mode of operation.

K64

Snow Leopard is the first version of Mac OS X to ship with a 64-bit kernel ("K64" in Apple's parlance), but it's not enabled by default on most systems. The reason for this this is simple. Recall that there's no "mixed mode" in Mac OS X. At runtime, a process is either 32-bit or 64-bit, and can only load other code—libraries, plug-ins, etc.—of the same kind.

An important class of plug-ins loaded by the kernel is device drivers. Were Snow Leopard to default to the 64-bit kernel, only 64-bit device drivers would load. And seeing as Snow Leopard is the first version of Mac OS X to include a 64-bit kernel, there'd be precious few of those on customers' systems on launch day.

And so, by default, Snow Leopard boots with a 64-bit kernel only on Xserves from 2008 or later. I guess the assumption is that all of the devices commonly attached to an Xserve will be supported by 64-bit drivers supplied by Apple in Snow Leopard itself.

Perhaps surprisingly, not all Macs with 64-bit processors are even able to boot into the 64-bit kernel. Though this may change in subsequent point releases of Snow Leopard, the table below lists all the Macs that are either capable of or default to booting K64. (To find the "Model name" of your Mac, select "About This Mac" from the Apple menu, then click the "More info…" button and read the "Model Identifier" line in the window that appears.)

Product	Model name	K64 status
Early 2008 Mac Pro	MacPro3,1	Capable
Early 2008 Xserve	Xserve2,1	Default
MacBook Pro 15"/17"	MacBookPro4,1	Capable
iMac	iMac8,1	Capable
UniBody MacBook Pro 15"	MacBookPro5,1	Capable
UniBody MacBook Pro 17"	MacBookPro5,2	Capable
Mac Pro	MacPro4,1	Capable
iMac	iMac9,1	Capable
Early 2009 Xserve	Xserve3,1	Default

For all K64-capable Macs, boot while holding down "6" and "4" keys simultaneously to select the 64-bit kernel. For a more permanent solution, use the nvram command to add arch=x86_64 to your boot-args string, or edit the file /Library/Preferences/SystemConfiguration/com.apple.Boot.plist and add arch=x86_64 to the Kernel Flags string:

...
	<key>Kernel</key>
	<string>mach_kernel</string>
	<key>Kernel Flags</key>
	<string>arch=x86_64</string>
...

To switch back to the 32-bit kernel, hold down the "3" and "2" keys during boot, or use one of the techniques above, replacing "x86_64" with "i386".

We've already discussed why, at least initially, you probably won't want to boot into K64. But as Snow Leopard adoption ramps up and 64-bit updates of existing kernel extensions become available, why might you actually want to use the 64-bit kernel?

The first reason has to do with RAM, and not in the way you might think. Though Leopard uses a 32-bit kernel, Macs running Leopard can contain and use far more RAM than the 4 GB limit the "32-bit" qualifier might seem to imply. But as RAM sizes increase, there's another concern: address space depletion—not for applications, but for the kernel itself.

As a 32-bit process, the kernel itself is limited to a 32-bit (i.e., 4GB) address space. That may not seem like a problem; after all, should the kernel really need more than 4GB of memory to do its job? But remember that part of the kernel's job is to track and manage system memory. The kernel uses a 64-byte structure to track the status of each 4KB page of RAM used on the system.

That's 64 bytes, not kilobytes. It hardly seems like a lot. But now consider a Mac in the not-too-distant future containing 96GB of RAM. (If this sounds ridiculous to you, think of how ridiculous the 8GB of RAM in the Mac I'm typing on right now would have sounded to you five years ago.) Tracking 96GB of RAM requires 1.5GB of kernel address space. Using more than a third of the kernel's address space just to track memory is a pretty uncomfortable situation.

A 64-bit kernel, on the other hand, has a virtually unlimited kernel address space (16 exabytes). K64 is an inevitable necessity, given the rapidly increasing size of system memory. Though you may not need it today on the desktop, it's already common for servers to have double-digit gigabytes of RAM installed.

The other thing K64 has going for it is speed. The x86 instruction set architecture has had a bit of a tortured history. When designing the x86-64 64-bit extension of the x86 architecture, AMD took the opportunity to leave behind some of the ugliness of the past and include more modern features: more registers, new addressing modes, non-stack-based floating point capabilities, etc. K64 reaps these benefits. Apple makes the following claims about its performance:

250% faster system call entry point
70% faster user/kernel memory copy

Focused benchmarking would bear these out, I'm sure. But in daily use, you're unlikely to be able to attribute any particular performance boost to the kernel. Think of K64 as removing bottlenecks from the few (usually server-based) applications that actually do exercise these aspects of the kernel heavily.

If it makes you feel better to know that your kernel is operating more efficiently, and that, were you to actually have 96GB of RAM installed, you would not risk starving the kernel of address space, and if you don't have any 32-bit drivers that you absolutely need to use, then by all means, boot into the 64-bit kernel.

For everyone else, my advice is to be glad that K64 will be ready and waiting for you when you eventually do need it—and please do encourage all the vendors that make kernel extensions that you care about to add K64 support as soon as possible.

Finally, this is worth repeating: please keep in mind that you do not need to run the 64-bit kernel in order to run 64-bit applications or install more than 4GB of RAM in your Mac. Applications run just fine in 64-bit mode on top of the 32-bit kernel, and even in earlier versions of Mac OS X it's been possible to install and take advantage of much more than 4GB of RAM.

64-bit applications

While Leopard may have brought with it support for 64-bit GUI applications, it actually included very few of them. In fact, by my count, only two 64-bit GUI applications shipped with Leopard: Xcode (an optional install) and Chess. And though Leopard made it possible for third-party developers to produce 64-bit (albeit Leopard-only) GUI applications, very few have—sometimes due to unfortunate realities, but most often because there's been no good reason to do so, abandoning users of Mac OS X 10.4 or earlier in the process.

Apple is now pushing the 64-bit transition much harder. This starts with leading by example. Snow Leopard ships with four end-user GUI applications that are not 64-bit: iTunes, Grapher, Front Row, and DVD Player. Everything else is 64-bit. The Finder, the Dock, Mail, TextEdit, Safari, iChat, Address Book, Dashboard, Help Viewer, Installer, Terminal, Calculator—you name it, it's 64-bit.

The second big carrot (or stick, depending on how you look at it) is the continued lack of 32-bit support for new APIs and technologies. Leopard started the trend, leaving deprecated APIs behind and only porting the new ones to 64-bit. The improved Objective-C 2.0 runtime introduced in Leopard was also 64-bit-only.

Snow Leopard continues along similar lines. The Objective-C 2.1 runtime's non-fragile instance variables, exception model unified with C++, and faster vtable dispatch remain available only to 64-bit applications. But the most significant new 64-bit-only API is QuickTime X—significant enough to be addressed separately, so stay tuned.

64-bits or bust

All of this is Apple's not-so-subtle way of telling developers that the time to move to 64-bit is now, and that 64-bit should be the default for all new applications, whether a developer thinks it's "needed" or not. In most cases, these new APIs have no intrinsic connection to 64-bit. Apple has simply chosen to use them as additional forms of persuasion.

Despite all of the above, I'd still call Snow Leopard merely the penultimate step in Mac OS X's journey to be 64-bit from top to bottom. I fully expect Mac OS X 10.7 to boot into the 64-bit kernel by default, to ship with 64-bit versions of all applications, plug-ins, and kernel extensions, and to leave even more legacy and deprecated APIs to fade away in the land of 32-bit.

QuickTime X

Apple did something a bit odd in Leopard when it neglected to port the C-based QuickTime API to 64-bit. At the time, it didn't seem like such a big deal. Mac OS X's transition to 64-bit had already spanned many years and several major versions. One could imagine that it just wasn't yet QuickTime's turn to go 64-bit.

As it turns out, my terse but pessimistic assessment of the situation at the time was accurate: QuickTime got the "Carbon treatment". Like Carbon, the venerable QuickTime API that we know and love will not be making the transition to 64-bit—ever.

To be clear, QuickTime the technology and QuickTime the brand will most definitely be coming to 64-bit. What's being left behind in 32-bit-only form is the C-based API introduced in 1991 and built upon for 18 years thereafter. Its replacement in the world of 64-bit in Snow Leopard is the aptly named QuickTime X.

The "X" in QuickTime X, like the one in in Mac OS X, is pronounced "ten." This is but the first of many eerie parallels. Like Mac OS X before it, QuickTime X:

aims to make a clean break from its predecessor
is based on technology originally developed for another platform
includes transparent compatibility with its earlier incarnation
promises better performance and a more modern architecture
lacks many important features in its initial release

Let's take these one at a time. First, why is a clean break needed? Put simply, QuickTime is old—really old. The horribly blocky, postage-stamp-size video displayed by its initial release in 1991 was considered a technological tour de force.

At the time, the fastest Macintosh money could buy contained a 25 MHz CPU. The ridiculous chart to the right is meant to hammer home this point. Forward-thinking design can only get you so far. The shape of the world a technology is born into eventually, inevitably dictates its fate. This is especially true for long-lived APIs like QuickTime with a strong bent towards backward compatibility.

As the first successful implementation of video on a personal computer, it's frankly amazing that the QuickTime API has lasted as long as it has. But the world has moved on. Just as Mac OS found itself mired in a ghetto of cooperative multitasking and unprotected memory, QuickTime limps into 2009 with antiquated notions of concurrency and subsystem layering baked into its design.

When it came time to write the video-handling code for the iPhone, the latest version of QuickTime, QuickTime 7, simply wasn't up to the task. It had grown too bloated and inefficient during its life on the desktop, and it lacked good support for the GPU-accelerated video playback necessary to handle modern video codecs on a handheld (even with a CPU sixteen times the clock speed of any available in a Mac when QuickTime 1.0 was released). And so, Apple created a tight, modern, GPU-friendly video playback engine that could fit comfortably within the RAM and CPU constraints of the iPhone.

Hmm. An aging desktop video API in need of a replacement. A fresh, new video library with good performance even on (comparatively) anemic hardware. Apple connected the dots. But the trick is always in the transition. Happily, this is Apple's forte. QuickTime itself has already lived on three different CPU architectures and three entirely different operating systems.

The switch to 64-bit is yet another (albeit less dramatic) inflection point, and Apple has chosen it to mark the boundary between the old QuickTime 7 and the new QuickTime X. It's done this in Snow Leopard by limiting all use of QuickTime by 64-bit applications to the QTKit Objective-C framework.

QTKit's new world order

QTKit is not new; it began its life in 2005 as a more native-feeling interface to QuickTime 7 for Cocoa applications. This extra layer of abstraction is the key to the QuickTime X transition. QTKit now hides within its object-oriented walls both QuickTime 7 and QuickTime X. Applications use QTKit as before, and behind the scenes QTKit will choose whether to use QuickTime 7 or QuickTime X to fulfill each request.

If QuickTime X is so much better, why doesn't QTKit use it for everything? The answer is that QuickTime X, like its Mac OS X namesake, has very limited capabilities in its initial release. While QuickTime X supports playback, capture, and exporting, it does not support general-purpose video editing. It also supports only "modern" video formats—basically, anything that can be played by an iPod, iPhone, or Apple TV. As for other video codecs, well, you can forget about handling them with plug-ins because QuickTime X doesn't support those either.

For every one of the cases where QuickTime X is not up to the job, QuickTime 7 will fill in. Cutting, copying, and pasting portions of a video? QuickTime 7. Extracting individual tracks from a movie? QuickTime 7. Playing any movie not natively supported by an existing Apple handheld device? QuickTime 7. Augmenting QuickTime's codec support using a plug-in of any kind? You guessed it: QuickTime 7.

But wait a second. If QTKit is the only way for a 64-bit application to use QuickTime, and QTKit multiplexes between QuickTime 7 and QuickTime X behind the scenes, and QuickTime 7 is 32-bit-only, and Mac OS X does not support "mixed mode" processes that can execute both 32-bit and 64-bit code, then how the heck does a 64-bit process do anything that requires the QuickTime 7 back-end?

To find out, fire up the new 64-bit QuickTime Player application (which will be addressed separately later) and open a movie that requires QuickTime 7. Let's say, one that uses the Sorenson video codec. (Remember that? Good times.) Sure enough, it plays just fine. But search for "QuickTime" in the Activity Monitor application and you'll see this:

Pretty sneaky, sis: 32-bit QTKitServer process

And the answer is revealed. When a 64-bit application using QTKit requires the services of the 32-bit-only QuickTime 7 back-end, QTKit spawns a separate 32-bit QTKitServer process to do the work and communicate the results back to the originating 64-bit process. If you leave Activity Monitor open while using the new QuickTime Player application, you can watch the QTKitServer processes come and go as needed. This is all handled transparently by the QTKit framework; the application itself need not be aware of these machinations.

Yes, it's going to be a long, long time before QuickTime 7 disappears completely from Mac OS X (at least Apple was kind enough not to call it "QuickTime Classic"), but the path forward is clear. With each new release of Mac OS X, expect the capabilities of QuickTime X to expand, and the number of things that still require QuickTime 7 to decrease. In Mac OS X 10.7, for example, I imagine that QuickTime X will gain support for plug-ins. And surely by Mac OS X 10.8, QuickTime X will have complete video editing support. All this will be happening beneath the unifying facade of QTKit until, eventually, the QuickTime 7 back-end is no longer needed at all.

Say what you mean

In the meantime, perhaps surprisingly, many of the current limitations of QuickTime X actually highlight its unique advantages and inform the evolving QTKit API. Though there is no direct way for a developer to request that QTKit use the QuickTime X back-end, there are several indirect means to influence the decision. The key is the QTKit API, which relies heavily on the concept of intent.

QuickTime versions 1 through 7 use a single representation of all media resources internally: a Movie object. This representation includes information about the individual tracks that make up the movie, the sample tables for each track, and so on—all the information QuickTime needs to understand and manipulate the media.

This sounds great until you realize that to do anything with a media resource in QuickTime requires the construction of this comprehensive Movie object. Consider playing an MP3 file with QuickTime, for example. QuickTime must create its internal Movie object representation of the MP3 file before it can begin playback. Unfortunately, the MP3 container format seldom contains comprehensive information about the structure of the audio. It's usually just a stream of packets. QuickTime must laboriously scan and parse the entire audio stream in order to complete the Movie object.

QuickTime 7 and earlier versions make this process less painful by doing the scanning and parsing incrementally in the background. You can see this in many QuickTime-based player applications in the form of a progress bar overlaid on the movie controller. The image below shows a 63MB MP3 podcast loading in the Leopard version of QuickTime Player. The shaded portion of the movie timeline slowly fills the dotted area from left to right.

QuickTime 7 doing more work than necessary

Though playback can begin almost immediately (provided you play from the beginning, that is) it's worthwhile to take a step back and consider what's going on here. QuickTime is creating a Movie object suitable for any operation that QuickTime can perform: editing, track extraction or addition, exporting, you name it. But what if all I want to do is play the file?

The trouble is, the QuickTime 7 API lacks a way to express this kind of intent. There is no way to say to QuickTime 7, "Just open this file as quickly as possible so that I can play it. Don't bother reading every single byte of the file from the disk and parsing it to determine its structure just in case I decide to edit or export the content. That is not my intent. Please, just open it for playback."

The QTKit API in Snow Leopard provides exactly this capability. In fact, the only way to be eligible for the QuickTime X back-end at all is to explicitly express your intent not to do anything QuickTime X cannot handle. Furthermore, any attempt to perform an operation that lies outside your previously expressed intent will cause QTKit to raise an exception.

The intent mechanism is also the way that the new features of QuickTime X are exposed, such as the ability to asynchronously load large or distantly located (e.g., over a slow network link) movie files without blocking the UI running on the main thread of the application.

Indeed, there are many reasons to do what it takes to get on board the QuickTime X train. For the media formats it supports, QuickTime X is less taxing on the CPU during playback than QuickTime 7. (This is beyond the fact that QuickTime X does not waste time preparing its internal representation of the movie for editing and export when playback is all that's desired.) QuickTime X also supports GPU-accelerated playback of H.264, but, in this initial release, only on Macs equipped with an NVIDIA 9400M GPU (i.e., some 2009 iMacs and several models of MacBooks from 2008 and 2009). Finally, QuickTime X includes comprehensive ColorSync support for video, which is long overdue.

The X factor

This is just the start of a long journey for QuickTime X, and seemingly not a very auspicious one, at that. A QuickTime engine with no editing support? No plug-ins? It seems ridiculous to release it at all. But this has been Apple's way in recent years: steady, deliberate progress. Apple aims to ship no features before their time.

As anxious as developers may be for a full-featured, 64-bit successor to the QuickTime 7 engine, Apple itself is sitting on top of one of the largest QuickTime-riddled (and Carbon-addled, to boot) code bases in the industry: Final Cut Studio. Thus far, It remains stuck in 32-bit. To say that Apple is "highly motivated" to extend the capabilities of QuickTime X would be an understatement.

Nevertheless, don't expect Apple to rush forward foolishly. Duplicating the functionality of a continually developed, 18-year-old API will not happen overnight. It will take years, and it will be even longer before every important Mac OS X application is updated to use QTKit exclusively. Transitions. Gotta love 'em.

File system API unification

Mac OS X has historically supported many different ways of referring to files on disk from within an application. Plain-old paths (e.g., /Users/john/Documents/myfile) are supported at the lowest levels of the operating system. They're simple, predictable, but perhaps not such a great idea to use as the only way an application tracks files. Consider what happens if an application opens a file based on a path string, then the user moves that file somewhere else while it's still being edited. When the application is instructed to save the file, if it only has the file path to work with, it will end up creating a new file in the old location, which is almost certainly not what the user wanted.

Classic Mac OS had a more sophisticated internal representation of files that enabled it to track files independent of their actual locations on disk. This was done with the help of the unique file ids supported by HFS/HFS+. The Mac OS X incarnation of this concept is the FSRef data type.

Finally, in the modern age, URLs have become the de facto representation for files that may be located somewhere other than the local machine. URLs can also refer to local files, but in that case they have all the same disadvantages as file paths.

This diversity of data types is reflected in Mac OS X's file system APIs. Some functions take file path as arguments, some expect opaque references to files, and still others work only with URLs. Programs that use these APIs often spend a lot of their time converting file references from one representation to another.

The situation is similar when it comes to getting information about files. There are a huge number of file system metadata retrieval functions at all levels of the operating system, and no single one of them is comprehensive. To get all available information about a file on disk requires making several separate calls, each of which may expect a different type of file reference as an argument.

Here's an example Apple provided at WWDC. Opening a single file in the Leopard version of the Preview image viewer application results in:

Four conversions of an FSRef to a file path
Ten conversions of a file path to an FSRef
Twenty-five calls to getattrlist()
Eight calls to stat()/lstat()
Four calls to open()/close()

In Snow Leopard, Apple has created a new, unified, comprehensive set of file system APIs built around a single data type: URLs. But these are URL "objects"—namely, the opaque data types NSURL and CFURL, with a toll-free bridge between them—that have been imbued with all the desirable attributes of an FSRef.

Apple settled on these data types because their opaque nature allowed this kind of enhancement, and because there are so many existing APIs that use them. URLs are also the most future-proof of all the choices, with the scheme portion providing nearly unlimited flexibility for new data types and access mechanisms. The new file system APIs built around these opaque URL types support caching and metadata prefetching for a further performance boost.

There's also a new on-disk representation called a Bookmark (not to be confused with a browser bookmark) which is like a more network-savvy replacement for classic Mac OS aliases. Bookmarks are the most robust way to create a reference to a file from within another file. It's also possible to attach arbitrary metadata to each Bookmark. For example, if an application wants to keep a persistent list of "favorite" files plus some application-specific information about them, and it wants to be resilient to any movement of these files behind its back, Bookmarks are the best tool for the job.

I mention all of this not because I expect file system APIs to be all that interesting to people without my particular fascination with this part of the operating system, but because, like Core Text before it, it's an indication of exactly how young Mac OS X really is as a platform. Even after seven major releases, Mac OS X is still struggling to move out from the shadow of its three ancestors: NeXTSTEP, classic Mac OS, and BSD Unix. Or perhaps it just goes to show how ruthlessly Apple's core OS team is driven to replace old and crusty APIs and data types with new, more modern versions.

It will be a long time before the benefits of these changes trickle down (or is it up?) to end-users in the form of Mac applications that are written or modified to use these new APIs. Most well-written Mac applications already exhibit most of the desirable behavior. For example, the TextEdit application in Leopard will correctly detect when a file it's working on has moved.

Of course, the key modifier here is "well-written." Simplifying the file system APIs means that more developers will be willing to expend the effort—now greatly reduced—to provide such user-friendly behaviors. The accompanying performance boost is just icing on the cake, and one more reason that developers might choose to alter their existing, working application to use these new APIs.

Doing more with more

Moore's Law is widely cited in technology circles—and also widely misunderstood. It's most often used as shorthand for "computers double in speed every year or so," but that's not what Gordon Moore wrote at all. His 1965 article in Electronics magazine touched on many topics in the semiconductor industry, but if it had to be summed up in a single "law", it would be, roughly, that the number of transistors that fit onto a square inch of silicon doubles every 12 months.

Moore later revised that to two years, but the time period is not what people get wrong. The problem is confusing a doubling of transistor density with a doubling of "computer speed." (Even more problematic is declaring a "law" based on a single paper from 1965, but we'll put that aside for now. For a more thorough discussion of Moore's Law, please read this classic article by Jon Stokes.)

For decades, each increase in transistor density was, in fact, accompanied by a comparable increase in computing speed thanks to ever-rising clock speeds and the dawn of superscalar execution. This worked great—existing code ran faster on each new CPU—until the grim realities of power density put an end to the fun.

Moore's Law continues, at least for now, but our ability to make code run faster with each new increase in transistor density has slowed considerably. The free lunch is over. CPU clock speeds have stagnated for years, many times actually going backwards. (The latest top-of-the-line 2009 Mac Pro contains a 2.93 GHz CPU, whereas the 2008 model could be equipped with a 3.2 GHz CPU.) Adding execution units to a CPU has also long since reached the point of diminishing returns, given the limits of instruction-level parallelism in common application code.

And yet we've still got all these new transistors raining down on us, more every year. The challenge is to find new ways to use them to actually make computers faster.

Thus far, the semiconductor industry's answer has been to give us more of what we already have. Where once a CPU contained a single logical processing unit, now CPUs in even the lowliest desktop computers contain two processor cores, with high-end models sporting two chips with eight logical cores each. Granted, the cores themselves are also getting faster, usually by doing more at the same clock speed as their predecessors, but that's not happening at nearly the rate that the cores are multiplying.

Unfortunately, generally speaking, a dual-core CPU will not run your application twice as fast as a single-core CPU. In fact, your application probably won't run any faster at all unless it was written to take advantage of more than just a single logical CPU. Presented with a glut of transistors, chipmakers have turned around and provided more computing resources than programmers know what to do with, transferring much of the responsibility for making computers faster to the software guys.

We're with the operating system and we're here to help

It's into this environment that Snow Leopard is born. If there's one responsibility (aside from security) that an operating system vendor should feel in the year 2009, it's finding a way for applications—and the OS itself—to utilize the ever-growing wealth of computing resources at their disposal. If I had to pick single technological "theme" for Snow Leopard, this would be it: helping developers utilize all this newfound silicon; helping them do more with more.

To that end, Snow Leopard includes two significant new APIs backed by several smaller, but equally important infrastructure improvements. We'll start at the bottom with, believe it or not, the compiler.

LLVM and Clang

Apple made a strategic investment in the LLVM open source project several years ago. I covered the fundamentals of LLVM in my Leopard review. (If you're not up to speed, please catch up on the topic before continuing.) In it, I described how Leopard used LLVM to provide dramatically more efficient JIT-compiled software implementations of OpenGL functions. I ended with the following admonition:

Don't be misled by its humble use in Leopard; Apple has grand plans for LLVM. How grand? How about swapping out the guts of the gcc compiler Mac OS X uses now and replacing them with the LLVM equivalents? That project is well underway. Not ambitious enough? How about ditching gcc entirely, replacing it with a completely new LLVM-based (but gcc-compatible) compiler system? That project is called Clang, and it's already yielded some impressive performance results.

With the introduction of Snow Leopard, it's official: Clang and LLVM are the Apple compiler strategy going forward. LLVM even has a snazzy new logo, a not-so-subtle homage to a well-known compiler design textbook:

LLVM! Clang! Rawr!

Apple now offers a total of four compilers for Mac OS X: GCC 4.0, GCC 4.2, LLVM-GCC 4.2 (the GCC 4.2 front-end combined with an LLVM back-end), and Clang, in order of increasing LLVM-ness. Here's a diagram:

All of these compilers are binary-compatible on Mac OS X, which means you can, for example, build a library with one compiler and link it into an executable built with another. They're also all command-line and source-compatible—in theory, anyway. Clang does not yet support some of the more esoteric features of GCC. Clang also only supports C, Objective-C, and a little bit of C++ (Clang(uage), get it?) whereas GCC supports many more. Apple is committed to full C++ support for Clang, and hopes to work out the remaining GCC incompatibilities during Snow Leopard's lifetime.

Clang brings with it the two headline attributes you expect in a hot, new compiler: shorter compile times and faster executables. In Apple's testing with its own applications such as iCal, Address Book, and Xcode itself, plus third-party applications like Adium and Growl, Clang compiles nearly three times faster than GCC 4.2. As for the speed of the finished product, the LLVM back-end, whether used in Clang or in LLVM-GCC, produces executables that are 5-25% faster than those generated by GCC 4.2.

Clang is also more developer-friendly than its GCC predecessors. I concede that this topic doesn't have much to do with taking advantage of multiple CPU cores and so on, but it's sure to be the first thing that a developer actually notices when using Clang. Indulge me.

For starters, Clang is embeddable, so Xcode can use the same compiler infrastructure for interactive features within the IDE (symbol look-up, code completion, etc.) as it uses to compile the final executable. Clang also creates and preserves more extensive metadata while compiling, resulting in much better error reporting. For example, when GCC tells you this:

It's not exactly clear what the problem is, especially if you're new to C programming. Yes, all you hotshots already know what the problem is (especially if you saw this example at WWDC), but I think everyone can agree that this error, generated by Clang, is a lot more helpful:

Maybe a novice still wouldn't know what to do, but at least it's clear where the problem lies. Figuring out why the compiler doesn't know about NSString is a much more focused task than can be derived from GCC's cryptic error.

Even when the message is clear, the context may not be. Take this error from GCC:

Sure, but there are four "+" operators on that single line. Which one has the problematic operands? Thanks to its more extensive metadata, Clang can pinpoint the problem:

Sometimes the error is perfectly clear, but it just seems a bit off, like this situation where jumping to the error as reported by GCC puts you on the line below where you actually want to add the missing semicolon:

The little things count, you know? Clang goes that extra mile:

Believe it or not, stuff like this means a lot to developers. And then there are the not-so-little things that mean even more, like the LLVM-powered static analyzer. The image below shows how the static analyzer displays its discovery of a possible bug.

Aside from the whimsy of the little arrows (which, admit it, are adorable), the actual bug it's highlighting is something that every programmer can imagine creating (say, through some hasty editing). The static analyzer has determined that there's at least one path through this set of nested conditionals that leaves the myName variable uninitialized, thus making the attempt to send the mutableCopy message in the final line potentially dangerous.

I'm sure Apple is going hog-wild running the static analyzer on all of its applications and the operating system itself. The prospect of an automated way to discover bugs that may have existed for years in the depths of a huge codebase is almost pornographic to developers—platform owners in particular. To the degree that Mac OS X 10.6.0 is more bug-free than the previous 10.x.0 releases, LLVM surely deserves some significant part of the credit.

Master of the house

By committing to a Clang/LLVM-powered future, Apple has finally taken complete control of its development platform. The CodeWarrior experience apparently convinced Apple that it's unwise to rely on a third party for its platform's development tools. Though it's taken many years, I think even the most diehard Metrowerks fan would have to agree that Xcode in Snow Leopard is now a pretty damn good IDE.

After years of struggling with the disconnect between the goals of the GCC project and its own compiler needs, Apple has finally cut the apron strings. OK, granted, GCC 4.2 is still the default compiler in Snow Leopard, but this is a transitional phase. Clang is the recommended compiler, and the focus of all of Apple's future efforts.

I know what you're thinking. This is swell and all, but how are these compilers helping developers better leverage the expanding swarm of transistors at their disposal? As you'll see in the following sections, LLVM's scaly, metallic head pops up in a few key places.

Blocks

In Snow Leopard, Apple has introduced a C language extension called "blocks." Blocks add closures and anonymous functions to C and the C-derived languages C++, Objective-C, and Objective C++.

These features have been available in dynamic programming languages such as Lisp, Smalltalk, Perl, Python, Ruby, and even the unassuming JavaScript for a long time (decades, in the case of Lisp—a fact gladly offered by its practitioners). While dynamic-language programmers take closures and anonymous functions for granted, those who work with more traditional, statically compiled languages such as C and its derivatives may find them quite exotic. As for non-programmers, they likely have no interest in this topic at all. But I'm going to attempt an explanation nonetheless, as blocks form the foundation of some other interesting technologies to be discussed later.

Perhaps the simplest way to explain blocks is that they make functions another form of data. C-derived languages already have function pointers, which can be passed around like data, but these can only point to functions created at compile time. The only way to influence the behavior of such a function is by passing different arguments to the function or by setting global variables which are then accessed from within the function. Both of these approaches have big disadvantages

Passing arguments becomes cumbersome as their number and complexity grows. Also, it may be that you have limited control over the arguments that will be passed to your function, as is often the case with callbacks. To compensate, you may have to bundle up all of your interesting state into a context object of some kind. But when, how, and by whom that context data will be disposed of can be difficult to pin down. Often, a second callback is required for this. It's all quite a pain.

As for the use of global variables, in addition to being a well-known anti-pattern, it's also not thread-safe. To make it so requires locks or some other form of mutual exclusion to prevent multiple invocations of the same function from stepping on each other's toes. And if there's anything worse than navigating a sea of callback-based APIs, it's manually dealing with thread safety issues.

Blocks bypass all of these problems by allowing functional blobs of code—blocks—to be defined at runtime. It's easiest to understand with an example. I'm going to start by using JavaScript, which has a bit friendlier syntax, but the concepts are the same.

b = get_number_from_user();

multiplier = function(a) { return a * b };

Here I've created a function named multiplier that takes a single argument, a, and multiplies it by a second value, b, that's provided by the user at runtime. If the user supplied the number 2, then a call to multiplier(5) would return the value 10.

b = get_number_from_user(); // assume it's 2

multiplier = function(a) { return a * b };

r = multiplier(5); // 5 * 2 = 10

Here's the example above done with blocks in C.

b = get_number_from_user(); // assume it's 2

multiplier = ^ int (int a) { return a * b; };

r = multiplier(5); // 5 * 2 = 10

By comparing the JavaScript code to the C version, I hope you can see how it works. In the C example, that little caret ^ is the key to the syntax for blocks. It's kind of ugly, but it's very C-like in that it parallels the existing C syntax for function pointers, with ^ in place of *, as this example illustrates:

/* A function that takes a single integer argument and returns
   a pointer to a function that takes two integer arguments and
   returns a floating-point number. */
float (*func2(int a))(int, int);

/* A function that takes a single integer argument and returns
   a block that takes two integer arguments and returns a
   floating-point number. */
float (^func1(int a))(int, int);

You'll just have to trust me when I tell you that this syntax actually makes sense to seasoned C programmers.

Now then, does this mean that C is suddenly a dynamic, high-level language like JavaScript or Lisp? Hardly. The existing distinction between the stack and the heap, the rules governing automatic and static variables, and so on are all still in full effect. Plus, now there's a whole new set of rules for how blocks interact with each of these things. There's even a new __block storage type attribute to further control the scope and lifetime of values used in blocks.

All of that said, blocks are still a huge win in C. Thanks to blocks, the friendlier APIs long enjoyed by dynamic languages are now possible in C-derived languages. For example, suppose you want to apply some operation to every line in a file. To do so in a low-level language like C requires some amount of boilerplate code to open and read from the file, handle any errors, read each line into a buffer, and clean up at the end.

FILE *fp = fopen(filename, "r");

if (fp == NULL) {
  perror("Unable to open file");
}
else {
  char line[MAX_LINE];

  while (fgets(line, MAX_LINE, fp)) {
    work; work; work;
  }

  fclose(fp);
}

The part in bold is an abstract representation of what you're planning to do to each line of the file. The rest is the literal boilerplate code. If you find yourself having to apply varying operations to every line of many different files, this boilerplate code gets tedious.

What you'd like to be able to do is factor it out into a function that you can call. But then you're faced with the problem of how to express the operation you'd like to perform on each line of the file. In the middle of each block of boilerplate may be many lines of code expressing the operation to be applied. This code may reference or modify local variables which are affected by the runtime behavior of the program, so traditional function pointers won't work. What to do?

Thanks to blocks, you can define a function that takes a filename and a block as arguments. This gets all the uninteresting code out of your face.

foreach_line(filename, ^ (char *line) {
  work; work; work;
});

What's left is a much clearer expression of your intent, with less surrounding noise. The argument after filename is a literal block that takes a line of text as an argument.

Even when the volume of boilerplate is small, the simplicity and clarity bonus is still worthwhile. Consider the simplest possible loop that executes a fixed number of times. In C-based languages, even that basic construct offers a surprising number of opportunities for bugs. Let's do_something() 10 times:

for (int i = 0; i <= 10; i++) {
  do_something();
}

Oops, I've got a little bug there, don't I? It happens to the best of us. But why should this code be more complicated than the sentence describing it. Do something 10 times! I never want to screw that up again. Blocks can help. If we just invest a little effort up front to define a helper function:

typedef void (^work_t)(void);

void repeat(int n, work_t block) {
  for (int i = 0; i < n; ++i)
    block();
}

We can banish the bug for good. Now, repeating any arbitrary block of code a specific number of times is all but idiot-proof:

repeat(10, ^{ do_something() });
repeat(20, ^{ do_other_thing() });

And remember, the block argument to repeat() can contain exactly the same kind of code, literally copied and pasted, that would have appeared within a traditional for loop.

All these possibilities and more have been well explored by dynamic languages: map, reduce, collect, etc. Welcome, C programmers, to a higher order.

Apple has taken these lessons to heart, adding over 100 new APIs that use blocks in Snow Leopard. Many of these APIs would not be possible at all without blocks, and all of them are more elegant and concise than they would be otherwise.

It's Apple intention to submit blocks as an official extension to one or more of the C-based languages, though it's not yet clear which standards bodies are receptive to the proposal. For now, blocks are supported by all four of Apple's compilers in Mac OS X.

Concurrency in the real world: a prelude

The struggle to make efficient use of a large number of independent computing devices is not new. For decades, the field of high-performance computing has tackled this problem. The challenges faced by people writing software for supercomputers many years ago have now trickled down to desktop and even mobile computing platforms.

In the PC industry, some people saw this coming earlier than others. Almost 20 years ago, Be Inc. was formed around the idea of creating a PC platform unconstrained by legacy limitations and entirely prepared for the coming abundance of independent computing units on the desktop. To that end, Be created the BeBox, a dual-CPU desktop computer, and BeOS, a brand-new operating system.

The signature catch phrase for BeOS was "pervasive multithreading." The BeBox and other machines running BeOS leveraged every ounce of the diminutive (by today's standards, anyway) computing resources at their disposal. The demos were impressive. A dual 66 MHz machine (don't make me show another graph) could play multiple videos simultaneously while also playing several audio tracks from a CD—some backwards— and all the while, the user interface remained completely responsive.

Let me tell you, having lived through this period myself, the experience was mind-blowing at the time. BeOS created instant converts out of hundreds of technology enthusiasts, many of whom maintain that today's desktop computing experience still doesn't match the responsiveness of BeOS. This is certainly true emotionally, if not necessarily literally.

After nearly purchasing Be in the late 1990s, Apple bought NeXT instead, and the rest is history. But had Apple gone with plan Be instead, Mac developers might have had a rough road ahead. While all that pervasive multithreading made for impressive technology demos and a great user experience, it could be extremely demanding on the programmer. BeOS was all about threads, going so far as to maintain a separate thread for each window. Whether you liked it or not, your BeOS program was going to be multithreaded.

Parallel programming is notoriously hard, with the manual management of POSIX-style threads representing the deep end of that pool. The best programmers in the world are hard-pressed to create large multithreaded programs in low-level languages like C or C++ without finding themselves impaled on the spikes of deadlock, race conditions, and other perils inherent in the use of in multiple simultaneous threads of execution that share the same memory space. Extremely careful application of locking primitives is required to avoid performance-robbing levels of contention for shared data—and the bugs, oh the bugs! The term "Heisenbug" may as well have been invented for multithreaded programming.

Nineteen years after Be tilted at the windmill of the widening swath of silicon in desktop PCs, the challenge has only grown. Those transistors are out there, man—more than ever before. Single-threaded programs on today's high-end desktop Macs, even when using "100%" CPU, extend but a single glowing tower in a sea of sixteen otherwise empty lanes on a CPU monitor window.

And woe be unto the user if that pegged CPU core is running the main thread of a GUI application on Mac OS X. A CPU-saturated main thread means no new user inputs are being pulled off the event queue by the application. A few seconds of that and an old friend makes its appearance: the spinning beach ball of death.

Nooooooooo!!! Credit: Image from The Iconfactory

This is the enemy: hardware with more computing resources than programmers know what to do with, most of it completely idle, and all the while the user is utterly blocked in his attempts to use the current application. What's Snow Leopard's answer? Read on…

Grand Central Dispatch

Snow Leopard's answer to the concurrency conundrum is called Grand Central Dispatch (GCD). As with QuickTime X, the name is extremely apt, though this is not entirely clear until you understand the technology.

The first thing to know about GCD is that it's not a new Cocoa framework or similar special-purpose frill off to the side. It's a plain C library baked into the lowest levels of Mac OS X. (It's in libSystem, which incorporates libc and the other code that sits at the very bottom of userspace.)

There's no need to link in a new library to use GCD in your program. Just #include <dispatch/dispatch.h> and you're off to the races. The fact that GCD is a C library means that it can be used from all of the C-derived languages supported on Mac OS X: Objective-C, C++, and Objective-C++.

Queues and threads

GCD is built on a few simple entities. Let's start with queues. A queue in GCD is just what it sounds like. Tasks are enqueued, and then dequeued in FIFO order. (That's "First In, First Out," just like the checkout line at the supermarket, for those who don't know and don't want to follow the link.) Dequeuing the task means handing it off to a thread where it will execute and do its actual work.

Though GCD queues will hand tasks off to threads in FIFO order, several tasks from the same queue may be running in parallel at any given time. This animation demonstrates.

A Grand Central Dispatch queue in action

You'll notice that Task B completed before Task A. Though dequeuing is FIFO, task completion is not. Also note that even though there were three tasks enqueued, only two threads were used. This is an important feature of GCD which we'll discuss shortly.

But first, let's look at the other kind of queue. A serial queue works just like a normal queue, except that it only executes one task at a time. That means task completion in a serial queue is also FIFO. Serial queues can be created explicitly, just like normal queues, but each application also has an implicit "main queue" which is a serial queue that runs on the main thread.

The animation above shows threads appearing as work needs to be done, and disappearing as they're no longer needed. Where do these threads come from and where do they go when they're done? GCD maintains a global pool of threads which it hands out to queues as they're needed. When a queue has no more pending tasks to run on a thread, the thread goes back into the pool.

This is an extremely important aspect of GCD's design. Perhaps surprisingly, one of the most difficult parts of extracting maximum performance using traditional, manually managed threads is figuring out exactly how many threads to create. Too few, and you risk leaving hardware idle. Too many, and you start to spend a significant amount of time simply shuffling threads in and out of the available processor cores.

Let's say a program has a problem that can be split into eight separate, independent units of work. If this program then creates four threads on an eight-core machine, is this an example of creating too many or too few threads? Trick question! The answer is that it depends on what else is happening on the system.

If six of the eight cores are totally saturated doing some other work, then creating four threads will just require the OS to waste time rotating those four threads through the two available cores. But wait, what if the process that was saturating those six cores finishes? Now there are eight available cores but only four threads, leaving half the cores idle.

With the exception of programs that can reasonably expect to have the entire machine to themselves when they run, there's no way for a programmer to know ahead of time exactly how many threads he should create. Of the available cores on a particular machine, how many are in use? If more become available, how will my program know?

The bottom line is that the optimal number of threads to put in flight at any given time is best determined by a single, globally aware entity. In Snow Leopard, that entity is GCD. It will keep zero threads in its pool if there are no queues that have tasks to run. As tasks are dequeued, GCD will create and dole out threads in a way that optimizes the use of the available hardware. GCD knows how many cores the system has, and it knows how many threads are currently executing tasks. When a queue no longer needs a thread, it's returned to the pool where GCD can hand it out to another queue that has a task ready to be dequeued.

There are further optimizations inherent in this scheme. In Mac OS X, threads are relatively heavyweight. Each thread maintains its own set of register values, stack pointer, and program counter, plus kernel data structures tracking its security credentials, scheduling priority, set of pending signals and signal masks, etc. It all adds up to over 512 KB of overhead per thread. Create a thousand threads and you've just burned about a half a gigabyte of memory and kernel resources on overhead alone, before even considering the actual data within each thread.

Compare a thread's 512 KB of baggage with GCD queues which have a mere 256 bytes of overhead. Queues are very lightweight, and developers are encouraged to create as many of them as they need—thousands, even. In the earlier animation, when the queue was given two threads to process its three tasks, it executed two tasks on one of the threads. Not only are threads heavyweight in terms of memory overhead, they're also relatively costly to create. Creating a new thread for each task would be the worst possible scenario. Every time GCD can use a thread to execute more than one task, it's a win for overall system efficiency.

Remember the problem of the programmer trying to figure out how many threads to create? Using GCD, he doesn't have to worry about that at all. Instead, he can concentrate entirely on the optimal concurrency of his algorithm in the abstract. If the best-case scenario for his problem would use 500 concurrent tasks, then he can go ahead and create 500 GCD queues and distribute his work among them. GCD will figure out how many actual threads to create to do the work. Furthermore it will adjust the number of threads dynamically as the conditions on the system change.

But perhaps most importantly, as new hardware is released with more and more CPU cores, the programmer does not need to change his application at all. Thanks to GCD, it will transparently take advantage of any and all available computing resources, up to—but not past!—the optimal amount of concurrency as originally defined by the programmer when he chose how many queues to create.

But wait, there's more! GCD queues can actually be arranged in arbitrarily complex directed acyclic graphs. (Actually, they can be cyclic too, but then the behavior is undefined. Don't do that.) Queue hierarchies can be used to funnel tasks from disparate subsystems into a narrower set of centrally controlled queues, or to force a set of normal queues to delegate to a serial queue, effectively serializing them all indirectly.

There are also several levels of priority for queues, dictating how often and with what urgency threads are distributed to them from the pool. Queues can be suspended, resumed, and cancelled. Queues can also be grouped, allowing all tasks distributed to the group to be tracked and accounted for as a unit.

Overall, GCD's use of queues and threads forms a simple, elegant, but also extremely pragmatic architecture.

Asynchronicity

Okay, so GCD is a great way to make efficient use of the available hardware. But is it really any better than BeOS's approach to multithreading? We've already seen a few ways that GCD avoids the pitfalls of BeOS (e.g., the reuse of threads and the maintenance of a global pool of threads that's correctly sized for the available hardware). But what about the problem of overwhelming the programmer by requiring threads in places where they complicate, rather than enhance the application?

GCD embodies a philosophy that is at the opposite end of the spectrum from BeOS's "pervasive multithreading" design. Rather than achieving responsiveness by getting every possible component of an application running concurrently on its own thread (and paying a heavy price in terms of complex data sharing and locking concerns), GCD encourages a much more limited, hierarchical approach: a main application thread where all the user events are processed and the interface is updated, and worker threads doing specific jobs as needed.

In other words, GCD doesn't require developers to think about how best to split the work of their application into multiple concurrent threads (though when they're ready to do that, GCD will be willing and able to help). At its most basic level, GCD aims to encourage developers to move from thinking synchronously to thinking asynchronous. Something like this: "Write your application as usual, but if there's any part of its operation that can reasonably be expected to take more than a few seconds to complete, then for the love of Zarzycki, get it off the main thread!"

That's it; no more, no less. Beach ball banishment is the cornerstone of user interface responsiveness. In some respects, everything else is gravy. But most developers know this intuitively, so why do we still see the beach ball in Mac OS X applications? Why don't all applications already execute all of their potentially long-running tasks on background threads?

A few reasons have been mentioned already (e.g., the difficulty of knowing how many threads to create) but the big one is much more pragmatic. Spinning off a thread and collecting its result has always been a bit of a pain. It's not so much that it's technically difficult, it's just that it's such an explicit break from coding the actual work of your application to coding all this task-management plumbing. And so, especially in borderline cases, like an operation that may take 3 to 5 seconds, developers just do it synchronously and move onto the next thing.

Unfortunately, there's a surprising number of very common things that an application can do that execute quickly most of the time, but have the potential to take much longer than a few seconds when something goes wrong. Anything that touches the file system may stall at the lowest levels of the OS (e.g., within blocking read() and write() calls) and be subject to a very long (or at least an "unexamined-by-the-application-developer") timeout. The same goes for name lookups (e.g., DNS or LDAP), which almost always execute instantly, but catch many applications completely off-guard when they start taking their sweet time to return a result. Thus, even the most meticulously constructed Mac OS X applications can end up throwing the beach ball in our face from time to time.

With GCD, Apple is saying it doesn't have to be this way. For example, suppose a document-based application has a button that, when clicked, will analyze the current document and display some interesting statistics about it. In the common case, this analysis should execute in under a second, so the following code is used to connect the button with an action:

- (IBAction)analyzeDocument:(NSButton *)sender
{
  NSDictionary *stats = [myDoc analyze];
  [myModel setDict:stats];
  [myStatsView setNeedsDisplay:YES];
  [stats release];
}

The first line of the function body analyzes the document, the second line updates the application's internal state, and the third line tells the application that the statistics view needs to be updated to reflect this new state. It all follows a very common pattern, and it works great as long as none of these steps—which are all running on the main thread, remember—takes too long. Because after the user presses the button, the main thread of the application needs to handle that user input as fast as possible so it can get back to the main event loop to process the next user action.

The code above works great until a user opens a very large or very complex document. Suddenly, the "analyze" step doesn't take one or two seconds, but 15 or 30 seconds instead. Hello, beach ball. And still, the developer is likely to hem and haw: "This is really an exceptional situation. Most of my users will never open such a large file. And anyway, I really don't want to start reading documentation about threads and adding all that extra code to this simple, four-line function. The plumbing would dwarf the code that does the actual work!"

Well, what if I told you that you could move the document analysis to the background by adding just two lines of code (okay, and two lines of closing braces), all located within the existing function? No application-global objects, no thread management, no callbacks, no argument marshalling, no context objects, not even any additional variables. Behold, Grand Central Dispatch:

- (IBAction)analyzeDocument:(NSButton *)sender
{
  dispatch_async(dispatch_get_global_queue(0, 0), ^{
    NSDictionary *stats = [myDoc analyze];
    dispatch_async(dispatch_get_main_queue(), ^{
      [myModel setDict:stats];
      [myStatsView setNeedsDisplay:YES];
      [stats release];
    });
  });
}

There's a hell of a lot of packed into those two lines of code. All of the functions in GCD begin with dispatch_, and you can see four such calls in the blue lines of code above. The key to the minimal invasiveness of this code is revealed in the second argument to the two dispatch_async() calls. Thus far, I've been discussing "units of work" without specifying how, exactly, GCD models such a thing. The answer, now revealed, should seem obvious in retrospect: blocks! The ability of blocks to capture the surrounding context is what allows these GCD calls to be dropped right into some existing code without requiring any additional setup or re-factoring or other contortions in service of the API.

But the best part of this code is how it deals with the problem of detecting when the background task completes and then showing the result. In the synchronous code, the analyze method call and the code to update the application display simply appear in the desired sequence within the function. In the asynchronous code, miraculously, this is still the case. Here's how it works.

The outer dispatch_async() call puts a task on a global concurrent GCD queue. That task, represented by the block passed as the second argument, contains the potentially time-consuming analyze method call, plus another call to dispatch_async() that puts a task onto the main queue—a serial queue that runs on the main thread, remember—to update the application's user interface.

User interface updates must all be done from the main thread in a Cocoa application, so the code in the inner block could not be executed anywhere else. But rather than having the background thread send some kind of special-purpose notification back to the main thread when the analyze method call completes (and then adding some code to the application to detect and handle this notification), the work that needs to be done on the main thread to update the display is encapsulated in yet another block within the larger one. When the analyze call is done, the inner block is put onto the main queue where it will (eventually) run on the main thread and do its work of updating the display.

Simple, elegant, and effective. And for developers, no more excuses.

Believe it or not, it's just as easy to take a serial implementation of a series of independent operations and parallelize it. The code below does work on count elements of data, one after the other, and then summarizes the results once all the elements have been processed.

for (i = 0; i < count; i++) {
    results[i] = do_work(data, i);
} 

total = summarize(results, count);

Now here's the parallel version which puts a separate task for each element onto a global concurrent queue. (Again, it's up to GCD to decide how many threads to actually use to execute the tasks.)

dispatch_apply(count, dispatch_get_global_queue(0, 0), ^(size_t i) {
    results[i] = do_work(data, i);
});

total = summarize(results, count);

And there you have it: a for loop replaced with a concurrency-enabled equivalent with one line of code. No preparation, no additional variables, no impossible decisions about the optimal number of threads, no extra work required to wait for all the independent tests to complete. (The dispatch_apply() call will not return until all the tasks it has dispatched have completed.) Stunning.

Grand Central Awesome

Of all the APIs added in Snow Leopard, Grand Central Dispatch has the most far-reaching implications for the future of Mac OS X. Never before has it been so easy to do work asynchronously and to spread workloads across many CPUs.

When I first heard about Grand Central Dispatch, I was extremely skeptical. The greatest minds in computer science have been working for decades on the problem of how best to extract parallelism from computing workloads. Now here was Apple apparently promising to solve this problem. Ridiculous.

But Grand Central Dispatch doesn't actually address this issue at all. It offers no help whatsoever in deciding how to split your work up into independently executable tasks—that is, deciding what pieces can or should be executed asynchronously or in parallel. That's still entirely up to the developer (and still a tough problem). What GCD does instead is much more pragmatic. Once a developer has identified something that can be split off into a separate task, GCD makes it as easy and non-invasive as possible to actually do so.

The use of FIFO queues, and especially the existence of serialized queues, seems counter to the spirit of ubiquitous concurrency. But we've seen where the Platonic ideal of multithreading leads, and it's not a pleasant place for developers.

One of Apple's slogans for Grand Central Dispatch is "islands of serialization in a sea of concurrency." That does a great job of capturing the practical reality of adding more concurrency to run-of-the-mill desktop applications. Those islands are what isolate developers from the thorny problems of simultaneous data access, deadlock, and other pitfalls of multithreading. Developers are encouraged to identify functions of their applications that would be better executed off the main thread, even if they're made up of several sequential or otherwise partially interdependent tasks. GCD makes it easy to break off the entire unit of work while maintaining the existing order and dependencies between subtasks.

Those with some multithreaded programming experience may be unimpressed with the GCD. So Apple made a thread pool. Big deal. They've been around forever. But the angels are in the details. Yes, the implementation of queues and threads has an elegant simplicity, and baking it into the lowest levels of the OS really helps to lower the perceived barrier to entry, but it's the API built around blocks that makes Grand Central Dispatch so attractive to developers. Just as Time Machine was "the first backup system people will actually use," Grand Central Dispatch is poised to finally spread the heretofore dark art of asynchronous application design to all Mac OS X developers. I can't wait.

OpenCL

Somehow, OpenCL got in on the "core" branding

So far, we've seen a few examples of doing more with more: a new, more modern compiler infrastructure that supports an important new language feature, and a powerful, pragmatic concurrency API built on top of the new compilers' support for said language feature. All this goes a long way towards helping developers and the OS itself make maximum use of the available hardware.

But CPUs are not the only components experiencing a glut of transistors. When it comes to the proliferation of independent computation engines, another piece of silicon inside every Mac is the undisputed title holder: the GPU.

The numbers tell the tale. While Mac CPUs contain up to four cores (which may show up as eight logical cores thanks to symmetric multithreading), high-end GPUs contain well over 200 processor cores. While CPUs are just now edging over 100 GFLOPS, the best GPUs are capable of over 1,000 GFLOPS. That's one trillion floating-point operations per second. And like CPUs, GPUs now come more than one on a board.

Writing for the GPU

Unfortunately, the cores on a GPU are not general-purpose processors (at least not yet). They're much simpler computing engines that have evolved from the fixed-function silicon of their ancestors that could not be programmed directly at all. They don't support the rich set of instructions available on CPUs, the maximum size of the programs that will run is often limited and very small, and not all of the features of the industry-standard IEEE floating-point computation specification are supported.

Today's GPUs can be programmed, but the most common forms of programmability are still firmly planted in the world of graphics programming: vertex shaders, geometry shaders, pixel shaders. Most of the languages used to program GPUs are similarly graphically focused: HLSL, GLSL, Cg.

Nevertheless, there are computational tasks outside the realm of graphics that are a good fit for GPU hardware. It would be nice if there were a non-graphics-oriented language to write them in. Creating such a thing is quite a challenge, however. GPU hardware varies wildly in every imaginable way: number and type of execution units, available data formats, instruction sets, memory architecture, you name it. Programmers don't want to be exposed to these differences, but it's difficult to work around the complete lack of a feature or the unavailability of a particular data type.

GPU vendor NVIDIA gave it a shot, however, and produced CUDA: a subset of the C language with extensions for vector data types, data storage specifiers that reflect typical GPU memory hierarchy, and several bundled computational libraries. CUDA is but one entrant in the burgeoning GPGPU field (General-Purpose computing on Graphics Processing Units). But coming from a GPU vendor, it faces an uphill battle with developers who really want a vendor-agnostic solution.

In the world of 3D programming, OpenGL fills that role. As you've surely guessed by now, OpenCL aims to do the same for general-purpose computation. In fact, OpenCL is supported by the same consortium as OpenGL: the ominously named Khronos Group. But make no mistake, OpenCL is Apple's baby.

Apple understood that OpenCL's best chance of success was to become an industry standard, not just an Apple technology. To make that happen, Apple needed the cooperation of the top GPU vendors, plus an agreement with an established, widely-recognized standards body. It took a while, but now it's all come together.

OpenCL is a lot like CUDA. It uses a C-like language with the vector extensions, it has a similar model of memory hierarchy, and so on. This is no surprise, considering how closely Apple worked with NVIDIA during the development of OpenCL. There's also no way any of the big GPU vendors would radically alter their hardware to support an as-yet-unproven standard, so OpenCL had to work well with GPUs already designed to support CUDA, GLSL, and other existing GPU programming languages.

The OpenCL difference

This is all well and good, but to have any impact on the day-to-day life of Mac users, developers actually have to use OpenCL in their applications. Historically, GPGPU programming languages have not seen much use in traditional desktop applications. There are several reasons for this.

Early on, writing programs for the GPU often required the use of vendor-specific assembly languages that were far removed from the experience of writing a typical desktop application using a contemporary GUI API. The more C-like languages that came later remained either graphics-focused, vendor-specific, or both. Unless running code on the GPU would accelerate a core component of an application by an order of magnitude, most developers still could not be bothered to navigate this foreign world.

And even if the GPU did give a huge speed boost, relying on graphics hardware for general-purpose computation was very likely to narrow the potential audience for an application. Many older GPUs, especially those found in laptops, cannot run languages like CUDA at all.

Apple's key decision in the design of OpenCL was to allow OpenCL programs to run not just on GPUs, but on CPUs as well. An OpenCL program can query the hardware it's running on and enumerate all eligible OpenCL devices, categorized as CPUs, GPUs, or dedicated OpenCL accelerators (the IBM Cell Blade server—yes, that Cell—is apparently one such device). The program can then dispatch its OpenCL tasks to any available device. It's also possible to create a single logical device consisting of any combination of eligible computing resources: two GPUs, a GPU and two CPUs, etc.

The advantages of being able to run OpenCL programs on both CPUs and GPUs are obvious. Every Mac running Snow Leopard, not just those with the recent-model GPUs, can run a program that contains OpenCL code. But there's more to it than that.

Certain kinds of algorithms actually run faster on high-end multi-core CPUs than on even the very fastest available GPUs. At WWDC 2009, an engineer from Electronic Arts demonstrated an OpenCL port of a skinning engine from one of its games running over four times faster on a four-core Mac Pro than on an NVIDIA GeForce GTX285. Restructuring the algorithm and making many other changes to better suit the limitations (and strengths) of the GPU pushed it back ahead of the CPU by a wide margin, but sometimes you just want the system you have to run well as-is. Being able to target the CPU is extremely useful in those cases.

Moreover, writing vector code for Intel CPUs "the old-fashioned way" can be a real pain. There's MMX, SSE, SSE2, SSE3, and SSE4 to deal with, all with slightly different capabilities, and all of which force the programmer to write code like this:

r1 = _mm_mul_ps(m1, _mm_add_ps(x1, x2));

OpenCL's native support for vector types de-clutters the code considerably:

r1 = m1 * (x1 + x2);

Similarly, OpenCL's support for implicit parallelism makes it much easier to take advantage of multiple CPU cores. Rather than writing all the logic to split your data into pieces and distribute those pieces to the parallel-computing hardware, OpenCL lets you write just the code to operate on a single piece of the data and then send it, along with the entire block of data and the desired level of parallelism, to the computing device.

This arrangement is taken for granted in traditional graphics programming, where code implicitly works on all pixels in a texture or all vertices in a polygon; the programmer only needs to write code that will exist in the "inner loop," so to speak. An API with support for this kind of parallelism that runs on CPUs as well as GPUs fills an important gap.

Writing to OpenCL also future-proofs task- or data-parallel code. Just as the same OpenGL code will get faster and faster as newer, more powerful GPUs are released, so too will OpenCL code perform better as CPUs and GPUs get faster. The extra layer of abstraction that OpenCL provides makes this possible. For example, though vector code written several years ago using MMX got faster as CPU clock speeds increased, a more significant performance boost likely requires porting the code to one of the newer SSE instruction sets.

As newer, more powerful vector instruction sets and parallel hardware becomes available, Apple will update its OpenCL implementations to take advantage of them, just as video card makers and OS vendors update their OpenGL drivers to take advantage of faster GPUs. Meanwhile, the application developer's code remains unchanged. Not even a recompile is required.

Here be dragons (and trains)

How, you may wonder, can the same compiled code end up executing using SSE2 on one machine and SSE4 on another, or on an NVIDIA GPU on one machine and an ATI GPU on another? To do so would require translating the device-independent OpenCL code to the instruction set of the target computing device at runtime. When running on a GPU, OpenCL must also ship the data and the newly translated code over to the video card and collect the results at the end. When running on the CPU, OpenCL must arrange for the requested level of parallelism by creating and distributing threads appropriately to the available cores.

Well, wouldn't you know it? Apple just happens to have two technologies that solve these exact problems.

Want to compile code "just in time" and ship it off to a computing device? That's what LLVM was born to do—and, indeed, what Apple did with it in Leopard, albeit on a more limited scale. OpenCL is a natural extension of that work. LLVM allows Apple to write a single code generator for each target instruction set, and concentrate all of its effort on a single device-independent code optimizer. There's no longer any need to duplicate these tasks, using one compiler to create the static application executable and having to jury-rig another for just-in-time compilation.

(Oh, and by the way, remember Core Image? That's another API that needs to compile code just-in-time and ship it off to execute on parallel hardware like GPUs and multi-core CPUs. In Snow Leopard, Core Image has been re-implemented using OpenCL, producing a hefty 25% overall performance boost.)

To handle task parallelism and provision threads, OpenCL is built on top of Grand Central Dispatch. This is such a natural fit that it's a bit surprising that the OpenCL API doesn't use blocks. I think Apple decided that it shouldn't press its luck when it comes to getting its home-grown technologies adopted by other vendors. This decision already seems to be paying off, as AMD has its own OpenCL implementation under way.

The top of the pyramid

Though the underlying technologies, Clang, blocks and Grand Central Dispatch, will undoubtedly be more widely used by developers, OpenCL represents the culmination of that particular technological thread in Snow Leopard. This is the gold standard of software engineering: creating a new public API by building it on top of lower-level, but equally well-designed and implemented public APIs.

A unified abstraction for the ever-growing heterogeneous collection of parallel computing silicon in desktop computers was sorely needed. We've got an increasing population of powerful CPU cores, but they still exist in numbers that are orders of magnitude lower than the hundreds of processing units in modern GPUs. On the other hand, GPUs still have a ways to go to catch up with the power and flexibility of a full-fledged CPU core. But even with all the differences, writing code exclusively for either one of those worlds still smacks of leaving money on the table.

With OpenCL in hand, there's no longer a need to put all your eggs in one silicon basket. And with the advent of hybrid CPU/GPU efforts like Intel's Larabee, which use CPU-caliber processing engines, but in much higher numbers, OpenCL may prove even more important in the coming years.

Transistor harvest

Collectively, the concurrency-enabling features introduced in Snow Leopard represent the biggest boost to asynchronous and parallel software development in any Mac OS X release—perhaps in any desktop operating system release ever. It may be hard for end-users to get excited about "plumbing" technologies like Grand Central Dispatch and OpenCL, let alone compilers and programming language features, but it's upon these foundations that developers will create ever-more-impressive edifices of software. And if those applications tower over their synchronous, serial predecessors, it will be because they stand on the shoulders of giants.

QuickTime Player's new icon (Not a fan)

QuickTime Player

There's been some confusion surrounding QuickTime in Snow Leopard. The earlier section about QuickTime X explains what you need to know about the present and future of QuickTime as a technology and an API. But a few of Apple's decisions—and the extremely overloaded meaning of the word "QuickTime" in the minds of consumers—have blurred the picture somewhat.

The first head-scratcher occurs during installation. If you happen to click on the "Customize…" button during installation, you'll see the following options:

We've already talked about Rosetta being an optional install, but QuickTime 7 too? Isn't QuickTime severely crippled without QuickTime 7? Why in the world would that be an optional install?

Well, there's no need to panic. That item in the installer should actually read "QuickTime Player 7." QuickTime 7, the old but extremely capable media framework discussed earlier, is installed by default in Snow Leopard—in fact, it's mandatory. But the player application, the one with the old blue "Q" icon, the one that many casual users actually think of as being "QuickTime," that's been replaced with a new QuickTime-X-savvy version sporting a pudgy new icon (see above right).

The new player application is a big departure from the old. Obviously, it leverages QuickTime X for more efficient video playback, but the user interface is also completely new. Gone are the gray border and bottom-mounted playback controls from the old QuickTime Player, replaced by a frameless window with a black title bar and a floating, moveable set of controls.

The new QuickTime Player: boldly going where NicePlayer has gone before

It's like a combination of the window treatment of the excellent NicePlayer application and the full-screen playback controls from the old QuickTime Player. I'm a bit bothered by two things. First, the ever-so-slightly clipped corners seem like a bad idea. Am I just supposed to give up those dozen-or-so pixels? NicePlayer does it right, showing crisp, square corners.

Second, the floating playback controls obscure the movie. What if I'm scrubbing around looking for something in that part of the frame? Yes, you can move the controls, but what if I'm looking for something in an unknown location in the frame? Also, the title bar obscures an entire swath of the top of the frame, and this can't be moved. I appreciate the compactness of this approach, but it'd be nice if the title bar overlap could be disabled and the controls could be dragged off the movie entirely and docked to the bottom or something.

(One blessing for people who share my OCD tendencies: if you move the floating controls, they don't remember their position the next time you open a movie. Why is that a blessing? Because if it worked the other way, we'd all spend way too much time fretting about our inability to restore the controller to its default, precisely centered position. Sad, but true.)

The new QuickTime Player presents a decidedly iMovie-like (or is it iPhone-like, nowadays?) interface for trimming video. Still-frame thumbnails are placed side-by-side to form a timeline, with adjustable stops at each end for trimming.

Holding down the option key changes from a thumbnail timeline to an audio waveform display:

In both the video and audio cases, I have to wonder exactly how useful the fancy timeline appearances are. The audio waveform is quite small and compressed, and the limited horizontal space of the in-window display means a movie can only show a handful of video frames in its timeline. Also, if there's any ability to do fine adjustments using something other than extremely careful mouse movements (which are necessarily subject to a limited resolution) then I couldn't find it. Final Cut Pro this is not.

QuickTime Player has learned another new trick: screen recording. The controls are limited, so more demanding users will still have a need for a full-featured screen recorder, but QuickTime Player gets the job done.

There's also an audio-only option, with a similarly simplified collection of settings.

Finally, the new QuickTime Player has the ability to upload a movie directly to YouTube and MobileMe, send one via e-mail, or add it to your iTunes library. The export options are also vastly simplified, with preset options for iPhone/iPod, Apple TV, and HD 480p and 720p.

Unfortunately, the list of things you can't do with the new QuickTime Player is quite long. You can't cut, copy, and paste arbitrary portions of a movie (trimming only affects the ends); you can't extract or delete individual tracks or overlay one track onto another (optionally scaling to fit); you can't export a movie by choosing from the full set of available QuickTime audio and video codecs. All of these things were possible with the old QuickTime Player—if, that is, you paid the $30 for a QuickTime Pro license. In the past, I've described this extra fee as "criminally stupid", but the features it enabled in QuickTime Player were really useful.

It's tempting to attribute their absence in the new QuickTime Player to the previously discussed limitations of QuickTime X. But the new QuickTime Player is built on top of QTKit, which serves as a front-end for both QuickTime X and QuickTime 7. And it does, after all, feature some limited editing features like trimming, plus some previously "Pro"-only features like full-screen playback. Also, the new QuickTime Player can indeed play movies using third-party plug-ins—a feature clearly powered by QuickTime 7.

Well, Snow Leopard has an extremely pleasant surprise waiting for you if you install the optional QuickTime Player 7. When I did so, what I got was the old QuickTime Player—somewhat insultingly installed in the "Utilities" folder—with all of its "Pro" features permanently unlocked. Yes, the tyranny of QuickTime Pro seems to be at an end…

…but perhaps the key word above is "seems," because QuickTime Player 7 does not have all "pro" features unlocked for everyone. I installed Snow Leopard onto an empty disk, and QuickTime 7 was not automatically installed (as it is when the installer detects an existing QuickTime Pro license on the target disk). After booting from my fresh Snow Leopard volume, I manually installed the "QuickTime 7" optional component using the Snow Leopard installer disk.

The result for me was a QuickTime Player 7 application with all pro features unlocked and with no visible QuickTime Pro registration information. I did, however, have a QuickTime Pro license on one of the attached drives. Apparently, the installer detected this and gave me an unlocked QuickTime Player 7 application, even though the boot volume never had a QuickTime Pro license on it.

The Dock

The new appearance of some aspects of the Dock are accompanied by some new functionality as well. Clicking and holding on a running application's Dock icon now triggers Expos?, but only for the windows belonging to that application. Dragging a file onto a docked application icon and holding it there for a bit produces the same result. You can then continue that same drag onto one of the Exposé window thumbnails and hover there a bit to bring that window to the front and drop the file into it. It's a pretty handy technique, once you get in the habit of doing it.

The Exposé display itself is also changed. Now, minimized windows are displayed in smaller form on the bottom of the screen below a thin line.

Dock Exposé with new placement of minimized windows

In the screenshot above, you'll notice that none of the minimized windows appear in my Dock. That's thanks to another welcome addition: the ability to minimize windows "into" the application icon. You'll find the setting for this in the Dock's preference pane.

New Dock preference: Minimize windows into application icon

Minimized windows in a Dock application menu

Minimized window
denoted by a diamond

Once set, minimized windows will slip behind the icon of their parent application and then disappear. To get them back, either right-click the application icon (see right) or trigger Exposé.

The Dock's grid view for folders now incorporates a scroll bar when there are too many items to fit comfortably. Clicking on a folder icon in the grid now shows that folder's contents within the grid, allowing you to navigate down several folders to find a buried item. A small "back" navigation button appears once you descend.

These are all useful new behaviors, and quite a bonus considering the supposed "no new features" stance of Snow Leopard. But the fundamental nature of the Dock remains the same. Users who want a more flexible or more powerful application launcher/folder organizer/window minimization system must still either sacrifice some functionality (e.g., Dock icon badges and bounce notifications) or continue to use the Dock in addition to a third-party application.

The option to keep minimized windows from cluttering up the Dock was long overdue. But my enthusiasm is tempered by my frustration at the continued inability to click on a docked folder and have it open in the Finder, while also retaining the ability to drag items into that folder. This was the default behavior for docked folders for the first six years of Mac OS X's life, but it changed in Leopard. Snow Leopard does not improve matters.

Docking an alias to a folder provides the single-click-open behavior, but items cannot be dragged into a docked folder alias for some inexplicable reason. (Radar 5775786, closed in March 2008 with the terse explanation, "not currently supported.") Worse, dragging an item to a docked folder alias looks like it will work (the icon highlights) but upon release, the dragged item simply springs back to its original location. I really hoped this one would get fixed in Snow Leopard. No such luck.

Dock grid view's in-place navigation with back button

The Finder

One of the earliest leaked screenshots of Snow Leopard included an innocuous-looking "Get Info…" window for the Finder, presumably to show that its version number had been updated to 10.6. The more interesting tidbit of information it revealed was that the Finder in Snow Leopard was a 64-bit application.

The Mac OS X Finder started its life as the designated "dog food" application for the Carbon backward-compatibility API for Mac OS X. Over the years, the Finder has been a frequent target of dissatisfaction and scorn. Those bad feelings frequently spilled over into the parallel debate over API supremacy: Carbon vs. Cocoa.

"The Finder sucks because it's a Carbon app. What we need is a Cocoa Finder! Surely that will solve all our woes." Well, Snow Leopard features a 64-bit Finder, and as we all know, Carbon was not ported to 64-bit. Et voila! A Cocoa Finder in Snow Leopard. (More on the woes in a bit.)

The conversion to Cocoa followed the Snow Leopard formula: no new features… except for maybe one or two. And so, the "new" Cocoa Finder looks and works almost exactly like the old Carbon Finder. The biggest indicator of its "Cocoa-ness" is the extensive use of Core Animation transitions. For example, when a Finder window does its schizophrenic transformation from a sidebar-bedecked browser window to its minimally-adorned form, it no longer happens in a blink. Instead, the sidebar slides away and fades, the toolbar shrinks, and everything tucks in to form its new shape.

Despite crossing the line in a few cases, the Core Animation transitions do make the application feel more polished, and yes, "more Cocoa." And presumably the use of Cocoa made it so darn easy to add features that the developers just couldn't resist throwing in a few.

The number-one feature request from heavy column-view users has finally been implemented: sortable columns. The sort order applies to all columns at once, which isn't as nice as per-column sorting, but it's much better than nothing at all. The sort order can be set using a menu command (each of which has a keyboard shortcut) or by right-clicking in an unoccupied area of a column and selecting from the resulting context menu.

Even the lowly icon view has been enhanced in Snow Leopard. Every icon-view window now includes a small slider to control the size of the icons.

The Finder's icon view with its new slider control

This may seem a bit odd—how often do people change icon sizes?—but it makes much more sense in the context of previewing images in the Finder. This use case is made even more relevant by the recent expansion of the maximum icon size to 512x512 pixels.

The icon previews themselves have been enhanced to better match the abilities available in Quick Look. Put it all together and you can smoothly zoom a small PDF icon, for example, into the impressively high-fidelity preview shown below, complete with the ability to turn pages. One press of the space bar and you'll progress to the even larger and more flexible Quick Look view. It's a pretty smooth experience.

Not your father's icon: 512x512 pixels of multi-page PDF previewing

QuickTime previews have been similarly enhanced. As you zoom in on the icon, it transforms into a miniature movie player, adorned with an odd circular progress indicator. Assuming users are willing to wrangle with the vagaries of the Finder's view settings successfully enough to get icon view to stick for the windows where it's most useful, I think that odd little slider is actually going to get a lot of use.

The Finder's QuickTime preview. (The "glare" overlay is a bit much.)

List view also has a few enhancements—accidental, incidental, or otherwise. The drag area for each list view item now spans the entire line. In Leopard, though the entire line was highlighted, only the file name or icon portion could be dragged. Trying to drag anywhere else just extended the selection to other items in the list view as the cursor was moved. I'm not sure whether this change in behavior is intentional or if it's just an unexamined consequence of the underlying control used for list view in the new Cocoa Finder. Either way, thumbs up.

Double-clicking on the dividing line between two column headers in list view will "right-size" that column. For most columns, this means expanding or shrinking to minimally fit the widest value in the column. Date headers will progressively shrink to show less verbose date formats. Supposedly, this worked intermittently in Leopard as well. But whether Cocoa is bringing this feature for the first time or is just making it work correctly for the first time, it's a change for the better.

Searching using the Finder's browser view is greatly improved by the implementation of one of those little things that many users have been clamoring for year after year. There's now a preference to select the default scope of the search field in the Finder window toolbar. Can I get an amen?

Default Finder search location: configurable at last.

Along similar lines, there are other long-desired enhancements that will go a long way towards making the desktop environment feel more solid. A good example is the improved handling of the dreaded "cannot eject, disk in use" error. The obvious follow-up question from the user is, "Okay, so what's using it?" Snow Leopard finally provides that information.

(Yes, Mac OS X will refuse to eject a disk if your current working directory in a command-line shell is on that disk. Kind of cool, but also kind of annoying.)

Another possible user response to a disk-in-use error is, "I don't care. I'm in a hurry. Just eject it!" That's an option now as well.

Hm, but why did I get information about the offending application in one dialog, an option to force ejection in the other, but neither one presented both choices? It's a mystery to me, but presumably it's related to exactly what information the Finder has about the contention for the disk. (As always, the lsof command is available if you want to figure it out the old-fashioned way.)

So does the new Cocoa Finder finally banish all of those embarrassing bugs from the bad-old days of Carbon? Not quite. This is essentially the "1.0" release of the Cocoa Finder, and it has its share of 1.0 bugs. Here's one discovered by Glen Aspeslagh (see image right).

Do you see it? If not, look closer at the order of the dates in the supposedly sorted "Date Modified" column. So yeah, that old Finder magic has not been entirely extinguished.

There also remains some weirdness in the operation of the icon grid. In a view where grid snap is turned on (or is enabled transiently by holding down the command key during a drag) icons seem terrified of each other, leaving huge distances between themselves and their neighbors when they select which grid spot to snap to. It's as if the Finder lives in mortal fear that one of these files will someday get a 200-character filename that will overlap with a neighboring file's name.

The worst incarnation of this behavior happens along the right edge of the screen where mounted volumes appear on the desktop. (Incidentally, this is not the default; if you want to see disks on your desktop, you must enable this preference in the Finder.) When I mount a new disk, I'm often surprised to see where it ends up appearing. If there are any icons remotely close to the right edge of the screen, the disk icon will refuse to appear there. Again, the Finder is not avoiding any actual name or icon overlapping. It appears to be avoiding the mere possibility of overlapping at some unspecified point in the future. Silly.

Finder report card

Overall, the Snow Leopard Finder takes several significant steps forward—64-bit/Cocoa future-proofing, a few new, useful features, added polish—and only a few shuffles backwards with the slight overuse of animation and the continued presence of some puzzling bugs. Considering how long it took the Carbon Finder to get to its pre-Snow-Leopard feature set and level of polish, it's quite an achievement for a Cocoa Finder to match or exceed its predecessor in its very first release. I'm sure the Carbon vs. Cocoa warriors would have had a field day with that statement, were Carbon not put out to pasture in Leopard. But it was, and to the victor go the spoils.

Exchange

Snow Leopard's headline "one new feature" is support for Microsoft Exchange. This appears to be, at least partially, yet another hand-me-down from the iPhone, which gained support for Exchange in its 2.0 release and expanded on it in 3.0. Snow Leopard's Exchange support is weaved throughout the expected crop of applications in Mac OS X: iCal, Mail, and Address Book.

The big caveat is that it will only work with a server running Exchange 2007 (Service Pack 1, Update Rollup 4) or later. While I'm sure Microsoft greatly appreciates any additional upgrade revenue this decision provides, it means that for users whose workplaces are still running older versions of Exchange, Snow Leopard's "Exchange support" might as well not exist.

Those users are probably already running the only other viable Mac OS X Exchange client, Microsoft Entourage, so they'll likely just sit tight and wait for their IT departments to upgrade. Meanwhile, Microsoft is already making overtures to these users with the promised creation—finally—of an honest-to-goodness version of Outlook for Mac OS X.

In my admittedly brief testing, Snow Leopard's Exchange support seems to work as expected. I had to have one of the Microsoft mavens in the Ars Orbiting HQ spin up an Exchange 2007 server just for the purposes of this review. However it was configured, all I had to enter in the Mail application was my full name, e-mail address, and password, and it automatically discovered all relevant settings and configured iCal and Address Book for me.

Windows users are no doubt accustomed to this kind of Exchange integration, but it's the first time I've seen it on the Mac platform—and that includes my many years of using Entourage.

Access to Exchange-related features is decidedly subdued, in keeping with the existing interfaces for Mail, iCal, and Address Book. If you're expecting the swarm of panels and toolbar buttons found in Outlook on Windows, you're in for a bit of a shock. For example, here's the "detail" view of a meeting in iCal.

Clicking the "edit" button hardly reveals more.

The "availability" window also includes the bare minimum number of controls and displays to get the job done.

The integration into Mail and Address Book is even more subtle—almost entirely transparent. This is to be construed as a feature, I suppose. But though I don't know enough about Exchange to be completely sure, I can't shake the feeling that there are Exchange features that remain inaccessible from Mac OS X clients. For example, how do I book a "resource" in a meeting? If there's a way to do so, I couldn't discover it.

Still, even basic Exchange integration out-of-the-box goes long way towards making Mac OS X more welcome in corporate environments. It remains to be seen how convinced IT managers are of the "realness" of Snow Leopard's Exchange integration. But I've got to think that being able to send and receive mail, create and respond to meeting invitations, and use the global corporate address book is enough for any Mac user to get along reasonably well in an Exchange-centric environment.

Performance

The thing is, there's not really much to say about performance in Snow Leopard. Dozens of benchmark graphs lead to the same simple conclusion: Snow Leopard is faster than Leopard. Not shockingly so, at least in the aggregate, but it's faster. And while isolating one particular subsystem with a micro-benchmark may reveal some impressive numbers, it's the way these small changes combine to improve the real-world experience of using the system that really makes a difference.

One example Apple gave at WWDC was making an initial Time Machine backup over the network to a Time Capsule. Apple's approach to optimizing this operation was to address each and every subsystem involved.

Time Machine itself was given support for overlapping i/o. Spotlight indexing, which happens on Time Machine volumes as well, was identified as another time-consuming task involved in backups, so its performance was improved. The networking code was enhanced to take advantage of hardware-accelerated checksums where possible, and the software checksum code was hand-tuned for maximum performance. The performance of HFS+ journaling, which accompanies each file system metadata update, was also improved. For Time Machine backups that write to disk images rather than native HFS+ file systems, Apple added support for concurrent access to disk images. The amount of network traffic produced by AFP during backups has also been reduced.

All of this adds up to a respectable 55% overall improvement in the speed of an initial Time Machine backup. And, of course, the performance improvements to the individual subsystems benefit all applications that use them, not just Time Machine.

This holistic approach to performance improvement is not likely to knock anyone's socks off, but every time you run across a piece of functionality in Snow Leopard that disproportionately benefits from one of these optimized subsystems, it's a pleasure.

For example, Snow Leopard shuts down and restarts much faster than Leopard. I'm not talking about boot time; I mean the time between the selection of the Shutdown or Restart command and when the system turns off or begins its new boot cycle. Leopard doesn't take long at all to do this; only a few dozen of seconds when there are no applications open. But in Snow Leopard, it's so fast that I often thought the operating system had crashed rather than shut down cleanly. (That's actually not too far from the truth.)

The performance boosts offered by earlier major releases of Mac OS X still dwarf Snow Leopard's speedup, but that's mostly because Mac OS X was so excruciatingly sluggish in its early years. It's easy to create a big performance delta when you're starting from something abysmally slow. The fact that Snow Leopard achieves consistent, measurable improvements over the already-speedy Leopard is all the more impressive.

And yes, for the seventh consecutive time, a new release of Mac OS X is faster on the same hardware than its predecessor. (And for the first time ever, it's smaller, too.) What more can you ask for, really? Even that old performance bugaboo, window resizing, has been completely vanquished. Grab the corner of a fully-populated iCal window—the worst-case scenario for window resizing in the old days—and shake it as fast as you can. Your cursor will never be more than a few millimeters from the window's grab handle; it tracks your frantic motion perfectly. On most Macs, this is actually true in Leopard as well. It just goes to show how far Mac OS X has come on the performance front. These days, we all just take it for granted, which is exactly the way it should be.

Grab bag

In the "grab bag" section, I usually examine smaller, mostly unrelated features that don't warrant full-blown sections of their own. But when it comes to user-visible features, Snow Leopard is kind of "all grab bag," if you know what I mean. Apple's even got its own incarnation in the form of a giant webpage of "refinements." I'll probably overlap with some of those, but there'll be a few new ones here as well.

New columns in open/save dialogs

The list view in open and save dialog boxed now supports more than just "Name" and "Date Modified" columns. Right-click on any column to get a choice of additional columns to display. I've wanted this feature for a long time, and I'm glad someone finally had time to implement it.

Configurable columns in open/save dialogs

Improved scanner support

The bundled Image Capture application now has the ability to talk to a wide range of scanners. I plugged in my Epson Stylus CX7800, a device that previously required the use of third-party software in order to use the scanning feature, and Image Capture detected it immediately.

Epson scanner + Image Capture - Epson software

Image Capture is also not a bad little scanning application. It has pretty good automatic object detection, including support for multiple objects, obviating the need to manually crop items. Given the sometimes-questionable quality of third-party printer and scanner drivers for Mac OS X, the ability to use a bundled application is welcome.

System Preferences bit wars

System Preferences, like virtually all other applications in Snow Leopard, is 64-bit. But since 64-bit applications can't load 32-bit plug-ins, that presents a problem for the existing crop of 32-bit third-party preference panes. System Preferences handles this situation with a reasonable amount of grace. On launch, it will display icons for all installed preference panes, 64-bit or 32-bit. But if you click on a 32-bit preference pane, you'll be presented with a notification like this:

64-bit application vs. 32-bit plug-in: fight!

Click "OK" and System Preferences will relaunch in 32-bit mode, which is conveniently indicated in the title bar. Since all of the first-party preference panes are compiled for both 64-bit and 32-bit operation, System Preferences does not need to relaunch again for the duration of its use. This raises the question, why not have System Preferences launch in 32-bit mode all the time? I suspect it's just another way for Apple to "encourage" developers to build 64-bit-compatible binaries.

Safari plug-ins

The inability of of 64-bit applications load 32-bit plug-ins is a problem for Safari as well. Plug-ins are so important to the Web experience that relaunching in 32-bit mode is not really an option. You'd probably need to relaunch as soon as you visited your first webpage. But Apple does want Safari to run in 64-bit mode due to some significant performance enhancements in the JavaScript engine and other areas of the application that are not available in 32-bit mode.

Apple's solution is similar to what it did with QuickTime X and 32-bit QuickTime 7 plug-ins. Safari will run 32-bit plug-ins in separate 32-bit processes as needed.

Separate processes for 32-bit Safari plug-ins

This has the added, extremely significant benefit of isolating potentially buggy plug-ins. According to the automated crash reporting built into Mac OS X, Apple has said that the number one cause of crashes is Web browser plug-ins. That's not the number one cause of crashes in Safari, mind you, it's the number one cause when considering all crashes of all applications in Mac OS X. (And though it was not mentioned by name, I think we all know the primary culprit.)

As you can see above, the QuickTime browser plug-in gets the same treatment as Flash and other third-party 32-bit Safari plug-ins. All of this means that when a plug-in crashes, Safari in Snow Leopard does not. The window or tab containing the crashing plug-in doesn't even close. You can simply click the reload button and give the problematic plug-in another chance to function correctly.

While this is still far from the much more robust approach employed by Google Chrome, where each tab lives in its own independent process, if Apple's crash statistics are to be believed, isolating plug-ins may generate most of the benefit of truly separate processes with a significantly less radical change to the Safari application itself.

Resolution independence

When we last left Mac OS X in its seemingly interminable march towards a truly scalable user interface, it was almost ready for prime time. I'm sad to say that resolution independence was obviously not a priority in Snow Leopard, because it hasn't gotten any better, and may have actually regressed a bit. Here's what TextEdit looks like at a 2.0 scale factor in Leopard and Snow Leopard.

TextEdit at scale factor 2.0 in Snow Leopard

Yep, it's a bummer. I still remember Apple advising developers to have their applications ready for resolution independence by 2008. That's one of the few dates that the Jobs-II-era Apple has not been able to hit, and it's getting later all the time. On the other hand, it's not like 200-DPI monitors are raining from the sky either. But I'd really like to see Apple get going on this. It will undoubtedly take a long time for everything to look and work correctly, so let's get started.

Terminal splitters

The Terminal application in Tiger and earlier versions of Mac OS X allowed each of its windows to be split horizontally into two separate panes. This was invaluable for referencing some earlier text in the scrollback while also typing commands at the prompt. Sadly, the splitter feature disappeared in Leopard. In Snow Leopard, it's back with a vengeance.

(Now if only my favorite text editor would get on board the train to splittersville.)

Terminal in Snow Leopard also defaults to the new Menlo font. But contrary to earlier reports, the One True Monospaced Font, Monaco, is most definitely still included in Snow Leopard (see screenshot above) and it works just fine.

System Preferences shuffle

The seemingly obligatory rearrangement of preference panes in the System Preferences application accompanying each release of Mac OS X continues in Snow Leopard.

System Preferences (not running) with Dock menu

This time, the "Keyboard & Mouse" preference pane is split into separate "Keyboard" and "Mouse" panes, "International" becomes "Language & Text," and the "Internet & Network" section becomes "Internet & Wireless" and adopts the Bluetooth preference pane.

Someday in the distant future, perhaps Apple will finally arrive at the "ultimate" arrangement of preference panes and we can all finally go more than two years without our muscle memory being disrupted.

Before moving on, System Preferences has one neat trick. You can launch directly into a specific preference pane by right-clicking on System Preferences's Dock icon. This works even when System Preferences is not yet running. Kind of creepy, but useful.

Core location

One more gift from the iPhone, Core Location, allows Macs to figure out where in the world they are. The "Date & Time" preference pane offers to set your time zone automatically based on your current location using this newfound ability.

Keyboard magic

Snow Leopard includes a simple facility for system-wide text auto-correction and expansion, accessible from the "Language & Text" preference pane. It's not quite ready to give a dedicated third-party application a run for its money, but hey, it's free.

Global text expansion and auto-correction

The keyboard shortcuts preference pane has also been rearranged. Now, instead of a single, long list of system-wide keyboard shortcuts, they're arranged into categories. This reduces clutter, but it also makes it a bit more difficult to find the shortcut you're interested in.

The sleeping Mac dilemma

I don't like to leave my Mac Pro turned on 24 hours a day, especially during the summer in my un-air-conditioned house. But I do want to have access to the files on my Mac when I'm elsewhere—at work, on the road, etc. It is possible to wake a sleeping Mac remotely, but doing so requires being on the same local network.

My solution has been to leave a smaller, more power-efficient laptop on at all times on the same network as my Mac Pro. To wake my Mac Pro remotely, I ssh into the laptop, then send the magic "wake up" packet to my Mac Pro. (For this to work, the "Wake for Ethernet network administrator access" checkbox must be checked in the "Energy Saver" preference pane in System Preferences.)

Snow Leopard provides a way to do this without leaving any of my computers running all day. When a Mac running Snow Leopard is put to sleep, it attempts to hand off ownership of its IP address to its router. (This only works with an AirPort Extreme base station from 2007 or later, or a Time Capsule from 2008 or later with the latest (7.4.2) firmware installed.) The router then listens for any attempt to connect to the IP address. When one occurs, it wakes up the original owner, hands back the IP address, and forwards traffic appropriately.

You can even wake some recent-model Macs over WiFi. Combined with MobileMe's "Back to My Mac" dynamic DNS thingamabob, it means I can leave all my Macs asleep and still have access to their contents anytime, anywhere.

Back to my hack

As has become traditional, this new release of Mac OS X makes life a bit harder for developers whose software works by patching the in-memory representation of other running applications or the operating system itself. This includes Input Managers, SIMBL plug-ins, and of course the dreaded "Haxies."

Input Managers get the worst of it. They've actually been unsupported and non-functional in 64-bit applications since Leopard. That wasn't such a big deal when Mac OS X shipped with a whopping two 64-bit applications. But now, with almost every application in Snow Leopard going 64-bit, it's suddenly very significant.

Thanks to Safari's lack of an officially sanctioned extension mechanism, developers looking to enhance its functionality have most often resorted to the use of Input Managers and SIMBL (which is an Input-Manager-based framework). A 64-bit Safari puts a damper on that entire market. Though it is possible to manually set Safari to launch in 32-bit mode—Get Info on the application in the Finder and click a checkbox—ideally, this is not something developers want to force users to do.

Happily, at least one commonly used Safari enhancement has the good fortune to be built on top of the officially supported browser plug-in API used by Flash, QuickTime, etc. But that may not be a feasible approach for Safari extensions that enhance functionality in ways not tied directly to the display of particular types of content within a webpage.

Though I plan to run Safari in its default 64-bit mode, I'll really miss Saft, a Safari extension I use for session restoration (yes, I know Safari has this feature, but it's activated manually—the horror) and address bar shortcuts (e.g., "w noodles" to look up "noodles" in Wikipedia). I'm hoping that clever developers will find a way to overcome this new challenge. They always seem to, in the end. (Or Apple could add a proper extension system to Safari, of course. But I'm not holding my breath.)

As for the Haxies, those usually break with each major operating system update as a matter of course. And each time, those determined fellows at Unsanity, against all odds, manage to keep their software working. I salute them for their effort. I delayed upgrading to Leopard for a long time based solely on the absence of my beloved WindowShade X. I hope I don't have to wait too long for a Snow-Leopard-compatible version.

The general trend in Mac OS X is away from any sort of involuntary memory space sharing, and towards "external" plug-ins that live in their own, separate processes. Even contextual menu plug-ins in the Finder have been disabled, replaced by an enhanced, but still less-powerful Services API. Again, I have faith that developers will adapt. But the waiting is the hardest part.

ZFS MIA

It looks like we'll all be waiting a while longer for a file system in shining armor to replace the venerable HFS+ (11 years young!) as the default file system in Mac OS X. Despite rumors, outright declarations, and much actual pre-release code, support for the impressive ZFS file system is not present in Snow Leopard.

That's a shame because Time Machine veritably cries out for some ZFS magic. What's more, Apple seems to agree, as evidenced by a post from an Apple employee to a ZFS mailing list last year. When asked about a ZFS-savvy implementation of Time Machine, the reply was encouraging: "This one is important and likely will come sometime, but not for SL." ("SL" is short for Snow Leopard.)

There are many reasons why ZFS (or a file system with similar features) is a perfect fit for Time Machine, but the most important is its ability to send only the block-level changes during each backup. As Time Machine is currently implemented, if you make a small change to a giant file, the entire giant file is copied to the Time Machine volume during the next backup. This is extremely wasteful and time consuming, especially for large files that are modified constantly during the day (e.g., Entourage's e-mail database). Time Machine running on top of ZFS could transfer just the changed disk blocks (a maximum of 128KB each in ZFS, and usually much smaller).

ZFS would also bring vastly increased robustness for data and metadata, a pooled storage model, constant-time snapshots and clones, and a pony. People sometimes ask what, exactly, is wrong with HFS+. Aside from its obvious lack of the features just listed, HFS+ is limited in many ways by its dated design, which is based on HFS, a twenty-five year-old file system.

To give just one example, the centrally located Catalog File, which must be updated for each change to the file system's structure, is a frequent and inevitable source of contention. Modern file systems usually spread their metadata around, both for robustness (multiple copies are often kept in separate locations on the disk) and to allow for better concurrency.

Practically speaking, think about those times when you run Disk Utility on an HFS+ volume and it finds (and hopefully repairs) a bunch of errors. That's bad, okay? That's something that should not happen with a modern, thoroughly checksummed, always-consistent-on-disk file system unless there are hardware problems (and a ZFS storage pool can actually deal with that as well). And yet it happens all the time with HFS+ disks in Mac OS X when various bits of metadata get corrupted or become out of date.

Apple gets by year after year, tacking new features onto HFS+ with duct tape and a prayer, but at a certain point there simply has to be a successor—whether it's ZFS, a home-grown Apple file system, or something else entirely. My fingers are crossed for Mac OS X 10.7.

The future soon

Creating an operating system is as much a social exercise as a technological one. Creating a platform, even more so. All of Snow Leopard's considerable technical achievements are not just designed to benefit users; they're also intended to goad, persuade, and otherwise herd developers in the direction that Apple feels will be most beneficial for the future of the platform.

For this to work, Snow Leopard has to actually find its way into the hands of customers. The pricing helps a lot there. But even if Snow Leopard were free, there's still some cost to the consumer—in time, worry, software updates, etc.—when performing a major operating system upgrade. The same goes for developers who must, at the very least, certify that their existing applications run correctly on the new OS.

The usual way to overcome this kind of upgrade hesitation has been to pack the OS with new features. New features sell, and the more copies of the new operating system in use, the more motivated developers are to update their applications to not just run on the new OS, but also take advantage of its new abilities.

A major operating system upgrade with "no new features" must play by a different set of rules. Every party involved expects some counterbalance to the lack of new features. In Snow Leopard, developers stand to reap the biggest benefits thanks to an impressive set of new technologies, many of which cover areas previously unaddressed in Mac OS X. Apple clearly feels that the future of the platform depends on much better utilization of computing resources, and is doing everything it can to make it easy for developers to move in this direction.

Though it's obvious that Snow Leopard includes fewer external features than its predecessor, I'd wager that it has just as many, if not more internal changes than Leopard. This, I fear, means that the initial release of Snow Leopard will likely suffer the typical 10.x.0 bugs. There have already been reports of new bugs introduced to existing APIs in Snow Leopard. This is the exact opposite of Snow Leopard's implied promise to users and developers that it would concentrate on making existing features faster and more robust without introducing new functionality and the accompanying new bugs.

On the other side of the coin, I imagine all the teams at Apple that worked on Snow Leopard absolutely reveled in the opportunity to polish their particular subsystems without being burdened by supporting the marketing-driven feature-of-the-month. In any long-lived software product, there needs to be this kind of release valve every few years, lest the entire code base go off into the weeds.

There's been one other "no new features" release of Mac OS X. Mac OS X 10.1, released a mere six months after version 10.0, was handed out for free by Apple at the 2001 Seybold publishing conference and, later, at Apple retail stores. It was also available from Apple's online store for $19.95 (along with a copy of Mac OS 9.2.1 for use in the Classic environment). This was a different time for Mac OS X. Versions 10.0 and 10.1 were slow, incomplete, and extremely immature; the transition from classic Mac OS was far from over.

Judged as a modern incarnation of the 10.1 release, Snow Leopard looks pretty darned good. The pricing is similar, and the benefits—to developers and to users—are greater. So is the risk. But again, that has more to do with how horrible Mac OS X 10.0 was. Choosing not to upgrade to 10.1 was unthinkable. Waiting a while to upgrade to Snow Leopard is reasonable if you want to be sure that all the software you care about is compatible. But don't wait too long, because at $29 for the upgrade, I expect Snow Leopard adoption to be quite rapid. Software that will run only on Snow Leopard may be here before you know it.

Should you buy Mac OS X Snow Leopard? If you're already running Leopard, then the answer is a resounding "yes." If you're still running Tiger, well, then it's probably time for a new Mac anyway. When you buy one, it'll come with Snow Leopard.