Everyware_ The Dawning Age of Ubiquitous Computing - Adam Greenfield [17]
With this payoff as incentive, a spectrum of methods have been devised to capture the retinue of expressive things we do with our hands, from the reasonably straightforward to arcane and highly computationally intensive attempts to infer the meaning of user gestures from video. Some of the more practical rely on RFID-instrumented gloves or jewelry to capture gesture; others depend on the body's own inherent capacitance. Startup Tactiva even offers PC users something called TactaPad, a two-handed touchpad that projects representations of the user's hands on the screen, affording a curious kind of immersion in the virtual space of the desktop.
With the pace of real-world development being what it is, this category of interface seems to be on the verge of widespread adoption. Nevertheless, many complications and pitfalls remain for the unwary.
For example, at a recent convention of cartographers, the technology research group Applied Minds demonstrated a gestural interface to geographic information systems. To zoom in on the map, you place your hands on the map surface at the desired spot and simply spread them apart. It's an appealing representation: immediately recognizable and memorable, intuitive, transparent. It makes sense. Once you've done it, you'll never forget it.
Or so Applied Minds would have you believe. If only gesture were this simple! The association that Applied Minds suggests between a gesture of spreading one's hands and zooming in to a map surface is culturally specific, as arbitrary as any other. Why should spreading not zoom out instead? It's just as defensibly natural, just as "intuitive" a signifier of moving outward as inward. For that matter, many a joke has turned on the fact that certain everyday gestures, utterly unremarkable in one culture—the thumbs-up, the peace sign, the "OK" sign—are vile obscenities in another.
What matters, of course, is not that one particular system may do something idiosyncratically: Anything simple can probably be memorized and associated with a given task with a minimum of effort. The problem emerges when the different systems one is exposed to do things different ways: when the map at home zooms in if you spread your hands, but the map in your car zooms out.
The final category of new interfaces in everyware concerns something still less tangible than gesture: interactions that use the audio channel. This includes voice-recognition input, machine-synthesized speech output, and the use of "earcons," or auditory icons.
The latter, recognizable tones associated with system events assume new importance in everyware, although they're also employed in the desktop setting. (Both Mac OS and Windows machines can play earcons—on emptying the Trash, for example.) They potentially serve to address one of the concerns raised by the Bellotti paper previously referenced: Used judiciously, they can function as subtle indicators that a system has received, and is properly acting on, some input.
Spoken notifications, too, are useful, in situations where the user's attention is diverted by events in their visual field or by the other tasks that he or she is engaged in: Callers and visitors can be announced by name, emergent conditions can be specified, and highly complex information can be conveyed at arbitrary length and precision.
But of all audio-channel measures, it is voice-recognition that is most obviously called upon in constructing a computing that is supposed to be invisible but everywhere. voices can, of course, be associated with specific people, and this can be highly useful in providing for differential permissioning—liquor cabinets that unlock only in response to spoken commands issued by adults in the household, journals that refuse access to any but their owners. Speech, too, carries clear cues as to the speaker's emotional state; A household system might react to these alongside whatever content is actually expressed—yes, the volume can be turned down in response to your command, but should