Thing Home | Josephine Home |
You follow this strange guide to a dark stairway, winding up inside the cathedral. At the top a low doorway takes you outside. The city stretches beneath you - transparent pathways curl around the dome and lead back to the ground. You see the little guide far below. He is dancing wildly as a stream of flying letters and books sail past him and whip inside a building. What is in there .....?
This is a description of an experience in the CAVE(TM) virtual reality theatre. Virtual Reality is the art and science of using computers to create three dimensional worlds that users can be immersed in, explore as they please, and interact with in real time.
The CAVE (a recursive acronym for CAVE Automatic Virtual Environment) is a virtual reality display device, but not the kind of head-mounted display normally associated with VR. Instead, it's more like a prototype for Star Trek's HoloDeck - a room that people can enter, with stereoscopic computer images projected on the walls and floor. The computer continually updates and redraws the display as users move through the environment.
One of the potentials of the CAVE is the creation of animated 3D worlds and characters that a user can interact with: in effect making the user part of a story.
In 1991, DeFanti and Sandin decided to use their experience with video and interactive computer graphics to create a new approach to the growing field of virtual reality. Traditional VR systems were head-mounted - a pair of small video displays attached to a helmet or mechanical boom. Most such displays were low-resolution, encumbering, and isolated the user. EVL's new device - the CAVE - used video projection screens to create a VR display that users entered, rather than wore. The CAVE display was high-resolution, only required the users to wear lightweight shutter glasses, and could be shared by whole groups of people at once. It was implemented by EVL students - Carolina Cruz-Neira, Greg Dawe, Sumit Das, and others - and was first shown at the 1992 SIGGRAPH conference in Chicago. The full system was completed barely in time for the conference, and many of the applications demonstrated weren't seen in the CAVE itself until showtime.
The CAVE is a 10 foot by 10 foot cube; three walls are rear projection screens,
and the floor is projected onto from above. High-end Silicon Graphics computers,
such as an Onyx2 Infinite Reality, generate the 3D images and simulate
the dynamics of the virtual world. Another SGI machine, connected to loudspeakers
in the four corners of the CAVE, creates the sounds of the environment.
The ImmersaDesk(TM) is a newer, smaller-scale version of the CAVE -
a drafting-table-style
display, rather than an entire room. Since 1992, over 50 CAVEs, ImmersaDesks,
and similar devices have been installed in universities, corporate labs,
and a few museums.
VR can be applied to any problem that can benefit from an immersive, three-dimensional, interactive solution - whether in molecular biology, cosmology, architecture and design, education, entertainment or the arts. General Motors has started using CAVEs to evaluate the design of new car interiors before having to build physical prototypes. Old Dominion University is using an ImmersaDesk to view computer simulations of the Chesapeake Bay ecosystem. At NCSA, Donna Cox used the CAVE program Virtual Director to create animation for the IMAX film Cosmic Voyage. A group at EVL has built a virtual island where children can tend a virtual garden and learn about environmental concepts. EVL also participates fully in the world of electronic art. Dan Sandin organised the opening show for the first CAVE installed in a museum of Art and Technology, the Ars Electronica Center in Linz, Austria, which featured projects by EVL faculty and students.
The Thing looks translucent. The triangular shapes forming its head, appendages
and body do not seem to join up. It changes colors as it speaks and according
to its moods. It is alternately bullying and loving. It has no specific
gender. Its goal is to make you dance with it, which it takes as a sign
of love and obedience.
To animate in the CAVE we use tools familiar to any computer animator. For example the models for The Thing Growing are being made in Softimage and the textures are being made in Photoshop. These models are then imported into the CAVE. In VR, the computer has to redraw the scene in about one sixtieth of a second, in order to keep the frame rate at 30 frames per second (remember it has to draw a different view for each eye). Therefore, even with an Onxy2 these models must be far simpler than those of computer animation for film and video, where you can spend minutes or hours rendering a single frame. The gain, and we think it's an exciting one, is being able to interact, in real time, with a virtual character and world.
Virtual reality applications can use a wide variety of methods for animating things: flipbooks, keyframing, motion capture, and procedural (computer programmed) animation. The Multi Mega Book in the CAVE (described at the beginning of this article) uses a flipbook of 3D models to walk a wire-framed Judas out of Leonardo's painting of the Last Supper. In The Thing Growing rocks come alive and chase the user. When a rock gets close enough it rears up and swallows her. In this case there are only four simple models and the CAVE morphs between them to produce the rock's growing and grabbing action.
We commonly use keyframe animation to move objects in a CAVE application. For example to animate the flying letters referred to in the description of the Multi Mega Book, keyframes were set to determine the path of the letters through the city. As the letters move along this path, a simple behavior routine makes them also spin and orbit each other.
The CAVE uses a tracking system to get information on the user's head and hand position in order for the computer to compute the perspective from a user-centered point of view. Tracking is done with electromagnetic systems such as the Ascension Flock of Birds; sensors are attached to the stereo glasses and to the 3D mouse. We use this same system to record motion tracked animation.
Our first experiments with this were for the Multi Mega Book. Our collaborator, Franz Fischnaller, had already determined a shape for the guide character - a simple collection of geometric shapes. We hooked Franz up to four tracking sensors, one for the head, one for each arm, and one for the body. At the same time we ran a CAVE program that took the position and orientation information from the tracker and fed it to the parts of the character's body which were then displayed. So as he moved Franz could immediately see how the character would move. Its head moved as he moved his head. Its arm waved as he moved his arm. In this way we could build up animation for the individual body parts. Later, we could define keyframes to move the body as a whole along a path through the virtual world.
Although motion-tracking or keyframing are useful in creating movements of characters and objects, they can't be used alone in VR. Because the virtual world is interactive, the user is an integral part of it, and the full progression of the storyline isn't known in advance. We can pre-animate elements of the action - walk cycles, dances, gestures - but these elements have to be combined dynamically in response to the user. Creating this level of interaction is easily the most challenging aspect of CAVE animation.
Building interaction for the CAVE means making objects intelligent and able to react to people. A simple example comes from The Thing Growing. At one point in the narrative the Thing becomes so angry with the user that it hides under one of the rocks in the vast plain the action takes place on. This is the point when the other rocks come alive and start to stalk and herd the user. Intelligence has to be programmed into the rocks - they have to know where the user is, they have to avoid each other, and they have to sneak up on the user and try to trap her. Instead of movement based on keyframes, the rocks are given a set of rules on how to move until one grabs the user. When that happens, all the other rocks scatter and the user's ability to navigate is taken away. She is trapped with a rock slobbering on her.
For the Thing itself we are using motion tracking to build up a library of actions; in this case there are 8 body parts - head, two arms, body, four tail sections. Each action lasts a few seconds and has a corresponding sound bite. As the program runs, the Thing's intelligence unit selects an appropriate action and sound according to the point in the narrative, the user's actions, and the Thing's own emotional state. The computer will interpolate between the end of one action and the beginning of the next, so that the movement is smooth.
The Thing's intelligence also decides how the body as a whole moves. This movement may be relative to the user, as the Thing stays close or swoops in on the user. Or it may decide to move to a particular spot in the environment. All the while it has to avoid other objects. The computations for these movements are done on the fly using a set of rules and information about the position of the user, the Thing, and other objects.
The Thing has four basic moods: happy, depressed, manic, angry. Its emotional state is established in part from information about the user. Essentially all that the computer, and therefore the Thing, can know about a user is the tracked position and orientation of the user's head and one, or both, hands. So, for example, we keep track of the user's head position relative to the Thing - if the user looks at the Thing most of the time, it interprets that as attentiveness and that makes it happy. We also monitor the general activity of the user. A user that moves about a lot is fast, and this will tend to make a happy Thing manic. If the Thing is not so happy, fast user movement will make it angry, slow user movement will make it depressed. The Thing's emotional state will also fluctuate according to an internal set of rules, so that the emotions are not simply a reflection of the user - too much of any one emotion will flip it over into a different one.
In addition to monitoring the user in general, we check specific user movements.
The Thing is attempting to teach the user a dance. It will demonstrate
each part of the dance, then observe or join the user as she copies the
movement. Each of the parts of the dance will be one of the Thing's actions.
And each of these particular actions will have a test associated with it
to verify whether the user is dancing correctly. Knowing whether the user
is dancing correctly will feed back both to the Thing's emotional component
and to its decision-making process. It may decide to repeat a part of the
dance that the user is doing incorrectly. It will admonish, encourage or
praise the user according to the user's behavior and its own mood.
Thereafter, the most daunting task is to give that life intelligence. The starting point is assessing its sensory inputs - the tracking information. In building the intelligence component we interpret the data the Thing receives, but we also try to construct its character so that the paucity of the information it is working with is not apparent. The Thing is high-handed and willful - in part because of the exigencies of the story line and in part to hide its stupidity. It is inconsistent, arbitrarily praising or abusing the user for the same behavior - in part to mimic the inconsistency of many people, in part to hide its ignorance.
The Thing Growing is still under development. At this stage we can't judge how effective or real an experience it will give a user. But we hope that after testing and refining the program, it will be possible for the Thing to build a relationship with a user - however virtual that relationship may be.